Essays on Knowledge Spillovers by Jie Cai B. E. Computer Science, Renmin University of China, 1999 M. A. Economics, Peking University, 2002 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in THE FACULTY OF GRADUATE STUDIES (Economics) The University of British Columbia (Vancouver) June 2010 c© Jie Cai, 2010 Abstract This thesis studies three issues involving knowledge diffusion across firms. The first chapter explains two data facts related to firm size distribution. First, it uses sector-specific inter-firm knowledge spillovers to explain sectoral differences in firm size heterogeneity. Greater inter-firm knowledge spillovers in a sector induce firms in that sector to invest relatively more in imitation. Greater imitation also causes faster catch-up by lagging firms and declining firm growth rate with firm size. Hence, the sectoral firm size distribution becomes more homogeneous in sec- tors with greater knowledge spillovers. Second, in a multi-sector version of this environment, I use inter-sector knowledge spillovers to explain the observed de- pendent Pareto size distributions in every subset of the economy. I test the model using patent citation data and find support for both its sectoral and aggregate pre- dictions. The second chapter rationalizes firms’ motivation to build directed links with each other and formalizes the dynamic formation process that generates the ob- served network structure, including triple Power-law degree distributions, in the patent citation networks. Networks allow firms to become more specialized with- out losing customers, because having more firms in the market results not only in competitors but also in potential partner who redirect customers. Using firm cita- tion panel data from the NBER Patent Citation Database, I estimate the model’s parameters and simulate networks that exhibit similar structure features as corre- sponding real networks. The third chapter documents a new empirical fact that larger firms update in- formation faster than smaller firms in patent citation data and address its macroe- conomic implications. In a model with size-dependent reaction time lag and Pareto ii firm size distribution, the gradual spread of a firm-level technology shock gener- ates a persistent and hump-shaped aggregate output growth rate. Greater infor- mation heterogeneity across firms de-synchronizes the co-movement among firms of different sizes, and hence causes a less volatile, smoother and longer aggregate business cycle. The model is well suited to explaining several timing relations of the business cycle. For example, productivity dispersion is pro-cyclical, the top firm’s growth rate predicts future GDP growth, and investment leads hiring over the business cycle. iii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Knowledge Spillovers and Firm Size Heterogeneity . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . 6 1.2 One-Sector Model . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.1 Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.2 Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.3 Determinants of Firm Size Distribution . . . . . . . . . . 17 1.2.4 Growth Rate Volatility Decomposition . . . . . . . . . . . 19 1.2.5 General Equilibrium . . . . . . . . . . . . . . . . . . . . 21 1.3 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.2 Implication 1: Scale-Independency of Firm Growth Rate . 23 1.3.3 Implication 2: Determinants of Firm Size Heterogeneity . 26 iv 1.3.4 Implication 3: Knowledge Spillovers Efficiency and Firm Size Heterogeneity . . . . . . . . . . . . . . . . . . . . . 27 1.4 Multi-Sector Model . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.4.1 Facts about Multi-Sector Firms . . . . . . . . . . . . . . 29 1.4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2 Dynamic Formation of Directed Networks . . . . . . . . . . . . . . 55 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.2.1 Differentiated Goods Market . . . . . . . . . . . . . . . . 59 2.2.2 Methods of Building Networks . . . . . . . . . . . . . . . 60 2.2.3 Dynamic Networks Formation Process . . . . . . . . . . 61 2.3 Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 2.3.1 Risk-Neutral Consumers . . . . . . . . . . . . . . . . . . 67 2.3.2 Risk-Averse Consumers . . . . . . . . . . . . . . . . . . 68 2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.4.1 Data Description . . . . . . . . . . . . . . . . . . . . . . 68 2.4.2 Stylized Facts about Citation networks . . . . . . . . . . . 69 2.4.3 Degree Distribution Heterogeneity and Ratio r in JR . . . 70 2.4.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3 Information Heterogeneity by Firm Size and Aggregate Fluctuations 83 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.3 Information Heterogeneity In Patent Citation Data . . . . . . . . . 87 3.3.1 Greater Information Heterogeneity after 1980s . . . . . . 88 3.4 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.4.1 Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.4.2 Heterogeneous Firms . . . . . . . . . . . . . . . . . . . . 91 3.4.3 Aggregate Fluctuation under Information Heterogeneity . 94 3.5 Conclusion and Future Research . . . . . . . . . . . . . . . . . . 98 v Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Appendix A. Robustness of Sectoral Firm Size Heterogeneity . . . . . 111 Appendix B. Detailed One-Sector Model . . . . . . . . . . . . . . . . . 114 B..1 General Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . 116 Appendix C. Robustness Checks . . . . . . . . . . . . . . . . . . . . . 119 C..1 Random Citation Data . . . . . . . . . . . . . . . . . . . . . . . 119 C..2 G7 Country Citation Data . . . . . . . . . . . . . . . . . . . . . . 120 Appendix D. New Imitation Production Function . . . . . . . . . . . . 122 Appendix E. Related Questions . . . . . . . . . . . . . . . . . . . . . . 124 Appendix F. Technical Details . . . . . . . . . . . . . . . . . . . . . . . 126 Appendix G. Simulating Networks for Every Sector . . . . . . . . . . . 128 Appendix H. Calculation Detail . . . . . . . . . . . . . . . . . . . . . . 130 Appendix I. Goodness-of-Fit Test for Power-Law Distribution . . . . 132 vi List of Tables 1.1 OLS Regressions with U.S. Citations . . . . . . . . . . . . . . . . 35 1.2 Summary of Variables . . . . . . . . . . . . . . . . . . . . . . . . 36 1.3 Correlation between Variables . . . . . . . . . . . . . . . . . . . 36 1.4 Table 2 in Broda and Weinstein (8) . . . . . . . . . . . . . . . . . 37 1.5 Multi-Sector U.S. Patent Owners . . . . . . . . . . . . . . . . . . 37 1.6 Share of Cross-Sector Citations: Example . . . . . . . . . . . . . 38 1.7 Intensity of Cross-Sector Citations: Example . . . . . . . . . . . 38 1.8 Example Sector Names . . . . . . . . . . . . . . . . . . . . . . . 38 1.9 Correlation between Different Measures of Firm Size Heterogene- ity - French 4-Digit NAICS Manufacturing Sectors 1997 . . . . . 39 1.10 Correlation between Different Measures of Firm Size Heterogene- ity - Chilean 4-Digit SITC Manufacturing Sectors 1996 . . . . . . 39 1.11 OLS Regressions - Random Citations . . . . . . . . . . . . . . . 40 1.12 Summary of Variables - Random Citations . . . . . . . . . . . . . 41 1.13 Correlation between Variables - Random Citations . . . . . . . . . 41 1.14 OLS Regressions - G7 Citations . . . . . . . . . . . . . . . . . . 42 1.15 Summary of Variables - G7 Citations . . . . . . . . . . . . . . . . 43 1.16 Correlation between Variables - G7 Citations . . . . . . . . . . . 43 2.1 Stylized Facts (a), (d) and (e) for Firm Citation Networks . . . . . 73 2.2 Stylized Facts (b) and (c) for Firm Citation Networks . . . . . . . 74 3.1 Cox Proportional Hazards model 1985-2000; Group Variable: Patent Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 vii List of Figures 1.1 Firm Size Distribution in Different Sectors - French Manufacturing Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 1.2 Firm Size Distribution in Different Sectors - U.S. Patent Owners . 44 1.3 Five-Year Firm Growth Rate - French Manufacturing Firms . . . . 45 1.4 Five-Year Firm Growth Rate - U.S. Patent Owners . . . . . . . . . 45 1.5 Scale-Dependency of Firm Growth Rate - French Manufacturing Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 1.6 Scale-Dependency of Firm Growth Rate - U.S. Patent Owners . . 46 1.7 Scale-Dependency of Imitation Rate and Innovation Rate - French Manufacturing Firms . . . . . . . . . . . . . . . . . . . . . . . . 47 1.8 Scale-Dependency of Imitation Rate and Innovation Rate - U.S. Patent Owners . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 1.9 Cross-Firm Citation’s Share in Total Citation by Sectors . . . . . . 48 1.10 Cross-Firm Citation’s Share in Total Citation by Year . . . . . . . 48 1.11 Average Citation Distance by Year . . . . . . . . . . . . . . . . . 49 1.12 Sectoral Firm Size Heterogeneity in U.S. 1997 and 2002 . . . . . 49 1.13 Sectoral Firm Size Heterogeneity in France 1997 and 2005 . . . . 50 1.14 Sectoral Firm Size Heterogeneity in Chile 1979 and 1996 . . . . . 50 1.15 Sectoral Firm Size Heterogeneity in U.S. and France 1997 . . . . 51 1.16 Sectoral Firm Size Heterogeneity in U.S. and France 2002 . . . . 51 1.17 Simulated Cross-Firm Citation’s Share in Total Citation by Sectors 52 1.18 Scale-Independency of Innovation Rate by Sectors . . . . . . . . 52 1.19 Scale-Dependency of Imitation Rate by Sectors . . . . . . . . . . 53 1.20 Determinant of Firm Size Heterogeneity . . . . . . . . . . . . . . 53 viii 1.21 Cross-Firm Citation’s Share in Total Citation by Sectors G7 Countries 54 2.1 An Example of Firm Citation Networks . . . . . . . . . . . . . . 75 2.2 Mu-In and Log(Random Friends/Network-Based Friends) . . . . 75 2.3 Mu-Out and Log(Random Friends/Network-Based Friends) . . . . 76 2.4 Mu-Total and Log(Random Friends/Network-Based Friends) . . . 76 2.5 CC-TT and Log(Random Friends/Network-Based Friends) . . . . 77 2.6 CC and Log(Random Friends/Network-Based Friends) . . . . . . 77 2.7 CC-Avg. and Log(Random Friends/Network-Based Friends) . . . 78 2.8 Mu-In of Real Networks and Mu-In of Simulated Networks . . . . 78 2.9 Mu-Out of Real Networks and Mu-Out of Simulated Networks) . 79 2.10 Mu-Total of Real Networks and Mu-Total of Simulated Networks 79 2.11 CC-TT of Real Networks and CC-TT of Simulated Networks) . . 80 2.12 CC of Real Networks and CC of Simulated Networks . . . . . . . 80 2.13 CC-Avg. of Real Networks and CC-Avg. of Simulated Networks) 81 2.14 CC of Real Networks and CC of Simulated Networks . . . . . . . 81 2.15 CC-TT and Mu-Out in Real Networks . . . . . . . . . . . . . . . 82 2.16 CC-TT and Mu-Out in Simulated Networks . . . . . . . . . . . . 82 3.1 The Changing Ratio of Citation Share to Patent Stock Share . . . 102 3.2 Information Heterogeneity and Growth Rate IR to Technology Shock102 3.3 Figure 7 in Davis, Haltiwanger, Jarmin, and Miranda (19) . . . . . 103 3.4 Cross-Correlation between Top Firms’ Growth and GDP Growth . 103 3.5 Figure 8 in Davis, Haltiwanger, Jarmin, and Miranda (19) . . . . . 104 3.6 Productivity Dispersion after Technology Shock . . . . . . . . . . 104 ix Acknowledgments I am heartily thankful to my supervisor, Amartya Lahiri, and my thesis committee members, Paul Beaudry and Patrick Francois, whose encouragement, guidance and support from the initial to the final level enabled me to develop an understanding of the subject. I would like to show my gratitude to the attendees at UBC Macroeconomics and DIET lunch seminars for your wonderful comments. I am indebted to many of my colleagues who supported and encouraged me during the PhD study. Lastly, I offer my regards and blessings to all those who supported me in any respect during the completion of the thesis. x Dedication To my parents and husband. xi Chapter 1 Knowledge Spillovers and Firm Size Heterogeneity 1.1 Introduction Increasingly firm- or establishment-level data show that firm size distributions within narrowly defined sectors and within the overall economy are widely dis- persed and follow a Pareto distribution. Two important related questions are not well understood in the literature of firm growth dynamics. First, why is firm size heterogeneity1 different across sectors? Second, why does Pareto firm size distri- bution exist in every subset of the economy; Moreover, why are firm size variables in different sectors dependent on each other? This paper uses intra-sector and inter-sector knowledge spillovers, respectively, to answer these two questions. In a one-sector model, cross-sector differences in firm size heterogeneity can be attributed to sector-specific intra-sector knowledge diffusion efficiency. The one-sector model used in this paper extends the endoge- nous innovation model of Klette and Kortum (48) by giving firms the option to imitate. In sectors with more abundant knowledge spillovers, firms invest rela- tively more in imitation, as compared to innovation. Imitation then contributes a 1Using French, Chilean and U.S. firm-level data, Appendix A shows that sector-specific firm size heterogeneity is robust to different proxies of firm size, stable over time in the same country, and highly correlated across different countries in the same year. 1 greater share to the gross growth rate. Since equal opportunity to learn provide a stronger impetus to small firms, firm growth rate drops faster as the firm becomes larger. The sectoral firm size distribution is more homogeneous if small firms have more opportunities to catch up with the leaders. The model implications are con- firmed using NBER (National Bureau of Economic Research) Patent Citation Data, which provides a measure of knowledge spillovers and the appropriate information for distinguishing the contributions by imitation and innovation to firm growth rate. The one-sector model suggests that optimal intellectual property rights depend on the trade-off between a higher imitation rate and a lower private return of knowl- edge. A multi-sector model incorporates two additional facts that are absent in the one-sector model: firms develop products in multiple sectors, and inter-sector knowledge spillovers integrate the firm growth dynamics in all sectors. Firm growth dynamics in any subset of the economy are subject to a similar influence from all sectors, and the firm size distribution therefore converges universally to a Pareto distribution, not only within narrowly defined sectors but in the economy overall. Besides the implication on firm size distribution, the multi-sector model suggests one more channel than the one-sector model through which intellectual property rights protection promotes economic growth. Stronger intellectual property rights directly raise the private return of the intensive knowledge contributing sector more than other sectors, therefore attracting a larger share of research investment to the knowledge giver sectors. Indirectly, a better cross-sector R&D resource allocation that is justified by sectoral knowledge externality boosts economic growth rate. The one-sector model in this paper extends that of Klette and Kortum (48) by allowing firms to create new goods by imitation, as well as by innovation. The difference between innovation and imitation is that innovation relies on a firm’s pri- vate knowledge (measured by its current number of goods), while imitation relies on a sector’s public knowledge (measured by the average firm size in the sector). Both types of R&D are subject to independent and identically distributed (i.i.d.) shocks, which are necessary to induce the Pareto firm size distribution. Sector- specific knowledge diffusion efficiency is given by a firm’s productivity in utiliz- ing private knowledge in innovation and public knowledge in imitation. When it is relatively more efficient to imitate than innovate, firms invest relatively more in 2 imitation; as a result, the imitation rate contributes a greater share to the overall growth rate for the firm and the entire sector. A scale-dependent firm’s growth rate is the summation of the scale-independent innovation rate and the scale-dependent imitation rate. As specified in the Cobb- Douglas production function of new goods, output is proportional to the input. The innovation rate is independent of firm size, because the number of new goods generated from private knowledge is proportional to the firm’s private knowledge capital. In contrast, the imitation rate decreases with firm size, because the number of imitated new goods is proportional to the public knowledge pool. When imitated output is divided by firm size, smaller firms obtain higher imitation rates. In sectors where imitation accounts for a greater share of the total growth rate, a firm’s growth rates drop faster as the firms become larger. A larger growth rate gap between smaller and larger firms causes faster firm size mean reversion and a more homogeneous firm size distribution. According to Kesten (44), when the innovation shock and imitation shock are i.i.d. across time and firm, firm size distribution within the sector converges to a Pareto distribution with scale parameter, while in the Klette and Kortum (48) environment without imitation, the firm size distribution converges to a logarithmic distribution. When the innovation risk follows a log-normal distribution, the closed form solution of the firm size heterogeneity measure, 1/µ , increases with the volatility of innovation shock and decreases with the imitation’s contribution to the gross growth rate of the sector. Intuitively, the innovation shocks generate the firm size dispersion, while imitation shocks alleviate firms’ size differences by allowing firms to learn and catch up. The one-sector model used here has three testable implications. First, scale- dependent firm growth rate arises purely from the scale-dependent imitation rate. Specifically, a surviving firm’s imitation rate drops as its size increases, while a firm’s innovation rate is independent of firm size. Moreover, a firm’s growth rate is more scale-dependent in sectors with more abundant knowledge spillovers and more homogeneous firm size. Second, firm size heterogeneity decreases with im- itation’s contribution to the gross growth rate, and increases with the volatility of innovation risk. Third, knowledge diffuses faster in sectors with a more homoge- neous firm size distribution. 3 The challenge in testing the first two implications is to distinguish the imitation rate from the innovation rate in the total firm growth rate. This is done by differ- entiating between citations given to the citing firm’s old patents (inside citations) and citations given to other firms’ old patents (outside citations). A firm’s quality adjusted growth rate of patent stock is split into innovation rate and imitation rate according to the ratio between inside citations and outside citations. Each outward citation is weighted by the importance of the cited firm2 to control for the quality of knowledge spillovers embodied in each citation. When calculating a firm’s quality adjusted growth rate, each new patent is also weighted by its number of quality adjusted citations received in the 10 years after application. In the regression of innovation (imitation) rate on firm size, the coefficient is barely (always) statistically significantly different from zero among 42 sectors, as predicted by the second model implication. This result supports the idea that imita- tion rate declines with firm size, while innovation rate is independent of firm size. Moreover, the regression coefficient of imitation rate on firm size decreases with firm size heterogeneity, which confirms the first model prediction, that imitation rate is more scale-dependent in sectors with more homogeneous firm size distribu- tion. With an estimated innovation rate and imitation rate for every firm within a sector, I can derive the sector-level imitation rate’s share in gross growth rate and the variance of the log scale innovation rate. The second implication is also sup- ported by the data: firm size heterogeneity is negatively related to imitation rate’s share in gross growth rate and positively related to the volatility of innovation risk. To test the third implication, I employ within-sector patent citations in NBER Patent Citation Data as a measure of intra-sector knowledge spillovers. I only in- clude citations made by firms located in United States. The cross-firm knowledge diffusion speed and the percentage of cross-firm citations among all citations are negatively correlated with the firm size distribution heterogeneity. In the regres- sions, I control the geographic distance between the citing and the cited patent, size of the citing organization, cited organization and sector size. The knowledge 2I measure the importance of a citing firm f in sector s by its hub weight in the firm networks connected by cross-firm patent citations within sector s. The hub weight of each firm is calculated according to Kleinberg (45), which measures a node’s ability to absorb information through the networks. 4 spillover speed is measured by citation time lag3. As a robustness check, the cross- sector difference in knowledge spillovers also holds in the citation data including all G7 countries. Now, to turn to the second question: why does firm size distribution follow a Pareto distribution not only within narrowly defined sectors but also in the entire economy? According to Jessen and Mikosch (40), summation or pooling of in- dependent Pareto-distributed variables induces a new Pareto-distributed variable. However, the scale parameter of the new Pareto distribution should be equal to the smallest scale parameter of the component distributions. In contrast, firm-level data (Figure 1.1 and Figure 1.2) show that the scale parameter of the size distribution for the whole economy is in the middle of the range of the component distribu- tions’ scale parameters. Therefore, some mechanism must make firm size and firm growth dynamics in different sectors dependent on each other. The multi-sector model adds two important elements to the one-sector model. First, many firms develop products in multiple sectors. In the NBER Patent database, every organization, on average, applied for patents in 6.5 out of 42 patent cate- gories; moreover, larger organizations cover more categories (see Table 1.5). Sec- ond, inter-sector knowledge spillovers are as important as within-sector knowledge diffusion. Inter-sector citations amount to 37% of total citations in the citation data. In the multi-sector model, when firms invent new products in a single sector, they can apply their private knowledge capital from all sectors. Also, a firm’s growth in a single sector is affected by its previous knowledge capital in all sectors. As a result, a firm’s overall size, which is the summation of its branches in all sectors, is influenced by its private knowledge capital in all sectors. Since a firm’s growth dynamics in any subset of the economy follow a similar formula, firm size distri- bution converges to Pareto distribution universally in any subset of the economy. The models demonstrate policy implications for optimal intellectual property rights protection (IPRP). In the one-sector model, the growth maximizing degree of protection depends on the trade-off between a higher imitation rate and a lower private return of knowledge. In the multi-sector model, cross-sector resource al- 3The citation lag, the time difference between the application time of the citing patent and that of the cited patent, indicates the time needed for the knowledge to travel between the citing patent inventor and the cited patent inventor. 5 location is another factor in the trade-off that promotes better IPRP. A strength- ened IPRP increases the private return of major knowledge contributing sectors more than other sectors, hence encouraging more research investment to enter the knowledge giver sectors. Indirectly, economic growth benefits from a cross-sector resource allocation that agrees with each sector’s knowledge contribution to the economy. Similarly, in the open economy setting of Cai and Li (11), the multi- sector model discovers a new channel through which trade cost harms growth: the distorted R&D resource allocation across-sectors, because trade cost reduces the private return of those sectors contributing intensive knowledge spillovers more than other sectors. 1.1.1 Literature Review In the literature on firm growth dynamics, I have identified two strands in the theo- retical debate about sources of firm size heterogeneity. The first strand, represented by the work of Lucas (51), Jovanovic (43), and Klette and Raknerud (49), empha- sizes the impact of a manager’s various talents in creating permanent differences in firm efficiency. The second strand, elaborated by Hopenhayn (35), Ericson and Pakes (25), Klepper (46), Klette and Kortum (48), Klepper and Thompson (47), and Luttmer (54), contends that firm size dispersion is caused by accumulated id- iosyncratic shocks over a firm’s life cycle. Seker (64) incorporates both of these strands and tries to distinguish the contribution of each. In addition to exploring the origins of firm size heterogeneity, the literature on firm size dynamics tries to explain interesting stylized facts observed in firm-level data. First, the firm size distribution follows a Pareto distribution, both within indi- vidual sectors and in the entire economy. Meanwhile, the heterogeneity of firm size distribution varies across sectors (see Figure 1.1 and Figure 1.2), as documented by Axtell (4); Helpman, Melitz, and Yeaple (33); Rossi-Hansberg and Wright (63) (RW henceforth); and Luttmer (54). Second, a surviving firm’s expected growth rate drops with firm size, or put differently, firm growth rate is scale-dependent4. Studies take various approaches when modeling scale-dependent surviving firm 4See Evans (26); Hall (30); Dunne, Roberts, and Samuelson (22); Sutton (68); Klette and Kortum (48); and Luttmer (54). Again, there are cross-sector differences in growth rate scale dependency, as demonstrated in RW. 6 growth rate. Cooley and Quadini (18), Cabral and Mata (10), Albuquerque and Hopenhayn (2) and Clementi and Hopenhayn (14) show that financial market fric- tion can induce firm growth rate to decline with firm size. In Klette and Kortum (48), every firm has the same unconditional growth rate, while small firms have a higher growth rate conditional on survival, because they are less likely to survive than large firms. Selection is the key to having scale-dependent firm growth rate. Klepper and Thompson (47) use creation and de-construction of sub-markets to explain firm size dynamics. The size decrement due to sub-market de-construction is proportional to firm size, while the size increment due to the creation of emerg- ing markets is independent of firm size. As the firm operates in more sub-markets, the expected proportional increase in size declines. RW shows that small firms grow faster because households want to accumulate industry-specific human capi- tal more rapidly in small firms where the marginal return of human capital is still high. Luttmer (54) assumes that new firms enter with a high-quality blueprint, but that the blueprint’s quality depreciates and becomes obsolete over time. Hence, firms choose to replicate their blueprints faster when they are smaller and their blueprints’ quality is still high. Only RW pays attention to the cross-sector differences in firm growth rate scale dependence and firm size heterogeneity in terms of sector-specific capital inten- siveness. The firm growth rate drops faster in more capital intensive or less human capital intensive sectors because the marginal return to human capital decreases faster there. When small firms are more likely to catch up with large firms, firm size distribution becomes more homogeneous. Capital intensiveness can explain cross-sector differences in firm size heterogeneity for broad sector divisions (for in- stance, between education, construction and manufacturing.), but it fails to explain differences across a more refined division of manufacturing sectors. The patent citation data used in this paper primarily cover manufacturing and shows that ex- amining knowledge spillovers efficiency is a more promising way to account for differences across manufacturing sectors. Luttmer (53) and RW also consider the role of knowledge spillovers in shap- ing firm size distribution. In Luttmer (53), only entrants learn from incumbents, and as new entrants can learn more from incumbents, the firm size distribution is more homogeneous. In the learning-by-doing (LBD) extension of RW, the in- 7 dustrial total output enters the accumulation function of industry-specific human capital. The authors show that a larger externality in LBD leads to a faster mean reversion in human capital stock and a more homogeneous firm size distribution. The conclusion of the extension is that capital intensiveness and the LBD external- ity jointly determine firm size dispersion. In addition, the firm size distribution in their paper converges to a log-normal distribution, instead of a Pareto distribution. Unfortunately, unlike this paper, neither of the above two papers provides empirical evidence to support its theoretical predictions on knowledge spillovers. This paper differs from Luttmer (53) in that it allows every firm, instead of just new entrants, to imitate. It is closer to RW’s extension featuring LBD externality, but this paper uses a micro-founded approach, while RW uses a macroeconomic approach. In some sense, knowledge spillovers are one reason for the deprecia- tion of a blueprint’s quality in Luttmer (54). Expecting that others will ’steal’ its blueprint in the future, the owner of a blueprint chooses to replicate it faster before others imitate it. No prior research has studied the universal Pareto firm size distribution that is found across all subsets of the economy. In addition, the multi-sector model makes a distinct contribution by providing growth policy suggestions concerning intellec- tual property rights and trade policy through the cross-sector resource allocation channel. The remainder of the paper is organized as follows. In section 2, the one-sector model shows that sector-specific intra-sector knowledge diffusion efficiency de- termines a firm’s choice of endogenous innovation and imitation inputs, which in turn affects the firm size heterogeneity. In section 3, I test the implications of the one-sector model with NBER Patent Citation Data. The multi-sector model is pre- sented in section 4, where I show that inter-sector knowledge spillovers integrate growth dynamics in all sectors and induce a Pareto size distribution in all subsets of the economy. Section 5 concludes. 8 1.2 One-Sector Model 1.2.1 Consumer The representative consumer faces the following problem: U = max {xi,t} ∞ ∑ t=0 ρ t [log(Ct)] (1.1) s. t. PtCt + ∫ MF 0 S f ,t+1Vf ,td f = L+ ∫ MF 0 S f ,t (Vf ,t +D f ,t)d f Ct = (∫ It 0 x σ−1 σ i,t di ) σ σ−1 . ρ is the time preference of the representative consumer; Ct is the consumption of final goods; the consumption and price of intermediate good i are xit and pi,t , respectively. Pt is the aggregate price index. The representative consumer inelasti- cally supplies L hours of labor each period. Wage rate is normalized to 1. MF is the total number of firms in the economy. S f ,t is the equity share of firm f held by the consumer at time t. Vf ,t is the price of firm f equity at time t. D f ,t is the dividend to firm f shareholders at time t. There are It intermediate goods in the economy at time t. σ > 1 is the elasticity of substitution between intermediate goods. Consumer demand for intermediate goods is xi,t =Ct ( Pt pit )σ Pt = (∫ It 0 p1−σit di ) 1 1−σ From the first order conditions for the consumer’s problem, the equity price of firm f is given by Vf ,t = ρ CtPt Ct+1Pt+1 (Vf ,t+1+D f ,t+1) 9 1.2.2 Firms There is only one sector with MF firms in the economy. MF is a large number, so each firm is tiny relative to the economy. Firm f hires one unit of production labor to produce one unit of goods. The wage rate is the numeraire. According to Dixit and Stiglitz (20), the profit-maximizing price for every product is σσ−1 . In the monopolistic competitive market, the profit from each product is 1σ Yt It . Firm f produces z f ,t number of goods that it has invented by time t. The total number of goods in the economy is It = ∫MF 0 z f ,td f . Firms grow by inventing new goods. Firm f invents the new goods through two types of R&D: innovation and imitation. Innovation uses firm’s private knowledge capital and N f ,t units of research hour. Here the size of firm f ’s private knowledge capital is measured by z f ,t to represents firm f ’s experience in R&D. In contrast, imitation uses public knowledge capital Z̄t 5 and M f ,t units of research hour. The size of the public knowledge pool is measured by the average firm size Z̄t in the industry. This assumption implies that learning is time-consuming and that no firm can afford to acquire all outside knowledge in one period. In other words, the inter-firm learning happens multiple-to-multiple instead of all-to-all. A number of pairs of firms are randomly matched to learn from each other. What one firm expects to learn from a random peer is the average firm size in the sector. The assumption of proxy public knowledge by average firm size rather than total industry size, is supported by the firm citation network data. If the total indus- try size was the right measure of public knowledge pool size, we would observe that each firm is linked to every other firm, or a significant proportion of the total number, whereas network density, the average number of cited firms per firm is far smaller than the total number of firms. For example, in 1976 there were 5749 U.S. patenting firms, which on average cited patents from 2.81 firms; in 1996 there were 20,522 U.S. patenting firms, which on average cited patents from 11.02 firms. Al- though the network density increases with the total number of patenting firms, each firm still only learns from a limited number of peers. Additionally, this assumption ensures that in the general equilibrium, economic growth rate is a constant, which is independent of the total number of firms and population size. 5In Appendix A.4, firms use both private and public knowledge to imitate. 10 Imitation here does not refer to simple reverse engineering and replication; it means improvements and upgrades of other firms’ products. This assumption re- flects the fact that patent law does not acknowledge simple replication. In order to gain a new patent, a firm must upgrade existing patented goods to demonstrate enough originality and creativity. Firms’ private knowledge diffuses to the public knowledge pool through many channels6. Firms may or may not voluntarily re- veal their private knowledge to the public, but interactions between firms always generate a steady flow of knowledge from each firm’s private pools to the public pool. I borrow the Cobb-Douglas production function from Klette and Kortum (48)to describe the new goods production function. The expected number of new goods depends on the amount of hours invested in research and knowledge capital. E ( ∆zNf ,t ) = ANNαf ,tz 1−α f ,t E ( ∆zMf ,t ) = AMMαf ,t Z̄ 1−α t ∆zNf ,t (∆z M f ,t) is the number of new goods invented by innovation (imitation). 0 < α < 1 is the labor share in knowledge production. The productivity of innovation (imitation) AN (AM) is sector-specific. This assumption captures the fact that technology is more standardized or codified in some sectors than others. Standardized industrial technology is easier to transplant across firms, while firm-specific technology is only suited for application within the inventing firm. Additionally, standardized industrial technology enables workers to change employers within the same sector. If knowledge capital is embodied in workers, a high labor turnover rate also helps to disseminate one firm’s private knowledge to other firms. Within a sector, the productivity in the two types of R&D AN and AM can be different, which implies that firms employ private and public knowledge at differ- ent costs. These costs include the searching cost of related existing knowledge, the 6In Duguet and MacGarvie (21), there are 12 channels listed: external R&D, cooperative R&D, patents and licenses, analysis of competition, experts, equipment acquisition, hiring employees, com- munication with suppliers, communication with customers, mergers and acquisitions, joint ventures and alliances, and personnel exchange. 11 reverse engineering cost of absorbing the existing knowledge, and the creation cost of adding novelty to the existing knowledge. Normally, firm borders block knowl- edge spillovers, and it is therefore more efficient to use private knowledge instead of public knowledge, i.e. AN > AM7. The Cobb-Douglas knowledge production function hinges on two assumptions. First, R&D research hours N f ,t and M f ,t have decreasing marginal productivity. Second, with the same amount of innovative research hours, N f ,t , larger firms in- vent more new goods due to greater R&D experience. Similarly, with the same amount of imitative research hours, M f ,t , firms with access to a deeper public knowledge pool Z̄t create more new goods. When these two forces offset each other, the amount of research hours that each firm spends on innovation (imitation) is proportional to the size of private (public) knowledge pool z f ,t (Z̄t). Firm f chooses inputs in innovation N f ,t and imitation M f ,t to maximize its firm value V (z f ,t). max N f ,t , M f ,t V (z f ,t) = PtCt σ z f ,t It − N f ,t +M f ,t Z̄t + ρCtPt Ct+1Pt+1 E[V (z f ,t+1)] (1.2) subject to z f ,t+1 = z f ,t +∆zNf ,t +∆z M f ,t (1.3) ∆zNf ,t z f ,t = ANNαf ,tz 1−α f ,t z f ,t + εnf ,t (1.4) ∆zMf ,t Z̄t = AMMαf ,t Z̄ 1−α t Z̄t + εmf ,t (1.5) z f ,t It represents firm f ’s market share in terms of both number of goods and sales, 7On average, firms tend to use private knowledge more frequently and sooner than public knowl- edge. In the NBER Patent Citation Data, every organization on average owns 0.17% of the old patent stock in the industry, but the rate at which they cite their own old patents is disproportionately high at 11.1%. If the cited and citing patents are owned by the same organization, the average citation time lag is 5.86 years; otherwise the average time lag is 9.06 years. Citation lag is defined as the application year of citing patent minus the application year of cited patent, which indicates how long it takes the citing firm to acquire and make use of the knowledge embodied in the cited patent. 12 because each good is sold at the same amount. Firm f ’s investments in R&D decide the expected success rates of innovation and imitation, but the actual realization in (1.4) and (1.5) are subject to i.i.d. shocks εnf ,t and ε m f ,t 8. When firm f ’s manager chooses research inputs at the beginning of time t, she knows the distributions of εnf ,t and ε m f ,t but not their actual realizations. Firm f discounts future firm value at the same rate of consumer’s ρCt PtCt+1Pt+1 . La- bor productivity in R&D grows as fast as the public knowledge pool size Z̄t , be- cause workers learn human capital from their employers; when workers turn over across firms, their productivity of R&D is equal to the average knowledge capital among firms 9. Since each firm is tiny relative to the entire sector, firm f takes It , Yt and Pt as given. I assume that firms receive full liquidation value in case of exit, so that their current innovation and imitation decisions are independent of exit risk in the future. One educated guess for the firm value is a linear function of the form: V (z f ,t) = vt z f ,t It +ut −F . (1.6) vt is the private marginal value of market share. I will show later that ut represents the rent from public knowledge externality. F is the fixed entry cost to the market. The first order conditions are: N f ,t = ( ANαvt+1ρ ′t MF ) 1 1−α z f ,t (1.7) M f ,t = ( AMαvt+1ρ ′t MF ) 1 1−α Z̄t (1.8) 8εnf ,t and ε m f ,t are zero mean random variables bounded from below, such that ∆zNf ,t z f ,t and ∆zMf ,t Z̄t are always positive. 9This assumption keeps the number of R&D workers constant in general equilibrium while the number of goods can grow at a constant rate. Moreover, the endogenous growth rate of the economy is independent of the population size under this assumption. 13 vt = PtCt σ −MF ( ANαvt+1ρt ′ MF ) 1 1−α +ρ ′vt+1 [ 1+AN ( ANαvt+1ρt ′ MF ) α 1−α ] (1.9) where ρ ′t = ρItCtPt It+1Ct+1Pt+1 . (1.7) ((1.8)) equates the marginal cost of innovation (imitation) to the expected marginal return from innovation (imitation). A firm’s optimal labor input in inno- vation N f ,t is proportional to the firm’s private knowledge z f ,t ; and the labor input in imitation M f ,t is proportional to public knowledge Z̄t . Equation (1.9) means that the marginal value of current market share is the current marginal profit plus the discounted future profit from innovation. Notice that v in (1.9) is the private return of knowledge capital accumulation, which is smaller than the social return of knowledge, due to knowledge externality through imitation. The externality is captured by ItIt+1 in ρ ′: a higher growth rate of number of goods due to larger imitation productivity AM erodes an existing product’s future market share. The constant component of the firm value function ut is given by ut =− ( AMαvt+1ρ ′t MF ) 1 1−α +ρ ′t It It+1 ut+1+ρ ′t 1 α ( AMαvt+1ρ ′ MF ) 1 1−α . (1.10) ut is equal to the expected discounted future profit from the imitated products. In other words, ut measures the public knowledge pool’s externality to each firm. In the equilibrium with free entry, ut must be smaller than or equal to the fixed entry cost F . Since potential entrants are firms with zero products, their expected profit from entering is purely the public knowledge externality ut . When the externality ut is greater than entry cost F , new entrants will keep entering and diluting the public knowledge pool Z̄t . The average firm size Z̄t will shrink until ut is equal to F and no more entry occurs. Substituting ut = F into (1.10) shows that the total number of firms MF decreases with entry cost F . 14 Firm f ’s period t dividend is Dt = PtCt σ z f ,t It − ( ANαvt+1ρ ′t MF ) 1 1−α z f ,t Z̄t + ( AMαvt+1ρ ′t MF ) 1 1−α . Negative dividend means that firm f finances the R&D cost from shareholders at zero interest rate. Firm Size Dynamics Substituting (1.7) and (1.8) to (1.3), (1.4), and (1.5), the firm size dynamic process in (1.3) can be summarized by z f ,t+1 = R f ,t+1z f ,t +L f ,t+1, (1.11) where R f ,t+1 ≡ ItIt+1 ( 1+A 1 1−α N [ αvρ ′t MF ] α 1−α ) + εnf ,t+1, (1.12) L f ,t+1 ≡ ItIt+1 A 1 1−α M [ αvρt ′ MF ] α 1−α + εmf ,t+1. I decompose a firm’s expected growth rate g f ,t into an innovation rate r f ,t and an imitation rate l f ,t . E (r f ,t)≡ E ( ∆zNf t ) z f ,t = A 1 1−α N [ αvρ ′t MF ] α 1−α (1.13) E (l f ,t)≡ E ( ∆zMft ) z f ,t = A 1 1−α M [ αvρt ′ MF ] α 1−α Z̄t z f ,t (1.14) The expected innovation rate E (r f ,t) is a constant and independent of firm size z f ,t ; but the expected imitation rate E (l f ,t) is scale-dependent. As firm f grows larger, its imitation rate declines simply because the public knowledge pool Z̄t becomes smaller relative to firm f ’s size z f ,t . In total, the expected firm growth rate E (g f ,t)≡ E (r f ,t)+E (l f ,t) declines with 15 firm size purely because of the scale-dependent imitation rate. E (g f ,t) = A 1 1−α N [ αvρ ′t MF ] α 1−α +A 1 1−α M [ αvρt ′ MF ] α 1−α Z̄t z f ,t . Moreover, the expected imitation rate E (l f ,t) declines faster (or is more scale- dependent) when cross-firm knowledge spillovers are more efficient (AM is greater) in (1.14). As a result, the firm’s expected growth rate E (g f ,t) also drops faster in a sector with more abundant knowledge spillovers than in other sectors. Model Implication 1: A firm’s imitation rate declines with firm size while a firm’s innovation rate is independent of firm size. Moreover, a firm’s imi- tation rate drops faster in sectors with more abundant knowledge spillovers than in other sectors, which causes the cross-sector difference in the scale- independency of firm growth rate. Notice that (1.13) and (1.14) also provide insights into the sector-specific op- timal research and development policy. There are two types of R&D: innovation that relies on intra-firm knowledge diffusion, and imitation that depends on inter- firm knowledge diffusion. Moreover, both R&D outputs have increasing return to their productivity AM and AN ( 11−α and 1 1−β > 1). If knowledge diffuses faster within a firm than across firms (AN > AM) and α = β , increasing AN by 1% causes a greater growth rate increment than increasing AM by the same amount and vice versa. The reason is that firms endogenously allocate more R&D input to the type with a comparative advantage in knowledge diffusion. Take Natural Gas and Petroleum industry as an example. Since the oil land in Alberta Canada contains a different chemical ingredient from the oil land in Texas U.S., firms in this industry may find other firm’s technology useless in their produc- tion or research. A strengthened intellectual property rights policy that encourages firms to use their private knowledge in R&D is more effective than policies that facilitate the sharing of information between firms. In Computer and Office Ac- counting Machinery industry, however, firms obey a common industrial standard and technologies are codified in most cases. A R&D policy that encourages firms to share information is more suited. In summary, a tailored sector-specific policy that favors the R&D type that allows for more efficient knowledge diffusion helps 16 to achieve a higher economic growth rate. Policies to support imitation (increase AM) include subsidizing cross-firm R&D cooperation, facilitating labor turnover, encouraging universities to disseminate knowledge to the public, and so on. Strict intellectual property rights protection supports innovation (increase AN). 1.2.3 Determinants of Firm Size Distribution To provide economic context for (1.11), I want to compare it with an AR(1) pro- cess. R f ,t here is a random variable while for a typical AR(1) process R is a con- stant. In an AR(1) process zt+1 = Rzt +Lt , R measures persistency and L represents the randomness of the stochastic process. Similarly, in (1.11) R f ,t measures the per- sistency of firm size, or to what extent current firm size affects future firm size by providing private knowledge capital for future innovation. L f ,t indicates how much firms can learn from public knowledge capital which is independent of current firm size. If R is estimated using panel data of firm size controlling for firm fixed ef- fects, it varies between 0.6 to 0.96 across sectors. Moreover, the estimated R is higher in a sector with more a heterogeneous firm size distribution, which confirms that a more persistent firm size dynamics is associated with a more dispersed size distribution. Imagine an economy without imitation, which means eliminating L f ,t in (1.11). Starting from a sector with many equally sized firms, and repeating the process for z f ,t+1 = Rt+1z f ,t for numerous periods, firms will end up with different sizes be- cause they have different ’luck’ in their innovation history. Overtime, firm size dispersion will grow without bound. The volatility of innovation shocks εnf ,t de- termines how quickly the size dispersion explodes. In the real world with chances to learn from other firms, L f ,t constrains and attenuates the size dispersion gener- ated by innovation shocks. In equilibrium, firm size heterogeneity measured by the standard deviation (s.d.) of log-scale firm size is constant over time with imitation. Overtime, the firm size distribution moves forward with the lower bound of the distribution rising at a constant growth rate. For a Pareto distribution, the shape parameter that determines the firm size dispersion does not change with the lower bound of the distribution. Therefore, even with a steady growth rate of the average 17 firm size, the firm size distribution maintains its shape. Proposition 1 According to theorem 5 in Kesten (44), the firm size distribution{ z f ,t } in a given sector follows a Pareto distribution with scale parameter µ , such that E (R f ,t) 1 µ = 1, if {R f ,t ,L f ,t} in the market size dynamics (1.11) are indepen- dently and identically distributed over time and across firms10 . Lemma 2 When {log(R f ,t)} follows a normal distribution with variance σ2r , and {R f ,t ,L f ,t} in the market size dynamics (1.11) are independently and identically distributed over time and across firms, there is a closed form solution for µ: µ = 1− 2ln { E (R f ,t+1) } σ2r ≈ 1+ 2 l̄1+r̄+l̄ σ2r , l̄ ≡ 1 MF MF ∑ f=1 r f ,t , r̄ ≡ 1MF MF ∑ f=1 l f ,t . l̄ (r̄) is the cross-firm average imitation (innovation) rate in the sector. The number of goods growth rate is defined as g = l̄+ r̄. Over time, the average firm size Z̄t keeps growing at a constant rate g, but the size dispersion measure 1µ is a constant. (2) highlights two offsetting forces shaping firm size distribution: the inno- vation shock’s volatility creates firm size difference while imitation reduces the difference. As mentioned in the last paragraph, the innovation shock’s volatility σ2r determines how quickly firm size dispersion explodes without imitation. On the other hand, imitation’s relative contribution to the gross growth rate l̄1+r̄+l̄ defines the power of mean reversion to constrain the firm size dispersion from exploding. In total, firm size heterogeneity11 1µ declines with the relative magnitude between these two offsetting forces. Since abundant cross-firm knowledge spillovers or high AM increase imitation’s relative contribution to the gross growth rate l̄1+r̄+l̄ , 10The independent assumption is unnecessary according to Goldie (29). 11For a Pareto distribution with scale parameter µ , 1µ is equal to the standard deviation of log scale firm size, which is commonly used as a measure of firm size heterogeneity in the literature. 18 a sector with more abundant cross-firm knowledge spillovers also exhibits more homogeneous firm size distribution than other sectors. Model Implication 2: Firm size heterogeneity declines with the relative magnitude between imitation’s gross growth rate contribution and innovation risk’s volatility l̄ 1+r̄+l̄ σ2r . Model Implication 3: Sectors with more abundant cross-firm knowledge spillovers have more homogeneous firm size distribution than other sectors. 1.2.4 Growth Rate Volatility Decomposition Var (g f ,t) =Var (R f ,t)+ Var (L f ,t) (z f ,t) 2 (1.15) According to the market share dynamics (1.11) and (1.12), every firm’s growth rate is subject to two shocks: innovation risk and imitation risk. The relative weight of these two risks in a firm’s growth volatility is different across firms. For a large firm, the main risk component is innovation risk, while for a small firm, the major component is imitation risk. Overall, a firm’s growth volatility declines with its size, because innovation risk is the same across firms, while imitation risk’s contribution to total volatility decreases with firm size. Decomposing firm volatility into innovation risk and imitation risk sheds light on recent discoveries about the converging firm growth volatility among small pri- vate firms and large public firms. Comin and Mulani (16) and Davis, Haltiwanger, Jarmin, and Miranda (19) find that U.S. large public traded firms’ volatility has risen, while small private firms’ volatility has declined over the last several decades. One possible mechanism to explain these two concurrent facts is that certain policy or technology changes encouraged firms to invest more in innovation and less in imitation (AN rises and/or AM decreases). As a result of this policy change, firms undertake riskier projects, allocating more generous funds to innovation; the op- posite happens when firms choose imitation projects requiring more limited funds. Such changes induce innovation volatility Var (R f ,t) to rise and imitation volatility Var (L f ,t) to drop at the same time. Since Var (R f ,t) is the major risk component for large firms and Var(L f ,t) (z f ,t)2 is the major risk component for a small firm, the incre- 19 ment of Var (R f ,t) dominates the decrement of Var(L f ,t) (z f ,t)2 for a large firm’s volatility; meanwhile the decrement in Var(L f ,t) (z f ,t)2 outweighs the increment of Var (R f ,t) for small firms. There is existing literature on declining knowledge spillovers, which deter im- itation and encourage innovation. Caballero and Jaffe (9) and Rosell and Agrawal (62) find that the potency of spillover from old ideas to new knowledge genera- tion has been declining over the last century. The policy changes started with the Bayh-Dole Act (35 USC 200-212) 1980, which grants patents to inventors who are funded by federal assistance. Since then, U.S. Patent Law has been amended several times to include increasingly broad infringement definitions. These policy changes all encourage innovation and limit imitation. Even universities, whose tra- ditional role was to disseminate knowledge, have become more and more commer- cially oriented. Another related discovery is the divergence of moderating aggre- gate volatility and rising firm-level volatility for publicly traded firms. Comin and Mulani (16) propose an explanation also based on changing R&D activity: firms spend more resources on Embodied innovations and less on Disembodied innova- tions. The first type of R&D is patentable, so firms can appropriate all the benefits it generates. The second type of R&D is difficult to patent and easy to reverse engineer. The firm that develops a disembodied innovation cannot appropriate the benefits enjoyed by other firms when adopting it. The co-movement across firms weakens when there are fewer disembodied innovations to be imitated by everyone simultaneously. Since total output volatility is the summation of individual firms’ volatility and the covariance between firms’ growth rates, weaker co-movement reduces GDP volatility. Summating the firm dynamics in (1.11) to the aggregate level may provide a coherent understanding of both the volatility convergence between large and small firms at firm level and the moderating aggregate volatility at the macro level. The key is firms’ changing R&D patterns: less imitation and more innovation. 20 1.2.5 General Equilibrium In general equilibrium, the average firm value, vMF , the growth rate in the number of goods, g, and the number of firms, MF are solved using (1.16) to (B..12).( 1− ρ 1+g ) v MF = PC MF + 1−α α ( ANραv (1+g)MF ) 1 1−α (1.16) g = ( A 1 1−α N +A 1 1−α M )[ ραv (1+g)MF ] α 1−α (1.17) F = u = 1−α α ( 1− ρ1+g ) ( AMραv (1+g)MF ) 1 1−α (1.18) PC = L+ PC σ − ραvg 1+g (1.19) In (B..12), the average firm value vMF increases with entry cost F , but decreases with imitation productivity AM, because v is the private return of knowledge capital, which shrinks with larger externality. In (1.16) to (1.17), higher AM imposes two conflicting effects on growth rate g: first, it raises the imitation R&D input for the given average firm value vMF ; second, it reduces average firm value v MF for given R&D input because emerging new products squeeze the current products’ market share. Higher AN , however, always boosts economic growth, because it increases both innovation input and the private return of knowledge v. (B..13) is the consumer’s budget constraint. PCσ − ραvg1+g is the total dividend. The total R&D labor input ραvg1+g increases with the private return of knowledge capital v, consumer’s patience ρ , labor share in knowledge production function α and growth rate g. Note that the economic growth rate, or the total number of goods growth rate, g, is independent of population size L. As L enlarges, market size PC and marginal firm value v increase proportionally. In the mean time, a larger market also accom- modates more firms as indicated in (B..12), which means vMF remains unchanged. This model allows policy to affect economic growth. Suppose IPRP policy can not change innovation efficiency AN , but it can constrain a firm’s utilization of pub- 21 lic knowledge and reduce AM. Define γ ≡ AMAN ,which is smaller if IPRP is stronger. A better IPRP on one hand reduces imitation rate and harms growth rate; on the other hand, it raises the private value of knowledge capital v and indirectly in- creases economic growth. The growth maximizing γ is determined by the balance of these two factors. 1.3 Empirical Results Before testing the three implications listed above, I introduce the data briefly. 1.3.1 Data The NBER Patent Citation Data comprise detailed information on almost three mil- lion U.S. patents granted between January 1963 and December 1999, more than 16 million citations made to these patents between 1975 and 1999, and around 20,000 patent assignees, 92% of which are non-governmental organizations. I refer to all the organizations as ’firms’ henceforth. Each patent contains highly detailed infor- mation on the innovation itself, the inventors, the assignee, etc. Moreover, patents have very wide industry and geographic coverage. The patents are classified to 42 wide SIC (Standard Industrial Classification) sectors. The percentage of U.S. patents awarded to foreign inventors has risen from about 20% in the early 1960s to about 45% in the late 1990s12. The citation data is well suited to this paper’s purpose because these citations provide detailed paper trails of intellectual interactions across firms and sectors. Aggregated by industry level, the average values of time lag, the geographic dis- tance, and the percentage of cross-firm citations indicate the pace and abundance of knowledge diffusion in each sector. Aggregated by firm level, cross-firm citations describe the sources of knowledge in each firm’s R&D process. The firm-level aggregation allows for the distinction between imitation and innovation’s contri- butions to each firm’s overall growth rate, which is critical for testing the first two implications of the one-sector model. The industry aggregation allows for the test- ing of the model’s third implication. 12See Hall, Jaffe, and Trajtenberg (31) for more details. 22 Here, I use patent citation to measure knowledge flow13. However, citations do not represent a one-to-one mapping of direct knowledge flows. A high proportion of noise may exist, because only some citations are made by the applicant, and others by the patent examiner. Jaffe, Trajtenberg, and Fogarty (39) and Duguet and MacGarvie (21) justify the use of aggregate patent citations as an indicator of knowledge spillovers based on a survey of patent inventors in the U.S. and firms in France. They conclude that some of the citations are associated with real knowl- edge flow, and patent citations aggregated at the industrial or regional level are valid measures of knowledge flow. 1.3.2 Implication 1: Scale-Independency of Firm Growth Rate The one-sector model’s first implication is: A firm’s imitation rate declines with firm size, while a firm’s innovation rate is independent of firm size. Moreover, a firm’s imitation rate drops faster in a sector with more abundant knowledge spillovers than in other sectors, which causes the cross-sector difference in the scale-independency of firm growth rate. In this subsection, I first demonstrate that the growth rate of a surviving firm has various scale dependencies in different sectors. I then attribute the above phe- nomenon to the scale-dependent imitation rate and its cross-sector differences. Figure 1.3 and Figure 1.414 show that firm growth rate in the ’Petroleum and natural gas extraction and refining’ sector is almost independent of firm size, while in ’Office computing and accounting machinery’ it drops rapidly as firm size in- creases. Notice that the former sector has a more heterogeneous firm size distri- bution than the latter. In Figure 1.3 firm size (growth rate) is measured by French manufacturing firms’ total revenues (growth rate of total revenues) in the Amadeus Database, while in Figure 1.4 firm size (growth rate) is measured by a firm’s num- ber of patents (growth rate of number of patents) in the NBER Patent Citation Data. In the model, the firm growth rate in terms of number of goods or total revenues is 13Patents cite other patents as ’prior art’, with citations describing the property rights conferred. While a patent grants the assignee the right to exclude others from practising the invention described in the patent, it does not necessarily grant the owner the right to use the invention without the per- mission of cited assignees. 14The x-axis values are discounted by sector average so that the two sectors have similar domains in firm size. 23 the same. For every sector, I run the following regression with both NBER Patent Citation Data and French manufacturing firm data: g f ,t = as,t −bs,t ln(ps f ,t) . where g f ,t is firm f ’s growth rate at time t. ps f ,t is the number of patents granted to firm f by the beginning of time t (or firm f ’s total revenue at time t in the French firm data set). as,t and bs,t are sector-specific. A larger bs,t means firm growth rate drops faster with firm size (or firm growth rate is more scale-dependent) in sector s at time t. In Figure 1.5 (Figure 1.6), each scatter point represents one sector and the num- bers label the four-digit NAICS (North American Industry Classification System) 2002 industry classification (SIC (Standard Industry Classification)). The firm size dispersion measure is the standard deviation of log scale firm revenue (patent stock) in Figure 1.5 (Figure 1.6). In both graphs, bs declines with the firm size hetero- geneity measure. In other words, firm growth rate is more scale-dependent in a sector with more homogeneous firm size distribution than other sectors. This implication also predicts that when firm growth rate is broken down into innovation rate and imitation rate, the scale-independency of firm growth rate de- rives purely from the imitation rate (1.14). A challenge in testing this implication is to estimate a firm’s imitation rate r f ,t and innovation rate l f ,t . Typically, we only observe a firm’s overall growth rate, and it is difficult to tell what share is attributable to a firm’s private knowledge and what share originates from public knowledge. The use of patent citations is a promising approach to solving this problem because they indicate the source of knowledge used during the patent invention. Within-firm citations (cross-firm citations) indicate that the citing firm uses its private (public) knowledge when creating a new patent. At time t, firm f ’s patent stock growth rate g f ,t is split into an innovation rate r f ,t and imitation rate l f ,t , according to the ratio between within-firm citation and 24 cross-firm citation15. ĝ f ,t = No. of new patents f ,t patent stock f ,t l̂ f ,t = ĝ f ,t No. of cross-firm citations f ,t No. of total citations f ,t r̂ f ,t = ĝ f ,t No. of within-firm citations f ,t No. of total citations f ,t Consider the following example. Firm f had ten patents at the beginning of year t. It obtained five new patents during year t. In these five patent applications, firm f’s scientists cited 30 patents held by other firms and cited firm f’s own patents 20 times. Firm f ’s patent stock growth rate at year t is ĝ f ,t = 510 = 50%; the innovation rate is r̂ f ,t = ĝ f ,t ∗ 3030+20 = 30%; and the imitation rate is l̂ f ,t = ĝ f ,t ∗ 2030+20 = 20%. In order to reflect the quality of the information transmitted in each citation count, I adjust the pure citation count by assigning a greater weight to a citation with shorter time lag or given by a more important citing firm. For example, if the citation time lag is n years, this citation is given a weight of (1− δ )n. δ is the knowledge capital depreciation rate. The three implications are virtually unaf- fected if I let the discount rate vary between 0 and 0.9. I use δ = 0.1 in the follow- ing regressions. One reason to add time discount is that citations with shorter time lag transfer more frontier knowledge on average. The other reason is that firms usually cite inside patents sooner than outside patents. Without the time discount adjustment, I underestimate the inside knowledge flow and overestimate the out- side knowledge flow. Since large firms cite themselves more intensively than small firms, without weighting citations by the importance of citing firm, I will under- estimate the innovation rate for large firms and overestimate the imitation rate for small firms. I run the following two regressions for every sector s and time t. Again, a larger brs,t (bls,t) means the innovation rate (imitation rate) is more scale-dependent in sector s at time t. r̂ f ,t = ars,t −brs,t ln(ps f ,t) 15I use only within-sector citations made and received by U.S. firms. 25 l̂ f ,t = als,t −bls,t ln(ps f ,t) Figure 1.7 shows the results for 1990. Similar patterns are exhibited in other years. Each scatter point represents one sector and the numbers label the SIC patent classification in the U.S. Patent Office. The imitation rate scale-dependencies{ b̂ls,t } for all sectors are around 0.1 to 0.3 and significantly different from 0 for all sectors. In contrast, innovation scale-dependencies { b̂rs,t } are around 0 and 0.0516 which are, for most sectors, not statistically significant. In addition, scale- dependency of imitation rate b̂ls,t decreases with the sectoral firm size heterogene- ity measure. This implies that the imitation rate is more scale-dependent in sectors with a more homogeneous firm size distribution, while scale-dependency of inno- vation rate b̂rs,t is independent from the sectoral firm size heterogeneity measure. In summary, when the growth rate is split into an innovation rate and imitation rate, the scale dependence of the growth rate comes only from the scale dependence of the imitation rate, since the innovation rate is independent of firm size. The firm growth rate drops faster in sectors with more homogenous firm size distribution because the imitation rate reduces more quickly in those sectors. Appendix 1.5 discusses why larger firms cite fewer outside patents and why firm growth rate declines with firm size. 1.3.3 Implication 2: Determinants of Firm Size Heterogeneity The one-sector model’s second implication is: Firm size heterogeneity declines with the relative magnitude between imitation’s contribution to gross growth rate and innovation risk’s volatility l̄ 1+r̄+l̄ σ2r . For a Pareto distribution, the commonly used measure of firm size heterogene- ity, i.e. standard deviation of log-scale firm size, is the reciprocal of the Pareto distribution scale parameter µ . (2) predicts that µ increases with imitation’s contri- bution to gross growth rate, l̄1+r̄+l̄ , and decreases with innovation shock’s volatility, σ2r . In sector s and year t, r̄s,t and l̄s,t are given by the average of r̂ f ,t and l̂ f ,t for all 16The outlier, sector 51, has only 20 firms applying for patents in that year. b̂r is not statistically significant. 26 firms that applied for patents in sector s and year t; σ̂2rs,t is estimated by the stan- dard deviation of ln ( 1+r̂ f ,t 1+r̄s,t+l̄s,t ) in (1.12); and sdlnps is the standard deviation of log-scale patent stock. Therefore, the model predicts that sdlnps increases in σ̂2rs,t and decreases in l̄1+r̄+l̄ . l̄ 1+r̄+l̄/σ̂ 2 rs,t is the relative magnitude of these two offsetting forces. In Figure 1.8, each scatter point represents one sector and the numbers label the SIC patent classification in the U.S. Patent Office. The figure illustrates exactly what the model predicts. In this figure, the y-axis is the firm size heterogeneity measure s.d. of log scale patent stock and the x-axis is l̄1+r̄+l̄/σ̂ 2 rs,t . Therefore, the result supports the claim that when imitation’s force dominates that of innovation risk’s volatility, firm size distribution becomes more homogeneous. 1.3.4 Implication 3: Knowledge Spillovers Efficiency and Firm Size Heterogeneity The one-sector model’s third implication is: Knowledge spillovers are more abun- dant in sectors with more homogeneous firm size distribution. I measure knowledge spillovers efficiency by the percentage of cross-firm cita- tions among total citations and the citation time lag of cross-firm citations. The share of cross-firm citations among all citations indicates how likely it is that knowledge spillovers cross firm borders. Figure 1.9 shows that the proportion of cross-firm citations is negatively correlated with sectoral firm size heterogeneity. The citation time lag, the interval between the application time of the citing patent and the application time of the cited patent, indicates the time needed for knowledge to travel between the inventors of these two patents. A shorter cita- tion time lag for cross-firm citations indicates more efficient knowledge spillovers. Notice that the two inventors may take longer to exchange information if the geo- graphic distance between them is larger. The great circle distance between the first inventor of the citing and the first inventor of the cited patent measures how far knowledge travels17. 17The patent inventors are required to report their mailing address. From the Census 2000 U.S. Gazetteer Files, I identify over 90% of U.S. inventors’ geographic locations by their five-digit ZIP code’s latitude and longitude. Using both sides’ latitude and longitude data, the great circle distance between the citing patent and the cited patent is calculated by the method in Sinnott (66). 27 Take the ’Office computing and accounting machinery’ and ’Petroleum and natural gas extraction and refining’ industries, for example. Figure 1.10 and Figure 1.11 show that knowledge diffusion is more likely to overcome firm borders faster in the former industry. Notice that the former sector has a more homogeneous firm size distribution than the latter sector. The gap between these two sectors shrinks as the time lag becomes longer, but still exists even after a lag of 20 years. The fixed-effects OLS regressions in Table 1.1 give the determinants of cross- firm citation time lag with U.S. citations. Since the time lag of repetitive citations overestimates the knowledge spillovers time lag, I only include a citation the first time the citing firm cites the cited patent. First-time citations account for around 70% of all citations. The regression results are similar and more significant when all citations are included. In the first regression, there are state pair fixed effects to capture time invariant unobserved variables that may have an impact on information diffusion between the citing state and the cited state. In the second regression, the sector fixed ef- fects are included to take care of sector-specific time invariant elements that may affect within-sector knowledge spillovers. If sectoral firm size heterogeneity (s.d. of log(patent stock)) changes over time, the model implies that citation time lag should move in the same direction. In the third regression, both types of fixed effects are considered. In the fourth column, I control the citing firm and cited firm firm-pair fixed effects. Year dummies for the citing patent application year are included in all regressions. In all regressions, the citation time lag is longer if geographic distance is larger, the citing organization is smaller, the cited organization is smaller, the sector size is smaller, or the sectoral firm size distribution is more heterogeneous. Distance delays the exchange of knowledge because it increases communication cost. Larger firms are quicker to acquire information, because they are on average older and have better connections due to a more established social network. A larger industry tends to have faster knowledge diffusion. Table 1.1 shows that the sectoral firm size heterogeneity has the predicted positive effect on citation time lag. One standard deviation change in s.d. of log(patent stock) (0.53) causes the citation time lag to increase by 0.44 (0.766*0.53) to 1.55 (2.93*0.53) years, keeping other conditions constant. 28 In summary, there is a greater proportion of cross-firm citations when the sec- toral firm size distribution is more homogeneous. Additionally, among the cross- firm citations, citation time lag is shorter in sectors with more homogeneous firm size distribution, controlling for the size of the citing and cited firms, the size of the sector, and state-pair, sector and firm-pair fixed effects. These results support the third implication of the theoretical model: knowledge spillovers are more abundant in sectors with a homogeneous firm size distribution. 1.4 Multi-Sector Model 1.4.1 Facts about Multi-Sector Firms When firm size is measured by the number of patents, firm size distribution within each sector follows a distinct Pareto distribution with scale parameter ranging from 0.29 to 318. When all the patenting firms are pooled, the firm size distribution also follows a Pareto distribution with a scale parameter close to 1.68. Note that one firm may apply for patents in multiple sectors; the firm size in the pooled distribution of the whole economy is the summation of its number of patents in all sectors. This result corroborates the stylized facts in Helpman, Melitz, and Yeaple (33), with firm size measured by number of employees. In their paper, every sector s follows a Pareto firm size distribution with special scale parameter µs, while the aggregate economy also follows a Pareto firm size distribution with scale parameter µ close to 1. To this point, no research has been conducted to explain the universal Pareto distribution of firm size in each sector and in the entire economy. The following phenomena inspire me to consider firm size dynamics from a multi-sector perspective. First, Table 1.4 and Table 1.5 show that many firms de- velop products in multiple sectors. Moreover, larger firms operate in more sectors. Table 1.4 is borrowed from Broda and Weinstein (8)19, which highlights the multi- product nature of firms in these markets. It demonstrates that firms with higher 18Estimated by French Manufacturing Firm Data from Bureau van DIJK’s Amadeus Database 19Streitweiser (67), Jovanovic (43) and Bernard, Redding, and Schott (7) also found a similar extent of industry diversification in U.S. firms or plants. UPCs in the second column means Universal Product Codes, commonly referred to as bar codes. Share in the last column means the total market share of the firms within each group. 29 sales in dollar value also sell a greater number of goods and sell in more sectors. Table 1.5 shows a similar result in patent data: organizations that own more patents also apply for patents in more patent categories. Second, in the NBER Patent Citation Data, 37% of all citations are cross-sector Citations; the percentage becomes higher and approaches 100% when the sector di- vision is finer. This suggests that knowledge spillovers exist not only within but also across sectors. In Table 1.6, the row index represents the citing sector, and the column index represents the cited sector. The (i, j) element of the matrix is the percentage of citations given by sector j to sector i. There are 42 sectors in total, from which I selected 6 for illustration. Every sector gives a large proportion of citations to the patents in the same sector, but also allocates a small proportion of citations to patents in every other sector. In Table 1.7, I adjust the original percent- age of cross-sector citation by the cited sector’s patent shock share in the data set of a given year. The table shows that every sector cites itself over-proportionally and cites other sectors under-proportionally in most cases, but there are some sectors that receive over-proportional citations (the blue cells). These blue cells indicate that the cited sector contributes above-average knowledge to the citing sectors. Third, inter-sector knowledge spillovers cause firm size dynamics in each sec- tors to be dependent on each other and generate a pooling firm size distribution similar to the data. The multi-sector model expands upon the one-sector model by adding inter-sector knowledge spillovers. With inter-sector knowledge spillovers, a firm’s growth dynamics in every sector and in the entire economy are subject to the impacts of all sectors. The similarity in growth dynamics confirms that firm size distributions, whether measured within one sector or in the entire economy, all converge to the Pareto distribution. The one-sector model is a special case of the multi-sector model, when cross-sector knowledge spillovers do not exist and sectors are independent. Without inter-sector knowledge spillovers, firm growth dynamics in different sectors would be independent. The summation of several independent Pareto dis- tributed variables is still Pareto distributed, but the scale parameter is equal to the minimum of the component distribution scale parameters20. In contrast, the firm- 20See Jessen and Mikosch (40) and Gabaix (27). 30 or establishment-level data (Figure 1.1 and Figure 1.2) show that the scale param- eter of all firms’ distribution lays between the component sectors’ scale parameter values. Therefore, the firm size dynamics in different sectors must be dependent. 1.4.2 Model A representative firm f operates in K sectors. Firm f ’s size in all sectors at time t is summarized by a K-dimensional real vector z f ,t . The kth element of z f ,t , zkf ,t , represents the number of products in the kth sector invented by firm f . Firm f can apply its private knowledge capital in sector i, zif ,t , to the innovation in any sector j, where i, j ∈ {1, 2, ..., K}, using production function Ai jN ( Ni jf ,t )α ( zif ,t )1−α . Ai jN is the ability to apply sector i’s knowledge to innovate in sector j (call it i j type innovation). Ni jf ,t is firm f ’s research hours spent in i j’s type of innovation. Firm f utilizes public knowledge capital, Z̄it , for imitation in any sector j with production function Ai jM ( M jf ,t )β ( Z̄it )1−β . Ai jM is the ability to apply sector i public knowl- edge for imitation in sector j. Mi jf ,t is firm f ’s research hours spent in i j’s type of imitation. Notice that the cross-sector knowledge spillovers happen both within firm bor- ders and across firm borders. The { Ai jN } , i 6= j measures the cross-sector but within-firm-border knowledge spillovers efficiency, while { Ai jM } includes both the cross-sector and cross-firm knowledge spillovers. This assumption accords with the data for cross-firm citations, where cross-firm citations occur both within and across sectors. For simplicity, I assume AM = γAN , where a smaller γ represents a stronger protection of intellectual property rights. One striking feature of the knowledge diffusion matrix AN is its asymmetry. Some sectors contribute intensive knowledge spillovers to a large number of other sectors, for example, Electronic components and accessories and communications equipment, Office computing and accounting machines and Professional and sci- entific instruments; while other sectors are almost isolated from the rest of the economy, for example, Ship and boat building and repairing and Railroad equip- ment. Given the heterogeneous sectoral knowledge contribution to the economy, any growth policy should consider its impact on the relative private returns among sec- 31 tors, because firms allocate research efforts across sectors according to the relative private returns. More importantly, in the general equilibrium section of Appendix A.2, I show that the cross-sector resource allocation is important for growth, and that when resources are allocated according to a sector’s knowledge contribution to the economy, the economy obtains a higher growth rate. Firm f ’s manager chooses 2K2 types of R&D inputs { Ni jf ,t , M i j f ,t } because i, j ∈ {1, 2, ..., K}. That way, expected marginal returns from the 2K2 types of R&D are equal to their marginal costs. Solving a similar but more complicated firm’s problem than (1.2) (see Appendix A.2 for details), the dynamics of firm size in all K sectors can be summarized by z f ,t+1 = R f ,tz f ,t +L f ,t . (1.20) R f ,t is a K×K random matrix, and the (i, j) element Ri jf ,t measures the success rate when firm f uses its private knowledge from sector j to innovate in the creation of new products in sector i. L f ,t is a K-dimensional random vector. The kth element of L f ,t is the number of imitated products in sector k which firm f has invented. For instance, the size dynamics of firm f ’s branch in sector k are zkf t+1 = R k1 t z 1 f t + ...+R kk t z k f t + ...+R kK t z K f t +L k t . (1.21) Lkt = L k1 t Z̄ 1 t + ...+L kk t Z̄ k t + ...+L kK t Z̄ K t Proposition 3 If {z} follows the dynamics in (1.20) and the random matrices R f ,t and L f ,t in (1.20) satisfy the restrictions in Kesten (44) (4.9), for any vector x ∈ RK and |x| = 1, there exist some µ such that {x′z} follows a Pareto distribution with parameter µ . The above proposition means that a Pareto distribution exists in any subset of the economy. For example, when studying the firm size distribution of the kth sector, pick x = (0, ...0,1,0, ....0) with the kth element equal to one and all others set to zero. When studying the size distribution of all firms in the entire economy, pick x = 1√ K (1, ...1,1,1, ....1), and here the total firm size is the summation of its branches size in all K sectors. 32 Besides the implication about universal Pareto distribution, the knowledge dif- fusion matrix A also determines the allocation of research resources across-sectors. The knowledge of a sector k is more valuable and sector k attracts more research investment, if sector k contributes more intensive knowledge spillovers to the entire economy (see Appendix A.2 for details). The multi-sector model suggests one more reason to protect intellectual prop- erty rights. When policy makers strengthen intellectual property rights so that γ is smaller, the private return of knowledge accumulation in each sector becomes higher, but the return of major knowledge-contributing sectors increases more than other sectors. As a result, firms invest a greater share of their research fund in the major knowledge givers than in other sectors. Economic growth indirectly benefits from the better cross-sector R&D resource allocation. In addition to the trade-off between higher imitation rate and lower private return of knowledge in the one- sector model, the growth maximizing γ here also takes into account its impact on cross-sector resource allocation. 1.5 Conclusion This paper employs knowledge spillovers to examine two questions about firm size distribution: Why is firm size heterogeneity different across sectors? and Why do firm size distributions follow dependent Pareto distributions in every subset of the economy? The one-sector model answers the first question using sector-specific inter-firm knowledge spillovers efficiency. In sectors with abundant knowledge spillovers, firms invest more in imitation and less in innovation; therefore imitation contributes more substantially to the overall growth rate. Since every firm has an equal chance to learn from public knowledge, imitation has a stronger influence on smaller firms’ growth rates, which leads to a declining firm growth rate with firm size. Faster catch-up of smaller firms generates a more homogeneous firm size distribution. The one-sector model implies that knowledge spillovers are more abundant, firm growth rate declines faster with firm size, and imitation contributes more to the gross growth rate in sectors with more homogeneous firm size distribution. The model has three testable implications that are supported by NBER Patent Citation 33 Data. The advantage of this data set is that it keeps track of inter-firm knowledge spillovers, which allows for the measurement of the speed of knowledge diffusion and the separation of the share of the innovation and imitation rates in the overall growth rate of the firm. To answer the second question, the multi-sector model improves upon the one- sector model with two additional features: firms develop products in multiple sec- tors, and cross-sector knowledge spillovers allow for dynamics to interact across all sectors. As a result, the firm growth dynamic in any subset of the economy evolves in a pattern similar to that of the whole economy. This induces a Pareto firm size distribution with different scale parameters in any subset of the economy. At the aggregate level, the micro-founded models lead to policy suggestions relevant to intellectual property rights and trade. The one-sector model suggests that the optimal level of intellectual property rights protection depends on the trade-off between high imitation rate and lower private return of innovation. The multi-sector model suggests one more channel by which policies affect economic growth: the cross-sector R&D resource allocation. Strong intellectual property rights promote economic growth by encouraging firms to invest a larger share of their research fund in the intensive knowledge-contributing sectors, so that resource allocation is justified by each sector’s knowledge externality to the economy. In a similar manner, Cai and Li (11) shows that trade cost harms economic growth by distorting the relative private returns of innovation between sectors of hetero- geneous knowledge contributions to the economy and the cross-sector research resource allocation. 34 Table 1.1: OLS Regressions with U.S. Citations Dependent variable: citation lag Independent variable 1 2 3 4 S. D. of log(ps)a .776** 2.934** 1.262** 2.443*** (.372) (1.236) (.563) (.455) Log(dist) b .137*** .122*** .142*** .030*** (.017) (.013) (.008) (.010) Log(PSciting)c -.147*** -.110*** -.099*** .555*** (.008) (.021) (.004) (.063) Log(PScited)d -.152*** -.080*** -.113*** -2.621*** (.008) (.014) (.005) (.146) Log(PSindustry)e -.355*** -3.470*** -3.215*** 1.208*** (.014) (.429) (.125) (.128) Fixed effects State pair Sector State pair by sector Firm pair No. of observations 1132505 1132505 1132505 1132505 No. of groups 2626 42 47375 719298 R square .095 .093 .173 .855 aStandard deviation of log scale patent stock for all firms in the sector. bLog scale great circle distance between the citing patent and the cited patent cLog scale patent stock of the citing firm. dLog scale patent stock of the cited firm. eLog scale patent stock of the sector. 35 Table 1.2: Summary of Variables Summary of Variables Variable Obs. Mean Std. Dev. Min. Max. Log(dist) 1132640 6.711 1.784 0 9.460 Log(PSciting) 1132640 2.539 2.574 0 10.490 Log(PScited) 1132640 2.769 2.618 0 10.280 Log(PSindustry) 1132640 10.611 1.052 4.111 12.382 S. D. of Log(PS) 1132640 1.667 0.217 0.786 8.495 Table 1.3: Correlation between Variables Variable Lag Log(dist) Log(PSciting) Log(PScited) Log(PSindustry) Lag 1 Log(dist) 0.0304 1 Log(PSciting) -0.1211 0.0133 1 Log(PScited) -0.1106 -0.0137 0.371 1 Log(PSindustry) -0.0605 0.055 0.2123 0.2123 1 S. D. of Log(PS) 0.0511 -0.021 0.1906 0.2139 0.0942 36 Table 1.4: Table 2 in Broda and Weinstein (8) Table 1.5: Multi-Sector U.S. Patent Owners Number of Patents Average Number of Patent Categories 1 - 10 1.34 11 - 100 3.89 101 - 1000 8.93 1001 - 10000 15.17 10000- 25.57 Source: NBER Patent Citation Data 1999 37 Table 1.6: Share of Cross-Sector Citations: Example The Cited Sector The Citing Sector % 1 2 6 7 8 9 1 80.34 0.00 0.00 6.74 0.00 8.15 2 0.33 38.59 0.33 0.66 8.70 0.00 6 0.11 0.42 60.30 10.72 1.27 0.42 7 0.46 0.41 5.16 58.46 4.52 14.06 8 0.00 1.44 1.32 7.38 66.33 0.06 9 1.09 0.30 0.30 14.68 0.24 67.73 Table 1.7: Intensity of Cross-Sector Citations: Example The Cited Sector The Citing Sector %/% 1 2 6 7 8 9 1 80.34 0.00 0.00 6.74 0.00 8.15 2 0.33 38.59 0.33 0.66 8.70 0.00 6 0.11 0.42 60.30 10.72 1.27 0.42 7 0.46 0.41 5.16 58.46 4.52 14.06 8 0.00 1.44 1.32 7.38 66.33 0.06 9 1.09 0.30 0.30 14.68 0.24 67.73 Table 1.8: Example Sector Names 1 Food and kindred products 2 Textile mill products 6 Industrial inorganic chemistry 7 Industrial organic chemistry 8 Plastic materials and synthetic resins 9 Agricultural chemicals 38 Table 1.9: Correlation between Different Measures of Firm Size Heterogene- ity - French 4-Digit NAICS Manufacturing Sectors 1997 Measure sd(lnl) sd(lny) sd(lns) sd(lnva) sd(lnl) 1 sd(lny) 0.964 0.961 sd(lns) 0.961 0.998 0.1 sd(lnva) 0.970 0.963 0.964 1 Table 1.10: Correlation between Different Measures of Firm Size Hetero- geneity - Chilean 4-Digit SITC Manufacturing Sectors 1996 Measure sd(lnl) sd(lny) sd(lns) sd(lnva) sd(lnl) 1 sd(lny) 0.825 1 sd(lns) 0.769 0.933 1 sd(lnva) 0.715 0.910 0.896 1 39 Table 1.11: OLS Regressions - Random Citations Dependent variable: citation laga Independent variable 1 2 3 S. D. of Log(PS) .424*** .160** .168*** (.059) (.078) (.066) Log(dist) .065 -.051 0.039 (.042) (.026) (0.026) Log(PSciting) -.040*** -.002 -.001 (.005) (.002) (.002) Log(PScited) -.170*** -.108 -.160*** (.010) (.036) (.007) Log(PSindustry) -.175*** -3.603*** -3.341*** (.014) (.179) (.096) Year fixed effect Yes Yes Yes State pair fixed effects Yes No No Industry fixed effects No Yes No State pair - industry fixed effects No No Yes No. of observations 2120904 2120904 2120904 No. of groups 2626 42 47375 aRobust standard errors clustered by sector are reported in the brackets. Year dummies are included. 40 Table 1.12: Summary of Variables - Random Citations Summary of Variables Variable Obs. Mean Std. Dev. Min. Max. Citation lag 2120904 8.332 6.385 0 94 Log(dist) 2120904 7.043 1.1897 0 9.530 Log(PSciting) 2120904 3.473 2.695 0 9.649 Log(PScited) 2120904 3.134 2.667 0 9.563 Log(PSindustry) 2120904 10.645 1.055 4.111 12.382 S. D. of Log(PS) 2120904 1.700 .243 .787 8.495 Table 1.13: Correlation between Variables - Random Citations Variable Lag Log(dist) Log(PSciting) Log(PScited) Log(PSindustry) Lag 1 Log(dist) -.009 1 Log(PSciting) -.0317 -.047 1 Log(PScited) -.062 -.040 .172 1 Log(PSindustry) .020 .078 .211 .219 1 S. D. of Log(PS) .054 -.042 .190 .199 .048 41 Table 1.14: OLS Regressions - G7 Citations Dependent variable: citation laga Independent variable 1 2 3 4 S. D. of Log(PS) .576 .492 1.99** 1.917*** (.353) (1.164) (.774) (.440) Log(dist) .078*** .122*** .104*** .044*** (.020) (.050) (.022) (.007) Log(PSciting) -.149*** -.134*** -.112*** .562*** (.014) (.008) (.008) (.050) Log(PScited) -.183*** -.179*** -.149*** -2.464*** (.044) (.059) (.051) (.164) Log(PSindustry) -.335*** -2.787*** -2.939*** 1.215*** (.089) (.365) (.373) (.104) Fixed effects State pair Sector State pair-sector Firm pair No. of observations 2158761 2158761 2158761 2158761 No. of groups 49 42 1884 1238745 R square .089 .098 .111 .805 aRobust standard errors clustered by sector are reported in the brackets. Year dummies are included. 42 Table 1.15: Summary of Variables - G7 Citations Summary of Variables Variable Obs. Mean Std. Dev. Min. Max. Citation lag 2158761 6.866 5.010 0 95 Log(dist) 2158761 7.310 2.036 0 9.760 Log(PSciting) 2158761 3.321 2.778 0 10.491 Log(PScited) 2158761 3.451 2.689014 0 10.280 Log(PSindustry) 2158761 10.651 1.033 4.111 12.382 S. D. of Log(PS) 2158761 1.687 .231 .786 8.494 Table 1.16: Correlation between Variables - G7 Citations Variable Lag Log(dist) Log(PSciting) Log(PScited) Log(PSindustry) Lag 1 Log(dist) 0.038 1 Log(PSciting) -0.155 0.035 1 Log(PScited) -0.162 0.028 0.423 1 Log(PSindustry) -0.058 0.025 0.273 0.278 1 S. D. of Log(PS) 0.026 0.024 0.157 0.181 0.069 43 Figure 1.1: Firm Size Distribution in Different Sectors - French Manufactur- ing Firms Figure 1.2: Firm Size Distribution in Different Sectors - U.S. Patent Owners 44 Figure 1.3: Five-Year Firm Growth Rate - French Manufacturing Firms Figure 1.4: Five-Year Firm Growth Rate - U.S. Patent Owners 45 Figure 1.5: Scale-Dependency of Firm Growth Rate - French Manufacturing Firms Figure 1.6: Scale-Dependency of Firm Growth Rate - U.S. Patent Owners 46 Figure 1.7: Scale-Dependency of Imitation Rate and Innovation Rate - French Manufacturing Firms Figure 1.8: Scale-Dependency of Imitation Rate and Innovation Rate - U.S. Patent Owners 47 Figure 1.9: Cross-Firm Citation’s Share in Total Citation by Sectors Figure 1.10: Cross-Firm Citation’s Share in Total Citation by Year 48 Figure 1.11: Average Citation Distance by Year Figure 1.12: Sectoral Firm Size Heterogeneity in U.S. 1997 and 2002 49 Figure 1.13: Sectoral Firm Size Heterogeneity in France 1997 and 2005 Figure 1.14: Sectoral Firm Size Heterogeneity in Chile 1979 and 1996 50 Figure 1.15: Sectoral Firm Size Heterogeneity in U.S. and France 1997 Figure 1.16: Sectoral Firm Size Heterogeneity in U.S. and France 2002 51 Figure 1.17: Simulated Cross-Firm Citation’s Share in Total Citation by Sec- tors Figure 1.18: Scale-Independency of Innovation Rate by Sectors 52 Figure 1.19: Scale-Dependency of Imitation Rate by Sectors Figure 1.20: Determinant of Firm Size Heterogeneity 53 Figure 1.21: Cross-Firm Citation’s Share in Total Citation by Sectors G7 Countries 54 Chapter 2 Dynamic Formation of Directed Networks 2.1 Introduction The literature on dynamic networks formation among firms has primarily been concerned with understanding the formation of non-directed networks. It has been successful in explaining many empirical characteristics observed in actual social networks1. However, in reality, many networks transferring goods and information flows are directed. Examples of such networks are websites connected by hyper- links, people connected by phone calls or emails, and firms connected by patent citations. In such directed networks, two nodes connected by one link are not symmetric but play different roles: initiator and receiver. People who send emails are not symmetric with those receiving them; patent citers are not the same as those being cited. In fact, for most networks, the symmetry of nodes is at best a 1Jackson and Rogers (38) summarize five characteristics: (1) small average shortest distance between nodes; (2) positive clustering coefficients (clustering coefficients measure how often two nodes with a common friend are also friends); (3) power-law degree distribution. A quantity x obeys a Power-law if it is drawn from a probability distribution p(x)∝ x−α , where α is a constant parameter of the distribution known as the exponent or scaling parameter. In real-world situations the scaling parameter typically lays in the range 2<α < 3, although there are occasional exceptions; (4) positive correlation between degrees of linked nodes; and (5) negative correlation between the local clustering coefficient of a node’s neighborhood and the node’s degree. 55 simplifying assumption. There has, until now, been no model of dynamic networks formation for directed network. Any such model would need to provide insights into what leads selfish individuals to engage in network building, and to match the observed triple Power-law degree distributions of networks, i.e., the in-degree, out-degree and total-degree all follow Power-law degree distribution. This paper builds a dynamic model of directed network formation and demon- strates its application to a real-world directed network: a firm citation network. The model uses profit sharing between a firm with access to a customer and a firm that possesses the technology to produce what the customer wants, to explain individual firms’ incentives to build directed networks. When firms and customers are ran- domly matched, a representative firm may not be able to produce what its customer wants, but can gain by referring this customer to another firm whose production ca- pability can meet the demand. The preconditions are that firms know what others in the networks produce and that they can profit, perhaps via commission fees, by directing consumers towards them. Firm i knows what firm j produces corresponds to a directed link pointing from firm i to firm j in the networks. With profit shar- ing as the form of commission, firms want to know other firms, so that they earn more commission fees, while at the same time they seek to become known by more other firms in order to obtain re-directed customers. In equilibrium, the dynamic process of network formation is decided by a firm’s trade-off between the bene- fit and cost of building networks. Joharia, Mannorb, and Tsitsiklis (41) uses cost compensation to explain why private post offices build directed networks to deliver mail packages, but their model is static. In a similar manner, we can understand a firm’s incentive to build a directed knowledge network: to discover a new good, a firm needs complementary knowl- edge from two random fields. In most cases, one firm only masters the knowledge in one field and needs to learn the other field’s knowledge from their peers. Since future knowledge demand is uncertain, each firm wants to link with more knowl- edge providers. But why are firms willing to provide knowledge to others without being paid directly? The key is the long term collaboration feature of the networks and the uncertainty of future knowledge demand. A teacher today has equal chance to learn from her/his learner in the future. Moreover, through communication with the learner, the teacher has a chance to build new links with unknown firms in- 56 troduced by the learner. Therefore, the teacher is indirectly compensated by the chance to learn from more peers in the future. In other environments, as long as firm’s economic activities need two complementary resources and each firm only owns one of them, a network that links the owners of both resources is a long term solution. This paper extends the dynamic network formation of non-directed networks in Holme and Kim (34), Vazquez (69), and Jackson and Rogers (38) (JR hence- forth) into directed networks2. In non-directed networks, nodes build a new link either through the network-based method (knowing a friend’s friend) or random- ized method (knowing random unknown people). In directed networks, there is another layer of complexity within the network-based network formation method: the two directions of links. Nodes build new links via both old links in the same direction and old links in the opposite direction at different success rates. The first success rate depends on a node’s tendency to maintain its current role as an initiator or receiver in the networks; the second success rate relies on a node’s possibility of switching its current role to the opposite role in the networks. According to Kesten (44), when the success rates in building new links through different methods are subject to i.i.d. ’popularity’ shocks, the in-degree, out-degree, and total-degree distributions all converge to Power-law distribution, as seen in the real directed networks. The inter-temporal causality between two types of links in the bilateral networks-based networks formation is the key to generating the triple Power-law distribution. The model can be extended to understand the dynamic formation of more com- plex networks, where there are multiple types of nodes and links; for example, an exporter-market network and a buyer-seller network. The key to handling complex networks is the modeling of the inter-temporal causality between different types of links. I illustrate the application of the model by considering a firm citation network panel data. I construct the networks data from the National Bureau of Economic Research (NBER) Patent Citation Database for 42 sectors from 1985 to 1994. This 2Network-based network formation means two unconnected nodes with common neighbor in last period connect with each other the current period or ’knowing a friend’s friend’). Its opposite is randomized network formation, where two randomly picked nodes connect with each other. 57 is clearly a directed network: one inter-firm citation corresponds to one directed link from the citing firm3 to the cited firm4. With multiple years of data, I can observe the inter-temporal change in each sectoral citation networks. I can deter- mine both the links that are newly built and the method by which the new links are built; whether a new link is introduced by an old link in the same direction, introduced by an old link in the opposite direction, or by the random meeting of two previously unconnected nodes. Identifying the method through which a node builds new connections allows me to infer the success rates of building new links via the different methods for each node. Knowing the distribution of these success rates, I am able to simulate the dynamic network formation process and compare the simulated networks with the real sectoral citation networks. The simulation allows me to test whether the simple model of degree5 dynamic process also mimics other structural features in the real networks. The simulated network for sector s starts from a randomly generated network. In each period, new links are built by a mixture of networks-based and randomized networks for- mation methods. In the bilateral network-based network formation, every node introduces its unconnected friends to each other. A representative node i at time t is assigned an i.i.d. ’popularity draw’ poxyit from the estimated success rate dis- tribution of building x type new link from y type old link in sector s, where x, y ∈ (inward, outward). A higher popularity draw poxyit means a y type old link is more likely to introduce an x type new link to node i. In the randomized networks formation, two unconnected nodes i and j are randomly connected by a link from i to j by possibility rs, which is the estimated success rate of random matching in sector s. Repeat the above process for 50 periods, which is long enough for degree distributions to converge to Power-law. The simulated networks match actual net- works not only in degree distributions, but also in clustering coefficients and other structure features. The remainder of the paper is organized as follows. The model section de- scribes a firm’s motivation to build a directed social network and the methods of building new links. The data section introduces the NBER Patent Citation Data 3The firm that applies a patent and cites other existing patents. 4A firm whose existing patents are cited by other patent applications. 5A node’s degree is the number of nodes with which it directly connects. 58 and illustrates how to infer the distribution of popularity draws in each network formation method. With the estimated distribution of popularity draws, I simu- late artificial networks and compare them with the real sectoral citation networks. Lastly, the conclusion summarizes the paper. 2.2 The Model 2.2.1 Differentiated Goods Market There are N firms evenly distributed on [0,1]. Each firm plays two roles: producer and dealer. As a producer, firm i produces one unit of goods i ∈ [i− 12N , i+ 12N ] with one unit of labor. A consumer wants a random good on [0,1] each period. A customer for good j is willing to pay P(1−T |i− j|) dollars for good i. T measures the consumer’s intolerance to product difference. Assume T > 2N, so that consumers only accept the closest substitute available. As a dealer, firm i may receive a query from a randomly matched customer j for goods j ∈ [0,1]. If j ∈ [i− 12N , i+ 12N ], firm i serves customer j itself. Otherwise, firm i acts as a dealer and checks other producers on its contact list Cit = {k| i knows k} at time t. If there exists one producer such that k ∈ Cit and j ∈ [k− 12N ,k+ 12N ], dealer i introduces this customer to producer k and earns a commission fee θ(P−W ). Producer k earns (1−θ)(P−W ). The wage rate W is normalized to 1. θ represents a dealer’s bargaining power. The market size is L. More connections in the producer-dealer networks bring a higher expected profit to the firm. Denote the number of producers that i knows at time t (num- ber of elements in Cit) as pit . Denote the number of dealers that know i at time t (the number of firms k that i ∈Ckt) as dit . At time t, the expected profit for firm i that knows pit producers and with dit dealers knowing i is: piit = L N ( 1− T 4N ) (P−1) [1+θ pit +(1−θ)dit ] . (2.1) The first part of the profit occurs when firm i serves the random customer itself. The second part originates from the commission fee, when firm i cannot serve the customer itself, but one of its producers can. The third part of the profit derives 59 from the business introduced by dit dealers that know firm i, when the dealers cannot serve their customers themselves. When there is a greater number of firms, each firm acquires a smaller market share LN , but every realized trade brings higher profit (1− T4N ). This is because firms are more specialized and consumers will pay a higher price for a product tailored to their specific taste. This dealer-producer relationship connects all firms by directed links: there is one directed link from firm i to its producer k. In networks theory, pit is called the out-degree of firm i and dit is called the in-degree of firm i. Firm i’s profit (2.1) linearly increases with pit and dit . 2.2.2 Methods of Building Networks According to the degree-dependent income flow in (2.1), any profit maximizing firm wants to expand its connections with other firms. As a dealer, there are three ways for firm i to know more producers: (1a) knowing a new producer k through a current producer j ( j ∈Cit , k ∈ C jt and k /∈Cit); (2a) knowing a new producer k through a current dealer j (i ∈C jt and k ∈C jt); and (3a) randomly encountering a producer’s advertisement on the street. As a producer, there are three ways for firm i to acquire new dealers: (1b) ask a current dealer j to forward firm i’s information to j’s producer k (i ∈C jt , k ∈ C jt and i /∈Ckt) ; (2b) ask a current producer j to forward firm i’s information to a firm j’s producer k( j ∈ Cit and k ∈ C jt); and (3b) send its advertisement to a random firm on the street. Methods (1a), (2a), (1b), and (2b) belong to network-based networks forma- tion, where firm i’s new connections built today depend on its position in yester- day’s networks. Methods (3a) and (3b) are called randomized networks formation, in which the new connections built today are independent of yesterday’s networks topology. 60 2.2.3 Dynamic Networks Formation Process Communication Technology To communicate with current producers and dealers, firm i spends lpit hours listen- ing and t pit hours talking to a producer; l d it hours listening and t d it hours talking to a dealer; and lrit hours listening and t r it hours talking to a random unknown firm. While listening, firm i receives information about more producers from the talkers; while talking, firm i broadcasts its own or another firm’s information to the listen- ers. Denote the number of new y gained by communication with current x as4yxi jt , in which x ∈ {producer, dealer, and random firm}, y ∈ {producer, dealer}, i is the talker, and j is the listener. For example, 4ppi jt represents firm i’s number of new producers known by listening to producer j ( j ∈Cit) at time t. Since the communication is bilateral, the outcome depends on the time inputs from both sides. A higher number of current links means firm i is more experienced in communication, which helps it make more links in the future. Suppose firm j is firm i’s producer, firm k is firm i’s dealer;6 firm r is a random unknown firm to firm i. The expected numbers of new links built through different methods are as follows: 4ppjit = App ( tdjt )β ( lpit )α 4dpi jt = Ad p ( t pit )β (ldjt)α 4pdkit = Apd ( t pkt )β (ldit)α 4ddikt = Add ( tdit )β ( lpkt )α 4prilt = Apr (trrt)β (lrit)α 4drilt = Adr (trit)β (lrrt)α Here 0 < α < 1, 0 < β < 1, and 0 < α + β < 1. Axy is the technology of knowing new x through current y, in which y ∈ {producer, dealer, and random unknown firm}, x ∈ {producer and dealer}. α and β measures the listener’s and 6Notice that meanwhile firm i is also firm j’s dealer and firm i is firm k’s producer. 61 talker’s share in the communication outcome, respectively. When x,y ∈ {producer, dealer}Axy measures the likelihood of trusting and building a link with an intro- duced firm. Axr measures the likelihood of trusting and building links with random unknown firms. Institutions and social norms affect the efficiency to build network by different methods directly and determine the networks structure indirectly. For example, as- sume government publishes each firm’s credit history, so that any firm can check an unknown firm’s record at a cost φ . A smaller φ makes it easier to trust an unknown firm, which corresponds to a higher Axr. In a society where people discriminate against others of different backgrounds, firm owners may be very selective when building linkages with other firms. They may prefer to connect with others of sim- ilar background, and may trust an introduced firm better than a random firm. This type of social norm increases the productivity of network-based methods Axy x,y ∈ {producer, dealer} and lowers the productivity of randomized methods Axr. Firm’s Problem The firm i maximizes firm value by choosing hour inputs lpit , t p it , l d it , t d it , l r it , and t r it . Vt (pit ,dit) = max L N ( 1− 1 4N ) [P+θPpit +((1−θ)P−1)dit ] +ρE [Vit+1 (pit+1,dit+1)]− pit ( lpit + t p it )−dit (ldit + tdit)−Nlrit −Ntrit such that pit+1 = (1−δ ) pit + ∑ j∈Ci App ( tdjt )β ( lpit )α + ε ppit pit+ (2.2) ∑ i∈Ck Apd ( t pkt )β (ldit)α + ε pdit dit + ∑ r/∈Ci Apr (trrt) β (lrit) α + ε prit Nt (2.3) 62 dit+1 = (1−δ )dit + ∑ j∈Ci Ad p ( t pit )β (ldjt)α + εd pit pit+ (2.4) ∑ i∈Ck Add ( tdit )β ( lpkt )α + εddit dit + ∑ r/∈Ci Apr (trit) β (lrrt) α + εdrit Nt (2.5) Firm value Vt (pit ,dit) is a function of its in-degree and out-degree. (2.2) and (2.4) are the networks formation functions of out-degree and in-degree. ρ is the firm’s discount rate. δ is the depreciation rate of contact information. In the general equi- librium, δ is set to the average speed at which dealers get to know new producers, so that the average number of producers per dealer is a constant overtime. Other- wise, when δ is too small, the network degenerates to a fully connected network; or when δ is too big, the network breaks down. { ε ppit , ε pd it , ε pr it , ε d p it , ε dd it , and εdrit } are the i.i.d. zero mean shocks that firm i receives in different types of social activ- ities at time t. They capture firm i’s random popularity in different types of social communication. Firm i pays its representatives to gain more connections. In the first-order conditions (2.6) to (2.11), the expected marginal profit equals the marginal cost. Every firm takes other firms’ time input as given. lpit = (ρVpαApp) 1 1−α ( tdjt ) β 1−α (2.6) ldit = ( ρVpαApd ) 1 1−α ( t pkt ) β 1−α (2.7) lrit = (ρVpαAprNt) 1 1−α (trrt) β 1−α (2.8) t pit = ( ρVdβAd p ) 1 1−β ( ldkt ) α 1−β (2.9) tdit = (ρVdβAdd) 1 1−β ( lpkt ) α 1−β (2.10) trit = (ρVdβAdrNt) 1 1−β (lrrt) α 1−β (2.11) An educated guess for the firm value function is V (p,d) = vp p+ vdd+u. 63 vp = L N ( 1− 1 4N ) (P−1)θ (2.12) +ρvp ( 1−δ+App ( tdjt )β ( lpit )α)+ρvdApp(tdjt)β (lpit)α − (lpit + t pit ) (2.13) vd = L N ( 1− 1 4N ) (P−1)(1−θ) (2.14) +ρvpApd ( t pkt )β (ldit)α +ρvd (1−δ+Ad p (t pit )β (ldjt)α)−(ldit + tdit) (2.15) u = L N ( 1− 1 4N ) P+ρvpApr (trrt) β (lrit) α +ρvdApr (trit) β (lrrt) α −Nlrit −Ntrit To firm i, one link’s marginal value is equal to the discounted value of future profit from it and new links introduced by that link, minus the cost of communiction with linked firms. With free entry, the number of firms reaches the equilibrium when u is equal to the fixed entry cost fe. For a potential entrant that has no linkage with any incumbent, its firm value includes only u, which is equal to the profit from exact matched consumers plus the profit from links made with random firms, minus the cost of communication with random firms. First order conditions (2.6) to (2.11) show that firm i exerts more effort in knowing new producers (dealers), when its marginal value vp (vd) is higher. In general equilibrium, every firm chooses the same time input portfolio {lp∗it , td∗it , ld∗it , t p∗it , l r∗ it , and t r∗ it }. lp∗it = ρ 1 1−α−β (vpαApp) 1−β 1−α−β (vdβAdd) β 1−α−β (2.16) td∗it = ρ 1 1−α−β (vdβAdd) 1−α 1−α−β (vpαApp) α 1−α−β (2.17) ld∗it = ρ 1 1−α−β ( vpαApd ) 1−β 1−α−β ( vdβAd p ) β1−α−β (2.18) t p∗it = ρ 1 1−α−β ( vdβAd p ) 1−α 1−α−β ( vpαApd ) α1−α−β (2.19) lr∗it = ρ 1 1−α−β (vpαApr) 1−β 1−α−β (vdβAdr) β 1−α−β (2.20) tr∗it = ρ 1 1−α−β (vdβAdr) 1−α 1−α−β (vpαApr) α 1−α−β (2.21) 64 vp (1−ρ (1−δ )) = LN ( 1− 1 4N ) Pθ + ( 1 α −1 ) lp∗it + ( 1 β −1 ) t p∗it vd (1−ρ (1−δ )) = LN ( 1− 1 4N ) (P(1−θ)−1)+ ( 1 α −1 ) ld∗it + ( 1 β −1 ) td∗it fe = u = L N ( 1− 1 4N ) P+N ( 1 α −1 ) lr∗it +N ( 1 β −1 ) tr∗it The equilibrium communication inputs lp∗it , t d∗ it , l d∗ it , t p∗ it , l r∗ it , and t r∗ it , firm value function parameters vp and vd and the number of firms N are jointly determined by the 9 equations above. There is another way to explain the second part of vp and vd , ( 1 α −1 ) lp∗it +( 1 β −1 ) t p∗it and ( 1 α −1 ) ld∗it + ( 1 β −1 ) td∗it . They capture the externality from other linked firms. The reason is that building new links depends on the efforts of both the listener and the talker, but each party only pays for its own effort. One party’s higher input also induces a larger input from the other party. As a result, in a network where every firm inputs more time communicating with others, the value of each link is higher due to the bilateral externality. Notice that vp is still positive even if a producer does not share profit with a dealer when θ = 0, because the dealer is rewarded with the externality from the producer’s communication effort, which is the chance to know new dealers and producers while talking and listening to the producer. In the firm citation networks, a dealer acts as a knowledge teacher, who provides the type of knowledge that is complementary to the producer’s knowledge. θ measures the strength of intellectual property rights protection (IPRP). Under the worse institution of IPRP (θ = 0), although firms’ communication efforts are not maximized, the knowledge networks is still sustainable because the teacher is rewarded with the possibility of knowing new teachers and learners in the future. Nevertheless, an extremely strong IPRP or θ approaches 1 is not optimal either. In a special case where α = β and matrix A is symmetric, firms exert the largest communication efforts when θ = 0.5 and vp = vd . That is due to the complemen- 65 taries between the efforts from two parties: when θ deviates from 0.5, the party that gains a smaller share of profit reduces its communication input; the other party also reduces its effort because it enjoys a smaller externality. Substituting Nash equilibrium time input (2.16) to (2.21) into the networks for- mation functions (2.4) and (2.2) solves the dynamic networks formation processes for firm i. ( pit+1 dit+1 ) = Ft ( pit dit ) +Rt (2.22) Ft = ( F pp F pd Fd p Fdd ) F ppt = 1−δ +(vpαApp) α 1−α−β (vdβAdd) β 1−α−β + ε ppit F pdt = ( vpαApd ) α 1−α−β ( vdβAd p ) β 1−α−β + ε pdit Fd pt = ( vpαApd ) α 1−α−β ( vdβAd p ) β 1−α−β + εd pit Fddt = 1−δ +(vpαApp) α 1−α−β (vdβAdd) β 1−α−β + εddit Rt = ((vpαApr) α1−α−β (vdβAdr) β1−α−β + ε prit (vpαApr) α 1−α−β (vdβAdr) β 1−α−β + εdrit ) Notice that the expectation of matrix Ft is center-symmetric, because every firm’s inputs are the same. For example, firm i spends as much time listening to its pro- ducer j as its dealer k listens to firm i. Firm j also spends as much time talking to firm i as firm i talks to firm k, which is why the upper-left and lower-right elements of E (F) are the same. Similarly, the upper and lower elements of E (R) are sym- metric, because firm i spends the same time listening (talking) to a random firm as a random firm spends listening (talking) to firm i. Proposition 4 According to Kesten (44), when {ε ppit , ε pdit , ε prit , εd pit , εddit , and εdrit } are identically independent distributed across firm and time, for two-dimensional vector x with |x|= 1, as t→ ∞, x′(pitdit) follows Pareto distribution µx. By choosing x = (1,0),(0,1), and ( 1√ 2 , 1√ 2 ), I obtain the distributions of out- degree (pit) in-degree (dit) and total degree (pit + dit) of the networks. Since the 66 matrices F and R in (2.22) are symmetric, pit and dit have the same Pareto distribu- tion parameter µ . Notice that Power-law distribution is the discrete time version of Pareto distribution. Since the number of links is a discrete number, the out-degree in-degree and total degree exhibit Power-law distributions. 2.3 Welfare In this simple environment, network density, the number of firms (degree of spe- cialization), and the heterogeneity of degree distribution jointly determine con- sumer welfare. Communication technologies, fixed market entry cost, IPRP insti- tutions and social norms influencing firm’s attitudes to unknown firms and intro- duced firms are the fundamentals of setting the above networks features. 2.3.1 Risk-Neutral Consumers The social welfare L E(d)N ( 1− T4N ) depends on the network density7 E(d)N and the degree of specialization 1N . Higher network density improves the chance of real- izing a trade, while specialization increases consumer utility from each trade. For networks with a constant networks density, the contact depreciation rate δ must be equal to the rate at which new links are built. δ = App ( tdjt )β ( lpit )α +Apd (t pkt)β (ldit)α + NE (d)Apr (trrt)β (lrit)α Therefore, network density decreases with the depreciation rate of links δ and increases with the productivity to build links via different methods App, Apd and Apr. In a knowledge network, an IPRP that divides profit equally between learner and teacher maximizes the communication inputs from both sides; in the meantime, network density is also maximized. As long as network density is a constant, more firms in the industry leads to deepening specialization without shrinking each firm’s market share or likelihood of trade. 7The average number of outward (inward) links of each firm in the network. 67 2.3.2 Risk-Averse Consumers When consumers are risk-averse in terms of trade likelihood, the variance of degree distribution enters the welfare function. When a consumer wants a random good j each period, a more heterogeneous degree distribution lowers the consumer’s expected utility at a given network density, because the trade likelihood varies too greatly across types of goods. Referring to the discussion in the ’Degree Distribution Heterogeneity and Ra- tio r’ section below, heterogeneity of degree distribution decreases with the relative productivity between randomized method and network-based method to build new links. For example, if firms can easily trust a random unknown, Adr is higher relative to Add and Ad p, therefore a higher percentage of links are built through the randomized method instead of network-based methods. When the randomized method dominates, all firms have a relatively equal number of links. In contrast, when network-based methods dominate, the well-connected firms today will be even better connected tomorrow. As richer become richer, the degree distribution becomes more dispersed. Referring to the discussion in the ’Communication Tech- nology’ section below, a higher cost φ to check an unknown firm’s credit record or a discriminative social norm induce lower Adr relative to Add and Ad p, hence a more heterogeneous degree distribution. In summary, regardless of consumer risk attitudes, social welfare increases with the productivity to build and preserve linkages among firms and decreases with the fixed market entry cost. When consumers are risk-averse, institutions and social norms that encourage firms to trust and connect with random unknown firms lead to more homogeneous degree distribution and higher consumer welfare. 2.4 Data 2.4.1 Data Description The NBER Patent Citation Database published by the U.S. Patent Office reports patent applications in 42 broad SIC classifications from 1962 to 2002. With mul- tiple years, it allows me to track the inter-temporal change of networks. With 42 sectors, it is also convenient for comparing cross sectors. I use within-sector inter- 68 firm citations made between U.S. firms from 1985 to 1995 to construct sectoral citation networks for 42 sectors. Figure 2.1 shows the three-dimensional graph of a real firm citation networks based on the Refrigeration and Service Industry Machinery sector during the period 1990-1994. Each berry represents a firm, and a link with arrow indicates a citation from the citing firm to the cited firm. There are numerous layers in the networks: firms with more links lay closer to the center of the networks, while firms with fewer links stay in the periphery of the networks. A sectoral citation network is constructed as follows. Every firm is a node in the networks. Every citation is a directed link pointing from the citing firm to the cited firm. At time t, an n by n adjacency matrix Mst summarizes sector s’ citation networks, where n is the total number of firms. Mst(i, j) = 1 if firm i cites firm j; otherwise Mst(i, j) = 0. Firm i’s out-degree (number of outward links or producers pit in the model) is the number of ones in the ith row of Mst . Firm i’s in-degree (number of inward links or dealers dit in the model) is the number of ones in the ith column of Ms,t . The total number of inward and outward links is called total- degree. Denote it as tit = pit +dit . 2.4.2 Stylized Facts about Citation networks Newman (59) and JR summarize five stylized facts that socially generated networks share. Sectoral firm citation networks have all these characteristics. Notice that these five facts are all static, not dynamic, features of networks. I report (a), (d) and (e) in Table 1, and (b) and (c) in Table 2.2. (a) Average shortest distance between pairs of nodes is small. (b) As with the other social networks, clustering coefficients8 are larger than those in randomly generated networks. (c) Power-law in-degree (d), out-degree (p), and total-degree (t) distribution (triple Power-law). (d) Positive sorting. Degrees and patent stocks of linked firms are positively correlated. Average geographic distances between the citing and the cited firms are much shorter than average distance between two randomly picked firms in the 8They measure how likely it is that two nodes with a common connected node are also connected. 69 same sector. (e) The clustering among the neighbors of a given node is inversely related to the node’s degree. 2.4.3 Degree Distribution Heterogeneity and Ratio r in JR JR predicts that the networks degree distribution is more heterogeneous and net- works structure is more clustered when nodes build more new links with a friend’s friend, and form fewer new links with a random node. Because a connection with a friend’s friend is a type of ’preferential attachment’, it means nodes with more links today acquire more new links tomorrow. In contrast, every node has an equal chance of building new links with random nodes in the randomized networks for- mation, which tends to eliminate current differences in degree numbers. The patent citation data contains directed firm network information for more than 30 years and 42 sectors, which permits me to test these predictions across sectors. For sector s at time t, I estimate the key parameter rst in JR, the ratio of new links with a random node to new links with a friend’s friend. On the other hand, I also estimate the Power-law degree distribution parameter µst and calculate three measures of clustering coefficient CT Tst , Cst , and C Avg st listed in JR. I give the details of estimating rts, µ inst , µoutst , µ totalst , CT Tst , Cst , and C Avg st in the Appendix 2.1. As predicted in their paper, in Figures 2.2, 2.2, and 2.2, µ inst , µoutst , and µ totalst are higher (degree distribution is more homogeneous), when rts is higher in sector s. In Figure 2.5 to Figure 2.7, CT Tst , Cst , and C Avg st are smaller, when rts is higher in sector s. 2.4.4 Simulation To test whether the networks formation process specified in the model section mim- ics the real networks formation process, I simulate a directed network for every sector and compare the simulated networks with the real sectoral networks. Before simulating, I must estimate the distribution of random matrices Fs and Rs in (2.22) ,GFs (Fs) and GRs (Rs), for every sector s. I give the details for estimating GFs (Fs) and GRs (Rs) in Appendix B. I then fit { Ff t } and { R f t } with log-normal distribu- tion. δs is set to be the average growth rate of new links. The detailed simulation 70 process is listed in Appendix C. In Figure 2.8 to Figure 2.13, I compare the degree distribution parameters µ inst (t), µoutst (t), and µ totalst (t) as well as the clustering coefficients CT Tst (t), Cst (t), and CAvgst (t) in the simulated networks with their value in real sectoral networks. Each dot represents a sector. The straight line is the 45 degree line. The µ inst (t), µoutst (t), µ totalst (t), CT Tst (t), Cst (t), and C Avg st (t) reported for simulated networks are the average value of the last 20 periods. The values reported for real networks are the five-year average from 1991 to 1995. The simulated networks mimic the real networks in terms of degree distribu- tions and clustering coefficients (µ inst (t), µoutst (t), µ totalst (t), and CT Tst (t). Cst (t) and CAvgst (t) in simulated networks deviate from their correspondents in real networks, but the rank across sectors is still retained. The more highly clustered sectors in the real world are still more highly clustered in the simulated world. In Figure 2.14, I compare the simulated value and real value of the correlation between the clustering among the neighbors of a given node and the node’s degree for all sectors. Although most sectors still exhibit negative correlation between the clustering among the neighbors of a given node and the node’s degree, the simulated networks abandon the cross-sector rank among real networks. Among the real networks, the more highly clustered sectors also have more het- erogeneous degree distribution (lower µ inst (t), µoutst (t), and µ totalst (t)) as displayed in Figure 2.15. Figure 2.16 shows that this rank is also maintained in the simulated world. In conclusion, the simulated sectoral networks have a structure similar to their corresponding real networks. The cross-sector ranks in many structure measures are preserved in all but one case. 2.5 Conclusion This paper extends the current literature in dynamic networks formation to the directed networks. The model uses profit sharing to explain a firms’ motivation to build directed networks. Firms with customer access may not have the technology to produce what the consumer wants. Within a directed network, the firm with customer access introduces the customer to the firm with the required technology 71 and receives a commission fee as an incentive. Since the communication outcome depends on the efforts of both parties, there is an externality from staying in touch with linked firms. Moreover, a knowledge network is still sustainable even when the knowledge teacher is not rewarded by profit sharing; instead the reward comes from the opportunity to know new teachers and learners in the future through the communication with the current learner. The model extends the network-based networks formation method in Jackson and Rogers (38) by modeling the inter-temporal causality between two types of links. A current link in one direction may introduce new links in both directions. The inter-temporal causality between links in two directions is the key to generat- ing triple Power-law degree distribution of in-degree, out-degree and total-degree, as observed in real directed networks. Networks allow firms to become more specialized without losing customers. The reason is that a higher number of firms in the market also bring more poten- tial dealers who redirect customers. Social welfare increases with the efficiency to build and preserve linkages among firms, because of higher network density and trade likelihood. Due to the complementaries between communication efforts from two parties, an intellectual property rights institution that equally divides the profit between the knowledge owner and the learner maximizes the communication ef- forts from both parties and the network density. Lower fixed market entry cost also improves welfare by hosting a larger number of more specialized firms. When consumers are risk-averse, institutions and social norms that encourage firms to trust and connect with random unknown firms lead to a more homogeneous de- gree distribution, which increases consumer welfare by reducing the variance of cross-good trade likelihood. The empirical part of the paper constructs the sectoral firm citation networks from the NBER Patent Citation Database and estimates the model parameters from the panel networks data. The simulated networks have a structure similar to their real counterparts. Meanwhile, the empirical section also proves the predictions in Jackson and Rogers (38). In future research, the extended model can be used to understand the dynamic formation of more complex networks with multiple types of nodes and links. 72 Table 2.1: Stylized Facts (a), (d) and (e) for Firm Citation Networks 1 2 3 4 5 6 7 8 Patent Category SIC87 Code Average Shortest Distance between Two Nodes (Links) Correlation in Total Degree Correlation in Patent Stock (Sig- nificant level) Great Circle Distance be- tween Ran- dom Nodes (Kilometer) Great Circle Distance between Linked Nodes (Kilometer) Correlation between Local Clus- tering and Total Degree 1 20 3.568 0.256 (0.00) 0.174 (0.00) 6941.487 1061.766 -0.059 2 22 4.161 0.335 (0.00) 0.274 (0.00) 6991.833 793.612 -0.057 6 281 4.347 0.223 (0.00) 0.259 (0.00) 6487.991 988.637 -0.052 7 286 4.897 0.334 (0.00) 0.309 (0.00) 7159.114 724.757 -0.062 8 282 3.833 0.262 (0.00) 0.219 (0.00) 6724.499 724.359 -0.048 9 287 4.065 0.326 (0.00) 0.317 (0.00) 6835.288 839.865 -0.059 11 284 3.311 0.318 (0.00) 0.285 (0.00) 6250.190 804.956 -0.051 12 285 4.130 0.189 (0.00) 0.222 (0.00) 6735.441 662.364 -0.048 13 289 5.133 0.374 (0.00) 0.216 (0.00) 6643.541 844.491 -0.056 14 283 3.888 0.198 (0.00) 0.143 (0.00) 6366.191 1498.119 -0.058 15 1329 3.249 0.175 (0.00) 0.276 (0.00) 6566.865 894.687 -0.051 16 30 3.311 0.250 (0.00) 0.367 (0.00) 6950.680 994.404 -0.057 17 32 5.817 0.336 (0.00) 0.428 (0.00) 6890.830 918.441 -0.061 19 331+ 2.903 0.215 (0.00) 0.127 (0.00) 6534.431 606.400 -0.036 20 333+ 3.484 0.267 (0.00) 0.197 (0.00) 6769.626 963.375 -0.021 21 34- 5.574 0.259 (0.00) 0.308 (0.00) 6766.991 1064.213 -0.058 23 351 3.580 0.159 (0.00) 0.370 (0.00) 6766.586 939.101 -0.063 24 352 5.963 0.270 (0.00) 0.285 (0.00) 7044.228 1033.380 -0.036 25 353 5.962 0.197 (0.00) 0.383 (0.00) 6777.036 918.523 0.000 26 354 5.424 0.180 (0.00) 0.296 (0.00) 6983.690 941.041 -0.048 27 357 3.580 0.167 (0.00) 0.191 (0.00) 6687.346 1618.286 -0.062 29 355 6.037 0.201 (0.00) 0.328 (0.00) 6648.464 891.189 -0.058 30 356 5.947 0.180 (0.00) 0.315 (0.00) 6695.607 929.471 -0.060 31 358 4.035 0.223 (0.00) 0.214 (0.00) 7215.740 1116.276 -0.025 32 359 1.675 0.294 (0.00) 0.243 (0.00) 6127.337 1262.372 -0.045 35 361+ 3.626 0.105 (0.00) 0.211 (0.00) 6974.893 1305.647 -0.050 36 362 3.978 0.152 (0.00) 0.246 (0.00) 7042.136 1016.173 -0.057 38 363 2.123 0.191 (0.00) 0.289 (0.00) 6359.496 670.735 -0.017 39 364 4.759 0.285 (0.00) 0.378 (0.00) 6888.798 1174.515 -0.057 40 369 4.059 0.233 (0.00) 0.299 (0.00) 6863.550 1238.117 -0.055 42 365 3.315 0.144 (0.00) 0.183 (0.00) 7250.125 1236.902 -0.061 43 366+ 4.280 0.170 (0.00) 0.251 (0.00) 6953.626 1325.210 -0.059 46 371 3.764 0.215 (0.00) 0.457 (0.00) 7060.468 944.429 -0.056 47 376 2.404 -0.061 (0.00) 0.249 (0.00) 7256.875 1975.178 0.212 49 373 1.967 0.268 (0.00) 0.102 (0.00) 6917.394 1721.915 0.056 50 374 3.891 0.115 (0.00) 0.273 (0.00) 5475.854 775.826 -0.045 51 375 2.238 0.194 (0.00) 0.199 (0.00) 6505.385 1770.709 0.191 52 379- 1.589 0.044 (0.00) 0.065 (0.00) 5743.242 1252.619 NaN 53 348+ 4.113 0.257 (0.00) 0.271 (0.00) 6884.717 1379.926 -0.018 54 372 2.872 -0.002 (0.95) 0.336 (0.00) 5551.671 1193.121 -0.053 55 38- 6.257 0.193 (0.00) 0.412 (0.00) 6945.959 1379.452 -0.058 56 99 6.860 0.277 (0.00) 0.270 (0.00) 6904.848 1336.209 -0.054 73 Table 2.2: Stylized Facts (b) and (c) for Firm Citation Networks 1 2 3 4 5 6 7 8 9 Patent Cate- gory SIC87 Code Mu-in9 Mu-out Mu-total Mu-ps CTT C Cavg 1 20 .666 (.97) .592 (.50) .639 (.88) 1.941 (.88) .045 .043 .139 2 22 .817 (.91) .724 (.10) .791 (.38) 1.856 (.31) .050 .049 .178 6 281 .744 (.93) .675 (.73) .717 (.90) 1.839 (.61) .056 .054 .151 7 286 .791 (.78) .714 (.62) .756 (.97) 1.635 (0) .055 .052 .229 8 282 .656 (.26) .576 (.10) .638 (.72) 1.663 (.02) .056 .055 .173 9 287 .794 (.17) .723 (.47) .768 (.01) 1.584 (.01) .055 .053 .235 11 284 .612 (.98) .521 (.10) .603 (.96) 1.843 (1.00) .049 .047 .135 12 285 .876 (.64) .753 (.01) .825 (.84) 1.486 (.30) .053 .052 .150 13 289 .692 (.92) .653 (.59) .680 (.95) 1.898 (.61) .049 .048 .161 14 283 .925 (.74) .905 (.27) .910 (.49) 1.710 (.37) .046 .046 .188 15 1329 .568 (.61) .508 (.10) .551 (.66) 1.713 (.087) .041 .040 .122 16 30 .888 (.45) .849 (.99) .866 (.76) 1.803 (.24) .054 .051 .243 17 32 .867 (.84) .749 (.05) .829 (.24) 2.087 (.95) .050 .050 .174 19 331+ .885 (.56) .720 (.10) .837 (.65) 1.881 (.27) .023 .022 .055 20 333+ .850 (.40) .714 (.64) .804 (.32) 2.263 (.83) .038 .041 .085 21 34- 1.035 (.93) .961 (.92) 1.003 (.69) 2.052 (.20) .054 .052 .223 23 351 .567 (.32) .539 (.51) .571 (.31) 1.722 (.77) .051 .050 .178 24 352 1.043 (.65) .933 (.70) 1.010 (.89) 2.136 (.76) .040 .039 .104 25 353 .907 (.97) .852 (.93) .891 (.77) 2.139 (.78) .053 .049 .161 26 354 1.146 (.60) 1.010 (.10) 1.091 (.38) 1.844 (.58) .042 .042 .133 27 357 .572 (.13) .566 (.41) .580 (.17) 1.731 (.60) .056 .053 .298 29 355 .983 (.84) .936 (.19) .967 (.80) 1.697 (.65) .054 .050 .198 30 356 .958 (.98) .859 (.56) .920 (.10) 1.787 (.20) .053 .051 .234 31 358 1.204 (.51) 1.018 (.10) 1.155 (.03) 1.870 (.12) .042 .039 .074 32 359 1.494 (.10) 1.021 (.10) 1.300 (.10) 1.937 (.95) .008 .008 .038 35 361+ .826 (.94) .803 (.85) .817 (.87) 1.855 (.89) .059 .057 .177 36 362 .896 (.45) .836 (.84) .890 (.54) 1.963 (.95) .048 .048 .179 38 363 .780 (.52) .566 (.10) .744 (.01) 1.867 (.21) .017 .012 .028 39 364 .953 (.15) .858 (.38) .907 (.89) 1.850 (.28) .047 .044 .139 40 369 .750 (.97) .659 (.22) .722 (.74) 1.713 (.11) .050 .050 .166 42 365 .727 (.95) .709 (.10) .731 (.41) 1.500 (.28) .048 .045 .173 43 366+ .626 (.99) .614 (.65) .622 (.82) 1.759 (.68) .056 .052 .295 46 371 .649 (.10) .608 (.10) .649 (.88) 1.760 (.93) .055 .054 .172 47 376 .963 (.10) .742 (.01) .911 (.50) 1.524 (.52) .010 .009 .008 49 373 1.259 (.10) .915 (.10) 1.122 (.39) 1.906 (.57) .015 .013 .022 50 374 .797 (.10) .625 (.10) .743 (.05) 2.170 (.44) .018 .017 .063 51 375 1.540 (.01) .986 (.10) 1.345 (.72) 1.660 (.01) .012 .007 .014 52 379- 1.157 (.99) .683 (.10) 1.005 (.442) 1.730 (.01) .000 .000 .000 53 348+ .867 (.61) .741 (.10) .844 (.77) 1.801 (.50) .034 .036 .069 54 372 .809 (.35) .689 (.10) .782 (.53) 1.567 (.17) .015 .015 .059 55 38- .681 (.50) .696 (.93) .692 (.97) 1.859 (.74) .056 .051 .282 56 99 .931 (.74) .869 (.32) .911 (.25) 1.839 (.19) .049 .047 .185 74 Figure 2.1: An Example of Firm Citation Networks Figure 2.2: Mu-In and Log(Random Friends/Network-Based Friends) 75 Figure 2.3: Mu-Out and Log(Random Friends/Network-Based Friends) Figure 2.4: Mu-Total and Log(Random Friends/Network-Based Friends) 76 Figure 2.5: CC-TT and Log(Random Friends/Network-Based Friends) Figure 2.6: CC and Log(Random Friends/Network-Based Friends) 77 Figure 2.7: CC-Avg. and Log(Random Friends/Network-Based Friends) Figure 2.8: Mu-In of Real Networks and Mu-In of Simulated Networks 78 Figure 2.9: Mu-Out of Real Networks and Mu-Out of Simulated Networks) Figure 2.10: Mu-Total of Real Networks and Mu-Total of Simulated Net- works 79 Figure 2.11: CC-TT of Real Networks and CC-TT of Simulated Networks) Figure 2.12: CC of Real Networks and CC of Simulated Networks 80 Figure 2.13: CC-Avg. of Real Networks and CC-Avg. of Simulated Net- works) Figure 2.14: CC of Real Networks and CC of Simulated Networks 81 Figure 2.15: CC-TT and Mu-Out in Real Networks Figure 2.16: CC-TT and Mu-Out in Simulated Networks 82 Chapter 3 Information Heterogeneity by Firm Size and Aggregate Fluctuations 3.1 Introduction This paper is motivated by two concurrent facts: first, information heterogene- ity among firms increases after the 1980s; second, aggregate output becomes less volatile and more persistent after the 1980s. The term of each full business cycle also increases from 58.2 months during 1960-1980 to 109 months after the mid- 1980s, as announced by the National Bureau of Economic Research (NBER) Offi- cial Business Cycle Dating Committee. When I measure the information updating speed by patent citation time lag in NBER Patent Data, I find that smaller firms cite older existing patents than larger firms. It is rational for smaller firms to update in- formation more slowly than larger firms due to their limited process capacity. In a model with size-dependent reaction time lag and Pareto firm size distribution, the gradual cross-firm diffusion of a micro-level technology shock generates a persis- tent and hump-shaped aggregate output growth rate. Greater information hetero- geneity across firms de-synchronizes the co-movement among firms of different sizes, and hence causes less volatile, more persistent and longer aggregate busi- 83 ness cycles. Among the literature about information heterogeneity, the model with size-dependent reaction time lag to shocks is well suited to explain several facts about the business cycle’s timing relation. For example, productivity dispersion is pro-cyclical, the top firms’ growth rate predicts future GDP growth, investment leads hiring over the business cycle, and labor-share is counter-cyclical. Patent citation data show that large firms acquire information faster than smaller firms due to their ability to access better and broader knowledge sources, obtain more updated information from the same knowledge source, and locate closer to R&D centers. From the 1980s to 1990s, information heterogeneity in terms of ci- tation time lag widens in two ways: firms obtain knowledge from a more and more exclusive clique, and the citation time lag gap between large and small firms broad- ens. Meanwhile, the dynamic formation process of firm networks also changes; firms build new connections more often through the introduction of existing links, and are less likely to connect with a random unconnected firm. All these changes block knowledge from diffusing evenly across all firms and lead to greater infor- mation heterogeneity. Changes in intellectual property protection towards more strengthened protection may indirectly cause slower information diffusion across firms and greater information heterogeneity. Combining the size-dependent reaction time lag with a Melitz (57) type model, I derive an analytic solution to illustrate information heterogeneity’s impact on ag- gregate fluctuation. I assume a z-sized firm’s reaction time lag follows Poisson dis- tribution with average λ (z)= c−d log(z). Poisson lag distribution works well with Pareto firm size distribution, so there is an explicit solution. Without aggregate level Total Factor Productivity shock, I show that the gradual diffusion of a firm- level productivity shock1 can generate a hump-shaped and persistent aggregate growth rate. Pareto firm size distribution is also important for generating an evenly hump-shaped impulse response. Pareto distribution’s p.d. f f (x) = µx−µ−1xµmin 2 is special, because the product of a given firm size x and the density of the firms of x size f (x) is constant over x for µ ≈ 1. During the periods after the shock, there 1Firm productivity shock comes from the shocks to firm level innovation and the production process. For example, a successful process innovation is a positive shock, while a strike is a negative shock. 2xmin is the minimum size of a firm. For example, the smallest firm contains one worker, the owner his/herself. 84 is always a similar share of total production affected by the shock. Other distribu- tions, for example log-normal distribution, with smaller density at the lower end of firm size distribution, still generates a hump-shaped impulse response function, but the response function plunges shortly after the peak, instead of persistently and slowly decreasing to zero. Compared to previous literature about information heterogeneity, the model featuring cross-firm differences in reaction time lag is well designed to explain several timing relations relevant to the business cycle. For example, Comin and Mulani (16), Comin and Gertler (15), and Davis, Haltiwanger, Jarmin, and Mi- randa (19) show that cross-section firm productivity dispersions are pro-cyclical over business cycles. When positive technology shock hits the economy, large firms upgrade their technology more quickly than small firms; the productivity gap between large and small firms widens temporarily until small firms also learn the improved technology at a later date. In contrast, when the economy is hit by a neg- ative technology shock, large firms contract earlier than small firms, and therefore the productivity gap between them shrinks temporarily until the small firms are also influenced adversely. Gabaix (28) shows that the top 100 firms’ growth rate predicts future GDP growth rate. In this model, large firms’ growth rate leads the GDP simply because these firms react to shocks faster than the rest of the econ- omy. For example, when a new technology emerges, large firms adopt it earlier than small firms. As a result, the aggregate growth rate reacts to technology shock with longer delays than in top firms. When I vary the degree of information heterogeneity d, I show that greater in- formation heterogeneity reduces aggregate volatility by desynchronizing co-movement among firms of different sizes. The term of one full business cycle also becomes longer. Since the cycle ends when the smallest firms finally adopt the new tech- nology, it takes longer for the smallest firm to adopt the new technology when information diffuses more slowly (d is larger). In addition, top firms’ growth rates lead the GDP for longer periods and pro-cyclical productivity dispersion becomes more significant in the simulations with larger information heterogeneity d. In future work, I will improve this model in the following ways. First, firms in the model are pure competitors, without roles of complementariness through input-output matrix or specialization and cooperation. The simple competition re- 85 lation limits the number of shock propagation channels in the model. Second, the citation time lag is measured in years, which is too long to study monthly or quar- terly fluctuations. I have to assume that the relation between reaction time lag and firm size still holds at shorter time horizons. In addition, there is no capital in the model, and it therefore cannot capture capital-related stylized facts about the busi- ness cycle, such as the lead and lag relation between investment and hours, or the counter-cyclical labor share in total value added. 3.2 Literature There are two streams of literature related to this topic. One stream concerns infor- mation ’stickiness’ and its macroeconomic implications, for example, see Carroll (12), Mankiw and Reis (56), Sims (65), Woodford (70) and Moscarini (58), Ball, Mankiw, and Reis (6), Reis (61), Luo (52), and Abel, C., and Panageas (1). They point out that an individual is too small to gather and analyze all available in- formation. Therefore, information stickiness exists, because agents choose to be inattentive for some interval. Carroll (12) also shows that agents from different de- mographic groups update information at different intervals and form heterogeneous expectations about the economic status. The other papers, however, assume that all agents are ex ante identical in information process capability and that the ex post information heterogeneity is completely due to the randomness of private signals. In contrast, information heterogeneity across firms in this paper originates from firms’ various abilities to absorb and gain from updated knowledge, conditional on firm size. The other literature stream uses idiosyncratic firm- or sector-level fluctuations to explain aggregate shocks; for example, see Jovanovic (42), Durlauf (24), Bak, Chen, Scheinkman, and Woodford (5), Nirei (60), Gabaix (27), Long and Plosser (50), Dupor (23), Horvath (36), Horvath (37) and Conley and Dupor (17). The structure of the input-output matrix (Long and Plosser (50),Horvath (36) and Du- por (23)), non-linear interaction between firms (Jovanovic (42)), non-convex cost technology (Bak, Chen, Scheinkman, and Woodford (5)), and the type of tech- nology shock matters (Comin and Mulani (16)) when studying the impact of in- dependent micro-level shocks on aggregate fluctuations. Gabaix (28) shows that 86 Pareto firm size distribution breaks the law of large number, when the macro shock comes from the aggregation of individual firm-level shocks. The granular nature of the economy determines that idiosyncratic shocks to a number of top firms are sufficient to explain the part of aggregate-level fluctuation. In this model, the shock propagation channel is the firm’s size-dependent reac- tion time lag. An exogenous firm-level technology shock diffuses gradually across firm networks. Large firms update information more often than smaller firms, be- cause their processing cost is lower or their gain from newer knowledge is greater. Pareto firm size distribution helps generate a persistent and even hump-shaped im- pulse response. Furthermore, the special ability of this model is to explain the timing relations during the business cycle. 3.3 Information Heterogeneity In Patent Citation Data Information heterogeneity is defined here by the speed of knowledge acquisition and processing. Specifically, in patent citation data the speed is represented by the citation time lag, which is equal to the application year of the citing patent minus the application year of the cited patent. Smaller firms have a disadvantage in acquiring and absorbing information compared to larger firms, in that they cite older patents than those cited by larger firms. The information heterogeneity observed in the NBER Patent Citation Data pro- vides empirical evidence relevant to the literature on information heterogeneity and rational inattention. Slower learning can be a smaller firm’s optimal choice, be- cause the cost of obtaining the most up-to-date information outweighs the benefit, given the limited capacity of small firms to process information. Smaller firms may also benefit less from fresh news than large firms do. Since information processing cost is more or less a fixed cost, large firms can cover the fixed cost by applying it to a wider scope of products. In detail, information heterogeneity (IH) among U.S. firms encompasses three aspects: the chances to learn from better sources, the speed at which it is possible to learn from the same source, and the geographic distance to research centers. First, firms choose to learn from peers similar to themselves. Firms make more 87 citations to peers in their own sector3. Larger firms tend to cite larger firms4. Better connected firms tend to cite better connected firms5. These facts echo the assortative mixing exhibited among other types of social networks documented in Newman (59). This pattern arises if the cost of absorbing external knowledge increases with the gap between the learner and the source. As a result, on average, larger firms learn from better knowledge sources than smaller firms. Second, among the citing firms that cite patents owned by the same cited firm, larger citing firms tend to cite newer patents than smaller citing firms. This supports the idea that larger firms are faster than smaller firms in acquiring and adopting outside knowledge, given that they are able to reach the same knowledge source. Third, firms located more closely together learn from each other more quickly. Firms within the same city and the same state cite each other faster than they cite other firms. Unfortunately, smaller firms are less likely than large firms to locate in the top R&D centers. In summary, larger firms have an advantage over smaller firms, in that they can reach larger knowledge sources, acquire more current knowledge from the same source, and locate closer to research centers. The information heterogeneity examined among firms in this paper differs from the information stickiness in previous literature. Here, different agents update in- formation at different frequencies, according to their processing capacity. This is similar to the Carroll (12) model, where households from different demographic groups form heterogeneous inflation expectations due to their different frequencies of updating information. 3.3.1 Greater Information Heterogeneity after 1980s The information heterogeneity among U.S. firms became stronger during the 1980s. The citation lag differences due to firm size, distance and sector border enlarge over time, due to the change of network structure6. Since citation lag is a nonnegative 367% of citations made between U.S. organizations are within the same sector. 4Correlation between the log scale patent numbers of the citing and the cited firm is 0.25, when both firms belong to the same sector. 5Correlation between the log scale hub weights of the citing and the cited firm is 0.38; correlation between the log scale authority weights of the citing and the cited firm is 0.21. 6Since one firm may cite a patent multiple times, I include only the citations that represent when the citing firm cites a given patent for the first time. The first-time citation indicates the earliest time 88 integer, which counts the number of continuous periods without successful search of information, I use a Cox Proportional Hazards model to estimate the likelihood of a successful search per period. Observations are grouped by patent class. The specification of the Cox regression is as follows log(HRt)= b0,t+b1,t ln(patent nociting,t)+b2,tsamecity+b3,tsamestate+b4,tcross+c patent class HRt is the hazard rate of a successful search per period. ln(patent nociting,t) and ln(patent nocited,t) are the number of patents owned by the citing firm and cited firm at year t, respectively. ln(dist) is the log-scale great circle distance in kilometers between the inventors of the citing patent and the cited patent. samecity and samestate are dummy variables indicating whether the citing firm and the cited firm are located in the same city and same state. cross is a dummy variable indi- cating whether the citing patent and the cited patent belong to different sectors. c patent class is the sector or patent class fixed effects. The citations are grouped by the citing year. There is one such regression per year. Among the firms that applied patents in the same patent class, hazard rate is larger (citation time lag is shorter), if the citing firm is larger, the citing firm is located closer to the cited firm, and the two patents belong to the same patent class. The regression results are reported in Table 3.1. More importantly, the hazard rate or citation lag differences due to size and location grew over time during the 1980s and 1990s. Suppose firm A is 10 times larger than firm B, they are located in the same city and apply patent in the same industry: firm A is 4.04% more likely to find the right knowledge per period than firm B in 1985, while in 2000, firm A’s hazard rate of a successful search is 12.14% higher than that of firm B’s. Despite the improving telecommunication technology, the obstacle of distance became stronger. In 1985, locating in the same city and the same state as the cited firm increased the hazard rate by 3.0% and 2.1%, respec- tively, while in 2000, the differences enlarged to 14.9% and 13.8%. The hazard rate difference due to sector border increased slightly, but is stable around 10%. Large that the citing firm learns from the cited patent. First-time citations also mark new knowledge flow between firms. 89 firms learn much faster than smaller firms. Knowledge takes longer to spread to faraway places. From another perspective, the dynamic formation process of firm networks during the same period also demonstrates that information is shared less evenly across firms. Each firm shares more information with closely connected peers and less with random, disconnected firms. In detail, in every period one firm may cite other firms that it did not cite the previous year. I divide these new connections into two types following Jackson and Rogers (38): friend’s friend (FF) and random friend (RF). The ratio between FF/RF rises steadily from 0.6 in 1984 to 0.77 in 1995. That is, when searching for a new knowledge source, a firm relies more and more on existing connections; the chance to randomly build a link with an unknown firm is declining7. This trend means that firms are more constrained when looking for new knowledge sources. As a result, information is shared within a small, already well connected neighborhood, and communication with random outsiders declines. Information heterogeneity across the entire network becomes greater over time. Overall, all citation lags became longer relative to the age of patent stocks after the 1980s. For instance in Figure 3.1, if I use the ratio between citation share and patent stock share to measure the inward citation intensity for patents of different ages, in the 1980s firms cited the most recent patents over-proportionally, whereas in the 1990s firms cited the most recent patents under-proportionally8. 3.4 The Model This model is based on an autarky version of Melitz (57). 3.4.1 Consumer The representative consumer faces the following problem: 7The cause of this phenomenon could be that firms become more specialized, so they only find information from closely related firms useful. It could also happen if the communication technology has changed in such a way that it strengthens the tie between already connected firms, but doesn’t help to build trust between two unknown firms. 8The citations used in this figure include only first-time cross-firm citations. 90 U = max {xi,t} ∫ ∞ 0 ρ t [log(Yt)]dt (3.1) Yt = (∫ 1 0 x σ−1 σ i,t di ) σ σ−1 . ρ is the time preference of the representative consumer; Yt is the consumption of final goods; xit is the consumption of intermediate good i; product i is sold at price pi,t . Pt is the aggregate price index. The mass of intermediate goods firms is 1. σ > 1 is the elasticity of substitution between intermediate goods. Consumer demand for intermediate goods is xi,t = Yt ( Pt pi,t )σ Pt = (∫ It 0 p1−σi,t di ) 1 1−σ 3.4.2 Heterogeneous Firms In a monopolistic competitive market, there are numerous intermediate goods firms with various sizes due to heterogeneous labor productivity. The mass of firms is 1. Productivity distribution follows Pareto distribution F (Ai,t) = 1− ( Ai,t Amin )−µ . Amin is the smallest productivity able to survive due to fixed operation cost. µ is the shape parameter, with larger µ meaning firm size distribution is more homo- geneous. A firm’s productivity is constant overtime unless it is hit by technology shocks. Firms can flexibly adjust their prices according to their productivity at any time. Production uses only labor. Firm i’s production function is xi,t = Ai,tLi,t . Wage rate is normalized to 1. Firm i’s optimal price is pi,t = σ (σ −1)Ai,t . 91 The aggregate price level is Pt = p̃t = σ (σ −1) Ãt , which is equal to the optimal price of a firm with productivity Ãt = (∫ ∞ Am A σ−1 i,t dF (Ai,t) ) 1 σ−1 =( µ µ−σ+1 ) 1 σ−1 Am, µA−σ + 1 > 0 to ensure that the Ãt is finite. Ãt represents the quantity weighted average firm productivity. Firm i’s production becomes xi,t = Yt ( Ai,t Ãt )σ . Firm size heterogeneity originates from productivity heterogeneity. In addition to productivity dispersion, firms are also heterogeneous in their network connectivity. Cross-firm patent citations link all firms as a network. The number of other firms with which a firm is directly connected is called the degree of that firm. The degree distribution across firms follows a Power-law distribution, which is the discrete version of Pareto distribution. A firm that owns more patents also has more linkages in the network. Size Dependent Reaction Time Lag Information dispersion associated with firm size heterogeneity is embodied in the model by the declining reaction time lag with firm size. Specifically, the time lag that firm f with productivity A f ,t requires to react to a news is a random draw from a Poisson distribution with parameter λt (A f ,t) = ct − dt ln(A f ,t) . Larger dt means larger firms react to news much faster than smaller firms, hence greater information heterogeneity in the economy. ct = c0 + dt ln(Amax)9 to ensure that even the largest firm with productivity Amax takes c0 time to gather information. If Amin = 1, the smallest firm’s expected time to hear the news after ct periods. 9According to Newman (59), length (in number of edges) of the longest geodesic path between any two nodes in a network increases as fast as ln(N), where N is the number of nodes in the network. When the largest firm’s productivity Amax is proportional to the number of firms N or population size, the assumption that ct ∝ ln(Amax) also means the longest reaction time in the network λ (Amin) = ct ∝ ln(N). 92 If every firm processes information at the same speed dt = 0, ct is the reciprocal of information updating frequency, or 1µ in Reis (61). The Poisson distribution of reaction time lag works well with Pareto distribution of firm size, in the sense that there is an analytical solution for total production k periods after the shock. The origin of a positive technology shock is an individual firm’s innovation and imitation activity. An example is a successful new innovation of a general purpose technology. The inventor of this new technology is the first firm to apply this new technology. All the firms that are directly linked to the inventor firm apply it in the first wave; those that are indirectly connected to the inventor firms apply it with longer lags, depending on their distances from the inventor. If one measures each patent’s impact by the number of citations it receives, Hall, Jaffe, and Trajtenberg (31) show that the distribution of each technology shock’s impact is as skewed as a Pareto distribution. That means only very few new technologies are able to attract attention from all peers and eventually induce macro-level fluctuations. A negative productivity shock could happen when some new distraction de- creases employee’s productivity at work; for example, instant messaging software and social network web sites. It also could happen because of temporary infras- tructure dysfunction, a strike, or an infectious disease. A larger firm is also more likely to receive a negative productivity shock earlier than a small firm, because it hires more workers, each of whom is equally like to introduce a new distraction or infectious disease to the company; a large firm also generates more tasks that demand public service and high quality infrastructure. Gabaix (28) also uses strike or the threat of a potential strike as a firm-specific negative productivity shock. Suppose a 1% positive technology shock hits the economy at time t. Denote the share of firms with productivity A that are just-informed at time t+ k as JI (k,A)≡ λt (A) k e−λt(A) k! = [ct −dt ln(A)]k Ad k!ect . By definition, JI (k,A) is the probability mass function of Poisson distribution with 93 λ (A) = ct −dt ln(A) . By t+ k, the share of just informed firms in the economy is JI (k) = ∫ ∞ Amin JI (k,A)dF (A) = ∫ ∞ Amin (ct −dt ln(A))k AµmµAdt−µ−1 k!ect dA = µ µ−dt k ∑ g=0 ( − dt µ−dt )k−g JI (g,Amin) . At time t+ k, the share of all informed firms is: I (k) = k ∑ g=1 JI (g) . JI (k,A) and I (k) summarize the aggregate status of information diffusion among firms. When dt = 0, JI (k) = JI (k,Amin), which is the probability mass function of Poisson distributions with λ = ct . Each firm has the same chance to learn the news as the smallest firm at time t+k. JI (k,Am) is the probability density function (p.d.f.) of Poisson distribution with parameter λ = ct −dt ln(Amin). For example, when Am = 1, dt = 0. Very slow information transmission within the firm network (large λ ) means JI (k,Amin) and I (k) is smooth and symmetric over k, because every period only a tiny group of the firms is informed. 3.4.3 Aggregate Fluctuation under Information Heterogeneity In this subsection, I vary the degree of information heterogeneity d to illustrate its impact on several aspects of aggregate fluctuation. First, larger d causes a more persistent, less volatile and longer hump-shaped impulse response of total output. Second, larger d induces a situation in which the top 100 firms’ growth rate leads GDP growth rate for more periods. Third, stronger information heterogeneity in- curs greater pro-cyclical cross-sectional firm productivity dispersion. 94 Hump-Shaped Impulse Response Firms adjust their labor input and productivity according to the information they possess; therefore the shape of JI(k,A) and I(k) govern the cyclical pattern of ag- gregate production and productivity dispersion over a cycle. Suppose the news is a positive technology shock: informed firms can produce at 1+4 times their origi- nal productivity, while uninformed firms still produce at their original productivity. Firms informed early enjoy excess profit, until all firms learn this news and every firm’s profit returns to the normal level justified by its initial relative productivity. The benefit of learning of a positive productivity shock earlier is twofold: a firm enjoys excess profit for a longer duration, and the excess profit is much higher in earlier periods, because the average productivity is still low because few firms are aware of the better technology. The aggregate production at t+k depends on how many firms are informed and how large these firms are Yt+k = 1 Pt+k = (σ −1) Ãt+k σ = (σ −1) σ (1+4) σ−1−1 µ−σ+1−dt ∑ k g=0∑ g j=0 ( − dtµ−σ+1−dt )k−g JI (g,Amin) + µµ−σ+1 1 σ−1 Am Yt+k and the growth rate of Yt+k, gt+k, have similar curvature as the mass of in- formed firms I (k) and the mass of just-informed firms JI (k), respectively. See Figure 3.2. With information heterogeneity or dt > 0, gt+k, exhibits a hump- shaped curve over k. The growth rate at the beginning comes mainly from the just-informed top firms; each of them is a giant but their mass is small in Pareto distribution. At a later stage, JI (k) rises first and drops later, but the average size of just-informed firms always declines. The peak of gt+k comes when the prod- uct of just-informed firms JI (k) and their average size reach the maximum. In contrast, when dt = 0 there is no hump-shaped impulse response of growth rate, because when every firm faces the same time lag distribution, the average size of just-informed firms are constants over time, and the growth rate plunges simply as JI (k) declines over time. 95 Here the slow propagation of positive technology shock is completely due to slow transmission of information across firms. Greater information heterogeneity (larger dt) de-synchronizes the co-movement between firms of different sizes. As a result, the peak of the {gt+k} curve comes later and lower, when dt is larger. The variance of {gt+k} over k = 0 to 30 also decreases with information het- erogeneity dt , even though the size of the technology shock is the same. This model may provide an additional explanation for the more than halved aggregate volatility or great moderation since the 1980s. Policy changes towards stronger intellectual property protection since the 1980s have induced slower knowledge diffusion across firms and the formation of highly clustered firm networks, which cause greater information heterogeneity among firms. In consequence, the macroe- conomic fluctuations become less volatile when firms move at de-synchronized paces. Lead and Lap between Top Firms’ Growth and GDP Growth Gabaix (27) shows that the growth rate of the top 100 non-oil and non-energy pub- lic U.S. firms predicts GDP growth rate. He lists several potential explanations: autocorrelation at the firm level; imitation dynamics, where a successful technol- ogy is imitated by other firms; time aggregation; and the propagation of shocks along supply and demand chains, as in Long and Plosser (50). This model pro- poses another explanation: the propagation of shocks through firm networks. Top firms learn of the shock earlier than other firms, and therefore their reaction leads the rest of the economy. This is similar to imitation dynamics and the propagation of shocks along supply and demand chains, which both emphasize the difference in reaction time across firms. Analytically, the output of top firms whose productivity is higher than Ax, Ax≫ Amin, can be expressed as follows: 96 Y A>Axt+k = ( Ãt+k|At+k > Ax Ãt+k )σ Yt+k = (σ −1) σ [∫ ∞ Ax A σ−1 i,t+kdF (Ai,t) ] σ σ−1 ∫ ∞ Ax A σ−1 i,t+kdF (Ai,t) = [ (1+4)σ−1−1 µ−σ+1−dt ∑ k g=0∑ g j=0 ( − dtµ−σ+1−dt )k−g JI (g,Ax)+ µ µ−σ+1 ] σ σ−1 (1+4)σ−1−1 µ−σ+1−dt ∑ k g=0∑ g j=0 ( − dtµ−σ+1−dt )k−g JI (g,Amin)+ µ µ−σ+1 (σ −1)Aσx σAσ−1min Top firms’ growth leads GDP growth, intuitively because top firms react to productivity shock faster than other firms, and mathematically because the peak of JI (g,Ax) leads that of JI (g,Amin). In Figure 3.3, I set ln(Ax) = 18, ln(Am) = 0, µ = 1.0610, σ = 1.05, 4 = 1% and d = 0, 0.1, 0.2, 0.4 in four scenarios. At the first few positive lags, the cross-correlation between gA>Axt+k and gt+k is even negative for large enough information heterogeneity or dt , because top firms squeeze other firms’ market share when they update technology so much earlier than others. Not only does the positive cross-correlation come later for larger dt , but the peak cross- correlation value is also lower. In contrast, when dt = 0, top firms’ growth and the growth of the rest of the economy completely coincide. Cross-correlation between them is simply self-cross-correlation of GDP growth rate. Pro-cyclical Productivity Dispersion Comin and Mulani (16), Comin and Gertler (15), Davis, Haltiwanger, Jarmin, and Miranda (19) show that cross-sectional firm productivity dispersions are pro- cyclical over business cycles. See Figure 3.4 and Figure 3.5 .11 In this model, pro-cyclical productivity dispersion is related to large firms’ advantage in updating information. Since large firms are more likely to improve their productivity earlier than smaller firms, the productivity gap between large and 10Axtell (4). 11Figures 7 and 8 of Davis, Haltiwanger, Jarmin, and Miranda (19). 97 small firms widens temporarily until smaller firms learn the news (at a later date). The new cumulative density function (c.d.f) for productivity distribution becomes: Pr ( A f ,t+k > x ) = Pr(A f ,t > x)+ ∫ x x/(1+4x) I (k,A)dF (A) ≈ ( x Am )−µ [ 1+ x 4x 1+4xI (k,x) ] It is difficult to derive an analytic solution for the commonly used productiv- ity heterogeneity measure, standard deviation of ln ( A f ,t+k ) . Instead, I illustrate the productivity dispersion in a simulation with 100,000 firms. In Figure 3.6, with information heterogeneity due to firm size or dt > 0, productivity dispersion’s im- pulse response is also hump-shaped. The summit of the hump-shape comes later as dt rises, because the productivity gap between early-informed large firms and later- informed small firms lingers longer. The summit is also higher as dt rises, because the early-informed firms are more often large firms, and therefore the productivity distribution is skewed further right. In contrast, when dt = 0, firm productivity dis- persion is flat over time, because every firm has an equal chance to update produc- tivity at a given time; the shape of the productivity distribution is well maintained, as it was before the shock. Note that the productivity dispersion and GDP growth rate have similar pat- tern and timing for a given dt . This echoes the empirical evidence that cross- sectional firm-level productivity dispersion and aggregate growth co-move at al- most the same pace. 3.5 Conclusion and Future Research This paper provides a potential explanation of the great moderation of aggregate volatility. Greater information heterogeneity across firms de-synchronizes the co- movement across firms of different sizes, and hence reduces aggregate output volatility. The model with size-dependent reaction time lag is well suited to ex- plaining the timing relations related to the business cycle; for example, the pro- cyclical productivity dispersion and the lead-lag relation between top firms’ growth rates and GDP growth rate. 98 Using patent citation data, I provide empirical evidence for information hetero- geneity across firms measured by patent citation time lag. Larger firms are faster to obtain and process knowledge than smaller firms. In detail, larger firms are able to cite patents owned by larger and better connected firms; among the citing firms that cite patents owned by the same cited firm, larger firms cite more recent patents than smaller firms; and larger firms are more likely to locate closer to R&D centers. These facts support the literature of rational inattention, in that smaller firms up- date information less frequently, perhaps simply because of their poor processing capability or insufficient benefit from newer knowledge. I analytically solve a simple Melitz (57) type model with Pareto distribution of firm size and size-dependent reaction time lag to shocks. The time lag with which a firm reacts to a technology shock follows a Poisson distribution, the mean of which is negatively related to the log-scale firm size. The model generates per- sistent hump-shaped aggregate fluctuation without aggregate TFP shocks. In the simulations with various degrees of information heterogeneity, greater information heterogeneity causes a less volatile GDP growth rate, a longer leading period of top firms’ growth rate over GDP, and more pronounced pro-cyclical productivity dispersion. In the future, I would like to incorporate more aspects of firm heterogeneity and explore the model’s ability to match other stylized facts related to the business cy- cle. For example, larger firms use capital more intensively than smaller firms. The capital share difference can be as large as 25% within a three-digit level industry12. With this additional feature, this model will be able to illustrate why invest- ment leads hiring over business cycles and why labor share is counter-cyclical. To expand production by the same amount, large firms invest more capital, while small firms hire more hours of labor. When large, capital-intensive firms move before small, labor-intensive firms, the aggregate investment leads hours over the business cycle. Meanwhile, when large, capital-intensive firms undertake a larger share of output during the boom and produce a smaller share during the recession, then at the aggregate level, labor receives a smaller share of value added during the boom and a larger share during the recession. The extension highlights that the 12See Young (71), Ambler and Cardia (3), and Hansen and Prescott (32). 99 model is suitable for capturing more timing relations over the business cycle. 100 Table 3.1: Cox Proportional Hazards model 1985-2000; Group Variable: Patent Class Dependent Variable: Hazard Rate of Successful Search Year 1985 1990 1995 2000 Log(Patent StockCiting)a .011*** .018*** .028*** .033*** (.002) (.002) (.001) (.001) Samecityb .030* .054*** .109*** .149 (.016) (.011) (.007) (.005) Samestatec .021 .072*** .124*** .139 (.029) (.022) (.014) (.010) Cross-sector citationd -.092*** -.079*** -.084*** -.100 (.009) (.007) (.004) (.003) No. of Observations 53272 98024 212902 368603 Wald chi2(4) 139.51 347.83 1684.72 4683.93 aNumber of patents owned by the citing firm b1 if the citing inventor and the cited inventor live in the same city, 0 other- wise. c1 if the citing inventor and the cited inventor live in the same state, 0 other- wise. d1 if the citing patent and the cited patent belong to the same sector, 0 other- wise. 101 Figure 3.1: The Changing Ratio of Citation Share to Patent Stock Share Figure 3.2: Information Heterogeneity and Growth Rate IR to Technology Shock 102 Figure 3.3: Figure 7 in Davis, Haltiwanger, Jarmin, and Miranda (19) Figure 3.4: Cross-Correlation between Top Firms’ Growth and GDP Growth 103 Figure 3.5: Figure 8 in Davis, Haltiwanger, Jarmin, and Miranda (19) Figure 3.6: Productivity Dispersion after Technology Shock 104 Bibliography [1] A. B. Abel, J. E. C., and S. Panageas. Optimal inattention to the stock market with information costs and transactions costs. NBER Working Paper 15010, National Bureau of Economic Research, Inc, 2009. → pages 86 [2] R. Albuquerque and H. A. Hopenhayn. Optimal lending contracts and firm dynamics. Review of Economic Studies, 71:285–315, 2004. → pages 7 [3] S. Ambler and E. Cardia. The cyclical behaviour of wages and profits un- der imperfect competition. Canadian Journal of Economics, 31(1):148–164, February 1998. → pages 99 [4] R. L. Axtell. Zipf distribution of u.s. firm sizes. Science, 293:1818–1820, 2001. → pages 6, 97 [5] P. Bak, K. Chen, J. Scheinkman, and M. Woodford. Aggregate fluctuations from independent sectoral shocks: Self organized criticality in a model of production and inventory dynamics. Ricerche Economiche, 41:3–30, 1993. → pages 86 [6] L. Ball, N. G. Mankiw, and R. Reis. Monetary policy for inattentive economies. Journal of Monetary Economics, 52:703–725, 2005. → pages 86 [7] A. B. Bernard, S. Redding, and P. K. Schott. Multi-product firms and prod- uct switching. NBER Working Paper 12293, National Bureau of Economic Research, Inc, 2006. → pages 29 [8] C. Broda and D. E. Weinstein. Product creation and destruction: Evidence and price implications. NBER Working Papers 13041, National Bureau of Economic Research, Inc, 2007. → pages vii, 29, 37 [9] R. J. Caballero and A. B. Jaffe. How high are the giants’ shoulders: An empirical assessment of knowledge spillovers and creative destruction in a 105 model of economic growth. NBER Working Papers 4370, National Bureau of Economic Research, Inc, 1993. → pages 20 [10] L. M. B. Cabral and J. Mata. On the evolution of the firm size distribution: Facts and theory. American Economic Review, 93:1075–1090, 2003.→ pages 7 [11] J. Cai and N. Li. Knowledge linkages and multi-product firm innovations. Working papers, University of New South Wales, 2010. → pages 6, 34 [12] C. D. Carroll. The epidemiology of macroecnomic expectations. NBER Working Paper 8695, National Bureau of Economic Research, Inc, 2001. → pages 86, 88 [13] A. Clauset, C. R. Shalizi, and M. E. J. Newman. Power-law distributions in empirical data. E-print, page arXiv:0706.1062v2, 2007. → pages 132 [14] G. L. Clementi and H. A. Hopenhayn. A theory of financing constraints and firm dynamics. Quaterly Journal of Economics, 121:229–265, 2006.→ pages 7 [15] D. Comin and M. Gertler. Medium term business cycles. American Economic Review, 96:523–551, 2006. → pages 85, 97 [16] D. Comin and S. Mulani. Diverging trends in aggregate and firm volatility. Review of Economics and Statistics, 88:374–383, 2006. → pages 19, 20, 85, 86, 97 [17] T. Conley and B. Dupor. A spatial analysis of sectoral complementarity. Jour- nal of Political Economy, 111:311–352, 2003. → pages 86 [18] T. F. Cooley and V. Quadini. Financial markets and firm dynamics. American Economic Review, 91:1286–1310, 2001. → pages 7 [19] S. J. Davis, J. Haltiwanger, R. Jarmin, and J. Miranda. Volatility and dis- persion in business growth rates: Publicly traded versus privately held firms. Technical report, National Bureau of Economic Research, Inc, 2006.→ pages ix, 19, 85, 97, 103, 104 [20] A. K. Dixit and J. E. Stiglitz. Monopolistic competition and optimum product diversity. American Economic Review, 67:297–308, 1977. → pages 10 [21] E. Duguet and M. MacGarvie. How well do patent citations measure flows of technology? evidence from french innovation surveys. Economics of Innova- tion and New Technology, 14:375–393, 2005. → pages 11, 23 106 [22] T. Dunne, M. J. Roberts, and L. Samuelson. The growth and failure of u.s. manufacturing plants. Quarterly Journal of Economics, 104:671–698, 1989. → pages 6 [23] W. Dupor. Aggregation and irrelevance in multi-sector models. Journal of Monetary Economics, 43:391–409, 1999. → pages 86 [24] S. Durlauf. Non-ergodic economic growth. Review of Economic Studies, 60: 349–366, 1993. → pages 86 [25] R. Ericson and A. Pakes. Marcov perfect industry dynamics: A framework for empirical analysis. Review of Economic Studies, 62:53–82, 1995. → pages 6 [26] D. S. Evans. The relationship between firm growth, size and age: Estimates for 100 manufacturing industries. Journal of Industrial Economics, 35:567– 581, 1987. → pages 6 [27] X. Gabaix. Power laws in economics and finance. NBER Working Paper 14299, National Bureau of Economic Research, Inc, 2008. → pages 30, 86, 96 [28] X. Gabaix. The granular origins of aggregate fluctuations. Nber working paper, National Bureau of Economic Research, Inc, 2009. → pages 85, 86, 93 [29] C. M. Goldie. Implicit renewal theory and tails of solutions of random equa- tions. The Annals of Applied Probability, 1(1):126–166, 1991. → pages 18 [30] B. H. Hall. The relationship between firm size and firm growth in the us manufacturing sector. Journal of Industrial Economics, 35:583–605, 1987. → pages 6 [31] B. H. Hall, A. B. Jaffe, and M. Trajtenberg. The nber patent citation data file: Lessons, insights and methodological tools. NBER Working Paper 8498, National Bureau of Economic Research, Inc, 2001. → pages 22, 93 [32] G. D. Hansen and E. C. Prescott. Capacity constraints, asymmetries, and the business cycle. Review of Economic Dynamics, 8(4):850–865, October 2005. → pages 99 [33] E. Helpman, M. J. Melitz, and S. R. Yeaple. Export versus FDI with het- erogenous firms. American Economic Review, 94:300–316, 2004. → pages 6, 29, 111, 113 107 [34] P. Holme and B. J. Kim. Growing scale-free networks with tunable clustering. Physical Review E, 65:026107, 2002. → pages 57 [35] H. A. Hopenhayn. Entry, exit, and firm dynamics in long run equilibrium. Econometrica, 60:1127–1150, 1992. → pages 6 [36] M. Horvath. Cyclicality and sectoral linkages: Aggregate fluctuations from sectoral shocks. Review of Economic Dynamics, 1:781–808, 1998. → pages 86 [37] M. Horvath. Sectoral shocks and aggregate fluctuations. Journal of Monetary Economics, 45:69–106, 2000. → pages 86 [38] M. Jackson and B. W. Rogers. Meeting strangers and friends of friends: How random are socially generated networks? American Economic Review, 97: 890–915, 2007. → pages 55, 57, 72, 90, 126, 127 [39] A. B. Jaffe, M. Trajtenberg, and M. S. Fogarty. Knowledge spillovers and patent citations: Evidence from a survey of inventors. American Economic Review, 90:215–218, 2000. → pages 23 [40] A. H. Jessen and T. Mikosch. Regularly varying functions. Publications de L’institut Mathematique, 94:171–192, 2006. → pages 5, 30 [41] R. Joharia, S. Mannorb, and J. N. Tsitsiklis. A contract-based model for directed network formation. Games and Economic Behavior, 56:201–224, 2006. → pages 56 [42] B. Jovanovic. Micro shocks and aggregate risk. Quarterly Journal of Eco- nomics, 102:395–409, 1987. → pages 86 [43] B. Jovanovic. The diversification of production. Brookings Papers on Eco- nomic Activity, Microeconomics, 1993(1):197–247, 1993. → pages 6, 29 [44] H. Kesten. Random difference equations and renewal theory for products of random matrices. Acta Mathematica, 131:207–248, 1973.→ pages 3, 18, 32, 57, 66 [45] J. M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of Association for Computing Machinery, 46(5):604–632, 1999. ISSN 0004- 5411. → pages 4 [46] S. Klepper. Entry, exit, growth, and innovation over the product life cycle. American Economic Review, 86(3):562–83, 1996. → pages 6 108 [47] S. Klepper and P. Thompson. Submarkets and the evolution of market struc- ture. Rand Journal of Economics, 34:862–888, 2007. → pages 6, 7 [48] T. J. Klette and S. Kortum. Innovating firms and aggregate innovation. Jour- nal of Political Economy, 112:986–1018, 2004. → pages 1, 2, 3, 6, 7, 11 [49] T. J. Klette and A. Raknerud. How and why do firms differ? Discussion Papers 320, Research Department of Statistics Norway, 2002. → pages 6 [50] J. B. Long and C. I. Plosser. Real business cycles. Journal of Political Econ- omy, 91:39–69, 1983. → pages 86, 96 [51] R. E. Lucas. On the size-distribution of business firms. Bell Journal of Eco- nomics, 9:508–523, 1978. → pages 6 [52] Y. Luo. Consumption dynamics under information processing constraints. Review of Economic Dynamics, 11:366–385, 2008. → pages 86 [53] E. G. J. Luttmer. Selection, growth, and the size distribution of firms. Quar- terly Journal of Economics, 122:1103–1144, 2007. → pages 7, 8 [54] E. G. J. Luttmer. On the mechanics of firm growth. Working Paper 657, Federal Reserve Bank Minneapolis, 2008. → pages 6, 7, 8 [55] S. A. M. M. L. Goldstein and G. G. Yen. Problems with fitting to the power- law distribution. THE EUROPEAN PHYSICAL JOURNAL B - CONDENSED MATTER AND COMPLEX SYSTEMS, 41(2):255–258, 2004. → pages 132 [56] N. G. Mankiw and R. Reis. Sticky information in general equilibrium. Jour- nal of the European Economic Association, 5(2-3):603–613, 04-05 2007. → pages 86 [57] M. J. Melitz. The impact of trade on aggregate industry productivity and intra-industry reallocations. Econometrica, 71:1695–1725, 2003. → pages 84, 90, 99 [58] G. Moscarini. Limited information capacity as a source of inertia. Journal of Economic Dynamics and Control, 28:2003–2035, 2004. → pages 86 [59] M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45:167–256, 2003. → pages 69, 88, 92, 127 [60] M. Nirei. Threshold behavior and aggregate fluctuation. Journal of Economic Theory, 127:309–322, 2006. → pages 86 109 [61] R. Reis. Inattentive producers. Review of Economic Studies, 73:793–821, 2006. → pages 86, 93 [62] C. Rosell and A. Agrawal. University patenting: Estimating the diminishing breadth of knowledge diffusion and consumption. NBER Working Papers 12640, National Bureau of Economic Research, Inc, 2006. → pages 20 [63] E. Rossi-Hansberg and M. L. J. Wright. Establishment size dynamics in the aggregate economy. American Economic Review, 97:1639–1666, 2007. → pages 6 [64] M. Seker. A structural model of establishment and industry evolution : ev- idence from chile. Policy Research Working Paper Series 4947, The World Bank, June 2009. → pages 6 [65] C. A. Sims. Implications of rational inattention. Journal of Monetary Eco- nomics, 50:665–690, 2003. → pages 86 [66] R. W. Sinnott. Virtues of the haversine. Sky and Telescope, 68:159, 1984. → pages 27 [67] M. L. Streitweiser. The extent and nature of establishment level diversifica- tion in sixteen us industries. Journal of Law and Economics, 34:503–534, 1991. → pages 29 [68] J. Sutton. Gibrat’s legacy. Journal of Economics Literature, 35:40–59, 1997. → pages 6 [69] A. Vazquez. Growing network with local rules: Preferential attachment, clus- tering hierarchy, and degree correlations. Physical Review E, 67(056104), 2003. → pages 57 [70] M. Woodford. Imperfect common knowledge and the effects of monetary policy. NBER Working Paper 8673, National Bureau of Economic Research, Inc, 2003. → pages 86 [71] A. Young. Labor’s share fluctuations, biased technical change, and the busi- ness cycle. Review of Economic Dynamics, 7(4):916–931, October 2004. → pages 99 110 Appendix A Robustness of Sectoral Firm Size Heterogeneity This appendix introduces a method for estimating sectoral firm size heterogeneity and the data sets used for this purpose. It then shows that sectoral firm size het- erogeneity varies little when firm size is measured with different proxies. As such, sectoral firm size heterogeneity is stable over time in a specific country and also highly correlated across industrialized countries. Helpman, Melitz, and Yeaple (33) show that there are two ways to estimate the commonly used measure of firm size heterogeneity defined by the variance of log scale firm size. For a Pareto distributed variable X, one method is to calculate the standard deviation of log(X) which is the reciprocal of . The other method is to estimate by OLS: log(Pr(X > x)) =−µlog(x)+ c and use 1µ . Theoretically, these two methods give the same estimation. In this paper, I estimate the firm size heterogeneity measure with the second method. The data sets used here include French manufacturing firm-level data for 1997- 2005 provided by Amadeus, BUREAU van DIJK, Chilean manufacturing firm- level data for 1979-1996 provided by Chile Instituto Nacional de Estadistica, and ’Industry Statistics by Employment Size’ provided by the U.S. Economic Census 1997 and 2002. The French data set has information for every firm; the Chilean data set includes only firms with more than ten employees1. Only the U.S. data 1If a firm size distribution measured by the number of employees follows a Pareto distribution with scale parameter µ , this truncation does not affect the estimation of µ , because a Pareto dis- tribution has the special feature that when the distribution is truncated from the left, the rest of the distribution on the right tail is still a Pareto distribution with the same scale parameter, except that the new distribution starts with a higher minimum level xm. 111 set gives the number of firms for ten employment size categories: 1 to 4 workers, 5 to 9 workers, 10 to 19 workers, 20 to 49 workers, 50 to 99 workers, 100 to 249 workers, 250 to 499 workers, 500 to 999 workers, 1000 to 2499 workers, and 2500 or more workers. From these numbers, I can determine the rank of the firms with 1, 5, 10, 20, 50, 100, 250, 500, 1000, 2500 employees in their six-digit NAICS industry. To show the proxy robustness of firm size heterogeneity with different firm size proxies, it is appropriate to use the data from France and Chile because they both have more than one proxy for firm size. Number of employees, operational turnover, sales, and value added are the alternative proxies for firm size. The U.S. data set only has the number of employees as a proxy for firm size. Sdlnl, sdlny, sdlns and sdlnva are abbreviations for standard deviation of the log(number of employees), standard deviation of log(operational turnover), standard deviation of log(sales) and standard deviation of log(value added), respectively. In the French data set (Table 1.9 ), these four measures for 81 four-digit NAICS sectors are highly correlated, with a correlation coefficient greater than 0.9 in all years and for all combinations of variables. In the Chilean data set (Table 1.10), the correlation coefficients between sdlns, sdlny and sdlnva are as high as those observed in the French data set, but those between sdlnl and the other three variables are lower and range between 0.6 and 0.8. A possible reason for this discrepancy is that Chilean firm data are truncated, since only firms with more than ten employees are part of this data set. The time persistence of firm size heterogeneity appears in the U.S., French and Chilean manufacturing sectors, though these different data sets cover different time intervals. The proxy for firm size is the number of employees in all data sets. In the U.S. data (Figure 1.12), the estimations for four-digit NAICS manufacturing industries in 1997 and 2002 exhibit a tight one-to-one relationship. In France (Fig- ure 1.13), the estimations for four-digit NAICS manufacturing industries in 1997 and 2005 also exhibit an almost perfect one-to-one pattern2. The Chilean data set 2The outlier 3122 represents the tobacco manufacturing sector. Its heterogeneity measure drops from 3.1 in 1997 to 0.9 in 2005. There were important policy changes in this sector during this eight- year period, which might cause the significant change in firm size heterogeneity. In 2001, Brussels passed a law, banning mass-media advertising of tobacco and requiring large warning labels on cigarette packages. To discourage potential new smokers, governments throughout Europe increased 112 (Figure 1.14) has the longest time range: 20 years. There also, the estimations for four-digit ISIC manufacturing industries in 1979 and 1996 roughly follow a one- to-one relationship. Note that Chile experienced some economic reforms during this period. The outliers typically have less than 100 establishments. The cross-country robustness test of firm size heterogeneity is based on a com- parison between French and U.S. manufacturing sectors, because they both use NAICS industry classification. There are 81 NAICS four-digit manufacturing sec- tors in total. The number of employees is the firm size proxy in both data sets. Figure 1.15 and Figure 1.16 show that the correlation coefficient between sdlnl, standard deviation of log(number of employees), in the two countries is 0.74 in 1997 and 0.72 in 2002. This result corroborates a similar result in Helpman, Melitz, and Yeaple (33). They find that, although the U.S. and France have different eco- nomic policies and institutions, firm size distributions for the same industries are highly correlated across countries, with a correlation coefficient of more than 0.5. their cigarette taxes in 2003. 113 Appendix B Detailed One-Sector Model The consumer’s utility function becomes: U = max {xik,t} ∫ ∞ 0 ρ t [ K ∑ k=1 sk log ( Ckt )] dt, Ck,t = (∫ Ikt 0 ( xki,t ) σk−1 σk di ) σk σk−1 , k = 1,2, ...,K. s is consumer preference for goods in sector k or the share of income spent in sector k. Ikt is the total number of varieties in sector k. σ k is the elasticity of substitution between differentiated goods in sector k. The firm’s problem is: max {Ni jf ,t , Mif ,t}, i, j∈{1,2,...,K} V (z f ,t)= K ∑ i=1 siPtCt σi zif ,t Iit + ρCtPt Ct+1Pt+1 E[V (z f ,t+1)]− K ∑ i=1 K ∑ j=1 ( Ni jf ,t +M i j f ,t ) Z̄it subject to z f ,t+1 = z f ,t +∆zNf ,t +∆z M f ,t (B..1) ∆zN,if ,t zif ,t =∑ j Ai jN ( Ni jf ,t )α ( z jf ,t )1−α zif ,t + εN,i jf ,t , i, j ∈ {1,2, ...,K} (B..2) 114 ∆ZM,if ,t Z̄it = Ai jM ( Mi jf ,t )α ( Z̄ jt )1−α Z̄it + εM,if ,t , i ∈ {1,2, ...,K} (B..3) where ∆zNf ,t and ∆z M f ,t are K dimension vectors. The i th element of ∆zNf ,t (∆z M f ,t), ∆zN,if ,t (∆z M,i f ,t ) is the number of innovated (imitated) new goods in sector i by firm f at time t. { εN,i jf ,t ,ε M,i f ,t } are i.i.d. across firms and time. An educated guess for the firm value function is V (zif ,t) = ∑ K i=1 v i t zif ,t Ii,t +ut . The first order conditions and Bellman equation can be written as: Ni jf ,t = ( Ai jNαv i tρ i j t MF ) 1 1−α z jf ,t , i, j ∈ {1,2, ...,K} (B..4) Mi jf ,t = ( Ai jMβv i tρ i j t MF ) 1 1−α Z̄ jt , i ∈ {1,2, ...,K} (B..5) v jt = s jPtCt σ j −MF K ∑ i=1 ( Ai jNαv i t+1ρ i j t MF ) 1 1−α (B..6) + K∑ i=1 I jt Iit ρ i jt vit+1A i j N ( Ai jNαv i t+1ρ i j t MF ) α 1−α + v jt+1 , (B..7) i, j ∈ {1,2, ...,K}, (B..8) where ρ i jt = ρI jt CtPt Iit+1Ct+1Pt+1 . In (B..4) and (B..5), the input in each type of R&D is proportional to the knowl- edge capital input. In (B..6), the marginal value of one product in sector j, v j ,de- pends on its current profit in sector j plus its contribution to future innovation in all K sectors. The firm size dynamic in sector i is: 115 zif ,t+1 = Iit Iit+1 1+Aii 11−αN [ αvitρ i j t MF ] α 1−α + εN,iif ,t zif ,t + I jt Iit+1 ∑ j 6=i Ai j 11−αN [ αvitρ i j t MF ] α 1−α + εN,i jf ,t z jf ,t + K∑ j=1 I jt Iit+1 Ai j 11−βM [ αvitρ i j t MF ] β 1−β + εM,if ,t . The firm size dynamics in all K sectors are summarized by z f ,t+1 = R f ,tz f ,t +L f ,t , (B..9) where Ri jf ,t+1 ≡ Iit Iit+1 (Ai jN) 11−α [ αvitρ i j t MF ] α 1−α + εN,i jf ,t , j 6= i, i, j ∈ {1,2, ...,K} Riif ,t+1 ≡ Iit Iit+1 1+ (AiiN) 11−α [ αvitρ i j t MF ] α 1−α + εN,iif ,t , i ∈ {1,2, ...,K} Lif ,t+1 ≡ K ∑ j=1 I jt Iit+1 ( Ai jM ) 1 1−α [ αvitρ i j t MF ] α 1−α + εM,if ,t . B..1 General Equilibrium In general equilibrium, the sectoral marginal firm value, { vi } ; the number of goods growth rate g; the relative size between sectors { I j Ii } i, j = 1,2, ...,K; the total con- sumption expenditure PC; and the number of firms MF are solved by the following equations: 116 ( 1− ρ 1+g ) v j MF = s jPC σ jMF + 1−α α K ∑ i=1 I j Ii ( Ai jNραv i (1+g)MF ) 1 1−α , i, j ∈ {1,2, ...,K}; (B..10) g = gi ≡ K ∑ j=1 [ ραvi (1+g)MF ] α 1−α [( Ai jN ) 1 1−α + ( Ai jM ) 1 1−α ] I j Ii , i, j ∈ {1,2, ...,K}; (B..11) F = u = 1−α α ( 1− ρ1+g ) K∑ i=1 K ∑ j=1 I j Ii ( Ai jMραv i (1+g)MF ) 1 1−α ; (B..12) PC = L+ K ∑ i=1 siPC σ i − ραg 1+g K ∑ i=1 vi. (B..13) In the equations above, i represents a knowledge learner sector, j represents a knowledge giver sector. Note that the number of goods in every sector is grow- ing at the same speed because inter-sector knowledge spillovers keep all sectors on the same growing track. If one sector i had been growing more slowly than other sectors for a lengthy period, its number of goods would be very small relative to other sectors. The cross-sector knowledge spillovers would push up gi to infinity through a huge relative sector size, I j Ii , in (B..11), until g i is equal to the common growth rate. Equation (B..10) shows that the marginal value of sector j knowledge v j de- pends not only on the discounted future profit from the self-sector but also its con- tribution to the knowledge production in all related sectors i such that Ai j > 0. If sector j gives intensive knowledge outflows to many other sectors, sector j attracts more R&D investment than others because of its higher marginal value of knowl- edge v j. Consider the special case when AM ≡ γAN ≡ γA. The asymmetry of knowledge diffusion matrix A indirectly shapes R&D resource allocation across sector { I j Ii } by the relative marginal value of sectoral knowledge { I j Ii } . Again v j is the private 117 return of knowledge accumulation in sector j, a stronger IPRP or a lower γ in- creases v j for all sector j. However, from the second part of (B..10), the increment is higher if sector j contributes more knowledge to other sectors; because the { vi } increments in all knowledge learner sector i add up to v j at the rate of ( vi ) 1 1−α . In the one-sector model, the optimal IPRP AMAN ≡ γ that maximizes economic growth rate depends on the trade-off between a higher imitation rate and a lower private return of knowledge v. In the multi-sector model, there is another factor to consider: the relative sector size { Ic Ip } , where c represents a center sector that contributes intensive knowledge spillovers to other sectors and p represents a pe- ripheral sector that barely generates externality. Since a higher γ hurts the private knowledge return vc of center sectors more than that of peripheral sectors vp, firms allocate a relatively smaller share of R&D resources to the center sector, therefore relative sector size in terms of number of goods { Ic Ip } becomes smaller. As a result, the smaller relative size of a knowledge contributing sector to a knowledge learner sector hurts the economic growth rate as illustrated in (B..11). 118 Appendix C Robustness Checks This appendix provides robustness check to the one-sector model’s implications with alternative citation data sets. The first robustness check is done with random simulated citations. The second robustness check is done with all G7 country cita- tions, which include more than 90% of all patents in NBER Patent Database, while US patents only account for about 50% of all patents. C..1 Random Citation Data A concern to the one-sector model’s third implication is that cross-sector citations should be fewer in a sector with more heterogeneous firm size distribution, even when cross-firm knowledge diffusion is equally complete and instant in each sector. The reason is that there are more big firms in a heterogeneous sector and a big firm is more likely to cite its own patent simply because it has more patent stock available to be cited. I simulate such random citation data sets to mimics an environment in which information is complete in every sector and compare them with the real citation data. Then I show that the real citation data set exhibits significantly larger cross- sector differences in knowledge diffusion than the random citation data sets. In the random citation data set, the citing patent is kept the same as in the real citation data, but the cited patent is randomly assigned. Every existing patent in the same sector has an equal chance to be cited, regardless of the distance and other characteristics of the citing firm and the cited firm. I simulated 100 such random citation data sets. The values reported in Figure 1.17 are the median of these 100 data sets’ results. 119 Figure 1.17 shows that cross-firm citations account for 95% to 99.9% of total random citations across sectors. Although it seems that a sector with heterogeneous firm size distribution has a lesser percentage of cross-firm citations than other sec- tors, the 5% cross-sector gap is trivial as compared with the 40% gap in the real citation data set (see Figure 1.10). I run the same regressions as those in Table 1.1 using the 100 random citation data sets and report the results in Table 1.11. The coefficients and the robust stan- dard errors reported are the median value of the 100 regression results. Compared with Table 1.1, the regression results using random citation data show that distance does not delay knowledge diffusion half as much as it does in real citation data; bigger firms do not cite outside patents faster than smaller firms at all; and citation time lag is slightly positively correlated with sectoral firm size heterogeneity, but the regression coefficients are much smaller than those in Table 1.1. Note that two factors still affect citation time lag in a similar magnitude to Table 1.1. The citation time lag is smaller when the cited patent is owned by a bigger firm and when the sector has a larger patent stock. Note that these ’random citations’ are not purely random, because the knowl- edge receiver, the citing firm, is still the same as in the real citation data, only the knowledge giver, the cited firm, is random. This is why cross-sector differences in knowledge spillovers do not disappear completely in the simulated random citation data sets. Above all, the cross-sector differences in knowledge diffusion that exist in the real citation data set are dramatically smaller in random citation data sets, where knowledge spillovers are equally complete and instant in all sectors by construc- tion. C..2 G7 Country Citation Data The G71 country citation data set also supports the one-sector model’s implications. All estimation methods used are the same as those in section 3. With similar results as Figure 1.7, Figure 1.18 and Figure 1.19 show that in- novation rate is independent of firm size and imitation rate declines with firm size 1Canada, France, Germany, Italy, Japan, U.K. and U.S. 120 in the G7 country citation data set. Moreover, the scale-independency of imitation rate is negatively related to sectoral firm size heterogeneity. In line with Figure 1.9, Figure 1.20 supports the finding that the sectoral firm size heterogeneity is negatively related to the ratio between the imitation’s contri- bution to gross growth rate and the innovation risk’s volatility in the larger data set with G7 country citation data. Figure 1.21 shows the same negative relation between the cross-firm knowl- edge diffusion abundance and the sectoral firm size heterogeneity as Figure 1.10. I run the same regressions as those in Table 1.1 using the G7 citation data and report the results in Table 1.14. The sectoral firm size heterogeneity is statistically significantly positively correlated with citation time lag in the regression No. 3 with country pair-industry fixed effect, but I find the same coefficient is not sta- tistically significant in the regression No. 2 and barely significant at 10% level in the regression No. 1. My explanation is that international citations may involve more country pair-industry specific unobserved variables that are correlated with the sectoral firm size heterogeneity measure and the citation time lag at the same time, therefore firm size heterogeneity measure is not significant in the first two regressions. 121 Appendix D New Imitation Production Function In this section, I extend the basic one-sector model to allow firms to combine pri- vate and public knowledge in imitation, while using only private knowledge in innovation. Everything else remains the same, except that the imitated new goods production function becomes E ( ∆zNf ,t ) = AMM β f ,t (γz f ,t +(1− γ) Z̄t)(1−β ) , where γ is private knowledge’s share in the combined knowledge pool. This imi- tation function implies that a firm’s past R&D experience helps to absorb current public knowledge. Put another way, positive sorting in a firm’s social network denotes that a larger firm can expects to learn more from the public knowledge pool, because firms of similar size are more likely to be connected. γ reflects the significance of positive sorting in the social network. These ideas are in line with the facts in patent citation data: when citing other firms’ patents, firms with more patent stock tend to cite newer existing patents, cite larger firms’ patents, and cite more diversified sources than smaller firms. The firm’s problem becomes max N f ,t , M f ,t V (z f ,t) = PtYt σ z f ,t It +ρE[V (z f ,t+1)]− N f ,t +M f ,tZ̄t subject to z f ,t+1 = z f ,t +∆zNf ,t +∆z M f ,t 122 ∆zNf ,t z f ,t = ANNαf ,tz 1−α f ,t z f ,t + εnf ,t ∆zMf ,t Z̄t = AMM β f ,t (γz f ,t +(1− γ) Z̄t)(1−β ) Z̄t + εmf ,t N f ,t = ( ANαvρIt It+1MF ) 1 1−α z f ,t M f ,t = ( AMβ (1− γ)vρIt It+1MF ) 1 1−β (γz f ,t +(1− γ) Z̄t) v= PtYt σ + ρvIt It+1 1+(1−α)AN(ANαvρItIt+1MF ) α 1−α + γ (1−β )AM ( AMβ (1− γ)vρIt It+1MF ) β 1−β Larger private knowledge’s share in imitation γ boosts marginal firm value v, because future return on imitation also relies on current firm size. In the social network environment, a larger size today wins the firm a better peer to imitate in the future. Higher γ also induces larger firm size heterogeneity in the sector. When private knowledge is more important in imitation or the social network is more positively assorted, sectoral firm size heterogeneity is larger for given productivity of innova- tion and imitation (AN and AM). In other words, rising γ incurs the same impact on firm size heterogeneity as rising innovation productivity AN or decreasing imitation productivity AM. 123 Appendix E Related Questions The following questions are closely related to the questions asked in this paper and I would like to study them further in future work. The subsequent paragraphs provide preliminary answers. Question 1: Why do larger firms give fewer citations to peers? First, gener- ally speaking, larger firms own better technology than average firms in the sector, therefore they are picky when using other’s knowledge. The citation pattern dif- ferences across firms confirm this hypothesis. Compared with smaller firms, larger firms cite newer patents and patents owned by other larger firms. Second, larger firms tend to enter frontier subclasses in the sector, where there are only a few other competitors, which also means there are fewer targets to learn from within the subclass. Third, from a network perspective, all subclasses are connected by cross-subclass citations. These frontier subclasses locate at the periphery of the network, having fewer cross-subclass links than those subclasses at the center of the network. As a result, large firms in these frontier subclasses also make fewer cross-subclass citations. In summary, larger firms give fewer citations to others not because they do not desire to learn from others, but because not many sufficiently good outside patents are available. Question 2: Why do larger firms grow more slowly? Does quality adjusted firm growth rate also decline with firm size? It is easier to understand firm growth rate by looking at extensive margin and intensive margin separately. gex is the growth rate attributable to patent applications in new subclasses. gin is the growth rate stimulated by patent applications in existing subclasses. AppnoNewt is the number of patent applications in new subclasses at time t. PSt−1 is the total number of 124 patent stock by time t-1. Appnot is the total number of patent applications at time t. Firm growth rate drops in both margins, and it drops faster in intensive margin than extensive margin. gex = ((AppnoNewt )/(PSt−1)) gin = ((Appnot −AppnoNewt )/(PSt−1)) There are two ways to look at the decreasing intensive growth margin. On one hand, products of the same firm may be closer substitutes than products of differ- ent firms. Since a firm’s new product becomes a close competitor of its previous products in the same subclass, the return from one additional product in an existing subclass decreases as a firm accumulates more products in the same subclass. On the other hand, the gain from learning is smaller when a firm owns more knowledge in a small subclass. The network structure among subclasses helps to analyze the declining intensive margin. When a firm dynamically expands across subclasses, it normally starts from the center of the network and gradually enters other linked subclasses farther away from the center. Only top firms enter frontier subclasses at the periphery of the network, looking for higher profit margin and weaker competition. The isolated network location of the periphery subclasses is a natural obstacle blocking small firms without related knowledge from entering. Since the center subclasses are well connected with other subclasses, smaller firms, which usually start from the center, have access to many potential new subclasses. In contrast, when larger firms approach the edge of the network, the network be- comes sparser. There are fewer linked new subclasses that they can potentially enter. Quality adjusted growth rate is measured by the citation-weighted number of patents, instead of a simple patent count. When adjusted by the number of in- ward citations, a larger firm’s growth rate drops even faster. The reason is that the number of inward citations per patent decreases with firm size in both extensive margin (number of subclasses) and intensive margin (number of patents within the subclass). 125 Appendix F Technical Details To fit into Jackson and Rogers (38)’s non-directed network environment, I ignore the direction in the citation networks and treat them as non-directed networks. For sector s at time t, the adjacency matrix becomes M̂s,t(i, j)=max(Ms,t (i, j) , Ms,t ( j, i)). Firm i and firm j are called ”old friend” at time t, if i and j are connected at both t and t−1 (M̂s,t−1(i, j) = 1 and M̂s,t(i, j) = 1). Firm i and j are called ”new friend” if firm i does not connect with firm j at time t− 1, but i connects with j at time t (M̂s,t−1(i, j) = 0 and M̂s,t(i, j) = 1). Conditional on firm i and j being new friends, they are called ’network-based new friend’ or ’friend’s friend’ to each other, if there exists at least one firm k 6= i, j such that k is connected with both i and j at time t−1 (M̂s,t−1(i, :)*M̂s,t−1 (:, j)≥ 1, where Ms,t−1 (i, :) is the ith row of matrix M̂s,t−1 and M̂s,t−1 (:, j) is the jth column of matrix M̂s,t−1). Conditional on firm i and j being new friends, they are called ’random new friend’ to each other, if there is no such firm k that is connected with both i and j at time t−1 (M̂s,t−1(i, :)∗ M̂s,t−1 (:, j) = 0). Mathematically rs,t is calculated as: rs,t = nnz [ not ( M̂s,t−1 ∗ M̂s,t−1 ) .∗not (M̂s,t−1) .∗ M̂s,t] nnz [ M̂s,t−1 ∗ M̂s,t−1.∗not ( M̂s,t−1 ) .∗ M̂s,t ] . (F..1) The numerator (denominator) in (F..1) is the number of new random (friend’s) friends made in sector s at time t. nnz counts the number of none zero elements. The (i, j) element in M̂s,t−1∗M̂s,t−1 is the number of common friends that firm i and j share at time t− 1. not (M) replaces positive elements in M with zeros, and re- places zeros with ones. Therefore the (i, j) element in not ( M̂s,t−1 ∗ M̂s,t−1 ) is one, 126 if firm i and firm j has no common friend at time t− 1, it is zero otherwise. The (i, j) element in not ( M̂s,t−1 ) . ∗ M̂s,t is one, if firm i and j are new friends to each other, it is zero otherwise. All together, the (i, j) element in not ( M̂s,t−1*M̂s,t−1 ) .*not ( M̂s,t−1 ) .*M̂s,t is positive, if firm i and j are ’random new friends’, it is zero otherwise. Similarly, the (i, j) element in M̂s,t−1*M̂s,t−1.*not ( M̂s,t−1 ) .*M̂s,t is positive, if firm i and j are ’network-based new friends’, it is zero otherwise. Newman (59) gives several ways to measure network clustering. Jackson and Rogers (38) examines three commonly used clustering coefficients in the literature. They are: CT T (Ms,t) = ∑i; j 6=i;k 6=i, j Ms,t (i, j)Ms,t ( j,k)Ms,t (k, i) ∑i; j 6=i;k 6=i, j Ms,t (i, j)Ms,t ( j,k) , C (Ms,t) = ∑i; j 6=i;k 6=i, j M̂s,t (i, j)M̂s,t ( j,k)M̂s,t (k, i) ∑i; j 6=i;k 6=i, j M̂s,t (i, j)M̂s,t ( j,k) , and CAvg (Ms,t) = 1 n∑i ∑ j 6=i;k 6=i, j M̂s,t (i, j)M̂s,t ( j,k)M̂s,t (k, i) ∑ j 6=i;k 6=i, j M̂s,t (i, j)M̂s,t ( j,k) . They all measure the likelihood that two nodes are connected, conditional on these two nodes being connected with a common node. The first two definitions are the same when the network is non-directed. The third definition gives an equal weight to every node, while the first two definitions give a bigger weight to the node with more links. To estimate µxs,t in sector s at time t, I run OLS regress of ln(1−Fs,t ( dxf ,t ) ) on ln(dxf ,t), where x={in,out and total}, Fs,t (dx) is the c.d.f. of x-degree distribution in sector s at time t. µ̂xs,t is equal to the absolute value of the OLS coefficient before ln(dxf ,t). Note that standard deviation of { ln(dxf ,t) } is equal to 1µ̂xs,t , which measures the heterogeneity of x-degree distribution. 127 Appendix G Simulating Networks for Every Sector In sector s′ firm citation network, I identify the realization of Ff ,t and R f ,t for node f at time t with the following steps. (1) Identify new friend. If Ms,t ( j, i) = 1 and Ms,t−1 ( j, i) = 0, node j is node i’s new inward friend at time t. If Ms,t (i, j) = 1 and Ms,t−1 (i, j) = 0, node j is node i’s new outward friend at time t. If Ms,t ( j, i) = 1 and Ms,t−1 ( j, i) = 1, node j is node i’s old inward friend at time t. (2) Identify the source of new friend. Suppose node j is node i’s new inward friend at time t. If there exists a node k, such that node k is an inward friend of node i, and inward or outward friend of node j at time t−1; then node j is node i’s new inward friend introduced by inward friend k. The total number of such node j introduced by inward friend are denoted as ∆ddf ,t . If there exists a node k, such that node k is an outward friend of node i, and k is inward or outward friend of node j at time t−1 or j = k; then node j is node i’s new inward friend introduced by outward friend k. The total number of such node j is denoted as ∆dpf ,t . If node i has both inward and outward common friends who are friend of j, then I attribute half to ∆dpf ,t and half to ∆d d f ,t . If node j and i do not have any type of common friend, then node j is node i’s random new friend and belongs to ∆drf ,t . Similarly, new outward friend introduced by inward friend, outward friend, and random new friend ∆pdf ,t , ∆p p f ,t , and ∆p r f ,t are identified. 128 (3) Estimate δ , the possibility to drop an old link. If Ms,t ( j, i)= 0 and Ms,t−1 ( j, i)= 1, then the old link between i and j is dropped at time t. Denote drop f ,t as the num- ber of inward link dropped by firm f at time t. In the entire network, the possibility to drop an old link is δ = ∑ f drop f ,t∑ f d f ,t−1 . (4) Infer the elements in random matrix Ff ,t and R f ,t . N is the total number of nodes in the network. F11f ,t = 1−δ + ∆ppf ,t p f ,t , F12f ,t = ∆pdf ,t d f ,t , F21f ,t = ∆dpf ,t p f ,t , F22f ,t = 1−δ + ∆ddf ,t d f ,t , R1f ,t = ∆prf ,t N , R2f t = ∆drf t N . 129 Appendix H Calculation Detail The mass of just-informed firms is: JI (k) = ∫ ∞ Amin JI (k,A)dF (A) = ∫ ∞ Amin (ct −dt ln(A))k AµmµAdt−µ−1 k!ect dA = [ −A µ minµ (ct −dt ln(A))k Adt−µ k!ect (µ−dt) ]∞ Amin − ∫ ∞ Amin dtA µ mµ (ct −dt ln(A))k−1 Adt−µ−1 (µ−dt)(k−1)!ect dA = µ (ct −dt ln(Amin))k Admin (µ−dt)k!ect − dt µ−dt JI (k−1) = µ µ−dt JI (k,Amin)− dt µ−dt JI (k−1) = µ µ−dt k ∑ g=0 ( − dt µ−dt )k−g JI (g,Amin) . The total output is: 130 Yt+k = 1 Pt+k = (σ −1) Ãt+k σ = (σ −1) σ {∫ ∞ Am I (k,A) [A(1+4)]σ−1+(1− I (k,A))Aσ−1dF (A) } 1 σ−1 = (σ −1) σ {∫ ∞ Am I (k,A)Aσ−1 [ (1+4)σ−1−1 ] +Aσ−1dF (A) } 1 σ−1 = (σ −1) σ {[ (1+4)σ−1−1 ]∫ ∞ Am I (k,A)Aσ−1dF (A)+ µ µ−σ +1A σ−1 m } 1 σ−1 = (σ −1) σ {[ (1+4)σ−1−1 ] k ∑ g=0 ∫ ∞ Am JI (g,A)Aσ−1dF (A)+ µ µ−σ +1A σ−1 m } 1 σ−1 = (σ −1) σ {[ (1+4)σ−1−1 ] k ∑ g=0 ∫ ∞ Am Aµmµ (ct −dt ln(A))k Ad+σ−µ−2 k!e−ct dF (A)+ µ µ−σ +1A σ−1 m } 1 σ−1 = (σ −1) σ { (1+4)σ−1−1 µ−σ +1−dt k ∑ g=0 g ∑ j=0 ( − dt µ−σ +1−dt )k−g JI (g,Amin)+ µ µ−σ +1 } 1 σ−1 Am. The total output of top firms whose sizes are larger than Ax is: Y A>Axt+k = ( Ãt+k|At+k > Ax Ãt+k )σ Yt+k = (σ −1) σ [∫ ∞ Ax A σ−1 i,t+kdF (Ai,t) ] σ σ−1 ∫ ∞ Ax A σ−1 i,t+kdF (Ai,t) = (σ −1) σ [ (1+4)σ−1−1 µ−σ+1−dt ∑ k g=0∑ g j=0 ( − dtµ−σ+1−dt )k−g JI (g,Ax)+ µ µ−σ+1 ] σ σ−1 (1+4)σ−1−1 µ−σ+1−dt ∑ k g=0∑ g j=0 ( − dtµ−σ+1−dt )k−g JI (g,Amin)+ µ µ−σ+1 Aσx Aσ−1min 131 Appendix I Goodness-of-Fit Test for Power-Law Distribution Following the Kolmogorov-Smirnov (KS) type goodness-of-fit test for in Clauset, Shalizi, and Newman (13) and M. L. Goldstein and Yen (55), I calculate KS test critical value table, assuming MLE estimation. Then compare the KS test statistics with the correspondent critical values and number of observations. In the patent data, firm size is measured by the number of patents. Among the 42 3-digit SIC industries, the Power-law size distribution with cut-off can not be rejected at 10% probability for all but 5 sectors (Industrial organic chemistry, Plastics materials and synthetic resins, Agricultural chemicals, Motorcycles, bicy- cles, and parts and Miscellaneous transportation equipment). In the patent data, in-degree (out-degree) is the number of inward (outward) citations, total degree it the summation of in-degree and out-degree. Among the 42 3-digit SIC industries, the Power-law in-degree distribution with cut-off can not be rejected at 10% prob- ability for all sectors but Miscellaneous transportation equipment; the Power-law out-degree distribution with cut-off can not be rejected at 10% probability for all sectors but Guided missiles and space vehicles and parts and Stone, clay, glass and concrete products; the Power-law total-degree distribution with cut-off can not be rejected at 10% probability for all sectors but Refrigeration and service industry machinery, Household appliances and Railroad equipment. See column 3-6 of Ta- ble 2.2. In the French manufacturing firm data, firm size is measured by number of employees. Among 21 3-digit NAICS 2002 industries, the p-value of KS statistics is greater than 10% for all sectors except Animal Food Manufacturing sector. 132
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Essays on knowledge spillovers
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Essays on knowledge spillovers Cai, Jie 2010
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Essays on knowledge spillovers |
Creator |
Cai, Jie |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | This thesis studies three issues involving knowledge diffusion across firms. The first chapter explains two data facts related to firm size distribution. First, it uses sector-specific inter-firm knowledge spillovers to explain sectoral differences in firm size heterogeneity. Greater inter-firm knowledge spillovers in a sector induce firms in that sector to invest relatively more in imitation. Greater imitation also causes faster catch-up by lagging firms and declining firm growth rate with firm size. Hence, the sectoral firm size distribution becomes more homogeneous in sectors with greater knowledge spillovers. Second, in a multi-sector version of this environment, I use inter-sector knowledge spillovers to explain the observed dependent Pareto size distributions in every subset of the economy. I test the model using patent citation data and find support for both its sectoral and aggregate predictions. The second chapter rationalizes firms’ motivation to build directed links with each other and formalizes the dynamic formation process that generates the observed network structure, including triple Power-law degree distributions, in the patent citation networks. Networks allow firms to become more specialized without losing customers, because having more firms in the market results not only in competitors but also in potential partner who redirect customers. Using firm citation panel data from the NBER Patent Citation Database, I estimate the model’s parameters and simulate networks that exhibit similar structure features as corresponding real networks. The third chapter documents a new empirical fact that larger firms update information faster than smaller firms in patent citation data and address its macroeconomic implications. In a model with size-dependent reaction time lag and Pareto firm size distribution, the gradual spread of a firm-level technology shock generates a persistent and hump-shaped aggregate output growth rate. Greater information heterogeneity across firms de-synchronizes the co-movement among firms of different sizes, and hence causes a less volatile, smoother and longer aggregate business cycle. The model is well suited to explaining several timing relations of the business cycle. For example, productivity dispersion is pro-cyclical, the top firm’s growth rate predicts future GDP growth, and investment leads hiring over the business cycle. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-01-25 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0071592 |
URI | http://hdl.handle.net/2429/30825 |
Degree |
Doctor of Philosophy - PhD |
Program |
Economics |
Affiliation |
Arts, Faculty of Vancouver School of Economics |
Degree Grantor | University of British Columbia |
GraduationDate | 2011-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2011_spring_cai_jie.pdf [ 2.34MB ]
- Metadata
- JSON: 24-1.0071592.json
- JSON-LD: 24-1.0071592-ld.json
- RDF/XML (Pretty): 24-1.0071592-rdf.xml
- RDF/JSON: 24-1.0071592-rdf.json
- Turtle: 24-1.0071592-turtle.txt
- N-Triples: 24-1.0071592-rdf-ntriples.txt
- Original Record: 24-1.0071592-source.json
- Full Text
- 24-1.0071592-fulltext.txt
- Citation
- 24-1.0071592.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0071592/manifest