Essays on factor misallocationbyJosé Pulido PescadorB.A., Universidad Nacional de Colombia, 2008M.A., Universidad Nacional de Colombia, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Economics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August, 2018c© José Pulido Pescador 2018The following individuals certify that they have read, and recommend to the Faculty of Graduate andPostdoctoral Studies for acceptance, the dissertation entitled:Essays on Factor Misallocationsubmitted by José Pulido Pescador in partial fulfillment of the requirements forthe degree of Doctor of Philosophyin EconomicsExamining Committee:Matilde Bombardini, EconomicsSupervisorTomasz S´wie˛cki, EconomicsSupervisory Committee MemberPaul Beaudry, EconomicsUniversity ExaminerJohn Ries, Business AdministrationUniversity ExamineriiAbstractThis thesis studies different implications of micro-level factor misallocation across heterogeneousagents. It consists of three chapters.The first chapter examines the impact of firm-level factor misallocation on an open economy’scomparative advantage. After providing empirical evidence on how Colombian metrics of firm-levelmisallocation are related to measures of its revealed comparative advantage, I explore the generalequilibrium effects of such misallocation and its impact on industries’ export capabilities. I computea counterfactual equilibrium in which the misallocation is removed in Colombia. The reallocation offactors leads to an important change in the country’s industrial structure and a rise in the exports-to-GDP ratio of 18 p.p. This industrial composition effect is absent in the workhorse models of firm-levelfactor misallocation under closed economies.Based on a co-authored paper with Tomasz S´wie˛cki, the second chapter studies the origin ofthe income gaps between agricultural and non-agricultural workers in developing countries. We useIndonesian data to document a robust premium for workers who move out of agriculture and a loss forthose who move into agriculture, even if they do not migrate. We argue that to generate simultaneouslythese within-worker premia and the main moments of the joint sector-income distribution over time,self-selection needs to take place under barriers to sectoral mobility that misallocate workers acrosssectors. We find that removing such barriers prompt 30% of the workforce to reallocate and aggregateoutput to increase by 17%.The third chapter extends the standard model of firm-level factor misallocation in a closed econ-omy in two dimensions. First, I introduce idiosyncratic demand shocks. This allows me to evaluatewhether metrics of misallocation predict plants’ survival, a test used to claim that misallocation met-rics are empirically swamped by demand shocks. I argue that unconditional estimates in this test arebiased in the presence of firms’ selection, which would explain the puzzling empirical findings. Sec-ond, I compute the TFP gains of removing misallocation both within and across industries. I quantifythe importance of inter-industry misallocation and explore its potential role in explaining TFP gapsacross countries.iiiLay SummaryThis thesis studies different aspects related to resource misallocation across heterogeneous agents.The first chapter examines how firm-level factor misallocation can affect the comparative advantageof an open economy. After documenting how standard metrics of factor misallocation are related tomeasures of comparative advantage, I explore the channels throughout factor misallocation shapesindustries’ export capabilities, using a model of international trade. The second chapter studies theorigin of the income gaps between agricultural and non-agricultural workers in developing countries.I evaluate whether those gaps are explained by barriers to labor mobility across sectors or efficientsorting of workers. The third chapter extends the standard framework of firm-level factor misallo-cation in a closed economy in two dimensions: to account for idiosyncratic demand shocks and formisallocation both within and across industries.ivPrefaceChapter 2 “Barriers to Mobility or Sorting? Sources and Aggregate Implications of Income Gapsacross Sectors and Locations in Indonesia” is a joint work with Professor Tomasz S´wie˛cki from theVancouver School of Economics, at the University of British Columbia. I participated in all stages ofthe research: collection and statistical analysis of the data; estimation and robustness checks of thereduced form regressions; identification and estimation of the structural model; computation of thecounterfactual exercises; and writing several sections of the manuscript.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Firm-level Factor Misallocation and Comparative Advantage . . . . . . . . . . . . . 51.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Empirical motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 A model of firm-level misallocation in an open economy . . . . . . . . . . . . . . . 141.4 Empirical implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331.6 Tables and figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Barriers to Mobility or Sorting? Sources and Aggregate Implications of Income Gapsacross Sectors and Locations in Indonesia . . . . . . . . . . . . . . . . . . . . . . . . 442.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482.3 Income gaps across sectors and locations . . . . . . . . . . . . . . . . . . . . . . . 492.4 Model of sorting across sectors with barriers to sectoral mobility . . . . . . . . . . . 582.5 Structural estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65viTable of Contents2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712.8 Tables and figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733 Demand Shocks and Inter-industry Distortions under Firm-level Factor Misallocation 853.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853.2 Demand shocks and plant survival . . . . . . . . . . . . . . . . . . . . . . . . . . . 863.3 Intra- and inter-industry misallocation . . . . . . . . . . . . . . . . . . . . . . . . . 903.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 963.5 Tables and figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102AppendicesA Appendix to Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A.1 Description of the dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A.2 Bils, Klenow and Ruane’s (2017) method and results for Colombia . . . . . . . . . 114A.3 Solution of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114A.4 Mathematical derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115B Appendix to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123B.1 Additional tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123B.2 Recall bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124B.3 Estimation procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126B.4 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129B.5 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137C Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139C.1 Additional figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139C.2 CES aggregator across sectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140viiList of Tables1.1 Alternative explanations for dispersion in revenue productivity . . . . . . . . . . . . 341.2 RCA explained by misallocation measures . . . . . . . . . . . . . . . . . . . . . . . 341.3 Equilibrium conditions and endogenous variables . . . . . . . . . . . . . . . . . . . 351.4 Parameters used in simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351.5 Values of misallocation measures used in the counterfactuals . . . . . . . . . . . . . 361.6 Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.1 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.2 Sectoral and urban income premia . . . . . . . . . . . . . . . . . . . . . . . . . . . 742.3 Transitions across sectors and locations . . . . . . . . . . . . . . . . . . . . . . . . 742.4 Premia for switchers and stayers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 752.5 Job top occupations and types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762.6 Premia for switchers and stayers by job type . . . . . . . . . . . . . . . . . . . . . . 762.7 Wage premia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762.8 Consumption premia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772.9 Premia with heterogeneity in Mincerian returns . . . . . . . . . . . . . . . . . . . . 772.10 Premia with additional jobs and home production . . . . . . . . . . . . . . . . . . . 782.11 Premia with hours worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792.12 Long run premia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792.13 Parameter estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802.14 Auxiliary models and selected coefficients . . . . . . . . . . . . . . . . . . . . . . . 812.15 Coefficients of auxiliary regression models . . . . . . . . . . . . . . . . . . . . . . . 822.16 Counterfactual: Aggregate income . . . . . . . . . . . . . . . . . . . . . . . . . . . 832.17 Counterfactual: Sectoral allocation and productivity . . . . . . . . . . . . . . . . . . 832.18 Sectoral premia in counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 833.1 Probability of survival explained by determinants of profitability . . . . . . . . . . . 973.2 Probability of being a exporter explained by determinants of profitability . . . . . . . 97A.1 Sectors in the sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113A.2 Countries in the sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113B.1 Premia with hours worked: Additional jobs . . . . . . . . . . . . . . . . . . . . . . 123viiiList of TablesB.2 Hours worked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123B.3 Premia over Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124B.4 Retrospective Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127ixList of Figures1.1 Revealed comparative advantage (RCA) measures for Colombia . . . . . . . . . . . 371.2 Cutoff functions and selection effects of distortions . . . . . . . . . . . . . . . . . . 381.3 Effects of factor misallocation within industries on RCA . . . . . . . . . . . . . . . 391.4 Effects of factor misallocation across industries on RCA . . . . . . . . . . . . . . . 391.5 Allocative efficient RCA and observed RCA for Colombia . . . . . . . . . . . . . . 401.6 Colombian industries in the world distribution of RCA . . . . . . . . . . . . . . . . 411.7 Changes in Colombian RCA and their causes . . . . . . . . . . . . . . . . . . . . . 411.8 Changes in determinants of Colombian RCA . . . . . . . . . . . . . . . . . . . . . 421.9 Rankings of RCA for different values of κ and σ . . . . . . . . . . . . . . . . . . . 431.10 Welfare gains and export growth from gradual reforms . . . . . . . . . . . . . . . . 432.1 Mean log income by employment history . . . . . . . . . . . . . . . . . . . . . . . 843.1 TFPQ, TFPR and MRP distribution in Colombia . . . . . . . . . . . . . . . . . . . 983.2 MRP distribution for selected industries . . . . . . . . . . . . . . . . . . . . . . . . 983.3 Removing intra- and inter-industry factor misallocation . . . . . . . . . . . . . . . . 993.4 TFP gains from factor reallocation in a closed economy . . . . . . . . . . . . . . . . 993.5 Sensitivity to production function specification and factor intensities . . . . . . . . . 1003.6 Sensitivity to elasticity of substitution across sectors . . . . . . . . . . . . . . . . . . 1003.7 TFP gains from removing inter-industry misallocation and GDP per capita . . . . . . 101C.1 Inter-sectoral gains and GDP per capita: Alternative specifications . . . . . . . . . . 139xAcknowledgementsI am especially grateful for the advice, guidance and support of my PhD supervisor Matilde Bombar-dini. There are not enough words to thank for all the time, effort and patience that she has had for meduring my doctoral studies. I think it is impossible to have a better supervisor. I am also indebted toTomasz S´wie˛cki, co-author of the second chapter of this thesis and member of my committee thesis,from whom I have learnt a lot. His advice has been always opportune and wise. I am also grateful toAmartya Lahiri and Vanessa Alviarez, members of my committee thesis, who have provided me veryimportant feedback every time that I have needed it.I am also indebted to several members of the community of the Vancouver School of Economics(VSE) at the University of British Columbia, particularly to Jesse Perla, Yaniv Yedid-Levi, VictoriaHnatkovska and Giovanni Gallipoli for their helpful comments and collaboration; to my friends andclassmates Alastair, Brad, Iain, João and Nouri for having worked as a team since the first day of ourdoctoral studies; and to Maureen Chin for her administrative support. I also thank several visitors atthe VSE, specially Rodrigo Adão, Andrés Rodríguez-Claire, Svetlana Demidova, Chang Tai Hsiehand Loukas Karabarbounis for their useful feedback, and the seminar participants at the 2017 CEA,Central Bank of Colombia, Universidad del Rosario, Universidad EAFIT and the Sauder-VSE tradegroup, for their comments. I am specially grateful for the financial support of the Faculty of GraduateStudies at the University of British Columbia, the Central Bank of Colombia and Colciencias duringdifferent stages of my doctoral studies. I am also grateful to the Colombian Statistics Institute (DANE)for providing access to the manufacturing data.xiDedicationTo Ximena, Eugenio, Myriam and Andrés.xiiIntroductionIn recent years, a growing body of research has strived to understand how resource misallocationacross heterogeneous agents can account for differences in aggregate outcomes across countries. Thisthesis studies some implications of micro-level resource misallocation across heterogeneous agents.For the empirical applications, the study uses data from different developing countries, where exten-sive evidence shows the problem is more relevant. Besides the introduction, this thesis comprises ofthree main chapters.The first chapter examines how firm-level factor misallocation can affect an open economy’s com-parative advantage. In an economy with heterogeneous firms in terms of their physical productivity, anefficient allocation of resources implies that more productive firms demand larger amounts of produc-tion factors, up to the point where each factor’s marginal revenue product is the same across all firms.Once we take into account measurement error, dispersions in the factors’ marginal revenue productacross firms suggest the presence of firm-level factor misallocation. First, I present empirical evidenceon how those factor misallocation metrics are related to the observed patterns of Colombia’s compara-tive advantage, a country whose manufacturing survey allows to separate out components of efficiencyand demand from the usual physical productivity measures. As a comparative advantage measure, Iuse the estimates from an export-industry fixed effect derived from a gravity equation. I find that thefactor misallocation metrics have a quantitative importance similar to the “natural” sources of com-parative advantage (Ricardian and Heckscher-Ohlin sources) in explaining comparative advantage.Next, I explore the general equilibrium effects of firm-level misallocation in an open economy andtheir role in shaping industry export capabilities. To do this, I introduce an international trade generalequilibrium model with endogenous selection of heterogeneous firms in which the factor allocationis inefficient. I compute a counterfactual in which factor misallocation is removed in Colombia. Thefactor reallocation allows Colombia to specialize in industries with “natural” comparative advantageand generates a substantial change in the country’s industrial composition, which leads to a rise in theexports-to-GDP ratio of 18 pp. This industrial composition effect is absent in the workhorse modelsof firm-level factor misallocation under closed economies.Based on a co-authored paper with Tomasz S´wie˛cki, in the second chapter we inquire about thesource and aggregate implications of the large income gaps between agricultural and non-agriculturalworkers in developing countries. We use panel data from the Indonesia Family Life Survey to con-clude that workers who move out of agriculture see an income gain of around 20% while those whomove into agriculture see a similar income loss, even if they stay in the same location. Without con-trolling for individual heterogeneity, the income premia are even larger, suggesting that sorting of1Introductionworkers occurs and is important. We explore whether those premia can be reconciled with an effi-cient sorting of workers based on comparative advantage alone. We conclude that, in principle, itcan. The reason is that the industry premium on its own has little empirical content. We show thisby extending a standard self-selection model based on both permanent and transitory components ofcomparative advantage to include different types of barriers to sectoral mobility, in particular utilitycosts of switching sectors and frictions preventing individuals from working in their preferred sectors.We demonstrate that the same cross-sectional and within-worker non-agriculture premia can be ratio-nalized by different combinations of comparative advantage shock processes and barriers to mobility,and hence, the premia alone cannot tell us if there is a misallocation or not. However, we argue that thecomparative advantage process and barriers to mobility can be separately identified once we imposesome parametric structure and exploit a richer set of moments of the joint sector-income distributionover time. We use indirect inference for our model’s structural estimation, where the selected auxil-iary models are the main reduced-form regressions that characterize the moments we are interested in.Our findings suggest that, although both types of sectoral mobility barriers significantly improve theoverall fit of the model compared to the frictionless specification, the model that recognizes that notall sectoral transitions are voluntary fits the data considerably better. We conduct a counterfactual inwhich frictions are removed entirely in this latter specification. Removing all intersectoral mobilitybarriers would prompt 30% of the workforce to reallocate across sectors. Since the initially misallo-cated workers reap large income gains from the reallocation (their income doubles on average), theadjustment has a sizable effect on aggregate output, raising it by 17%.The third chapter extends the usual framework to compute the total factor productivity (TFP) gainsfrom removing factor misallocation in a closed economy (Hsieh and Klenow (2009)) in two dimen-sions. First, I account for idiosyncratic demand shocks. This extension is useful to test the usual factormisallocation metrics’ ability to explain plants’ survival, a test that has been used recently to claimthat misallocation measures are empirically swamped by other profitability determinants, mainly de-mand shocks (Haltiwanger et al. (2018)). I obtain similar empirical findings using Colombian datawith which I can recover demand shock measures due to firm-level price indices availability. How-ever, I argue that explaining plants’ survival with only one profitability determinant produces biasedestimates and that including endogenous selection in the model can rationalize the signs of the biasand the data findings addressing those objections. Second, I account for the possibility that productionfactors are misallocated both within and across industries. I provide closed-form solutions to evaluatethe aggregate TFP gains from removing each type of misallocation, offering a simpler computationrelative to the methods proposed in the literature. Using data from Colombia and China, I show thatthe inter-industry misallocation contribution can be as high as 35% of the total gains from removingfactor misallocation. Moreover, given the relevance of the inter-industry type for the total gains fromremoving misallocation, I use cross-country data to document that including this type of misallocationcan amplify the usual TFP gaps attributed to factor misallocation based exclusively on intra-industryreforms.2IntroductionRelated literatureThe literature on the sources and implications of resource misallocation across heterogeneous agentshas grown exponentially in recent years. It is out of scope to offer an extensive review on the topic; seeRestuccia and Rogerson (2013) or Hopenhayn (2014a) for this purpose. Instead, I focus my attentionon the most relevant papers to the specific topics studied in each chapter of this thesis.The first chapter is mainly related to the literature that evaluates the effects of trade in openeconomies with resource misallocation, particularly Ho (2012), Tombe (2015), S´wie˛cki (2017), Caliendoet al. (2017) and Costa-Scottini (2018). My focus is different with respect to those papers. Insteadof analyzing the effect of trade liberalization in a distorted economy, my objective is to evaluatethe impact of the observed resource misallocation on the patterns of a country’s comparative advan-tage obtained from bilateral trade flows. Ho (2012), who evaluates India’s trade liberalization in atwo-country multi-sector setting with firm-level wedges, and Costa-Scottini (2018), who studies thegains from trade and from removing intra-industry misallocation in a multi-country setting with size-dependent factor distortions, are the papers with the closest models to the one used in this chapter.Although my multi-country, multi-sector, and multi-factor framework shares some features with thosemodels, it differs in several aspects.1 The empirical implementation is also different, since it does notrely on the calibration of large sets of parameters to obtain counterfactuals.2 In turn, Tombe (2015),Caliendo et al. (2017), and S´wie˛cki (2017) use multi-sector and multi-country models to study welfareand the gains from trade under the presence of sectoral frictions, and thus, only inter-industry misallo-cation. Instead of using a Eaton and Kortum’s (2002) type of model, my framework relies on Melitz’s(2003) type. It generates endogenous ex-post misallocation across industries as the consequence ofdifferences in the moments of the distributions of factor distortions across sectors, which allows forinteractions between intra- and inter-industry misallocations.The model I use in the first chapter has the same interactions between country, industry, and firmcharacteristics in general equilibrium as the multi-factor models that exhibit factor reallocations, bothwithin and across industries in response to trade shocks, particularly Bernard et al. (2007) and Bal-istreri et al. (2011). The introduction of resource misallocation generates a new source of comparativeadvantage that alters the frictionless trade equilibrium. Instead of a full characterization of the ineffi-cient equilibrium properties, my focus is mostly on the implications of allocative inefficiency for theindustrial specialization patterns. Therefore, my primary interest relies on the counterfactual exerciseof removing the misallocation. Finally, the first chapter is also related to the trade literature concen-trated on gravity equations to derive indirect measures of relative export capability, as in Costinot1First, since my main focus is on comparative advantage, I let misallocation arise in any factor market. This can distortindustries’ advantages in unit costs based on the relative size of the countries’ factor endowments (Heckscher-Ohlin forces).Second, I do not constrain factor distortions to be size-dependent. With size-dependent distortions, the model behavesexactly as a Melitz model with a unique physical productivities cut-off. Thus, the selection effects of distortions do notgenerate rank-reversals, which are necessary to obtain the large TFP gaps attributed to factor misallocation (Hopenhayn,2014a,b). Third, my framework accounts for both intra- and inter-industry misallocation.2Instead, I use the “exact hat algebra” method proposed by Dekle et al. (2008) that is not demanding in terms of datarequirements.3Introductionet al. (2012), Hanson et al. (2015), Levchenko and Zhang (2016), and French (2017). I use the sameapproach to obtain revealed comparative advantage measures, which are the main metric of interest inmy counterfactual exercises.The second chapter is related to the literature that inquiries about the origin of income gaps be-tween agricultural and non-agricultural workers in developing countries. Those gaps have been docu-mented for decades (see Lewis (1955) or Rostow (1960) and for more recent evidence Vollrath (2014)or Herrendorf and Schoellman (2018)). There are two main hypotheses in the literature. The first oneis that the observed gaps are a manifestation of barriers to mobility, implying labor is misallocated,and hence, there are efficiency gains from reallocating workers. This is the spirit of Restuccia et al.(2008), Adamopoulos et al. (2017) or Bryan and Morten’s (2018). The alternative explanation is thatresidual income gaps are result of sorting of workers across sectors or locations based on unobservablecharacteristics. This mechanism is the explanation for the urban premium proposed by Young (2013)or for the non-agriculture premium documented by Alvarez (2018), authors that build on Lagakos andWaugh’s (2013) adaptation of the Roy’s (1951) model. In this view residual gaps across sectors orlocations exist despite the efficient allocation of labor.We document that sorting does occur and is important. However, a frictionless allocation in asorting model is not able to generate the magnitude of the within-individual sectoral premium and atthe same time replicate the main moments of the joint sector-income distribution over time. This iswhy we have to combine both strands of literature. We evaluate the importance of utility costs relatedto switching sectors (Dixit and Rob (1994), Cameron et al. (2007), Artuç et al. (2010), Dix-Carneiro(2014)) and frictions preventing individuals from working in their preferred sectors (akin to searchcosts as outlined in Taber and Vejlin (2016), that can be rationalized by on-the-job searching frictions(Gautier et al. (2010), Gautier and Teulings (2015)), for example), to enable the sorting model to fitthe main features of the data.The third chapter is related to the recent literature on the implications of firm-level factor misal-location for aggregate productivity in more general settings than the one used in the pioneer works ofRestuccia and Rogerson (2008) and Hsieh and Klenow (2009). I augment the closed economy modelto account for idiosyncratic demand shocks, an extension that does not affect the model’s main logic,but allows me to test the misallocation metrics’ ability to explain plants’ survival while controlling forthe full set of profit determinants, as in Haltiwanger et al. (2018). I argue that including endogenousselection in the model, as in Bartelsman et al. (2013), Adamopoulos et al. (2017) or Yang (2017), canrationalize the apparent lack of empirical content of the misallocation measures in predicting plants’exit. I also incorporate both intra- and inter-industry misallocation, as in Oberfield (2013) or Brandtet al. (2013). I offer simpler closed-form formulas for the gains from removing each type of misal-location, and hence, a more straightforward way to compute each type’s contribution, plus empiricalevidence about the importance of inter-industry misallocation and its potential role in explaining TFPgaps across countries.4Chapter 1Firm-level Factor Misallocation andComparative Advantage1.1 IntroductionWhat are the implications of firm-level factor misallocation in open economies? Most of the literatureon the effects of resource misallocation on the aggregate economic performance has focused on closedeconomies.3 In open economies, if the extent of factor misallocation varies not only across countriesbut also across industries, it could also shape comparative advantage.4 For example, consider the broadrange of industrial policies that several East Asian countries introduced during the post-war period,intended to promote some strategic industries. Such policies could have generated not only realloca-tion of factors towards targeted industries but also an increase in resource misallocation across firmswithin those sectors given the distortionary nature of some instruments used: selective investment taxcredits, public enterprises, depreciation allowances, etc.5 Thus, the likely improvement in the exportcapability of targeted sectors due to the reduction in the average cost of the factors, compared to un-targeted industries, could have been countered by decreases in their sectoral TFP, due to their largerextent of within-industry factor misallocation. A relevant question here is then how to assess the roleof those policies in shaping comparative advantage through their effect on the allocation of resources.Did those policies accentuate or distort the “frictionless” patterns of industrial specialization?This chapter explores how firm-level factor misallocation can influence the core determinants ofindustries’ export capabilities in an open economy, and hence, the patterns of industrial specializa-tion. I do this by addressing the following two questions. First, does resource misallocation explainobserved industries’ export capabilities once we control for the “frictionless” sources of comparativeadvantage? Second, if so, what are the implications of removing such misallocation for the compar-3In the trade literature, most of the analysis has been addressed from a different angle: the effect of trade on a metric offirm-level misallocation, such as mark-ups dispersion (Epifani and Gancia (2011), Edmond et al. (2015)) or how much plantsurvival depend on productivity (Eslava et al. (2013)). Others have studied the effects of trade liberalization for welfare ineconomies with factor misallocation, papers that are mentioned below.4I use the term comparative advantage to describe the differences in the average unit cost of a good across industriesrelative to the same differences in a reference country. Hence, the sources of comparative advantage comprise all primitivevariables that affect the three determinants of the unit costs in an industry: sectoral average productivities, factors prices andthe number of varieties produced. Those sources include not only “natural” differences in technology distributions or factorendowments, but also, in a world with economies to scale, differences in the primitive determinants of industries’ scale (i.e.entry barriers) and, as I show below, the extent of factor misallocation within and across industries in allocative inefficienteconomies.5For details of East Asian industry policies, see for example Rodrik (1995), Chang (2006) or Lane (2017).51.1. Introductionative advantage of a country and its industrial composition taking into account general equilibriumeffects?To verify the role of firm-level factor misallocation as a determinant of comparative advantage, Ifirst present empirical evidence on how standard metrics of firm-level misallocation are related to theobserved patterns of export capability of Colombian industries, once we control for the “natural” de-terminants of comparative advantage. The choice of Colombia is due to the fact that its manufacturingfirm-level data, considered one of the richest in the world (De Loecker and Goldberg (2014)), offers abetter understanding of the role of firms’ efficiency in aggregate productivity. A unique feature of thedata is the possibility to obtain direct measures of firms’ physical productivity (TFPQ) using plant-level deflators for firms’ inputs and outputs. Those measures of TFPQ allow me to decompose thecontribution of efficiency, demand shocks and factor distortions in the sectoral TFP. As my metric ofexport capability, I use the estimates of the exporter-industry fixed effect derived from a gravity equa-tion, an approach that has gained popularity as a measure of “revealed” comparative advantage, RCAhereafter (Costinot et al. (2012); Levchenko and Zhang (2016), Hanson et al. (2015), French (2017)).I regress the Colombian RCA measure relative to the United States on indicators of both intra- andinter-industry misallocation, exploiting their variation over time. I control for the “natural” sources ofcomparative advantage using total endowments interacted with factor intensities and efficient sectoralproductivities, which capture Heckscher-Ohlin and Ricardian forces respectively. I find that firm-levelmisallocation have a quantitative relevance for shaping Colombian RCA with a magnitude similar tothe one observed for the “natural” determinants.Next, I examine the general equilibrium channels with which firm-level factor misallocation canshape relative industries’ unit costs and hence comparative advantage. This exploration, which isthe main contribution of this chapter, takes into account several adjustments that are absent whenremoving factor misallocation under a closed economy. For example, consider first the impact of firm-level misallocation within industries only. As it is well known, this type of misallocation generateslosses in sectoral TFP. In a closed economy setting with a fixed mass of firms, as in Hsieh and Klenow(2009), HK hereafter, the gains in sectoral efficiency from removing intra-industry misallocation donot generate reallocation of factors across sectors under the standard two-tier (Cobb Douglas-CES)demand system.6 Instead, in an open economy, even with the same demand structure and a fixedmass of firms, sectoral revenue shares are endogenously determined and depend not only on howsubstitutable goods are across sectors, but also on the gains from industrial specialization. Removingintra-industry misallocation in a country leads to two types of adjustments on factor prices, absentin a closed economy. First, it produces a change in the relative factor prices across countries torestore trade balance equilibrium, a result analogous to the introduction of a set of sectoral-specificproductivity shocks in standard Ricardian models. And second, it changes the relative real factor6Constant revenue shares across sectors imply that the efficiency gained by each industry, translated into a lower ag-gregate price index, is automatically followed by an increase in demand, so there are not inter-industry factor reallocationsand their relative prices do not adjust. Under a more general demand (two-tier CES) there is reallocation of factors acrosssectors, but abstracting from inter-industry misallocation, the effect on factor prices is marginal (see HK and Chapter 3).61.1. Introductionreturns depending on the adjustment of relative prices of goods, as in the standard Heckscher-Ohlinmodel.Furthermore, when allowing for endogenous entry and selection across firms, as in the closedeconomy models of Bartelsman et al. (2013), Yang (2017) or Adamopoulos et al. (2017), TFP gainsand their general equilibrium effects on factor prices are magnified by the adjustment in the extensivemargin (the number of operating firms) after removing misallocation. This effect is sizable since it in-volves a drastic recomposition of incumbent firms: a withdrawal of low-efficiency firms that survivedbecause of factor misallocation plus the addition of potential high-efficiency firms that were not ableto operate under allocative inefficiency. In monopolistically competitive industries this recompositionof firms can affect the scale of the sectors, which is a third channel that impacts industries’ relative unitcosts. Finally, the marginal returns of the factors might differ on average across sectors, suggesting thepresence of inter-industry misallocation as well. Simultaneously removing this type of misallocationaffects the direction of sectoral factor reallocations and the magnitude of the adjustments on relativefactor prices, which produces further adjustments on average productivities through firms’ selectioneffects.To consider all these general equilibrium channels, I use a tractable multi-country, multi-factor andmulti-sector model of international trade à la Melitz (2003) in which the allocation of factors acrossheterogeneous firms is inefficient. I employ wedge analysis to characterize the observed dispersionin the marginal returns of the factors abstracting from the underlying cause of misallocation, an ap-proach introduced by Restuccia et al. (2008) and HK in this context and inspired by the business cycleliterature.7 Under this approach, each firm is represented by a draw of “true” efficiency – physicalproductivity or TFPQ – and a vector of wedges, whose elements represent the differences between thereturns of each primary factor for the firm and the average returns in the economy. I derive a theoreti-cally consistent gravity equation along the lines of Chaney (2008), Arkolakis et al. (2012) and Melitzand Redding (2014) that incorporates the impact of wedges on the determinants of bilateral exports,in particular on the exporter industry fixed effect, my measure of RCA. I then investigate the effect ofremoving firm-level misallocation of a country on its bilateral exports and hence on its RCA.To this end, I obtain counterfactual equilibria solving the model in relative changes, using the “ex-act hat algebra” method proposed by Dekle et al. (2008). Each counterfactual incorporates the wholeset of general equilibrium effects of reallocating factors to their efficient allocation and is not demand-ing in terms of data requirements. I perform the exercises using a world composed of 47 countries andan aggregate rest of the world, three production factors and 25 tradable sectors, to evaluate the effectof Colombian firm-level factor misallocation on its comparative advantage schedule. I use Bils et al.’s(2017) method to estimate the dispersion in marginal products in the presence of additive measurementerror in revenue and inputs. This methodology exploits the fact that in the absence of measurement7Wedge analysis was first developed as accounting methodology in the business-cycle literature by Cole and Ohanian(2002), Mulligan (2005), Chari et al. (2007) and Lahiri and Yi (2009) among others. For recent uses in the literature onfactor misallocation, see for example Adamopoulos et al. (2017), Brandt et al. (2013), Bartelsman et al. (2013), Gopinathet al. (2015), Hopenhayn (2014b), Oberfield (2013), S´wie˛cki (2017), Tombe (2015) and Yang (2017) among others.71.1. Introductionerror the elasticity of revenues with respect to inputs should not vary for plants with different aver-age products. Hence, panel data can be used to back out the “true” marginal product dispersion byestimating how such elasticity changes for plants with different average products. Moreover, sinceoverhead factors (necessary to account for endogenous selection) are analogous to an unobservableadditive term in measured inputs, this methodology allows me to overcome the problem of measur-ing the variance of the marginal products of the factors directly from the dispersion of their averageproducts in the presence of fixed costs; a key issue of models with self-selection of heterogenous firms(Bartelsman et al. (2013)).The results of the counterfactual exercise suggest that in Colombia resource misallocation playsa major role in shaping comparative advantage. In the case of an extreme reform in which factormisallocation is entirely removed within and across industries, the ratio of exports to manufacturingGDP rises by 18 p.p. and welfare, measured as real expenditure, grows 75%.8 The large boostin exports is due to the increase in the dispersion of the schedule of comparative advantage, whichleads to higher degrees of industrial specialization in the frictionless equilibrium. For instance, thewhole chemical sector (both industrial chemicals and other chemicals such as paints, medicines, soapsor cosmetics) climbs to the top of the national export capability ranking, and ends up in the firstpercentile of the counterfactual RCA world distribution. The opposite case occurs in industries whosecomparative advantage in the actual data seems to be due to only factor misallocation, particularlycomputer, electronic and optical products, transportation equipment, petroleum and machinery andequipment. These four industries shrink and practically disappear, indicating a non-interior solutionin the counterfactual equilibrium.9The model also delivers a decomposition of the change in the RCA measure after removing factormisallocation into three terms, each of which corresponds to a single component of the relative unitcost across industries: the average TFP, factors prices, and the number of produced varieties. I findthat the adjustment in the relative number of produced varieties (i.e., in the extensive margin), whichis generated by the reallocation of factors across industries, contributes the most to the change in theRCA. This is because in the intensive margin the gains in average TFP relative to the rest of the worldare offset in large part by the rise in the relative factor prices, and the remaining effect does not varymuch across industries.The organization of this chapter is as follows. Section 1.2 presents the empirical motivation. Ifirst introduce the empirical measure of RCA derived from a standard gravity equation, and next Ipropose a strategy to evaluate the impact of different metrics of Colombian factor misallocation on itscomparative advantage. Section 1.3 introduces the theoretical model and derives the effect of firms’wedges on the gravity equation, particularly on exporter-industry fixed effects, the measure of RCA.I also offer an overview of the general equilibrium channels that each type of misallocation can trig-8The growth in real expenditure is equivalent to the TFP gains in a closed economy model.9The feasibility of non-interior solutions in multi-sector Pareto-Melitz type of models is established by Kucheryavyyet al. (2017). Under a similar setup to the one used in this chapter, it is guaranteed that the general equilibrium is unique,but not necessarily an interior solution.81.2. Empirical motivationger using model simulations under a simple parametrization. Section 1.4 presents the counterfactualexercise of removing firm-level misallocation in Colombia, to compute the effect of the two typesof misallocation on its industries’ comparative advantage. I also evaluate some departures from thebaseline model. Section 1.5 concludes.1.2 Empirical motivationIn this section I present empirical evidence on how factor misallocation is related to the comparativeadvantage of a country. For this, I first introduce the empirical measure of RCA derived from astandard gravity equation and I explain how this measure is linked to the relative producer price index.Next, I decompose the price index in terms of the “natural” sources of comparative advantage andmetrics of factor misallocation. Finally, I propose a strategy to evaluate the relation between themetrics of factor misallocation and the measures of RCA, controlling for the “natural” sources ofcomparative advantage.1.2.1 A measure of RCAA wide range of the new trade models deliver a gravity equation, in which comparative advantage hasan important role as a predictor of bilateral trade flows. In the generic formulation of the gravity equa-tion, bilateral exports of country i to country j, denoted by Xi j, can be expressed as the combination ofthree forces: i) a factor that represents “capabilities” of exporter i as a supplier to all destinations; ii) afactor that characterizes the demand for foreign goods of importer j; iii) a factor that captures bilateralaccessibility of destination j to exporter i, which combines trade costs and other bilateral frictions.The gravity equation can be estimated at the industry level, in order to reduce aggregation bias.10With cross-sectional data the standard procedure involves taking logs and estimating a regression withfixed effects:lnxi js = δis+δ js+δi j + εi js (1.1)where δis, the exporter-industry fixed effect, characterizes factor i), “capabilities” of exporter i in in-dustry s; δ js, the importer-industry fixed effect, captures factor ii), the demand for foreign goods ofimporter j in industry s; and δi j+εi js represent factor iii), bilateral accessibility of j to i, a componentthat involves characteristics of the bilateral relation independent of the sector (distance, common lan-guage, etc.), absorbed by the exporter-importer fixed effect δi j, plus sector-specific bilateral frictionsand measurement error, represented by the term εi js.In this way, the estimate of the industry-exporter fixed effect characterizes the relative country’sproductive potential in an industry and, given the structure of the gravity equation, it is “clean” fromother determinants that affect bilateral trade flows. Since it is only identified up to a double nor-malization, that is, it has meaning only when it is compared to a reference country and industry, it10For a detailed explanation about the necessary conditions for a trade model to yield a structural gravity equation, seeHead and Mayer (2014). On the aggregation bias see Anderson and Yotov (2010, 2016).91.2. Empirical motivationcan be interpreted as a measure of “revealed” comparative advantage (RCA), an approach that hasincreasingly gained relevance in the trade literature (Costinot et al. (2012), Hanson et al. (2015), andLevchenko and Zhang (2016)). In contrast to traditional measures of RCA, as Balassa’s (1965) index,the fixed effect estimate is a valid measure of countries’ fundamental patterns of comparative advan-tage (French (2017)). Moreover, it has better statistical properties than Balassa’s index, especiallylower ordinal ranking bias and higher time stationarity (Leromain and Orefice (2014)).Figure 1.1 displays for Colombia the RCA measures of the 25 manufacturing industries listed inTable A.1 of the Appendix A.1. I rely on the CEPII trade and production database, developed forde Sousa et al. (2012). I use bilateral trade flows among 47 countries plus a rest of the world aggregatefor 1995. The set of countries is listed in Table A.2 of Appendix A.1. Similar to Hanson et al. (2015),I use as a reference country and industry the mean over all countries and industries, so the RCAcan be interpreted as a measure of Colombian industries’ capabilities relative to a “typical” countryand a “typical” sector.11 The logarithmic transformation in equation (1.1) poses two well-knowneconometric issues for an estimation by OLS. First, zeros in bilateral exports are not likely randomin the data, and since OLS drops those observations, it introduces sample-selection bias. Second, thecoefficients of log-linearized models estimated by OLS are biased in the presence of heteroskedasticity(Silva and Tenreyro, 2006). In Monte Carlo simulations, Head and Mayer (2014) find that the Tobitmodel proposed in Eaton and Kortum (2001) (EK-Tobit hereafter) and the Poisson pseudo-maximum-likelihood estimator (PPML hereafter) proposed in Silva and Tenreyro (2006) are the two estimatingmethods which, depending on the structure of the error of the underlying data generating process,produce unbiased coefficients for exogenous variables in a gravity formulation.12 Thus, Figure 1.1compares the estimates obtained by EK-Tobit (vertical axis) and PPML (horizontal axis). Noticeably,the ranking across sectors in the cross section is not strongly affected by the estimation method.The determinants of the exporter-industry fixed effect vary according to the sources of compara-tive advantage in the considered theoretical model. However, a common feature across all standardmodels is that such determinants are collapsed in the reduced-form of the relative producer price indexat the industry level compared to a reference country ( PisPi′s′Pis′Pi′s ), as a measure of the relative unit cost ofproducing across industries (French (2017)).13 For example, in Ricardian models, as in Eaton and Ko-rtum (2002), such ratio depends only on sectoral fundamental efficiencies, the source of comparativeadvantage at the heart of the Ricardian theory.14 In a Heckscher-Ohlin model, as in Deardorff (1998),11Therefore, letting δˆis be an estimate of δis in regression (1.1), RCA of country i in sector s is defined as:RCAis =[exp(δˆis)/exp(S∑s1Sδˆis)]/[exp(N∑i1Nδˆis)/exp(S∑sN∑i1S∗N δˆis)]12Under heteroskedasticity in the form of a constant variance to mean ratio PPML performs better, whereas underhomoskedastic log-normal errors the Tobit proposed by Eaton and Kortum (2001) is preferred.13Strictly, French (2017) shows that country i has comparative advantage in sector s, compared to country i′ and industrys′, if the relative price of country i in sector s in autarky is smaller than the same price in country i′: P¯isP¯i′s′P¯is′ P¯i′s < 1 where P¯is isthe counterfactual price index in industry s of country i in autarky.14The implicit assumption is that sectors share the same intra-industry heterogeneity in the distribution of varieties’ pro-ductivities. If the heterogeneity varies across sectors, the productivity dispersion can be an additional source of comparative101.2. Empirical motivationthe ratio depends on the factor prices weighted by sectoral factor intensities, reflecting the balancebetween the relative sizes of factor endowments and the technology requirements. In the Krugman(1980) model, it depends only on the relative number of varieties produced, reflecting the effect ofgains from variety in the aggregate price. In the Pareto version of Melitz (2003), the ratio is analo-gous to that in Krugman (1980), adjusted by the Pareto lower bound of the productivity distribution.Multi-factor models with heterogenous firms, as in Bernard et al. (2007) or in this chapter, combineall mentioned sources of comparative advantage in the reduced form of the relative price index.The model with resource misallocation in an open-economy in the next section delivers an analyti-cal expression of the exporter-industry fixed effect taking into account endogenous entry and selectionof firms, features that will provide a rich theoretical grounding to the RCA measure. However, at thispoint we can use the insights from the most well-known misallocation framework, Hsieh and Klenow(2009) (HK hereafter), to decompose the producer price index in its different determinants and empir-ically test whether the components due to firm-level misallocation are related to the metrics of RCA,once we control for the remaining sources of export capability.1.2.2 Decomposing the price index under factor misallocationThe starting point in the HK framework to evaluate the implications of firm-level factor misallocationrelies on the distinction between physical productivity (TFPQ, defined as the ratio of physical outputto inputs) and revenue productivity (TFPR, defined as the ratio of revenues to inputs), first proposed byFoster et al. (2008). Assume a standard monopolistic competition framework in which firms differ interms of efficiency –i.e. in the TFPQ or Hicks-neutral productivity–, but use the same constant returnsto scale technology in each industry. Moreover, assume firms face a CES demand, with the sameelasticity of substitution in all industries. In this simple economy if factor markets are frictionless thefollowing two implications emerge: i) TFPR is equalized across firms within industries;15 and ii) thesectoral TFP can be computed as a power mean of firms’ TFPQ. Any dispersion in firms’ TFPR withina sector is a signal of within-industry factor misallocation, and leads to a loss in sectoral TFP.Of course, the reliability of the dispersion of TFPR as a measure of intra-industry factor misallo-cation depends on the plausibility of the considered assumptions. Some recent papers have tried toquantify the contribution of other possible sources of variation in TFPR, that do not imply factors aremisallocated. These include departures from the model specification (heterogeneity in inputs, variablemarkups, adjustments costs, etc.) and pure measurement error. In Table 1.1 I present a brief surveyof those contributions, each one derived from an extended structural model that takes into accountthe corresponding cause. The main conclusion is that, apart from measurement error, the remainingcauses have a relative small contribution to the dispersion in TFPR. In the case of measurement error,advantage (Bombardini et al. (2012a)).15This is simply because TFPR is the product of firm’s price and TFPQ. With constant mark-ups, prices vary acrossfirms only due to marginal costs. In turn, with all firms facing the same factor prices and the described technologies, theonly source of variation in marginal costs is TFPQ. Hence, differences in TFPQ are perfectly translated into (the inverse of)prices, leaving TFPR invariant.111.2. Empirical motivationBils et al. (2017) propose a method to compute the true dispersion in TFPR in the presence of addi-tive and orthogonal measurement error in revenues and inputs, using panel data. The methodologyexploits the fact that in the absence of measurement error the elasticity of revenues with respect toinputs should not vary for plants with different average products; see section 1.4.2 for a detailed ex-planation. In what follows I use Bils et al.’s (2017) methodology to obtain measures of intra-industrymisallocation that correct for measurement error, but, for tractability – and given the evidence citedabove – I abstract from other causes of dispersion in TFPR.More formally, assume that the production technology is Cobb-Douglas (CD) such that q unitsof variety m in a manufacturing industry s in country i are produced using a set of L homogenousfactors zl and factor intensities αls: qm = amL∏lzαlslm (I omit industry and country subscripts for firm-specific variables). Denote firms’ revenue by rm and the inverse of the constant mark-up by ρ . Thesectoral production function is then: Qis = AisL∏lZαlsils (capital letters denote aggregates) where thesectoral TFP Ais depends on the distribution of physical productivities and the extent of intra-industryfactor misallocation. In frictionless factor markets the efficient sectoral TFP is the power mean offirms’ TFPQ, (Aeis)σ−1 = ∑maσ−1m , and all firms face the same price for their homogenous inputs, saywl for factor zlm, leading to TFPR equalization across firms within industries, with values equal to1ρL∏lwαlsil . Since the sectoral price index can be expressed as the ratio between the sectoral TFPR andthe industry TFP, it can be in turn decomposed in terms of “natural” sources of comparative advantageand measures of factor misallocation as:lnPis = lnT FPRis− lnAis =L∑lαls[ln(1+ θ¯ils)+ lnwil]− lnAeis− lnAEMis (1.2)where AEMis corresponds to the ratio sectoral TFP to the efficient one, AEMis≡ Ais/Aeis, and(1+ θ¯ils)is defined as the ratio between the observed marginal revenue product (MRP) of factor l at the sectorlevel, αlsRisZils , and its return in the efficient allocation,wlρ , that is:(1+ θ¯ils) ≡ ραlsRiswlZils . Those two ra-tios quantify the extent of resource misallocation. In the first case, AEMis characterizes the amountof within-industry factor misallocation, with 0 ≤ AEMis ≤ 1 and values closer to 1 reflecting lessmisallocation. According to the implications of the model, this measure is inversely related to thewithin-industry variance of the TFPR.16 In the second case, the sectoral wedge(1+ θ¯ils)characterizesthe magnitude of inter-industry misallocation in factor l, and thusL∏l(1+ θ¯ils)αls is a factor-intensityweighted measure of inter-industry misallocation.17Therefore, the decomposition in equation (1.2) reveals the theoretical determinants of the RCAmeasure under resource misallocation: i) the efficient TFP, Aeis, which depends exclusively on the16In the case of a log-normal distribution of factor distortions across firms, the correlation is perfect, see Chen andIrarrazabal (2015) for the proof.17The sectoral wedge(1+ θ¯ils)can be also computed as the harmonic weighted average of similar wedges at the firm-level, with weights given by firms’ shares in sectoral revenue. See Chapter 3 for more details about the importance of thistype of factor misallocation relative to within-industry misallocation in a closed economy.121.2. Empirical motivationdistribution of physical productivities across firms; ii) the geometric average of factor prices,L∏lwαlsil ,which in general equilibrium can be recovered as the interaction between factor endowments andintensities;18 iii) the geometric average of inter-industry wedges,L∏l(1+ θ¯ils)αls , a measure of inter-industry misallocation; and iv) the measure of intra-industry misallocation, AEMis. Notice that, sincethe first component is related to technical efficiency and the second component to relative factor abun-dance, they represent the “Ricardian” and “Heckscher-Ohlin” sources of comparative advantage, re-spectively, whereas the two latter terms summarize both inter- and intra-industry resource misallo-cation. I use these four components (in logs) as explanatory variables in a regression of the RCAmeasure derived from the fixed effects, to test our hypothesis.1.2.3 Relation between RCA and misallocation measuresIdeally, the suggested regression would require measures of the four variables in a large set of countriesand industries, and thus comparable firm-level data for several countries. Given the infeasibility of thisapproach, I propose a two-stage strategy that exploits the time variation in the measures of RCA forColombia relative to the United States (US) using panel-data. In the first stage, I estimate the paneldata-version of equation (1.1), allowing the fixed effects in each cross section vary over time. That is,with data for the same set of countries in the period 1991-1998, I run the regression:lnXi jst = δist +δi jt +δ jst + εi jst (1.3)where the exporter-industry-year fixed effect δist identifies the triple difference of bilateral flows acrossexporters i and i′, sectors s and s′ and years t and t ′; that is, the variation of RCAis between time tand t ′, denoted by dRCAist . To compute dRCAist , instead of global means, I take as the referencecountry i′ the US, the reference year t ′ the first year in the panel (1991), and the reference industrys′ the sector with the median number of zeros bilateral flows in the data (footwear).19 In the secondstage, I regress the estimates of dRCAist for Colombian industries on the four theoretical determinantsof comparative advantage, constructed using micro-level data. Each variable is transformed to beexpressed as the double difference first with respect to the reference industry and second with respectto the reference year, and then is normalized by the corresponding difference in the producer priceindex in the US (obtained from the NBER-CES manufacturing database), using the same industry and18Particularly if we set wl = ρR/∑sZlsαls where R is total revenue (∑sRs), the values satisfy the solution for relative factorprices in general equilibrium for an allocative efficient closed economy, given by wlwk = Z¯k∑sαlsβs/Z¯l∑sαksβs where Z¯l is thetotal endowment of factor l (see Chapter 3).19Therefore, letting δˆist be an estimate of δist in the regression (1.3), dRCAist of country i in sector s at time t is definedas:dRCAist =[exp(δˆist)exp(δˆis′t)/exp(δˆi′st)exp(δˆi′s′t)]/[exp(δˆis91)exp(δˆis′91)/exp(δˆi′s91)exp(δˆi′s′91)]where i′ =US and s′ = Footwear (7). As I show below, the results are not very sensitive to the choice of s′.131.3. A model of firm-level misallocation in an open economyyear of reference.20The introduction of the time-dimension poses an additional challenge for the fixed effects esti-mators. Particularly, we must appraise the incidental parameter problem (Neyman and Scott (1948)),which generates an asymptotic bias for the fixed effects estimators when the number of time periods issmall. Fernández-Val and Weidner (2016) prove that under exogenous regressors, in a Poisson modelthis bias is zero, which make PPML preferable over EK-Tobit as estimating method in the first stage.Thus, Table 1.2 displays the results for the standardized coefficients of the regression in the secondstage, using PPML to obtain the exporter-industry-year fixed effects in the first stage. The estimationof the second stage is by weighted OLS, using the reciprocal of the error variance in the first stage asweighting matrix.21 In the first column I present the results for the measure of intra-industry misallo-cation AEMis, based on the direct measures of firms’ TFPQ using plant-level deflators for firms’ inputsand outputs, that allows me to isolate the influence of demand shocks. In the second column, I useinstead for the measure of intra-industry misallocation the within-industry variance of firms’ TFPR,corrected by measurement error following Bils et al.’s (2017) methodology.In both specifications, the measures of intra and inter-industry misallocation, once we control forthe “natural” sources of export capability, are significantly correlated with our RCA measure and dis-play the expected signs: positive for the intra-industry misallocation measure AEMis (negative in thecase of the within-industry variance of TFPR) and negative for the inter-industry misallocation mea-sureL∏l(1+ θ¯ils)αls . Moreover, the magnitude of the standardized coefficients suggests that both typesof misallocation have a similar impact for shaping Colombian RCA, and they are not less importantrelative to the “Ricardian” and “Heckscher-Ohlin” determinants. These correlations are robust to thechoice of the reference industry and the aggregation of countries. For instance, in column 3 I replicatethe first specification using the sector with the lowest number of zeros as reference industry (machin-ery exc. electrical) whereas in column 4 I aggregate the 48 countries into 20 regions. The results arequalitatively similar. Therefore, the empirical evidence suggests that resource misallocation can playa role shaping the schedule of comparative advantage in Colombia. The model in the next sectionoffers theoretical grounding to this insight.1.3 A model of firm-level misallocation in an open economyIn this section, I introduce a model of international trade à la Melitz (2003) in which the allocationof factors within and across industries is inefficient. Next, I derive a theoretically consistent gravity20This transformation intends to reflect the fact that the variation in RCA should be related to the change in the relativeproducer price indices compared to the same change in the country of reference: dRCAist = F(( PistPis′t /Pis0Pis′0)/(Pi′stPi′s′t/Pi′s0Pi′s′0)).Notice that in this approach we compare the growth on the relative prices (with respect to the reference year) across countries,so any difference in the measurement of relative prices across countries is absorbed by the difference over time.21The use of weighted OLS seeks to alleviate the impossibility to bootstrap standard errors. Given the high-dimensionality of the set of fixed effects involved in the non-linear regression by PPML in the first stage, the estimationis infeasible in standard econometric software as rSTATA, so I take advantage of the sparsity pattern of the problem anduse a specialized solver that deals efficiently with sparse problems (SNOPT). However, the estimation is still highly timeconsuming.141.3. A model of firm-level misallocation in an open economyequation following the lead of Arkolakis et al. (2012) and Melitz and Redding (2014), assumingcertain restrictions on the ex-ante joint distribution of TFPQ and factor distortions. Finally, I studythe effects of both intra- and inter-industry factor misallocation on the reduced-form expression of theexporter-industry fixed effect derived from the gravity equation, my measure of RCA, using modelsimulations under a simple parametrization.1.3.1 Model setupDenote by m a single variety, i the exporting country, j the importing country, s an industry and l ahomogenous production factor. Assume there are N possibly asymmetric countries, S industries andL homogenous primary factors. Hereafter capital letters denote aggregates, lower case letters firm-specific variables and for simplicity, I omit again sector subscripts for firm-specific variables. Eachcountry i consumes according a two-tier utility function, with an upper-level CD with expenditureshares βis across sectors and a lower-level CES with elasticity of substitution σ across varieties; letρ = σ−1σ . Each firm produces a variety m using L homogenous primary factors (each one denoted byzilm) and a CD production technology with factor intensities αls (different factor intensities across in-dustries, but equal for the same industry across countries). Firms are characterized by a Hicks-neutralphysical productivity (TFPQ) aim and a vector of L factor-distortions: ~θim = {θi1m,θi2m, ...θiLm}, whichare drawn from a joint ex-ante distribution Gis(a,~θ). There is a fixed cost of production fis in termsof the composite input bundle, and each industry faces an exogenous probability of exit δis.There is a fixed cost f xi js to access market j from country i in sector s, defined in terms of thecomposite input bundle, and a transportation iceberg-type cost τi js ≥ 1, with τiis = 1. Let wil denotethe price of factor l in country i in absence of distortions, unobservable and common for all firms.Firms in country i face an idiosyncratic distortion θilm (given by the l-th element of ~θim ) in the marketof primary factor l, such that the input price perceived by the firm is (1+θilm)wil . Define fi js = f xi js ifj 6= i; fi js = f xiis+ fis otherwise (so domestic market fixed costs incorporates both “market access” andfixed production costs, whereas the export cost includes only the market access cost). The minimum“operational” cost to sell a variety m of country i in country j is:ci jm (qi jm) = ωisΘim(τi jsqi jmaim+ fi js)(1.4)whereΘim =L∏l(1+θilm)αls is a factor-intensity weighted geometric average of firm wedges and ωis =L∏l(wil/αls)αls is the prevalent factor price of the composite input bundle for the firms with zero drawsof ~θim. Hereafter I refer to this cost as the total “operational” cost, which includes the variable cost ofproduction and the fixed costs of production and delivery. Notice that this is a standard cost functionin a multi-factor Melitz-type setting, the only difference here is that the composite input bundle’s priceperceived by the firm is a combination of both distortions and the underlying factor prices. Moreover,this cost function could be derived from a primal problem considering the following technology to151.3. A model of firm-level misallocation in an open economyproduce and deliver one unit of variety m of country i in country j:qi jm =aimτi js(L∏lzαlsi jlm− fi js)=aimτi js(zi jm− fi js) (1.5)Here zi jlm represents the total amount of primary factor l “embedded” in the production and deliveryof variety m from country i in country j, and zi jm the corresponding composite input bundle. Noticethat zi jlm includes the demand of primary factor l to pay both variable and fixed costs.Profit maximization implies a firm charges a price pi jm in each destination j equal to a fixed mark-up (ρ−1) over its marginal cost: pi jm = τi jsΘimωis/ρaim. Quantities, revenues and profits of variety mfrom country i sold in country j are (respectively):qi jm = p−σi jmE jsPdσ−1js ; ri jm = p1−σi jm E jsPdσ−1js ; p˘ii jm =1σri jm−ωisΘim fi js (1.6)where E js is the total expenditure of country j in varieties of industry s and Pdjs the correspond-ing consumer price index, variables that are defined below. It is straightforward to show the fol-lowing relation between revenues from destination j and the corresponding total “operational” cost:ci jm = ρri jm+ωisΘim fi js. Revenue productivity (TFPR) of selling variety m in destination j, denotedby ψi jm, is the ratio between revenue and the input used in production: ψi jm ≡ ri jm/(zi jm− fi js) =pi jmaim/τi js = Θimωis/ρ . Notice that although this destination-specific TFPR is not directly observ-able, since the allocation of factors to production for a given destination is unobservable, profit maxi-mization implies that firms equate this value across all destinations, as the natural consequence of theabsence of destination-specific frictions at the firm level. Hence, total TFPR must coincide with thisvalue. In the absence of frictions in factor markets, there is TFPR equalization across firms within anindustry (factor intensities make TFPR vary across sectors) for all destinations. Thus, in an efficientallocation, a firm’s performance with respect to its competitors depends uniquely on relative TFPQ.In contrast, in the presence of factor misallocation, firms with higher TFPQ or lower TFPR (due to alow geometric average of firm wedges, Θim), holding the rest constant, set lower prices and hence sellhigher quantities, obtaining higher revenues and profits in all markets.Denote by ξi jlm the marginal revenue product (MRP) of factor l “embedded” in the production ofvariety m from country i to country j. Once again this MRP is not directly observable, but it is a usefulconcept to illustrate the consequences of factor misallocation. After some manipulation, it is possibleto obtain the following relation between ξi jlm and the total “operational” cost: ξi jlm = αlsci jm/ρzi jlm.Notice that because of the presence of fixed costs, the MRP is no longer directly proportional tothe average revenue product, a result emphasized in Bartelsman et al. (2013). From the FOC of theminimization cost problem of the firm, we know that (1+θilm)wilzi jlm = αlsci jm, which derives intoξi jlm = (1+θilm)ωis/ρ . That is, an efficient allocation of factors in an open economy requires MRPequalization across firms over all industries for all destinations, TFPR equalization within industries161.3. A model of firm-level misallocation in an open economyfor all destinations,22 but because of fixed costs, there is not average revenue products equalization.Firms produce for a given destination only if they can make non-negative profits. Since profitsin each market depend on both TFPQ and TFPR, this condition defines a cutoff frontier a∗i js (Θ) foreach destination j, such that pˇii js(a∗i js (Θ) ,Θ)= 0 ∀ i, j,s. For a given combination of factor wedgesΘ of firms in country i industry s, i.e., a given value of TFPR, a∗i js (Θ) indicates the minimum TFPQrequired to earn non-negative profits in destination j . Define a∗i js as the TFPQ cutoff value for firmswith TFPR equal to ωisρ in destination j, i.e. firms with draws of distortions equal to zero: a∗i js≡ a∗i js (1).It is straightforward to derive the specific functional form of the cutoff functions in terms of a∗i js andΘ:a∗i js (Θ) = a∗i jsΘ1ρ with a∗i js ≡ a∗i js (1) =τi jsρ(E jsPσ−1jsσ fi js) 11−σω1ρis ∀ i, j,s. (1.7)The function a∗i js(Θ) is increasing in Θ (and thus in TFPR) reflecting the fact that larger wedges re-flect higher marginal cost of the inputs, becoming more difficult to sell to the corresponding market.The existence of these cutoff functions, instead of unique threshold values for physical productivity,implies that the introduction of factor misallocation triggers selection effects that are absent in theefficient allocation. For example, some firms productive enough to operate in an undistorted counter-factual can no longer keep producing either because their distortions draws turn their profits negativeor because even with a small “good” draw, the possible strengthening of competition due to the pres-ence of highly positive distorted firms does not make it profitable for them to stay in the respectivemarket. And the opposite could occur with some low productive firms, which will be able to survivein each market leading to misallocation of resources.23To analyze the selection effects of resource misallocation, notice first that all cutoff functionsacross destinations share the same functional forms. Particularly, cutoff values for exporting to desti-nation j areΛi js = τi js(E jsPdσ−1js fiis/EisPdσ−1is fi js) 11−σtimes larger than domestic cutoff values. Thus,a simple representation of the firms in an open economy can be done in the space a×Θ, illustrated inFigure 1.2. In this space, each firm in sector s, characterized by a pair of draws (a,Θ), is representedby a single point. Profits are an increasing function of TFPQ and a decreasing function of TFPR, sofirms with draws closer to the upper-left corner are more profitable. For simplicity, consider the des-tination j different to i with the lowest ratio Λi js for country i in sector s in Panel A. Only firms withdraws (a,Θ) above a∗i js(Θ) export to destination j, those with draws below a∗i js(Θ) and above a∗iis(Θ)produce only for the domestic market, and those with draws below a∗iis(Θ) do not produce. Panel Brepresents the selection mechanism that distortions trigger. Let a˜∗M represent the domestic productivitycutoff value in an allocative efficient economy (Melitz economy), and Λ˜i js the corresponding value22Notice also that TFPR of variety m sold in destination j can be expressed as a factor-intensity weighted geometricaverage of the MRP: ψi jm =L∏l(ξi jlm/αls)αls .23These selection channels are also present in the closed economy models of Bartelsman et al. (2013) and Yang (2017).171.3. A model of firm-level misallocation in an open economyof Λi js.24 In such economy, firms with productivity above Λ˜i jsa˜∗M export to j, those with productivitybetween Λ˜i jsa∗M and a∗M produce only for the domestic market, and those with productivity less thana∗M do not produce. Thus, each cutoff function in the allocative inefficient economy creates two effectsin the set of firms that sell to each market, which can be represented by two sets of areas: the regionsunder the density function that show firms that as consequence of distortions can no longer produce(light dotted area A) or export to j (light dotted area B) and the regions that display firms that becauseof distortions operate in the domestic market (dark dashed area A) or in the exporting market (darkdashed area B). The difference between dotted and dashed areas represents the net impact of distor-tions on the set of firms of country i and sector s, operating in the domestic and country- j markets(differences in A and B respectively).The timing of information and decisions is as follows. Each time, there is an exogenous probabilityof exit given by dis. A total of His potential entrants at country i industry s decide whether to produceand export to each destination conditional on their draws of physical productivity and distortions fromGis. All potential entrants pay a fee f eis to draw from Gis, which is paid in terms of the compositeinput bundle. The number of potential entrants is pinned down by the condition in which the expecteddiscounted value of an entry is equal to the cost of entry. As usual in this kind of setup, let us considerno discounting and only stationary equilibria. Hence, the free entry condition is:N∑jMi js∑mp˘ii jm = ωis f eisHis ∀ i,s (1.8)Where Mi js denotes the mass of operating firms in sector s of country i that is selling to country j.Aggregate stability requires that in each destination the mass of effective entrants is equal to the massof exiting firms:disMi js =[1−Gis(a∗i js (Θ) ,Θ)]His ∀ i, j,s (1.9)Given CES demand and firms prices, the consumer price index Pdis in country i sector s satisfies(Pdis)1−σ=N∑kP1−σkis , with:P1−σi js =(1ρωisτi js)1−σ Mi js∑m(aimΘim)σ−1(1.10)Total expenditure in country i and sector s is Eis = PdisQdis. By the upper-level utility function, theoverall consumer price index (equal to unit expenditure) is Pdi =S∏s(Pdis/βs)βs and satisfies Eis = βsEi,with Ei =S∑sEis total country-i expenditure.Now consider the aggregate variables. Let Xi js =Mi js∑mri jm be the value of total exports fromcountry i to destination j in industry s. Analogously as at the firm-level, the total “operational”cost of exporting to country j incurred by all firms of country i in industry s can be written as24In general, a∗iis and Λi js are not related to a˜∗M and Λ˜i js respectively. In Figure 1.2 it is arbitrarily assumed a∗i js > a˜∗M .181.3. A model of firm-level misallocation in an open economyCi js = ρXi js +Fi js where Fi js =Mi js∑mωisΘim fi js is the value of total expenditures in fixed costs. Sim-ilarly, denote by Ris, Fis, Cis the same aggregations but at the industry level, with Ri =S∑sRis repre-senting total country i’s gross output. Denote the HWA of primary factor-l wedges (1+θl) withinindustry s as(1+ θ¯ils), with weights given by the firm’s participation in Cis. It is possible to show that(1+ θ¯ils)= (ρRis+Fis)αls/wilZoils where Zoils is the aggregate demand of factor l for “operational”uses in country i in sector s: Zoils ≡N∑jMi js∑mzi jlm. Thus, this average wedge is the industry-level ana-logue of firm-level wedges and allows me to measure the degree of inter-industry misallocation, asin the closed-economy framework of the previous section. The total demand of primary factor l for“operational” uses in country i industry s can be expressed as:Zoils =αlsCiswil(1+ θ¯ils) (1.11)Primary factors are used for “operational” (fixed and variable costs) and investment (entry) costs.The sectoral demand of the composite input bundle for entry costs is simply f eisHis. Therefore, theamount of primary factor l allocated to entry costs in country i sector s is Zeils = αlsωis feisHis/wil , andthe total allocation of the same factor, Zils, is given by:Zils = Zoils+Zeils =αlsCiswil(1+ θ¯ils) + αlsωis f eisHiswil(1.12)Notice that the inter-industry wedge only appears in the input allocated for operational uses. This isa consequence of the timing of the model, in which firms allocate first real resources (the entry fixedcost) to draw from the joint distribution. Only after this moment is the draw of the vector of distortionsknown to the firm. Factor-l market clearing condition in country i is then:Z¯il =S∑sZils (1.13)where Z¯il is the total endowment of primary factor l in country i, and Zils is given by (1.12). Finally, thebalanced trade condition requires equalization of the total revenues to total expenditures plus aggregatedeficits:25Ri = Ei+Di (1.14)where Di is the country’s trade balance (a positive value means surplus), an exogenous value in the25By construction, total revenues are the sum of factor payments and profits: Ri =S∑sL∑l(1+ θ¯ils)wilZoils +S∑sωis f eisHis. Thiscan be shown decomposing sectoral revenues as:Ris = ρRis +1σRis =L∑l(1+ θ¯ls)wilZoils−Fis +N∑jMi js∑m(pii jms +ωisΩim fi js)where the second equality is derived from (1.11) and the aggregation of firms’ revenues.191.3. A model of firm-level misallocation in an open economymodel. Global trade balance requires:N∑iDi = 0. A summary of the whole system of equations andunknowns is given in Table 1.3. This table also offers the dimensionality of the problem.1.3.2 Comparative advantageBilateral exports at the industry level can be expressed in terms of sectoral expenditures in the importercountry (E js) and trade shares of the importer country (pii js). The latter term can be re-written in termsof the bilateral price indices as:Xi js = pii jsE js = (P1−σi jsN∑kP1−σk js)E js (1.15)The trade share of country i in country- j expenditures in goods of industry s only depends on the valueof its bilateral price index Pi js, relative to the same value for all competitors of country i in such market.As I commented earlier, this is so because the price index Pi js is a measure of the unit price incurredby consumers of the destination country, and hence it is an indicator of country-i’s competitiveness.To derive the reduced-form of the exporter industry fixed effect, consider the double difference ofbilateral flows across exporters i and i′ and sectors s and s′ for a given importer j, i.e., Xi jsXi′ js′Xi js′Xi′ js . It isstraightforward to see that this double difference is given by the difference in the relative price index,(Pi jsPi′ js′Pi js′Pi′ js)1−σ . From (1.10) it is possible to disentangle these bilateral prices indices as follows:Pi js = τi jsM11−σi jsψ¯i jsAi js(1.16)where Ai js and ψ¯i js are the industry-destination analogues of sectoral TFP and sectoral revenue pro-ductivity respectively,26 so Ai js represents the overall efficiency of exporting firms to destination jand ψ¯is depicts the average cost of the factors faced by the same set of exporters. Therefore, equa-tion (1.16) disentangles the four determinants of exporters’ competitiveness: i) their overall efficiency,which is a weighted average of exporters physical productivity and factor market frictions; ii) the av-erage cost of factors for exporters; iii) the mass of exported varieties; and iv) bilateral trade costs. Ofthese components, factor misallocation has a direct impact on the average TFP and an indirect impact(through general equilibrium channels) on the formation of factor prices and the determination of thenumber of exported varieties. Notice also that the unit price is a combination of both extensive andintensive margins of trade. Thus, the model is very rich about the determinants of competitiveness. Itis able to combine the sources of relative export capability in Ricardian and Heckscher-Ohlin models(where comparative advantage is due to differences in efficiency across industries in the first case andthe interaction between the sizes of factor endowments and factor intensities across industries that pins26This is: Ai js = Θ¯i js( 1Mi jsMi js∑m( aimΘim )σ−1)1σ−1 and ψ¯i js =ωisΘ¯i jsρ , where Θ¯i js =L∏l(1+ θ¯i jls)αls . Here (1+ θ¯i jls) denote theHWA of factor-l wedges of firms exporting to destination j in industry s, with weights given by firm’s participation in thetotal cost of factors Ci js.201.3. A model of firm-level misallocation in an open economydown relative factor prices, in the second case) with the motives for intra-industry trade in monopo-listic competition models with Dixit-Stiglitz preferences (where the gains-from-variety effect inducereductions in unit costs) in an environment of micro-level resource misallocation, which in turn canalso create “artificial” comparative advantage. In the next subsection, I perform numerical simula-tions to disentangle the effects of both intra- and inter-industry misallocation on each component ofthe relative unit prices.At this point I need to impose a functional form for the joint distribution Gis to derive the reduced-form equation of the exporter-industry fixed effect from the double difference in unit price. Let Gais (a)be the univariate margin of Gis with respect to a, and Gθis(~θ) the multivariate margin of Gis with respectto ~θ .27 Consider the following assumptions:A. 1. (Pareto distribution) ∀ai > a¯, Gais(a) = 1− ( a¯isa )κ ; κ > σ −1;A. 2. (Ex-ante independence) Gis = Gis(a,~θ) = Gais(a)Gθis(~θ)First, regarding Assumption A.1., the Pareto distribution is the common benchmark in the tradeliterature to model heterogeneity on physical productivity in the Melitz model. Not only does it have agood empirical performance approximating the observed distribution of firm size,28 but it also makesthe model analytically tractable, allowing me to derive a particular expression for the gravity equation.And second, although Assumption A.2. seems problematic given the observed correlation betweenTFPQ and TFPR in the data, it is worth emphasizing that the assumed independence is only betweenthe latent (ex-ante) marginal distribution of TFPQ and that of the vector of factor distortions. Theobserved (ex-post) distribution can exhibit any kind of correlation. In fact, given the functional formsof the cutoff functions, endogenous selection in the model implies the positive ex-post correlationsbetween TFPQ and TFPR observed in the data. Furthermore, there is no restriction for the jointdistribution of individual factor distortions Gθis, so covariances across factors wedges are completelyallowed. I keep Assumptions A.1. and A.2. hereafter unless otherwise indicated.Under Assumptions A.1. and A.2., the model exhibits an interesting set of features and offersa great simplification, which is done in detail in Appendix A.4.1 and summarized by the system ofequations (1.21)-(1.24) below. First, it is possible to show that the property of a constant aggregateprofits/revenue ratio of the Pareto-Melitz model still holds under factor misallocation: Ris = κρΠis =κρωis feisHis (see equation (A.5) in Appendix A.4.1). Thus, market clearing conditions can be re-statedas:wilZils = αls[(1+ θ¯ils)−1(1− ρκ)+ρκ]Ris (1.17)notice that the HWA wedge(1+ θ¯ils)affects only the fraction of the total revenue that is allocatedto “operational” costs: 1− ρκ . Denote the term in curly brackets by vils. Here, vils measures theeffective extent of inter-industry misallocation for primary factor l, considering all its possible uses27This is, Gais(a) = lim~θ∞¯Gis(a,~θ) and Gθis(~θ) = lima∞Gis(a,~θ)28See for example Cabral and Mata (2003) or Axtell (2001).211.3. A model of firm-level misallocation in an open economy(operational and entry costs). Let vis denote the factor-intensity weighted geometric average of thesemeasures: vis =L∏lvαlsils . Further, aggregate the sectoral demands of primary factors on an industry-level composite input bundle Zis =L∏lZαlsils . Thus, we can state visRis = ωisZis and hence His =ρZisκ f eisvis,a solution for the mass of entrants similar to that obtained in the multi-sector Pareto-Melitz case (inwhich the mass of entrants is related to the total allocation of inputs in the sector). The only differencehere is the presence of the inter-industry allocative inefficiency measure vis, which affects the totalallocation of factors across sectors.Second, it is possible to derive a relationship between the ex-post HWA wedge and the ex-antejoint distribution of distortions. Appendix A.4.2 shows that the following relation holds:(1+ θ¯ils) =ΓisΓils(1.18)where Γis =∫θi1 ...∫θiL Θi1− κρ dGθis(~θ) and Γils =∫θi ...∫θiLΘi1− κρ(1+θil)dGθis(~θ), terms that only dependon the ex-ante joint distribution of firm-level distortions Gθis. Equation (1.18) makes evident the in-teraction between both types of factor misallocation under our assumptions, and depending on theparametric assumptions on the joint distribution Gθis, it allows me to recover some structural parame-ters from the values of observed HWA wedges.Third, regarding the gravity equation, I show in Appendix A.4.3 that relative bilateral exports canbe expressed as:ln(Xi jsXi′ js′Xi js′Xi′ js)=ln[ρisρi′s′ρis′ρi′sΓisΓi′s′Γis′Γi′sRisRi′s′Ris′Ri′s(ωisωi′s′ωis′ωi′s)−κρ]+Bi js (1.19)where Bi js and ρis are constants that do not vary when we remove misallocation. The first term of theRHS of equation (1.19) is what δis identifies in the regression with fixed effects in (1.1). I show inAppendix A.4.3 how it can be decomposed in elements that capture the influence of each source ofexport capability in the model. Moreover, notice that changes in the extent of allocative inefficiencyhave a direct effect on the double difference of the term Γis, and an indirect effect (through generalequilibrium channels) on the product of the double differences of the terms Ris and ω− κρis . Thus, tofigure out the total impact of factor misallocation on RCA, it is necessary to solve the full model ingeneral equilibrium, which is done in section 1.4 .1.3.3 SimulationsTo illustrate the effects of both intra- and inter-industry misallocation on comparative advantage, Iuse numerical simulations under a simple parametrization of the model. Consider a world with twocountries, two factors and two sectors, with symmetric factor intensities across sectors. Sector 1is factor 1-intensive. Country 1 faces factor misallocation in sector 1 (I will simulate distortions221.3. A model of firm-level misallocation in an open economyon each factor, so the results are totally symmetric for factor misallocation in sector 2). Assumetrade costs do not vary across sectors. Two objectives are pursued: first, to show how both types offactor misallocation of country 1 affect its comparative advantage, disentangling the total impact onits determinants; and second, to illustrate how sensitive these effects are to factor intensities and tradecosts.Both sectors in the two countries have the same Pareto TFPQ distribution. Country 1 is relativelyabundant in factor 1 with respect to country 2, so in the allocative efficient scenario it has a comparativeadvantage in sector 1.29 I am interested in the RCA of country 1 in sector 1 relative to country 2 insector 2, which I compute using equation (1.19). Assume also a log-normal distribution for distortions,with location and shape parameters µl1 and σ2l1 for factor l respectively, and to simplify things, zerocovariances. I show in Appendix A.4.4 that using equation (1.18) under log-normality it is possible toobtain the following relation between the ex-post HWA wedge and those parameters:ln(1+ θ¯ils)= µils+[(1− κρ)αls− 12]σ2ils (1.20)Equation (1.20) sheds light on the feedbacks between the two types of factor misallocation underendogenous selection of firms. For example, consider the case in which the location parameter is zero.Ex-ante, the average (log) distortion for the firms within the industry is zero. However, for a givenvalue of the dispersion on these frictions (which generates intra-industry misallocation) we obtain(1+ θ¯ils)< 1; that is, ex-post inter-industry misallocation. This result is due to endogenous selection,since firms with both low TFPQ and high distortions exit for sure, pushing the value of the ex-postaverage of the prevalent distortions below zero, generating inter-industry misallocation.Only intra-industry misallocationTo represent the impact of only intra-industry misallocation on comparative advantage, I first considerthe impact of an increase in the variance of wedges of each factor separately, simultaneously adjust-ing the location parameter to ensure there is no inter-industry misallocation. Figure 1.3 displays theresults. The first four graphs correspond to the total impact on the comparative advantage of sector 1(first graph) and the decomposition of the sources of export capability explained above (average effi-ciency, returns of factors, and number of the mass of exported varieties; second to fourth graphs), fol-lowing equation (A.9) in Appendix A.4.3. Each of these graphs plots the difference between the valueof the endogenous variable under the parameters assumed for the distribution of distortions, which aredisplayed in the last graph, and the corresponding values in the allocative efficient equilibrium, so theycapture the net effect of the considered allocative inefficiency. The fifth graph illustrates the implicitHWA of the prevalent distortions, following equation (1.20), to verify the degree of inter-industry29Results do not change qualitatively in the case of the opposite relative factor endowments, or if the comparativeadvantage is countered or enhanced by Ricardian comparative advantage (through differences in the lower bound of thePareto distribution). In those cases, there is a change in the initial RCA, but the effect of factor misallocation is qualitativelysimilar.231.3. A model of firm-level misallocation in an open economymisallocation. Blue and red lines correspond to misallocation only in factors 1 and 2, respectively. Iconsider two trade regimes: free trade, represented by dashed lines,30 and costly trade, represented bycontinuous lines. The values for the whole set of parameters used in each simulation are displayed inTable 1.4.Introducing only intra-industry misallocation of any factor used in sector 1 reduces its comparativeadvantage. The effect increases the larger the variance of the (log) wedges and, for the same valueof the variance, if the misallocation affects the factor used intensively by industry. The total effectis also marginally larger under free trade for the range of variances considered in the graph. It isworth saying that for larger variances, there is a threshold in which with free trade the system fallsin a regime of complete specialization, so the production of sector 1 shuts down. These results areconsistent with the intuition that the larger the possibility to substitute goods across countries, thelarger the impact of misallocation on industry revenue shares, boosting more reallocation of factorsacross sectors. Regarding the determinants of relative export capability, intra-industry misallocationcreates well-known losses of TFP, as in a closed economy. However, to keep trade balanced, theselosses are followed by an adjustment in relative factor prices, absent under autarky. Given endogenousselection, there is relative net exit of exporters in the distorted sector 1, which is a consequence of thereallocation of factors to the undistorted sector 2. The increase in the relative demand of the factorused intensively in sector 2 also reduces the relative price of the factor used intensively in sector 1.The combined effect on factor prices largely counters the effect of the loss in overall efficiency, butthe sum of the two forces is still negative. Thus, the total impact on export capability is largely dueto the adjustment in the extensive margin of trade, whereas the contribution of the intensive margin issmaller, but not zero.31Only inter-industry misallocationNow consider the impact of inter-industry misallocation. For this, I shift the location parameter allow-ing it to take positive and negative values, keeping the shape parameter equal to zero. Then, there isno dispersion in wedges (and thus no intra-industry misallocation), but the ex-post HWA wedge varieswith the location parameter, creating inter-industry misallocation. Figure (1.4) displays the resultswith the same graphs and conventions as in the previous exercise. The net impact on comparative ad-vantage is inversely related to the sign on the location parameter. To understand this result, it is usefulto think about positive values of the location parameter as an industry-level tax in the cost of the factor,which imply a HWA wedge greater than 1 (or a subsidy for negative values). For instance, considerthe effects of introducing an industry-level factor tax. It becomes relatively more expensive to buy thecorresponding input for all firms within the taxed industry, raising the average return of the compositeinput bundle. Some firms whose productivity draws prevent them from paying the new inputs’ cost30For free trade I will consider an scenario without iceberg transportation costs but with fixed costs of exporting, since Iam interested in keeping endogenous selection on exporting markets.31The prevalence of the extensive margin is probably linked to the Pareto assumption. On the consequences on Pareto’sdistribution over the two margins of trade, see Fernandes et al. (2015).241.4. Empirical implementationmust exit. Here, there is no TFP loss due to within-industry misallocation, because all firms in theindustry face the same factor prices, so average TFP depends only on the physical productivities ofthe incumbents. Instead, there is selection of the more productive firms, so average TFP rises. Bothimpacts are larger if the taxed factor is the one used intensively in the sector (since it has more weightin the composite bundle) and under free trade (since reallocation of factors is larger). The increase onaverage TFP entirely compensates the loss in export capability due to the increase in the relative returnof the factors, up to the point that net effect on comparative advantage through the intensive margin ispositive, but small. Adding the negative effect on the extensive margin due to the exit of firms, whichis not very affected by the trade regime or by the intensity in the use of the factors, the overall impacton export capability is negative.In conclusion, each type of factor misallocation impacts industries’ comparative advantage throughdifferent general equilibrium channels. The extent of each impact depends on the interaction betweenfactor intensities and the variances of distortions, in the case of intra-industry misallocation, and pri-marily on whether the HWA wedges are less or greater than one, in the case of inter-industry misal-location. The effect of both types of factor misallocation on the industries’ TFP is partially offset bychanges in relative factor prices, so the intensive margin contributes less to the adjustment of relativeunit prices relative to the extensive margin (the change in the mass of produced varieties due to the re-allocation of factors across industries). Therefore, ignoring the general equilibrium effects caused byresource misallocation could lead to misguided conclusions. The next section presents a methodologyto solve the model in general equilibrium to produce a counterfactual series of bilateral exports afterremoving allocative inefficiency in a country, and hence to evaluate its frictionless RCA .1.4 Empirical implementationIn this section, I perform the counterfactual exercise of removing both (and separately) the observedintra and inter-industry misallocation in Colombia. I first show how to obtain the counterfactualequilibrium solving the model in relative changes. Next, I comment on the data employed, the methodto measure the dispersion in the MRP of the factors under overhead costs, and the baseline results.Finally, I conduct some robustness checks and compare the baseline results with those obtained forthe one-sector economy and the closed economy.251.4. Empirical implementation1.4.1 Counterfactual exerciseI show in Appendix A.4.1 that under assumptions A.1. and A.2. the entire system can be solved interms of the following system of equations:wilZils = αlsvilsRis (1.21)Z¯il =S∑sZils (1.22)Ris =N∑jpii jsβ js(S∑sR js−D j)(1.23)pii js =(L∏lw−καlsρil)Γisφi jsRisN∑k(L∏lw−καlsρkl)Γksφk jsRks(1.24)where φi js =fσ−1−κσ−1i js a¯κis(τi js)κ f eisdisand pii js is the share of country i in total expenditures of country j in sectors. Denote the share of factor l allocated to sector s in country i as Z˜ils, that is: Z˜ils ≡ ZilsZ¯il . Equations(1.21) and (1.22) can be re-stated as: wilZ˜ilsZ¯il = αlsvilsRis , with the conditionS∑sZ˜ils = 1 ∀ i, l.Now I use the methodology of Dekle et al. (2008), adopted in other papers,32 to obtain the coun-terfactual equilibrium in relative changes. This approach, known as exact hat algebra, allows me tosolve the model without assuming or estimating parameters that are hard to identify in the data, partic-ularly all those which are embedded in the term φi js (trade variable and fixed costs, entry costs, lowerbounds for TFPQ, probabilities of exit), and the current measures of intra-industry and inter-industrymisallocation for all industries and countries. All these values are included in the initial trade shares,and because they do not change in the counterfactual equilibrium, they do not appear in the system inrelative changes.For any variable x in the initial equilibrium denote x′ its counterfactual value and xˆ ≡ x′x the pro-portional change. Then, the system in the final equilibrium can be rewritten as:wˆil =S∑sZ˜ilsRˆisvˆils (1.25)RisRˆis =N∑jpi′i jsβ js(S∑sR jsRˆ js−D jDˆ j)(1.26)pi′i js =pii js(L∏lwˆ−καlsρil)ΓˆisRˆisN∑kpik js(L∏lwˆ−καlsρkl)ΓˆksRˆks(1.27)32See for example Costinot and Rodríguez-Clare (2014), Caliendo and Parro (2015), S´wie˛cki (2017), among others.261.4. Empirical implementationThe objective with this system is to analyze the impact of exogenous changes in both intra andinter-industry misallocation (through the terms vˆils and Γˆis) of a country on the equilibrium outcomesRˆis and wˆil . For this, the system can be solved for Rˆis and wˆil (after imposing the usual normalizationN∑iRisRˆis = 1) given values of the observable variables pii js, Z˜ils and Ris, technological and preferenceparameters αls and βis respectively, and assumptions on parameters κ and σ and the variation ofaggregate trade deficits Dˆ j. Since my interest is to remove factor misallocation only in a country, Iset vˆils = Γˆis = 1 for all countries different from Colombia, so I only need values of vils of Γis forColombia to derive the corresponding proportional changes.Once Rˆis and wˆil are obtained, it is straightforward to compute the relative changes in aggregateexpenditure and trade shares, Eˆi and pˆii js. With these variables it is possible to quantify the cost ofeach type of misallocation in terms of welfare, measured as total real expenditure. In Appendix A.4.5I show that the relative change in aggregate real expenditure can be derived from:EˆiPˆdi=S∏s[Eˆ1κ− 1ρi(pˆiiisRˆisΓˆis) 1κ L∏lwˆαlsρil]−βs(1.28)Notice that in the case of the undistorted economy with one factor of production, equation (1.28)collapses to the well-known Arkolakis et al.’s (2012) formula (S∏s[pˆiiisZˆis]− βsκ) to evaluate the increase inwelfare in response to any exogenous shock.1.4.2 Data and model solutionI collect information on bilateral trade shares, gross output and sectoral factor shares for the same setof countries and manufacturing sectors used in section 1.2. I use a gross output specification for theproduction function with capital, materials, skilled and unskilled labor as inputs. I set factor intensitiesfor all countries equal to the US cost shares, under the assumption that US cost shares reflect actualdifferences in technology across sectors instead of inter-industry misallocation. The primary sourceof information is the OECD’s Trade in Value Added (TiVA) database (2015’s release) for the year1995, but I also use auxiliary information from several other sources; for a detailed description seeAppendix A.1. For the calibrated parameters, I use in the baseline results κ = 4.56 and σ = 3.5,values consistent with those used in the literature.33 Section 1.4.4 verifies how sensitive are the resultsto changes in those values. Given the static nature of the framework, the model is silent about theadjustment of aggregate trade deficits. Thus, for the counterfactual exercises, I assume that for allcountries different from the RoW, trade deficits as a proportion of gross output remain constant in thecounterfactual. The trade deficit of the RoW adjusts to ensure global trade balance.To obtain the proportional changes in the measures of factor misallocation vˆils and Γˆis for Colom-bia, I assume that the joint distribution of factor distortions is log-normal. In Appendix A.4.4 I show33These values are averages of the ones used by Melitz and Redding (2015) (κ = 4.25 and σ = 4) and the ones estimatedby Eaton et al. (2011) (κ = 4.87 and σ = 2.98). Section 1.4.4 evaluates the sensitivity of the baseline results to changes inthese values.271.4. Empirical implementationhow equation (1.18) can be used to obtain an identity that relates the ex-post HWA wedges to the vec-tor of location parameters and the variance-covariance matrix of the ex-ante joint distribution of thedistortions Vis (see equation (A.10)). Therefore, I only need measures of the HWA of wedges, whichcan be inferred from sectoral data using (1.17), and estimates of Vis to obtain the latent location pa-rameters and, consequently, both vils and Γis. The counterfactual exercises involve removing: i) bothtypes of misallocation; ii) only intra-; and iii) only inter-industry misallocation for the homogenousproduction factors: capital, skilled and unskilled labor.34To estimate Vis, I use Bils et al.’s (2017) method to compute the dispersion in the factors’ MRPin the presence of additive measurement error in revenue and inputs. Since overhead factors areanalogous to an unobservable additive term in measured inputs, this approach deals also with theproblem of inferring the variance of factors’ MRP directly from the observed dispersion of the averagerevenue products in the presence of fixed costs. The main idea of Bils et al.’s (2017) approach is toestimate a “compression factor” λˆ to correct the observed dispersion on TFPR, σˆ2T FPR, as a measureof the dispersion in the “true” TFPR, σ2T FPR (λˆ = σ2T FPR/σˆ2T FPR), using panel data. The methodologyexploits the fact that in the absence of measurement error the elasticity of revenues with respect toinputs should not vary for plants with different average products. Hence, panel data can be usedto back out the “true” marginal product dispersion by estimating how such elasticity changes forplants with different average products. I estimate λˆ by GMM sector by sector, using the panel datafrom 1991 to 1998. In Appendix A.2, I present details about the methodology and the results of thereplication.35 I correct the observed variance-covariance matrix of the average revenue products offactors by λˆs to obtain Vˆis. Table 1.5 displays for each industry the employed values for the HWAwedges, the corresponding observed variances and covariances of factors’ average revenue productsand the obtained “compressions factors” λˆs, along with factor intensities.The model is constituted by N × (S+ L) = 1344 equations. The multiplicity of non-linearitiesin the model implies that common optimization routines find multiple local solutions. To obtain theglobal solution, I employ both an algorithm to choose a set of ideal initial conditions and a state-of-the-art solver for large-scale nonlinear systems. Appendix A.3 offers details about these two aspects.1.4.3 Baseline resultsFirst, I describe the results of “extreme” reforms that remove the total extent of intra- and inter-industry misallocation in Colombia. The results of gradual reforms are presented in the next section.I compute the RCA measures for each counterfactual equilibrium using PPML. Similar to Figure1.1, instead of choosing a pair importer-sector, I normalize by global means. The resulting RCAmeasures are displayed in Figure 1.5. All panels plot the actual RCA measures in the horizontal axis34Given the infeasibility of decomposing intermediate consumption into homogeneous inputs, I assume that all observeddispersion in the MRP of materials is due to actual heterogeneity in the input, instead of factor misallocation. Thus, thecounterfactual equilibrium preserves both the observed within-industry dispersion and the inter-industry differences in theMRP of intermediate consumption.35The point estimates for λˆs vary in the range [0.75, 0.87], indicating that around 20% of the observable dispersion inTFPR is attributable to measurement error.281.4. Empirical implementationand the counterfactuals in the vertical one. Panels A and B show the case of removing both types ofmisallocation. In Panel A the markers’ sizes represent the actual industries’ export shares and in PanelB the counterfactual ones.Once both types of misallocation are removed, the ratio of exports to manufacturing GDP risesfrom 0.15 to 0.33 and welfare grows 75%. Although the impact of factor misallocation looks at firstglance surprisingly large, these results are in line with the findings in much of the literature that assessthe gains of similar reforms.36 Table 1.6 displays a decomposition of the aggregate results. The boostin exports is due to the increase in the dispersion of the Colombian schedule of comparative advantage.This is evident in Figure 1.6, which compares the location of the Colombian industries in the RCAworld distribution for the initial and counterfactual equilibria, where each vertical line represents asingle Colombian industry. This figure also evidences the fact that the counterfactual ranking is notrelated to the actual one. Industrial chemicals, other chemicals, glass and tobacco are the industrieswith the largest increases with respect to their initial RCA, whereas petroleum, machinery and equip-ment, transport equipment and computer, electronic and optical products, display the largest drops.The latter industries disappear when both types of misallocation are removed, indicating the presenceof a non-interior solution in the counterfactual equilibrium,37 which explains in part the longer lefttail in the counterfactual world distribution.38 The larger dispersion on the frictionless comparativeadvantage leads to higher degrees of industrial specialization in the frictionless equilibrium, which isevident comparing the export shares from panel A to panel B. For instance, the whole chemical sector(both industrial chemicals and other chemicals), an industry that ends up in the first percentile of thecounterfactual RCA world distribution, concentrates 64% of the counterfactual Colombian exports,from 23% in the actual data.The total impact on comparative advantage is a non-linear combination of the effects of removingboth HWA wedges and the intra-industry dispersion on the returns of the factors. Panel C and PanelD of Figure 1.5 depict the RCA measures after removing only intra- and inter-industry misallocationrespectively, with markers’ sizes representing the counterfactual export shares. In each exercise, Icompute the counterfactual values v′ils and Γ′is such that the other type of misallocation remains un-36For example, HK find that without affecting firms’ selection, an intra-industry reform “would boost aggregate man-ufacturing TFP by 86%–115% in China, 100%–128% in India, and 30%–43% in the United States” (Hsieh and Klenow,2009, pg. 1420). For Indonesia, Yang (2017) computes TFP gains of 207% from removing manufacturing intra-industrymisallocation taking into account firms’ selection (97% in the case of a comparable reform to HK). All these large mag-nitudes are in part due to the extreme nature of the counterfactual, which implies a perfect allocation of factors across allfirms, perhaps an unrealistic reform. This is the reason why some papers prefer experiments with gradual reforms (for ourcase see the next section), or with the reduction of misallocation to the levels observed in a reference country (i.e. the UnitedStates, as in HK).37The feasibility of non-interior solutions in multi-sector Pareto-Melitz type of models is recently evaluated by Kuch-eryavyy et al. (2017). These authors show that under the standard formulation of the model in which the elasticities ofsubstitution do not vary between domestic and foreign varieties, as it is the case in this chapter, it is guaranteed that thegeneral equilibrium is unique, but not necessarily an interior solution. Besides multiple factors and resource misallocation,the other difference that makes the model here different is the fact that fixed costs of exporting are paid in terms of factorsof the source country.38The counterfactual equilibrium also involves large contractions (between 40% and 70%) in some industries of some ofthe main Colombian trade partners: 4 in Ecuador, 2 in Brazil, 1 in Venezuela and 1 in Hong Kong.291.4. Empirical implementationchanged. Notice that in both cases the dispersion of comparative advantage is lower than in PanelsA and B, but larger with respect to the original one. Table 1.6 shows that in spite of both types offactor misallocation contributing to the total growth in exports, intra-industry misallocation seemsquantitatively more important. Removing only intra-industry misallocation leads to an increase in 13p.p. of the exports to GDP ratio and a rise in 56% in welfare, whereas removing only inter-industrymisallocation causes smaller increases (7 pp. and 8% in each variable, respectively).The directions and the magnitudes of the changes in the RCA due to each type of factor misalloca-tion can be explained by the extent of its respective causes. The simulations performed in section 1.3.3suggested that the magnitude of the effect of intra-industry misallocation depends on the interactionbetween factor intensities and the relative variances of distortions, whereas the impact of inter-industrymisallocation depends on whether the HWA wedges are less or greater than 1. Figure 1.7 confirmsthis reasoning. Panel A plots the variation in the RCA when removing intra-industry misallocationagainst the intra-industry dispersion of the TFPR, equal to ~α ′sVˆis~αs for sector s, where ~αs is a L-vectorof factor intensities αls. The positive correlation suggests that sectors in which firms’ TFPR is rel-atively more disperse, have larger gains in comparative advantage. Analogously, Panel B plots thevariation in the RCA when removing inter-industry misallocation against the revenue productivity atthe industry level. The positive correlation implies that industries with HWA wedges greater than onegain export capability when inter-industry misallocation is removed, otherwise they lose.A further exploration of the latter results sheds light on the directions and extents of the generalequilibrium effects that are present in the model. Similar to section 1.3.3, I use the decomposition(A.9) in Appendix A.4.3 to disentangle the effect of each type of misallocation on comparative advan-tage into the three sources of export capability in the model: average TFP, the cost of inputs and thenumber of varieties produced in each sector. Panel A of Figure 1.8 displays the effect of removing allmisallocation (in the top graph), only intra (in the middle graph) and only inter-industry misallocation(in the bottom graph), in each sector’s RCA. Towards a better understanding of the results for the RCA,Panel B shows the same decomposition when the changes in the three sources of export capability arenot compared across industries, but instead are relative only to the same industry in the reference coun-try. Constructed in this way, the decomposition captures a measure that Hanson et al. (2015) denotethe “absolute advantage” index.39 The numbers displayed correspond to the log-differences betweenthe counterfactual values and the initial values of both measures of export capability, and the lengthsof the bars represent the strength of each element in the decomposition, so they add up exactly to thenumber shown.39Since I choose to normalize by world means, from (1.19) the log-differences in the measures of export capability areexactly identified by:log ˆRCAis =ΓˆisRˆisωˆ− κρisS∏s(ΓˆisRˆisωˆ− κρis )1/S/N∏i(ΓˆisRˆisωˆ− κρis )1/NN∏iS∏s(ΓˆisRˆisωˆ− κρis )1/NS; logAˆAis =ΓˆisRˆisωˆ− κρisN∏i(ΓˆisRˆisωˆ− κρis )1/Nwhere AA denotes the “absolute advantage” index.301.4. Empirical implementationFirst, regarding intra-industry misallocation, the gains on average TFP boost “absolute advantage”of all sectors, on average by 0.91 log points. However, these gains are countered by increases inrelative factor prices, on average by 0.74 log points (a rise in relative factor prices is shown as anegative contribution). Thus, in spite of the intensive margin plays a role in the total adjustmentof the “absolute advantage” measure, this latter is in a large part driven by the extent to which thenumber of varieties adjusts, i.e., the extensive margin. When we compute the same decompositionfor RCA, its variation is almost entirely explained by the number of varieties. This is a result of thelow dispersion in the adjustment of the intensive margin of the “absolute advantage” across sectors,contrary to what happens with the number of varieties. Second, regarding inter-industry misallocation,industries facing on average low returns of the factors (Θ¯is < 1, see Table 1.5) increase their inputs’cost, which improves average TFP through the selection of the more productive firms, compensatingthe adverse effect of factor prices in both RCA and “absolute advantage” measures, and vice versa. Inthis case, the magnitudes of the adjustments of average TFP and factor prices in the index of “absoluteadvantage” are lower than those obtained removing MRP dispersions within industries (for example,the median positive change due to average TFP is 0.25 log points). Nevertheless, despite their smallermagnitudes, those changes have a larger dispersion across sectors, enhancing the contribution of theintensive margin in the effect of inter-industry misallocation on the RCA measure.1.4.4 Robustness checks and additional resultsIn this section, I first evaluate the robustness of the previous results to changes in the parameters κand σ . Next, I present the results of gradually removing misallocation. Finally, I compare the baselineresults with those obtained in the cases of taking the whole manufacturing sector as a single industryand in the closed economy.Changes in κ and σChanges in κ or in σ do not importantly alter the ranking of RCA in the counterfactual equilibriaand, if any, have a small effect on its dispersion. Figure 1.9 displays for the case of removing bothtypes of misallocation the ranking of Colombian RCA measures under different values of κ and σ .Changes in the ranking are negligible, and only small variations in the dispersion are noticeable (seecolumn 5 in Table 1.6). However, for a given MRP distribution and RCA schedule, the extent of factorreallocations across sectors is increasing in κ and decreasing in σ . This is due to the fact that in eachindustry a fraction ρκ of the sectoral demand of factors is not affected by firm-level misallocation, thefraction that is allocated to entry. As a result, Table 1.6 shows that the rise in total exports and in theratio exports to GDP is lower for κ = 4 or σ = 4 and larger for κ = 5 or σ = 3.311.4. Empirical implementationGradual reformsFigure 1.10 displays the effects of reforms that gradually remove both and separately the two typesof misallocation on the welfare gains (Panel A) and exports growth (Panel B). The lines’ values inthe extreme right - removing 100% misallocation - coincide with the numbers in Table 1.6. Even thesmallest reform, which reduces 10% the extent of both types of misallocation, has a sizable impact onboth welfare and exports (6.7% and 11% respectively).40 Moreover, it is noticeable that for any reduc-tion in misallocation, the intra-industry type is quantitatively more important, although its contributionvaries with the intensity of the reform.One-sector vs. multiple sectorsTo quantify the importance of industrial specialization in the exports of the frictionless economy, Iperform the exercise of removing misallocation, taking the whole manufacturing sector as a singleindustry. By construction, there is now only intra-industry misallocation, and all industries face thesame factor intensities. Thus, I recompute the corresponding US cost shares and the within-industryvariances of firm’s wedges, values displayed in the last row of Table 1.5. The increase in welfareis similar to the baseline case (70%), but the increase in nominal exports is only 43%, leading to adecrease in the ratio of exports to GDP of 5 p.p. (see the last row in Table 1.6).Closed vs. open economySince in the closed economy revenue shares are constant and equal to the expenditure shares in thedemand system, there is no change in the industrial composition under the Cobb Douglas demand.However, it is possible to quantify the cost of the same measures of misallocation in terms of welfare.For this, notice that in the closed economy we have piiis = pˆiiis = 1 and Rˆis = Eˆis = Eˆi, so we can express(1.28) as:[EˆiPˆdi]closed=∏s[Γˆ−1κis ∏l(S∑sZ˜ilsvˆils)αlsρ]−βs(1.29)Thus, the welfare cost of misallocation in a closed economy with endogenous selection of firms canbe derived only with measures of misallocation and factor shares in autarky. The last column in Table1.6 shows the increase in welfare in the case in which Colombia was a closed economy, under theassumption that the measures of misallocation and factor shares were the same. Apart from the case ofremoving only inter-industry misallocation, the gains on welfare due to removing allocative efficiencyare larger under a closed economy, suggesting that in the particular case of Colombia, internationaltrade dampens the welfare cost of resource misallocation.4140The exports to GDP ratio only begins to increase after removing 20% misallocation, a threshold where the ranking ofindustries’s RCA starts to show alterations.41For the inter-industry case, the results are in line with S´wie˛cki (2017), who shows that simultaneously removingintersectoral wedges in labor in 61 countries and 16 industries leads to larger welfare gains in open economies relative to321.5. Conclusions1.5 ConclusionsResource misallocation at the firm level can alter the relative unit cost of producing a good acrosssectors, distorting the “natural” comparative advantage of a country. This chapter offers a frameworkto compute for a country the export capabilities of its industries under frictionless factor markets,considering the general equilibrium effects of factors reallocations both within and across sectors. Iperform the exercise with a sample of 48 countries, three production factors, and 25 tradable sectorsfor the observed misallocation in Colombia, a country whose firm-level data provide us with reliablemeasures of physical productivity.I find that the reallocation of factors allows Colombia to specialize in industries with “natural”comparative advantage, especially the whole chemical sector (both industrial chemicals and otherchemicals). Reallocating factors generates a rise in the ratio of exports to manufacturing GDP by 18p.p. and an increase in welfare of 75%, for the case of an extreme reform in which factor misalloca-tion is entirely removed. The specialization channel due to comparative advantage, that substantiallytransforms the industrial composition when removing firm-level factor misallocation, is a omittedmechanism in the workhorse models of firm-level resource misallocation in closed economies.The impact of allocative efficiency on comparative advantage depends importantly of the adjust-ment in the extensive margin. In the case of factor misallocation within industries, I find that removingdistortions increases comparative advantage for those sectors in which the returns of the factors usedintensively are relatively more dispersed. The gains in terms of unit costs are mainly the result of anincrease in the relative number of varieties produced because at the intensive margin the increases onaverage TFP are largely countered by the responses on relative factor prices, and there is not enoughvariation across industries of the residual effect. And for inter-industry misallocation, industries inwhich firms on average face factors’ returns larger than the allocative efficient values, increase theircomparative advantage when misallocation is removed. In this case, the gains in export capabilityderive from the reduction of average factor costs, which compensates the adverse selection of firmswithin the sector, plus an increase in the number of varieties produced. The overall effect of factormisallocation on comparative advantage is a combination of these two forces.These results suggest that the design of mechanisms that smooths the dispersion on factor returnsacross firms is a desirable policy. It can boost total productivity and welfare allowing for a moreefficient pattern of specialization across industries, in which comparative advantage responds more todifferences in efficiency across sectors and relative factor endowments, the “natural” sources of exportcapability. The growing literature exploring the causes of the dispersion on the factors’ returns is afertile field of research to start exploring optimal policy instruments in an open economy.closed ones (for Colombia, the gains are 18% in the open economy case and 11% under autarky). The intuition for hisresult is that in the closed economy distorted sectors cannot expand beyond the domestic demand for the sector’s output.However, adding firms’ endogenous selection can make the effect of trade on the cost of misallocation dependent on the jointdistribution of TFPQ and wedges. In particular, trade will have a larger impact on welfare in an economy where the exitingplants due to trade contribute relatively more to the total intra-industry misallocation (i.e., where their TFPR dispersion ishigher). In that sense, trade could mitigate or exacerbate the cost of misallocation, particularly of the intra-industry type.331.6. Tables and figures1.6 Tables and figures1.6.1 TablesTable 1.1: Alternative explanations for dispersion in revenue productivitySource Variable Contribution* Countries PaperAdjustment costsσ2MRPK1% China, Colombia,Mexico David andVenkateswaran (2017)Uncertainty about TFP 7%Variable markups 5%ChinaHeterogeneity in technology 17%Heterogeneity in workers σ2MRPL 9% Denmark Bagger et al. (2014)abilityAdditive measurement error σ2T FPR 45% India Bils et al. (2017)in revenues and inputsNotes: σ2T FPR corresponds to the variance of the revenue productivity (TFPR), which is a function of the variances (andcovariances) of the marginal revenue products (MRP), σ2MRPz for factor z. The table displays the contribution of causesdifferent to misallocation to the corresponding variances of the MRP (for capital (K) and labor (L)) or directly to the TFPR.*Average contribution if the number of countries is greater than one.Table 1.2: RCA explained by misallocation measures and determinants of export capability(1) (2) (3) (4)dRCAist dRCAist dRCAist dRCAistIntra-ind. allocative efficiency 0.358*** 0.575*** 0.339***(0.082) (0.088) (0.084)Intra-ind. variance of TFPR -0.145**(0.060)Inter-industry wedges -0.351*** -0.241*** -0.202** -0.371***(0.081) (0.088) (0.063) (0.085)Efficient TFP 0.244** 0.234** 0.218** 0.272***(0.090) (0.098) (0.103) (0.088)Factor prices -0.318*** -0.197** -0.263*** -0.306***(0.066) (0.076) (0.077) (0.067)Observations 208 208 208 208R-square 0.327 0.266 0.551 0.23Notes: * p<0.10, ** p<0.05 and *** p<0.01. The results correspond to the second-stage of the econometric strategy, wherein the first stage the exporter-industry FE are estimated by PPML. The dependent variable is dRCAist , the change in theRCA measure with respect to the first period. All independent variables are transformed to be changes with respect to thefirst period relative to the reference industry, normalized by the corresponding changes in the US PPI. (1) and (2) are thebaseline results. (3) changes reference industry (to min. number of zeros), (4) changes set of countries (to 19). Standardizedcoefficients and heteroskedastic robust errors.341.6. Tables and figuresTable 1.3: Equilibrium conditions and endogenous variablesEquilibrium condition Equation DimensionFactor clearing (1.13) N×LIndustry factor demand (1.12) N×L×SZero profit (1.7) N×N×SAggregate stability (1.9) N×N×SFree entry (1.8) N×SIndustry price (1.10) N×SIndustry demand Qdis =N(∑kMkis∑mqρkim)1ρ N×SAggregate price Pdi =S∏s(Pdisβs )βs NTrade balance (1.14) NEndogenous variable Notation DimensionPrimary factor price wil N×LIndustry-level primary factor Zils N×L×SCutoffs for undistorted firms by dest. a∗i js N×N×SMass of firms by destination Mi js N×N×SMass of entrants His N×SIndustry-level consumer price & demand Pdis ,Qdis 2×N×SAggregate consumer price & demand Pdi ,Qdi 2×NTable 1.4: Parameters used in simulationsParameter Description Valueαls Factor intensities[0.7 0.30.3 0.7]βis Expenditure shares 0.5 ∀ i,sσ Varieties’ elasticity of substitution 3.8κ Pareto’s shape parameter 4.58Z¯il Factor endowments[100 9090 100]a¯is Pareto’s location parameter 1 ∀ i,sδis Exogenous probability of exit 0.025 ∀ i,sf eis Fixed entry cost 2 ∀ i,sfi js Fixed trade cost 2 ∀ i, j,sτi js Iceberg trade costFree trade: 1 ∀ i, j,sCostly trade: 2 ∀ s∧ i 6= j; 1 ∀ s∧ i = jσl1 Log-normal shape par. in sector 1For figure 1.3: [0,0.5] ∀ lFor figure 1.4: 0 ∀ lµl1 Log-normal location par. sector 1For figure 1.3: (12 − (1− κρ )αl1)σ2l1 ∀ lFor figure 1.4: [−0.5,0.5] ∀ l351.6.TablesandfiguresTable 1.5: Factor intensities and misallocation measures used in counterfactualsNumber Factor intensities Inter-industry wedges Corrected∗∗ intra-industry Corrected∗∗ intra-industry BKR’s (2017)of firms (GO specification) (HWA of firm-level wedges) variances of log-wedges covariances of log-wedges “compression”Sector (in 1995) αk αs αu (1+ θ¯k) (1+ θ¯s) (1+ θ¯u) Θ¯ σ2k σ2s σ2u σks σku σsu λˆs∗ s.e.Food 1435 0.31 0.06 0.09 1.90 1.01 1.14 1.15 1.07 1.09 1.20 0.19 0.19 0.86 0.81a 0.13Beverage 142 0.36 0.06 0.06 1.05 0.98 1.14 1.33 0.90 0.76 0.75 0.00 -0.07 0.49 0.79 1.74Tobacco 9 0.73 0.02 0.04 1.67 1.64 0.39 1.28 0.53 1.24 1.62 0.28 -0.34 0.94 0.76a 0.02Textiles 465 0.22 0.08 0.18 0.81 1.08 0.88 1.02 1.33 0.71 0.69 -0.06 0.08 0.43 0.82 0.76Apparel 944 0.23 0.10 0.17 1.25 0.40 0.26 0.72 1.27 0.65 0.61 0.11 0.16 0.29 0.87a 0.04Leather 118 0.32 0.12 0.16 1.38 1.00 0.47 0.73 0.89 0.73 0.46 -0.01 -0.06 0.46 0.84a 0.09Footwear 254 0.21 0.12 0.20 1.51 1.00 0.59 0.97 1.09 0.66 0.46 0.08 0.12 0.34 0.80 0.73Wood 196 0.13 0.07 0.18 0.25 0.37 0.48 0.51 1.43 0.45 0.37 0.27 0.15 0.29 0.86a 0.12Furniture 270 0.18 0.11 0.25 0.70 0.27 0.32 0.50 1.45 0.40 0.40 0.12 0.01 0.20 0.85 0.58Paper 170 0.21 0.09 0.18 0.64 2.40 2.62 1.17 0.94 0.80 1.10 0.05 -0.03 0.68 0.79c 0.44Printing 434 0.23 0.15 0.26 1.02 0.83 1.62 1.02 0.74 0.50 0.50 -0.05 -0.09 0.20 0.85a 0.03Chemicals 177 0.37 0.07 0.08 1.23 1.96 1.77 1.08 1.43 0.78 0.76 0.11 -0.06 0.54 0.83a 0.06Other chemicals 356 0.36 0.12 0.09 2.50 1.13 1.49 1.53 1.02 0.71 0.85 -0.07 -0.11 0.50 0.81 0.98Petroleum 46 0.15 0.02 0.02 0.65 0.98 0.86 1.28 2.02 1.14 1.47 0.82 0.97 1.20 0.76a 0.01Rubber 93 0.20 0.12 0.22 0.63 2.01 1.64 1.05 0.68 0.61 0.48 0.20 0.20 0.33 0.83 1.24Plastic 428 0.10 0.08 0.28 0.38 0.95 1.74 1.04 0.83 0.61 0.59 -0.01 -0.04 0.39 0.83a 0.02Pottery 13 0.27 0.13 0.30 1.16 1.19 1.38 1.11 0.18 0.46 0.73 -0.06 -0.08 0.56 0.80a 0.01Glass 82 0.26 0.29 0.12 0.91 4.59 0.70 1.38 0.97 0.53 0.49 -0.15 0.02 0.33 0.80 2.72Other non-metallic 365 0.21 0.07 0.14 0.46 1.36 1.11 1.05 1.28 0.72 0.91 0.02 -0.01 0.64 0.80 2.59Iron and steel 86 0.18 0.10 0.21 0.50 2.74 3.01 1.28 0.91 1.08 1.35 -0.15 -0.12 1.07 0.78a 0.01Non-ferrous metal 42 0.18 0.10 0.27 0.38 0.56 0.94 0.39 0.44 0.78 1.22 -0.14 -0.40 0.89 0.82a 0.03Metal products 664 0.21 0.12 0.17 1.09 1.20 0.72 0.99 1.27 0.58 0.55 0.09 0.08 0.39 0.84b 0.35Mach. & equipment 374 0.25 0.11 0.09 1.50 0.83 0.36 1.04 0.94 0.43 0.46 0.02 0.12 0.28 0.83a 0.02Electric. / Profess. 276 0.19 0.02 0.08 1.00 1.27 0.74 1.01 0.94 0.59 0.62 0.05 0.06 0.43 0.78 0.58Transport 274 0.24 0.15 0.13 2.23 0.45 0.91 1.20 0.93 0.48 0.73 0.19 0.23 0.38 0.84a 0.02One-sector 7713 0.24 0.09 0.13 1.00 1.00 1.00 1.00 1.13 1.05 0.86 0.08 0.08 0.63 0.85a 0.33Notes: ∗Point estimates for λs using Bils et al. (2017) (see Appendix A.2). Levels of significance: c p < 0.1, b p < 0.05, a p < 0.01.∗∗“Corrected” values correspond to the product of the observed dispersion (after removing outliers and trimming 1% tails) and the corresponding value for λs.For non-significant values of λs, the value of the last row is used, a specification that controls for industry×years fixed effects.361.6. Tables and figuresTable 1.6: CounterfactualsChange in each variable after removing factor misallocation in ColombiaVariable RevenueValueaddedExportsExports/GDP*RCAs.d.*WelfareWelfare -autarkyCounterfactual RˆCol ˆGDPCol XˆCol ∆( XGDP )Col ∆σRCAColEˆColPˆCol[EˆColPˆCol]closedBaseline resultsBoth types 1.54 2.22 4.78 0.18 2.60 1.75 1.85Only intra-industry 1.41 1.92 3.59 0.13 1.95 1.56 1.72Only inter-industry 1.04 1.09 1.57 0.07 1.69 1.08 1.07Robustness: Both typesDecreasing σ (to 3) 1.59 2.35 5.22 0.19 2.68 1.90 1.99Increasing σ (to 4) 1.50 2.14 4.51 0.17 2.69 1.67 1.76Decreasing κ (to 4) 1.44 2.01 4.14 0.16 2.40 1.64 1.75Increasing κ (to 5) 1.61 2.38 5.36 0.19 2.61 1.84 1.92One-sectorOnly intra-industry 1.58 2.32 1.43 -0.05 - 1.70 1.87Note: Each cell shows the proportional change in each variable between the counterfactual equilibrium and the actual data.For variables marked by *, the simple difference in the measure is displayed.1.6.2 A.1 FiguresFigure 1.1: Revealed comparative advantage (RCA) measures for ColombiaFoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniture PaperPrintingInd. chemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOth. n−metal. mineralIron and steelNon−ferrous metalMetal productsMachinery, equipmentTransportElectric & profess.−2−1012Exp−Ind FE by Poisson PML−4 −2 0 2 4Exp−Ind FE by EK−TobitNotes: Markers’ sizes represent export shares, and the line the best linear fitting.371.6. Tables and figuresFigure 1.2: Cutoff functions and selection effects of distortionsPanel A : Cutoff functions for country i sector s*Exiting firms1Exporters to destination jProducers only for domestic market i! "(TFPQ)Θ" (TFPR)!$$%∗!$'%∗ = Λ$'%!$$%∗!$'%∗ (Θ) = !$'%∗ Θ,- !$$%∗ (Θ) = !$$%∗ Θ,-*For the domestic market and the destination j with lowest Λ$'%Panel B: Selection effects of distortions1BAABEntry due to distortions: for firms producing to domestic market (A) and exporters (B)Exit due to distortions: for firms producing to domestic market (A) and exporters (B)#$%&∗ (Θ)#$$&∗ (Θ)Θ+ (TFPR)# +(TFPQ)Λ$%&#$$&∗Λ-$%&#./∗#./∗#$$&∗Θ-/∗ = #$$&∗12(#./∗ )381.6. Tables and figuresFigure 1.3: Effects of factor misallocation within industries on RCA and its determinants<l10 0.1 0.2 0.3 0.4ln(RCA)!ln(~RCA)-2.5-2-1.5-1-0.50RCA: (1) + (2) + (3)<l10 0.1 0.2 0.3 0.4(<!1)$[ln(A)!ln(~ A)]-1.2-1-0.8-0.6-0.4-0.20(1) TFP<l10 0.1 0.2 0.3 0.4(1!<)$[ln(7 A)!ln(!;)]00.10.20.30.40.50.60.70.80.9(2) Inverse of factor prices<l10 0.2 0.4ln(M)!ln(~ M)-2-1.5-1-0.50(3) Mass of firmsWedges on fac. 1, costly tradeWedges on fac. 1, free trade Wedges on fac. 2, costly tradeWedges on fac. 2, free trade<l10 0.2 0.4(1+7 3 l1)11111Ex-post average wedge<l10 0.2 0.47l1;<l100.20.40.60.81Values of parameters711721<11<21Figure 1.4: Effects of factor misallocation across industries on RCA and its determinants7l1-0.5 0 0.5ln(RCA)!ln(~RCA)-1.5-1-0.500.51RCA: (1) + (2) + (3)7l1-0.5 0 0.5(<!1)$[ln(A)!ln(~ A)]-0.6-0.4-0.200.20.40.6(1) TFP7l1-0.5 0 0.5(1!<)$[ln(7 A)!ln(!;)]-0.500.5(2) Inverse of factor prices7l1-0.5 0 0.5ln(M)!ln(~ M)-1.5-1-0.500.51(3) Mass of firmsWedges on fac. 1, costly tradeWedges on fac. 1, free trade Wedges on fac. 2, costly tradeWedges on fac. 2, free trade7l1-0.5 0 0.5(1+7 3 l1)0.811.21.41.6Ex-post average wedge7l1-0.5 0 0.57l1;<l1-0.500.5Values of parameters711721<11<21 391.6. Tables and figuresFigure 1.5: Allocative efficient RCA and observed RCA for ColombiaPanel A: Intra- and inter-industry allocative efficient Panel B: Intra- and inter-industry allocative efficientRCA and observed RCA (observed export shares) RCA and observed RCA (counterfactual export shares)FoodBeverageTobaccoTextiles ApparelLeather7WoodFurniturePaper PrintingChemicalsOth chemicalsPetroleumRubberPlastic1718Oth. non−metal. mineralsIron and steelNon−ferrous metalMetal products not M&EM&E Elec. / profess.Transport−8−6−4−20246Counterfactual RCA−8 −6 −4 −2 0 2 4 6Observed RCA7 Footwear, 17 Pottery, 18 GlassFoodBeverageTobaccoTextiles ApparelLeather7WoodFurniturePaper PrintingChemicalsOth chemicalsPetroleumRubberPlastic1718Oth. non−metal. mineralsIron and steelNon−ferrous metalMetal products not M&EM&E Elec. / profess.Transport−8−6−4−20246Counterfactual RCA−8 −6 −4 −2 0 2 4 6Observed RCA7 Footwear, 17 Pottery, 18 GlassPanel C: Only intra-industry allocative efficient RCA Panel D: Only inter-industry allocative efficient RCAand observed RCA (counterfactual export shares) and observed RCA (counterfactual export shares)Food2Tobacco4ApparelLeather7WoodFurniturePaperPrintingChemicalsOth chemicalsPetroleumRubberPlastic17GlassOth. non−metal. mineralsIron and steelNon−ferrous metalMetal products not M&EM&EElec. / profess.−8−6−4−20246Counterfactual RCA−8 −6 −4 −2 0 2 4 6Observed RCA2 Beverage, 4 Textiles, 7 Footwear, 17 PotteryFoodBeverageTobaccoTextiles ApparelLeather7WoodFurniturePaperPrintingChemicalsOth chemicalsPetroleumRubber1617GlassOth. non−metal. mineralsIron and steelNon−ferrous metalMetal products not M&EM&EElec. / profess.Transport−8−6−4−20246Counterfactual RCA−8 −6 −4 −2 0 2 4 6Observed RCA7 Footwear, 16 Plastic, 17 PotteryNotes: Each panel compares the RCA measures in the corresponding counterfactuals to the observed RCA measures.Markers’ sizes represent the indicated export shares.401.6. Tables and figuresFigure 1.6: Colombian industries in the world distribution of RCAPanel A: Distribution under observed data Panel B: Distribution under Colombia’s efficient allocation0.1.2.3.4Density−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6log(RCA)0.1.2.3.4.5Density−6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6log(RCA)Note: Each vertical line represents the location of a Colombian industry in the RCA world distribution.Figure 1.7: Changes in Colombian RCA and their causesPanel A: Change in RCA by removing intra-industry Panel A: Change in RCA by removing inter-industrymisallocation and within-industry variance of TFPR misallocation and sectoral TFPR for ColombiaFoodBeverageTobacco4 56FootwearWood 91011ChemicalsOth chemicalsPetroleumRubberPlasticPotteryGlassOth. non−metal. minerals20Non−ferrous metalMetal products not M&EM&EElectric. / Profess.Transport−10−505Change in RCA0 .1 .2 .3 .4Intra−industry variance of log TFPR4 Textiles, 5 Apparel, 6 Leather, 9 Furniture, 10 Paper, 11 Printing, 20 Iron and steelFoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniture10PrintingChemicalsOth chemicalsPetroleumRubberPlastic PotteryGlassOth. non−metal. mineralsIron and steelNon−ferrous metalMetal products not M&EM&EElectric. / Profess.Transport−6−4−202Change in RCA.6 .8 1 1.2 1.4Sectoral TFPR10 PaperNotes: Intra-industry variance of log TFPR in Panel A is constructed as the weighted average of the within-industry disper-sion of the factors’ MRP that face misallocation: capital, skilled and unskilled labor. Similarly, sectoral TFPR in Panel B iscomputed using only capital, skilled and unskilled labor as inputs.411.6. Tables and figuresFigure 1.8: Changes in determinants of Colombian RCAPanel A: Changes in comparative advantage determinants Panel B: Changes in absolute advantage determinants1. Removing intra- and inter-industry misallocation2.71.26.30.50.40.80.9−3.0−1.32.30.84.13.5−7.01.80.22.13.70.62.7−4.90.4−5.7−6.0−5.2FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−8 −6 −4 −2 0 2 4 6 8Log points1.30.14.7−0.2−1.6−0.2−0.1−3.7−2.41.0−0.12.52.0−8.20.5−0.10.72.3−1.41.2−6.6−0.1−7.2−7.5−6.4FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−10 −8 −6 −4 −2 0 2 4 6 8 10Log points2. Removing only intra-industry misallocation1.30.84.70.61.31.10.6−0.30.31.20.02.81.0−7.60.9−0.61.21.40.41.02.80.4−6.0−3.1−5.9FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−8 −6 −4 −2 0 2 4 6 8Log points0.90.54.30.40.90.80.3−0.50.10.8−0.02.30.6−7.90.6−1.90.81.0−0.10.62.40.2−6.5−3.6−6.1FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−10 −8 −6 −4 −2 0 2 4 6 8 10Log points3. Removing only inter-industry misallocation1.70.61.60.0−1.70.00.5−1.7−1.61.51.11.82.80.21.31.11.42.7−0.32.1−5.60.2−5.7−5.41.6FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−8 −6 −4 −2 0 2 4 6 8Log points0.7−0.30.8−0.1−2.9−0.2−0.5−2.5−2.20.50.10.71.7−1.00.30.10.41.7−6.71.1−6.5−0.4−6.8−6.50.5FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non−metallicIron and steelNon−ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport−10 −8 −6 −4 −2 0 2 4 6 8 10Log points0.90.54.30.40.90.80.3-0.50.10.8-0.02.30.6-7.90.6-1.90.81.0-0.10.62.40.2-6.5-3.6-6.1FoodBeverageTobaccoTextilesApparelLeatherFootwearWoodFurniturePaperPrintingChemicalsOther chemicalsPetroleumRubberPlasticPotteryGlassOther non-metallicIron and steelNon-ferrous metalMetal productsMach. & equipmentElectric. / Profess.Transport-10 -8 -6 -4 -2 0 2 4 6 8 10Log pointsNumber of varieties Factor prices Average TFP 421.6. Tables and figuresFigure 1.9: Rankings of RCA for different values of κ and σPanel A: Changes in σ Panel B: Changes in κTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EOth. non-metal. min.Beverage PlasticApparelTextilesRubberIron and steel FootwearPrintingPaperLeatherPottery FoodTobaccoOth chemicalsGlassChemicalsTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EOth. non-metal. min.PlasticBeverage ApparelTextilesRubberFootwearIron and steelPrintingLeatherPaperPottery FoodOth chemicalsTobacco GlassChemicalsTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EOth. non-metal. min.BeveragePlasticApparelTextilesRubberIron and steel FootwearPrintingPaper LeatherTobaccoPottery FoodOth chemicalsGlassChemicals-8-6-4-2024RCABaseline (3.5) Sigma=3 Sigma=4Assumption for sigmaTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EOth. non-metal. min.Beverage PlasticApparelTextilesRubberIron and steel FootwearPrintingPaperLeatherPottery FoodTobaccoOth chemicalsGlassChemicalsTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EBeverage Oth. non-metal. min.PlasticApparelTextilesRubberIron and steelTobacco FootwearPaper PrintingLeatherPotteryFoodOth chemicalsGlassChemicalsTransportM&EElec./profess.PetroleumNon-ferrous metalWoodFurnitureMetal products not M&EOth. non-metal. min.PlasticBeverage ApparelTextilesRubberFootwearIron and steel PrintingLeatherPaperPottery FoodOth chemicalsGlassTobaccoChemicals-8-6-4-2024RCABaseline (4.6) Kappa=4 Kappa=5Assumption for kappaFigure 1.10: Welfare gains and export growth from gradual reformsPanel A: Welfare gains Panel B: Export growth10 20 30 40 50 60 70 80 90 100% of reduction in misallocation11.11.21.31.41.51.61.71.8Both types Only intra-industry Only inter-industry10 20 30 40 50 60 70 80 90 100% of reduction in misallocation11.522.533.544.55Both types Only intra-industry Only inter-industry43Chapter 2Barriers to Mobility or Sorting? Sourcesand Aggregate Implications of IncomeGaps across Sectors and Locations inIndonesia2.1 IntroductionLarge and persistent gaps in average incomes of agricultural and non-agricultural workers in devel-oping countries have been well documented. What exactly accounts for these gaps is still debated,however. A common view is that the gaps exist because workers cannot arbitrage them away due tobroadly understood barriers to mobility across sectors and locations. Such barriers would suggest thatlabor is inefficiently allocated. An opposing view, recently gaining influence, is that the gaps sim-ply reflect an efficient sorting of heterogeneous workers based on their observable and unobservablecharacteristics. The goal of this chapter is to evaluate the contribution of mobility barriers and sortingto the observed income gaps in Indonesia and to assess whether these gaps are a source of aggregateinefficiency.The relative plight of agricultural workers compared to non-agricultural workers in developingcountries is at first sight staggering. For example, wage workers outside of agriculture earn 80%more than workers in agriculture in a median of 13 countries studied by Herrendorf and Schoellman(2018). Similarly large gaps emerge when workers are split according to their place of residencerather than the sector of their occupation. For example, urban households in Vietnam in 1998 had realper capita consumption twice as high as rural households (Nguyen et al. (2007)). Such gaps couldin principle merely reflect differences in the composition of workforce. For example, to the extentthat urban workers are typically better educated than rural workers, the differences in their averagewages could simply be picking up the return to additional human capital of urban workers. Butstudies for various developing countries find evidence that substantial rural-urban gaps remain afterfactoring out the effect due to differences in schooling achievement and other observable individualcharacteristics (see, e.g., Hnatkovska and Lahiri (2016) for India and Qu and Zhao (2008) for China).These residual gaps have been documented for wages, broader measures of income, expenditure, andconsumption. Similar gaps have also been identified in a parallel literature using value added data442.1. Introductionto compare productivity of workers across sectors. Gollin et al. (2014) show using a wide sampleof countries that workers in non-agriculture are twice as productive as workers in agriculture, aftertaking into consideration the differences in hours worked, schooling and quality of schooling betweenrural and urban areas.42 Given the ubiquity of large residual gaps estimated using different countries,measures, and methodologies, they appear to be a real phenomenon rather than merely a measurementartifact.It is therefore a puzzle why such gaps persist. Why do workers not switch to sectors and locationsoffering higher income to workers with their observable characteristics, eroding the premia? Thereare two main hypotheses in the literature. The first one is that the gaps are a manifestation of barriersto mobility. To the extent that these barriers are at least partially induced by policies, this view impliesthat labor is misallocated. Given the magnitude of the gaps, there are potentially large aggregateefficiency gains from mitigating the mobility frictions. In this spirit, Restuccia et al. (2008) calculatethat distortions to the allocation of labor between agriculture and non-agriculture play an importantrole in explaining cross-country income differences.An alternative explanation for the residual income gaps is that it is a result of sorting of workersacross sectors or locations based on characteristics known to them but not observed by researchers.For example, a positive urban premium can be observed if workers choosing urban locations have onaverage more unobservable skills than rural workers conditional on their education attainment. Thismechanism is the explanation of the urban premium recently proposed by Young (2013), who buildson an adaptation of the Roy (1951) model by Lagakos and Waugh (2013). Importantly, in this viewresidual gaps across sectors or locations can exist despite the allocation of labor being efficient.Given the different implications of the two canonical explanations of income gaps for allocativeefficiency, it is important to know which view is a better description of reality. A major shortcomingof the existing literature accounting for income gaps is that it relies on cross-sectional data. Butas is well-known following Heckman and Honoré (1990), the estimation of selection models usingonly cross-sectional data faces identification challenges. Existing studies therefore need to rely onfunctional form assumptions (Bryan and Morten (2018)) or on indirect ways of detecting sorting(Young (2013)). In this chapter, we argue that augmenting a standard model of sorting by includingbarriers to sectoral mobility requires longitudinal data to identify the parameters of interest, evenwhen imposing functional forms assumptions. We exploit the panel dimension of a dataset collectedin Indonesia, to provide more direct evidence of the extent barriers to sectoral mobility in a context ofself-selection.The Indonesia Family Life Survey (IFLS, Strauss et al. (2016)) we use is uniquely well fitted forour goals. First, it is a longitudinal survey spanning a relatively long period of time, with five wavesof the survey conducted between 1993 and 2014. Second, a feature of the survey design and imple-mentation is that it exerts particular effort to track households and individuals even if they migrate, acritical feature for a country undergoing a process of urbanization. Third, IFLS records a rich set of42Herrendorf and Schoellman (2015) caution that such gaps might overestimate true productivity differences due topotential measurement problems in agricultural value added.452.1. Introductionsocio-economic information on surveyed individuals. Fourth, with about 20000 surveyed individualsit is a large survey representative of more than 80% of the Indonesian population. Fifth, with roughly40%/60% split between agricultural and non-agricultural workforce Indonesia is a relevant setting toinvestigate the gaps across boundaries traditionally used for developing countries. Finally, being theforth most populous country in the world Indonesia is an important country to study in its own right.We begin our analysis in the next section by documenting some robust features of the Indonesiandata. Just like in other developing countries, the ILFS data shows the existence of a large incomegap across sectors in Indonesia. Controlling for observable worker characteristics, workers outside ofagriculture earn 67% more than workers in agriculture. This is the non-agriculture premium we wantto understand better.Importantly, this premium already conditions on the rural vs. urban location. Much of the literaturetends to associate rural employment with agriculture and urban employment with non-agriculture.This implicit isomorphism might lead to an intuition that non-agricultural premium is to be expectedeven in the absence of sorting or frictions because it compensates workers for the real cost of rural-to-urban migration. We find that logic to be misguided. In Indonesia 45% of rural workforce hasprimary employment outside of agriculture and 11% of urban workforce is employed primarily inagriculture, so we can meaningfully separate the non-agricultural and urban premia. In this chapterwe emphasize the sectoral dimension more because most of the rural-urban residual income gap canbe accounted for by differences in sectoral composition of rural and urban areas combined with thelarge non-agriculture premium. The direct urban premium estimated at 26% in the cross-section ofworkers, while not trivial, is substantially smaller than the 67% non-agriculture premium.Moving beyond these cross-sectional premia, we exploit the panel structure of our data by rely-ing on within-worker variation in income across sectors and locations. This approach follows Katzand Summers (1989) and the subsequent long tradition of estimating inter-industry wage differen-tials in developed countries. In Indonesia, it reduces the residual gaps roughly by half, to 29% fornon-agriculture and 9% for urban locations. Digging even deeper, we compare the income growthof workers moving out of agriculture relative to those staying in agriculture, and of workers mov-ing out of non-agriculture relative to those who stay employed in non-agriculture. Because we alsohave detailed migration data, we can do this calculation even conditional on staying in the same verynarrowly defined geographical areas (village level). Perhaps our most surprising finding is that thenon-agriculture premium exists even within such local markets and that it is approximately symmetricfor switches in both directions. Workers who move out of agriculture see an income gain of 19% whilethose who move into agriculture see a loss of 19%, even if they stay in the same village. The reportedpremia are robust to a host of concerns about sample selection, estimation method, and measurementissues.The fact that half of the non-agriculture premium disappears after controlling for time-invariantunobserved heterogeneity informally suggests that sorting does indeed occur and is important. Thequestion is if the 19% average excess income gain received by a worker who switches from agricultureto non-agriculture can be reconciled with an efficient sorting based on comparative advantage alone.462.1. IntroductionIn principle, it can. This is because the industry premia, even estimated using within-worker variation,have by themselves little empirical content. We show this by extending a standard model of self-selection based on both permanent and transitory components of comparative advantage to includedifferent types of barriers to sectoral mobility. In particular, we consider utility costs of switchingsectors (Dixit and Rob (1994); Cameron et al. (2007); Artuç et al. (2010); Dix-Carneiro (2014)) andfrictions preventing individuals from working in their preferred sectors (akin to search costs as in Taberand Vejlin (2016)). We demonstrate that the same cross-sectional and within-worker non-agriculturepremia can be rationalized by different combinations of comparative advantage shock processes andbarriers to mobility. In particular, by picking the right covariance matrix for the transitory componentof comparative advantage we can generate large within-worker premia in the absence of any barriers.Similarly, we can have large barriers to mobility despite observing zero non-agricultural premia. Thepremia alone cannot tell us if there is any worker misallocation or not.The comparative advantage process and barriers to mobility can be separately identified once weimpose some parametric structure and exploit a richer set of moments of the joint sector-income dis-tribution over time. We use indirect inference (Gourieroux et al., 1993) for the structural estimationof our model, where the selected auxiliary models are the main reduced-form regressions that char-acterize the data features we are interested in and that allow us to identify the full set of structuralparameters, including the mobility barriers.Our findings suggest that both types of barriers - utility switching costs and inability to selectthe preferred sector - significantly improve the overall fit of the model compared to the frictionlessspecification. They are both able to qualitatively match simultaneously the sectoral premia and thepatterns of the moments of the joint distribution of income. For the switching costs specification, weestimate opposite signs for the switching costs away from and towards agriculture. This pattern isobservationally similar to receiving a positive compensating differential for working in agriculture. Ifwe assume that the choice of a sector is always voluntary (but switching is costly), then the modeluses utility compensation for moving to agriculture to rationalize why so many workers make themove despite taking an income cut.A considerably better fit to the data, however, is offered by the model which recognizes that notall sectoral transitions are voluntary. In fact, our central estimate implies that half of the transitionsbetween non-agriculture and agriculture we see happen for random reasons (these can be interpretedas life events forcing an individual to switch the sector of employment) rather than in response toshocks to the comparative advantage. Once a worker lands in her sub-optimal sector, moving to thepreferred sector is difficult as it requires a lucky draw. Given its superior empirical performance, ourpreferred model relies on this type of mobility friction.The barriers to mobility are quantitatively important. To make this point, we conduct a counter-factual in which the frictions are removed entirely from our baseline model. This thought experimentis standard in the misallocation literature, though of course extreme because we do not know how thefrictions could be completely eliminated in practice. With that caveat in mind, removing all barriers tointersectoral mobility would result in large reallocation of workers. Overall, 30% of workforce would472.2. Datawork in a different sector than in the baseline equilibrium. Since the initially misallocated workersreap large income gains from the reallocation (their income doubles on average), the adjustment has asizable effect on aggregate output, raising it by 17%. Agricultural employment contracts by nearly 6p.p., but output and productivity increase by double digits in both sectors.Among the large literature on the income gap between agriculture and non-agriculture, the mostclosely related work consists of a handful of papers that also exploit individual-level panel informationfor developing countries. Beegle et al. (2011) offer early evidence of large within-individual gains inKenia, but their focus is on consumption gains from migration rather than the more puzzling incomegains from sector switching conditional on not migrating. Perhaps the closest, in concurrent workHicks et al. (2017) also use the IFLS and find smaller within-individual non-agricultural premium. Aswe explain in section 2.11, our substantive differences stem from different data selection and focusingon different measures of interest. More importantly, we argue that the non-agricultural premium byitself is not necessarily an informative statistic, and we estimate a structural model that allows us toquantitatively evaluate the importance of barriers to sectoral mobility.While our results are non-experimental and the magnitudes we report depend on the structuralassumptions we make, we believe our key finding of barriers to mobility is also broadly consistent withthe limited existing experimental evidence. In a randomized small-scale setting, Bryan et al. (2014)find substantial gains from inducing workers in Bangladesh to work outside their village, though againthe focus is on consumption gains from migration making direct comparison difficult. More closely,Sarvimaki et al. (2018) using a natural experiment in Finland find large income gains for workers whoabandoned farming as a result of forced migration.2.2 DataIn this section we describe the data, only highlighting the features of the dataset most relevant forour analysis. Comprehensive details about the design and implementation of the IFLS are reported inStrauss et al. (2016).Our primary source of data is the Indonesia Family Life Survey. The first IFLS was conducted in1993, with subsequent waves in 1997, 2000, 2007, and 2014. From the outset the IFLS was designedas a long-term panel survey, which allows us to compare life trajectories of individuals making differ-ent occupational and locational choices. Furthermore, the IFLS puts considerable effort into trackingindividuals over time. This feature is rare among longitudinal household surveys in developing coun-tries, which typically lose respondents who move out of an original survey area. As a measure oftracking success, Thomas et al. (2012) report that the 2007 IFLS managed to interview 87% of indi-viduals who were eligible to be tracked. Tracking movers is crucial for drawing conclusions from acomparison of migrants and stayers when the decision to migrate is not random.The IFLS is a large-scale survey, conducted in 13 of the 27 Indonesian provinces. Because the onesexcluded are mostly outlying provinces, the sample is representative of 83% of Indonesian population.The first wave interviewed 22019 individuals and the number of respondents grew to 58337 in the482.3. Income gaps across sectors and locationsfifth wave. In our analysis we restrict attention to adults (15 years or older) who are employed andtherefore answer the detailed work module of the survey. The definition of employed is expansive andcomprises all persons who answered affirmatively to any of the following categories: i) their primaryactivity during the past week was working, trying to work or helping to earn income; ii) had workedfor pay at least 1 hour during the past week ; iii) had a job or business, but were temporarily notworking during the past week; iv) had worked at a family-owned (farm or non-farm) business duringthe past week.For those individuals, the dataset we construct records their annual income, the sector where theyworked according to the job that consumed the most time, years of schooling, work experience by sec-tor and standard demographic characteristics such as age and gender. In addition, we use informationon the household location in each survey wave and the movements recorded in the migration moduleof the survey to construct individual location histories at various levels of administrative detail.Our main outcome variable of interest is annual income. The annual income can be derived fromwages, from net profits of a business (such as a farm), or from other sources such as governmenttransfers. We believe that total income is the appropriate measure in a setting where work on a familyfarm is pervasive and where half of the workforce does not report any wage work.Following a standard distinction for developing countries, we split locations according to whetherthey are rural or urban. The rural-urban status of each survey location is determined by the IndonesianCentral Bureau of Statistics (BPS) based on multiple criteria. Along a sectoral dimension, we classifyworkers as employed either in agriculture or in non-agriculture comprising all other sectors.43Table 2.1 reports descriptive statistics for the constructed dataset. Overall we have 85869 obser-vations for 38112 individuals. In our analysis below we focus on the 22829 individuals whom weobserve in at least two waves of the survey, for a total of 70586 observations.442.3 Income gaps across sectors and locations2.3.1 Baseline resultsIn this section we present the key patterns of income gaps across sectors and locations in Indonesia.The gaps are estimated using Mincerian regressions with the following general formlnyislt = Xitβ +DN +DU +Di+ εislt , (2.1)where yislt denotes income of an individual i working in sector s (agriculture or non-agriculture),living in location type l (rural or urban) in year t. Xit collects standard individual covariates such assex, years of education, experience and experience squared, as well as year and province dummies. DN43This two-sector partition is common in macro-development literature and is sufficient to illustrate the puzzle of lowagricultural incomes. We have also divided non-agriculture further into manufacturing and services. The income gapsbetween manufacturing and services are small relative to the gaps between those two sectors and agriculture.44Depending on the specification the effective sample size can be smaller as we do not observe all variables for allindividuals.492.3. Income gaps across sectors and locationsand DU capture the non-agriculture and urban premia of interest, while Di captures the time-invariantcomponent individual heterogeneity.The baseline specification is a reduced form relationship between income and certain observableand unobservable worker characteristics. If workers switch between sectors randomly, then the DNpremium has a simple interpretation of an average gain that a worker can get by moving from agri-culture to non-agriculture. If, on the other hand, workers sort across sectors (and locations) based ontheir unobserved comparative advantage as in Roy (1951) then the premia estimated using equation(2.1) need not have a simple interpretation and a structural model is needed for an exhaustive analysis.While our argument in this chapter is that sorting is indeed important and we therefore estimate astructural model later, we begin by discussing the reduced form OLS estimates as they have a longtradition and they will be used as auxiliary models in our structural estimation.As a starting point we estimate equation (2.1) without any controls except for the sector dum-mies.45 This specification simply compares average incomes across sectors and, as can be seen inthe first column of Table 2.2, these incomes vary greatly. Compared to agriculture, incomes in non-agriculture are on average 84 log points [lp] (or 131%) higher.4647. The second column comparesurban and rural incomes. The urban premium stands at a similarly dramatic 65 lp (or 91%). A naturalquestion is whether the urban and sectoral premia capture the same variation in the data.Many studies take a dichotomous view of economic activity in developing countries. A clas-sical divide in development literature goes along the rural vs. urban dimension. Macroeconomiststend to work with sectoral data and hence use the agriculture vs. non-agriculture split. But both litera-tures often implicitly consider both partitions as interchangeable, for example by associating structuraltransformation (decline of agricultural employment share) with urbanization (increase in urban share).The joint distribution of workers across sectors and locations shown in Table 2.1 suggests that such in-terchangeability is too crude in Indonesia. In 2000 (around the middle of our sample period) the shareof rural workers at 59% was quite a bit higher than the 37% share of agricultural workers. Amongrural workers 45% had primary employment outside of agriculture, while 11% of urban workforcewas employed in agriculture.So can the raw urban premium be explained by different composition of sectors in rural and urbanlocations or are urban workers paid more in the same sectors? Column 3 of Table 2.2 estimates theurban and sectoral premia jointly. Controlling for sectors reduces the urban premium almost by half,yet it is still high at 41 lp. Controlling for type of location has a smaller impact on sectoral premia,still at 69 lp. These numbers are the first indication that sector of employment might have a strongereffect on income than place of residence directly.This point is further strengthened by controlling for individual worker characteristics in the Mincerregression. Column 4 shows the urban premium of 21 lp and non-agriculture premium of 57 lp.45In all specifications we control for year and province fixed effects. Observations are weighted by their longitudinalsurvey weights and standard errors are clustered at the level of primary sampling units of the survey.46Because the coefficients of interest are often large in magnitude we report them directly in log points and only occa-sionally translate them to exact percentage differences.47Reported coefficients are statistically significant at 5% level or lower unless mentioned otherwise.502.3. Income gaps across sectors and locationsControlling for observables reduces the urban premium by half once again, while the sectoral premiumagain changes much less. These residual (controlling for observables) income gaps are also about asmuch as what can be calculated with cross-sectional data. They therefore correspond most directly tothe gaps calculated in other studies.Using the panel structure of our data we are in a position to begin addressing the issue of sortingon unobservables. The specification in column 5 adds worker fixed effects to the set of controls. Usingonly within-worker variation to identify the gaps reduces the urban premium by more than half to 8lp. While not trivial, a 9% additional income gain associated with moving from rural to urban locationwhile keeping the same sector of employment is not shocking either. In contrast, the non-agriculturepremium is still surprisingly large. The same worker switching from agriculture to non-agriculturewithout changing the rural-urban status sees on average an additional income gain of 33 lp (or 39%).Column 6 paints a similar picture using sightly more flexible specification with a full set of interactionsbetween sector and urban dummies. Staying in a rural area and switching away from agriculture givesan income boost of 33 lp. Sectoral gaps of this magnitude are hard to explain without thinking aboutsome barriers preventing workers form moving out of agriculture despite better opportunities in othersectors.Because the premia estimated on switchers are most novel and surprising, we now explore themobility pattern in our data more carefully. The first panel of Table 2.3 presents the count of wave-to-wave transitions between sectors and the third panel shows the associated transition matrix.48 About20% of workers in agriculture transition to non-agriculture between survey waves, and 12% on work-ers in non-agriculture switch to agriculture. Overall, 24% of workers change the sector at least oncewhile in our sample. The fact that there are almost as many cases of workers moving into agricul-ture as cases of workers moving out of agriculture is puzzling in light of the large negative premiumassociated with working in agriculture. The second and fourth panels of Table 2.3 records analogoustransitions along the rural-urban dimension. There is less mobility between rural and urban areas,with a change in location status in 9% of cases. About 17% of workers move between rural and urbanlocations at least once while in our sample. As expected in a developing country, there are more thantwice as many transitions from rural to urban than in the opposite direction, resulting in net migrationto urban areas.The sectoral and urban premia reported so far show an average effect of moving in and out of thesector or location. We now reevaluate the income gaps while taking the direction of transitions intoaccount. The estimating equation now takes the form∆ lnyislt = ∆Xitβ +∆Dss′+∆Dll′+∆εislt , (2.2)where ∆Dss′ and ∆Dll′ capture the direction of sectoral and locational transition. Results are reported inthe first column of Table 2.4. Along the locational dimension, workers who move from rural to urban48These are not year-to-year transitions but transitions between two consecutive observations for each worker. The timebetween the waves of the survey varies form two to seven years.512.3. Income gaps across sectors and locationsareas see an income increase of 9 lp relative to those who stay in rural areas. Workers who move intorural areas have an income shortfall of 16 lp relative to those who stay in urban areas. Results for thenon-agriculture premium are once again even stronger. Relative to workers who remain in agriculture,workers switching out of agriculture see an additional income growth of 22 lp. Workers who switchfrom non-agriculture to agriculture see an income loss of 33 lp relative to workers remaining in non-agriculture.So far we have established existence of a significant income premium for working in non-agriculturecontrolling for movements between rural and urban locations. But it is still conceivable that move-ments within rural and urban locations, if correlated with sector switching and having an independenteffect on income, might bias the estimates of the sectoral premium. Now we isolate geographic mo-bility completely using the detailed migration information provided by our dataset. We interact thedirection of sectoral transition variable ∆Dss′ with an indicator for whether a worker migrated acrossvillage boundary (or correspondingly fine location for cities). The second column of Table 2.4 displaysthe results, with workers staying in agriculture and staying within a village as a reference category.Workers who migrate and move out of agriculture have the largest income gains; workers who migrateand move into agriculture suffer largest relative income losses. But perhaps the most striking resultsare for workers who do not migrate: those who switch out of agriculture gain additional 20 lp in in-come relative to those who remain in agriculture. Those switching into agriculture see an income lossof 26 lp relative to non-movers who remain employed in non-agriculture.That such large non-agricultural premium can be identified from within-worker sector switcheswithin very narrow geographical areas is truly surprising. Moreover, it is not easily reconciled withthe workhorse models of labor markets in developing countries. If workers are sorting across sectorsaccording to comparative advantage that is fixed over time and switching is costless then we shouldnot expect to see a large premium for switchers, and we should expect flows to be in one directiononly. If switching is costly and occurs only if the income gain justifies incurring the mobility costthen we should see a positive premium regardless of the direction of the voluntary switch. In con-trast, we see workers switching to agriculture taking systematic cuts to their income that is of similarmagnitude as gain for workers switching in the opposite direction. Thus it seems that there is a purepremium associated with working in non-agriculture. There are several possible rationalizations forthis finding. First, it might suggest existence of some additional friction that allows this premium toexist in equilibrium. We explore both asymmetric switching costs and compensating differentials asa possible rationalization of the observed choices. In the case of compensating differentials, workerssimply attach higher non-monetary value to working on a farm than for other jobs. Our concern is thatgiven the harsh realities of farm work in developing countries this explanation is not quite compelling.For that reason, we also consider an alternative friction that results in workers switching sectors in-voluntarily. Finally, as we argue later, the premium can arise even in the frictionless setting whenswitching happens because of idiosyncratic shocks to comparative advantage over time.522.3. Income gaps across sectors and locations2.3.2 RobustnessIn this subsection we illustrate that the existence of a non-agriculture premium is robust to a numberof concerns about measurement, interpretation and estimation. Our baseline point of reference is the57 lp cross-sectional premium and 33 lp within-worker premium reported earlier in column 4 and 5 ofTable 2.2.Job typeThe first exercise incorporates information on a type of job workers engage in as this helps to illumi-nate the nature of labor markets in Indonesia. Workers in IFLS can be consistently classified into 4categories: self-employed, private workers, government workers and unpaid family workers. As Table2.5 reports, self-employment is the most common work status, accounting for almost half of employ-ment. Private sector workers earning wages and salaries - a category that would usually be the focusin studies based on developed countries - constitutes less that a third of the workforce. Almost 15% ofworkers who typically help in household work or in a family business or farm are classified as unpaidfamily workers. These workers nevertheless can report income and are included in the analysis, butour results are robust to dropping this category altogether. The second panel of Table 2.5 also reportsthe 10 most common occupations. The point of this table is to show what non-agriculture typicallymeans in Indonesia. It is more about being a self-employed street vendor rather than having a formalfactory job in manufacturing.Controlling for the job type has a small impact on the non-agriculture premium, e.g., reducing itfrom 33 lp to 29 lp in the worker fixed effects regression. More interestingly, Table 2.6 reports theresults of interacting job type with a direction of switch. For the two main categories, self-employedand private workers, there is about 25 lp premium for switching to non-agriculture relative to stayingin agriculture. Workers switching away from non-agriculture suffer a loss of similar magnitude rel-ative to workers remaining in non-agriculture. The similarity of results for self-employed and wageworkers can come as a surprise. The non-agriculture premium for wage workers could be in principlerationalized along similar lines as intersectoral or even inter-firm wage differentials documented fordeveloping countries. There might be good non-agricultural jobs that pay more than bad agriculturaljobs because employers in non-agriculture for some reason share rents with their employees. But suchrent-sharing explanation would be silent as to why we see a similar premium for self-employed work-ers switching sectors since they are the residual claimants of their effort. The sectoral premium for theself-employed is thus perhaps our most surprising finding.Going back to earlier discussion, the non-agriculture premium could reflect compensating dif-ferentials, if, e.g, workers value flexible schedule associated with farm work. But the fact that thepremium exists for self-employed in both sectors makes compensating differentials less compellingas an explanation. Furthermore, going one step further we can show that the premium of the samemagnitude exists even for self-employed workers switching sectors while staying in the same narrowlocation. We do not find it plausible that workers willingly give up 25% of their income because they532.3. Income gaps across sectors and locationsprefer to run a farm than a non-farm business in the same village.Wages and consumptionWhile our preferred outcome variable is annual income, there can be concerns about the quality ofthat self-reported measure. The problem could be particularly stark for self-employed who often haveto allocate family business income to individuals. As a robustness check we now restrict attention toannual wage income that is less likely to suffer from measurement problems. Doing so comes at theexpense of restricting the sample by more then half to individuals who work for wages in the privateor government sector. Table 2.7 illustrates that the same pattern of premia can be observed using datafor wages as for total income, though the magnitudes are a little smaller. Controlling for worker fixedeffects, the non-agriculture premium is 23 lp, while the urban premium 11 lp. Despite the sample sizebeing significantly reduced the premia are still precisely estimated.Since the IFLS records consumption expenditure, it offers an additional way of verifying thatworking in non-agriculture allows a higher standard of living. One drawback of consumption datain the present context is that it is recorded at a household level, whereas the focus of the chapter ison individual decisions. This requires some adjustments to make the results comparable. The firstcolumn of Table 2.8 reports results of a household-level cross-sectional regression of log per capitaexpenditure (Log PCE) on a continuous variable measuring the share of household income derivedfrom non-agriculture and an urban dummy. Column 4 reports a corresponding calculation for percapita household income. Households that derive higher share of income from non-agriculture havea higher per capita consumption, though the elasticity is not as large as for income. The rest of Table2.8 reverts to individual level regressions, but with dependent variables still at the household level.Column 3 results indicate that if a member of a household moves from agriculture to non-agriculturethan the average consumption in the household increases by over 7 lp. This might appear as a modestnumber compared to the baseline income premium so two comments are in order. First, since a surveyworker typically accounts for less than 60% of income in his household, the coefficient should bescaled by the inverse of that share to be interpretable as an increase in consumption associated withall household workers switching to non-agriculture. This transformation would increase the non-agriculture consumption premium to about 13 lp. To illustrate that this transformation is reasonablecolumn 6 performs it on per capita income variable. The transformed coefficient of 35 lp is very closeto the baseline non-agricultural premium. Second, in similar specifications the consumption premiumis still only 1/3-1/2 as large as income premium. In light of permanent income logic perhaps it shouldnot be surprising that an income shock associated with switching sectors has only partial pass-throughto consumption.Heterogeneity in Mincerian returnsThe baseline regressions control for standard Mincerian determinants of income such as educationand experience. The coefficients on these determinants do not vary between sectors and rural/urban542.3. Income gaps across sectors and locationslocations, however. A recent paper by Herrendorf and Schoellman (2018) argues that this mightlead to an overstatement of the residual income gaps, if, e.g, non-agriculture offers higher returns toeducation and experience. To address this concern, we now allow the Mincerian returns to vary besector and location. Table 2.9 reports the associated premia, calculated as the average marginal effectsof switching for the population.49 While some underlying returns do indeed differ by sector, this hasno significant effect on the estimated premia of interest.Additional jobs and home productionWorkers are assigned to a sector according to whether their main job is in agriculture or non-agriculture.Correspondingly, the annual income is constructed using the income from the main job. Some work-ers, however, have more than one job. If having a secondary job is more common for agricultural andrural workers then we might overestimate the non-agriculture and urban premia. Columns 3 and 4 ofTable 2.10 show the premia estimated when instead we take into account income from worker’s bothprimary and secondary jobs. This adjustment reduces the premia by about a fifth.50Another concern is that by focusing on income we are not taking into account home productionwhich is not trivial in developing countries. If agricultural households do not include food producedand consumed in-house in their income then this could lead to an overstatement of the non-agriculturepremium. The IFLS data allows us to assess how important this consideration is because it askshouseholds to report the value of goods and services produced for own consumption. The averageshare of self-produced consumption is about 10%, but is predictably higher in rural areas (13%) thanin urban areas (7%). As a robustness check we therefore scale up individual incomes (from bothmain and secondary job) by the inverse of the share of self-produced consumption in a householdthe individual belongs too. This effectively increases incomes of workers in rural and predominantlyagricultural households. As columns 5 and 6 report, this has little effect on the estimated premia.Columns 7 and 8 consider an adjustment even more favorable for agriculture - scaling incomes by theinverse of the share of home-produced food in total food consumption. This again does not affect theestimated non-agriculture premium much, though the urban premium becomes insignificant.Hours workedAll the results so far show that workers in agriculture have lower annual income than workers in non-agriculture. One natural question is to what degree this income difference is driven by systematicdifferences in labor supply across sectors. To investigate this issue, Table 2.11 adds hours workedper year to the set of individual controls. Controlling for hours worked reduces the non-agriculturepremium by about a fifth. In particular, comparison of columns 2 and 4 shows that the premiumidentified from switchers falls from the baseline level of 33 lp to 27 lp. This reflects the fact that49Results are similar if we calculate the average marginal effects for switchers instead.50Using a continuous measure of the share of income a worker derives from non-agriculture instead of a dummy for theprimary job leads to similar results, with a cross-sectional premium of 47 lp and 26 lp in the specification with worker fixedeffects.552.3. Income gaps across sectors and locationsworkers in non-agriculture work more hours, as illustrated in Table B.2 in the Appendix. Column 2of that table shows that the same workers supply on average 15% more hours when they switch tonon-agriculture.Whether one actually should condition on hours work in calculating the sectoral premia can bedebated. The answer depends on the interpretation one wants to give to the premia and on the reasonhours differ across sectors. In this chapter, the non-agricultural premium is meant to capture an in-crease in the annual income that can be expected by a worker switching away from agriculture. To theextent that the switch is associated with higher labor supply, this increase in hours should be includedas part of the benefit of switching. Our baseline measure therefore does not control for hours. In ourview, thus calculated premium is a more interesting object than a premium netting out the effect ofhours. The reason is that a sector of employment and supply of hours are best seen as a package. Ourconjecture is that lower hours worked in agriculture observed for the same individuals are an indi-cation that these individuals are frequently underutilized in agriculture, perhaps because of intrinsicseasonality of farm work.51 If workers are forced to be idle for stretches of time in agriculture, thentheir low average utilization should be considered as a part of the productivity gap between agricultureand non-agriculture.Another interesting feature seen in columns 3 and 4 in Table 2.11 is that the elasticity of annualincome with respect to annual hours worked is only about one half. This means that income per houris declining in hours worked, consistent with diminishing returns to labor. Combining this observationwith higher hours in non-agriculture explains why the non-agricultural premium in terms of incomeper hour (columns 5 and 6) is smaller than the premium controlling for hours (columns 3 and 4).52However, even when identified off switching workers income per hour is still significantly (19 lp)higher in non-agriculture. We report these numbers mainly because some of the literature interpretsmeasures of income per hour as “wages” and uses them to calculate sectoral wage premia. In par-ticular, in a concurrent paper also using the IFLS data Hicks et al. (2017) argue that non-agriculturalpremium in Indonesia largely disappears when they use their preferred regression of income per hourwith worker fixed effects. There are two main reasons why our substantive findings are different. First,in our implementation we only rely on information on income and hours reported contemporaneouslyby the survey respondents. In contrast, Hicks et al. (2017) also rely on recall information for severalyears prior to the survey. As discussed in more detail in Appendix B.2, the recall information is likelysubject to non-classical measurement error which can bias the estimated non-agricultural premiumdownwards. Second, even though our results are robust to controlling for hours and looking at hourlyincome, as argued earlier our conceptually preferred specification does not take hours into account.Comparing income per hour could indeed be preferable in a setting in which workers are offered con-stant hourly wages and freely choose the sector to which to allocate their marginal hour of work. But51Table B.1 and column 3-4 of Table B.2 show that the results are robust to including the secondary job. This alleviatesa concern that lower hours in the main job for agricultural workers are offset by having a second job.52If we control for hours worked in the income per hour specifications in columns 5 and 6 then the premia would beidentical to those in columns 3 and 4.562.3. Income gaps across sectors and locationsif hours are largely dictated by the nature of work in a sector then sector is the relevant “marginal”choice. Since we find the second case to be more plausible in the context of Indonesian labor marketswe do not adjust our preferred non-agricultural premia for differences in hours.Long-run income growthOne of our most surprising findings is that workers who switch from non-agriculture to agriculturesuffer an income loss of around 30%. To be more precise, an interpretation of coefficients in the firstcolumn of Table 2.4 is that a worker who switches away from non-agriculture between two surveywaves has an income growth over that period 33 lp lower relative to what he would be expected toget if he remained in non-agriculture. Taking a large income cut could nevertheless be a rational de-cision for a worker maximizing his lifetime discounted income if he expects that the current loss ofincome will be compensated by higher future income growth in agriculture. This argument potentiallyhas some merit because over our sample period average income growth was indeed higher in agri-culture. To illustrate these differential trends, Table B.3 in the appendix shows the evolution of thenon-agricultural premium over time. While it is strong and statistically significant throughout, it doesdecline over our sample period, especially in the cross-section, consistent with agricultural incomespartially converging to those in non-agriculture.However, if switching workers could accurately predict the future income path then we wouldexpect that over a long period of time those who took a cut switching to agriculture are not worse offthan workers who remained in non-agriculture. As a first test of this hypothesis we look at incomegrowth over the entire 21-year period spanned by IFLS 1-5. Column 1 of Table 2.12 shows thatworkers who started in non-agriculture in 1993 but switched to agriculture by 2014 had income growthover that period lower by 37 lp compared to those who began and finished in non-agriculture. Thisresult suggests that switchers to agriculture do not make up their initial loss even after a prolongedtime.By using a single long time difference, the previous exercise identifies an average effect of switch-ing among workers with diverse interim sectoral employment histories. Our second exercise exploitsthis interim information. For this purpose, we consider employment histories spanned by three ob-servations at equal 7-year intervals (i.e. those individuals with data for 1993, 2000, 2007 or 2000,2007, 2014). We are interested in comparing the change in income over the 14-year span for workerswho made different sectoral decisions during that period. Figure 2.1 shows the mean log wages fora few key histories. In particular, compare income of NAA-history workers (i.e. those who switchedfrom non-agriculture to agriculture during the first 7-year period and stayed in agriculture during thesecond 7-year period)_to income of NNN-history workers (who remained in non-agriculture through-out). Before the switch, NAA-workers had on average lower incomes, consistent with idea that thosewho switch are negatively selected from non-agricultural workers. More importantly, after the switchtheir incomes decline relative to NNN-workers. This is another reflection of the loss from switch-ing emphasized in this chapter. But the gap between NAA- and NNN-workers does not significantly572.4. Model of sorting across sectors with barriers to sectoral mobilitynarrow over the subsequent 7-year period. So crucially, over the entire 14-year period incomes ofnon-agricultural workers who permanently switched in the first half of the period fall back relative tothose who stayed in non-agriculture.Column 2 of Table 2.12 casts this analysis into a regression framework with the usual controls.We find that workers who switched from non-agriculture to agriculture during the first 7-year periodand were still in agriculture at the end of the second 7-year period had a cumulative growth over 14years lower by 19 lp (significant at 0.05 level) than if they had remained in non-agriculture over thisperiod. Similarly, workers who switched into non-agriculture in the first period and remained therehad long-run income higher by 15 lp than if they had remained in agriculture, though that effect is lessprecisely estimated (significant at 0.10 level).53 Overall we take these results as evidence that workerswho chose agriculture have lower incomes even in the long run.2.4 Model of sorting across sectors with barriers to sectoral mobilityIn this section we introduce a simple discrete-time model of the labor supply in which heterogenousworkers self-select into sectors in each period based on the value of their human capital. Workersswitch across sectors due to exogenous variation in the prices of human capital over time and due tothe presence of an idiosyncratic time-varying component in their sector-specific human capital thatresembles transitory productivity shocks. As we argue in the next section, the latter component is ableto generate by itself a within-individual sectoral premium, depending on the magnitude of its relativedispersion across sectors. However, our structural estimation suggests that in order to simultaneouslyfit the magnitudes of the premia, the allocation and transition of workers across sectors over time andthe moments of the joint income distribution, some frictions to sectoral mobility are needed in themodel.For that reason, we evaluate different types of barriers to sectoral mobility that misallocate workersacross sectors. We first consider switching costs across sectors (Dixit and Rob (1994); Cameron et al.(2007); Artuç et al. (2010); Dix-Carneiro (2014)) and, relatedly, compensating differentials (Rosen(1986); Taber and Vejlin (2016)). Switching costs act as utility burdens that constrain voluntaryswitches, inducing misallocation across sectors. Since we estimate opposite signs for the switch-ing costs away from and towards agriculture, the model with switching costs performs similarly toa specification in which workers receive a positive compensating differential for working in agricul-ture. Next, we consider a specification with imperfect self-selection, where we allow for frictionsthat prevent individuals from working in their preferred sector. These frictions could be rationalizedby on-the-job searching frictions (Gautier et al. (2010); Gautier and Teulings (2015)), for example.In contrast to the case of utility costs, where mobility barriers bind only for workers with relativelysmall differences in comparative advantage, in the alternative specification even workers with a strong53In principle we could construct even longer histories which would allow us to control for pre- and post-trends of variousgroups. Unfortunately, between the number of possible histories increasing and the number of individuals with required datadecreasing with history lengths, these longer histories would have limited statistical power.582.4. Model of sorting across sectors with barriers to sectoral mobilitycomparative advantage in one sector can be affected by the frictions. Because frictions affect the in-framarginal workers, the induced allocative inefficiency in this case generates a larger impact on theaggregate income.2.4.1 Frictionless economySuppose agents choose their sector at each time t to maximize contemporaneous utility54. Let Ωit bea vector of state variables for an individual i at time t. The income an individual receives in sector s isa product of the exogenous price of human capital in sector s at time t, Rst , and the amount of humancapital the worker can supply to that sector:yst (Ωit) = Rst hs (Ωit) .The supply of human capital depends on both observable and unobservable components. Theformer are gathered in a vector of covariates Xit , which in our estimation includes gender, the urban-rural location, years of schooling, years of working experience and the square of working experience.Notice that since we emphasize in this chapter the sectoral dimension of the residual wage premia,we abstract from the choice of location, and treat the urban-rural choice just as another covariate.Since the sectoral premia are robust to heterogeneous Mincerian returns across sectors, we assumefor simplicity homogenous returns on covariates, and hence we focus our attention on self-selectionbased on the unobservable components. Regarding the set of unobservables, it includes a time invari-ant component θ si , representing the permanent comparative advantage of worker i in sector s, and anidiosyncratic time-varying term εsit , resembling a transitory productivity shock that affects the com-parative advantage of the same worker i in sector s at time t:hs (Ωit) = exp(X ′itβ +θsi + εsit).As in standard selection models, the functional form assumptions on the distribution of the compo-nents of comparative advantage are key for identification. We assume that the permanent componentθ si is i.i.d. across individuals, drawn from a normal distribution N (µθ , Σθ ). Productivity shocks arealso normal i.i.d. across individuals and time, εsit ∼ N (µε , Σε) and for identification purposes, orthog-onal across sectors. We impose the normalization µθ = µε = 0 in order to identify the evolution ofprices of human capital over time.Let us now describe the worker’s problem. The worker is choosing at the beginning of periodt where to work. At the time of the decision she knows the value of the comparative advantagecomponents and the human capital prices. Her problem is:V (Ωit) = maxs{V s (Ωit)} , (2.3)54We abstract from a model with inter-temporal optimization because our empirical findings do not support the hypoth-esis of maximization of lifetime discounted income (see section 2.3.2).592.4. Model of sorting across sectors with barriers to sectoral mobilitywhere the value of the human capital in sector s in the frictionless case is simply the log of the income,V s (Ωit) =V se f (Ωit) = lnyst (Ωit).Finally, we assume the researcher observes individual income yˆsit subject to a pure idiosyncraticmeasurement error νit :ln yˆsit = lnyst (Ωit)+νit .We assume measurement errors have mean zero and are normal i.i.d. across individuals and time,νit ∼ N(0,σ2ν). Notice that an alternative interpretation of these errors is as ex-post productivityshocks that affect observable worker’s income, but not her sectoral choice. Since in this case ex-postshocks do not affect workers’ self-selection into sectors, the model delivers the same predictions underboth specifications.Our model abstracts from the possibility that workers drop out from the labor market, to focusattention on the role of sorting and the barriers to sectoral mobility introduced in the next section inexplaining non-agriculture premia among active workers. For this reason, in the structural estimationwe use the balanced panel of workers with income recorded in the five available waves of IFLS.Denote by Θ the set of all structural parameters. The elements of Θ are listed and described in thefirst column of Table 2.13.2.4.2 Economies with barriers to sectoral mobilityThe first type of barrier to sectoral mobility that we consider is a utility cost of switching across sectors.This cost could reflect tangible expenditures such as training or transportation costs, or intangiblessuch as social adjustment costs. Denote by φ ss′ the utility cost for switching from sector s (which waschosen in t− 1) to sector s′, common to all individuals. The value of the human capital supplied tosector s is then:V ssc (Ωit) = lnyst (Ωit)− lnCst−1st (Ωit) ,whereCst−1st (Ωit) =Css′=φ ss′if s 6= s′1 if s = s′.The problem of the worker is the same as in (2.3), the only difference here is the definition of the valueof the human capital, V s (Ωit) = V ssc (Ωit). We only constrain the magnitude of φ ss′to be positive, soin principle switching costs could also measure a utility compensation for φ ss′ < 1.As we show in the next section, we estimate opposite signs for lnφAN and lnφNA (where A andN denote agriculture and non-agriculture, respectively), a pattern that is observationally similar toreceiving a positive compensating differential for working in agriculture. In the case of a compensatingdifferential, the value of the human capital can be defined as:V scd (Ωit) = lnyst (Ωit)+ lnCs,602.5. Structural estimationwhereCs =cd if s = A1 if s = Nand thus the differential cd measures the additional utility that a worker obtains by working in agri-culture (relative to working in non-agriculture). This differential can be related to any attribute of theagricultural work that is valued by individuals: less exposure to pollution, crime, or crowding, moreflexible work schedules, etc. Note that both switching costs and compensating differentials act as ifproportionally scaling down or up income, so they can be interpreted in terms of annual earnings. Fur-ther, notice that the specification with switching costs adds two parameters (φAN ,φNA) to Θ , whereasthe model with a compensating differential only adds one (cd).Finally, we also consider a different kind of barrier to the allocation of workers across sectors.We want to capture an idea that workers do not always get to work in a sector that they would like,even if they have a strong comparative advantage in that sector. These frictions can be interpretedas life events forcing an individual to switch the sector of employment, and are meant to capture ina simple way the underlying search frictions. Specifically, we assume that at the beginning of eachperiod an individual gets a random draw such that she will be able to choose the sector she desireswith probability 1− p(Ωit) and she will be forced to work in the other sector with probability p(Ωit).The probability p of being forced to accept a job in a sector other than desired can depend on theworker’s state. In particular, we want to allow for the possibility that it might be more difficult toswitch a sector than keep working in the same sector, by letting the probability differ between thosewho desire to switch and those who desire to stay:pst−1st (Ωit) = pss′=pT if s 6= s′pS if s = s′ .Similarly as with the switching costs, this specification adds two parameters (pT ,pS) toΘ .2.5 Structural estimationIn this section, we describe the estimation procedure and the identification of the parameters of thestructural model. The estimation method is Indirect Inference (Gourieroux et al. (1993)). We relyon the functional form assumptions to deliver a proof for identification in a simplified version of themodel.2.5.1 Estimation procedureThe first step in the estimation procedure is to choose a set of auxiliary regression models that sum-marize the main features of the data we want to capture: the sectoral premia, the moments of thejoint distribution of income and the workers’ sectoral decisions over time. Those auxiliary models are612.5. Structural estimationused to compute the Indirect Inference loss function to be minimized, and hence they must be simpleto estimate multiple times. As we explain below, for identification this method does not require thatthose auxiliary regressions are well specified (i.e. models which are exact reduced forms of the struc-tural model, in which case Indirect Inference is equal to MLE). However, we do need that the selectedmodels provide us enough information about the moments in the data that allow us to identify the setof structural parametersΘ .First of all, given the assumption of homogeneity in the Mincerian returns, the role of observablesin self-selecting workers across sectors is innocuous. Hence, we can map the parameters in β to theestimated coefficients on observables in a log-income linear regression on observables controlling forthe interaction between sectoral choice and year, in order to estimate the structural model using onlyresidual income. An identical estimation could be performed including β and using log-income tocompute all auxiliary regressions, controlling for observables. This is why we can safely drop theeffects of observables from our identification proof in Appendix B.4.We select the following seven auxiliary models: i) a log-residual income linear regression onthe sector choice, controlling for time fixed effects; ii) a log-residual income linear regression onthe sector choice, controlling for time and individual fixed effects; iii) a log-residual income linearregression on the direction of sector switching between waves, controlling for time fixed effects; iv) alog-residual income linear regression in first differences on the direction of sector switching betweenwaves, controlling for the first differences in years of the waves; v) a log-residual income linearregression on the interaction between sectoral choice and year; vi) a sectoral choice linear probabilitymodel on time dummy variables; and vii) a sectoral choice linear probability model on the previoussectoral choice. The role of each of these models in identifying the structural parameters is explainedin the next subsection.For efficiency reasons, we only use the coefficients of interest of the selected auxiliary regressionsin the Indirect Inference loss function. Hence, we use the following 29 coefficients of the sevenauxiliary models: 1-2) the non-agriculture premia in models i) and ii); 3-6) the sector-specific premiafor switching workers from models iii) and iv); 7-23) the full set of estimated coefficients from modelsv) to vii); 24-25) sector-specific residual variance in model v); and 26-29) sector-specific residualvariance for non-switching and switching workers in model iv). Table 2.14 summarizes the auxiliarymodels as well as the selected coefficients.Arrange the values of the selected coefficients estimated in the actual data in the vector δˆ . Theelements of vector δˆ are displayed in the third column of Table 2.15 and remain fixed during theestimation procedure. The Indirect Inference loss function is computed as the weighted sum of thesquared differences between the values in δˆ and the values for the same set of coefficients obtainedfrom simulations of the structural model. For weights, we use factors that represent the importance ofthe estimated coefficient in the identification of the structural parameters of the model, assigned afterextensive experimentation. Appendix B.3 describes their magnitudes and presents technical aspectsof the estimation procedure in more detail.Finally, in the models with switching costs or involuntary choices there is an issue of endogeneity622.5. Structural estimationof observing workers’ initial sector allocation in the panel. We address it by introducing a pre-sampleperiod zero with sectoral choice free of switching costs in the first case, and with a probability ofbeing forced to work in the undesired sector independent of the worker’s state55, in the second case.We use pre-sample information on covariates when available to construct the distribution of the initialconditions. This way, although the auxiliary regressions are computed only for the five years in thesample, the data generating process of the model produces draws also for period zero.2.5.2 IdentificationIn this section we discuss how from the selected coefficients of the auxiliary regressions we obtainthe set of moments that allows us to identify the parameters in Θ . Those moments are enumerated inAppendix B.4, where we demonstrate how Θ is identified in a simplified version of the model withtwo periods. In the proof, we take advantage of the functional form assumptions to extend the standardcross-sectional moments by including moments of the income distribution of the switching workersacross waves, available only thanks to the panel dimension of the data, with the aim to set up a systemof equations to solve for all parameters inΘ . We fully expect our reasoning to generalize to the samesetting with a larger number of years. To verify this hypothesis, we generate multiple samples fromthe model with simulated covariates over the number of years as observed in the data, using differentsets of parameters values. We find that the chosen auxiliary regressions allow the estimation procedureto obtain the values of parameters used to generate each sample.Let us first comment on the main insights from the demonstration in Appendix B.4. In the fric-tionless economy sectoral decisions do not depend on workers’ histories, so the model behaves ineach period t as the standard log-normal Roy model with comparative advantage usit = θ si + εsit . Inthis case, we can use standard arguments of (Heckman and Honoré (1990)) to identify from repeatedcross-sectional moments the prices of human capital (which, given our normalizations, act as themeans of the distribution comparative advantage) and the variance matrix of usit in each period aug-mented by the variance of measurement error. Only with panel data we can separately identify thevariances of the permanent and transitory components of comparative advantage and the variance ofmeasurement error, inferred from the moments of the growth in income of switchers. The intuitionis that the amount of additional information that switchers provide about the joint distribution of in-come in response to changes in relative human capital prices is similar to the information obtainedfrom exclusion restrictions and support conditions in the process of non-parametric identification ofcross-sectional non-normal Roy models.56We are able to find the analytical expressions for the moments of the income distribution of the em-ployment transition groups across waves exploiting the property that draws of usit in different periodsof time are joint normally distributed, since each one is the sum of two normally distributed randomvariables. In this way, we can express the transition probabilities across waves and the observed mo-55We make this probability equal to pS.56See French and Taber (2011) for a detailed discussion about parametric identification of selection models throughdistributional assumptions and nonparametric identification using exclusion restrictions and support conditions.632.5. Structural estimationments of the growth in income for switchers using upper truncated multivariate normal distributions,where the prices of human capital in the two periods affect the truncation values. We verify that byadding this information from the switchers to the standard cross-sectional moments we can set up asystem of equations with an unique solution for all parameters inΘ .For the case of switching costs and frictions, sectoral choices depend on workers’ histories andwith only repeated cross-sectional data we can no longer identify either the prices of human capitalor the variance matrix of usit augmented by the variance of measurement error. That is, we obtain thenon-identification result that even with the log-normality assumptions, the standard Roy model is notidentified in the presence of barriers to sectoral mobility. We can generate two combinations ofΘ , withdifferent values for at least one parameter other than the corresponding barriers, that produce exactlythe same set of cross-sectional moments. This is due to the fact that the cross-sectional momentsdepend on the distribution of the previous sectoral choices. Therefore, in order to identify the full setof parameters inΘ we need panel data even under log-normality assumptions.In a similar way as with the moments of the employment transition groups in the frictionlesseconomy, for the models with barriers to sectoral mobility we can derive the closed-form solutionsof the cross-sectional moments, the transition probabilities and the moments of the growth in incomefor switchers, expressed all of them in terms of moments of upper truncated multivariate normaldistributions, but with a dimensionality that grows with the number of time periods in the panel.Switching costs affect the truncation values of the distributions, similar to the human capital prices,whereas the probabilities of forced switches shift the entire distribution. We verify again that addingthe moments of the growth in income for switchers and the transition probabilities to the standardcross-sectional moments we obtain a system of equations with an unique solution for all parametersinΘ , including switching costs.Now we discuss in detail how the selected coefficients of the auxiliary regressions in our IndirectInference loss function capture the set of required moments for identification. First, linear probabilitymodels vi) and vii) describe the distribution of sectoral choice in each cross-section and the averagetransition probabilities between waves, respectively. Combined, these models characterize the evolu-tion of the joint distribution of sectoral choice over time, and hence they deliver the probabilities ofsectoral transition across all waves. Second, for the moments of income growth, we use for the firstmoments both the within-individual premium from model ii) and the premia for switching workersrelative to stayers in the model in first differences iv). The difference between the two is that the lattermodel takes into account the direction of transition, so it can actually inform the estimation proce-dure with the observed gains of switchers to non-agriculture and the losses of workers switching toagriculture separately, unlike the fixed-effects premium. For the second moments we use the residualvariances for workers switching to each sector from model iv).For the cross sectional moments, model v) informs us about the conditional expected incomes ineach combination sector-year, since it includes a full set of interactions for sector and year. Thosecoefficients, taking together with the cross-sectional premium in model i), characterize the first cross-sectional moments. We collect the residual variances for the pool of workers in each sector from642.6. Resultsmodel v) and the residual variances for non-switching workers to each sector from model iv), toaccount for the second cross-sectional moments. That is, since we have an identical distribution ofthe productivity shocks across years, we only need those four variances to characterize the secondcross-sectional moments, instead of a full set of variances from each combination of sector and year.Our set of cross-sectional moments in Appendix B.4 also includes third central moments. The in-tuition of including those moments is that self-selection produces right skewness in a sector’s earningsif the variance of the comparative advantage draws in that sector is larger than the covariance (Heck-man and Honoré, 1990). Hence, those moments contribute to inform us about the relative magnitudesof the elements of the variance-covariance matrix of comparative advantage. However, in practice,estimating those third moments can be problematic since they can be very sensitive to outliers. Likelyfor that reason, and up to the best our knowledge, no estimation of selection models by Indirect In-ference resorts to their use. We substitute the information of these moments with model iii), whichcompares the average performance of a switching worker to each sector with their peer group after theswitch. This comparison, which is only possible with panel data, provides us with evidence regard-ing the nature of sorting in the data, in particular, whether there is positive or negative hierarchicalsorting into each sector. Hence, as in standard selection models, those premia are informative aboutthe relation between the covariance and the ratio of variances of the comparative advantage draws.Thus model iii) offers the same amount of information that the third central moments provide in ouridentification strategy.It is worth to emphasize that Indirect Inference does not require that all auxiliary models are wellspecified for identification and consistent estimation of the parameters. As Sauer and Taber (2017)argue, the only requirement is that each structural parameter has to have an independent effect onat least one coefficient of the auxiliary models. Thus, a necessary condition for identification is tocheck that each parameter monotonically affects at least one coefficient of the auxiliary regressions,and that it produces a unique combination of responses on all the selected coefficients. We verify thisrequirement holds by visualizing the effects of changes in each structural parameter in the selectedcoefficients, keeping the remaining parameters constant, over the domain where those parameters areexpected to lie on.572.6 ResultsIn this section we present the structural estimation results and use them to quantify the importance ofbarriers to mobility and of self-selection. We begin with a frictionless model and show that it fails toexplain some salient features of the data. Models featuring frictions provide a much better fit to thedata and imply a large extent of misallocation in Indonesia. Finally, we discuss the empirical contentof the reduced form non-agriculture premia when viewed through the lens of the model.57The grids (one per each structural parameter) of 29 plots (one per each selected coefficient of the auxiliary regressions)are available upon request.652.6. Results2.6.1 Estimation resultsColumn (1) of Table 2.13 shows the values of the Indirect Inference point estimates for the 16 structuralparameters in the frictionless economy58. Given those estimated parameters the model generates thevalues for the 29 coefficients of the auxiliary regressions displayed in column (4) of Table 2.15. Thelast row of this table shows the value of the loss function, indicating the overall fit of the model (withsmaller values indicating a better fit).Perhaps surprisingly, the model without any frictions is not only able to replicate the cross-sectional non-agriculture premium, but also to generate a sizable within-individual premium. In theestimated frictionless economy, workers who switch from non-agriculture to agriculture see their in-comes decline by 24 lp on average. This striking result can be explained by a selection effect generatedby the transitory productivity shocks. As a result of this mechanism the fixed effect premium is shapedlargely by the variance of transitory shocks across sectors. We formally state this result for a simplifiedversion of the model in the following proposition.Proposition 1. Consider the frictionless model with two periods and human capital prices equalacross sectors and over time. Then the average growth of log income of workers switching fromagriculture to non-agriculture is positive if and only if σ2εN > σ2εA. Furthermore, the average growthof log income of workers switching from non-agriculture to agriculture has the same magnitude but isof the opposite sign.Proof. See Appendix B.5.Since with two periods the fixed effects premium is simply equal to to the average growth of logincome of switchers (taken with appropriate signs), we immediately have the following implication.Corollary 2. Under the same conditions as in Proposition 1, the non-agriculture premium identifiedfrom a regression with worker fixed effects is positive if and only if σ2εN > σ2εA.To understand these results, observe that after workers sort themselves into sectors in the firstperiod, the only reason a worker would switch to a different sector next period is a change in thebalance of productivity shocks, εNit − εAit . With equal variances of shocks across sectors, the averagegrowth in income is the same for switchers in both directions, so the within-individual premium isnull. But in the case of asymmetric variances, the shocks with a larger dispersion have a higherchance to take extreme values, resulting in larger average increase in income of workers shifting tothe sector with the larger variance. Thus, the sign of the non-agriculture premium after controlling forworker fixed effects depends only on the relative size of the variance of the productivity shocks: it ispositive when the variance is larger in non-agriculture, and negative otherwise. This reasoning carriesover quantitatively to the estimated general model with multiple periods and evolving human capitalprices.58We are currently computing standard errors from 200 bootstraps.662.6. ResultsThe main message from this discussion is that finding a large non-agricultural income premiumafter controlling for worker fixed effects, as we find for Indonesia, by itself does not indicate thatworkers face any frictions in choosing their sector of employment. In principle, the premium canbe explained simply by larger dispersion of productivity shocks faced by non-agricultural workers.But the pattern of variances have observable implications for moments other than sectoral premia.In particular, the frictionless model struggles to simultaneously account for non-agricultural premiaand the pattern of the residual variances of workers’ earnings in the data (the variance is larger inthe agriculture sector). To generate the cross-sectional and fixed effects non-agriculture premia, thefrictionless model forces the relative magnitudes of the variances for both the permanent and transitorycomponents of comparative advantage to be opposite to the pattern observed in the residual variances.This enables it to display a relatively good fit for the premia (0.56 lp and 0.23 lp in the model versus0.57 lp and 0.40 lp in the data for the cross-sectional and the fixed-effects premia, respectively), but atthe expense of generating residual variances that are completely reversed relative to the data (comparecoefficients δ24,δ25 and δ26,δ27 in columns (2) and (4) of Table 2.15). To explain jointly the premiaand the patterns of the residual variances, we need to introduce some frictions to the sectoral allocationin the model.The first type of friction is represented by utility costs of switching sectors. When we restrict theswitching costs to be positive, which is a standard and perhaps natural case, we find that they have ef-fectively no impact on the estimates. The reason is that the estimated costs are small in magnitude, andin particular, the zero bound for the cost of switching from non-agriculture to agriculture is binding.This result might seem surprising, given that the literature estimating utility costs of switching sectorstypically finds them to be large, often equivalent to multiples of a worker’s annual income (e.g. Artucet al. (2015)). But the magnitudes might not be easily comparable across studies, as they depend onwhat other mechanisms of sector determination are built into the respective models. In our case, whenwe allow for self-selection according to comparative advantage then positive switching costs do nothave much additional explanatory power. In particular, if switching to agriculture was costly then itwould be even more puzzling why so many workers make the move.The situation is different if we remove the restriction on the sign of the switching costs. Column(2) of Table 2.13 shows the estimates for the model with unrestricted switching costs, and column(5) of Table 2.15 the corresponding coefficients of the auxiliary models. The switching costs are ofopposite signs, approximately symmetric in magnitude, and of a large magnitude. A worker switchingfrom agriculture to non-agriculture faces a cost of 71 lp of annual income equivalent (i.e. roughlyequivalent to her annual income). That is, a worker who actually moves from agriculture to non-agriculture, must have a value of her human capital in non-agriculture at least twice as large as inagriculture. For smaller differences, the worker remains in agriculture. A worker switching towardsagriculture receives a utility compensation equivalent to almost doubling her new agricultural income.That is, a worker who actually switches from non-agriculture to agriculture, could have a value of herhuman capital in agriculture as much as 47% smaller than in non-agriculture.Because of this implied compensation, the model now has an easier time justifying why workers672.6. Resultsswitch to agriculture. It can rationalize the income cuts of workers switching to agriculture in termsof negative switching cost so it does not need to rely on the counterfactual pattern of residual incomevariances. It can therefore generate both a within-individual premium that is close to the one observedin the data (0.35 lp in the model versus 0.40 lp in the data) and deliver the correct qualitative patternsfor the residual variances (larger variances in agriculture, see coefficients δ24 to δ27 of Table 2.15).In summary, the overall fit of the model with switching costs is substantially better (last row of Table2.15).Since the estimated switching costs are nearly symmetric (i.e. φANφNA is close to 1), the modelwith switching is similar to a specification with a single positive compensating differential for workingin agriculture.59 Columns (3) in Table 2.13 and (6) in Table 2.15 show, respectively, the estimatedparameters and the obtained auxiliary coefficients for the latter model. In this case, individuals arewilling to be paid less to work in agriculture simply because it is a sector they enjoy more. Thisestimated preference is strong, as it is equivalent to increasing a worker’s agricultural income by 61lp (or 89%). Comparing columns (5) and (6) in Table 2.15 shows that the compensating differentialmodel fits the data nearly as well as the more flexible model with switching costs.These estimates demonstrate that in order to be consistent with the salient features of worker-levelpanel data on sectoral employment and income, a model built on revealed preferences (i.e. volun-tary choices) needs to make switching to agriculture attractive in some non-pecuniary terms. Whileestimating compensating differentials has a long history, we recognize that in our context they arenot a particularly satisfying explanation. Ultimately, such utility-based compensation is a residualforce that allows the model to rationalize choices otherwise difficult to explain. We therefore explorean alternative conceptual approach to think about barriers to sectoral mobility. Instead of treatingall observed sectoral transitions as a result of voluntary choices, the alternative is to recognize thatsometimes workers switch sectors for reasons independent of their productivity.First we consider a specification with a single probability p of a worker being forced to a differentsector than she would desire. This probability is estimated to be 0.05 (see column (4) in Table 2.13),which might not seem large, but in fact implies that most of the observed switches are of this randomnature. This parsimonious explanation fits the data noticeably better (see column (7) in Table 2.15)than the models with utility switching costs.Next, we increase the model’s flexibility by allowing the probability of the involuntary sectorallocation to depend on whether the workers wants to switch or to remain in the same sector as in theprevious period. This specification captures the notion that switching a sector might be more difficultthan staying put. This is indeed the case: as reported in column (5) in Table 2.13, the probabilitythat a worker who wants to remain in a sector has to switch anyway is pS = 0.09, whereas a workerwanting to switch most likely will not get the chance to do so (pT = 0.77). These numbers implythat 57% of the observed transitions from non-agriculture to agriculture are driven by chance ratherthan in response to productivity shocks. The effect is not symmetric, in that only 25% of switches to59The model with a compensating differential cd is observationally equivalent to a model with switching costs φAN = cd,φNA = 1/cd.682.6. Resultsnon-agriculture are forced by randomness.The explanation offered by this model for the prevalence of income-reducing transitions to agricul-ture is thus that these transitions are largely random events. Furthermore, once a worker finds herselfin a non-desired sector she can be “trapped” there for a while, because it is difficult to transition tothe other sector. The model with these features provides a considerably better fit to the data than allthe alternatives presented above, as can be seen from column (8) in Table 2.15.60 In particular, it canmatch closely not only the qualitative pattern of non-agriculture premia and residual variances but alsotheir magnitudes. It is also the only specification that can replicate the asymmetry in the magnitudeof income growth of switchers to agriculture and switchers to non-agriculture (coefficients δ5 and δ6)that is observed in the estimation sample. Since this model offers superior empirical performance andwhat we believe is a compelling underlying mechanism, it is our preferred specification and the basisfor further analysis.612.6.2 Counterfactual exercisesWe now proceed to quantify the importance of mobility barriers across sectors by computing thecounterfactual equilibrium in which the barriers are removed. While this counterfactual is intended toillustrate the response of labor supply to the removal of such frictions, it is worth pointing out that theexercise lacks a general equilibrium adjustments of factor prices. Such adjustments can dampen thereallocation of workers, so our results should be regarded as an upper limit of the full impact.We simulate counterfactual data setting pS = pT = 0 while keeping the remaining elements of Θˆand the values of covariates as in our baseline model. We first discuss the implications of eliminatingthe frictions for aggregate income and then present sectoral outcomes. Denoting by N the total numberof individuals in the panel, we compute the number of individuals reallocated after the barriers areremoved, equal to M, and the fraction of the population that is reallocated, m = MN . To decompose theimpact of workers’ misallocation on total income Y into its different margins, denote by Ym the sumof earnings of the misallocated individuals. Further, denote by ψm the ratio of the average income ofthe misallocated individuals to the average income in the population, ψm ≡ NYmMY . Thus, the percentagegrowth rate of total income after removing mobility frictions can be expressed as the product of threeterms:62∆%Y = mψm∆%Ym. (2.4)The first term represents the fraction of the population that is reallocated, the second term how impor-60This is a fair comparison since the model has the same number of free parameters as the model with switching costs.61Extending the model to also include compensating differentials or positive switching costs has very little effect on theestimated probabilities of involuntary switches and the model fit.62To prove this, for any variable x in the observed data let x′ denote its counterfactual value in the frictionless economy,and xˆ ≡ x′x the proportional change. Since individuals who remain in the same sector do not observe any adjustment intheir income, we can express: Yˆ = Yˆ m YmY +(Y−Ym)Y . After some manipulation we can rewrite the latter expression as: Yˆ =mψm(Yˆ m−1)+1, and hence the percentage growth rate of total income after removing switching costs, ∆%Y = 100(Yˆ −1),as in equation (2.4).692.6. Resultstant on average is the income of those individuals relative to the whole population in the data, and thethird term the growth rate in the total income of all misallocated individuals.Table 2.16 presents the results of the calculation. The main finding is that removing workers’mobility barriers across sectors leads to a significant reallocation of workers towards non-agriculture(30% of the total labor force) and to a large increase in income of misallocated workers (which doubleson average). As a result, it produces a sizable impact in aggregate terms: an increase of around 17%in total income (pooled across all years). It is worth noting that the effect would have been even largerif the misallocated workers were average earners. However, in our estimated model, the representativemisallocated worker earns 54% of what the average worker earns in the whole panel (largely becausethe misallocated workers cannot realize their full earning potential when they are in the wrong sector).This fact moderates the effect of the reallocation of those workers on the adjustment in the aggregateincome. It is also worth noting that in our baseline specification income is the only determinant ofutility, so increases in income result in identical increases in welfare.63Table 2.17 breaks down the results further by sector. Removing barriers to mobility would result inan agricultural employment shrinking by 5.8 p.p. as a share of total workforce. While this net changeis not small, it is significantly smaller than the 30 p.p. gross flows of workers between sectors. Grossflows exceed net flows because there are workers wrongly allocated in both sectors. Furthermore,because the misallocated workers have on average lower productivity than the average worker in theirsector, removing the misallocation increases (labor) productivity in both sectors, by 7.9% in non-agriculture and a whooping 39.1% in agriculture.64 Consequently, output increases in both sectors. Inparticular, it increases by 15.7% in agriculture despite the sector contracting in terms of employment.In summary, our results indicate that labor is misallocated to a significant degree in Indonesiabecause of barriers to mobility across sectors. Eliminating such barriers would potentially lead tolarge aggregate productivity gains. Our work does not offer a practical guide to how the barriers canbe eliminated in practice, but it highlights that policies easing frictions workers face in making sectoralchoices could have a large positive impact on the economy.2.6.3 Industry premia revisitedWith the structural model at our disposal, we now use it to shed more light on the empirical content ofthe reduced-form sectoral premia of the kind we estimated in section 2.3.There is a strand in the literature (e.g., Hicks et al. (2017), Herrendorf and Schoellman (2018))arguing that if substantial cross-sectional non-agriculture premium largely disappears after controllingfor worker fixed effects, then the data can be explained by an efficient sorting of workers. In section2.6.1 we explained that frictionless sorting does not imply that there should be zero premium identifiedfrom within-worker variation. The flipside of this argument is that once we allow for barriers to63In contrast, in a model with barriers to mobility modeled as utility costs of switching, income and welfare woulddiverge, with the average growth in utility smaller than in income, but positive.64The estimated processes of permanent and transitory components of comparative advantage draws imply that bothsectors are “standard” in the Roy model terminology of Heckman and Honoré (1990).702.7. Conclusionssectoral mobility, the absence of the within-worker premium does not imply that the allocation isefficient. There can be many combinations of processes for permanent and transitory componentsof comparative advantage draws and barriers to mobility that result in the same cross-sectional and(possibly zero) within-worker premia. To separately identify the role of frictions and of sorting wehave to look beyond industry premia at a rich set of moments observable in a panel of workers.To illustrate this discussion, column (2) in Table 2.18 reports the cross-sectional and within-workernon-agriculture premia obtained from data simulated in a counterfactual removing frictions in ourbaseline model (discussed in the previous subsection). Even though the allocation is perfectly efficientin this case, the non-agriculture premium from a regression with worker fixed effects is not zero, butin fact strongly negative at -35 lp. The negative premium is a natural consequence of larger varianceof productivity shocks faced by workers in agriculture.The level of the fixed-effect premium by itself therefore does not have clear implications for thestrength of barriers to mobility if sorting is also present. But the difference between the fixed effect andcross-sectional premia does indeed indicate the presence of sorting. To illustrate this point, we con-sider an alternative counterfactual scenario in which self-selection is eliminated. Specifically, we setσ2θA , σ2θN , σ2εA , σ2εN all to zero. In this case all workers are identical and would prefer non-agricultureas it offers higher prices for human capital. There is no sorting, and both sectors employ workers be-cause of the frictions restricting workers from selecting their preferred sector. As column (3) in Table2.18 confirms, when transitions between sectors are purely random the fixed effect premium takes thesame value as the cross-sectional premium.To summarize, comparing the cross-sectional and within-worker sector premia can be a usefuldiagnostic for detecting self-selection. But detecting barriers to sectoral mobility in observational datarequires imposing sufficient structure and using data beyond the sectoral premia.2.7 ConclusionsWe present extensive reduced-form evidence of a substantial premium for working outside of agri-culture in Indonesia. The same individual switching to work in non-agriculture gains about 25-30%income, while an individual switching in the opposite direction faces an income loss of a similar mag-nitude. We argue that in order to generate simultaneously those premia and the main moments of thejoint distribution of income, we need to extend the models that attribute income gaps across sectorsonly to sorting of workers by including barriers to sectoral mobility that misallocate workers acrosssectors.Our preferred way of thinking about barriers to mobility is that they restrict the ability of workers towork in their desired sectors. Such frictions misallocate a large fraction of workers across sectors (30%in our baseline specification), and imply large income gains (of around 100%) for the misallocatedworkers when they reallocate. As a result, output in Indonesia could increase by as much as 17% ifbarriers to mobility across sectors were removed.712.7. ConclusionsIn this chapter we are agnostic about the root causes of the barriers to sectoral mobility. Inves-tigating what constitutes such barriers, why they persist, and what policies can be used as a remedywould be fruitful avenue for future research.722.8. Tables and figures2.8 Tables and figures2.8.1 TablesTable 2.1: Descriptive statisticsIFLS 1: 1993 IFLS 2: 1997 IFLS 3: 2000 IFLS 4: 2007 IFLS 5: 2014Share of male 0.60 0.62 0.59 0.58 0.57Mean age 41.4 38.1 39.0 40.7 41.2Mean years of schooling 5.4 6.1 7.1 7.8 8.7Joint distribution over sectors and locationsTotal Agriculture 0.45 0.35 0.36 0.36 0.29Rural Agriculture 0.42 0.31 0.32 0.31 0.24Urban Agriculture 0.03 0.03 0.04 0.05 0.05Total Non-Agriculture 0.55 0.65 0.64 0.64 0.71Rural Non-Agriculture 0.27 0.30 0.27 0.25 0.27Urban Non-Agriculture 0.28 0.35 0.37 0.39 0.44Total Rural 0.69 0.62 0.59 0.56 0.50Total Urban 0.31 0.38 0.41 0.44 0.50No. observations 9714 12875 17931 20874 24475Main sample: panel of workers with 2+ observationsNo. observations 70586No. individuals 22829732.8. Tables and figuresTable 2.2: Sectoral and urban income premia(1) (2) (3) (4) (5) (6)Log Income Log Income Log Income Log Income Log Income Log IncomeNon-Agriculture 0.839*** 0.686*** 0.574*** 0.332***(0.041) (0.040) (0.036) (0.033)Urban 0.647*** 0.405*** 0.207*** 0.084**(0.045) (0.042) (0.036) (0.032)Agr.×Urban 0.062(0.055)Non-Agr.×Urban 0.416***(0.046)Non-Agr.×Rural 0.326***(0.039)Year FE Yes Yes Yes Yes Yes YesProvince FE Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes YesIndividual FE Yes YesObservations 48299 48308 48299 44494 44497 44497R2 0.412 0.394 0.424 0.503 0.518 0.518Notes: Individual controls: education, experience, experience sq., and sex. Observations weighted by longitudinal surveyweights. Standard errors clustered by enumeration areas (primary sampling units of the survey) in parentheses. Significancelevels: * p<0.10, ** p<0.05, *** p<0.01.Table 2.3: Transitions across sectors and locationsSector transitions No. of cases Share of totalAA 13214 27.68AN 3886 8.14NA 3546 7.43NN 27098 56.76Total 47744 100.00Indiv. who switch at least once 23.89Location transitions No. of cases Share of totalRR 23299 48.79RU 3171 6.64UR 1166 2.44UU 20121 42.13Total 47757 100.00Indiv. who switch at least once 16.91Sector in T+1Agricult. Non-Agr.Sector in TAgricult. 0.78 0.22Non-Agr. 0.12 0.88Location in T+1Rural UrbanLocation in TRural 0.90 0.10Urban 0.05 0.95Notes: XY indicates a transition from sector (or location type) X to Y between two consecutive observations for an individ-ual. A - Agriculture, N - Non-agriculture, R - Rural, U - Urban.742.8. Tables and figuresTable 2.4: Premia for switchers and stayers(1) (2)∆ Log Income ∆ Log IncomeSector transitionsAN 0.220***(0.050)NA -0.392***(0.049)NN -0.066***(0.023)Location transitionsRU 0.091*(0.047)UR -0.199***(0.058)UU -0.040*(0.023)Sector trans. ×MigrationAA ×Migrate -0.108(0.092)AN × Stay 0.196***(0.053)AN ×Migrate 0.275**(0.108)NA × Stay -0.379***(0.054)NA ×Migrate -0.472***(0.110)NN × Stay -0.117***(0.021)NN ×Migrate -0.008(0.039)∆ Year FE Yes Yes∆ Province FE Yes Yes∆ Indiv. cont. Yes YesObservations 27697 24858R2 0.075 0.075Notes: XY indicates a transition from sector (or location type) X to Y between two consecutive observations for an indi-vidual. A - Agriculture, N - Non-Agriculture, R - Rural, U - Urban. Migrate indicates movement outside of the villageboundary. Omitted categories: staying in agriculture (AA) and staying in rural area (RR) in column 1; staying in agricul-ture within the same village (AA×Stay) in column 2. Individual controls: education, experience, experience sq., and sex.Observations weighted by longitudinal survey weights. Standard errors clustered by enumeration areas (primary samplingunits of the survey) in parentheses. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.752.8. Tables and figuresTable 2.5: Job top occupations and typesTop 10 Occupations Empl. shareAgricultural and animal husbandry workers 0.352Salesmen, shop assistants and related workers 0.136Bricklayers, carpenters and other construction workers 0.038Maids and related housekeeping service workers NEC 0.038Working proprietors (catering and lodging services) 0.034Transport equipment operators 0.032Teachers 0.031Food and beverage processors 0.027Working proprietors (wholesale and retail trade) 0.026Service workers NEC 0.025Cumulative 0.739Job Type Empl. shareSelf-employed 0.471Private worker 0.318Government worker 0.068Unpaid family worker 0.142Notes: Employment shares reported for IFLS 4 (2007).Table 2.6: Premia for switchers and stayers by job type(1) (2) (3) (4)Self-employed Private Worker Government Unpaid FamilyAN-AA 0.259*** 0.245*** 0.111 0.33518.31 11.98 0.43 1.21NA-NN -0.309*** -0.274*** -0.225 -0.871*33.61 17.89 1.02 3.79Notes: Table presents tests based on results of a first-difference regression (2.2) (c.f. column 1 in Table 2.4) with direction ofsectoral switch interacted with job type. Reported are the difference in coefficients of interest and the value of an F(1,296)test that the difference is zero. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.Table 2.7: Wage premia(1) (2) (3) (4)Log Income Log Income Log Wage Log WageNon-Agriculture 0.574*** 0.332*** 0.490*** 0.231***(0.036) (0.033) (0.051) (0.050)Urban 0.207*** 0.084** 0.193*** 0.119***(0.036) (0.032) (0.042) (0.035)Year FE Yes Yes Yes YesProvince FE Yes Yes Yes YesIndiv. cont. Yes Yes Yes YesIndividual FE Yes YesObservations 44494 44497 23139 23140R2 0.503 0.518 0.556 0.601Notes: Individual controls: education, experience, experience sq., and sex. Observations weighted by longitudinal surveyweights. Standard errors clustered by enumeration areas (primary sampling units of the survey) in parentheses. Significancelevels: * p<0.10, ** p<0.05, *** p<0.01.762.8. Tables and figuresTable 2.8: Consumption premia(1) (2) (3) (4) (5) (6)Log PCE Log PCE Log PCE Log PCI Log PCI Log PCINA sh. in HH income 0.305*** 0.702***(0.017) (0.040)Non-Agr. 0.214*** 0.075*** 0.492*** 0.197***(0.014) (0.013) (0.030) (0.024)Urban 0.315*** 0.161*** 0.095*** 0.416*** 0.225*** 0.063*(0.029) (0.024) (0.026) (0.043) (0.034) (0.037)Non-Agr./Yih/Yh 0.382 0.134 0.884 0.352Year FE Yes Yes Yes Yes Yes YesProvince FE Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes YesIndividual FE Yes YesObservations 40168 53546 53550 38365 51690 51693R2 0.707 0.742 0.784 0.504 0.520 0.541Notes: Specifications (1) and (4) estimated at a household level with observations weighted by longitudinal householdsurvey weights. (1) also includes the number of household members (level and squared) as controls. NA sh. in HHIncome is a continuous variable measuring the share of non-agriculture in household’s income. Specifications (2)-(3) and(5)-(6) estimated at an individual level. Individual controls: education, experience, experience sq., and sex. Observationsweighted by longitudinal survey weights. Standard errors clustered by enumeration areas (primary sampling units of thesurvey) in parentheses. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.Table 2.9: Premia with heterogeneity in Mincerian returns(1) (2) (3) (4)Log Income Log Income Log Income Log IncomeNon-Agriculture 0.574*** 0.332*** 0.625*** 0.314***(0.036) (0.033) (0.039) (0.034)Urban 0.207*** 0.084** 0.200*** 0.074**(0.036) (0.032) (0.034) (0.032)Year FE Yes Yes Yes YesProvince FE Yes Yes Yes YesIndiv. controls Yes Yes Yes YesIndividual FE Yes YesHet. in Mincer Yes YesObservations 44494 44497 44494 44497R2 0.503 0.518 0.506 0.520Notes: Columns (3) and (4) allow for differences in Mincerian returns across sectors and locations. Average marginal effectfor the population reported. Average effects for switchers are similar. Individual Mincerian controls: education, experience,experience sq., and sex. Observations weighted by longitudinal survey weights. Standard errors clustered by enumerationareas (primary sampling units of the survey) in parentheses. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.772.8.TablesandfiguresTable 2.10: Premia with additional jobs and home productionBase Base Add. Job Add. Job Add+HH TC Add+HH TC Add+HH FC Add+HH FC(1) (2) (3) (4) (5) (6) (7) (8)Log Income Log Income Log Income Log Income Log Income Log Income Log Income Log IncomeNon-Agr. 0.574*** 0.332*** 0.501*** 0.264*** 0.462*** 0.251*** 0.447*** 0.245***(0.036) (0.033) (0.034) (0.032) (0.033) (0.032) (0.032) (0.032)Urban 0.207*** 0.084** 0.171*** 0.063* 0.141*** 0.057* 0.124*** 0.051(0.036) (0.032) (0.034) (0.034) (0.033) (0.034) (0.033) (0.034)Year FE Yes Yes Yes Yes Yes Yes Yes YesProvince FE Yes Yes Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes Yes Yes Yes Yes YesIndividual FE Yes Yes Yes YesObservations 44494 44497 44489 44492 44489 44492 44489 44492R2 0.503 0.518 0.514 0.538 0.513 0.540 0.515 0.545Notes: Base is the baseline specification involving primary job only. Add. Job also includes secondary job. HH TC scales income by the inverse of the share of self-producedconsumption in household’s overall consumption. HH FC scales income by the inverse of the share of self-produced food in household’s food consumption. Individual controls:education, experience, experience sq., and sex. Observations weighted by longitudinal survey weights. Standard errors clustered by enumeration areas (primary sampling units ofthe survey) in parentheses. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.782.8. Tables and figuresTable 2.11: Premia with hours worked(1) (2) (3) (4) (5) (6)Log Income Log Income Log Income Log Income Log Inc./Hour Log Inc./HourNon-Agriculture 0.574*** 0.332*** 0.441*** 0.271*** 0.297*** 0.185***(0.036) (0.033) (0.034) (0.032) (0.036) (0.036)Urban 0.207*** 0.084** 0.160*** 0.084*** 0.109*** 0.076***(0.036) (0.032) (0.031) (0.026) (0.029) (0.028)Log Hours/Year 0.496*** 0.432***(0.011) (0.011)Year FE Yes Yes Yes Yes Yes YesProvince FE Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes Yes Yes YesIndividual FE Yes Yes YesObservations 44494 44497 43841 43843 43841 43843R2 0.503 0.518 0.592 0.595 0.478 0.493Notes: Individual controls: education, experience, experience sq., and sex. Observations weighted by longitudinal surveyweights. Standard errors clustered by enumeration areas (primary sampling units of the survey) in parentheses. Significancelevels: * p<0.10, ** p<0.05, *** p<0.01.Table 2.12: Long run premia1993-2014 93-07/00-14(1) (2)∆ Log Income ∆ Log IncomeAA-AN 0.1721.38NA-NN -0.369***9.10ANN-AAA 0.147*2.79NAA-NNN -0.186**4.62Observations 2567 7857R2 0.105 0.098Notes: Column 1 presents tests based on results of a first-difference regression (2.2), where the difference is over the period1993-2014. Reported are the difference in coefficients of interest and the value of an F(1,288) test that the difference is zero.Column 2 presents tests based on a first-difference specification over 14 years (1993-2007 or 2000-2014) controlling fordirection of switch during the first and second 7-year period. Reported are the difference in coefficients of interest and thevalue of an F(1,292) test that the difference is zero. Other controls and weights are as in column 1 in Table 2.4. Significancelevels: * p<0.10, ** p<0.05, *** p<0.01.792.8. Tables and figuresTable 2.13: Parameter estimates(1) (2) (3) (4) (5)Parameter FrictionlessUnrestrictedswitching costsCompensatingdifferentialSingleprobability ofinvoluntarychoicesHeterogeneousprobabilities ofinvoluntarychoicesVariance of permanent comparative advantage in sector s (σ2θ s) and covariance (σθAN )σ2θA 0.27 0.56 0.52 0.32 0.39σ2θN 0.50 0.22 0.31 0.53 0.45σθAN 0.55 0.45 0.46 0.68 0.63Variance of transitory productivity shocks in sector s (σ2εs)σ2εA 0.00 0.17 0.12 0.01 0.23σ2εN 0.05 0.00 0.00 0.00 0.00Variance of measurement error (σ2ν )σ2ν 0.74 0.70 0.71 0.65 0.51Price of human capital in sector s at time t (Rst )RA1 0.75 0.41 0.44 0.76 0.73RA2 1.21 0.61 0.64 1.18 1.09RA3 1.04 0.52 0.59 1.07 1.07RA4 1.41 0.73 0.81 1.30 1.46RA5 1.61 0.94 1.00 1.60 1.83RN1 1.15 1.32 1.37 1.23 1.47RN2 1.86 2.00 2.11 1.88 2.31RN3 1.50 1.79 1.74 1.64 1.68RN4 2.02 2.03 2.22 1.94 2.10RN5 2.44 2.50 2.58 2.62 2.52Switching cost of moving from sector s to sector s′ (φ ss′)lnφAN – 0.71 – – –lnφNA – -0.63 – – –Compensating differentiallncd – – 0.61 – –Probabilities of involuntary choicesp – – – 0.05 –pS – – – – 0.09pT – – – – 0.77Notes: We are currently computing standard errors from 200 bootstraps.802.8. Tables and figuresTable 2.14: Auxiliary models and selected coefficientsAuxiliary model Selected coefficients Coefficient descriptioni) Log-residual income linear regression on the sectorchoice:δ1 Non-agriculture premium(cross-sectional)ln y˜its = c+1{dit = N}δ1+Dt + εistii) Log-residual income linear regression on the sectorchoice:δ2 Non-agriculture premium(within-individual)ln y˜its = c+1{dit = N}δ2+Dt +Di+ εistiii) Log-residual income linear regression on the directionof sector switching:δ3 = γNAδ4 = γAN− γNNPremia for switchers to eachsector relative toln y˜its = c+1{dit−1 = s,dit = s′}γss′+Dt + εist their peers post-switchiv) Log-residual income linear regression in firstdifferences on the direction of sector switching:δ5 = δAN δ6 = δNA−δNN Premia for switchers to eachsector relative to∆ ln y˜its = 1{dit−1 = s,dit = s′}γss′+∆Dt + εist non-switching workersv) Log-residual income linear regression on the interactionbetween sector choice and year:δ7δ8 = γA×2 . . .δ16 = γN×5ConstantInteractions sector and yearln y˜its = δ7+{1{dit = N}×1{dit = t}}γs×t + εistvi) LPM of sector choice on time dummy variables: δ17 Constant1{dit = N}=δ22+1{dit = t}γt + εist δ18 = γ2 . . .δ21 = γ5 Year dummiesvii) LPM of sector choice on previous sector choice: δ22,δ23 Constant and lagged1{dit = N}=δ27+1{dit−1 = N}δ28+ εist sector choicevii) Residual variances: δ24,δ25 For workers in each sectorfrom model v)δ26,δ27 For non-switching workersin each sector from modeliv)δ28,δ29 For switching workers toeach sector from model iv)Notes: LPM stands for linear probability model. y˜its is the residual income of individual i in time t working in sector s, thatsatisfies ln y˜its = lnyits−X ′it βˆ , where yits is the observed income, X ′it is the set of observables that includes gender, urban-rural location, years of schooling, years of working experience and square of years of working experience, and βˆ is thevector of estimated coefficients on observables in the log-income linear regression on the interaction between sector choiceand year conditional on observables: lnyits = δ+X ′itβ+{1{dit = N}×1{dit = t}}γs×t +εist . Dt corresponds to year fixed-effects and Di to individual fixed-effects. ∆x is the first difference of variable x. 1{dit = N} is a dummy indicating whetherindividual i works in non-agriculture in period t, 1{dit−1 = s,dit = s′} is a set of dummies indicating whether individual iin period t−1 worked in sector s and in period tworked in sector s′, and 1{dit = t} is a set of dummies indicating whetherthe observation of worker i corresponds to period t. The omitted category in models iii) and iv) is AA, in model v) is A×1and in model vi) is t = 1.812.8. Tables and figuresTable 2.15: Coefficients of auxiliary regression models(1) (2) (3) (4) (5) (6) (7) (8)Coefficientsδi (weight Ωi)Data (δˆi)Standarderror in thedataFrictionlessUnrestrictedswitchingcostsCompensatingdifferentialSingleprobability ofinvoluntarychoicesHeterogenousprobabilitiesof involuntarychoicesNon-agriculture premia: cross-sectional (δ1) and within-individual (δ2)δ1 (1) 0.57 (0.03) 0.56 0.62 0.62 0.60 0.51δ2 (1) 0.40 (0.05) 0.24 0.35 0.34 0.34 0.40Premia for switchers to agriculture (δ3,δ6) and to non-agriculture. (δ4,δ5). The first element in (a,b) is relative to peers post-switch;the second to non-switching workersδ3 (5) -0.05 (0.06) -0.07 -0.08 -0.08 -0.11 -0.06δ4 (5) -0.31 (0.05) -0.42 -0.39 -0.39 -0.41 -0.28δ5 (5) 0.15 (0.07) 0.23 0.31 0.29 0.31 0.26δ6 (5) -0.42 (0.06) -0.24 -0.35 -0.34 -0.35 -0.40Constant (δ7) and coefficients on interaction sector and year (δ8 : A×2, δ9 : A×3, . . .δ16 : N×5)δ7 (5) -0.17 (0.10) -0.17 -0.19 -0.19 -0.19 -0.17δ8 (1) 0.38 (0.07) 0.50 0.41 0.42 0.46 0.39δ9 (1) 0.34 (0.07) 0.34 0.29 0.30 0.35 0.38δ10 (1) 0.63 (0.07) 0.62 0.50 0.56 0.54 0.65δ11 (1) 0.85 (0.08) 0.77 0.75 0.77 0.77 0.90δ12 (5) 0.76 (0.06) 0.58 0.64 0.66 0.63 0.69δ13 (1) 1.10 (0.06) 1.05 1.04 1.07 1.05 1.12δ14 (1) 0.89 (0.06) 0.88 0.95 0.92 0.94 0.81δ15 (1) 1.05 (0.06) 1.17 1.09 1.17 1.12 1.04δ16 (1) 1.27 (0.07) 1.34 1.33 1.34 1.39 1.22Constant (δ17) and coefficients on year dummies (δ18 : t = 2, δ19 : t = 3...)δ17 (10) 0.70 (0.01) 0.73 0.71 0.72 0.73 0.70δ18 (10) 0.01 (0.02) 0.00 0.00 0.02 -0.01 -0.02δ19 (10) -0.02 (0.02) -0.04 0.01 -0.02 -0.02 -0.05δ20 (10) -0.03 (0.02) -0.04 -0.06 -0.05 -0.03 -0.07δ21 (10) -0.04 (0.02) -0.01 -0.08 -0.08 0.00 -0.09Constant (δ22) and lagged sector choice (δ23)δ22 (10) 0.21 (0.01) 0.22 0.22 0.24 0.23 0.17δ23 (10) 0.68 (0.01) 0.69 0.65 0.64 0.68 0.71Residual variance of workers in agriculture (δ24) and non-agriculture (δ25)δ24 (3) 1.24 (0.04) 0.98 1.14 1.13 0.96 1.13δ25 (3) 0.95 (0.03) 1.16 1.08 1.10 1.17 1.06Residual variance of non-switching/switching workers in/to agriculture (δ26,δ29) and in/to non-agriculture (δ27,δ28)δ26 (3) 1.43 (0.06) 1.44 1.63 1.59 1.30 1.48δ27 (3) 1.08 (0.04) 1.57 1.41 1.43 1.31 1.01δ28 (3) 1.73 (0.14) 1.55 1.51 1.52 1.74 1.79δ29 (3) 1.86 (0.14) 1.57 1.56 1.54 1.77 1.82Overall fit 1.914 1.306 1.312 1.005 0.315Notes: A description of the auxiliary regressions is done in Table 2.14. Ωi refers to the i−th element of the diagonal of thematrix Ω.822.8. Tables and figuresTable 2.16: Counterfactual: Aggregate incomeVariable Notation CounterfactualGrowth rate (%) in total income: (1)∗ (2)∗ (3) ∆%Yi 16.9(1) Fraction of the population reallocated m 0.30(2) Ratio of average income of reallocated workers to average income ψm 0.54(3) Growth rate (%) in total income of reallocated workers ∆%Ym 105.6Notes: Results correspond to the counterfactual exercise of eliminating involuntary switches.Table 2.17: Counterfactual: Sectoral allocation and productivityVariable Agriculture Non-AgricultureBaseline employment share 0.34 0.66Counterfactual employment share 0.29 0.71Counterfactual employment growth (%) -16.8 8.8Counterfactual output growth (%) 15.7 17.3Counterfactual productivity growth (%) 39.1 7.9Notes: Results correspond to the counterfactual exercise of eliminating involuntary switches.Table 2.18: Sectoral premia in counterfactuals(1) (2) (3)Coef.BaselinemodelNo frictions No sortingNon-agriculture premia: cross-sectional (δ1) and within-individual (δ2)δ1 0.51 0.20 0.50δ2 0.40 -0.35 0.50Notes: Baseline model is from column (8) of Table 2.15. No frictions imposes pT = pS = 0. No sorting imposesσ2θA , σ2θN , σ2εA , σ2εN all equal to zero.832.8. Tables and figures2.8.2 FiguresFigure 2.1: Mean log income by employment history-.6-.4-.20.2.4Mean Log Income0 7 14Time [years]ANN NAAAAA NNNNotes: Figure plots mean log income (after controlling for year and province fixed effects) by employment history spannedby three observations at 7-year intervals. XYZ indicates that worker was in sector X during the first observation (in 1993 or2000), in sector Y during the second observation 7 years later (in 2000 or 2007), and in sector Z during the third observation14 years later (in 2007 or 2014). A - Agriculture, N - Non-Agriculture. For clarity only histories of switchers who stick totheir new sector and of always stayers are reported.84Chapter 3Demand Shocks and Inter-industryDistortions under Firm-level FactorMisallocation3.1 IntroductionIn recent years, a growing body of research has strived to understand how factor misallocation acrossheterogeneous firms can account for differences in aggregate TFP across countries.65 The main in-sight from this literature is that, given a fixed endowment of production factors in the economy anda certain distribution of physical productivity across firms, the inefficient allocation of inputs acrossproduction units within industries generates sizable losses in aggregate TFP. Under standard assump-tions on the demand and production structure, and regardless the underlying cause of the inefficientuse of resources – regulations, financial constraints, information asymmetries, crony capitalism, etc.– the amount of misallocation can be measured by the extent to which the marginal returns to factorsvaries within countries. Some evidence suggests a broader dispersion of those returns in developingeconomies (Banerjee and Duflo (2005), Hsieh and Klenow (2009), Bartelsman et al. (2013)), implyinglarger productivity losses for those countries. In this way, factor misallocation has become one of theexplanations of the observed TFP gaps across countries.In this chapter I extend the standard model of firm-level factor misallocation in a closed econ-omy (Hsieh and Klenow (2009), HK hereafter) in two dimensions. First, I incorporate idiosyncraticdemand shocks. Introducing firm-specific demand shifts do not affect the main predictions of themodel. Particularly, demand shocks do not alter the main result of revenue productivity equalizationacross firms in the frictionless allocation. Hence, the TFP gains from removing misallocation remainunchanged. This is the main reason of why measuring physical productivity as in HK, that couldreflect variations not only in efficiency but also in demand shocks, does not bias the estimated TFPgains. However, this extension is useful to test the ability of the usual metrics of factor misallocationin explaining plants’ survival, since demand shocks are a key determinant of profitability. This testhas been recently used to argue that misallocation measures, based on dispersion revenue productivity,suffer from an apparent lack of empirical content (Haltiwanger et al. (2018)). One of the findings tosupport this claim consists in observing that efficiency, demand shocks and revenue productivity (a65For an extensive review, see Restuccia and Rogerson (2013) or Hopenhayn (2014a).853.2. Demand shocks and plant survivalmeasure of factor distortions in the misallocation model) are all unconditionally positively associatedwith survival, but once we control for efficiency and demand shocks, the coefficient on revenue pro-ductivity flips sign. Using Colombian data, with which I can recover measures of demand shocks dueto the availability of firm-level price indices, I obtain similar empirical findings. However, I argue thatexplaining plants’ survival only with unconditional determinants of profitability produces biased esti-mates, and that including endogenous selection in the model as it has been done by Bartelsman et al.(2013), Yang (2017), Adamopoulos et al. (2017) or in the first chapter of this thesis, can rationalizethe signs of the bias and the data findings, addressing Haltiwanger et al.’s (2018) objections.Second, I account for the possibility that production factors are misallocated both within andacross industries. I provide closed-form solutions to evaluate the gains on aggregate TFP from remov-ing each type of misallocation, leading to a more straightforward computation relative to the methodsproposed in the literature (Oberfield (2013), Brandt et al. (2013)). Naturally, the magnitude of thegains from removing each type depends on the considered industry aggregation. Using data fromChina and Colombia, I show how under the most used industry classifications (3 and 4 digits ISIC)the contribution of the inter-industry type can be as high as 35% of the total gains from removingfactor distortions. Given the relevance of inter-industry misallocation, it is worth to know whether theTFP loss induced by its presence is larger in less developed economies, as it is the case with within-industry misallocation. I use cross-country data to show that this is the case, suggesting that the TFPgaps attributed to factor misallocation can be larger than the ones computed using only intra-industryreforms.3.2 Demand shocks and plant survivalIn this section, I extend HK’s model of firm-level factor misallocation in a closed economy to accountfor idiosyncratic demand shifters. I first show that the TFP gains of removing misallocation are notaffected by the introduction of demand shocks in the model. Next, I discuss the ability of the standardmetrics of misallocation in explaining plants’ survival when demand shocks are taken into account.In particular, I argue that the apparent contradiction in the signs of the unconditional estimates ofthe determinants of profitability in the misallocation model, instead of suggesting a lack of empiricalcontent of the misallocation measures, can be the natural result of a process of selection of firms inthe economy.3.2.1 Demand shocks and TFP gains from removing misallocationHK assume a standard monopolistic competition model where each variety m in a manufacturingindustry s is produced using a set of L homogenous factors66 zlm (l denotes the factor of production;I omit industry subscripts for firm-specific variables). Industry demand Qs is a CES aggregate of Ms66HK assume two production factors (capital and labor), but the model can be easily extended to account for morefactors, as I do in this section.863.2. Demand shocks and plant survivalvarieties with elasticity of substitution σ (I denote sectoral aggregates with capital letters, let ρ = σ−1σdenote the inverse of the mark-up). Firms use a Cobb-Douglas (CD) technology with constant returnsto scale and factor intensities αls, common for all firms within the industry. Firms differ in terms ofefficiency, i.e. on Hicks-neutral physical productivity, or TFPQ, defined as the ratio between output qmand the input use, given by the composite bundleL∏lzαlslm . Define revenue productivity, or TFPR, as theratio between revenue, rm = pmqm, and the same input bundle. Since profit maximization entails firms’prices pm are a constant mark-up over their marginal cost, the cost function automatically impliesthat if all firms are price-takers there is TFPR equalization within industries.67 That is, a standardmonopolistic framework with heterogeneous firms and frictionless factor markets allows firms to varyin terms of TFPQ, but imposes an equal TFPR for all firms within industries. Therefore, under thissimple setting dispersion in TFPR is a signal of allocative inefficiency.68Denote the TFPQ and TFPR of firm producing variety m as am and ψm, respectively. Stan-dard aggregation under monopolistic competition leads to an industry production of the form Qs =AsM1σ−1sL∏lZlsαls , where sectoral TFP As can be derived from firm-level data from:Aσ−1s =1MsMs∑m(amψ¯sψm)σ−1(3.1)where ψ¯s is the sectoral revenue productivity. If a reform equalizes TFPR across firms, the sectoral(efficient) TFP is simply the power mean of physical productivities: A˜σ−1s = M˜−1sM˜s∑maσ−1m . With theassumption of no self-selection of firms, M˜s = Ms and the percentage gains on sectoral TFP due toTFPR equalization are:Gainsintras = 100(A˜sAs−1) = 100(Ms(∑m(amψ¯sA˜sψm)σ−1)11−σ −1) (3.2)Equation (3.2) is the cornerstone of HK’s counterfactual exercise, and the description until hereprovided summarizes the main features of HK’s model. Notice that HK assume a CES demand forvariety m that allows us to obtain a demand equation of the form qm = p−σm Pσs QS, that is, an isoelasticdemand function, which is linear in logs. Now, let me introduce a firm-specific demand shifter γm forvariety m, such that the demand equation is:qm = γm p−σm Pσs QS (3.3)67In this case the marginal cost is simply MCm = 1T FPQmL∏l( wlαls )αls , where wl is the price of factor l. Profit maximizationimplies that firm’s output price is a constant markup over its marginal cost, pm = 1ρ MCm, Hence, given that marginal andunit costs are equal under constant returns to scale, revenues are 1/ρ > 1 times the total cost. Revenue productivity for thefirm is T FPRm ≡ pmqm/ L∏lzαlslm = pmT FPQm =1ρL∏l( wlαls )αls , i.e. the return of the composite bundleL∏lzαlslm , which does notdepend on m. Thus, TFPR should be equal for all firms within an industry, and the only differences across industries are dueto factor intensities.68See Table 1.1 in the first chapter for a review of the literature regarding the contribution of other possible sources ofvariation in the TFPR that do not imply factors are misallocated.873.2. Demand shocks and plant survivalThe demand shifter γm could represent not only differences in idiosyncratic demand, but also differ-ences in quality. The sectoral demand function that rationalizes the demand shifter is QρS =Ms∑mγ1σm qρmand the corresponding price index is P1−σS =Ms∑mγm p1−σm . Thus, sectoral TFP As,d can be obtained inthis case through:Aσ−1s,d =1MsMs∑mγm(amψ¯sψm)σ−1(3.4)while the sectoral efficient TFP A˜s,d is now A˜σ−1s,d = M˜−1sM˜s∑mγmaσ−1m .The choice between equations (3.1) and (3.4) to compute the sectoral TFP (and their correspondingefficient TFP and the gains from removing misallocation) depends on the availability of measures ofγm. These shocks can in turn be recovered as residuals from (3.3) with information of prices orquantities at the firm-level. One possibility is to employ econometric techniques to back out prices orquantities, an option that necessarily requires additional (and strong) assumptions. Other possibility isto take advantage of datasets with direct measures of firms’ prices or quantities; an approach followedby Haltiwanger et al. (2018) with U.S data and in this chapter thanks to the availability of firm-levelprices in the Colombian data.69The availability of information of prices or quantities at the firm-level allows us to recover notonly a demand shifter but also a direct measure of TFPQ that captures only technical efficiency. Inthis way, we do not need to rely in the usual method to compute TFPQ in the misallocation literature,which assumes firms’ prices satisfy the CES demand equation, and thus TFPQ reflect variations notonly in efficiency but also in the quality of the products or in demand shocks. Particularly, to obtainTFPQ, HK use:aHKm = κs(pmqm)1ρL∏lzlmαls(3.5)where κs collects sectoral values and thus can be omitted in the case of computing the TFP gainsfrom removing intra-industry misallocation. Notice that aHKm is proportional to γ1σ−1m am, and thus theTFP gains computed using aHKm in equation (3.1) (which require TFPQ to the power of σ −1, that is,γmaσ−1m ) are the same than those obtained using individual measures for γm and am in equation (3.4)(which require γmaσ−1m ).This equivalence result is consequence of the fact that in the CES case demand shocks do not alterthe main implication of the misallocation model, the revenue productivity equalization across firms inthe frictionless allocation, since prices for variety m are not affected by γm. Demand shocks impactfirms’ profits, but not the allocation of factors, since this allocation depends exclusively on the firm’scost minimization problem. Hence, the dispersion in TFPR is still a valid measure of firm-level factormisallocation under idiosyncratic demand shocks.69I use the firm-level prices constructed by Eslava et al. (2004) for the period 1984-1998. See the data appendix inchapter 1 for details about the dataset and the cleaning procedure to reduce the influence of measurement error and outliers.883.2. Demand shocks and plant survival3.2.2 Misallocation measures and survival of firmsA recent paper by Haltiwanger et al. (2018) raises concerns about the empirical content of the mea-sures of misallocation based on the dispersion in TFPR. One of their arguments is based on the factthat once we have access to direct measures of efficiency and demand shifters as determinants offirm’s profitability, we can analyze the ability of misallocation measures in predicting plants’ survival.Particularly, they show with U.S. data that efficiency, demand shocks and TFPR are all uncondition-ally positively associated with survival, but once we control for efficiency and demand shocks, thecoefficient on TFPR flips sign. They conclude: “measured distortions do include information aboutsomething that is a true distortion, but this component of the measure is empirically swamped byother sources of variation that are instead associated with (positive) fundamentals about producerprofitability [efficiency and demand shocks]” (Haltiwanger et al. (2018, pg. 31)).In more detail, notice first that selection of firms is on profits: more profitable firms are more likelyto remain in the market whereas unprofitable firms tend to exit. In the above framework, profits fromvariety m are proportional to p1−σm γm. Then, profits are an increasing function of firm’s TFPQ (as anindicator of efficiency) and of the demand shifter γm, and a decreasing function of its TFPR (as anindicator of frictions in all factor markets). A regression of the probability of survival on TFPQ, γmand TFPR controlling for relevant observables, should display positive signs in the two first cases, anda negative sign in the third case.Table 3.1 presents the results for the linear probability models of firms’ survival. Columns (1)-(3)display the results of the regressions on each of the three determinants of profitability. I control for yearand 4-digit industry fixed effects. The only determinant that shows the opposite sign is TFPR, a similarresult than the obtained by Haltiwanger et al. (2018). However, for the regression including the threedeterminants in column (4), the coefficient on the TFPR flips sign, suggesting that TFPR, conditionalon TFPQ and demand shocks, is inversely related to profits, as is suggested by the misallocation model,while the signs on the TFPQ and γm are still the expected. These findings are robust to the inclusion offirm observables (size, age and lagged capital, results displayed in column (7)) and geographic fixedeffects (results in column (8)). That is, we obtain the same type of “anomaly” than the documentedby Haltiwanger et al. (2018) with U.S. data, and that leads them to conclude than the distortionarycomponent of the TFPR is empirically swamped by the other determinants of profitability.My argument here, following a similar reasoning as in Yang (2017), is that the results in Table3.1 are perfectly consistent with the misallocation framework if we augment the model to accountfor firms selection (which is very likely to occur in the data), as is done in closed economy settingsby Bartelsman et al. (2013), Yang (2017) or Adamopoulos et al. (2017), or for an open economy inthe first chapter of this thesis. Regressions in columns (1)-(3) suffer from omitted variable bias sincethe remaining determinants of profitability are excluded. In a model with selection of firms, TFPRis positively correlated with both demand shocks and TFPQ, since firms with bad draws of TFPQor demand shocks and high TFPR are not active (for these firms profits are the lowest). Since the“true” signs of the omitted determinants in column (2) (TFPQ and γm) are both positive, the bias in the893.3. Intra- and inter-industry misallocationcoefficient of the TFPR is positive, and hence the estimated coefficient on the TFPR in (2) is greaterthan the “true” conditional value. In practice, this bias can reverse the “true” sign in an unconditionalregression as in column (2).70 Similarly, regressions of columns (1) and (3) suffer from similar bias,but in those cases is not possible to know the sign of the bias since the omitted determinants (TFPRand γm in column (1) and TFPQ and TFPR in column (3)) have opposite “true” signs. Nevertheless,to explore this bias and confirm the intuitions, I could use the approach of HK to measure TFPQas in (3.5), that mixes up true efficiency and demand shocks in only one measure, aHKm , to replicatethe exercise. Column (4) shows the results for the unconditional estimate of aHKm , whereas column (5)controls for TFPR. The conclusion for the bias on the TFPR is the same. However, since we now knowthat the “true” sign of the omitted variable in (4) (TFPR) is negative, we should obtain a negative biasin the coefficient of aHKm in (4), and thus a estimated value smaller than its “true” value in (5), exactlyas it is shown in Table 3.1.In an open economy setting as in Chapter 1, factor misallocation should also affect the selectionof exporters. Since only firms with enough profits to pay the costs of international trade become ex-porters, the decision of being an exporter is also influenced by TFPQ, TFPR and demand shocks. Table3.2 presents the results of the regressions of the probability of being an exporter on the same variablesas in Table 3.1, using a shorter panel due to the availability of firm-level exports in the Colombiandata.71 The signs on both TFPQ, TFPR and demand shocks remain the same in all specifications. Theonly exception is for TFPQ in column (1), that does not show a significant coefficient, confirming thebias of the unconditional estimations. Therefore, including firms selection in the model can rationalizethe signs of the bias and the empirical findings, addressing the objections of Haltiwanger et al. (2018)about the empirical content of the TFPR as a measure of factor distortions.3.3 Intra- and inter-industry misallocationIn this section I extend the model to account for misallocation both within and across industries. Ifirst present the closed form formulas to compute the TFP gains from removing each type of factormisallocation under the standard two-tier (Cobb Douglas-CES) demand system. Next, I explore howrobust are the results to the production function specification and to the elasticity of substitution acrosssectors. Finally I compare the gains of removing inter-industry misallocation with cross-country data,to show that TFP gaps attributed to factor misallocation might be larger than the obtained using onlyintra-industry reforms.70I have numerically tested this proposition assuming a functional form for the joint distribution of TFPQ and γm (jointlynormal), selecting firms according a cutoff function as in Chapter 1, and running the regressions with the selected data. Fora broad range of parameters of the joint distribution, the sign reversal is feasible.71Eslava et al.’s (2004) dataset does not include information on exports. So I match my original dataset with the panelemployed by Bombardini et al. (2012b) for 1978-1991, which has been used extensively in the literature, to obtain exports.See details in Appendix B1.903.3. Intra- and inter-industry misallocation3.3.1 Accounting for inter-industry factor misallocationPanel A of Figure 3.1 shows the distributions of TFPQ and TFPR for the Colombian manufacturingsector controlling for 4-digit International Standard Industrial Classification (ISIC) industries and yearfixed effects. I use a gross-output specification for the production function with four inputs: capital,skilled labor, unskilled labor and materials. Although Figure 3.1 shows a larger variance in TFPQ,the dispersion of TFPR suggests allocative inefficiencies. This dispersion of is a result of the mis-allocation of all inputs. With a CD technology, TFPR can be expressed as the weighted geometricaverage of the marginal revenue products (MRP) of the factors, using factor intensities as weights. Infrictionless factor markets, there should be MRP equalization for all firms in the economy. Constantreturns to scale imply the MRP are directly proportional to the average revenue products of factors,which are observable measures.72 Panel B of Figure 3.1 displays the distributions of MRP for the ho-mogenous inputs (capital, skilled labor and unskilled labor) used in the construction of TFPR above,exploiting the proportionality between the marginal and average returns. The observed dispersionssuggest that although the factor with the most extensive misallocation is capital, all factors seem tocontribute in some degree to variation in the TFPR. Moreover, the extent of misallocation varies acrosssectors. Figure 3.2 compares the same distributions for three different industries: food, chemicals andtransport equipment. Not only does the dispersion vary across industries but also the expected values,suggesting the presence of inter-industry factor misallocation, which I aim to quantify in this section.To characterize the observed dispersion in factors MRP we can use wedge analysis. In an efficientallocation, all firms should face the same price for inputs, say wl for factor zlm. To replicate thedispersions in the factors MRP, I assume that firms face an idiosyncratic distortion θlm in the marketof factor zlm such that the observed return of the factor is (1+θlm)wlρ . Thus, the wedge (1+θlm) forthe firm producing variety m represents the difference between the observed MRP of factor l, αlsrmzlm ,and its return in the efficient allocation, wlρ :(1+θlm) =ραlsrmwlzlm(3.6)Since the interest here is to recreate the dispersion in the factors MRP, being agnostic about the un-derlying cause that creates the misallocation, factor wedges are taken as primitives in the model. Thisstrategy is denoted by Restuccia and Rogerson (2013) as the “indirect approach” to quantitatively as-sessing the implications of resource misallocation. Denote by (1+ θ¯ls) the harmonic weighted average(HWA) of all factor-l wedges (1+θlm) in sector s, with weights given by firms’ shares in total industryrevenue (Rs), this is:(1+ θ¯ls) = (Ms∑m1(1+θlm)pmqmPsQs)−1 =ραlsRswlZls(3.7)where Zls is the total demand of factor l and Ms is the number of firms in sector s. The second equality72As it is point out by Bartelsman et al. (2013), the proportionality is not valid when the production function includesoverhead factors (fixed costs), since the production function is no longer homogenous.913.3. Intra- and inter-industry misallocationin equation (3.7) shows that this average wedge is the industry-analogue of a wedge at the firm-levelfor each production factor. Thus, this average wedge, which only needs information at the industry-level to be computed, can be used to quantify the amount of factor misallocation across industries.Revenue productivity at the industry level, computed as the ratio between sectoral revenue and totalinput use, can be expressed as the geometric average of (1+ θ¯ls)wlρ over all factors, with weightsgiven by each input intensity. In this way, sectoral revenue productivity is a measure of the returnsof factors that on average firms are facing in the industry. In the inter-industry efficient allocation,sectoral revenue productivities should differ only by factor intensities.To visualize the problem of both intra- and inter-industry factor misallocation, Panel A of Figure3.3 represents all firms of two industries (food and vehicles) in the space (TFPQ, TFPR). When re-moving only intra-industry factor misallocation, which is the exercise proposed by HK, all firm-levelwedges (1+ θlm) collapse to their industry’s HWA (1+ θ¯ls). Thus, the new values for firms’ TFPRcoincide exactly with the corresponding industry’s revenue productivities, which are represented bythe dashed lines in the graph. However, the revenue productivities at the industry level are not neces-sarily allocative efficient. Frictionless factor markets require that sectoral revenue productivities differonly by factor intensities, so all firms face the same prices for primary factors. Assuming values of wlsuch that the HWA of sectoral (1+ θ¯ls) is equal to one,73 the inter-industry allocative efficient sectoralrevenue productivities are given by the weighted geometric average of wlρ over all factors, with weightsgiven by the factor intensities. The values of the inter-industry efficient allocation are represented bythe continuous lines in the graph. Panel B of Figure 3.3 shows both the intra-industry and the intraand inter-industry efficient allocation for all firms of the two studied industries.To quantify the importance of each type of allocative inefficiency in the data, it is useful to computethe contribution of each one to the total TFP loss due to factor misallocation. Denote by ξlm the MRPof the input l. Let ξ¯ls denote the HWA of ξlm, with weights given by the participations of firm’srevenues in total industry revenue. Note that ξ¯ls = (1+ θ¯ls)wlρ . Assume that the production of thefinal good involves the output of S industries using a Cobb-Douglass (CD) technology with revenueshares βs. Using the cost minimization condition of the CD aggregator across sectors, total demand offactor-l in industry s can be expressed as:Zls =αlsβs/ξ¯lsS∑sαlsβs/ξ¯lsZ¯l (3.8)where Z¯l ≡ S∑sZls correspond to the fixed endowment of factor-l in the economy.The gains from removing intra-industry misallocation in (3.2) are the same if the reform equalizes73I use wl = ρR/∑sZlsαls where R is total revenue, ∑sRs. These values satisfy the solution for relative factor prices in generalequilibrium for an allocative efficient closed economy, which is given by wlwk = Z¯k∑sαlsβs/Z¯l∑sαksβs where Z¯l is the totalendowment of factor l (see Appendix D). Further, these factor prices allow me to interpret all wedges as deviations withrespect to one. Firms with wedges greater than one employ a smaller amount of the factor with respect to the efficientallocation; and vice versa.923.3. Intra- and inter-industry misallocationfirms’ TFPR to ψ¯s, so the factors’ MRP are equal to their HWA in the industry, or to the inter-industryefficient allocation, in which case the factors’ MRP are equated to wlρ . However, only in the first caseit is ensured there are no factor reallocations across sectors (which is evident from equation 3.8), sothe sectoral TFP gains in equation (3.2) are identical to the gains in industry output, 100( Q˜sQs −1). Inthis specific case, total output gains in the economy can be computed simply by aggregating sectoralproductivities up using the CD aggregator across industries:Gainsintra = 100(S∏s(A˜sAs)βs−1) (3.9)Clearly, total gains in (3.9) are only due to resource reallocation within industries: by assumption,there are not factor reallocations across sectors. In this case, there is MRP equalization within indus-tries, but not necessarily across them. In the more general case in which I impose MRP equalizationnot only within but across industries (i.e. removing all wedges), sectoral TFP gains are the same asin (3.2), but output gains in each industry are no longer equal to the corresponding TFP gains, due tofactor reallocation across sectors. From (3.8), the allocative efficient demand of factors at the industrylevel is given by Z˜ls = αlsβsZ¯l/S∑sαlsβs.74 Industry’s output in frictionless factor markets is given byQ˜s = A˜sM˜1σ−1sL∏lZ˜lsαls . Thus, the variation in sectoral output due to a reform that removes all wedges isa consequence of both a rise in the TFP and a variation in the use of factors in the whole sector, whichdepends exclusively on the sign of θ¯ls (the extent of inter-industry misallocation). At the aggregatelevel, factor endowments between the distorted economy and the allocative efficient counterfactual arekept constant. So any change in aggregate output Q is attributable to variations in the aggregate TFP,and it is due to resource reallocation, both within and between industries. Gains in aggregate TFP canbe caused by increases in sectoral TFP, term denoted Gainsintra above, or by reallocation of factorsbetween industries, given by:Gainsinter = 100(S∏sL∏lZ˜lsαlsβsZlsαlsβs−1) = 100(S∏sl∏l[S∑s(αlsβs/ξ¯ls)(S∑sαlsβs)/ξ¯ls]αlsβs −1) (3.10)Where I use equation (3.8) and the expression for Z˜ls to obtain the explicit closed-form solution.Thus, inter-industry gains only depend on the industry average MRP interacted with technologicalparameters, a plain consequence of the sectoral demand of factors in equation (3.8). These gains canbe computed only with industry-level data, a fact that allows me to make cross-country comparisonsto evaluate whether this component also explains the TFP gaps observed across countries, an exercisethat is performed below. Finally, total gains in the economy, given by the variation on total output (or74This is, in the case that all sectors have the same revenue shares, the efficient allocation of factors across sectorsimplies that more intensive industries should have a larger proportion of the corresponding factor. Similarly, in the case thatall sectors have the same factor intensities, the factors should be allocated in proportion only on sectoral revenue shares. Theefficient factor allocation across industries is the combination of these two forces.933.3. Intra- and inter-industry misallocationaggregate TFP), are a combination of both sources of gains:Gains = 100(Y˜Y−1) = 100[(Gainsinter100+1)(Gainsintra100+1)−1] (3.11)The importance of each type of misallocation depends, of course, on the considered industry ag-gregation. For example, in the extreme case in which the whole manufacturing sector is representedas a single industry, the entire TFP loss due to allocative inefficiency proceeds from the intra-industrytype, whereas in the opposite extreme, the whole loss proceeds from the inter-sectoral type. Usinga 4-digit ISIC industry classification,75 a value added specification for the production function, andaverage US cost shares at the corresponding aggregation level from the NBER-CES ManufacturingIndustry Database during the same period, the same set of specifications than the used in HK’s base-line, I find that the inter-sectoral component contributes on average up to 35% of the total reallocationgains of a comprehensive reform that removes all factor misallocation in Colombia, for the period1982-1998. As a robustness check, I replicate the exercise with firm-level data from China, a countrythat offers external validation using the calculations provided by HK.76 In Figure 3.4 I report usingcontinue lines the total gains (blue) and the intra-sectoral gains (red) from removing distortions forboth countries, when the 4-digit ISIC industry aggregation is used. The difference between both linesis due to the gains from inter-sectoral reallocation. For China I find similar TFP gains as in HK inthe case of removing only intra-industry misallocation, and an average contribution of 30% of theinter-sectoral component for the complete reform.In general, gains from removing distortions are larger for China, although the time periods are notcomparable. The graph shows that over time in both countries there are not significant improvementsin allocative efficiency in the considered periods; indeed, there is a slight worsening at the end of eachone. When I move to the 3-digit ISIC classification, the predictions from the decomposition seem tohold. The dashed lines in Figure 3.4 report once again the total gains (blue) and the intra-sectoral gains(red) from removing distortions, but now at the 3-digit ISIC classification. Both total gains fluctuatearound a similar range. However, the intra-industry gains rise in a larger proportion than the totalgains, so their average contribution is now 68% and 73% for Colombia and China, respectively. Thisconfirms that as the level of disaggregation increases, the intra-industry gains are lower.3.3.2 Robustness checksThe source of inter-industry gains is neither related to the use of US cost shares instead of domesticfactor intensities in the sectoral production function nor to the use of a value-added specification. Forexample, Figure 3.5 displays for the Colombian case that using a gross-output specification (Panel75For the 4-digit classification in the Colombian case, due to small number of observations, 14 industries were reclassifiedto its closest 4-digit industry or to the 4-digit sector within the same 3-digit industry that merges the products not elsewhereclassified.76For China, I use the panel from the Annual Survey of Industrial Production collected by the Chinese government’sNational Bureau of Statistics, for the period 1999-2007943.3. Intra- and inter-industry misallocationA) or changing the production function coefficients for Colombian cost shares (Panel B) does notalter importantly the key insights. In the latter case, factor intensities are now equal to the observedshare costs, but they are still different to the optimal share cost in monopolistic competition (wherethe total cost is ρ times the revenue), which is what matters in the efficient allocation. However, theuse of Colombian cost shares reduces the relative importance of inter-sectoral reallocation: its averagecontribution shrinks to 23%.Further, the total gains and the contribution of the inter-sectoral component increase using a higherelasticity of substitution across sectors. This is completely in line with the HK prediction that whensectors outputs are better substitutes, inputs are reallocated toward sectors with bigger productivitygains, so there are larger TFP gains. We can show this with a CES demand across sectors. In this case,there is not a closed-form solution for each component, but it is possible to implement a numericalprocedure to obtain both gains. Appendix C.2 offers details about its implementation. Figure 3.6shows that for different values of the elasticity of substitution across sectors (φ ), the componentsof the gains behave as predicted. The numerical procedure replicates the results of the close-formsolutions for the CD aggregator for both components in the case φ = 1, whereas total gains and thecontribution of the inter-sectoral component increases when φ = 2 (up to 50% from 43% in the lattercase) and decreases when φ = 0.5 (to 36% in the latter case). In those exercises the change in theintra-sectoral gains is negligible.3.3.3 Inter-industry misallocation and developmentAnother important question about the relevance of inter-industry misallocation is whether its asso-ciated TFP loss is larger in less developed economies, as is the case with intra-industry misalloca-tion, the core result of HK’s paper. If the inter-sectoral gains vary systematically across countries,omitting the inter-sectoral component implies an under-estimation of the TFP gap attributed to fac-tor misallocation, if the latter is computed only with intra-industry reforms, as in HK. In the caseof the CD aggregator across sectors, the closed form solution for the TFP gains of removing inter-industry misallocation only requires information at the industry level. Thus, I use information fromthe socio-economic accounts of the World Input Output Database - WIOD (Timmer et al. (2015)),which contains industry-level data for 40 countries and 35 industries mostly at the 2-digit ISIC level,covering the overall economy, to compute those gains.Figure 3.7 presents how the gains from inter-sectoral reallocation vary with the GDP per capita bycountry.77 For this calculation, I use a gross output specification for the sectoral production functionwith 3 inputs (hours worked, capital and materials) and US cost shares. The linear correlation betweenboth variables in this baseline is -0.75 (Figure 3.7 also shows the best linear fit). The negative corre-lation is robust to the use of value added specification or own country’s cost shares in the production77Each dot corresponds to the average value between 1995 and 2007 of the intersectoral gains calculated using (3.10) foreach country and the average GDP per capita in constant 2005 US dollars obtained from the World Bank. The results arevery similar if median values are used. Two small countries with many zeros in sectoral data were dropped from the WIODsample (Luxembourg and Malta). Likewise, Taiwan was dropped to make comparable WIOD and World Bank data.953.4. Conclusionsfunction; to restrict the set of sectors to only manufacturing industries and to measure labor with thewage bill and materials in nominal values to control for heterogeneity in labor and for differencesin quality of intermediate inputs respectively, graphs shown in Figure C.1 in Appendix . Therefore,there is evidence that less developed economies tend to have greater inter-sectoral gains for removingdistortions. This is consistent with the insights of multi-country studies as Tombe (2015) or S´wie˛cki(2017) which focus on inter-sectoral misallocation, that find larger intersectoral distortions in poorcountries. Thus, omitting the inter-sectoral component of the total gains from removing distortionsunderstates the common TFP gaps attributed to firm-level misallocation.3.4 ConclusionsIn this chapter the standard model of firm-level misallocation in a closed economy (Hsieh and Klenow(2009)) is augmented in two dimensions. First, idiosyncratic demand shocks are introduced to test theability of the usual metrics of factor misallocation in explaining plants’ survival, a test that has beenrecently used to argue that misallocation measures are empirically swamped by other determinants ofprofitability, mainly demand shocks (Haltiwanger et al. (2018)). I obtain similar empirical findingsusing Colombian data with which I can recover demand shock measures due to firm-level price indicesavailability. However, I argue that explaining plants’ survival with unconditional determinants ofprofitability produces biased estimates, and that including firms selection in the model can rationalizethe signs of the bias and the data findings, addressing Haltiwanger et al.’s (2018) objections.Second, the model is extended to account for the possibility that production factors are misallo-cated both within and across industries. I document that in Colombia and China the contribution ofinter-industry misallocation can be as high as 35% of the total gains from removing misallocation.Given the relevance of inter-industry misallocation in these two cases of study, I use cross-countrydata to show that including this type of misallocation can amplify the usual TFP gaps attributed tofactor misallocation based exclusively on intra-industry reforms. Hence, from a macro perspective,the simultaneously study of both intra- and inter-industry misallocation, as it is done for example inthe first chapter of this thesis, enriches the comprehension of the total impact of factor misallocationin an economy.963.5. Tables and figures3.5 Tables and figures3.5.1 TablesTable 3.1: Probability of survival explained by determinants of profitability(1) (2) (3) (4) (5) (6) (7) (8)TFPQ 0.012*** 0.061*** 0.068*** 0.067***(0.002) (0.002) (0.003) (0.003)TFPR 0.026*** -0.047*** -0.042*** -0.057*** -0.057***(0.003) (0.004) (0.003) (0.004) (0.004)Demand shock 0.018*** 0.028*** 0.032*** 0.031***(0.001) (0.001) (0.001) (0.001)TFPQ as in HK 0.044*** 0.055***(0.001) (0.001)Year FE Yes Yes Yes Yes Yes Yes Yes YesSector FE Yes Yes Yes Yes Yes Yes Yes YesFirm controls Yes YesLocation FE YesN 71880 71880 71880 71880 71880 71880 62619 60394R2 0.016 0.017 0.033 0.044 0.040 0.044 0.046 0.046* p<0.10, ** p<0.05 and *** p<0.01. Dependent variable: probability of survival. All independent variables are in deviations over industrymeans. Firm controls include age, size and lagged capital. Standard errors cluster by plant. Source: EAM Colombia, 1982-1998Table 3.2: Probability of being a exporter explained by determinants of profitability(1) (2) (3) (4) (5) (6) (7) (8)TFPQ 0.003 0.177*** 0.148*** 0.150***(0.006) (0.009) (0.011) (0.011)TFPR 0.043*** -0.178*** -0.188*** -0.139*** -0.141***(0.008) (0.011) (0.009) (0.014) (0.014)Demand shock 0.070*** 0.093*** 0.080*** 0.080***(0.002) (0.003) (0.004) (0.004)TFPQ as in HK 0.139*** 0.187***(0.002) (0.006)Year FE Yes Yes Yes Yes Yes Yes Yes YesSector FE Yes Yes Yes Yes Yes Yes Yes YesFirm controls Yes YesLocation FE YesN 47692 47692 47692 47692 47692 47692 39969 39904R2 0.058 0.058 0.185 0.219 0.219 0.219 0.233 0.235* p<0.10, ** p<0.05 and *** p<0.01. Dependent variable: probability of being an exporter. All independent variables are in deviations overindustry means. Firm controls include age, size and lagged capital. Standard errors cluster by plant. Source: EAM Colombia, 1982-1991973.5. Tables and figures3.5.2 FiguresFigure 3.1: TFPQ, TFPR and MRP distribution in ColombiaPanel A: Observed TFPQ and TFPR and efficient TFPR* Panel B: Observed MRP*0.51Density-4 -3 -2 -1 0 1 2 3 4log(TFPQ),log(TFPR)TFPQ TFPR Allocative efficient TFPR*TFPQ: Physical productivity, TFPR: Revenue productivity.CD-GO specification, controlling for year and 4-dig industry FE. Source: Colombian AMS.0.5Density-6 -4 -2 0 2 4 6log(MRP)Capital Unskilled labor Skilled labor*MRP: Marginal revenue product.CD-GO specification, controlling for year FE. Source: Colombian AMS.Figure 3.2: MRP distribution for selected industries0.2.4.6Density-6 -4 -2 0 2 4 6log(MRP)Food0.2.4.6Density-6 -4 -2 0 2 4 6log(MRP)Chemicals0.2.4.6Density-6 -4 -2 0 2 4 6log(MRP)Motor vehicles*MRP: Marginal revenue product.CD-GO specification, controlling for year FE. Source: Colombian AMS.Capital Unskilled labor Skilled labor983.5. Tables and figuresFigure 3.3: Removing intra- and inter-industry factor misallocationPanel A: TFPQ and TFPR in two sectors Panel B: Intra and inter-industry efficient allocation-2-10123log(TFPQ)TFPR s=FEff. TFPR s=FTFPR s=MEff. TFPR s=Mlog(TFPR)Footwear (F) Motor Vehicles (M)-2-10123log(TFPQ)-2 -1 0 1 2log(TFPR)Footwear Intra-industry E.A. Motor Vehicles intra-industry E.A.Footwear Intra and inter-industry E.A. Motor Vehicles intra and inter-industry E.A.E.A.: Efficient AllocationFigure 3.4: TFP gains from factor reallocation in a closed economyPanel A: China Panel B: Colombia1998 1999 2000 2001 2002 2003 2004 2005 2006 2007Year708090100110120130140150160170Gains (%)115.1 95.8 86.6Total gains, 4-dig Total gains, 3-digOnly intra-industry gains, 4-dig Only intra-industry gains, 3-dig1982 1984 1986 1988 1990 1992 1994 1996 1998Year5060708090100110120130140Gains (%)Total gains, 4-dig Total gains, 3-digOnly intra-industry gains, 4-dig Only intra-industry gains, 3-digNote: In Panel A, × correspond to the values found by HK.993.5. Tables and figuresFigure 3.5: Sensitivity to production function specification and factor intensitiesPanel A : TFP gains using gross output specification Panel B : TFP gains by set of cost shares(Colombia, US cost shares) (Colombia, 4-dig, gross output specification)1982 1984 1986 1988 1990 1992 1994 1996 1998Year20304050607080Gains (%)Total gains, 4-dig Total gains, 3-digOnly intra-industry gains, 4-dig Only intra-industry gains, 3-dig1982 1984 1986 1988 1990 1992 1994 1996 1998Year25303540455055606570Gains (%)Total gains, US shares Only intra-industry gains, US sharesTotal gains, Col shares Only intra-industry gains, Col sharesFigure 3.6: Sensitivity to elasticity of substitution across sectors82 84 86 88 90 92 94 96 98Year20304050607080Gains (%)Total gains, =1 Only intra-industry gains, =1Total gains, =2 Only intra-industry gains, =2Total gains, =0.5 Only intra-industry gains, =0.5Note: φ corresponds to the elasticity of substitution across sectors.1003.5. Tables and figuresFigure 3.7: TFP gains from removing inter-industry misallocation and GDP per capitaAUSAUTBELBGRBRACANCHNCYPCZEDEUDNKESPEST FINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEXNLDPOLPRTROURUSSVKSVN SWETURUSA05101520253035404550Gains from removing distortions across sectors (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capita (constant 2005 US$)Note: Averages 1994-2007. Data source: WIOD (Timmer et al., 2015), World Bank Development indicators.(All WIOD sectors, GO specification with US cost shares, homogenous inputs)Inter- ectoral gains from factor reallocation and GDP per capitaNote: Each dot corresponds to the average gains from removing inter-industry misallocation and the corresponding averageGDP per capita in the period 1991-2007. The source of the data is WIOD and the World Bank development indicators.101BibliographyAdamopoulos, T., Brandt, L., Leight, J., and Restuccia, D. (2017). Misallocation, selection and pro-ductivity: A quantitative analysis with panel data from China. Working Paper 574, University ofToronto.Alvarez, J. (2018). The agricultural wage gap: Evidence from Brazilian micro-data. Unpublished.Anderson, J. E. and Yotov, Y. V. (2010). The changing incidence of geography. The AmericanEconomic Review, 100(5):2157–2186.Anderson, J. E. and Yotov, Y. V. (2016). Terms of trade and global efficiency effects of free tradeagreements, 1990–2002. Journal of International Economics, 99:279 – 298.Arkolakis, C., Costinot, A., and Rodríguez-Clare, A. (2012). New trade models, same old gains? TheAmerican Economic Review, 102(1):94–130.Artuç, E., Chaudhuri, S., and McLaren, J. (2010). Trade shocks and labor adjustment: A structuralempirical approach. The American Economic Review, 100(3):1008–1045.Artuc, E., Lederman, D., and Porto, G. (2015). A mapping of labor mobility costs in the developingworld. Journal of International Economics, 95(1):28 – 41.Axtell, R. L. (2001). Zipf distribution of U.S. firm sizes. Science, 293(5536):1818–1820.Balassa, B. (1965). Trade liberalisation and "revealed” comparative advantage. The ManchesterSchool of Economic and Social Studies, 33(2):99–123.Balistreri, E. J., Hillberry, R. H., and Rutherford, T. F. (2011). Structural estimation and solution ofinternational trade models with heterogeneous firms. Journal of International Economics, 83(2):95– 108.Banerjee, A. V. and Duflo, E. (2005). Growth theory through the lens of development economics.volume 1, Part A of Handbook of Economic Growth, pages 473 – 552. Elsevier.Bartelsman, E., Haltiwanger, J., and Scarpetta, S. (2013). Cross-country differences in productivity:The role of allocation and selection. The American Economic Review, 103(1):305–34.Beegle, K., De Weerdt, J., and Dercon, S. (2011). Migration and economic mobility in tanzania:Evidence from a tracking survey. Review of Economics and Statistics, 93(3):1010–1033.102BibliographyBerlemann, M. and Wesselhöft, J.-E. (2014). Estimating aggregate capital stocks using the perpetualinventory method - A survey of previous implementations and new empirical evidence for 103countries. Review of Economics, 65(1):1–34.Bernard, A. B., Redding, S. J., and Schott, P. K. (2007). Comparative Advantage and HeterogeneousFirms. The Review of Economic Studies, 74(1):31–66.Bils, M., Klenow, P. J., and Ruane, C. (2017). Misallocation or mismeasurement. Unpublished.Bombardini, M., Gallipoli, G., and Pupato, G. (2012a). Skill dispersion and trade flows. The AmericanEconomic Review, 102(5):2327–2348.Bombardini, M., Kurz, C. J., and Morrow, P. M. (2012b). Ricardian trade and the impact of do-mestic competition on export performance. Canadian Journal of Economics/Revue canadienned’économique, 45(2):585–612.Bound, J., Brown, C., and Mathiowetz, N. (2001). Measurement error in survey data. volume 5 ofHandbook of Econometrics, pages 3705 – 3843. Elsevier.Brandt, L., Tombe, T., and Zhu, X. (2013). Factor market distortions across time, space and sectors inChina. Review of Economic Dynamics, 16(1):39 – 58.Bruins, M., Duffy, J. A., Keane, M. P., and Smith Jr., A. A. (2018). Generalized indirect inference fordiscrete choice models. Unpublished (Forthcoming Journal of Econometrics).Bryan, G., Chowdhury, S., and Mobarak, A. M. (2014). Underinvestment in a profitable technology:The case of seasonal migration in bangladesh. Econometrica, 82(5):1671–1748.Bryan, G. and Morten, M. (2018). The aggregate productivity effects of internal migration: Evidencefrom Indonesia. Unpublished.Cabral, L. M. B. and Mata, J. (2003). On the evolution of the firm size distribution: Facts and theory.American Economic Review, 93(4):1075–1090.Caliendo, L. and Parro, F. (2015). Estimates of the trade and welfare effects of NAFTA. The Reviewof Economic Studies, 82(1):1–44.Caliendo, L., Parro, F., and Tsyvinski, A. (2017). Distortions and the structure of the world economy.Working Paper 23332, National Bureau of Economic Research.Cameron, S., Chaudhuri, S., and McLaren, J. (2007). Trade shocks and labor adjustment: Theory.Working Paper 13463, National Bureau of Economic Research.Chaney, T. (2008). Distorted gravity: The intensive and extensive margins of international trade. TheAmerican Economic Review, 98(4):1707–1721.103BibliographyChang, H.-J. (2006). The East Asian development experience: The miracle, the crisis and the future.Palgrave Macmillan.Chari, V. V., Kehoe, P. J., and McGrattan, E. R. (2007). Business cycle accounting. Econometrica,75(3):781–836.Chen, K. and Irarrazabal, A. (2015). The role of allocative efficiency in a decade of recovery. Reviewof Economic Dynamics, 18(3):523 – 550.Cole, H. L. and Ohanian, L. E. (2002). The U.S. and U.K. great depressions through the lens ofneoclassical growth theory. The American Economic Review, 92(2):28–32.Costa-Scottini, L. (2018). Firm-level distortions, trade, and international productivity differences.Unpublished.Costinot, A., Donaldson, D., and Komunjer, I. (2012). What goods do countries trade? A quantitativeexploration of Ricardo’s ideas. The Review of Economic Studies, 79(2):581–608.Costinot, A. and Rodríguez-Clare, A. (2014). Trade theory with numbers: Quantifying the conse-quences of globalization. In Gita Gopinath, E. H. and Rogoff, K., editors, Handbook of Interna-tional Economics, volume 4 of Handbook of International Economics, pages 197 – 261. Elsevier.De Loecker, J. and Goldberg, P. K. (2014). Firm performance in a global market. Annual Review ofEconomics, 6(1):201–227.de Nicola, F. and Giné, X. (2014). How accurate are recall data? Evidence from coastal India. Journalof Development Economics, 106:52 – 65.de Sousa, J., Mayer, T., and Zignago, S. (2012). Market access in global and regional trade. RegionalScience and Urban Economics, 42(6):1037 – 1052.Deardorff, A. V. (1998). Determinants of bilateral trade: Does gravity work in a neoclassical World?In Frankel, J. A., editor, The Regionalization of the World Economy, chapter 1, pages 7–32. Univer-sity of Chicago Press.Dekle, R., Eaton, J., and Kortum, S. (2008). Global rebalancing with gravity: Measuring the burdenof adjustment. Working Paper 13846, National Bureau of Economic Research.Dix-Carneiro, R. (2014). Trade liberalization and labor market dynamics. Econometrica, 82(3):825–885.Dixit, A. and Rob, R. (1994). Switching costs and sectoral adjustments in general equilibrium withuninsured risk. Journal of Economic Theory, 62(1):48 – 69.Duncan, G. J. and Hill, D. H. (1985). An investigation of the extent and consequences of measurementerror in labor-economic survey data. Journal of Labor Economics, 3(4):508–532.104BibliographyEaton, J. and Kortum, S. (2001). Trade in capital goods. European Economic Review, 45(7):1195 –1235.Eaton, J. and Kortum, S. (2002). Technology, geography, and trade. Econometrica, 70(5):1741–1779.Eaton, J., Kortum, S., and Kramarz, F. (2011). An anatomy of international trade: Evidence fromFrench firms. Econometrica, 79(5):1453–1498.Edmond, C., Midrigan, V., and Daniel Yi, X. (2015). Competition, markups, and the gains frominternational trade. The American Economic Review, 105(10):3183–3221.Epifani, P. and Gancia, G. (2011). Trade, markup heterogeneity and misallocations. Journal of Inter-national Economics, 83(1):1 – 13.Eslava, M., Haltiwanger, J., Kugler, A., and Kugler, M. (2004). The effects of structural reformson productivity and profitability enhancing reallocation: Evidence from Colombia. Journal ofDevelopment Economics, 75(2):333 – 371.Eslava, M., Haltiwanger, J., Kugler, A., and Kugler, M. (2013). Trade and market selection: Evidencefrom manufacturing plants in Colombia. Review of Economic Dynamics, 16(1):135 – 158.Fernandes, A. M., Klenow, P. J., Meleshchuk, S., Pierola, M. D., and Rodríguez-Clare, A. (2015). Theintensive margin in trade: Moving beyond Pareto. Unpublished.Fernández-Val, I. and Weidner, M. (2016). Individual and time effects in nonlinear panel models withlarge N,T. Journal of Econometrics, 192(1):291 – 312.Foster, L., Haltiwanger, J., and Syverson, C. (2008). Reallocation, firm turnover, and efficiency:Selection on productivity or profitability? The American Economic Review, 98(1):394.French, E. and Taber, C. (2011). Identification of models of the labor market. Handbook of LaborEconomics, pages Vol. 4, 537 – 617. Elsevier.French, S. (2017). Revealed comparative advantage: What is it good for? Journal of InternationalEconomics, 106:83 – 103.Gautier, P. A. and Teulings, C. N. (2015). Sorting and the output loss due to search frictions. Journalof the European Economic Association, 13(6):1136–1166.Gautier, P. A., Teulings, C. N., and Van Vuuren, A. (2010). On-the-job search, mismatch and effi-ciency. The Review of Economic Studies, 77(1):245–272.Gibson, J. and Kim, B. (2010). Non-classical measurement error in long-term retrospective recallsurveys. Oxford Bulletin of Economics and Statistics, 72(5):687–695.105BibliographyGodlonton, S., Hernandez, M. A., and Murphy, M. (2016). Anchoring bias in recall data: Evidencefrom Central America. IFPRI discussion papers 1534, International Food Policy Research Institute(IFPRI).Gollin, D., Lagakos, D., and Waugh, M. E. (2014). The agricultural productivity gap. The QuarterlyJournal of Economics, 129(2):939–993.Gopinath, G., Kalemli-Ozcan, S., Karabarbounis, L., and Villegas-Sanchez, C. (2015). Capital al-location and productivity in South Europe. Working Paper 21453, National Bureau of EconomicResearch.Gourieroux, C., Monfort, A., and Renault, E. (1993). Indirect inference. Journal of Applied Econo-metrics, 8:S85–S118.Haltiwanger, J., Kulick, R., and Syverson, C. (2018). Misallocation measures: The distortion that atethe residual. Working Paper 24199, National Bureau of Economic Research.Hanson, G. H., Lind, N., and Muendler, M.-A. (2015). The dynamics of comparative advantage.Working Paper 21753, National Bureau of Economic Research.Head, K. and Mayer, T. (2014). Gravity equations: Workhorse,toolkit, and cookbook. InGita Gopinath, E. H. and Rogoff, K., editors, Handbook of International Economics, volume 4of Handbook of International Economics, pages 131 – 195. Elsevier.Heckman, J. J. and Honoré, B. E. (1990). The empirical content of the Roy model. Econometrica,58(5):1121–1149.Herrendorf, B. and Schoellman, T. (2015). Why is measured productivity so low in agriculture?Review of Economic Dynamics, 18(4):1003 – 1022.Herrendorf, B. and Schoellman, T. (2018). Wages, human capital and barriers to structural transfor-mation. American Economic Journal: Macroeconomics, 10(2):1–23.Hicks, J. H., Kleemans, M., Li, N. Y., and Miguel, E. (2017). Reevaluating agricultural productivitygaps with longitudinal microdata. Working Paper 23253, National Bureau of Economic Research.Hnatkovska, V. and Lahiri, A. (2016). Urbanization, structural transformation and rural-urban dispar-ities.Ho, G. T. (2012). Trade liberalization with size-dependant distortions: Theory and evidence fromIndia. Unpublished, IMF.Hopenhayn, H. A. (2014a). Firms, misallocation, and aggregate productivity: A Review. AnnualReview of Economics, 6(1):735–770.106BibliographyHopenhayn, H. A. (2014b). On the measure of distortions. Working Paper 20404, National Bureau ofEconomic Research.Hsieh, C.-T. and Klenow, P. J. (2009). Misallocation and manufacturing TFP in China and India. TheQuarterly Journal of Economics, 124(4):1403–1448.Kan, R. and Robotti, C. (2017). On moments of folded and truncated multivariate normal distributions.Journal of Computational and Graphical Statistics, 26(4):930–934.Katz, L. F. and Summers, L. H. (1989). Industry rents: Evidence and implications. Brookings Paperson Economic Activity, page 209. Copyright - Copyright Brookings Institution 1989; Last updated -2014-05-07; CODEN - BPEAD5.Krugman, P. (1980). Scale economies, product differentiation, and the pattern of trade. The AmericanEconomic Review, 70(5):950–959.Kucheryavyy, K., Lyn, G., and Rodríguez-Clare, A. (2017). Grounded by gravity: A well-behavedtrade model with external economies. Unpublished.Kugler, M. and Verhoogen, E. (2012). Prices, plant size, and product quality. The Review of EconomicStudies, 79(1):307–339.Lagakos, D. and Waugh, M. E. (2013). Selection, agriculture, and cross-country productivity differ-ences. The American Economic Review, 103(2):948–980.Lahiri, A. and Yi, K.-M. (2009). A tale of two states: Maharashtra and West Bengal. Review ofEconomic Dynamics, 12(3):523 – 542.Lane, N. (2017). Manufacturing revolutions. Unpublished.Leromain, E. and Orefice, G. (2014). New revealed comparative advantage index: Dataset and empir-ical distribution. International Economics, 139:48 – 70.Levchenko, A. A. and Zhang, J. (2016). The evolution of comparative advantage: Measurement andwelfare implications. Journal of Monetary Economics, 78:96 – 111.Lewis, W. A. (1955). Theory of economic growth. Routledge.Liu, J., van Leeuwen, N., Vo, T. T., Tyers, R., and Hertel, T. (1998). Disaggregating labor paymentsby skill level in GTAP. GTAP Technical Paper 11, Purdue University.Melitz, M. J. (2003). The impact of trade on intra-industry reallocations and aggregate industry pro-ductivity. Econometrica, 71(6):1695–1725.Melitz, M. J. and Redding, S. J. (2014). Heterogeneous firms and trade. In Gita Gopinath, E. H. andRogoff, K., editors, Handbook of International Economics, volume 4 of Handbook of InternationalEconomics, pages 1–54. Elsevier.107BibliographyMelitz, M. J. and Redding, S. J. (2015). New trade models, new welfare implications. The AmericanEconomic Review, 105(3):1105–1146.Mulligan, C. B. (2005). Public policies as specification errors. Review of Economic Dynamics,8(4):902 – 926.Neyman, J. and Scott, E. L. (1948). Consistent estimates based on partially consistent observations.Econometrica, 16(1):1–32.Nguyen, B. T., Albrecht, J. W., Vroman, S. B., and Westbrook, M. D. (2007). A quantile regressiondecomposition of urban–rural inequality in Vietnam. Journal of Development Economics, 83(2):466– 490.Oberfield, E. (2013). Productivity and misallocation during a crisis: Evidence from the Chilean crisisof 1982. Review of Economic Dynamics, 16(1):100 – 119.Qu, Z. F. and Zhao, Z. (2008). Urban-rural consumption Inequality in China from 1988 to 2002:Evidence from quantile regression decomposition. IZA Discussion Papers 3659, Institute for theStudy of Labor (IZA).Restuccia, D. and Rogerson, R. (2008). Policy distortions and aggregate productivity with heteroge-neous establishments. Review of Economic Dynamics, 11(4):707 – 720.Restuccia, D. and Rogerson, R. (2013). Misallocation and productivity. Review of Economic Dynam-ics, 16(1):1 – 10.Restuccia, D., Yang, D. T., and Zhu, X. (2008). Agriculture and aggregate productivity: A quantitativecross-country analysis. Journal of Monetary Economics, 55(2):234 – 250.Rodrik, D. (1995). Getting interventions right: How South Korea and Taiwan grew rich. EconomicPolicy, 10(1):55 – 107.Rosen, S. (1986). The theory of equalizing differences. volume 1 of Handbook of Labor Economics,pages 641 – 692. Elsevier.Rostow, W. W. (1960). The stages of economic growth: A non-communist manifesto. CambridgeUniversity Press.Roy, A. D. (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers, 3(2):135–146.Sarvimaki, M., Uusitalo, R., and Jantti, M. (2018). Habit formation and the misallocation of labor:Evidence from forced migrations. mimeo.Sauer, R. M. and Taber, C. R. (2017). Indirect inference with importance sampling: An application towomen’s wage growth. IZA Discussion Papers 11004.108Silva, J. M. C. S. and Tenreyro, S. (2006). The log of gravity. The Review of Economics and Statistics,88(4):641–658.Strauss, J., Witoelar, F., and Sikoki, B. (2016). The fifth wave of the Indonesia Family Life Survey(IFLS5): Overview and field report. Technical Report WR-1143/1-NIA/NICHD., RAND workingpapers.S´wie˛cki, T. (2017). Intersectoral distortions and the welfare gains from trade. Journal of InternationalEconomics, 104:138 – 156.Taber, C. and Vejlin, R. (2016). Estimation of a Roy/search/compensating differential model of thelabor market. Working Paper 22439, National Bureau of Economic Research.Tallis, G. M. (1961). The moment generating function of the truncated multi-normal distribution.Journal of the Royal Statistical Society. Series B (Methodological), 23(1):223–229.Thomas, D., Witoelar, F., Frankenberg, E., Sikoki, B., Strauss, J., Sumantri, C., and Suriastini, W.(2012). Cutting the costs of attrition: Results from the Indonesia Family Life Survey. Journal ofDevelopment Economics, 98(1):108 – 123. Symposium on Measurement and Survey Design.Timmer, M. P., Dietzenbacher, E., Los, B., Stehrer, R., and de Vries, G. J. (2015). An illustrated userguide to the world input-output database: The case of global automotive production. Review ofInternational Economics, 23(3):575–605.Tombe, T. (2015). The missing food problem: Trade, agriculture, and international productivity dif-ferences. American Economic Journal: Macroeconomics, 7(3):226–58.Vollrath, D. (2014). The efficiency of human capital allocations in developing countries. Journal ofDevelopment Economics, 108:106 – 118.Yang, M.-J. (2017). Micro-level misallocation and selection: Estimation and aggregate implications.Unpublished.Young, A. (2013). Inequality, the urban-rural gap, and migration. The Quarterly Journal of Eco-nomics, 128(4):1727–1785.109Appendix AAppendix to Chapter 1A.1 Description of the datasetThis chapter uses two types of data: A “macro” dataset with information at the country-sectoral level,and a “micro” dataset, with information at the firm level for Colombia.The “macro” dataset collects sectoral information of gross output, bilateral trade flows, interme-diate consumption and shares of employment and capital for a sample of 48 countries and 25 manu-facturing industries (3-digit ISIC rev. 2 level), for the year 1995. Table A.1 and Table A.2 at the endof this section display the considered industries and countries respectively.Data for sectoral gross output, bilateral trade flows and intermediate consumption come fromOECD’s Trade in Value Added (TiVA) database (2015’s release). This dataset contains a range ofindicators derived from the OECD’s Inter-Country Input-Output (ICIO) database. The latter is con-structed by OECD from various national and international data sources, all drawn together and bal-anced under constraints based on official National Accounts (SNA93)78. Information on gross outputand trade flows was collected for all available manufacturing sectors in TiVA (16), and an imputationscheme was implemented to obtain output and bilateral flows for the remaining sectors and for twocountries not available in TiVA (Venezuela and Ecuador, which were included given their relevance asColombia’s trade partners), based on production and trade shares computed from the CEPII database(de Sousa et al., 2012).I derive imports from home from the difference between gross output and total exports. As it isknown in the literature, this procedure could generate negative values for some country-industry pairs(for instance if the country-sector has high amount of reexports). To solve this issue, I follow Costinotand Rodríguez-Clare (2014) and S´wie˛cki (2017), adjusting those negative flows rescaling exports toall destinations until the ratio total exports to gross output is as in the sector with the highest ratiostill less than one in that country. This adjustment was needed in the case of six country-industryobservations.Factors shares were constructed using information from several sources. For materials, I computethe shares using the series of intermediate consumption from TiVA. Data for the remaining industriesand for Venezuela and Ecuador was imputed using shares from UNIDO’s INDSTAT2 database (2015’srelease), which contains information at the 2-digit ISIC rev. 3 level only for manufacturing industries.78The underlying sources used are notably: i) National supply and use tables; ii) National and harmonized Input-OutputTables, iii) Bilateral trade in goods by industry and end-use category; and iv) Bilateral trade in services. For more informa-tion, see www.oecd.org/trade/valueadded110A.1. Description of the datasetThe information was gathered adjusting each country’s available aggregation to the one used here. Forlabor, ICIO database contains information of employment (measured in number of persons engaged)for 42 of the 48 countries considered here. For the remaining sectors and countries, data was collectedusing UNIDO’s INDSTAT2 database. Skilled and unskilled labor shares were allocated using GTAP-5database, which are draw on labor force surveys and national censuses where they are available, or thestatistical model proposed by Liu et al. (1998) otherwise.For capital, shares were constructed as follows. First, the Social Economics Accounts of the WorldInput Output Database (WIOD, see Timmer et al. (2015)) contain calculations of the stocks of capitalat the two-digit ISIC rev. 3 level or groups thereof for 36 countries of the 48 countries consideredhere (in the 2013’s release). For the remaining countries, I apply the steady-state approach on thecalculation of the initial stock of capital in the perpetual inventory method79, using information ofgross fixed capital formation (GCFC) from INDSTAT2 database. For country i-industry s the share ofcapital γiks was imputed as:γiks =GCFCisgis+δ risS∑sGCFCisgis+δ riswhere GCFCis is the average GCFC over the five-year window centered on the reference year, gis isthe growth rate of the GDP of the sector in the same period, and δ ris is an exogenous depreciationrate, which are computed using the NBER-CES Manufacturing Industry database for US80. I computecapital shares using this methodology even for the countries with available information from WIOD,to assess the fit of the imputation procedure. I evaluate the imputation results in terms of cross correla-tions and mean absolute errors using three approximations: i) Setting gis = δ ris = 0 ∀i,s (thus I use onlyinformation on GCFC); ii) Setting gis = 0 ∀i,s (hence I use information on GCFC and US depreciationrates); iii) Using the full set of information. I found the best adjustment under the second approach.Therefore, capital shares for the remaining countries were imputed using only series of GCFC and USdepreciation rates.For the “micro” dataset I use the panel of manufacturing plants created by Eslava et al. (2004)(hereafter EHKK) for the period 1984-1998 from the Colombian Annual Manufacturing Survey (AMS),collected by the Departamento Administrative Nacional de Estadística (DANE), the Colombian na-tional statistical agency81. The AMS is a census of plants with 10 or more workers or annual salesabove certain limit, which is adjusted over time82. A unique feature of the AMS is that, in conjunctionwith the main variables of standard surveys (output and sales values, overall cost, energy consump-tion, payroll, number of workers and book values of equipment and structures), the DANE collects79For reference, see for example Berlemann and Wesselhöft, 201480I use five-year windows to prevent that short-run volatility in the GCFC bias the imputation results. Notice that sinceI only need sectoral factor shares, a temporal shock that affects homogeneously the whole economy does not affect theimputation results.81The dataset was made available to research by the DANE.82For 1998, the last year of the panel, was around US$35000. This criterium was introduced in the AMS in 1992 toincrease coverage.111A.1. Description of the datasetinformation at the product level (with a disaggregation comparable to the 6-digit HS) on the value andphysical quantities of outputs and inputs (valued at factory-gate prices). This allows EHKK to obtainprices as unit values for each output and input produced and used by every plant, and hence to con-struct specific firm prices of total output and materials using Tornqvist indices (see EHHK Appendixfor details).I perform the detailed cleaning procedure of Kugler and Verhoogen (2012) to reduce the influenceof measurement error and outliers (see their data Appendix). Next, I follow HK and remove 1% tailsof the distributions of log(ψm/ψ¯s) and log(M1σ−1s am/A˜s) to drop remaining influential observations83.Following the misallocation literature, to obtain TFP measures I use as a factor intensities averageU.S. cost shares at the corresponding aggregation levels from the NBER-CES Manufacturing IndustryDatabase during the same period of time. Since for the selected years the AMS uses ISIC rev-2adapted for Colombia, I match the NAICS97 US code with the ISIC rev-3, and afterwards with theColombian one. The purpose of using US cost shares is to employ factor intensities that reflect truetechnological differences across industries instead of frictions in factor markets, since domestic costshares can be affected by the extent of inter-industry factor misallocation.The final panel contains around 4700 plants on average in a typical year. On average, around390 firms enter each year while 450 exit, which corresponds to an entry/exit rate of 8 and 9 percentrespectively. For the computation of the misallocation measures in the counterfactual exercise, I useinformation only for the reference year (1995). Despite its coverage, EHHK’s dataset does not includeexports. Thus, I use the panel employed by Bombardini et al. (2012b) for 1978-1991, which hasbeen used extensively in the literature, to obtain exports. I merge both panels using variables inquantities (year, 4-digit ISIC, production and non-production workers and energy consumption). Forthe overlapping period, plants representing between 2% and 3% of the original nominal productionwere unmatched, and therefore dropped from the sample. I also keep only plants with positive andnon-missing values for production and inputs. Up to 1991, on average around 13 of each 100 firmswere exporters, while the total value exported represents in average 8% of industry’s gross revenue,with a large variation across sectors.With the goal to ensure consistency between the macro and the micro dataset, two procedureswere executed. First, since the calculation of factor shares in the macro dataset is independent on theseries of gross output and bilateral trade flows, factor shares for Colombia were taken directly fromthe AMS. It is worth to say that the factor shares computed by both sources are very similar, minordifferences occur due to the exclusion of outliers in the micro dataset. Second, revenues of all firmswithin each industry were re-scaled to ensure that the revenue share included in the TiVA databasecoincide with the corresponding shares on the AMS. Once again, revenue shares from the two sourcesare very alike, and the small discrepancies also occur for the exclusion of outliers.83For the definitions of ψ¯s, Ms and A˜s see Appendix D.112A.1. Description of the datasetTable A.1: Sectors in the sampleNo. Sector Sector Description ISIC Rev. 21 Food Food manufacturing 311-3122 Beverage Beverage industries 3133 Tobacco Tobacco manufactures 3144 Textiles Manufacture of textiles 3215 Apparel Wearing apparel, except footwear 3226 Leather Leather and products of leather and footwear 3237 Footwear Footwear, except vulcanized or moulded rubber or plastic footwear 3248 Wood Wood and products of wood and cork, except furniture 3319 Furniture Furniture and fixtures, except primarily of metal 33210 Paper Paper and paper products 34111 Printing Printing, publishing and allied industries 34212 Chemicals Industrial chemicals 35113 Other chemicals Other chemicals (paints, medicines, soaps, cosmetics) 35214 Petroleum Petroleum refineries, products of petroleum and coal 353-35415 Rubber Rubber products 35516 Plastic Plastic products 35617 Pottery Pottery, china and earthenware 36118 Glass Glass and glass products 36219 Other non-metallic Other non-metallic mineral products (clay, cement) 36920 Iron and steel Iron and steel basic industries 37121 Non-ferrous metal Non-ferrous metal basic industries 37222 Metal products Fabricated metal products, except machinery and equipment 38123 Mach. & equipment Machinery and equipment except electrical 38224 Electric. / Profess. Electrical machinery apparatus, appliances and supplies & 383-385professional and scientific, measuring and controlling equipment25 Transport Transport equipment 384Table A.2: Countries in the sampleOECD Country (I) Code OECD Country (II) Code Non-OECD Country CodeAustralia AUS Korea KOR Argentina ARGAustria AUT Mexico MEX Brazil BRABelgium BEL Netherlands NLD China CHNCanada CAN New Zealand NZL Colombia COLChile CHL Norway NOR Ecuador ECUDenmark DNK Poland POL Hong Kong HKGFinland FIN Portugal PRT India INDFrance FRA Czech Republic CZE Indonesia IDNGermany DEU Spain ESP Malaysia MYSGreece GRC Sweden SWE Philippines PHLHungary HUN Switzerland CHE Rest of the World ROWIreland IRL Turkey TUR Romania ROUIsrael ISR United Kingdom GBR Russia RUSItaly ITA United States USA Saudi Arabia SAUJapan JPN Singapore SGPSouth Africa ZAFThailand THATaiwan TWNVenezuela VEN113A.2. Bils, Klenow and Ruane’s (2017) method and results for ColombiaA.2 Bils, Klenow and Ruane’s (2017) method and results for ColombiaHere I succinctly introduce Bils et al.’s (2017) method to estimate the dispersion in the factors’MRP in the presence of additive measurement error in revenue and inputs, which in the latter casecan be also interpreted as overhead factors. Define measured revenues and inputs for firm producingvariety m as the sum of the “real” values plus an idiosyncratic measurement error: Rˆm = Rm+ fm andIˆm = Im+gm. Denote ∆ the log-difference and N the absolute difference. Bils et al. (2017) find, undersome reasonable assumptions, that the elasticity of ∆Rˆ with respect to ∆Iˆ, βˆ = σ∆Rˆ,∆Iˆσ2∆Iˆ, satisfies:E{βˆ | ln(T FPRm)}=[Ψ+Λ(ln(T FPRm))2][1− (1−λ )ln(T FPRm)]with λ = σ2lnΘσ2T FPR, the ratio between the dispersion of the factor’s MRP and the dispersion of the observedTFPR, our measure of interest; Ψ= 1+ΩΘ−Ω f ′ , whereΩΘ = σ∆Θ,∆Iσ2∆I , Ω f ′ =σN f ′,∆Iˆσ2∆Iˆ,N f ′ = N fmIˆm; and Λa constant that depends on the stochastic process of Θ, which is assumed is stationary. In the absenceof measurement error (λ = 1) the elasticity of revenues with respect to inputs should be the same (Ψ)for plants with different average products. The quadratic term Λ(ln(T FPRm))2 is included to reflectthe possibility of mean reversions in the stochastic process of Θ, given the stationary assumption.Therefore, λ can be estimated by GMM through the non-linear regression:∆Rˆm =φ ln(T FPRm)+Ψ∆Iˆm−Ψ(1−λ )ln(T FPRm)∆Iˆm (A.1)+Γ(ln(T FPRm))2+Λ(1−λ )(ln(T FPRm))2∆Iˆm+ϒ(ln(T FPRm))3+Λ(1−λ )(ln(T FPRm))3∆Iˆm+ εmWith Colombian data, I follow closely Bils et al. (2017) for the construction of the variables.I estimate equation (A.1) by GMM sector by sector, controlling for year fixed effects, in the panelfrom 1991 to 1998. Standard errors are clustered at the firm-level. The last two columns in Table1.5 show the point estimates for λˆs and its standard errors. For sectors in which the method does notdeliver significative values, probably due to the influence of remaining outliers, I use the results fromestimating (A.1) in the whole manufacturing sector controlling for a full set of sector-year fixed effects(as in Bils et al. (2017)), values that are displayed in the last row.I use the estimated values of λˆs to compress the observed dispersions in the average revenue prod-ucts of the factors to obtain variances and covariances of the MRP, and hence to derive Vˆis.A.3 Solution of the modelTo obtain the global solution of the system of equations, I employ both an algorithm to chooseideal initial conditions and a state-of-the-art solver for large-scale nonlinear systems. The proposedalgorithm consists of the following three steps:114A.4. Mathematical derivations1. Step 1: I start solving the model for a two-country world composed by Colombia and an aggre-gate adding the rest of countries up (the number of equations is N× (S+L) = 56). The purposeof this step is to find ideal initial conditions for Colombia and the rest of the world in step 2. Tosolve this two-country model I perform first a global search using particles swarm optimizationa sufficient large number of times (500), to remove the influence of randomness in the initialposition of the particles. Next, I use a local solver initialized in each of the 50 best solutionsof the global search. For the local solver, I use auto-differentiation to obtain information aboutthe gradient and the hessian of the objective function, and Knitro, a solver that implements bothnovel interior-point and active-set methods for solving large-scale nonlinear optimization prob-lems84. The final solution is the best point of those 50 local solutions. It is worth to say that theobtained solution behaves according to the predictions of a small-open economy model, wherethe small country cannot influence foreign factor prices.2. Step 2: Next, I solve the model N−1 times, in each case for a small-scale version of the worldwith the following three countries: Colombia, each country in the dataset and an aggregateadding the remaining countries up (the model is solved for N × (S+ L) = 84 equations 47times). The objective of this step is to find ideal initial points for every country to solve thefull model in step 3. In each of the N−1 times I initialize the local solver using for Colombiathe solution found in step 1, and for the remaining two countries the solution for the rest of theworld in step 1. I use the same local-solver and auto-differentiation as in step 1.3. Step 3: Finally, I collect the solution for each country in step 2 to initialize the local solverfor the model with the full set of countries; while for Colombia I initialize with a median ofits N− 1 solutions found in step 2 (such solutions have low dispersion). I use the same local-solver and auto-differentiation as in steps 1 and 2. The number of equations in this case isN× (S+L) = 1344.A.4 Mathematical derivationsA.4.1 Model solution under assumptions A.1 and A.2Under assumptions A.1 and A.2, it is possible to express:Mi js∑m(aimΘim)σ−1 = Hisdis∫θi ...∫θiL∫ ∞a∗i js(Θ)( aimΘim )σ−1dGis =Hisκ a¯κisdis∫θi1 ...∫θiL∫ ∞a∗i js(Θ)aσ−κ−2im Θ1−σim dGθisUsing the formula of the cutoff function in (1.7), the last expression can be simplified as:Mi js∑m( aimΘim )σ−1 = Hisdisκ1+κ−σ (a¯isa∗i js)κa∗σ−1i js Γis (A.2)84I use auto-differentiation and the Knitro solver through the Tomlab optimization environment in Matlab.115A.4. Mathematical derivationswith Γis defined as in the text. Applying the formulas for firm-level profits and revenues, the free entrycondition can be restated as:N∑jMi js∑m1σ(τi jsΘimρaim)1−σω−σis E jsPσ−1js −N∑jMi js∑mΘim fi js = f eisHisNotice thatMi js∑mΘim = Hisdis (a¯isa∗i js)κΓis. Combining with equation (A.2), it is possible to obtain:N∑j1σ (τi jsρ )1−σω−σis E jsPσ−1js1disκ1+κ−σ (a¯isa∗i js)κa∗σ−1i js Γis−N∑jfi jsdis( a¯isa∗i js)κΓis = f eisUsing the definition of the productivity cutoff value for the undistorted firms in (1.7) to substitute ina∗σ−1i js , the expression can be simplified to:N∑j( a¯isa∗i js)κ fi js =dis f eis(1+κ−σ)Γis(σ −1) (A.3)On the other hand, applying again (A.2) and the definition of the productivity cutoff value, bilateralexports Xi js =Mi js∑mri jm are given by:Xi js =Mi js∑m(τi jsΘimωisρaim)1−σE jsPσ−1js =ωisHisdisσκ1+κ−σ (a¯isa∗i js)κΓis fi js (A.4)Hence, from (A.3), sectoral revenues Ris =N∑jXi js are given by:Ris = κρωis feisHis (A.5)Free entry requires that the aggregate sectoral profits, Πis, are equal to the expenditures in entry,ωis f eisHis. This means the Pareto property of a constant profits/revenue ratio is not affected by distor-tions: Ris = kρΠis. From equations (1.11) and (1.12), the sectoral demand of primary factor l for bothoperational (fixed and variable costs) and entry uses is given by:Zils = Zoils+Zeils =ραlsRiswil(1+ θ¯ils)+αlsFiswil(1+ θ¯ils)+αlsωis f eisHiswilSubstituting the expression forMi js∑mΘim from above in the definition of Fis and using again equation(A.5), it is straightforward to obtain equation (1.17), the total demand of primary factor l in terms ofsector revenue, underlying factor prices and the HWA wedges. With the definition of vils as in the text,equation (1.21) is evident.116A.4. Mathematical derivationsFinally, combining (A.4) with the gravity equation, I obtain:Xi js =Xi jsN∑kXk jsE js =ωisHisdis( a¯isa∗i js)κΓis fi jsN∑kωksHksdks( a¯ksa∗k js)κΓks fk jsE jsBy definition of the cutoff function in (1.7), it is possible to show the following relation between thecutoffs for the undistorted firms of country i and country i′ for the same destination j:a∗i jsa∗i′ js= (τi jsτi′ js)(ωisωi′s)1ρ (fi jsfi′ js)1σ−1 (A.6)Using the formula in (A.6) into the denominator of bilateral exports, I obtain:Xi js =1disω1− κρis Hisa¯isκ( 1τi js )κ( fi js)1−κσ−1ΓisN∑k1dksω1− κρks Hksa¯κks(1τk js )κ( fk js)1−κσ−1ΓksE jsUsing (A.5) to substitute for the mass of entrants in terms of sectoral revenue, it simplifies to:Xi js =ω− κρis Risφi jsΓisN∑kω− κρks Rksφk jsΓksE js (A.7)where φi js is as in the text. Hence, trade shares are given by (1.24). The model is closed combining(A.7) with the definitions of sectoral and aggregate revenues (Ris =N∑jXi js and Ri =S∑sRis), the Cobb-Douglas solution for sectoral expenditures, E js = β jsE j and the trade balance condition: E j =S∑sR js−D j, which results on equation (1.23).The system can be solved for the values of Ris for a given set of values of factor intensities αls,factor endowments Z¯il , expenditure shares β js, aggregate trade deficits D j, deep parameters φi js,κand ρ , and misallocation measures Γis and vils. Once the solution of Ris is computed, the values ofall remaining variables can be found following the next sequence: i) factor prices and sectoral factorallocations from (1.21) and (1.22); ii) expenditures from the trade balance condition; iii) bilateralexports from (A.7); iv) mass of entrants from (A.5); v) bilateral cutoffs values for the undistortedfirms from (A.4); vi) mass of operating firms from (1.9).A.4.2 Demonstration of equation (1.18)Here I deduce the formula for the ex-post HWA wedge in equation (1.18).117A.4. Mathematical derivationsProof. Starting by the definition of the HWA wedge:(1+ θ¯ils)≡ (N∑jMi js∑m1(1+θilm)ci jmCis)−1 = (N∑jMi js∑m1(1+θilm)ρri jm+ωisΘim fi jsρRis+Fis )−1Substituting firm level exports from i to j and after few algebraic manipulations we can write:(1+ θ¯ils)ρRis+Fis= (N∑jMi js∑mρ(1+θilm)(τi jsΘimωisρaim )1−σE jsPσ−1js +ωisΘim fi js(1+θilm))−1(1+ θ¯ils)ρRis+Fis= (ρ(ωisρ )1−σ N∑jτ(1−σ)i js E jsPσ−1jsMi js∑m1(1+θilm)(Θimaim )1−σ+ωisN∑jfi jsMi js∑mΘim(1+θilm))−1Similar to how it is done in the precedent section, it is possible to show that:Mi js∑m1(1+θilm)(Θimaim )1−σ =Meisdis( a¯isa∗i js)κa∗σ−1i jsκΓils1+κ−σ andMi js∑mΘim(1+θilm)=MeisΓilsdis( a¯isa∗i js)κ , with Γils as in the text. Thus:(1+ θ¯ils)ρRis+Fis= (ρ(ωisρ )1−σ MeisdisN∑jτ(1−σ)i js E jsPσ−1js (a¯isa∗i js)κa∗σ−1i jsκΓils1+κ−σ+ωisMeisdisN∑jfi js( a¯isa∗i js )κΓils)−1Substituting the definition of the productivity cutoff value for the undistorted firms in (1.7) in a∗σ−1i js , Iobtain:(1+ θ¯ils)ρRis+Fis= (ωisMeisdis(σ−1)κΓils1+κ−σN∑jfi js( a¯isa∗i js )κ+ωisMeisΓilsdisN∑jfi js( a¯isas∗i j )κ)−1(1+ θ¯ils)ρRis+Fis= (ωisMeisdisΓils σκ+1−σ(1+κ−σ)N∑jfi js(a¯isa∗i js)κ)−1Using the free entry condition in (A.3):(1+ θ¯ils)ρRis+Fis= (ωisMeis f eisΓilsΓisσκ+1−σ(σ−1) )−1Substituting the expression forMi js∑mΘim given in Appendix C.1. in the definition of Fis and using againequation (A.5) it is possible to show ρRis+Fis = ωisMei f eiσκ+1−σ(σ−1) and hence:(1+ θ¯ils) =ΓisΓilsIt is possible to repeat the proof to derive an expression for the HWA wedge of the firms able tosell in each market j. Doing so, it follows (1+ θ¯i jls) = (1+ θ¯ils), this is, the HWA wedge does not varyacross destinations. Even though this result looks at first glance counterintuitive, since this average itis not computed for the same set of firms (for example, (1+ θ¯iils) includes the firms that only sell inthe domestic market, who must have, conditional on TFPQ, higher wedges than the firms exporting to118A.4. Mathematical derivationsj), the fact that in the HWA the inverse of the wedge is weighted by the cost share (firms that only sellin the domestic market have higher cost shares), makes possible this equalization.A.4.3 Decomposition of industry-exporter fixed effectFrom the definition of bilateral price index in equation (1.16), the double difference across sectorsand exporters of the unit prices in each destination can be re-written in terms of the relative bilateraliceberg costs, number of exporters, average TFP and factor returns as:(Pi jsPi′ js′Pi js′Pi′ js)1−σ = (τi jsτi′ js′τi js′τi′ js)1−σ (Mi jsMi′ js′Mi js′Mi′ js)(ψ¯i jsψ¯i′ js′ψ¯i js′ψ¯i′ js)1−σ (Ai jsAi′ js′Ai js′Ai′ js)σ−1 (A.8)My interest is twofold. First, I will provide a proof of equation (1.19), and second I will decomposethe industry-exporter fixed effect on single components that come from each of the mentioned sources.For this reason, in the next lines I develop the RHS of (A.8) keeping each term separated in squarebrackets, without simplifying across terms. Using the definitions of ψ¯i js and Ai js in the text, equation(A.8) can be written as:(Pi jsPi′ js′Pi js′Pi′ js)1−σ =[τi jsτi′ js′τi js′τi′ js]1−σ [Mi jsMi′ js′Mi js′Mi′ js][ωisωi′s′Θ¯i jsΘ¯i′ js′ωis′ωi′sΘ¯i js′Θ¯i′ js′]1−σ[Θ¯i jsM11−σi js (Mi js∑m( aimΘim )σ−1)1σ−1Θ¯i js′M11−σi js′ (Mi js′∑m( aimΘim )σ−1)1σ−1Θ¯i′ js′M11−σi′ js′ (Mi′ js′∑m(ai′mΘi′m)σ−1)1σ−1Θ¯i′ jsM11−σi′ js (Mi′ js∑m(ai′mΘi′m)σ−1)1σ−1]σ−1Using the expression forMi js∑m( aimΘim )σ−1 in equation (A.2) and the fact Θ¯i js = Θ¯is derived in AppendixC.2, this reduces to:(Pi jsPi′ js′Pi js′Pi′ js)1−σ =[τi jsτi′ js′τi js′τi′ js]1−σ [Mi jsMi′ js′Mi js′Mi′ js][ωisωi′s′Θ¯isΘ¯i′s′ωis′ωi′sΘ¯is′Θ¯i′s′]1−σ[(Θ¯isΘ¯i′s′a∗i jsa∗i′ js′Θ¯is′Θ¯i′s′a∗i js′a∗i′ js)σ−1(Mi js′Mi′ jsMi jsMi′ js′)(ΓisΓi′s′Γis′Γi′s)Hisdis( a¯isa∗i js)κHis′dis′(a¯is′a∗i js′)κHi′s′di′s′(a¯is′a∗i′ js′)κHi′sdi′s( a¯isa∗i′ js)κ]Under assumptions A.1 and A.2. the aggregate stability condition (1.9) can be solved to obtainMi js = Hisϒisδis (a¯isa∗i js)κ with ϒis =∫θi1 ...∫θiL Θi− kρ dGθis(~θ), an expected value that depends only on thejoint distribution of distortions. Substituting this expression in the first and third terms, and using119A.4. Mathematical derivationsequation (A.6), I obtain for the RHS:=[τi jsτi′ js′τi js′τi′ js]1−σ [dis′di′sdisdi′s′HisHi′s′His′Hi′sϒisϒi′s′ϒis′ϒi′s(a¯isa¯i′s′a¯is′ a¯i′s)κ(τi jsτi′ js′τi js′τi′ js)−κ(ωisωis′ωi′s′ωi′s)−κρ (fi js fi′ js′fi js′ fi′ js)−κσ−1 ][ωisωi′s′Θ¯isΘ¯i′s′ωis′ωi′sΘ¯is′Θ¯i′s′]1−σ [(Θ¯isΘ¯i′s′Θ¯is′Θ¯i′s′)σ−1ΓisΓi′s′Γis′Γi′sϒis′ϒi′sϒisϒi′s′(τi jsτi′ js′τi js′τi′ js)σ−1(ωisωi′s′ωis′ωi′s)σ (fi js fi′ js′fi js′ fi′ js)]Using His = Risωis f eis and applying logs to separate the components that only depend on exporter-industryterms and simplifying, I finally obtain for the RHS of (A.8):=log[ρisρi′s′ρis′ρi′sRisRi′s′Ris′Ri′sϒisϒi′s′ϒis′ϒi′s(ωisωis′ωi′s′ωi′s)−κρ−1]+ log[ωisωi′s′Θ¯isΘ¯i′s′ωis′ωi′sΘ¯is′Θ¯i′s′]1−σ (A.9)+ log[(Θ¯isΘ¯i′s′Θ¯is′Θ¯i′s′)σ−1(ωisωis′ωi′s′ωi′s)σΓisΓi′s′Γis′Γi′sϒis′ϒi′sϒisϒi′s′]+Bi jswhere Bi js = ln[(τi jsτi′ js′τi js′τi′ js)−κ( fi js fi′ js′fi js′ fi′ js )1− κσ−1 ] and ρis =a¯κisdis f eis. Canceling out the double differences ofΘ¯is and ϒis across terms and simplifying the double differences of ωis it is straightforward to derivethe gravity equation in (1.19). Furthermore, equation (A.9) offers a decomposition of the exporter-industry fixed effect on the three sources of interest: number of exporters (first term in log), averagefactor returns (second term in log) and TFP (third term in log).This decomposition is used in section 1.3.3 as follows. Denote x˜ the value in the allocative efficientequilibrium of x, and xˇ ≡ xx˜ the proportional change when we introduce distortions. Thus figure 1.3plots in each chart the following terms:log(Xˇi jsXˇi′ js′Xˇi js′Xˇi′ js) =logRˇisRˇi′s′Rˇis′Rˇi′sϒisϒi′s′ϒis′ϒi′s(ωˇisωˇis′ωˇi′s′ωˇi′s)−κρ−1+ log(ωˇisωˇi′s′Θ¯isΘ¯i′s′ωˇis′ωˇi′sΘ¯is′Θ¯i′s′)1−σ+ log(Θ¯isΘ¯i′s′Θ¯is′Θ¯i′s′)σ−1(ωˇisωˇis′ωˇi′s′ωˇi′s)σΓisΓi′s′Γis′Γi′sϒis′ϒi′sϒisϒi′s′with i = 1, i′ = 2, j = 2, s = 1, s′ = 2.A.4.4 Solution for Γis under log-normalBy definition of Γils in the text:Γis =∫θi ...∫θiLΘi1− κρ dGθis = E(L∏l(1+θil)(1−κρ )αls)Assume ~θis = {θi1s,θi2s, ...θiLs} has a multivariate log-normal distribution, such the transformed vec-tor ~θ ∗is = {ln(θi1s), ln(θi2s), ... ln(θiLs)} has a multivariate normal distribution with expected value~µis (1× L vector) and variance Vis (L× L matrix). Let ~αs a (column) vector with elements: ~αs ={(1− κρ )α1s,(1− κρ )α2s, ...,(1− kρ )αLs}′. Then the productL∏l(1+θil)(1−κρ )αls is log-normal distributedwith location parameter (~αs)′ ~µis and shape parameter (~αs)′Vis~αs. Under log-normality, the required120A.4. Mathematical derivationsexpected value is then:Γis = exp[(~αs)′ ~µis+12(~αs)′Vis~αs]On the other hand, the definition of Γils in the text:Γils =∫θi ...∫θiLΘ1− κρi(1+θils)dGθis = E[(1+θil)(1− κρ )αls−1 L∏h 6=l(1+θih)(1−κρ )αhs ]By the same token, let ~αls a (column) vector with elements: ~αls = {(1− κρ )α1s, ...,(1− κρ )αls −1, ...,(1− κρ )αLs}′. This is, ~αls has the same elements of ~αs with exception to the element in posi-tion l, which is (1− kρ )αls− 1. Thus the product (1+θil)(1−κρ )αls−1 L∏h 6=l(1+θih)(1−κρ )αhs is log-normaldistributed with location parameter ( ~αls)′ ~µis and shape parameter ( ~αls)′Vis ~αls. Accordingly, its ex-pected value is:Γils = exp[( ~αls)′ ~µis+12( ~αls)′Vis ~αls]Now, using the formula for (1+ θ¯ils) in (1.18) we obtain:ln(1+ θ¯ils) = (~αs)′~µis+12(~αs)′Vis~αs− ( ~αls)′ ~µis− 12( ~αls)′Vis ~αls= µils+12[(~αs)′Vis~αs− ( ~αls)′Vis ~αls] (A.10)A.4.5 WelfareCombining the formula of the consumer price index in sector s and equation (A.2) we obtain:(Pdis)1−σ=N∑kP1−σkis =N∑kτkisρ ωksMkis∑m( akmΘkm )σ−1 =N∑kτkisρωksHksdksκ1+κ−σ (a¯ksa∗kis)κa∗σ−1kis ΓksInserting the definition of the productivity cutoff value for the undistorted firms in (1.7) in the terma∗σ−1−κkis , the price index can be written as:(Pdis)−κ= E−κ1−σ−1isN∑k( τkisρ )−κω1− κρksHksdksκ1+κ−σ (a¯ks)κ(σ fkis)1−κσ−1ΓksUsing the country i’s share of expenditure on itself within sector s from equation (A.7), we obtain:(Pdis)−κ= ςi jsE−κ1−σ−1is ω− κρis RisΓis(1piiis )where ςi js = (ρ a¯isτi js )κ 1dis f ei( 1fiis )1− κσ−1 ( κ1+κ−ρ ) a term that does not vary in the counterfactual exercise.Hence, the proportional change of the price index from the initial equilibrium to the counterfactualone can be written as:Pˆdis = Eˆ11−σ+1kis ωˆ1ρis Rˆ− 1κis Γˆ− 1κis (pˆi1kiis)121A.4. Mathematical derivationsUsing the fact that Pˆdi = ∏s(Pˆdis)βs , Eˆis = Eˆi and equation (1.25) to substitute ωˆis, the derivation ofequation (1.28) is straightforward. Moreover, notice that in the case of the undistorted economy withone factor production, Rˆis = ωˆisZˆis and ωˆis = wˆi = Eˆi so the increase in the sectoral price index isPˆdis = wˆi(pˆiiisZˆis)1k , which leads to the Arkolakis et al. (2012)’s formula to compute the increase in welfarein response to any exogenous shock.122Appendix BAppendix to Chapter 2B.1 Additional tablesTable B.1: Premia with hours worked: Additional jobs(1) (2) (3) (4) (5) (6)Log Income Log Income Log Income Log Income Log Inc./Hour Log Inc./HourNon-Agriculture 0.501*** 0.264*** 0.390*** 0.216*** 0.275*** 0.150***(0.034) (0.032) (0.032) (0.031) (0.034) (0.035)Urban 0.171*** 0.063* 0.143*** 0.063** 0.112*** 0.057**(0.034) (0.034) (0.030) (0.026) (0.028) (0.028)Log Hours/Year 0.509*** 0.445***(0.011) (0.011)Year & province FE Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes Yes Yes YesIndividual FE Yes Yes YesObservations 44489 44492 43819 43821 43819 43821R2 0.514 0.538 0.603 0.615 0.495 0.514Notes: Income and hours from both the main job and the secondary job. Individual controls: education, experience, expe-rience sq., and sex. Observations weighted by longitudinal survey weights. Standard errors clustered by enumeration areas(primary sampling units of the survey) in parentheses. Significance levels: * p<0.10, ** p<0.05, *** p<0.01.Table B.2: Hours workedBase Base Add. Job Add. Job(1) (2) (3) (4)Log Hours Log Hours Log Hours Log HoursNon-Agriculture 0.286*** 0.152*** 0.234*** 0.119***(0.024) (0.029) (0.022) (0.027)Urban 0.101*** 0.014 0.062*** 0.012(0.020) (0.031) (0.018) (0.032)Year & province FE Yes Yes Yes YesIndiv. cont. Yes Yes Yes YesIndividual FE Yes YesObservations 43841 43843 43819 43821R2 0.053 0.023 0.052 0.026Notes: Base is the baseline specification involving primary job only. Add. Job also includes secondary job. Individualcontrols: education, experience, experience sq., and sex. Observations weighted by longitudinal survey weights. Standarderrors clustered by enumeration areas (primary sampling units of the survey) in parentheses. Significance levels: * p<0.10,** p<0.05, *** p<0.01.123B.2. Recall biasTable B.3: Premia over Time(a) Cross-Sectional PremiaPooled 1993 1997 2000 2007 2014(1) (2) (3) (4) (5) (6)Log Income Log Income Log Income Log Income Log Income Log IncomeNon-Agriculture 0.574*** 0.792*** 0.721*** 0.547*** 0.461*** 0.449***(0.036) (0.070) (0.052) (0.051) (0.048) (0.058)Urban 0.207*** 0.388*** 0.271*** 0.227*** 0.204*** 0.097(0.036) (0.057) (0.051) (0.051) (0.049) (0.062)Year FE YesProvince FE Yes Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes Yes Yes YesIndividual FEObservations 44494 5296 8548 10293 10619 9738R2 0.503 0.382 0.333 0.244 0.267 0.249(b) Premia with Worker Fixed EffectPooled 1993-97 1997-00 2000-07 2007-14(1) (2) (3) (4) (5)Log Income Log Income Log Income Log Income Log IncomeNon-Agriculture 0.332*** 0.339*** 0.292*** 0.303*** 0.217***(0.033) (0.071) (0.052) (0.056) (0.059)Urban 0.084** 0.210*** 0.097 0.156*** 0.144**(0.032) (0.068) (0.087) (0.058) (0.058)Year FE Yes Yes Yes Yes YesProvince FE Yes Yes Yes Yes YesIndiv. cont. Yes Yes Yes Yes YesIndividual FE Yes Yes Yes Yes YesObservations 44497 13844 18841 20912 20360R2 0.518 0.242 0.205 0.396 0.282Notes: Pooled is the baseline sample with observations from IFLS 1-5. Panel A: cross-sectional regressions run separatelyfor each survey wave. Panel B: panel regressions run separately for each two consecutive survey waves. Individual controls:education, experience, experience sq., and sex. Observations weighted by longitudinal survey weights. Standard errorsclustered by enumeration areas (primary sampling units of the survey) in parentheses. Significance levels: * p<0.10, **p<0.05, *** p<0.01.B.2 Recall biasEach wave of the IFLS asks respondents about the income they earned over the past year. Throughoutthe chapter we use this contemporaneously recorded income as our main dependent variable. In ad-dition, the survey asks respondents to retrospectively recall employment information for several yearsprior to the survey. While this recall information can in principle be used to supplement the contempo-raneous data and increase the sample size, retrospective survey data is known to raise serious quality124B.2. Recall biasconcerns (cf. Bound et al. (2001)). For this reason we do not use retrospective income informationin our analysis. In this appendix we explain this choice in more detail and argue that it can largelyexplain why our results differ from those concurrently obtained by Hicks et al. (2017).The first three columns of the first panel of Table B.4 show the non-agricultural premia estimatedon the contemporaneous data recorded by the IFLS. These numbers are similar to those reportedin Table 2.11 (columns 3, 4, and 6) in the main text, but not identical because the specificationsand sample are modified to ease comparison with Hicks et al. (2017). In particular, we discard theinformation from the most recent wave of IFLS as it has not been incorporated by these authors.Columns 4-6 show the corresponding premia estimated on data from retrospective recall. Comparedto the contemporaneous estimates, the cross-sectional premium (controlling for hours) drops from 71lp to 53 lp, premium with worker fixed effects (controlling for hours) drops from 25 lp to 11 lp, andthe 19 lp premium in terms of income per hour (with worker FE) disappears entirely.These patterns are not surprising in light of research on biases arising in recall surveys. Onesuch well documented bias is that past income reported by workers is biased towards their usual in-come.85 For example, Gibson and Kim (2010) show the extent of this bias for US wage workers bycomparing their self-reported retrospective earnings with administrative records. They also demon-strate that underreporting transitory income changes generates non-classical measurement error thatbiases the regression coefficients towards zero if the mismeasured variable is the dependent variable.This result is consistent with the reduced non-agricultural premia we find using recall data if work-ers cannot accurately recall how much higher their income was in years in which they worked innon-agriculture. Furthermore, the problem is likely to be exacerbated when the identifying variationcomes from changes in income of individual workers over time. This would explain why the fall inthe premium is proportionately much larger in the specification with worker fixed effects. Finally,the problems with measurement error are likely to be compounded when the dependent variable isconstructed by dividing reported income by reported hours. That retrospectively recalled hours areunreliable is suggested by comparing coefficients on hours in columns 5 and 2. The elasticity of in-come with respect to hours implied by column 5 is less than 0.15, only 1/3 of the 0.44 elasticity impliedby the corresponding column 2 for contemporaneous data. The implausibly low elasticity for recalledhours indicates that their relationship to income should be treated with great caution in recall data. Ina rare validation study observing both hours worked and earnings, Duncan and Hill (1985) find that“interview reports of average hourly earnings, obtained by dividing the interview reports of annualearnings by reports of annual work hours, appeared to be exceedingly unreliable” and caution againsttheir use. The particularly low signal-to-noise ratio in income per hour derived from retrospective datacan explain why the results become insignificant in column 6.85Another bias with similar implications in this context is an anchoring bias, where respondents use an answer to apreviously answered question as a mental anchor for subsequent answers. Godlonton et al. (2016) find strong evidenceof this behavior in a survey of Central American farmers: retrospectively recalled income correlates more highly withcurrent income (about which the respondents are asked first) than with income over the recall period that had been reportedcontemporaneously in the past. This type of cognitive bias is likely to be present in IFLS too, since IFLS also first asksabout contemporaneous income and then asks respondents to retrospectively recall past income.125B.3. Estimation procedureThe take-away message from this discussion is that using data from retrospective recall in ourapplication would introduces biases in our key results. These recall biases can be strong in IFLS sincethe respondents are asked retrospective questions about multiple years prior to the survey (up to amaximum of 10 years), and the quality of recall information deteriorates with time elapsed from thepertaining event (see, e.g. de Nicola and Giné (2014) in a developing country context). There are noobvious offsetting benefits to including the retrospective data. Statistical power, in particular, is not anissue, since the baseline sample of contemporaneous responses is large enough to allow us to estimatethe key non-agricultural premium precisely.We conclude this appendix by showing that the inclusion of retrospective data is likely the mainreason why the substantive results on the strength of the non-agricultural premium reported by Hickset al. (2017) are different than ours. In contrast to our results, they argue that the non-agricultural pre-mium in Indonesia mostly disappears once individual fixed effects are allowed for. To aid comparison,columns 1-3 in the second panel of Table B.4 repeat the same exercise as columns 1-3 and 4-6 in panelA, but now on a sample pooling the contemporaneous and retrospective responses. The estimates lieroughly half way between the two corresponding numbers reported in the first panel. This means thatthe pooled-sample estimates are significantly attenuated relative to those based on better-measuredcontemporaneous data that we favor. For comparison, columns 4-6 copy the corresponding estimatesfrom Hicks et al. (2017) (columns 2, 6, and 7 of their Table 5A), who use pooled contemporaneousand retrospective data. While we cannot replicate their results exactly without detailed knowledge oftheir data processing protocol, the estimates in columns 1-3 come close. Based on this exercise, weexpect that their results would much have been much more in line with ours had they not used theretrospective data.86B.3 Estimation procedureThis Appendix presents some technical aspects about the estimation procedure. The vector of struc-tural parameters in the frictionless economy, denoted by Θ , is constituted by the following set of21 elements:{Rst ,β ,σ2θ s ,σθAN ,σ2εs ,σ2ν}for t = 1, ..5 and s = A,N (denoting agriculture and non-agriculture, respectively). In this set, σ2θ s and σ2εs denote the variances in Σθ and Σε respectively, σθANthe covariance in Σθ , and β is comprised of the Mincerian returns on the five mentioned covariates,denoted βsex, βloc, βedu, βexp and βexp2, respectively. For the model with switching costs, Θ is aug-mented by{φAN ,φNA}, whereas for the model with compensating differentials Θ is augmented bycd. The Indirect Inference loss function, denoted by Q(Θ), is computed as the weighted sum of thesquared differences between the values in δˆ and the values for those obtained from simulations of thestructural model, that is:Q(Θ) =(δˆ − δˆ s(Θ))′Ω(δˆ − δˆ s(Θ))86Furthermore, their headline result depends on using income per hour as their preferred measure. We do not use hoursdata in our preferred specifications, both because of measurement issues for hours described in this appendix and conceptualissues discussed in section 2.3.2.126B.3. Estimation procedureTable B.4: Retrospective Recall(a) Contemporaneous vs. Recall DataContemporaneous Retrospective(1) (2) (3) (4) (5) (6)Log Inc. Log Inc. Log Inc./Hr Log Inc. Log Inc. Log Inc./HrNon-Agriculture 0.707*** 0.245*** 0.192*** 0.525*** 0.110*** -0.038(0.013) (0.022) (0.024) (0.020) (0.039) (0.052)Log Hours 0.604*** 0.462*** 0.140*** -0.012(0.039) (0.046) (0.051) (0.045)Log Hours Squared 0.000 -0.002 0.018*** 0.016***(0.005) (0.005) (0.006) (0.005)Age squared -0.000*** -0.000*** -0.001*** -0.000***(0.000) (0.000) (0.000) (0.000)Year FE Yes Yes Yes Yes Yes YesIndividual FE Yes Yes Yes YesObservations 48626 48626 48626 63498 63498 63498R-sq 0.423 0.540 0.433 0.161 0.192 0.158(b) Pooled Data vs. Hicks et al. (2017)Pooled Data Hicks et al. (2017)(1) (2) (3) (4) (5) (6)Log Inc. Log Inc. Log Inc./Hr Log Inc. Log Inc. Log Inc./HrNon-Agriculture 0.588*** 0.173*** 0.076*** 0.514*** 0.171*** 0.047(0.015) (0.019) (0.021) (0.016) (0.025) (0.031)Log Hours 0.385*** 0.206*** 0.531** 0.323***(0.040) (0.037) (0.025) (0.034)Log Hours Squared 0.006 0.009** -0.021*** -0.014**(0.005) (0.004) (0.005) (0.006)Age squared -0.000*** -0.000*** -0.001*** -0.000***(0.000) (0.000) (0.000) (0.000)Year FE Yes Yes Yes Yes Yes YesIndividual FE Yes Yes Yes YesObservations 107933 107933 107933 115897 115897 115897R-sq 0.303 0.353 0.263Notes: Contemporaneous measures based on values reported for last year. Retrospective measures obtained from recallpart of the survey. Pooled Data combines contemporaneous and retrospective observations. Sample restricted to IFLS 1-4.Sample includes all individuals with at least one observation of income and hours worked. Income is average monthly laborincome from primary and secondary job. Contemporaneous income obtained by dividing annual income by 12. Hours areaverage monthly hours from primary and secondary job obtained as (weeks worked per year)*(normal hours per week)/12.Observations are not weighted. Standard errors clustered at the individual level in parentheses. Significance levels: * p<0.10,** p<0.05, *** p<0.01. Columns 4-7 in Panel B are columns 2, 6, 7, respectively, from Table 5A in Hicks et al. (2017).127B.3. Estimation procedurewhere δˆ s(Θ) corresponds to the same vector of selected coefficients of the auxiliary models estimatedwith data simulated from the structural model with parameters Θ , and Ω is a diagonal weightingmatrix. For weights, we use factors that represent the importance of the estimated coefficient in theidentification of the structural parameters of the model. The values of those factors, displayed in thesecond column of Table 2.15, were assigned after extensive experimentation with simulations of themodel. We proceed next to comment on their magnitudes, particularly for those weights that differfrom one.As we argue in the main text, the within-individual variation is key for identification. Our priorityin the estimation procedure is to make the structural model able to deliver the observed premia forswitchers in the data. Procuring the right amount of switchers in each year is crucial because of theirsize depends the precision of the obtained premia. For this reason, the coefficients of the linear proba-bility models have the largest weights in the loss function, by a factor of 10. Moreover, we also imposelarger weights (by a factor of 5) to the two sets of switchers’ sectoral premia in which we are interestedin. First, to the premia in the model in differences iv), since this regression actually forces the struc-tural model to deliver both the gains of switchers to non-agriculture and the cuts in income of workersswitching to agriculture, allowing the estimation procedure to identify any possible asymmetry in theswitching costs across sectors. Second, to the premia in model iv), which compares the average per-formance of a switching worker to each sector with their peer group after the switch, informing aboutthe nature of sorting. In addition, we know that the estimated coefficients of the interactions in modeliv) help to identify the growth in relative human capital prices. Thus, the information about the fullpath of these prices can be recovered once the regression pin down the conditional expected incomein each sector in the first year. For this reason, we force a greater accuracy in the coefficients of theconstant and the interaction term of non-agriculture and the first year through larger weights, by afactor of 5. Finally, given the importance of the residual variances to identify the joint distribution ofcomparative advantage, they also have larger weights in the loss function, in this case by a factor of 3.We minimize Q(Θ) using in each evaluation H different simulated samples each with size equal tothe number of observations in the balanced panel (N×T = 8760)87. For observables, we take in eachsimulated sample the same values we observe in the data. To choose H, we numerically explore howQ(Θ) varies in a fixed number of simulations only due to changes in the seed of the random numbers,as a function of the number of individuals. We found that the range of variation of Q(Θ) starts tostabilize after we include 80000 individuals. Thus, we choose H = 40≈ 80000/1752.Optimizing Q(Θ) is challenging since the discrete choices in the selection model create a non-smooth function, that behaves as a step function in some regions of the parameters’ space. To dealwith this problem, we use an algorithm with repeated iterations of an evolutionary method, partic-ularly particles-swarm optimization, to find the solution88. We start with 16 implementations of87We use the approach to compute the solution on the sample generated by H×N×T observations, a method that isequivalent to compute the average of H times the solution of each sample of N×T observations, although computationallyit is more efficient.88Other possibilities recently developed in the literature include the use of a logistic-kernel of simulated latent utilitiesinstead of endogenous variables (Bruins et al., 2018) or Monte Carlo importance sampling (Sauer and Taber, 2017). How-128B.4. Identificationparticles-swarm optimization in a wide range of the feasible parameter space. In each of these 16implementations we work with 96 particles, that initially are randomly and uniformly distributed. Ina second stage, we perform 8 implementations of particles-swarm optimization in a range of the pa-rameter space bounded by the smallest and largest solution for each parameter in the first stage, plus aparameter-dependent margin error. In this stage we use the same number and distribution for the initialparticles. Finally we perform an additional optimization in which we initialize 8 of the 96 particlesin the solutions found in the second stage. The estimate Θˆ is the solution that minimizes Q(Θ) in the25 implementations described of particles-swarm optimization. Using several numerical simulationswe test that our algorithm ensures two-decimal accuracy in the solutions, as opposed to alternativeoptimization techniques.B.4 IdentificationIn this Appendix we demonstrate how parameters in Θ are identified for the same model as in ourbaseline specification, but with only two periods and abstracting from the effect of observables inincome. For illustrative purposes, we first show how identification is achieved in the frictionlesseconomy, and next we proceed to the model with switching costs. We refer to the Agriculture sectoras A and to the Non-Agriculture sector as N. We denote rt = rNr − rAt , usit = θ si + εsit for s = A,N andσ˜2ks = σ2ks−σkAN for k = u,θ and s = A,N. Notice that σ2us = σ2θs+σ2εs for s = A,N,AN. Further, wedenote the st. dev. of uit ≡(uAit −uNit)as σ∗u =√σ˜2uA+ σ˜2uN and the st. dev. of θi ≡(θAi −θNi)asσ∗θ =√σ˜2θA+ σ˜2θN .Model for the frictionless economyWithout switching costs, sectoral decisions do not depend on workers’ histories, so the model behavesin each period t as the standard Roy model with comparative advantage usit , where, excluding thevariance of measurement error, we can identify the variance matrix Σu and the prices of human capitalrst from cross-sectional data. This is consequence of the normality assumptions on the distribution ofboth(θAi ,θNi)and(εAit ,εNit), which imply that in each period(uAit ,uNit)is joint normally distributedwith variance Σu, and hence standard arguments of Heckman and Honoré (1990) for identification inthe normal case can be applied. However, only with panel data we can decompose Σu into Σθ and Σε ,the variances of the permanent and transitory components respectively, and identify σ2ν , the varianceof measurement error, using the information obtained from the switching workers in the panel.Letting λ (·) = φ(·)Φˆ(·) with φ and Φˆ the PDF and CDF of a standard normal, and using propertiesof normal random variables following Heckman and Honoré (1990), we can obtain the followingever, the possibility to use those techniques is model-dependent. As we argue next, our algorithm does not face problems tofind an accurate solution.129B.4. Identificationderivations for the first three observed moments of the income distribution in each period t:P(t = N) = Φˆ(rtσ∗u)(B.1)E(yNit | t = N)= rNt +σ˜2uNσ∗uλ(rtσ∗u)(B.2)E(yAit | t = A)= rAt +σ˜2uAσ∗uλ(−rtσ∗u)(B.3)Var(yNit | t = N)= σ2uN +(σ˜2uNσ∗u)2 [−λ(rtσ∗u)rtσ∗u−λ 2(rtσ∗u)]+σ2ν (B.4)Var(yAit | t = A)= σ2uA+(σ˜2uAσ∗u)2 [λ(−rtσ∗u)rtσ∗u−λ 2(−rtσ∗u)]+σ2ν (B.5)E([yNit −E(yNit | t = N)]3 | t = N)= ( σ˜2uNσ∗u)3λ(rtσ∗u)[2λ 2(rtσ∗u)+3λ(rtσ∗u)rtσ∗u+(rtσ∗u)2−1](B.6)E([yAit −E(yAit | t = A)]3 | t = A)= ( σ˜2uAσ∗u)3λ(−rtσ∗u)[2λ 2(−rtσ∗u)−3λ(−rtσ∗u)rtσ∗u+(rtσ∗u)2−1](B.7)With information of T = 2 repeated cross sections to compute the LHS of this system of 14 equations,we can identify rNt , rAt and the combination σ2ν I2+Σu (8 parameters). Let us now show the additionalinformation we can obtain from panel data. We exploit the property that (uit ,uit ′) for t ′ 6= t and (uit ,usit)for s = A,N are joint normally distributed, since each element is the sum of two normally distributedrandom variables. Denoting the CDF of a bivariate normal distribution with mean 0′ and variance Σevaluated at the vector A as Φ(A,Σ), the probability of transition from N to N is given by:P(2 = N,1 = N) =P({rA2 +uAi2 < rN2 +uNi2},{rA1 +uAi1 < rN1 +uNi1})=P({ui2 < r2} ,{ui1 < r1})=Φ(−→r NN ,ΣT ) (B.8)with −→r NN = [r2,r1]′ and ΣT =[σ∗2u σ∗2θσ∗2θ σ∗2u]. The probability of transition from N to A is given by:P(2 = A,1 = N) =P({−ui2 <−r2} ,{ui1 < r1})=Φ(−→r NA,ΣW ) (B.9)with −→r NA = [−r2,r1]′ and ΣW =[σ∗2u −σ∗2θ−σ∗2θ σ∗2u]. Similarly:P(2 = N,1 = A) =Φ(−→r AN ,ΣW ) (B.10)P(2 = N,2 = A) =Φ(−→r AA,ΣT ) (B.11)130B.4. Identificationwith −→r AN = [r2,−r1]′ and −→r AA = [−r2,−r1]′.Now consider the values of the expected income in the second period for each transition group ofworkers. In the frictionless economy we do not need directly those expected values, but we illustratehere how to compute them to introduce some notation that we use hereafter. The income of stayers inN in period 2 is given by:E(yNi2|2 = N,1 = N)=rN2 +E(uNi2|2 = N,1 = N)=rN2 +E(uNi2| {ui2 < r2} ,{ui1 < r1})Notice that the expected value in the second term of the RHS can be expressed as:E(Xk11 Xk22 Xk33 | −∞< Xi < bi, i = 1,2,3)(B.12)with X1 = uNi2, X2 = ui2, X3 = ui1, k1 = 1,k2 = k3 = 0, b1 = ∞, b2 = r2 and b3 = r1. This expectedvalue is the moment of the upper truncated multivariate normal distribution with mean 0 and variance:ΣNN = σ2uN −σ˜2uN −σ˜2θN−σ˜2uN σ∗2u σ∗2θ−σ˜2θN σ∗2θ σ∗2u= [ σ2uN ΛNNΛ ′NN ΣT]where the vector ΛNN is defined as ΛNN =[−σ˜2uN ,−σ˜2θN]. In general terms, we can denote the ex-pected value in (B.12) for the particular case k2 = k3 = 0 and b1 = ∞ as the function M3(·) of thevariance matrix Σ , the elements b2 and b3 stacked up in a vector B and the coefficient k = k1, that is:M3 (B,Σ ,k)≡ E(Xk1 | −∞< X1 < ∞,−∞< X2 < B1,−∞< X3 < B2)with {X1,X2,X3} ∼ N(0,Σ). Then we can rewrite:E(yNi2 |2 = N,1 = N)= rN2 +M3(−→r NN ,ΣNN ,1)To evaluate M3(·) we can use for example the recurrence relations developed by Kan and Robotti(2017) to compute numerically the moment generating function of the truncated multivariate normaldistribution (first obtained by Tallis (1961))89. Following similar arguments, we can show that theincome of each transition group in period 2 is given by:E(yAi2|2 = N,1 = N)=rA2 +M3 (−→r NA,ΣNA,1)E(yNi2|2 = N,1 = A)=rN2 +M3 (−→r AN ,ΣAN ,1)E(yAi2|2 = A,1 = A)=rA2 +M3 (−→r AA,ΣAA,1)89Particularly, we can use the function multivutmom developed by Kan and Robotti (2017) in the Matlab package ftnorm.The instruction to compute M3 (B,Σ ,k) is simply multivutmom([k 0 0] , [inf B1 B2] , [0 0 0] ,Σ).131B.4. Identificationwith ΣNA =[σ2uA ΛNAΛ ′NA ΣW], ΣAN =[σ2uN ΛANΛ ′AN ΣW]and ΣAA =[σ2uA ΛAAΛ ′AA ΣT]whereΛNA =[−σ˜2uA, σ˜2θA],ΛAN =[−σ˜2uN , σ˜2θN] andΛAA = [−σ˜2uA,−σ˜2θA]. Notice that the second and third moments of each tran-sition group can be computed as functions of M3(·, ·,2) and M3(·, ·,3) respectively. We do not needthose expressions here, so we deduce those moments only for the model with switching costs.Now let us compute the moments of the growth in income for switchers. For switching workersfrom A to N, the first moment is:E(yNi2− yAi1|2 = N,1 = A)= rN2 − rA1 +E(uNi2−uAi1|2 = N,1 = A)= rN2 − rA1 +E(uNi2−uAi1| {ui2 < r2} ,{−ui1 <−r1})= rN2 − rA1 +M3(−→r AN , Σ˜AN ,1) (B.13)with Σ˜AN =[σ2uA+σ2uN−2σθAN Λ˜ANΛ˜ ′AN ΣW]and Λ˜AN =[−σ˜2uN− σ˜2θA, σ˜2uA+ σ˜2θN]. Similarly, the ex-pected value of the growth in income for switchers from N to A is:E(yAi2− yNi1|2 = A,1 = N)= rA2 − rN1 +M3(−→r NA, Σ˜NA,1) (B.14)where Σ˜NA =[σ2uA+σ2uN−2σθAN Λ˜NAΛ˜ ′NA ΣW]and Λ˜NA =[−σ˜2uA− σ˜2θN , σ˜2uN + σ˜2θA]. The variances ofthe growth in income for switchers are defined as:Var(yNi2− yAi1|2 = N,1 = A)= E((uNi2−uAi1)2 | {ui2 < r2} ,{−ui1 <−r1})−E (uNi2−uAi1|2 = N,1 = A)2+2σ2ν= M3(−→r AN , Σ˜AN ,2)+(M3(−→r AN , Σ˜AN ,1))2+2σ2ν (B.15)And similarly:Var(yAi2− yNi1|2 = A,1 = N)= M3(−→r NA, Σ˜NA,2)+(M3(−→r NA, Σ˜NA,1))2+2σ2ν (B.16)The system of 22 equations (B.1)-(B.11) and (B.13)-(B.16) has a unique solution for the 10 elementsof Θ. We verified this after extensive experimentation using global solvers over a broad range offeasible values for Θ. This shows that the cross-sectional moments, the transition probabilities acrosswaves for each group of workers and the two first moments of the income growth for switchers areenough moments to identify the full set of parameters.Model with switching costs across sectorsIn the model with switching costs, we will require exactly the same set of 22 moments computedabove to identify the 12 elements of Θ. The difficulty to obtain expressions for those moments relies132B.4. Identificationon the fact that sectoral decisions depend now on workers’ histories, and hence all moments, includingthe cross-sectional ones, depend on the income distributions of the previous periods. We deduce herethe general rules to deduce expressions for those moments. Denote the CDF of a multivariate normaldistribution with mean 0′ and variance Σ evaluated at the vector A as Φ(A,Σ). To compute the crosssectional moments in period 1, we need first the distribution of sectoral choices, that depend on thefrictionless decisions in period zero. The probability of choosing non-agriculture in period 1 is:P(1 = N) =P(1 = N,0 = A)+P(1 = N,0 = N)=P({rA1 +uAi1 < rN1 +uNi1− lnφAN},{rN0 +uNi0 < rA0 +uAi0})+P({rA1 +uAi1− lnφNA < rN1 +uNi1},{rA0 +uAi0 < rN0 +uNi0})=P({ui1 < r1− lnφAN},{−ui0 <−r0})+P({ui1 < r1+ lnφNA},{ui0 < r0})=Φ(−→r AN ,ΣW )+Φ(−→r NN ,ΣT )where now :−→r AN =[r1− lnφAN ,−r0]′,−→r NN = [r1+ lnφNA,r0]′ and ΣW and ΣT as in the model with-out switching costs. The values of the expected income in N the first period are:E(yNi1|1 = N)=rN1 +E(uNi1|1 = N)=rN1 +E(uNi1|1 = N,0 = A)P(1 = N,0 = A)E(uNi1|1 = N,0 = N)P(1 = N,0 = N)P(1 = N)=rN1 +M3(−→r AN ,ΣAN ,1)Φ(−→r AN ,ΣW )+M3(−→r NN ,ΣNN ,1)Φ(−→r NN ,ΣT )Φ(−→r AN ,ΣW )+Φ(−→r NN ,ΣT )with M3(·), ΣAN , ΣNN as in the model without switching costs. Similarly:E(yAi1|1 = A)= rA1 +M3(−→r AA,ΣAA,1)Φ(−→r AA,ΣT )+M3(−→r NA,ΣNA,1)Φ(−→r NN ,ΣT )Φ(−→r AA,ΣT )+Φ(−→r NA,ΣW )where −→r AA =[−r1+ lnφAN ,−r0]′,−→r NA = [−r1− lnφNA,r0]′ and ΣAA, ΣNA as in the model withoutswitching costs.The variances can be computed simply by:Var(yNi1|1 = N)=M3(−→r AN ,ΣAN ,2)Φ(−→r AN ,ΣW )+M3(−→r NN ,ΣNN ,2)Φ(−→r NN ,ΣT )Φ(−→r AN ,ΣW )+Φ(−→r NN ,ΣT ) −E(uNi1|1 = N)2+σ2µVar(yAi1|1 = A)=M3(−→r AA,ΣAA,2)Φ(−→r AA,ΣT )+M3(−→r NA,ΣNA,2)Φ(−→r NN ,ΣT )Φ(−→r AA,ΣT )+Φ(−→r NA,ΣW ) −E(uAi1|1 = A)2+σ2µ133B.4. IdentificationThe third central moments are computed as:E([yNi1−E(yNi1|1 = N)]3 |1 = N)=M3(−→r AN ,ΣAN ,3)Φ(−→r AN ,ΣW )+M3(−→r NN ,ΣNN ,3)Φ(−→r NN ,ΣT )Φ(−→r AN ,ΣW )+Φ(−→r NN ,ΣT )−3E (uNi1|1 = N)[Var(yNi1|1 = N)−σ2µ]− [E (uNi1|1 = N)]3E([yNi1−E(yNi1|1 = N)]3 |1 = N)=M3(−→r AA,ΣAA,3)Φ(−→r AA,ΣT )+M3(−→r NA,ΣNA,3)Φ(−→r NN ,ΣT )Φ(−→r AA,ΣT )+Φ(−→r NA,ΣW )−3E (uAi1|1 = A)[Var(yNi1|1 = A)−σ2µ]− [E (uNi1|1 = A)]3Now let examine the cross sectional moments for period 2. To compute the probability of being ina sector, we need to know the probability of occurrence of all possible paths that an individual canexhibit before choosing a given sector90. This is, P(2 = N) = P(2 = N,1 = N)+P(2 = N,1 = N)where in turn the probability of transition from N to N is given by:P(2 = N,1 = N)=P(2 = N,1 = N,0 = A)+P(2 = N,1 = N,0 = N)=P({ui2 < r2+ lnφNA},{ui1 < r1− lnφAN},{−ui0 <−r0})+P({ui2 < r2+ lnφNA},{ui1 < r1+ lnφNA},{ui0 < r0})=Φ(−→r ANN ,ΣWT )+Φ(−→r NNN ,ΣT T )with:−→r ANN =[r2+ lnφNA,−→r ′AN]′,−→r NNN = [r2+ lnφNA,−→r ′NN]′and:ΣWT = σ∗2u σ∗2θ −σ∗2θσ∗2θ σ∗2u −σ∗2θ−σ∗2θ −σ∗2θ σ∗2u ,ΣT T = σ∗2u σ∗2θ σ∗2θσ∗2θ σ∗2u σ∗2θσ∗2θ σ∗2θ σ∗2uFollowing similar arguments, we can show that the remaining probabilities of transition can be ex-pressed as:P(2 = A,1 = N) =Φ(−→r ANA,ΣWW )+Φ(−→r NNA,ΣTW )P(2 = A,1 = A) =Φ(−→r AAA,ΣT T )+Φ(−→r NAA,ΣWT )P(2 = N,1 = A) =Φ(−→r AAN ,ΣTW )+Φ(−→r NAN ,ΣWW )with: −→r ANA =[−r2− lnφNA, −→r ′AN]′, −→r NNA = [−r2− lnφNA, −→r ′NN]′,90Unfortunately, we cannot use Bayes’ rule to derive the expressions of the joint probability from the marginals, sincefor the latter ones there is no closed form solution.134B.4. Identification−→r AAA =[−r2+ lnφAN , −→r ′AA]′ −→r NAA = [−r2+ lnφAN , −→r ′NA]′, −→r NAN = [ r2− lnφAN , −→r ′AN]′ and:ΣWW = σ∗2u −σ∗2θ σ∗2θ−σ∗2θ σ∗2u −σ∗2θσ∗2θ −σ∗2θ σ∗2u ,ΣTW = σ∗2u −σ∗2θ −σ∗2θ−σ∗2θ σ∗2u σ∗2θ−σ∗2θ σ∗2θ σ∗2uNow consider the values of the expected income in the second period. Again, we need to know theexpected income for each transition group. The income of stayers in N in period 2 is given by:E(yNi2|2 = N,1 = N)=rN2 +E(uNi2|2 = N,1 = N)=rN2 +[E(uNi2|2 = N,1 = N,0 = A)P(2 = N,1 = N,0 = A)+E(uNi2|2 = N,1 = N,0 = N)P(2 = N,1 = N,0 = N)]/P(2 = N,1 = N)Similarly as above, consider the moment E(Xk11 Xk22 Xk33 Xk44 | −∞< Xi < bi, i = 1,2,3,4)of the uppertruncated multivariate normal distribution N(0,Σ) with k2 = k3 = k4 = 0 and b1 = ∞ as a functionM4(B,Σ ,k) of the variance Σ , the elements b2, b3 and b4 stacked up in a vector B and the coefficientk = k1 that is:M4(B,Σ ,k)≡ E(Xk1 | −∞< X1 < ∞,−∞< X2 < B1,−∞< X3 < B2,−∞< X4 < B3)So we can express:E(yNi2|2 = N,1 = N)= rN2 +M4(−→r ANN ,ΣANN ,1)Φ(−→r ANN ,ΣWT )+M4(−→r NNN ,ΣNNN ,1)Φ(−→r NNN ,ΣT T )Φ(−→r ANN ,ΣWT )+Φ(−→r NNN ,ΣT T )with: ΣNNN =[σ2uN ΛNNNΛ ′NNN ΣT T]and ΣANN =[σ2uN ΛANNΛ ′ANN ΣWT], whereΛNNN =[−σ˜2uN ,−σ˜2θN ,−σ˜2θN]andΛANN =[−σ˜2uN ,−σ˜2θN , σ˜2θN]. Similarly, the expected incomes in period 2 for the remaining groupsare:E(yAi2|2 = A,1 = N)= rA2 +M4(−→r ANA,ΣANA,1)Φ(−→r ANA,ΣWW )+M4(−→r NNA,ΣNNA,1)Φ(−→r NNA,ΣTW )Φ(−→r ANA,ΣWW )+Φ(−→r NNA,ΣTW )E(yAi2|2 = A,1 = A)= rA2 +M4(−→r AAA,ΣAAA,1)Φ(−→r AAA,ΣT T )+M4(−→r NAA,ΣNAA,1)Φ(−→r NAA,ΣWT )Φ(−→r AAA,ΣT T )+Φ(−→r NAA,ΣWT )135B.4. IdentificationE(yNi2|2 = N,1 = A)= rN2 +M4(−→r AAN ,ΣAAN ,1)Φ(−→r AAN ,ΣTW )+M4(−→r NAN ,ΣNAN ,1)Φ(−→r NAN ,ΣWW )Φ(−→r AAN ,ΣTW )+Φ(−→r NAN ,ΣWW )with:ΣANA =[σ2uA ΛANAΛ ′ANA ΣWW], ΣNNA =[σ2uA ΛNNAΛ ′NNA ΣTW], ΣAAA =[σ2uA ΛAAAΛ ′AAA ΣT T]ΣNAA =[σ2uA ΛNAAΛ ′NAA ΣWT], ΣAAN =[σ2uN ΛAANΛ ′AAN ΣTW], ΣNAN =[σ2uN ΛNANΛ ′NAN ΣWW]where ΛANA =[−σ˜2uA, σ˜2θA,−σ˜2θA], ΛNNA = [−σ˜2uA, σ˜2θA, σ˜2θA], ΛAAA = [−σ˜2uA,−σ˜2θA,−σ˜2θA], ΛNAA =[−σ˜2uA,−σ˜2θA, σ˜2θA], ΛAAN = [−σ˜2uN , σ˜2θN , σ˜2θN], ΛNAN = [−σ˜2uN , σ˜2θN ,−σ˜2θN]. Combining the lastthree equations with the probabilities of transition into each sector, it is straightforward to derivethe first moments of the cross-sectional distribution of earnings in period 2. The second and thirdmoments can be derived as in period 1, as functions of M4(·, ·,2) and M4(·, ·,3) respectively.Finally consider the growth in income for switchers from A to N, that is:E(yNi2− yAi1|2 = N,1 = A)=rN2 − rA1 +E(uNi2−uAi1|2 = N,1 = A)=rN2 − rA1 +[E(uNi2−uAi1|2 = N,1 = A,0 = A)P(2 = N,1 = A,0 = A)+E(uNi2−uAi1|2 = N,1 = A,0 = N)P(2 = N,1 = A,0 = N)]/P(2 = N,1 = A)The unknown terms are those that involved expected values, that can be obtained from M4(−→r AAN , Σ˜AAN ,1)and M4(−→r NAN , Σ˜NAN ,1) respectively, with:ΣAAN =[σ2uA+σ2uN−2σθAN Λ˜AANΛ˜ ′AAN ΣTW], ΣNAN =[σ2uA+σ2uN−2σθAN Λ˜NANΛ˜ ′NAN ΣWW]where:Λ˜AAN =[−σ˜2uN− σ˜2θA, σ˜2uA+ σ˜2θN , σ˜2uA+ σ˜2θN] , Λ˜NAN = [−σ˜2uN− σ˜2θA, σ˜2uA+ σ˜2θN ,−σ˜2uA− σ˜2θN]The variance can be expressed in terms of M4(−→r AAN , Σ˜AAN ,2) and M4(−→r AAN , Σ˜AAN ,2), as in thefrictionless case.Similarly, for the growth in income for switchers from N to A we need expressions for:E(uAi2−uNi1|2 = A,1 = N,0 = A),E(uAi2−uNi1|2 = A,1 = N,0 = N)136B.5. Proofsthat can be obtained from M4(−→r ANA, Σ˜ANA,1) and M4(−→r NNA, Σ˜NNA,1) respectively, with:ΣANA =[σ2uA+σ2uN−2σθAN Λ˜ANAΛ˜ ′ANA ΣWW], ΣNNA =[σ2uA+σ2uN−2σθAN Λ˜NNAΛ˜ ′NNA ΣTW]where:Λ˜ANA =[−σ˜2uA− σ˜2θN , σ˜2uN + σ˜2θA,−σ˜2uN− σ˜2θA] ; Λ˜NNA = [−σ˜2uA− σ˜2θN , σ˜2uN + σ˜2θA, σ˜2uN + σ˜2θA]The variance can be expressed in terms of M4(−→r ANA, Σ˜ANA,2) and M4(−→r NNA, Σ˜NNA,2). As in thefrictionless case, we verified the system of 22 moments has a unique solution for the 12 elements ofΘ.B.5 ProofsProof of Proposition 1Under the assumptions of Proposition 1, the expression for expected log income growth of switchersto from agriculture to non-agriculture given in (B.13) simplifies to:E(yNi2− yAi1|2 = N,1 = A)= M3(−→0 , Σ˜AN ,1),and where Σ˜AN can be simplified toΣ˜AN = σ∗2θ +σ2εA+σ2εN −(σ∗2θ +σ2εN)σ∗2θ +σ2εA−(σ∗2θ +σ2εN) σ∗2θ +σ2εA+σ2εN −σ∗2θσ∗2θ +σ2εA −σ∗2θ σ∗2θ +σ2εA+σ2εN .Re-write Σ˜AN in terms of the correlation matrix CAN :CAN =1−(σ∗2θ +σ2εN)σ∗2θ +σ2εA+σ2εNσ∗2θ +σ2εAσ∗2θ +σ2εA+σ2εN−(σ∗2θ +σ2εN)σ∗2θ +σ2εA+σ2εN1 −σ∗2θσ∗2θ +σ2εA+σ2εNσ∗2θ +σ2εAσ∗2θ +σ2εA+σ2εN−σ∗2θσ∗2θ +σ2εA+σ2εN1and denote ρANi j the (i, j) element of CAN . Using our definition of M3 (·) and explicit formulas for themoments of the upper-truncated multivariate normal distribution in the trivariate case (derived fromrecurrence relations) from Kan and Robotti (2017), M3(−→0 , Σ˜AN ,1) can be re-written as:M3(−→0 , Σ˜AN ,1) =−√σ∗2θ +σ2εA+σ2εNφ(0)[ρAN12 Φ2([∞,0] ;ρAN13·2)Φ3 ([∞,0,0] ;CAN)+ρAN13 Φ2([∞,0] ;ρAN12·3)Φ3 ([∞,0,0] ;CAN)]137B.5. Proofswith ρANi j·k =ρANi j −ρANik ρANjk√(1−(ρANik )2)(1−(ρANjk )2) . Noticing that in our case Φ2([∞,0] ;ρANi j·k)= 12 ∀ i, j,k, we haveM3(−→0 , Σ˜AN ,1) =φ(0)[σ2εN−σ2εA]2√σ∗2θ +σ2εA+σ2εNΦ3 ([∞,0,0] ;CAN),which is positive if and only if σ2εN > σ2εA. Following the same steps we find the expected log incomegrowth of switchers to from non–agriculture to agriculture as:M3(−→0 , Σ˜NA,1) =−φ(0)[σ2εN−σ2εA]2√σ∗2θ +σ2εA+σ2εNΦ3 ([∞,0,0] ;CNA).Furthermore, it can be verified thatΦ3 ([∞,0,0] ;CNA)=Φ3 ([∞,0,0] ;CAN), which implies that M3(−→0 , Σ˜AN ,1)and M3(−→0 , Σ˜NA,1) have the opposite sign but the same magnitude. QED.138Appendix CAppendix to Chapter 3C.1 Additional figuresFigure C.1: Inter-sectoral gains and GDP per capita: Alternative specificationsAUSAUTBELBGRBRACANCHNCYPCZEDEU DNKESPEST FINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEXNLDPOLPRTROURUSSVKSVN SWETURUSA051015202530354045Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaAll WIOD sectors, GO spec., own country's cost shares, homogenous inputsAUSAUTBELBGRBRACANCHNCYPCZEDEU DNKESPESTFINFRAGBRGRCHUNIDNIND IRLITAJPNKORLTULVAMEXNLDPOL PRTROURUSSVK SVN SWETURUSA0102030405060708090Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaAll WIOD sectors, VA spec., US cost shares, homogenous inputsAUSAUTBELBGR BRACANCHNCYPCZEDEUDNKESPESTFINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEXNLDPOLPRTROURUSSVKSVNSWETURUSA0510152025Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaAll WIOD sectors, GO spec., US cost shares, heterogenous inputsAUSAUTBELBGRBRACANCHNCYPCZEDEU DNKESPESTFINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEXNLDPOLPRTROURUS SVKSVN SWETURUSA051015202530Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaAll WIOD sectors, GO spec., own country's cost shares, heterogenous inputsAUSAUTBELBGRBRACANCHN CYPCZEDEU DNKESPEST FINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEXNLDPOLPRTROURUSSVKSVN SWETURUSA0510152025Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaOnly manufacturing, GO spec., US cost shares, homogenous inputsAUS AUTBELBGRBRACANCHNCYPCZE DEU DNKESPESTFINFRAGBRGRCHUNIDNINDIRLITAJPNKORLTULVAMEX NLDPOLPRTROURUSSVK SVN SWETURUSA051015202530Gains (%)6.5 7 7.5 8 8.5 9 9.5 10 10.5 11Log GDP per capitaOnly manufacturing, GO spec., own country's cost shares, homogenous inputsNote: Averages 1994-2007. Data source: WIOD (Timmer et al., 2015), World Bank Development Indicators139C.2. CES aggregator across sectorsC.2 CES aggregator across sectorsWith a CES aggregator of the form Yϕ =S∑sβsYsϕ , where ϕ = φ−1φ and φ is the elasticity of substitutionacross sectors, the sectoral factor demand is now:Zls =αlsβφs P1−φs /ξ¯lsS∑sαlsβφs P1−φs /ξ¯lsZ¯l (C.1)Thus, in the efficient inter-industry allocation, not only factor intensities and revenue shares playa role, but also the efficient sectoral price indexes as indicators of productivity. The direction andstrength of their influence depends on the magnitude of φ . For φ > 1 (φ < 1), if factor intensitiesand shares of sectoral revenue are constant across sectors, factors should be allocated to more (less)productive sectors. The interaction of these three sectoral forces (factor intensities, revenue sharesand aggregate productivities) is what determines the efficient inter-sectoral allocation. Notice thatto find Z˜ls it is necessary to solve for P˜s, which implies to find firm’s output prices in the efficientallocation. These prices can be obtained by solving the non-linear system that includes all firm-levelprices, through numerical algorithms. Once Z˜ls are obtained, it is simple to calculate both components,using the counterfactual aggregate output generated by A˜s and Zls. The variation between currentoutput and this counterfactual represent the intra-sectoral gains, whereas the difference between thiscounterfactual and the allocative efficient aggregate output represents the inter-sectoral gains:Gainsintra = 100((S∑sβs(A˜sL∏lZlsαls)ϕ)1ϕ(S∑sβs(AsL∏lZlsαls)ϕ)1ϕ−1); Gainsinter = 100((S∑sβs(A˜sL∏lZ˜lsαls)ϕ)1ϕ(S∑sβs(A˜sL∏lZlsαls)ϕ)1ϕ−1)Total gains can be calculated in the same way as in (3.11).140
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Essays on factor misallocation
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Essays on factor misallocation Pulido Pescador, José 2018
pdf
Page Metadata
Item Metadata
Title | Essays on factor misallocation |
Creator |
Pulido Pescador, José |
Publisher | University of British Columbia |
Date Issued | 2018 |
Description | This thesis studies different implications of micro-level factor misallocation across heterogeneous agents. It consists of three chapters. The first chapter examines the impact of firm-level factor misallocation on an open economy’s comparative advantage. After providing empirical evidence on how Colombian metrics of firm-level misallocation are related to measures of its revealed comparative advantage, I explore the general equilibrium effects of such misallocation and its impact on industries' export capabilities. I compute a counterfactual equilibrium in which the misallocation is removed in Colombia. The reallocation of factors leads to an important change in the country's industrial structure and a rise in the exports-to-GDP ratio of 18 p.p. This industrial composition effect is absent in the workhorse models of firm-level factor misallocation under closed economies. Based on a co-authored paper with Tomasz Święcki, the second chapter studies the origin of the income gaps between agricultural and non-agricultural workers in developing countries. We use Indonesian data to document a robust premium for workers who move out of agriculture and a loss for those who move into agriculture, even if they do not migrate. We argue that to generate simultaneously these within-worker premia and the main moments of the joint sector-income distribution over time, self-selection needs to take place under barriers to sectoral mobility that misallocate workers across sectors. We find that removing such barriers prompt 30% of the workforce to reallocate and aggregate output to increase by 17%. The third chapter extends the standard model of firm-level factor misallocation in a closed economy in two dimensions. First, I introduce idiosyncratic demand shocks. This allows me to evaluate whether metrics of misallocation predict plants' survival, a test used to claim that misallocation metrics are empirically swamped by demand shocks. I argue that unconditional estimates in this test are biased in the presence of firms' selection, which would explain the puzzling empirical findings. Second, I compute the TFP gains of removing misallocation both within and across industries. I quantify the importance of inter-industry misallocation and explore its potential role in explaining TFP gaps across countries. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2018-08-10 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0370967 |
URI | http://hdl.handle.net/2429/66739 |
Degree |
Doctor of Philosophy - PhD |
Program |
Economics |
Affiliation |
Arts, Faculty of Vancouver School of Economics |
Degree Grantor | University of British Columbia |
GraduationDate | 2018-09 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2018_september_pulido_pescador_jose.pdf [ 1.5MB ]
- Metadata
- JSON: 24-1.0370967.json
- JSON-LD: 24-1.0370967-ld.json
- RDF/XML (Pretty): 24-1.0370967-rdf.xml
- RDF/JSON: 24-1.0370967-rdf.json
- Turtle: 24-1.0370967-turtle.txt
- N-Triples: 24-1.0370967-rdf-ntriples.txt
- Original Record: 24-1.0370967-source.json
- Full Text
- 24-1.0370967-fulltext.txt
- Citation
- 24-1.0370967.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0370967/manifest