UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

On interaction and efficiency : prematch investments with hidden characteristics Bidner, Christopher 2008

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2009_spring_bidner_christopher.pdf [ 961.12kB ]
Metadata
JSON: 24-1.0066903.json
JSON-LD: 24-1.0066903-ld.json
RDF/XML (Pretty): 24-1.0066903-rdf.xml
RDF/JSON: 24-1.0066903-rdf.json
Turtle: 24-1.0066903-turtle.txt
N-Triples: 24-1.0066903-rdf-ntriples.txt
Original Record: 24-1.0066903-source.json
Full Text
24-1.0066903-fulltext.txt
Citation
24-1.0066903.ris

Full Text

On Interaction and Efficiency Prematch Investments with Hidden Characteristics by Christopher Bidner B.Ec. (Hons), The University of New South Wales, 2001 M.A., The University of British Columbia, 2003 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy in The Faculty of Graduate Studies (Economics)  The University Of British Columbia (Vancouver) January 23, 2009 c Christopher Bidner 2008  Abstract I develop three models that are designed to aid in the analysis of environments in which agents i) benefit from interacting with others, and ii) optimally choose their characteristics mindful of the fact that such choices will influence the quality of interaction that they can expect. Of central interest is the ways in which a concern for interaction affects the efficiency with which agents choose their characteristics. The first two models contrast with previous work in that each agents’ relevant characteristics are both unobserved and endogenously determined. The first model provides an explanation for credentialism in the labour market, and demonstrates how a concern for interaction can lead to over-investment in the relevant characteristic. The second model is motivated by human capital development in the prescence of peer effects, and demonstrates how a concern for interaction can exacerbate an inherent under-investment problem. The third model retains the feature of unobserved characteristics, and contrasts with previous work by embedding frictions in the process by which agents compete for partners. The model is set in a labour market and demonstrates that outcomes of interest (equilibrium matching patterns, income, inequality and welfare) are generally not monotonic in the level of frictions.  ii  Table of Contents Abstract  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  Table of Contents List of Figures  ii  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii  Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  x  1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1  1.2 The Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  2  1.3 An Overview of the Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . .  3  1.4 Comments and Overview  . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  4  2 A Spillover-Based Theory of Credentialism . . . . . . . . . . . . . . . . . . .  5  2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  5  2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Fundamentals  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12  2.2.2 The Matching Market . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.3 Equilibrium Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.4 A Geometric Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2.5 Separating Equilibria  . . . . . . . . . . . . . . . . . . . . . . . . . . . 18  2.2.6 Multiple Separating Equilibria . . . . . . . . . . . . . . . . . . . . . . 22 2.3 Analysis  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26  2.3.1 A Generalization of the Productivity Function . . . . . . . . . . . . . 26 2.3.2 Efficiency  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27  2.3.3 Entry of Lower Types 2.3.4 The Effect of Spillovers  . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 . . . . . . . . . . . . . . . . . . . . . . . . . . 31  2.3.5 A Comparison with Global Spillovers  . . . . . . . . . . . . . . . . . . 35  2.4 A Simple Dynamic Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . 36  iii  2.4.1 Non-Transferable Utility  . . . . . . . . . . . . . . . . . . . . . . . . . 38  2.4.2 Transferable Utility: Bargaining . . . . . . . . . . . . . . . . . . . . . 41 2.5 Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43  3 Peer Effects and the Promise of Social Mobility: A Model of Human Capital Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.1 Motivation and Introduction  . . . . . . . . . . . . . . . . . . . . . . . . . . . 44  3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.1 Fundamentals 3.2.2 Structure  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49  3.2.3 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 Analysis  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57  3.3.1 Efficiency  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57  3.3.2 Total Investment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.3 An Alternative Benchmark: Random Matching  . . . . . . . . . . . . 59  3.4 Simple Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.4.1 Efficient Investments 3.4.2 Pooling Equilibrium  . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60  3.4.3 Separating Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.4.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.5 An Extended Illustration  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64  3.6 Imperfectly Observed Investments . . . . . . . . . . . . . . . . . . . . . . . . 66 3.7 Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70  4 A Model of Frictional Pre-Match Investment: Implications for Income, Inequality and Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2 A Model  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74  4.2.1 Spillovers and General Results . . . . . . . . . . . . . . . . . . . . . . 74 4.2.2 A Simple Model of Imperfect Matching 4.3 Analysis  . . . . . . . . . . . . . . . . . 76  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78  4.3.1 Existence  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78  4.3.2 Equilibrium Segregation 4.3.3 The Impact of Frictions  . . . . . . . . . . . . . . . . . . . . . . . . . 79 . . . . . . . . . . . . . . . . . . . . . . . . . . 80  4.3.4 Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.4.1 Differential Efficacy of Investment Decisions . . . . . . . . . . . . . . 84 4.4.2 Interpreting Trends in Inequality  . . . . . . . . . . . . . . . . . . . . 86  iv  4.5 Conclusions  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87  5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Bibliography  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90  Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96  A Appendix to Chapter 2  A.1 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 A.1.1 Proof of Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 A.1.2 Proof of Proposition 1  . . . . . . . . . . . . . . . . . . . . . . . . . . . 96  A.1.3 Proof of Proposition 2  . . . . . . . . . . . . . . . . . . . . . . . . . . . 98  A.1.4 Proof of Proposition 3  . . . . . . . . . . . . . . . . . . . . . . . . . . . 98  A.1.5 Proof of Proposition 4  . . . . . . . . . . . . . . . . . . . . . . . . . . . 98  A.2 A Further Rationale for Selecting The Pareto Dominant Equilibrium . . . . 99 A.3 Derivation of Closed-Form Solution  . . . . . . . . . . . . . . . . . . . . . . . 101  A.4 An Illustration  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102  A.5 Optimal Policy  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107  A.6 A Model With Classes  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108  A.7 Signaling With Patience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 A.8 Pooling Equilibria B Appendix to Chapter 3  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115  B.1 Complementarities: Separating Equilibrium vs Random Matching . . . . . 115 B.1.1 Comparison with Separating Equilibrium  . . . . . . . . . . . . . . . 117  B.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 B.2.1 Proof of Proposition 5 B.2.2 Proof of Result 3  . . . . . . . . . . . . . . . . . . . . . . . . . . . 120  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120  B.2.3 Proof of Proposition 6  . . . . . . . . . . . . . . . . . . . . . . . . . . . 121  B.2.4 Proof of Proposition 8  . . . . . . . . . . . . . . . . . . . . . . . . . . . 122  B.3 Derivation: Investments with Noise . . . . . . . . . . . . . . . . . . . . . . . 122 C Appendix to Chapter 4  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125  C.1 A Generalization of the Productivity Function . . . . . . . . . . . . . . . . . 125 C.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 C.2.1 Proof of Results 11 and 12 . . . . . . . . . . . . . . . . . . . . . . . . . 127 C.2.2 Proof of Result 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 C.2.3 Proof of Result 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 v  C.2.4 Proof of Proposition 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 C.2.5 Proof of Proposition 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 C.2.6 Proof of Proposition 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . 130  vi  List of Figures 2.1 Preferences and Single Crossing . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.2 Optimality for a type θ worker given µ(x) . . . . . . . . . . . . . . . . . . . . 17 2.3 The Rational Expectations Condition in a Separating Equilibrium . . . . . 18 2.4 Direction Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Multiple Separating Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1 Properties of µ: A Sketch of the Direction Field associated with Γ(x, µ) . . . 55 3.2 Derivation of Equilibrium Investments . . . . . . . . . . . . . . . . . . . . . 57 3.3 Geometric Proof of Proposition 7 . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.4 Welfare: The Effect of Altruism and Spillovers . . . . . . . . . . . . . . . . . 67 4.1 A Geometric Proof of Proposition 15 . . . . . . . . . . . . . . . . . . . . . . . 83 A.1 Possible Pooling Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 B.1 The Functions S(φ) and S ∗ (φ) . . . . . . . . . . . . . . . . . . . . . . . . . . . 119  vii  Acknowledgements The research contained in the following chapters has greatly benefited from discussions, detailed comments, and general support from many people. Ken Jackson has been particularly helpful in this regard: his willingness to listen to, and to provide uncensored comments on, numerous ideas in their infancy has been invaluable, and his broad intelligence and creative capacity continues to inspire and challenge me. Patrick Francois has been a tremendous supervisor, managing to provide valuable guidance while still leaving my sense of autonomy intact. He has an incredible knack for delivering reassurance at the exact point that it is needed, and consistently encourages me through his demonstration of the fact that a successful academic career need not conflict with being a ‘normal’ person. Mike Peters has influenced me, and my work, in many ways. Contemplating his comments rarely fails to take me to the very limits of my understanding (which are always closer than I had originally assumed). Such excursions, while disheartening at times, always provide a glimpse of uncharted territory and thereby present a rare opportunity for real intellectual growth. I also would like to thank Mukesh Eswaran and Gorkem Celik for their detailed, insightful, and always honest comments and thoughts. I am extremely grateful for extensive comments and suggestions from Li Hao. I owe a great debt to my grand-supervisor, Ashok Kotwal, for the enthusiasm that he has shown in my work, and for his generous efforts in promoting it. Research is such less the chore when surrounded by peers that encourage you, inspire you, and occasionally drink with you. In their own individual ways, Ben Sand, Ken Jackson, Kelly Foley, Andy Chan, Chris Barrington-Leigh, Michael Vlassopolous and Leo Basso have all immeasurably enriched my post-graduate experience, and have together surrounded me in an energy that draws the best out of me. I wish to acknowledge the contribution of those who do not necessarily know, and surely do not care, about the finer details of my work. My family has always encouraged me, and has never wavered in their support of the path that I have chosen. I am extremely grateful for the many sacrifices that they have made in the pursuit of my happiness. If I appear less-than-suitably homesick, then it is because the (extended) Laine family has made me feel so welcome, and has displayed a warm, genuine concern for my well-being. I am extremely thankful for all the kindness this family has shown. Most of all, I want to thank Angela Laine. She has endured many weekends in viii  which my work took precedent over anything fun, never complaining but seeing it as the investment that it is. She has sailed me through some very rough waters and, in many ways, is responsible for much of the calmness that I do experience. Everyone must address the persistent question of “why bother?”, and she makes it so much the easier for me to do so.  ix  For Mr. Jenkins  x  Chapter 1  Introduction 1.1  Motivation  Many of our important objectives, economic or otherwise, are realized through interaction with others. For instance, the quality of our social lives is influenced by the type of friends we keep, the nature of our family lives is influenced by the spouse we choose, our productive potential at work is influenced by the types of coworkers we engage with, and a child’s development is influenced by their peer group. Given this general importance, it is not surprising that we speak of ‘positions’ in society, and envision a strong connection between one’s position and the quality of interaction that one can expect. In most modern societies, meritocratic forces are central in the determination of position, and are therefore instrumental in shaping who we end up interacting with. Economists, back to Adam Smith, see such ‘competitive’ forces as a key ingredient in providing an environment amenable to economic growth. The reverse direction is stressed in a recent book by Benjamin Friedman, where he argues that such freedoms are natural consequences of economic growth (Friedman (2005)), and Amartya Sen sees the extension of freedoms such as these as defining development (Sen (1999)). The presence of meritocratic forces - a capacity to undertake actions that influence who we interact with - is a key element of the models developed in the following chapters. If the characteristics of those that we interact with have implications for our welfare, then a complete analysis requires a description of how these characteristics are formed. Just like the set of people that we interact with, our characteristics - and the characteristics of those we interact with - are rarely fixed. For example, we may learn a new technique or insight from a skilled coworker, but whether or not the coworker is skilled depends on actions taken by the coworker in the past (e.g. the type and amount of formal education that they had engaged in). This feature - the endogeneity of relevant characteristics - is another central element of research presented here. Far from being separate choice problems, these two elements are often inter-related in the sense that the decision to choose one set of characteristics over another has implications for the quality of interaction that can be expected. This inter-relatedness is present in the models that follow. For example, in Chapter 2 I develop a model in which workers build their skill level by augmenting their natural ability with an investment in 1  education. Apart from increasing a worker’s skill, the education investment also shapes the type of coworkers that the worker will benefit from. In Chapter 3, parents invest in the human capital of their children, and generate wealth mindful of the relationship between wealth and the quality of peer group that the child will be exposed to. It is not until this inter-relatedness is acknowledged that extended efficiency issues are raised. For instance, interaction between agents, by itself, introduces scope for inefficiency in the choice of relevant characteristics since agents do not take into account that their choices have consequences for others (those that they interact with). This question is enriched when such choices also have implications for who the agent interacts with.1 In short, we can ask: ‘How does a concern for interaction affect the efficiency with which agents choose their relevant characteristics?’ The answer to questions like these will depend on the details of the mechanisms that determine with whom agents interact. To fix ideas, suppose that agents interact within pairs.2 The environment is then not too dissimilar to one of bilateral trade in which agents use their characteristics to compete for the right to trade with ‘desirable’ partners. The market mechanism at work here is not the determination the terms of trade within fixed partnerships, but, rather, the determination of the partnerships themselves. Although such a formulation has the potential to deliver desirable outcomes, the feasibly of such a mechanism is jeopardized by the nature of relevant characteristics. In particular, although relevant characteristics may be inferred from observed characteristics, they may not be observed directly. In terms of the previous examples, i) skill may not be observed directly, but it may be inferred from observed education, and ii) parental investment may not be observed directly, but can be inferred from wealth. In such settings, competition for partners must unfold on the basis of characteristics that themselves are not necessarily the relevant ones. This feature is a further defining element of the research presented here.  1.2  The Context  The feature that agents choose who they interact with is prominent in the applied theoretical literatures on marriage (Becker (1973)), worker-job assignment (Sattinger (1993)), and models of search and matching (Burdett and Coles (1997), Shimer and Smith (2000), and Smith (2006)). It is also underlies much of co-operative game theory, especially matching theory (see Roth and Sotomayor (1992)). These types of models 1 Although not stressed in the models, another efficiency issue is present. For fixed characteristics, efficiency generally depends on the way in which competitive forces shape matching patterns. This issue is central in a wide variety of papers, but will not be the focus in the research presented here. 2 As will hopefully become clear, there is nothing particularly special about this but will be very helpful in i) describing the environment, and ii) relating the models to others found in the literature.  2  are typically interested in describing equilibrium matching patterns. As such, they treat agents as ‘passive’ in the sense that they have no capacity to alter their attractiveness as a potential partner. In contrast, such a capacity is an important ingredient in the analysis presented here. The implications of agents choosing characteristics in the presence of a concern for matching are developed in the relatively small ‘pre-match investment’ literature. For instance, Peters and Siow (2002) and Peters (2004b) assume non-transferable utility in a static economy, Cole, Mailath, and Postelwaite (2001) assume transferable utility in a static economy, and Burdett and Coles (2001) assume non-transferable utility in a dynamic search economy. In contrast to the research presented in the following chapters, all these papers model environments in which all relevant characteristics are observable. In many settings of economic relevance, including those discussed above, such an idealized view is neither particularly accurate nor innocuous. A small number of recent papers model economies in which i) relevant characteristics are unobserved, and ii) agents can actively influence their matching prospects through their choice of observable characteristics. Such papers include Hoppe, Moldovanu, and Sela (2005), Damiano and Li (2007), and Rege (2007). In contrast to the analysis presented in this thesis, these papers all take the relevant characteristics as being exogenous. Doing so rules out the possibility of asking the question of how competition for partners shapes incentives to choose their relevant characteristics. In addition to the literature mentioned above, the research engages a number of others, whether it be due to a shared motivation, application, or implication. Since the details are quite context-specific, I leave an elaboration to the relevant sections of the following chapters.  1.3  An Overview of the Chapters  The first paper, A Spillover-Based Theory of Credentialism, studies a large economy in which workers’ productivities are influenced by the skills of their coworkers. Each worker’s skill is the result of an ability-augmenting investment that is made prior to matching with a coworker. These skills are ‘soft’ (unquantifiable) characteristics of workers. In contrast, investments are ‘hard’ characteristics. This feature gives investments a credential quality, since a worker can use their investment to attract desirable coworkers. The main result of the model is that, despite the positive externality, there is overinvestment in equilibrium. Apart from providing a rationalization for credentialism that overcomes some recent criticisms of signaling models, the analysis provides insights into the relationship between spillovers and productivity, welfare, and inequality. The model offers a different interpretation of returns to education, and demonstrates how modeling 3  spillovers in this way produces conclusions that are dramatically different from standard treatments. I extend the model to a simple dynamic setting and show that the qualitative conclusions of the static model are unaffected when utility is non-transferable, and that efficiency is restored as workers become patient when utility is transferable. The second paper, Peer Effects and the Promise of Social Mobility: A Model of Human Capital Investment, develops a model of human capital development in the presence of peer effects. Parents invest in their child, and this investment conveys a positive externality upon the child’s peers. Parents also acquire wealth, which not only finances consumption, but also determines a child’s peer group in the matching market. I show how the freedom to compete for desirable peers (the ‘promise of social mobility’) exacerbates the natural underinvestment problem associated with the positive externality. The analysis thereby produces a general equilibrium framework in which the inefficiencies displayed in a ‘rat-race’ interact with those stressed in the ‘multi-tasking’ literature. Following some general results, I provide two illustrations of the model. I also consider an extension in which partners are assigned on the basis of noisy signals of both wealth and parental investment. The third paper, A Model of Frictional Pre-Match Investment: Implications for Income, Inequality and Welfare, examines an environment in which agents are motivated to make unproductive investments with the sole aim of improving their matching opportunities. In contrast to existing work, I add frictions by allowing the investment to be imperfectly observed. The analysis allows for a deeper understanding of the tradeoff inherent in related models: investments waste resources but facilitates more efficient matching patterns. I show that greater frictions i) do not always lead to inferior matching patterns, and ii) can force the economy into to a Pareto preferred equilibrium. The model offers an alternative perspective on recent trends in residual wage inequality among skilled workers.  1.4  Comments and Overview  To avoid repetition in the text, let me point out some conventions used throughout. First, all proofs that are not contained in the text appear in the Appendix associated with the relevant Chapter. Second, I adopt the convention that fx refers to the first partial derivative of a function, f , with respect to the argument x. Similarly, fxx refers to the second derivative, fxz refers to the cross derivative, and so on. The remainder of the thesis is organized as follows. The three models discussed here are introduced in Chapters 2, 3, and 4. Conclusions are drawn in Chapter 5. The Appendix contains proofs and also various other extensions and derivations.  4  Chapter 2  A Spillover-Based Theory of Credentialism 2.1  Introduction  A society’s prosperity depends in large part upon how productive its members are and therefore on its capacity to provide individuals with suitable incentives to enhance their skills. This paper studies such incentives when ‘interactive’ aspects of production are central: agents’ productivities depend not only on their skills, but also on the skills of those who they work with. As I argue below, such an interactive setting is becoming an increasingly relevant feature of the modern workplace. The focus of the paper is the relationship between skill spillovers in the workplace and the phenomenon of credentialism. I define credentialism as the tendency for individuals to be motivated, at least in part, to engage in an activity (e.g. higher education) because of the credential it offers (e.g. a degree), as opposed to the intrinsic benefit associated with the activity (e.g. learning).3 Despite the prominent perception that this phenomenon exists, the standard explanation for it, at least among economists, suffers a number of problems.4 This paper offers a rationalization of credentialism that avoids such problems, and thereby restores a theoretical underpinning for the phenomenon. A solid understanding of the phenomenon is of great policy relevance since the social benefit of education is weakened in the presence of credential-motivated attainment. Economists typically comprehend credentialism through models of signaling and/or screening, where the essential point is that education acts as a credential because it changes employer beliefs about a worker’s unobserved productivity.5 A potent criticism 3  For instance, there is credentialism inherent in signaling models because workers are paid on the basis of their observed education credential, independent of their true productivity. There is no credentialism inherent in human capital models because workers are paid according to their productivity, regardless of any investment credentials. 4 That there is a widely-held view that credentialism exists is evidenced by the large empirical literature devoted to evaluating the magnitude of signaling relative to the human capital explanation of the link between education and wages (see Weiss (1995), Ferrer and Riddell (2002), and Lange and Topel (2006) for surveys). The perception also exists outside Economics. Sociologists have discussed the phenomenon at length (e.g. see Labaree (1997), Brown (1995), Collins (1979) and Berg (1970)) and is also prominent in the popular press (e.g. Jacobs (2004)). 5 Seminal work of course includes Spence (1973), Spence (1974), Rothschild and Stiglitz (1976) and Arrow  5  of such models has emerged, especially among labor economists, based on a claim that employers determine the productivity of their employees relatively quickly. As Gary Becker puts it6 : The signaling interpretation of the benefits of going to college originated in the 1970’s and had a run of a couple of decades, but is seldom mentioned any longer. I believe it declined because economists began to realize that companies rather quickly discover the productivity of employees who went to college, whether a Harvard or a University of Phoenix. Before long, their pay adjusts to their productivity rather than to their education credentials. Much work in the empirical labor literature (Lange (2007), Lange and Topel (2006), Farber and Gibbons (1996), and Altonji and Pierret (2001)) use competitive arguments to suggest that workers are paid their expected marginal product, and quick learning implies that signaling can only boost wages for a short time. While intuitive, this line of reasoning requires a significant extension to the original claim. In particular, it requires that the market - not just the single employer - quickly uncovers a worker’s true productivity. This extended version of the claim is far less obvious, and, as I argue below, even less compelling in the environment considered in this paper. Instead, I argue that the original claim reduces the plausibility of signaling models because it makes the feasibility of productivity or performance contingent wage contracts more compelling. There is no role for signaling (in the standard sense) when such contracts are available since wages depend on performance, independent of any credentials. One reason such contracts are infeasible in signaling models is the assumption that a firm’s total output is observed, but the individual contributions toward this total are not observed.7 Early formulations, such as Akerlof (1976), motivate this assumption by considering “faceless and nameless” workers on an assembly line. I would suggest that these are not the types of jobs that spring to mind when one thinks of credentialism. Post-secondary degrees typically lead to jobs in which individual performance is quite visible: lawyers, consultants, engineers, accountants, etc. The point is that if firms can evaluate their workers’ performance, then it is reasonable to suppose that firms can condition wages on such assessments - leaving little, if any, motivation to signal. (1973). 6 Taken from a January 29, 2006, entry in the Becker-Posner Blog, which can be found at: http://www.becker-posner-blog.com/archives/2006/01/on forprofit co.html. 7 There are other justifications for the assumption. For instance that the nature of a worker’s ‘output’ is non-verifiable, or otherwise non-contractable (the provision of consulting services, for example). But the relevance of this assumption is also placed in doubt when firms are able to accurately assess the productivity of their workers (no matter how abstract such an assessment is), since the firm can fire any worker that they perceive as not being worth the agreed-upon wage. As long as it is costly enough for ‘masquerading’ workers to keep switching jobs, the use of such implicit contracts ensures that a workers’ (abstract) productivity is accurately rewarded.  6  If the possibility of performance-based wages renders signaling unconvincing, then there are two clear alternatives. The first alternative is that credentialism is an illusion. Indeed, in light of criticisms of this nature, the recent dramatic increase in educational attainment experienced in many countries has been widely interpreted as a natural response to the changing demands for skill in the modern workplace. The second alternative is that we need a different approach to understanding credentialism. The model developed here offers such an alternative. Far from ignoring the ‘changing demands for skill’, the basic ingredients of the model are drawn from salient features of the modern workplace. The first ingredient is intimately tied to the above criticism of signaling models: individual worker productivity is relatively easy to identify and reward. Lemieux, MacLeod, and Parent (2007) document the growing incidence of explicit performance-based pay in the U.S. and argue that technological change, especially in regards to monitoring and reporting technology, has made this feasible. The practice is even more widespread than suggested since performance may be implicitly rewarded by continued employment (as in Academia). Even when objective measures of performance are unavailable, subjective performance evaluations often offer a suitable substitute.8 To make this feature as stark as possible, I assume that workers are simply paid their output. This assumption is made primarily to address the above criticism of signaling models. The central feature of the model is that a worker’s productivity is influenced by who they work with. Modern work practices have greatly increased the scope for such spillovers to arise. From a long-term perspective, the fact that workers have moved off the farm and out of the workshop and into the factory and office building means that physical proximity to other workers has increased. Trends in Human Resource Management, such as those described in Ichniowski and Shaw (2003), stress the importance of “pay-for-performance plans like gain-sharing or profit-sharing, problem-solving teams, broadly defined jobs, cross-training for multiple jobs, employment security policies and labor-management communication procedures”. Many of these practices enhance the interconnectedness of employees.9 Finally, the introduction of computers in the workplace has changed the nature of tasks being performed. Autor, Levy, and Murnane (2003) and Levy and Murnane (2004) argue that computers can not easily perform non-routine tasks, and document the growing importance of tasks related to expert thinking (“solving problems for which there are no rule-based solutions”), and complex communication (“in8  For discussion and a theoretical treatment, see Baker, Gibbons and Murphy (1994), and for an empirical demonstration of the relevance of subjective performance evaluations, see Lazear (1999). 9 For example, Gant, Ichniowski, and Shaw (2002) provide evidence that worker productivity is improved following the introduction of innovative work practices because of stronger social capital developed between workers. Drago and Garvey (1998) find that ‘helping effort’ is more readily expended by workers engaged in a large variety of tasks.  7  teracting with humans to acquire information, to explain it, or to persuade others of its implications for action”). A worker’s ability to perform such tasks depends on the skills of their coworkers - expert thinking is often a collaborative process and the efficacy of complex communication relies on the skills of both sender and receiver. I assume, in the human capital tradition, that a worker’s skill is the result of a costly investment that augments their natural ability. In order to introduce spillovers, a worker’s productivity depends upon their skill as well as upon the skill of a coworker. If we define a firm to simply be a pair of workers, then we can say that spillovers are ‘local’ in the sense that they occur within the boundaries of a firm. It is important to note however that there is no sense in which the ‘firm’ internalizes the skill externality. This is because of the assumption that workers are paid according to their productivity (and not their skill, as in Kremer (1993) for example). In compiling a list of ‘new basic skills’ - skills that are valued in the modern workplace - Murnane and Levy (1996) write: A surprise in the list of New Basic Skills is the importance of soft skills. These skills are called “soft” because they are not easily measured on standardized tests. In reality, there is nothing soft about them. Today more than ever, good firms expect employees to raise performance continually by learning from each other through written and oral communication and by group problem solving. To capture this feature, skills are modeled as unquantifiable qualities such as creativity, capacity for working in teams, self-reliance, etc., for which there is no natural metric. The significance of this is that it is infeasible for workers to meet each other on the basis of skills.10 In contrast, I assume that investment is quantifiable (e.g. years of schooling, courses taken, grades, institution attended, etc.). This means that workers are able to use their investment in the ‘competition’ for desirable coworkers in the matching market, and thereby provides the rationale behind credential-minded educational attainment. The model is clearly related to papers that model skill spillovers. Unlike Lucas (1988) and Moretti (2004), I do not model spillovers as a ‘global’ phenomenon in which some measure of the average skill in an area improves the productivity of everyone in that area. Instead, I model spillovers arising as a result of ‘local’ interactions between workers.11 I do not believe that this approach does any injustice to the motivation underlying those papers, yet I show that this approach leads to dramatically different conclusions. 10 There are no explicit firms in the model, but we can think of ‘workers meeting each other’ as the reducedform of a process in which firms post vacancy notices requiring applicants possess particular quantifiable characteristics. By their ‘soft’ nature, firms could not meaningfully use skills for this purpose. 11 The distinction between ‘global’ and ‘local’ does not arise in Lucas (1988) because he studies an economy with a representative agent. This assumption allows for a sharper focus on the growth dynamics.  8  For example, policy prescriptions and most comparative statics are completely reversed. The literature on learning in cities, including Jovanovic and Roy (1989), Glaeser (1999), and many others, takes this local aspect of spillovers more seriously. These papers typically model local interaction by assuming that agents meet in a random fashion and then enter into some kind of exchange. In contrast, this paper is interested in the consequences of the fact that individuals are able to take actions that influence who they interact with. It therefore differs from models of search and matching, such as Shimer and Smith (2000), Smith (2006), and Burdett and Coles (1997), which also assume random matching. Here, spillovers alter incentives to invest precisely because interaction is not random. The feature that agents take actions mindful of the fact that it will affect their matching prospects is prominent in the literature on premarital investment, e.g. Peters (2004b), Peters and Siow (2002), Cole, Mailath, and Postelwaite (2001). In contrast to these papers, I allow the desirability of an agent as a match partner to depend on more than just their investment: in particular, an agent’s desirability will also depend on their type (ability). This is also the primary difference between this model and Kremer’s O-Ring Theory (Kremer (1993)). If we think of a ‘firm’ as being the shell that houses groups of coworkers, then firms are ex-ante homogeneous in the model. However workers are not indifferent to which firm they work in because firms are differentiated by the skill of the coworkers they house. Hopkins (2005) analyzes a model in which firms house a single worker but firms themselves are heterogeneous. The main qualitative difference to the present model is that the heterogeneity of firms is exogenous in that model. In contrast, firms are endogenously differentiated in the present model since the skills of their workers are endogenously determined by the workers’ optimal investments. I assume that workers interact in pairs, making the analysis necessarily two-sided.12 The group of models that are most similar to the model presented here are those of two-sided signaling; including Hoppe, Moldovanu, and Sela (2005), Rege (2007), and Damiano and Li (2007). These papers share the general feature that agents are matched in a positive assortative manner on the basis of some unproductive investment made prior to matching. The model introduced in this paper assumes that this investment is productive. Although this distinction is largely trivial in standard signaling models based on Spence (1973), the distinction is not so trivial in the two-sided case considered here. The first reason for this is that the some key features of the environment, including the type of equilibrium employed, needs to be altered in order to produce separating 12  A model in which heterogeneity is generated due to ‘investment’ behavior is presented in Cole, Mailath, and Postelwaite (1995). In that model the ‘investment’ is purely a waste and therefore more investment actually detracts from one’s quality, all else equal. In addition, that model is one-sided.  9  equilibria. I take a competitive approach in that agents optimize taking as given a market return function that specifies the expected skill of coworker that will arise from the matching process given a particular investment. The second important distinction is that when investment is a pure waste, there is clearly over-investment in any equilibrium in which any agent makes a positive investment. The issue is not so clear once investments are productive, since an agent’s investment confers a positive externality upon their partner. If partners were fixed, then the conclusion from this observation is that there is too little investment. But partners are not fixed, and in equilibrium more investment allows an agent to obtain a more desirable partner. I show that this latter aspect dominates and consequently that there can never be under-investment in equilibrium. In fact the lowest type invests efficiently, whereas all other types invest too much. Thus, not only does the model rationalize why individuals are concerned with the credentials offered by an education, it also justifies the conclusion that such a concern encourages excessive attainment. Interestingly, the existence of positive skill spillovers does not, on its own, justify the use of subsidies. The third significant difference is that models with unproductive investment require that interaction be complementary (at least when the cost of investment is independent of type13 ).14 This generates a trade-off: investment-based matching involves wasted resources, but improves the efficiency of the matching pattern. Whilst this trade-off is interesting, it is a direct consequence of complementary interaction. By allowing for productive investment, I am able to relax the assumption of complementary interaction and thereby remove the necessity of the trade-off. However, a new trade-off emerges: investment-based matching generates too much investment in equilibrium whereas random matching generates too little investment. This paper differs from all of the above studies in that it is concerned with the consequences of greater spillovers. Such an exercise is of little interest in models of two-sided unproductive investment because aggregate outcomes, such as productivity and inequality, are fixed by the exogenous distribution of types and are therefore insensitive to the degree of spillovers. By considering productive investments, I identify a novel channel through which technological change increases productivity: new technologies that embody spillovers (e.g. communication technologies, including computers) raise productivity because they provide incentives for workers to increase their skill-enhancing in13  This feature is desirable as it overcomes another common criticism of signaling - that costs are lower for higher types. This is not at all obvious if costs reflect opportunity costs since higher types would reasonably be forgoing better alternatives (Weiss (1995)). 14 In this context, ‘interaction is complementary’ just means that marginal product of one’s own characteristic is increasing in the level of one’s partner’s characteristic (supermodularity). Equilibria will exist when interaction is only weakly complementary (the marginal product is independent of one’s partner), but these are ‘knife-edge’ equilibria in which all agents are indifferent to all equilibrium investments.  10  vestments, not because the technologies enhance worker skills per se. Spillovers between workers are argued to be a crucial force underlying credentialist behaviour. This is because investments are efficient in the absence of spillovers and the degree of over-investment monotonically increases in the degree of spillovers. The fact that credentialism relies on spillovers demonstrates how ‘education externalities’ and ‘credentialism’ are structurally related, and need not represent separate phenomena. A second crucial element of the environment is that spillovers occur locally (within worker-coworker pairs) and that investment influences workers’ matching prospects. I show how more standard approaches to modeling spillovers produce dramatically different conclusions to those produced here. For instance, it matters that there is heterogeneity of types: the results produced are necessarily absent in models that use a representative agent. Further, it matters that the spillovers that a worker is exposed to is sensitive to their investment: the central comparative static results of the model are in direct contrast to those of models in which spillovers accrue to the population at large, or in which interaction is random. The model suggests that care needs to be taken in interpreting observed returns to education. Unlike both signaling and human capital models, education raises a worker’s productivity even when education has no capacity to raise skills.15 The reason is that a higher education enhances matching prospects and raises productivity via spillovers. This implies that even if the researcher had individual-level data on ability, productivity and education, the resulting implied private effect of education on productivity would over-state the true social return (even though education has a positive external effect on the coworker). The fact that spillovers are local means that the true social return will generally not be uncovered by including controls for region-level aggregate education as is typically done in the empirical literature. Finally, I examine the interpretation that pre-match investments represent ‘signaling’ by considering some simple dynamic extensions in which workers can reject a proposed match after immediately observing their partner’s skill. The main conclusions from these exercises are that the analysis is qualitatively unaltered when utility is nontransferable. When utility is transferable, I show that changes in workers’ discount factors allows us to endogenize the degree of spillovers, and, in particular, equilibrium investment becomes efficient as the discount factor approaches unity. The model is presented in Section 2.2. After laying out the fundamentals, the equilibrium concept is introduced. A graphical approach is taken to illustrate equilibria. The qualitative features of the unique separating equilibrium is then analyzed in Section 2.3. A simple dynamic extension to the base model is introduced in Section 2.4 before conclusions are drawn in Section 2.5. The Appendix contains proofs as well as further 15  Unproductive education raises a worker’s wage in signaling models, but not their actual productivity.  11  discussion and results. For example, I examine optimal taxation policy and produce a number of further results within the context of a parameterized illustration.  2.2 2.2.1  Model Fundamentals  The economy is populated with a unit measure of workers indexed by i ∈ [0, 1]. Each  worker is endowed with an ability, θi ∈ Θ ≡ θ, θ . The distribution of abilities is common  knowledge and is given by F , with an associated density of f , where it is assumed that 0 < f (θ) < ∞ for θ ∈ Θ.  Agents have the opportunity to make an investment, x ≥ 0, at a cost of c(x), where  c : R+ → R+ is differentiable, convex and strictly increasing. This cost is assumed  to be independent of ability in order to emphasize the importance of the feature that investment complements natural ability.16 In particular, if a worker of type θ makes an investment of x, then they have a skill of s = s(x, θ), where sx (x, θ) > 0, sxx (x, θ) ≤ 0, sθ (x, θ) > 0 and sxθ (x, θ) > 0 for all (x, θ) ∈ R+ × Θ.  Complementarity between investment and ability is a natural assumption to make when θ is interpreted as a characteristic such as ‘aptitude’ or ‘competence’, in the sense that higher types absorb more from any given level of exposure to stimuli. Empirical support for the assumption of investment-ability complementarity (i.e. a demonstration that returns to education are increasing in ability) is surveyed in Weiss (1995). To ensure the social planner’s problem is well-behaved, I assume that maxx s(x, θ) −  c(x) has a unique, strictly positive (finite) solution for each θ ∈ Θ.17 Finally, I make the  technical assumptions i) that sθ (x, θ) is bounded above on R++ ×Θ by some positive finite  number, and ii) that both sxθ (x, θ) and sθθ (x, θ) are continuous on R++ × Θ.18 If these  assumptions do not hold, then it is sometimes possible to redefine types in such a way  16 That is, the ‘single-crossing’ property - the feature that the net marginal return to investment is higher for higher types - is supported by the assumption that the marginal benefit is increasing in ability, as opposed to the marginal cost being decreasing in ability. In other words, the single-crossing property remains intact if I were instead to assume that marginal investment costs decrease with ability. Adding this assumption not only unnecessarily complicates the analysis, but also obscures a key difference between this analysis and the signaling framework, since signaling models rely on type-dependent investment costs 17 Given the global concavity of the objective function, this solution must satisfy the first-order condition: sx (x, θ) = cx (x). 18 These technical assumptions aid in the presentation of the geometric approach, as they guarantee that the integral curves plotted in a direction field do not cross. This is tightly connected to the establishment of existence and uniqueness of certain equilibria.  12  that the assumptions do hold.19 One example of a class of functions that satisfies these assumptions is s(x, θ) = g(x) · θ, where g is some differentiable, increasing, concave, and  bounded function with limx→0 g (x) = ∞.  A worker’s skill is relevant because it influences their productivity. However, in addi-  tion a worker’s productivity is also influenced by the skill of their coworker. If a worker has a skill of s and their coworker has a skill of s , then the worker’s productivity is given by: y = y(s, s ) = (1 − φ) · s + φ · s , where φ ∈ (0, 1] measures the degree of skill spillovers.20  Apart from being simple, this linear specification has a number of advantages. First,  it allows for an easier comparison with similar papers (e.g. Peters and Siow (2002)). Second, the specification is neutral in the sense that the total output produced by any worker-coworker pair is independent of φ.21 That is, spillovers are introduced in such a way that they have no mechanical impact on output - implying that any such impact is driven solely by altered incentives to invest. Third, additive separability allows me to sharpen the focus on the question of whether investments are efficient by simplifying the question of whether the matching pattern is efficient.22 This simplification can not be made in models of unproductive investment, since complementary interaction (a positive cross-partial derivative of y) is required to induce higher types to invest more (at least when costs are independent of type). The linearity assumption is made for simplicity of exposition, and does not imply that less simple functions (e.g. functions displaying complementary interaction) can not be accommodated. Indeed, once it becomes convenient to do so below, I generalize the function to the class of CES functions and show that nothing changes. Once workers have made their investment, they enter a frictionless matching market in which participants are matched on the basis of their investment. If a worker of skill s leaves the matching market with a partner of skill s , then the worker is able to produce y(s, s ) units of output per unit of labour input. Assuming that workers each inelastically supply a unit of labour, y(s, s ) is also the worker’s total output. I assume 19  For example, consider s(x, θ) = g(x) · (θ − θ)η for η > 1. Since sθθ approaches infinity as θ approaches θ, the function does not satisfy the assumption that sθθ (x, θ) is continuous on R++ × Θ. However, we can redefine ability to be θ˜ = (θ − θ)η , so that sθ˜θ˜ = 0 (which is continuous). 20 It may be natural to restrict φ ≤ 1/2 so that a worker’s skill is no less influential than that of their coworker, but this is not necessary for the results. 21 Furthermore, if it happened to be the case that there was perfect segregation on skill (s = s ), then changes in φ have no effect on individual-level output. 22 With additive separability, average output is unaffected by who ends up matching with who. This is not true when skills are complements, for example. See Becker (1973).  13  that the agent then simply consumes their output. Accordingly, the worker’s total payoff is their consumption net of their investment cost: u(x, θ, s ) ≡ y s(x, θ), s − c(x).  2.2.2  The Matching Market  A matching market equilibrium is characterized by an assignment of workers to other workers, where the assignment is made on the basis of workers’ investments.23 The matching market assignment is characterized by a function, m(x), which says that workers that have an investment of x are to be matched with workers that have an investment of m(x). The assignment rule is required to be feasible. For all A ⊆ R, define24 : M (A) ≡ {x | m(x) = a for some a ∈ A} . An assignment rule is feasible if the measure of agents for which xi ∈ M (A) is no less than the measure of agents for which xi ∈ A. In words, if we take some group of workers  - all those that make investments in A - and consider the group of workers that they are  supposed to be matched with - workers that invest in M (A) - then it must be that there are at least as many workers in the latter group as in the former group. The assignment rule is also required to be stable. A matching rule is stable if we can not find two non-paired workers such that both would prefer to be matched together (at least one strictly) to remaining with their assigned partner. To describe these preferences, note that workers do not care about their partner’s investment per se, but rather their skill. Worker i’s evaluation of worker j’s skill will of course in general depend on worker j’s investment. Given this, let µ ˜ (x ) represent the expected skill of a worker with investment x . Given these expectations, the relative attractiveness of a potential partner with investment of x is described by the utility function, u (x, θ, µ ˜ (x )).  2.2.3  Equilibrium Conditions  Beliefs are required to be consistent with equilibrium behaviour. In particular, if workers invest according to some measurable function, x(θ), where x : Θ → R+ , then beliefs are 23 Matching does not occur on the basis of skill, because I interpret the ‘soft’ nature of skills (as stressed in the Introduction) as meaning that there does not exist a natural metric upon which such skills can be meaningfully and/or feasibly described. It is infeasible for workers to arrange meetings on the basis of skill when they do not have a capacity to describe (or verify) skill prior to meeting the potential partner. 24 Since m(x) = a if and only if m(a) = x (both describe a situation in which agents that invest x are to be matched with agents that invest a), an alternative definition of M (A) replaces “m(x) = a” with “m(a) = x”.  14  consistent (or satisfy ‘rational expectations’) if: µ ˜(x) = E s(x(θ ), θ ) | x(θ ) = x , for all θ ∈ Θ. As is usual, this condition only restricts beliefs when evaluated at investment levels that arise in equilibrium.  Given beliefs and the matching rule, a worker’s total payoff from investing x is therefore given by u (x, θ, µ(x)), where µ(x) ≡ µ ˜ (m(x)). The µ function acts as a return  function in the sense that it describes the coworker skill that can be expected given an investment of x and can be thought of as a Hedonic return (Peters and Siow (2002)). Investments are optimal when agents maximize utility taking µ as given. That is, an agent of type θ is investing optimally if x(θ) ∈ arg maxx u (x, θ, µ(x)).  Given these above conditions, an equilibrium can be defined as follows.  Definition 1 (Equilibrium). An equilibrium is an investment function, x(θ), a matching rule, m(x), and beliefs, µ ˜(x), such that: 1. Given x(θ) and µ ˜(x), the matching rule is stable and feasible. 2. Given x(θ), beliefs are consistent. 3. Given µ(x), all workers are investing optimally. The following Lemma allows us to simplify the definition of equilibrium. Lemma 1. There is positive assortative matching (perfect segregation) in equilibrium. That is, m(x) = x. That is, in equilibrium, workers are randomly matched with another worker that has made the same investment.25 Importantly, the positive assortative matching is an equilibrium outcome and not an inbuilt feature of the matching game (as it is in Peters (2004b) and Hoppe, Moldovanu, and Sela (2005), for example). Given perfect segregation, the following simpler definition of equilibrium will suffice. Definition 2 (Equilibrium ). An equilibrium is an investment function, x(θ) and a return function, µ(x), such that: 1. Consistency: µ(x) = E [s(x(θ ), θ ) | x(θ ) = x] for all θ ∈ Θ. 2. Optimality: x(θ) = arg maxx u (x, θ, µ(x)) for all θ ∈ Θ. 25 This can be generalized to a model in which workers are exogenously designated a ‘class’ (e.g. males and females) ex-ante and matching occurs across classes (as in a standard marriage model). See the Appendix for details.  15  In order to study equilibria, it is useful to take a geometric approach. This process reveals that there is a continuum of equilibria, even when attention is restricted to separating equilibria. I make arguments as to why it is non-restrictive to focus on one of these in the analysis that follows.  2.2.4  A Geometric Approach  In order to depict equilibria, it is convenient to consider various relationships in investment/skill space. To begin, we can plot the indifference curves of the utility function u(x, θ, s ) for each type θ ∈ Θ, where the vertical axis measures units of coworker skill,  s . This is shown in Figure 2.1. The indifference curves are convex functions that are upward-sloping for sufficiently high investment levels (the curves may be everywhere increasing, and not U-shaped, as depicted). This is because the marginal return to investment becomes increasingly negative at higher investment levels, requiring increasingly large increases in coworker skill to maintain indifference.  s′  Iθ Iθ′ Iθ′  Skill Iθ  x Investment Figure 2.1: Preferences and Single Crossing Furthermore, the fact that investment acts to augment ability in a complementary manner implies a single-crossing property: for any proposed increase in investment,  16  s′  Iθ µ(x)  Skill  Iθ  µ(x) x(θ)  x  Investment Figure 2.2: Optimality for a type θ worker given µ(x) higher types need fewer units of coworker skill in order to remain indifferent.26 In the same space we can plot the return function µ(x), where workers conjecture that an investment of x allows them to match with a coworker of skill µ(x). This is depicted in Figure 2.2. Given an arbitrary return function, the optimality condition requires that each worker chooses the value of x that places them on the highest indifference curve, subject to the return function. Note that this condition, along with the singlecrossing property, implies that investment must be a non-decreasing function of type in equilibrium (see the proof of Lemma 1). Although we know that investment must be a non-decreasing function of ability, there will still be many possible equilibria. I will focus on separating equilibria - equilibria in which each type invests a unique amount. The primary reason for this focus is that it 26  Both the convexity and the single-crossing properties can be demonstrated by direct inspection of the indifference curve expression. If we define y −1 (s, z) as the inverse of y with respect to coworker skill (so that y(s, y −1 (s, z)) = z), then the equation of indifference curves are given by I(x, θ, s ; u) = y −1 (s(x, θ), u + c(x)). The slope of the indifference curve is Ix (·) = [cx (x) − ys (s(x, θ), s ) · sx (x, θ)]/ys (s(x, θ), s ). Using the additively separable form gives I(·) = (1/φ) · [u + c(x) − (1 − φ) · s(x, θ)] and Ix (·) = [cx (x) − (1 − φ) · sx (x, θ)]/φ. The curve is convex (since c(x) is convex and the fact that the planner’s problem has a wellbehaved solution implies that s(x, θ) − c(x) is strictly concave in x), and is decreasing in type (since sxθ > 0). The indifference curve is upward sloping for all x sufficiently large that cx (x) > (1 − φ) · sx (x, θ). Such values are ensured to exist by the assumption that the planner’s problem is well-behaved (i.e. that maxx s(x, θ) − c(x) has a unique, strictly positive (finite) solution).  17  s, s′ µ(x) s(x, θ′ )  s(x(θ ′ ), θ ′ )  Skill s(x, θ) s(x(θ), θ)  µ(x) x(θ′ )  x(θ)  x  Investment Figure 2.3: The Rational Expectations Condition in a Separating Equilibrium makes comparisons with the literature more transparent. I do not pursue refinements that are designed to rule out pooling equilibria (as in Cho and Kreps (1987)), instead leaving a discussion of pooling equilibria to the Appendix.  2.2.5  Separating Equilibria  A separating equilibrium is an equilibrium in which each type of worker invests a different amount. In deriving a separating equilibrium, note that the consistency condition becomes: µ (x(θ)) = s (x(θ), θ) ,  ∀ θ ∈ Θ.  That is, workers must believe that if they invest an amount that is an equilibrium investment for some type, say x(θ), then they will be matched with a coworker that is of skill s (x(θ), θ). This condition is depicted in Figure 2.3. An equivalent, and much more convenient way to express the consistency condition is to work with the inverse investment function, as follows. Let X represent the set of investments that arise in equilibrium, and for all x ∈ X, let ξ(x) be the type of worker  18  that finds it optimal to invest x. The consistency condition then becomes: µ (x) = s (x, ξ(x)) ,  ∀ x ∈ X.  (2.1)  This expression can be substituted into the objective function so that optimality requires: x(θ) = arg max {y (s (x, θ) , s (x, ξ(x))) − c(x)} . x  Restricting attention to differentiable investment functions allows us to use a first-order condition approach, since µ is also differentiable (from 2.1, using the fact that ξ is differentiable by virtue of it’s inverse being differentiable and strictly monotone). The firstorder condition, once re-arranged, gives us a differential equation that ξ(x) must satisfy: cx (x) − [ys (s(x, ξ), s(x, ξ)) + ys (s(x, ξ), s(x, ξ))] · sx (x, ξ) ys (s(x, ξ), s(x, ξ)) · sθ (x, ξ) ≡ Γ(x, ξ),  ξ (x) =  (2.2)  where I have used the fact that θ = ξ (x(θ)). The function Γ is expressed here for an arbitrary function, y(·, ·), but notice that whenever y(·, ·) has the property that ys (z, z) =  (1 − φ) and ys (z, z) = φ for all z ≥ 0 - as in the additively separable case under consider-  ation - we have:  Γ(x, ξ) =  cx (x) − sx (x, ξ) . φ · sθ (x, ξ)  (2.3)  This differential equation, once combined with an initial condition, {x0 , ξ0 }, defines an  initial values problem. One cost of the simplicity of the environment studied so far is that it has provided little guidance as to the appropriate initial condition. That is, although we know that ξ0 = θ, we have not as yet placed any restriction the equilibrium investment of the lowest type workers.27 To make progress in this regard, it is instructive to examine the conditions under which the solution to a particular initial values problem 27  In standard signaling models the investment of the lowest type workers is pinned down by the fact that they can only improve the firm’s belief about their type by deviating from their equilibrium investment. As a result, they invest as if their type were publicly observed. This type of reasoning does not work as smoothly here because a worker’s payoff depends on more than just what others infer from any given investment - in particular it matters how these inferences interact with the matching institution to determine the connection between a given investment and who, if anyone, they are matched with. In other words, it is not possible to pin down x0 in this environment without introducing some institutional detail over-and-above what is required in standard signaling models. In Peters and Siow (2002), which is a competitive model of premarital investment, the initial condition is pinned down by the fact that agents belong to classes (males and females) and that measure of agents differs across classes. The lowest type in the more populous class that succeeds in matching must be indifferent to investing and being matched with the lowest type on the other side of the market and not entering the matching market at all. The absence of explicit classes means that such an approach is not directly applicable here.  19  will generate a separating equilibrium. A careful examination of these conditions allows us to place restrictions on possible investments made by workers of the lowest type. There are three main conditions, as follows. First, the solution must be a strictly increasing function for x ≥ x0 . This ensures  that the inverse of the solution (the investment function) is defined. Although a strictly decreasing solution would also achieve this, such a solution would violate the fact that equilibrium investments must be increasing in type (see proof of Lemma 1). Second, the solution must be unbounded in the sense that for any z ≥ θ there exists  an x such that ξ(x) = z. This ensures that the equilibrium investment function is defined for all possible types. Third, we must verify that the implied investment function in fact maximizes the payoff of each type. One component of this exercise is to propose off-equilibrium values of µ(x) in such a way that no type has an incentive to deviate. In addition, we must also check that no type has an incentive to deviate to an investment made by some other type. To begin using these properties, it is useful to describe the solution to the initial values problem by analyzing the function’s direction field, as depicted in Figure 2.4. At each point in (x, ξ) space, we know the slope of ξ is Γ. By selecting points in the space, and drawing small line segments with a slope of Γ that run through the points, the properties of the solution emerge. To begin, we can plot the locus of points for which the slope of ξ is zero (the zero isocline). This is simply given by {(x, ξ) | cx (x) = sx (x, ξ)}.  This locus is represented by an increasing function in (x, ξ) space since sxθ > 0. Note that Γ(x, ξ) > 0 for points ‘south-east’ of this locus and Γ(x, ξ) < 0 for points ‘north-west’ of this locus. Potential solutions to the initial values problem can be depicted by constructing curves that pass through the direction field in such a way that their slope is tangent to the line segments at all points chosen. An important implication of the technical assumptions made on s(x, θ) is that such curves do not cross on R++ × Θ.28  The fact that we need ξ to be strictly increasing for x ≥ x0 , allows us to make the  following remark.  Remark 1. If x0 is the investment made by the lowest type in a separating equilibrium, then x0 ≥ x∗0 , where x∗0 is the value of x that satisfies sx (x, θ) = cx (x). There is also a maximum value that x0 can take in a separating equilibrium. This is derived from the third property since all types must prefer their equilibrium payoff 28  The assumptions are sufficient for Γ and Γξ to be continuous on R++ × Θ, which in turn is sufficient for any solution to an initial values problem to be unique when the initial values lie in R++ × Θ. The uniqueness of the solution implies that it is impossible for two different solutions to emanate from the same point, thereby demonstrating that curves can not cross.  20  ξ  {(x, ξ) : Γ(ξ, x) = 0} One Solution Another Solution θ  x∗0  x0  x  Figure 2.4: Direction Field to the payoff that they would receive had they ignored the matching market altogether and instead invested under the assumption that they would have no partner. I assume that remaining unmatched is equivalent to being matched with an agent of zero skill.29 Intuitively, agents can not be investing so much in equilibrium that the investment costs overwhelm the benefits by such an extent that the net payoff is driven below the agents’ next best alternative. To formalize this, let x ˜(θ) maximize y(s(x, θ), 0) − c(x), and let u ˜(θ) ≡ y(s(˜ x(θ), θ), 0) − c(˜ x(θ)).  Remark 2. If x0 is the investment made by the lowest type in a separating equilibrium, ∗∗ then x0 ≤ x∗∗ ˜(θ). 0 , where x0 is the value of x that satisfies s(x, θ) − c(x) = u  The value of x∗∗ 0 is depicted geometrically in Figure 2.5. It corresponds to the investment made by the lowest type such that indifference curve running through the resulting skill level is tangent to the horizontal axis. This indifference curve marks the highest possible payoff that this type can obtain if they were to ignore the matching market. Similarly, the Figure also depicts the value of x∗0 as the investment level such that the indifference curve passing through the resulting skill level is tangent to the skill function.30 The figure demonstrates that x∗0 < x∗∗ 0 (since the indifference curve associated 29 We can think of agents also having access to a technology that does not depend on spillovers, and provides a type-independent payoff normalized to zero. 30 Equating the slope of the indifference curve to the slope of the skill function gives [cx (x) − ys (s, s) ·  21  with the latter is always lower than that associated with the former). In addition, the Figure provides some guidance as to why we need only worry about the lowest type investing too much in equilibrium: if higher types invested x0 , then (by single-crossing), their indifference curve would not touch the horizontal axis. Therefore, in equilibrium, higher types would always prefer to invest x0 than to remain unmatched (of course, they prefer their equilibrium investment to investing x∗∗ 0 ). The above two remarks together restrict the set of initial investments that could possibly generate separating equilibria. If the solution to their associated initial values problem satisfies all three conditions outlined above, then we can can conclude that any such solution generates a separating equilibrium. The following verifies that such conditions are indeed satisfied. Proposition 1. For each x0 ∈ [x∗0 , x∗∗ 0 ], a solution to the initial values problem defined by  {ξ(x) = Γ(x, ξ), ξ(x0 ) = θ} generates a separating equilibrium in which x(·) = ξ −1 (·) and  µ(x) = s(x, ξ(x)) for all x that arise in equilibrium.  Since the satisfaction of the differential condition is a necessary condition of a separating equilibrium, we have that the existence and uniqueness of equilibrium depends on the existence and uniqueness of a solution to the initial values problem. The technical assumptions made on s(x, θ) are sufficient for the Lipshitz condition to hold, implying that each such initial values problem has a unique solution. Although we may be sure of existence, we will in general not be able to derive an explicit solution. However, in special cases we can: e.g., Γ is linear in ξ if s(x, ξ) takes the form g(x) · θ for some function g(·). In this case a closed-form expression is available (see Appendix for derivation): ξ(x) =  cx (x0 ) g(x0 ) · gx (x0 ) g(x)  1 φ  1 + φ  x x0  cx (z) g(z) · g(z) g(x)  1 φ  dz.  (2.4)  When an explicit solution is not available, the essential properties of the solution are easily seen by an inspection of the direction field.  2.2.6  Multiple Separating Equilibria  Even if the separating equilibrium is unique for a given x0 , the fact that the possible values of x0 lie on a continuum means that there are multiple equilibria. In order to justify focusing on one of these equilibria, we need to examine the properties of the continuum. sx (x, θ)]/ys (s, s) = sx (x, θ). Multiplying both sides by ys (s, s), and using the fact that ys (s, s) + ys (s, s) = 1, implies that this condition is simply cx (x) = sx (x, θ). The value of x that satisfies this is, by definition, x∗0 .  22  The main point to make is that the equilibria, as indexed by x0 , do not differ in any qualitative way apart from the fact that those with higher values of x0 involve i) greater investment, and ii) lower equilibrium payoffs. These results do not just on average, but hold type-by-type, in the following sense. Remark 3. Let x(θ; x0 ) be the investment made by a type θ worker in a separating equilibrium in which the lowest type invests x0 , and let u(θ; x0 ) be the associated payoff. Then, x0 > x0 implies i) that x(θ; x0 ) > x(θ; x0 ), and ii) that u(θ; x0 ) < u(θ; x0 ) for all θ ∈ Θ. The proof of the investment component comes from a direct inspection of Figure 2.4; the inverse investment function associated with x0 lies everywhere below the inverse investment function associated with x0 < x0 (recall that the technical assumptions ensure that the two curves can not cross in the direction field). The proof of the payoff component is seen most easily with the aid of Figure 2.5. The figure is set in the investmentskill space and shows three possible equilibrium return functions, each associated with ∗ a different x0 ∈ [x∗0 , x∗∗ 0 ]. The point x0 occurs when the lowest type’s indifference curve  is tangential to their skill production function, and x∗∗ 0 occurs when their highest indif-  ference curve that passes through the horizontal axis intersects their skill production function.  μ μ(x)  μ(x)  μ(x)  s(x, θ) s(x, θ)  Skill  x∗0  x0  x∗∗ 0  x  Figure 2.5: Multiple Separating Equilibria The payoff received by the lowest type is clearly decreasing in x0 , since higher values 23  place this type on lower indifference curves. The same logic applies for all workers, since their equilibrium investment will lie at a point at which their indifference curve crosses their skill production function from below. Since we have just argued that equilibria with higher x0 have higher investment, it follows that this higher investment places the worker on a lower indifference curve. Thus, the separating equilibrium in which x0 = x∗0 Pareto dominates all other separating equilibria. To sharpen the focus of the remaining analysis, I select a single separating equilibrium to analyze: the Pareto dominant one. Choosing this equilibrium is conservative if one is making a point about inefficiency. In the Appendix I provide other arguments for why this is a reasonable equilibrium to focus on. In order to develop an intuition for the following analysis of this equilibrium, and to clarify this paper’s contribution to the theoretical literature, it is useful to be clear about the differences between the equilibrium concept employed here and the ‘non-cooperative’ approach taken in the literature (e.g. Hoppe, Moldovanu, and Sela (2005)). Discussion of Equilibrium Concept By ‘non-cooperative’ I simply mean a model in which agents make investments taking as given a particular matching game in the following stage. The most prominent, and perhaps natural, of such games is the ‘premarital investment’ game (Peters (2004b)). In this game, participants in the matching market are simply matched in a positive assortative manner according to their investment.31 In contrast, the approach taken here is have agents invest taking as given a market return function (as in the standard signaling model). As with any competitive analysis, I have ‘black-boxed’ the the particular matching process and instead simply imposed some reasonable restrictions upon it (stability and feasibility). The benefit of this is that it allows us to focus on the essentials without getting caught up in institutional detail. But, this is only a benefit if the institutional detail is unimportant. One way to verify that institutional details are unimportant is to consider a particular finite matching game, and check whether equilibrium behaviour resembles the competitive outcome as the number of players gets large. This is precisely the exercise performed in Peters (2004a) and Peters (2007). Using the premarital investment game, he demonstrates the interesting result that competitive behaviour does not emerge as the number of players gets large. The essence of the argument is that competitive equilibrium is supported by agents of the lowest type holding incorrect beliefs regarding the consequence of cutting their investment. In particular, in anticipating positive assor31  That is, the agent with the highest investment is matched with the agent with the second-highest investment, and so on. In explicit two-sided models, e.g. marriage, the male with the highest investment is matched with the female with the highest investment, and so on.  24  tative matching, agents of the lowest type understand that they can deviate from their equilibrium investment without being matched with a lower quality partner (since they are always already matched with the lowest quality partner in equilibrium). In contrast, in order to support a competitive equilibrium, agents of the lowest type must believe that there will in fact be match-related consequences from cutting their investment. These beliefs are incompatible with the premarital investment game, but are consistent because such deviations never arise in equilibrium. Thus, the institutional detail does in fact matter in this case. In response, I would argue that although the competitive approach is inappropriate for studying large premarital investment games, it is not clear that the premarital investment game is the only interesting underlying game. For example, suppose we make the slight adjustment to the model and allow partnerships to form between two or three workers, and assume that s is taken to be the average skill of other workers in the partnership. There are no added costs to having three partners rather than two. Now it is no longer the case that there are no matching-related consequences stemming from a deviation for agents of the lowest type. In particular, such a deviation could reasonably leave the worker without a partner since their equilibrium partner(s) could be assimilated into a three-worker team. Alternatively, a richer dynamic structure incorporating search may produce a situation in which deviations are punished because their equilibrium partner has the option of continued search (see Burdett and Coles (2001) for a more detailed description of such an environment). Exploring such alternative noncooperative foundations is beyond the scope of this paper, but certainly is an interesting area for future research. One way to see that the underlying matching institution of interest in the current model will not in general be a premarital investment game, is to note that the bilateral Nash investment for agents of the lowest type, xN (θ), is always less than or equal to x∗0 . If the underlying game were a premarital investment game, then x0 = xN (θ). However, using Remark 1, there is no separating equilibrium when x0 < x∗0 . Therefore, the underlying game can be modeled as a premarital investment game only in the case in which xN (θ) = x∗0 . This happens to be the case when investment is completely unproductive, since both terms equal zero. This is the reason why the separating equilibrium of the premarital investment game in Hoppe, Moldovanu, and Sela (2005) exists (with a continuum of agents). However, the equality never holds when investment is productive. This suggests that the equilibrium concept used in Hoppe, Moldovanu, and Sela (2005) does not allow for the study of separating equilibria once the model is extended to the case in which investment is productive.32 In contrast, the equilibrium concept used 32  It should be pointed out that although Hoppe, Moldovanu, and Sela (2005) analyze the case of a continuum of agents, their primary focus is on finite economies. The argument presented in the text only applies  25  here always produces a separating equilibrium. It should be stressed, however, that the separating equilibria supported in this model relies somewhat heavily on the freedom to assign conjectured payoffs to investments that do not arise in equilibrium. As I have stressed in this section, the extent to which this is a problem depends on the matching game that one has in mind. A deeper exploration of suitable matching games is left for future work.33  2.3  Analysis  This section analyzes some of the properties of the separating equilibrium derived in the previous section.  2.3.1  A Generalization of the Productivity Function  One may be interested in understanding the connection between the function that maps skills to productivity, y(s, s ), and the nature of separating equilibria. In particular, the model would be of little interest if everything hinged on the assumption that y is linear in skills. To examine this, consider the following CES generalization of the linear model: y(s, s ) = (1 − φ) · sρ + φ · s  ρ  1 ρ  ,  for ρ ∈ (∞, 1]. This specification is quite flexible: it includes Leontief (ρ → −∞) and Cobb-Douglas (ρ → 0) technologies, as well as the original linear (ρ = 1) technology. The generalization allows us to capture complementarity because:  yss (s, s ) = (1 − φ) · φ · (1 − ρ) · (ss )ρ−1 (1 − φ) · sρ + φ · s  ρ  1 −2 ρ  ,  which is non-negative (strictly positive for ρ < 1). Furthermore, notice that ys (s, s )|s =s = (1 − φ) · sρ−1 (1 − φ) · sρ + φ · s  ρ  1 −1 ρ  = (1 − φ),  and ys (s, s )|s =s = φ · s  ρ−1  (1 − φ) · sρ + φ · s  ρ  1 −1 ρ  = φ.  to the continuum case and therefore is in no way intended to suggest a major shortcoming of their work. Indeed, using a competitive model to describe small matching markets has the clear potential to overlook interesting and relevant phenomena. 33 Another approach would be to abandon the focus on separating equilibria, and instead study more complex equilibria such as the Truncated Hedonic Equilibrium introduced in Peters (2006). As far as I know, this type of equilibrium has not been studied in the presence of incomplete information.  26  These expressions can be substituted into the general expression for Γ(ξ, x) given in (2.2), and, irrespective of ρ, we end up with the expression given in (2.3). The only difference induced by this generalization is that the set of initial values that support a separating equilibrium will change, however this set of initial values always contains the initial value associated with the Pareto dominant equilibrium.34 The point is that linearity is not a crucial assumption - in fact the Pareto dominant equilibrium is completely unaltered by selecting any productivity function from this somewhat general class of functions. The assumption of linearity was convenient to make primarily because it allowed for a simpler discussion of beliefs and matching returns, owing to the fact that all workers care only about the expected skill of their coworkers. Focusing on a separating equilibrium makes this unnecessary since coworker skill is non-stochastic in equilibrium. The linearity assumption is also useful in highlighting the minimal role played by complementarity of skills.  2.3.2  Efficiency  One central issue of interest is whether investment is ‘efficient’. Before this is addressed, we need to be clear about what is meant by ‘efficient’. There are two aspects. First, given equilibrium investments does the equilibrium matching arrangement maximize total welfare? Since investments are fixed, this requires that the matching arrangement maximizes total output. Second, given the equilibrium matching arrangement, do equilibrium investments maximize total welfare? Since the matching arrangement is fixed this allows us to check whether welfare is maximized on a type-by-type basis. One advantage of studying a production function that is additively separable in ownand coworker skill is that the issue of whether the matching arrangement is efficient is easily addressed. This is because all matching arrangements produce the same total output.35 This is not the case when there is complementary interaction (corresponding to ρ < 1 in the CES generalization above), where positive assortative matching is the unique output-maximising matching arrangement (see Becker (1973)). But this is not a major problem, since this is precisely the matching pattern that arises in a separating equilibrium. Thus, the structure allows us to focus attention on the efficiency of investments. 34  When ρ < 1 it is no longer the case that agents only care about the expected value of worker skill. As such, the proof of Lemma 1 does not apply and we can not be sure that all equilibria display segregation on investment. However, all separating equilibria will indeed display segregation since i) investment is increasing in type (single-crossing), and ii) all agents prefer those of higher skill. That is, higher investment reveals higher skill, and all agents therefore still find those with higher investments more attractive. 35 The total output always equals Θ s(x(θ), θ)dF (θ) since each worker contributes s(x(θ), θ) to total output regardless of who they are matched with. Different matching arrangements will influence the distribution of output, but each possible arrangement is Pareto efficient.  27  Consider a particular pair of coworkers, i and j. The total welfare generated within this match is W (xi , xj ; θi , θj ) = y(s(xi , θi ), s(xj , θj )) + y(s(xj , θj ), s(xi , θi )) − c(xi ) − c(xj ). The efficient investments are defined to be those that maximize W (·).36 Since i) there is perfect segregation in equilibrium (θi = θj ), and ii) [s(x, ·) − c(x)] is concave in x, there is  a unique efficient investment associated with each type defined by:  x∗ (θ) = arg max {2 · [s(x, θ) − c(x)]} = arg max {s(x, θ) − c(x)} . x  x  (2.5)  There are good reasons to suspect that workers will invest too little in equilibrium due to the presence of positive externalities. Since worker’s do not take into account that their investment benefits their coworker, we could reasonably suspect that workers invest too little. This turns out to not be the case however, since workers are provided an added incentive to invest in the form of a better coworker. Peters and Siow (2002) and Cole, Mailath, and Postelwaite (2001) show how a concern for matching can induce agents to invest efficiently when they would otherwise under-invest. These papers make the assumption that all an agent cares about in their partner is their partner’s investment. This is not the case here however, since workers care about their parter’s skill, which depends on their partner’s ability in addition to their partner’s investment. To explore this, note that the first-order condition associated with (2.5) is sx (x, θ) = cx (x). If we let ξ ∗ (x) be the inverse efficient investment function, then we have sx (x, ξ ∗ (x)) = cx (x). But notice how this corresponds exactly the locus of points for which Γ(x, ξ) = 0 in Figure 2.4. Since the inverse equilibrium investment function, ξ, has a slope of zero along this locus and a positive slope at all points ‘to the right’ of the locus it follows that ξ(x0 ) = ξ ∗ (x0 ) and ξ(x) < ξ ∗ (x) for all x > x0 . That is, the lowest type invests efficiently but all other types invest too much since the type that actually invests x is lower than the type that would efficiently invest x. This is formalized in the following Proposition. Proposition 2. Workers of the lowest type invest efficiently in the Pareto dominant separating equilibrium, but all other types over-invest. How can we reconcile the positive externality with over-investment? To begin, suppose that a firm or some other institution were able to internalize the externality by rewarding skill rather than productivity. Workers invest efficiently when they are paid 36  One may argue that this is an inappropriate (i.e. excessively stringent) definition because it is based on a set-up in which utility is transferable across partners, whereas utility is non-transferable in the model. A more reasonable criteria would then be that the investments are Pareto efficient (within the match). Since there is perfect segregation in equilibrium, both members of the match invest the same amount. Therefore, in checking whether investments are Pareto efficient we need only compare them to the symmetric Pareto efficient investment level. This level coincides with the investment that maximizes W (x, x; θ, θ). In this sense, nothing is lost by using this definition of efficiency in the analysis of equilibrium investments. The advantage of this definition is that it allows us to speak of the efficient investments for each member of the match without having to continually refer to the set of (Pareto) efficient investments.  28  the full marginal product of their skill: i.e. when they solve max y s(x, θ), s + y s , s(x, θ) − c(x). x  That is, workers invest efficiently when they perceive the net marginal benefit to be: ys s(x, θ), s · sx (x, θ) + ys s , s(x, θ) · sx (x, θ) − cx (x).  (2.6)  Now consider the situation in which workers are paid according to their productivity. The nature of the matching market is such that higher investment levels allows workers to work with higher skilled partners. Thus, there is an added benefit to investing that is introduced by matching. The worker actually solves max y (s(x, θ), µ(x)) − c(x). x  That is, workers actually perceive a net marginal return of: ys (s(x, θ), µ(x)) · sx (x, θ) + ys (s(x, θ), µ(x)) · µ (x) − cx (x).  (2.7)  Workers will therefore invest efficiently in equilibrium when µ is such that (2.6) equals (2.7). When evaluated at the equilibrium investment level, (i) the rational expectations condition ensures that s = µ(x), (ii) by definition, we have θ = ξ(x), and (iii) perfect segregation implies s = s(x, θ). Using these observations when equating (2.6) and (2.7) allows us to cancel expressions so that we are left with the requirement that: sx (x, ξ(x)) = µ (x). The same three observations also imply that µ(x) = s(x, ξ(x)) (the alternative expression of the rational expectations condition). Differentiating this with respect to x, we have that µ (x) = sx (x, ξ(x)) + sθ (x, ξ(x)) · ξ (x). Therefore, ξ must be such that: sx (x, ξ(x)) = sx (x, ξ(x)) + sθ (x, ξ(x)) · ξ (x), which can only happen when ξ (x) = 0. This means that workers would invest efficiently if and only they expected that their investment would influence the investment made by their match, but not the ability of their match. Of course, this does not work here because, in equilibrium, matching with a higher-investing coworker necessarily means matching with a higher-ability coworker.37 Thus, the incentive to invests are too great. The fact 37  Note that the above argument is general to the extent that I did not rely on any specific functional form for either y(s, s ) or s(x, θ).  29  that workers invest efficiently in Peters and Siow (2002) can be seen as a consequence of the fact that the agents in their model do not care about the ability of their partner, only their investment. This proposition has three main implications. The first is that spillovers are central to the over-investment result in the sense that there is over-investment if and only if spillovers exist. In this light, credentialism and education externalities are not two separate phenomena that need to be weighted against each other in uncovering the social benefits of education. Rather, the analysis here suggests that the phenomena are structurally related: credentialism is a consequence of spillovers. The second is from a policy perspective: despite the existence of positive spillovers, an investment subsidy could never be optimal (an analysis of optimal policy is contained in the Appendix). Thus, not only does the model provide an explanation for credentialism, it also confirms that credentialism implies excessive educational attainment.38 The third implication from this result is that heterogeneity matters. If agents were assumed to be homogeneous, then there would be no over-investment (since all workers are of the lowest type). Apart from this, heterogeneity forces us to think more carefully about modeling skill externalities because any ‘macro’ approach that uses a representative agent will necessarily overlook the mechanism offered here.  2.3.3  Entry of Lower Types  Although I have exogenously fixed the lowest type, it is not difficult to imagine variants in which this is endogenous. For example, if workers had some alternative to working that delivers a utility of b, then the value of θ is determined by the type of worker that is indifferent to working and not working. As b falls, so too does θ. Clearly this affects the workers that are induced to participate, but the effects are felt by all participating workers. In particular, Proposition 3. An decrease in θ raises the equilibrium investment, and therefore also the productivity and income, of all types. This is most easily seen by inspecting the direction field: a lower value of θ causes the solution to the new initial values problem to lie everywhere below the old solution (implying that all agents that were participating previously are now investing more). 38  Whilst credentialism is casually associated with over-investment, it need not be the case. For example, Hopkins (2005) considers an economy in which firms are (exogenously and observably) differentiated by some quality measure and workers invest in education in order to compete for the ‘good jobs’. Under my definition, this economy exhibits credentialism since workers are assigned to firms on the basis of their investment. He shows that when worker investments are productive, low ability workers will tend to invest too little. Intuitively, these workers do not take into account the fact that their investment also benefits the firm. Such possibilities do not arise in the present model however since investment is two-sided.  30  The fact that productivity and income are raised does not imply that welfare is raised. In fact, welfare must fall because greater investment only heightens the original overinvestment problem.  2.3.4  The Effect of Spillovers  If technological progress tends to increase the scope for spillovers (see discussion in the Introduction) then it is of interest to examine the consequences of rising spillovers. Recall that the definition of the spillover parameter is φ2 /(φ1 + φ2 ), where φ1 + φ2 was normalized to unity. Thus ‘an increase in spillovers’ means a simultaneous increase in φ2 and decrease in φ1 (so that the sum remains at unity). With this construction, changes in φ are ‘neutral’ in the sense that individual productivities would not be affected in equilibrium if behaviour were not affected. Thus, changes in equilibrium variables arising from changing spillovers occur purely because of the fact that incentives to invest are changed. Proposition 4. Workers of the lowest type are unaffected by spillovers, however the equilibrium investment of all other workers is increasing in spillovers. This result indicates that the spillover dimension of new technologies may be an important source of productivity growth in the development process. To be sure, it is not that spillovers enhance the productivity of existing worker skills, but rather, that spillovers provide incentives for workers to improve their skills. This theory is difficult to test against the theory that new technology simply makes existing skill more productive.39 Both theories imply that new technologies raise productivity and involve a rising investment level (e.g. educational attainment). One possibility is examining the OLS return to education. This should increase if technology raises the productivity of existing skills, but will decrease if it is induced purely by greater spillovers. Although greater spillovers raise productivity and incomes, the effect on welfare is not favorable. Corollary 1. The welfare of workers of the lowest type are unaffected by spillovers, however the welfare of all other workers is decreasing in spillovers. This follows from the fact that i) there is over-investment and ii) spillovers increase investment. This result implies that if all differences in income across economies were 39  It is important to emphasize that there is nothing in the model that says that new technology can not also make existing skills more productive. That is, new technology will likely raise both φ1 and φ2 . The point is that productivity will increase even if φ1 + φ2 does not change. In this light, it is unsurprising that the two theories are difficult to differentiate.  31  attributable to differences in spillovers, then average income and welfare at the economy level would be negatively correlated in the cross section. Returns to Education In this section, I briefly explore some of the ways in a naive interpretation of observed returns to education will be misleading in the presence of spillovers. To begin, consider the true private returns from investment. In equilibrium, a worker of type θi evaluates the relationship between their investment and output as being given by: y(xi , θi ) = y (s(xi , θi ), µ(xi )) = y (s(xi , θi ), s (xi , ξ(xi ))) , at least for investments that arise in equilibrium, xi ∈ X. The marginal return to ed-  ucation perceived by such a worker, when evaluated at their equilibrium investment, is: ∂ y(x, θi ) = (1 − φ) · sx (xi , θi ) + φ · [sx (xi , θi ) + sθ (xi , θi ) · ξx (xi )] ∂x = sx (xi , θi ) + φ · sθ (xi , θi ) · ξx (xi ),  (2.8) (2.9)  where I have used the fact that ξ(xi ) = θi in equilibrium, as well as the fact that ys (z, z) = (1 − φ) and ys (z, z) = φ for all z > 0.  The true private return will be overstated if the researcher were simply to fit a re-  lationship between output and education due to what looks like a standard ‘ability bias’ problem. To be sure, investment and output are related in equilibrium by: yˆ(xi ) = y (s(xi , ξ(xi )), s(xi , ξ(xi ))) , implying a marginal return of ∂ yˆ(xi ) = sx (xi , θi ) + sθ (xi , θi ) · ξx (xi ), ∂x  (2.10)  where I have use the same arguments as above. This implied return is never less than the true private return. Notice however that the presence of spillovers adds a subtlety: although the returns given by (2.10) are strictly greater than the true returns given by (2.9) for φ ∈ [0, 1), the difference converges to zero as spillovers approach unity. This  could never happen in a standard ‘ability bias’ problem, since investment is a strictly increasing function of ability. Spillovers are therefore introducing new scope for misin-  32  terpretation of observed returns. One feature that is perhaps puzzling at first glance is that the ability bias problem is actually reduced when spillovers - the source of imperfection in the model - are more pronounced. Indeed, the ability bias is completely overcome when spillovers are at their most severe. To explore this further, it is useful to imagine the economy lying on a continuum that represents the extent to which a worker’s investment affects their income because of the fact that the investment augments their skills. The degree of spillovers determines the economy’s location on this continuum. At one end of the continuum, spillovers are zero. This corresponds to a pure ‘human capital’ economy in which investment raises skills, and these skills are rewarded directly. At this end, the entire benefit of investment arises because skills are enhanced. As spillovers are increased and we move along the continuum, the effect of investment on skills becomes less important relative to the effect of investment on attracting higher skilled coworkers. At the other end of the continuum, when spillovers are equal to one, none of the benefit of investment is directly due to skills being enhanced, but, rather, is entirely due to the fact that investment allows one to work with better coworkers. The significance of this is that an ability bias only arises when the effect of ability on income is mistakingly attributed to investment. But the effect of ability is relatively small when the effect of own skill is relatively small: that is, when spillovers are relatively high. Thus, a declining ability bias is consistent with rising spillovers. If the ability bias is ignored, then the resulting implied returns to education are exaggerated - but the extent of this exaggeration is reduced when spillovers are greater. Suppose now that the ability bias problem is perfectly resolved (by assuming the researcher observes ability, or has some perfect instrument), and the researcher reestimates the relationship between wages and investment while dealing with the ability bias. Such an exercise will reveal the true private returns to education (assuming the researcher uses the correct functional forms, etc). Such estimates are useful for individuals attempting to decide upon their optimal investment, but are not necessarily useful to a government that is evaluating the social benefit of education. The social return from a worker’s investment is given by: y ∗ (xi ) = y (s(xi , θi ), sj ) + y (sj , s(xi , θi )) , where sj is the skill of the worker i’s coworker. The marginal social return in equilibrium is ∂ ∗ y (xi ) = sx (xi , θi ), ∂xi  33  where I have used the fact that sj = s(xi , θi ) in equilibrium. It is important to note that evidence of a positive causal relationship between education and wages offers little, if any, insight into this social return. To see this most clearly, note that there is a positive causal relationship between education and wages in both a pure signaling model and a pure human capital model. This is a central reason why the debate between the human capital model and the signaling model has persisted for such a long time. If we are preoccupied with distinguishing between the ‘human capital’ and ‘signaling’ models, then it may be of some value to possess data on worker-level output (as opposed to just wages). For various reasons, this is largely infeasible, but in principal it would shed light on the issue. The human capital explanation would posit that the effect of education on output is roughly the same as the effect on wages (since wages are tied to output directly). The signaling explanation would predict that education has less of an effect on output (possibly zero) than it does on wages. So, imagine that the researcher determined that the effect on output is roughly the same as the effect on wages. This provides evidence against signaling, and would likely produce the conclusion that there is no over-investment in education, given the evidence in support of the human capital model. This conclusion is misleading in the presence of spillovers. The reason is that wages are tied to output in this model also. In other words, such an exercise may be able to distinguish between signaling and the model presented here (where the human capital model is a special case), but it will not be able to tell us anything about the degree of spillovers (and therefore the social value of education). The above observation leads to the following paradox: education raises a worker’s productivity, even if education has no capacity to raise skills. The reason is that more education allows workers to work with higher skilled coworkers, which raises their productivity via spillovers. In short, finding evidence in favor of a human capital model does not rule out over-investment in the presence of spillovers. How then can we distinguish between a pure human capital model (φ = 0) and models with credentialism (φ ∈ (0, 1])? One way is to note that when spillovers are positive,  the mechanisms at play share the main qualitative features of signaling models - that education has a value as a credential. Therefore, the most compelling evidence in favor of signaling models (e.g. Tyler, Murnane, and Willett (2000), Bedard (2001), and Lang and Kropp (1986)) is also consistent with positive spillovers.40 The ideal way in which to determine the extent of spillovers is to include measures of coworker skills. There are a number of practical problems in being able to achieve this, perhaps explaining why there are relatively few papers that attempt such an exercise. A 40  In fact, the model here perhaps offers a better interpretation of the results of Tyler, Murnane, and Willett (2000) because they find that the GED credential has a positive effect on wages, but this effect does not manifest itself for five years after attainment. This seems implausible in a signaling world, because the worker’s true productivity would almost surely be revealed during that time.  34  more standard way to estimate spillovers is to include a measure of region-wide education in a wage regression. For example, Acemoglu and Angrist (1999) include state-level education aggregates in the wage equation, whilst Sand (2007) and Moretti (2004) uses city-level education aggregates. Such an approach will likely have little power in detecting the type of spillovers introduced here for the simple reason that spillovers occur on a much more local level. Approaches like these will produce spurious estimates if regions differed in their spillover parameter, or if there is sorting across regions and a mis-specification in the wage equation. This issues are left to future research.  2.3.5  A Comparison with Global Spillovers  How would matters be different if we were to model skill spillovers in a more standard way? In the spirit of Lucas (1989) and Moretti (2004), suppose that each worker benefits from some aggregate of skill in the economy.41 One parsimonious way to address this issue is to suppose that a worker’s productivity is given by: y(s, s) = (1 − φ) · s + φ · s, where s is the average skill in the economy. As before, s = s(x, θ) = θ · g(x). The addi-  tive separability is convenient because we know that investment with global spillovers, xG (θ), satisfies: (1 − φ) · sx (xG (θ), θ) = cx (xG (θ)). The key difference between this set-up and the model above is that a worker’s investment does not influence the skill spillover that they are exposed to. As such, the standard under-investment problem arises since no worker takes into account their positive impact on other workers. In fact, many of the central conclusions from the analysis are reversed. For instance, an increase in the spillover parameter lowers investment. Intuitively, inequality will tend to be relatively low under global spillovers since all workers are exposed to the same spillover level. As spillovers increase, this common component becomes relatively more important, which implies that spillovers tend to lower inequality when spillovers are global. The opposite is true in the model (see the following Section). One way to view global spillovers is to note it’s qualitative equivalence to a setting in which matching on the basis of investment was infeasible (e.g. if investments were hidden). That is, s can be interpreted as the expected skill that one will obtain from a 41  Models in which spillovers enter in this way are qualitatively the same (for our purposes) as models in which interaction is local but meetings are random, as in Glaeser (2001).  35  coworker, given that matching is random. In this light, the comparison between equilibrium welfare and welfare with global spillovers is equivalent to the comparison of welfare between fully observable investments and fully hidden investments. The trade-off here is different to that studied in the literature because interaction is not complementary.42 Here, visible investments lead to over-investment whereas hidden investments lead to under-investment. Welfare is the same across the cases in the absence of spillovers (φ = 0), since investment is efficient in both cases. When there are complete spillovers (φ = 1), investment is zero in the hidden case. Average welfare is therefore also zero. In contrast, workers still separate in the visible case but each earns a utility equal to that obtained by the lowest type when they invest efficiently. The reason is that indifference curves are the same for all workers when spillovers are complete. Visible investment therefore tends to produce higher welfare when spillovers are very high. Intermediate cases are ambiguous, and depend in the extent to which investment augments ability. Along this dimension, hidden investments tend to produce greater welfare when investment is not very productive (e.g. hidden investments deliver the efficient investment level - zero - when investment is unproductive). The nature of this trade-off is analyzed in further detail in Section A.4 of the Appendix, where I use functional forms to explicitly derive equilibrium quantities. This exercise also shows how spillovers increase inequality in the model but decrease inequality when spillovers are global.  2.4  A Simple Dynamic Extension  The criticism of signaling models identified in the Introduction starts from the claim that firms learn the productivity of their workers relatively quickly. I have interpreted the central implication of this claim as being that firms have a reasonable capacity to condition wage contracts on worker performance, thus removing any significant motivation to signal. The more standard approach, especially in the empirical literature, is based on competitive labour market arguments whereby workers are always paid their (estimated) marginal product. But, for this to be convincing one must extend the original claim by further positing that the market observes a worker’s productivity relatively quickly. This extended claim is less convincing - and indeed is placed in further doubt in the presence of spillovers (since the map between workers’ characteristics/history and their underlying skill is confounded by coworker influences). 42 The trade-off considered in the literature (e.g. Rege (2007) and Hoppe, Moldovanu, and Sela (2005)) is that signaling is wasteful but, thanks to complementary interaction, is socially productive because it facilitates superior matching patterns. A trade-off of this nature is noted in Arrow (1973) and Stiglitz (1975). This trade-off can not be at work in the present model however since interaction is not strictly complementary. As noted previously, matching patterns are irrelevant for efficiency in this model.  36  If the market learns a worker’s skill quickly, then the over-investment problem identified in the model is reduced by the same type of competitive arguments. The analogous story is that matching will quickly become assortative on skill (as opposed to investment). To provide a simple illustration, suppose that skill is revealed to the market after exactly T periods. Then, from period T onwards, stability requires that each worker is paired with a worker of equal skill. As before, each worker is paired with a worker of equal investment for periods prior to T . Given a discount factor of β, the objective function of a worker is [(1 − β T )/(1 − β)] · y(s(x, θ), s(s, ξ(x)) + [β T /(1 − β)] · s(x, θ) − c(x). The value of T can be seen as determining the benefit of masquerading as a higher type, since this benefit only lasts for T periods. Indeed, if skill is learned by the market immediately (T = 0) then there is no over-investment problem because workers are being matched on the ‘correct’ attribute. As mentioned above, it is less clear that the market is able to learn a worker’s skill relatively quickly, especially in the presence of spillovers. Even if a workers’ employment history (including information on the identity and history of their coworkers) was readily available for the market to observe, it is not obvious how matches can (effectively) be co-ordinated on the basis of skill since a worker’s inferred skill is likely to be a complicated function of their investment, their previous output levels, as well as their previous coworkers’ investment(s) and output(s). However, even in the extreme case in which the market can never infer a workers’ skill from any variable other than their investment, there are still reasons to suspect that the over-investment problem is over-stated by studying a static setting. In particular, what if workers were able to learn the skill of their assigned coworkers relatively quickly and had the freedom to leave their assigned partner in favour of continued search? Intuitively, workers understand that there is no point in masquerading too much because their intended coworker will reject them. Furthermore, as workers become more patient they become more discriminating in who they are willing to accept as a coworker. Thus, intuition seems to suggest that higher levels of patience among workers will reduce the extent of the over-investment problem because there are fewer incentives to masquerade. The purpose of this section is to understand the extent to which this intuition is correct. In order to do this, I make two extreme assumptions about learning intended to highlight the relevant mechanisms. First, I assume that the market only ever observes a worker’s investment. The consequences of relaxing this are hopefully clear given the above discussion. Second, I assume that workers are able to observe their assigned coworker’s skill immediately - in fact, prior to agreeing to engage in an employment relationship. I consider both the case of non-transferable and transferable utility. I show that it 37  is indeed the case that greater patience (discount factor) makes workers more discriminating in terms of the set of coworkers that they will reject. However, I show that this does not translate into qualitatively different equilibrium behaviour when utility is non-transferable. In this sense, the conclusions of the static model are robust to considerations of coworker learning. I then go on to show that equilibrium behaviour is qualitatively affected when utility is transferable, but not for the reason identified in the intuition. It is not so much that workers are discouraged from masquerading out of a concern for being rejected, but rather because their bargaining position deteriorates if they masquerade too much. I show that over-investment declines as workers become more patient, and this suggests a sense in which patience can be thought of as one way to endogenize the degree of spillovers in the static setting without reference to any ‘technological’ features of production. In summary, I show that the over-investment result remains even in the case of immediate coworker learning. This limiting degree of over-investment is unaffected by patience when utility is non-transferable and is diminishing with patience when utility is transferable.  2.4.1  Non-Transferable Utility  Consider a generalization of the economy described above which operates in discrete time. All workers have a discount factor of β ∈ [0, 1) so that a sequence of outputs {yt }∞ t=0  produces a (gross) payoff of Y =  ∞ t t=0 β yt .  In general Y will depend on the worker’s  type, θ, and investment, x, so write this as Y (x, θ). The worker’s net payoff incorporates the cost associated with this investment, C(x): U (x, θ) = Y (x, θ) − C(x) A new generation of workers are born each period. In the first period of existence workers choose their investment and are assigned a coworker in the labour market. If both workers within a match agree to the match, then the workers produce together for all future periods. If a worker of skill s forms such a partnership with a coworker of skill s then the worker produces and consumes y(s, s ) = (1−φ)·s+φ·s each period and therefore obtains a gross payoff of (1 − β)−1 · y(s, s ). If at least one of the workers does not agree  to the match, then both workers re-enter the labour market the following period and are assigned new coworkers according to the equilibrium matching function. I assume that the worker’s only verifiable characteristic in the labour market is their investment (i.e. the output history is non-verifiable, as is the skill of their previous partners, etc). This assumption is clearly extreme, but it allows me to retain the feature that matches must be formed on the basis of investment. 38  Since we are primarily interested in the qualitative impact of changes in β upon equilibrium investment, it is useful scale up investment costs by the factor (1 − β)−1 .  The reason is that an increase in patience will mechanically increase incentives to invest simply because workers care more about the future payoffs that the investment allows for. Scaling up costs in this way purges the overall impact of patience of such mechanical effects. Thus, I assume that C(x) = (1 − β)−1 · c(x).  Given the analysis of the static model, I will look for a separating equilibrium. As  in the static case, there is positive assortative matching on investment, and a worker that invests x is matched with a coworker of skill s(x, ξ(x)), where ξ(x) is again the inverse investment function. Now we can write a worker’s total payoff as U (x, θ) = (1 − β)−1 · u(x, θ, s(x, ξ(x))).  Even though the objective function is equivalent to the objective function in the static  case, note that the constraints on the optimization problem are different. In the static case, we only required that x ≥ 0. This is because matches would never be refused since there is only one period. This is not true in the dynamic case. If a worker of skill s is matched with a coworker of skill s , then the partnership is worth (1 − β)−1 · y(s, s ) to  the worker. In equilibrium the worker is supposed to match with a coworker of skill s, and can therefore guarantee a partnership worth (1 − β)−1 · y(s, s) in the following period  if they are to reject the coworker of skill s . Therefore, a worker of skill s agrees to a match with a worker of skill s if and only if y(s, s ) ≥ β · y(s, s). If we let A(s, β) be such that y(s, A(s, β)) = β · y(s, s), then A(s, β) is the minimum skill that a worker of skill s will accept in equilibrium given a patience of β. Notice that A(s, β) has the following properties. 1. Workers always accept partners of equal skill: A(s, β) < s. 2. All workers are accepted for low enough patience: ∃β ∈ (0, 1) such that A(s, β) = 0. 3. Workers become more selective as patience increases: Aβ (s, β) > 0 and limβ→1 A(s, β) = s. The first property intuitively reflects the fact that workers prefer to accept their equilibrium match sooner rather than later. The second property tells us that the conclusions drawn from the static case (which is equivalent to β = 0) remain unaltered for sufficiently small levels of patience. The third property is key since it opens the possibility that patience can have a qualitative effect on equilibrium investments because the value of masquerading as a higher type is reduced as patience increases. One intuition underlying over-investment in the static model is that workers are investing in order to distinguish themselves from lower types. Greater patience may reduce the pressure to do so because some lower types know that they will never be 39  accepted by some higher types and, as a consequence, will never even attempt to masquerade as one. The question is whether this reduced ‘pressure’ translates into more efficient investment. The minimum acceptance rule implies the following constraint on a worker’s optimization problem: s (x, θ) ≥ A (s(x, ξ(x)), β) .  (2.11)  That is, if a worker invests x expecting to match with a partner of skill s(x, ξ(x)) then it must be the case that the worker’s skill is at least as great as the minimum required by their intended partner. In equilibrium, the constraint places an upper limit on how much a type θ worker can invest.43 The problem facing each worker is therefore max (1 − β)−1 · u (x, θ, ξ(x)) x  subject to (2.11). Although the constraint has real implications for off-equilibrium behaviour - it places an upper limit on how much a worker would masquerade - it has no qualitative effect whatsoever on equilibrium investment behaviour. To see this, suppose that workers ignored the constraint. This produces an equilibrium investment identical to the one derived in the static case. Is the constraint satisfied? By the first property of A(s, β), the answer is ‘yes’ for all possible values of β. In other words, if the constraint were binding for some type then this type must be matched with a worker of a strictly higher skill. But this then implies that the equilibrium is not separating since two workers of different abilities invest the same amount. Result 1. When utility is non-transferable, equilibrium investment behaviour is not qualitatively affected by workers’ degree of patience, β. The conclusion drawn from this exercise is that the over-investment that arises in the static case has nothing to do with the fact that workers are ‘locked in’ to their partnership. The same qualitative behaviour arises once we allow workers to reject partners. In contrast to the previous intuition, the ‘pressure’ leading a worker to over-invest is very much a local phenomenon. In other words, if a change in patience does not induce workers of type θ to change their investment, then it will not induce workers of slightly 43  To see this, apply the function y (s(x, ξ(x)), ·) to each side. This leads to y (s(x, ξ), s(x, θ)) ≥ βy (s(x, ξ(x)), s(x, ξ(x))). Since y˜(x, θ) ≡ y (s(x, ξ(x)), s(x, θ)) cuts yˆ(x) ≡ y (s(x, ξ(x)), s(x, ξ(x))) at the equilibrium investment ‘from above’ (since θ > ξ(x) for x < x(θ)), it also cuts β · yˆ(x) from above. Therefore, the constraint requires that a type θ worker invests no more than xc (θ, β) where this value satisfies y˜(xc (θ, β), θ) = β · yˆ(xc (θ, β)). Furthermore, notice that β < 1 implies that xc (θ, β) > x(θ).  40  higher types to change their investment regardless of what all types lower than θ are doing. One interpretation of this is that ‘signaling’ is not quite the right way to view the over-investment result. This claim is made in light of the facts that i) the hidden information - workers’ skills - are revealed once workers meet one another, and ii) the degree of patience can be thought of as the speed of learning this hidden information: the single period in which one’s partner’s skill is hidden becomes increasingly irrelevant as perfect patience is approached. Thus, the relevant feature is that matches are co-ordinated on the basis of investment and not skill. The ‘speed of learning’, once matched, is irrelevant.44 This result is special in at least two ways. First, it matters that types are not discrete. The efficient investments can always be supported when types are discrete for sufficiently high patience levels. Thus, a weaker version of the result holds: When utility is non-transferable and types are discrete, equilibrium investment behaviour is qualitatively unaffected when patience is sufficiently low. In the static model, if the efficient investments can not be supported it is because a type θk worker profits from raising their investment to the level that is supposed to made by a type θk+1 worker. However, if patience is high enough, the type θk+1 worker will find it optimal to reject any type θk workers that deviate by making the higher investment. This discourages the deviation and the efficient investments can be supported. Second, it matters that utility is non-transferable across partners. This case is now analyzed.  2.4.2  Transferable Utility: Bargaining  In the previous section the wage accruing to each worker was determined purely by the spillover parameter, and there was an implicit assumption that workers do not engage in side payments. An alternative approach is to assume that each worker produces an output equal to their skill and that workers within a match bargain over the total match output (the sum of the two individual outputs). This section studies this type of setting with Nash bargaining. Apart from wage determination, the economy is the same as just described. Suppose that a worker i is paired with a worker j so that the pair bargain over the total match output, si + sj . If i gets a payment of wi , then j gets a payment equal to the remaining output, si + sj − wi . A wage of w produces a present value of (1 − β)−1 · w. If  k ∈ {i, j} has an outside option of qk (to be determined), the wage to i is determined by 44  This is a key difference between this model and standard signaling models. In those models the speed of learning is important. See the Appendix for a demonstration of how efficient investments can be supported in the analogous signaling environment.  41  (symmetric) Nash bargaining: s i + s j − wi wi − qi = − qj . 1−β 1−β  (2.12)  In equilibrium, worker j expects to be matched with a worker that also has a skill of sj . Given the symmetry, j gets paid a wage equal to sj (since the match output, 2sj , is split evenly). Thus, the present value of future consumption for j, once they meet their equilibrium partner, is (1 − β)−1 · sj . Worker j’s outside option, qj , is the value  associated with waiting one period in order to meet their equilibrium partner. That is, qj = β · (1 − β)−1 · sj .  In equilibrium, the fact that worker i is optimizing means that if worker i is able to  negotiate a wage of wi this period, then this will also be the best that he can do in the following period. Thus, i’s outside option is the value associated with waiting one period in order to receive a constant wage stream of wi . That is, qi = β · (1 − β)−1 · wi .  Once these outside options are substituted into (2.12), the resulting expression can  be re-arranged to get: wi =  1 1−β · si + · sj . 2−β 2−β  (2.13)  Thus, if the pair agree to form a partnership, then the total output is divided according to {wi , si + sj − wi }. Since I have already incorporated the fact that workers are behav-  ing optimally, the pair will always agree to form a partnership (i.e. if the partnership were rejected, then the rejected party is acting sub-optimally in bothering to show up in the first place). Again, positive assortative matching in equilibrium means that si = sj in equilibrium - implying that wi = wj = si = sj . If we let φ˜ = (1 − β)/(2 − β), then  equation (2.13) indicates that the equilibrium will share the qualitative features of the ˜ Thus, the transequilibrium with non-transferable utility in which spillovers equal φ. ferable utility case can be thought of as one way to endogenize the spillover parameter. ˜ Unlike the non-transferable case, Note that φ˜ ∈ (0, 1/2], where higher patience lowers φ.  changes in patience have a qualitative effect on equilibrium investments since patience determines relative bargaining power. Whilst there is always over-investment, investments approach the efficient investment as patience approaches perfect patience (β → 1).  This is because worker i perceives that he will be able to appropriate the full marginal return on his investment. Result 2. When utility is transferable, equilibrium investment behaviour is qualitatively affected by workers’ degree of patience, β. Equilibrium outcomes in the dynamic model share the qualitative features of a static model in which φ = (1 − β)/(2 − β). In particular,  investment becomes efficient as patience becomes perfect (β → 1).  42  2.5  Conclusions  The model developed here reconciles two common, seemingly opposing, views of education: first, the externality view that education entails positive benefits to others, and second, the credentialist view that education is, at least in part, wasteful because educational attainment is motivated more by the fact that it offers one a credential than by the fact that it makes one more productive. The analysis shows how the two views are not necessarily separate phenomena that need to be weighed against each other, but rather that credentialism actually relies on the existence of spillovers, implying that there is a more structural connection between the viewpoints than has previously been recognized. The model allows us to draw out some implications of a greater degree of interaction in the workplace. Despite being parameterized in a ‘neutral’ manner, higher spillovers are shown to raise productivity and inequality, but lower welfare. This is suggestive of a novel way in which to interpret the impact of spillover-conducive technologies: productivity and incomes are raised because there are greater incentives for individuals to invest in their skill level, not because the technology enhances existing skills per se. Far from being welfare-improving, these rising productivity and income levels are associated with lower utility levels since greater spillovers exacerbate a type of ‘rat-race’, especially among the relatively high ability workers. The conclusions reached in this paper must be taken within the context of the model’s limitations. Specifically, I make no attempt at arguing that spillovers of a more global nature are unimportant and effective education policy clearly needs to take these types of spillovers into account. My goal is simply to point out that making a seemingly small, but realistic and relevant, departure in the modeling of spillovers forces us to dramatically re-evaluate the role of education in a modern economy.  43  Chapter 3  Peer Effects and the Promise of Social Mobility: A Model of Human Capital Investment 3.1  Motivation and Introduction  The determinants of human capital formation are, for various reasons, important to understand. Economists have recognized that such determinants include not only the level of resources that are devoted to the process, but also the social context in which this investment takes place. For example, there is now a large literature on ‘peer effects’ that analyzes and attempts to quantify the notion that an individual’s outcomes are sensitive to the particular individuals that they interact with (see Durlauf (2004) for a comprehensive survey). This paper presents a model of human capital development in which parents allocate resources mindful of the existence of peer effects. A central concern is the efficiency of such resource allocations. Peer effects are modeled in a simple, direct manner: parents make investments in their child’s human capital and this spills over to the child’s peers. If peers were fixed, then there is a natural underinvestment problem since no parent takes into account the fact that their investment benefits others. However, peers are not fixed: parents face ‘the promise of social mobility’ in the sense that they have the capacity to undertake costly actions that place their child among desirable peers. In particular, a family’s (endogenous) wealth determines the type of peer group that their child will interact with. This is modeled as a marriage problem with observable wealth, but can be thought of as representing a competitive market process whereby wealth determines which families can afford to live in which neighborhoods (or attend which schools). Finally, the model captures the feature that acquiring wealth places a demand on family resources, thereby raising the cost of parental investments. Most existing studies that model human capital development in the presence of peer effects are concerned with the efficiency of equilibrium segregation across neighborhoods. Prominent examples include Benabou (1996a), Durlauf (1996), and de Bartolome  44  (1990). Although these studies are concerned with the efficiency of parental location choices, they generally trivialize the parental investment choice.45 In contrast, the efficiency of such investment choices are the central issue here. A related literature abstracts from explicit peer effects and instead focuses on ‘fiscal spillovers’: the effect of neighborhood composition on local public finance decisions, such as educational expenditures (e.g. Benabou (1996b) and Fernandez and Rogerson (1996)). There are two relevant points to be made here. First, the standard public goods problem (under-provision) is readily overcome when contributions are monetary, since it is relatively simple for local governments to establish and enforce the suitable contracts, e.g. imposing suitable tax rates. Indeed, this is the essence of Tiebout competition. Such mechanisms are not available when the contributions take the form of parental investment, implying that the under-investment problem remains. Second, for any given tax rate, families prefer to be surrounded by wealthier families since they generate greater tax revenue. Families also wish to be surrounded by wealthy families in the model presented here, but for a very different reason: wealth signals that such families have also made high parental investments. This perspective may help reconcile the puzzling coexistence of i) the fact that parents have a concern about which school their child attends, and ii) a general disagreement in the empirical literature as to whether school resources have a significant impact on outcomes. The model developed here extends the literature on competitive matching with prematch investments by analyzing multiple pre-match investments. This is an important extension since the literature stresses two distinct roles of pre-match investments. First, an investment has a surplus-generating role when it serves to increase an agent’s value as a potential partner - i.e. to generate surplus within a match. Second, an investment has a matching role when it serves as a means through which more desirable partners can be attracted.46 There is a class of models have a single investment that plays both roles simultaneously (e.g. Bidner (2008a), Peters (2004a), Peters (2007), Peters and Siow (2002) and Cole, Mailath, and Postelwaite (2001)).47 A second class of models employ 45 Although a parents’ human capital acts as an input in the production of their childs’ human capital, it is not a choice variable in Benabou (1996a) (it is an exogenous type). Human capital is produced with neighborhood-wide educational inputs (per capita) in de Bartolome (1990), and, in a similar vein, Durlauf (1996) assumes that all individuals in a neighborhood receive the same investment. 46 The only other paper that I am aware of that incorporates multiple pre-match investments is Han (2005). In that paper, firms choose both a workplace characteristic and a wage payment (workers choose a single productive characteristic). Although the firm makes multiple investments, both of the investments are observed, and therefore both simultaneously play the surplus-generating and matching role. In contrast, the present paper uses multiple investments to distinguish the roles. 47 These model can be further classified according to the significance of agents’ types. In Peters (2004a), Peters (2007), Peters and Siow (2002) and Cole, Mailath, and Postelwaite (2001), types determine the cost of investment (much like in signaling models) but do not affect an agent’s value as a partner, which only depends on their investment (making signaling uninteresting). In Bidner (2008a), types determine the productivity of investment and therefore influence an agent’s value as a partner (as in signaling), but do  45  a single investment in the matching role only (e.g. Bidner (2008b), Hoppe, Moldovanu, and Sela (2005), Damiano and Li (2007), and Rege (2007)). Finally, a large and varied class of models uses a single investment to focus on the surplus-generating role only (e.g. non-cooperative models of public good contribution, hold-up, etc., as well as competitive models such as the Kremer (1993) O-Ring Theory). Conceptually, these roles are distinct. As such, interesting insights may be overlooked if we either ignore one of the roles, or try to impose both roles on a single investment. Section 3.6 below sheds more light on the issue, as it considers an extension of the model in which both investments are observed with noise. The central message delivered by the model is that competition for peers exacerbates the inherent inefficiency associated with parental investment externalities. There are essentially two components to this. First is the fact that families devote too many resources to acquiring wealth in order to compete for better peers. Modern treatments of this phenomenon explicitly incorporate matching (e.g. Hoppe, Moldovanu, and Sela (2005) and Rege (2007)), but the general ‘rat-race’ phenomenon has long been recognized (e.g. Akerlof (1976), and Frank (1985)). The model presented here places this phenomenon within a ‘general equilibrium’ setting because the objects that agents are competing for - the parental investment embodied in peers - is itself endogenous. This leads to the second aspect: competition for peers consumes parental resources, which makes parental investments themselves more costly. This mechanism is reminiscent of the adverse effects of high-powered incentives stressed in the literature on multi-tasking (e.g. Holmstrom and Milgrom (1991)). Again, the model places this mechanism within a ‘general equilibrium’ context because incentives are only high-powered because of the possibility of interaction with others. The conclusion that competition for peers is detrimental is in direct contrast to the positive conclusions drawn from models in which a single investment plays both a surplusgenerating role and a matching role. In such models, the desire to attract better partners provides an added impetus to invest - as it does in this model - but, the fact that there is only one investment automatically implies that the under-investment problem is, at least in part, resolved.48 Despite the dramatic difference in the conclusions reached, the models produce many observationally-equivalent outcomes. For instance, positive assortative matching (on wealth, parental investment, and type) is predicted by both models. However, the models are empirically distinguishable, at least in principal, because they differ on the variables that cause positive matching. For instance, models with a single investment predict that it is the child’s human capital that allows for a better match, whereas the model here predicts that it is wealth. not affect investment costs (making signaling infeasible). 48 In Bidner (2008a), the added impetus actually leads to the reverse problem - over-investment.  46  The basics of the model are laid out in Section 3.2. Essentially, families observe their type, acquire wealth and make parental investments, then compete for a desirable partner (peer) in the matching market. The equilibrium concept is defined following a description of the matching market. Various general results, including existence, uniqueness, and some welfare properties, are presented in Section 3.3. Following this, two illustrations are presented. The first, in Section 3.4, is quite simple and is designed to demonstrate i) how to calculate equilibria, and ii) some strong welfare dominance properties. The second illustration, in Section 3.5, is more detailed, and demonstrates how to derive equilibria when simple closed-form solutions are not available. Furthermore, the illustration provides the background for an extension in Section 3.6. The extension examines a situation in which both wealth and parental investment are observed (with noise) in the matching market. I demonstrate that many of the results are robust in this dimension, and that additional insight is obtained as i) outcomes depend on the distribution of types, and ii) special cases are obtained as the different noise levels are manipulated. Although the model is motivated by peer/neighborhood effects, I believe the mechanisms highlighted by the model are applicable to a variety of situations that share the essential features - e.g. analysis of labour and marriage markets.  3.2 3.2.1  Model Fundamentals  A family consists of one adult and one child, and is indexed by i ∈ [0, 1]. Each adult is  endowed with an ability, θi , where θi is continuously distributed on Θ ≡ θ, θ according  to Ψ, which is assumed to have a positive and bounded density ψ ∈ (0, ∞) on Θ.  Adults have preferences defined over two outcomes: consumption and the human  capital of their child. Consumption is financed by wealth, and wealth is determined by an investment that the adult makes in their productivity. If an adult makes x units of investment in their productivity, then this allows them to consume an amount that produces utility according to f (x), where f is a twice differentiable function with fx > 0, fxx < 0, and limx→0 fx (x) = ∞.  The process of earning income and consuming does not involve any form of interaction  with other families. In contrast, the development of a child’s human capital is, in part, a social phenomenon. In particular, child i not only benefits from parental investments made by adult i, but also benefits from parental investments made by the parents of their peers. To focus ideas, suppose that children socialize in pairs.49 If adult i makes y 49  This assumption is made for simplicity. Allowing for any finite number of agents per group is not  47  units of parental investment and their child socializes with a child whose parents make y units of parental investment, then the human capital of child i is given by h(y, y ). The function h is twice differentiable with hy , hy > 0 and hyy ≤ 0. I assume that interaction is weakly complementary in the sense that hyy ≥ 0. That is, the marginal product of  parental investment in non-decreasing in the level of parental investment made by their partner. This property of h is important in matching problems because it determines the efficient matching pattern (Becker (1973)). Finally, I make the regularity assumption that hyy (y, y) + hyy (y, y) ≤ 0. This says that the marginal product of parental investment  is non-increasing when evaluated at a point in which both members of the match make the same investment. The assumption ensures that a well-defined social optimum exists. To fix ideas, it will be convenient to assume that h belongs to the class of generalized CES functions: h(y, y ) = (1 − φ) · q(y)ρ + φ · q(y )ρ  1 ρ  ,  where ρ ∈ (−∞, 1], φ ∈ [0, 1], and q is an increasing, twice continuously differentiable, concave function.50  Even when q is chosen to be the identity function, q(z) = z, the CES specification is flexible enough to capture linear (ρ = 1), Cobb-Douglas (ρ → 0), and Leontief (ρ → −∞) specifications. The parameter φ captures the degree to which a child’s human capital is sensitive to the parental investment embodied in their peers. An adult’s ability, θ, determines the costs incured in making both types of investment. In particular, if an adult of ability θ chooses the investment bundle (x, y), then their total investment is T = x + y, which has an associated cost of c(T, θ), where c is a twice continuously differentiable function where i) the marginal cost is positive, except at zero: cT (0, θ) = 0 and cT (T, θ) > 0 for T > 0, ii) the marginal cost is strictly increasing: cT T > 0, and most importantly iii) the marginal cost is decreasing in ability: cT θ (·) < 0. The assumption that investment costs depend on the total investment, and not the composition, can be motivated by interpreting x as expenditures on consumption, and y as expenditures on a child’s human capital. These expenditures are financed from wealth, which is acquired from labor income. A parent’s labor income depends on their labor supply, as well as their wage. The wage equals their marginal productivity, which is their ability. Adults are endowed with a unit of time that is divided between work and leisure. By allocating t units of time to working, the adult obtains a leisure payoff problematic if we assume that all an agent cares about is the average of the investments made by others in the group. 50 Note that q being C 2 implies that both q and q are continuous functions, the monotonicity of q implies that there is no yˆ < ∞ such that limy→ˆy q (y) = 0. Together, these imply that −q (y)/[q (y)]2 is continuous on (0, ∞).  48  of (1 − t), and can allocate T = tθ units to the two types of expenditures. Thus, if we  assume that (1) = 0, we can interpret the cost of investment as the opportunity cost of leisure: c(T, θ) = − (1 − (T /θ)). Despite this interpretation, the functional forms adopted in the illustrations will be chosen for their analytical simplicity.  Finally, in order to analyze the effect of altruism, I assume that the adults’ objective function incorporates a weighted sum of consumption and child human capital. In particular, if an adult chooses the investment bundle (x, y) and is matched with a family that chooses an investment bundle (·, y ), then the adult’s total payoff is: V x, y, y , θ ≡ (1 − α) · f (x) + α · h(y, y ) − c(x + y, θ),  (3.1)  where α ∈ (0, 1] parameterizes altruism. To ensure that parents choose a positive  amount to invest in their child’s human capital - that is, to make the model interesting - I assume that limy→0 Vy (x, y, y , θ) > 0 for all values of (x, y ). This is automatically satisfied when either ρ < 1 or α = 1. In all other cases, the assumption is satisfied if limy→0 q (y) = ∞.  3.2.2  Structure  Each adult clearly has an interest in who their child socializes with. The mechanism through which a child is assigned a peer is intended to capture the feature that parents are somewhat able to influence the quality of their match through the choice of investments. In particular, the model will unfold in two stages as follows: 1. Adults observe their ability, θ, and choose their investment bundle (x, y). 2. Families enter a matching market in which they match with another family. The matching market is modeled in a somewhat reduced form manner, and is intended to capture the salient features of a frictionless matching game with a large, even, number of agents. The key element is that only agents’ wealth, x, is observed in the matching market (this is relaxed in Section 3.6 below). It is convenient to assume that agents are able to ‘hide’ any amount of their wealth at some arbitrarily small, but positive, cost. We think of each agent, having observed the wealth of all agents, proposing to and receiving propositions from - other agents. Any agent that goes unmatched gets a payoff equivalent to being matched with a family that makes zero investment in their child’s human capital. Thus, all agents prefer to be matched than to remain unmatched. The matching market is said to be in equilibrium once all further mutually agreeable propositions are exhausted. I impose certain conditions on the matching market that are intended to capture the the essential properties of this equilibrium allocation as 49  the number of agents becomes infinitely large. This is discussed in more detail in the following section.  3.2.3  Equilibrium  The economy is in equilibrium when i) agents make their investments optimally, given a conjecture about the matching market, and ii) equilibrium in the matching market, given the pattern of investments that are made, does not contradict the agents’ initial conjecture. I consider two classes of equilibria; pooling and separating. Rather than introducing an all-encompassing definition of equilibrium at this point, I offer specialized definitions for each of these classes in their respective sections. Pooling Equilibrium In a pooling equilibrium, all families have the same observed wealth, say xP . Since all families appear identical in the matching market, matching must effectively be random. Furthermore, any unilateral deviation from a wealth of xP will not change a family’s matching prospects since they will still be matched with some family (and all families appear identical). This implies that no family can be hiding wealth in a pooling equilibrium, since hiding wealth entails a small cost. Given that other families invest according to (xP , y P (θ)), each agent faces the following optimization problem: max {(1 − α) · f (x) + α · H(y) − c(x + y, θ)} , x,y  (3.2)  where H(y) ≡  h(y, y(z))dΨ(z).  The profile (xP , y P (θ)) is a pooling equilibrium if the solution to (3.2) is {xP , y(θ)}, for all θ ∈ Θ. Importantly, the optimal value of x must be xP for all families. For this reason, pooling equilibria will often fail to exist.  Proposition 5. A pooling equilibrium does not exist if there is imperfect altruism (α < 1). In light of this, one must take care when commenting on the trade-off emphasized in the literature: that signaling is wasteful but facilitates efficient matching patterns. In Hoppe, Molodovanu, and Sela (2005), this trade-off is motivated by making a comparison across equilibria - however, once the investment yields a private benefit in addition to any signaling aspect, the equilibrium without investment fails to exist. Of course, one 50  could always motivate the trade-off by comparing equilibrium outcomes to a benchmark in which the ‘signal’ is also hidden. I analyze a model in which both types of investment are observed with noise in Section 3.6 below. One last point to note is that, since the marginal benefit to parental investment is non-decreasing in the investment of others, pooling equilibria will generally not be unique (the parental investments, not wealth, will differ). Separating Equilibrium Separating equilibria have the property that the parental investment made by each family is perfectly revealed by their observed wealth level. For this to occur, families must have no incentive to ‘hide’ part of their wealth from the matching market. This in turn requires that families of different types must optimally choose different wealth levels. I look for equilibria in which wealth is a differentiable and strictly monotone function of type. The pair of functions, {x(θ), y(θ)}, are candidate equilibrium investment functions if  x(·) is a differentiable, strictly monotone function. If agents invested according to these candidate functions, then observing a family with a wealth of z in the matching market reveals that the family is of type x−1 (z), and therefore has made a parental investment of µ(z) ≡ y(x−1 (z)). Since all families prefer to match with those that have higher values of µ(z), it follows that the only stable matching is positive assortative on µ(z).  Let X be the set of investments that arise in equilibrium (the image of x(θ)). Agents recognize that if they enter the matching market with a wealth of z ∈ X, then they will  be matched with any family that has made a parental investment of µ(z). On the other hand, if they enter with a wealth of z ∈ / X then, since they match with some family, they  get matched with a family that has made some parental investment in {µ(z ) | z ∈ X}.  In light of this, we can interpret µ as a matching market return function.  Separating behaviour is only consistent with µ being non-decreasing (at least on X). Suppose to the contrary that for some pair of observed equilibrium wealth levels, (x, x ), where x < x , we had µ(x) > µ(x ). Those with an observed wealth of x would be better off hiding part of their wealth and displaying a wealth of x to the matching market (since the associated cost is arbitrarily small). This would then contradict the fact that x is a wealth level observed in equilibrium. To summarize, we say that the function µ is consistent with the candidate investment functions {x(θ), y(θ)} if i) µ is non-decreasing on X, and ii)  = y(x−1 (z)) if z ∈ X µ(z) ∈ {y(x−1 (z)) | z ∈ X} otherwise.  (3.3)  51  The first of these conditions can also be expressed as: µ(x(θ)) = y(θ), for all θ ∈ Θ.  (3.4)  Taking the non-decreasing return function as given, families invest optimally. In particular, given µ, investments are optimal if {x(θ), y(θ)} ∈ arg max{V (x, y, µ(x), θ)} x,y  (3.5)  for all θ ∈ Θ.  Putting this all together, a separating equilibrium is defined as follows.  Definition 3. A separating equilibrium is a pair of candidate investment functions, {x(·), y(·)}, and a matching market return function, µ(·), such that: 1. {x(·), y(·)} are optimal given µ(·), and 2. µ(·) is consistent with {x(·), y(·)}. One central property of a separating equilibrium is that families are perfectly segregated along the parental investment dimension. This, by itself, does not imply that families are also perfectly segregated along the wealth dimension. That is, if parental investment were non-monotonic in type, then two different types will make the same parental investment (and therefore can be matched together), yet have different wealth levels. This possibility is ruled out by the following. Result 3. Parental investment is a weakly increasing, differentiable function of type in a separating equilibrium. This comes from the observation i) µ being non-decreasing implies that if wealth is increasing (decreasing) in type then parental investment is weakly increasing (decreasing) in type, and ii) that total investment is increasing in type (see Appendix). Differentiability follows from the assumed differentiability of x(·) and an inspection of the equation that implicitly defines optimal parental investments. Since both x(θ) and y(θ) are differentiable in a separating equilibrium, condition (3.4) implies that so too is µ. This fact is used in deriving the equilibrium return function, but before doing so, it is useful at this point to establish a set of benchmark investment levels since they will also feature in the derivation. Some Benchmarks To begin, suppose that each family were exogenously matched with a family of the same type. The optimal investments in this setting are called the Nash investments (since 52  families take their partner as given). Some insight into the role played by ‘the promise of social mobility’ can be obtained by comparing the Nash investments to the equilibrium investments. The Nash investments, xN (θ) and y N (θ), satisfy the following: xN (θ), y N (θ) ∈ arg max V (x, y, y N (θ); θ). x,y  (3.6)  When investing in this way, families do not take into account that their parental investment benefits their partner. To formalize this, we can define the Efficient investments, x∗ (θ) and y ∗ (θ), as those that satisfy: {x∗ (θ), y ∗ (θ)} ∈ arg max V (x, y, y; θ). x,y  (3.7)  Given the positive externality, the following result is not particularly surprising. Result 4. Nash investments are not efficient. In particular, y N (θ) < y ∗ (θ) and xN (θ) ≥ x∗ (θ).  Nash wealth is (weakly) greater than the efficient wealth level since the Nash parental investment is lower, which lowers the cost of making the wealth enhancing investment. The reason that the inequality is weak is that both wealth levels may be zero (when α = 1). Result 5. Both the Nash- and Efficient Investments are independent of ρ. This follows from the observation that for any z > 0, we have h(z, z) = q(z), hy (z, z) = (1 − φ) · q (z), and hy (z, z) = φ · q (z) for all values of ρ. Although this is a special feature of the generalized CES form imposed on h, it will be useful below. Deriving the Equilibrium The equilibrium is derived in two steps. First, the matching market return function is derived. Once we verify i) that this function is increasing on X, and ii) that appropriate values for off-equilibrium investments can be found, the second step involves using the first-order conditions to derive the optimal parental investment associated with given wealth level. These two curves are plotted in the same space, and their intersection characterizes equilibrium investments. To begin, consider the problem of deriving the matching market return function. Parents make choose their investment bundle, (x, y), taking µ as given. Since wealth is a strictly increasing function, almost all families will optimally make interior wealth investments. This, together with the observation that µ is differentiable in equilibrium,  53  implies that optimal investments are characterized by the first-order conditions: (1 − α) · fx (x(θ)) + α · hy (y(θ), y(θ)) · µx (x(θ))  = cT (x(θ) + y(θ), θ)  α · hy (y(θ), y(θ)) = cT (x(θ) + y(θ), θ)  (3.8) (3.9)  Equating the left-hand sides of these, and using (3.4), we get the following differential equation: µx (x) =  α · hy (µ, µ) − (1 − α) · fx (x) ≡ Γ(µ, x) α · hy (µ, µ)  (3.10)  In order to pin down µ, we need an initial condition. To obtain this, we turn our attention to the fact that we need suitable off-equilibrium values for µ. All we need is that for all z ∈ / X, µ(z) = y(θ) for some θ. Given that y is non-decreasing in θ we can safely set  / X (if the objective is to ensure that no family has an incentive to µ(z) = y(θ) for all z ∈  deviate to an off-equilibrium investment). Now consider the problem faced by families of the lowest type. In equilibrium they are always matched with a family that invests y(θ), and by the above argument, can do no worse by deviating to anything else. Therefore, their equilibrium investments coincide with those that would be made had they taken their partner as fixed. This gives us the initial condition that the lowest types make their Nash investments: {µ0 , x0 } = y N (θ), xN (θ) . This, combined with (3.10), defines an initial values problem.  The solution to the initial values problem represents a candidate equilibrium return function, which we need to verify is strictly increasing on X. To do this, we sketch out the direction field associated with Γ. That is, for any given point in (µ, x) space, we know that the slope of µ(x) equals Γ(x, µ). The essential features of this process are illustrated in Figure 3.1. To begin, consider the case in which α < 1, and consider the set of points such that Γ = 0. Such points are described by the implicit function, N (x), which satisfies: (1 − α) · fx (x) = α · hy (N (x), N (x)).  (3.11)  It is straightforward to verify that N (x) is a strictly increasing function, as depicted. At points to the left of N (x), we have Γ < 0 and at points to the right of N (x), we have Γ > 0. Thus, any solution to the initial values problem will be downward-sloping to the left of N (x), flat at N (x), and upward sloping to the right of N (x), as depicted. The particular solution depends on the initial condition, but note that the initial condition lies on N (x) since (3.11) is a consequence of the first-order conditions associated with the Nash investments. The blue line depicts a solution to the initial values problem, and  54  it is straightforward to see that it must be strictly increasing on x ≥ x0 .51 Values of x  below x0 do not arise in equilibrium, so µ need not be governed by Γ in this region. The dashed blue line is one possibility for what µ looks like below x0 : it captures a situation in which agents realize that cutting their investment below x0 means that they must match with the least desirable family (who makes a parental investment of µ0 ). When α < 1, note that the initial condition will be strictly interior. Since Γ(x, µ) is continuously differentiable at all x > 0 and µ > 0, the fundamental theorem of differential equations can be applied to demonstrate that a solution exists, and is unique. If α = 1, then Γ > 0 on the entire space so that we can be sure any solution is strictly increasing, and the initial condition lies on the µ axis. Existence and uniqueness of the solution to the initial values problem is immediate when α = 1, since we can simply integrate to get: µ(x) = [(1 − φ) · φ−1 ] · x + y N (θ).  Γ(x, µ) < 0  Returns (µ)  N (x)  µ0  Γ(x, µ) > 0 x0  Wealth (x)  Figure 3.1: Properties of µ: A Sketch of the Direction Field associated with Γ(x, µ) Result 6. The initial values problem has a solution in which µ is non-decreasing on X, 51 2 Both Γ(x, µ) and Γµ (x, µ) are continuous on R++ , which is sufficient to demonstrate that the Lipshitz 2 condition on Γ is satisfied. As such, each interior point in the direction field (R++ ) will have at most one solution passing through it. This observation rules out the possibility that a solution that passes below the N (x) curve can at some later point re-cross the N (x) curve (since curves crossing N (x) must do so ‘from above’ owing to the fact that the slope of such curves is negative to the left of N (x) and positive to the right of N (x)).  55  and this solution is unique. Notice that µ is completely independent of i) the distribution of types, and ii) any cost parameters. Furthermore, note that once the generalized CES form is applied, we have: Γ(x, µ) =  1 − φ 1 1 − α f (x) − · · , φ φ α q (µ)  (3.12)  which is independent of ρ. Since the Nash investments are also independent of ρ, we have the following. Result 7. The solution to the initial values problem is independent of ρ. The result relies on the CES form, however, it indicates that ‘complementarity’ (as captured by ρ) is not of first-order significance.52 What is important is the degree of spillovers, φ. Once we have the return function, the equilibrium investment functions can be derived from another implication of the first-order conditions. In particular, if the equilibrium wealth for a family of type θ is x, then their optimal parental investment is given by yˆ(x, θ), where yˆ(x, θ) satisfies: α · hy (ˆ y (x, θ), yˆ(x, θ)) = cT (x + yˆ(x, θ), θ) . The function yˆ(x, θ) is decreasing in x and increasing in θ. Figure 3.2 depicts yˆ(x, θ) in (y, x) space for three different values of θ. The equilibrium return function derived above is superimposed on this space also, since consistency of beliefs, condition (3.4), requires that µ(x(θ)) = y(θ). The equilibrium investments therefore are given by the points of intersection of yˆ(x, θ) and µ(x), as depicted. The fact that yˆ(x, θ) never starts below the Nash parental investment (and starts strictly above it when α < 1), and equals zero for some finite wealth level, implies that the curves cross exactly once. The final step in establishing the existence and uniqueness of a separating equilibrium is to show that that the first-order necessary conditions are also sufficient. To do this, I show that the objective function is globally concave when evaluated using a candidate return function (see the Appendix). To conclude this section, we therefore have the following. Proposition 6. A separating equilibrium exists, and it is (essentially) unique. Moreover, the separating equilibrium is independent of ρ. 52  The general condition required of h is as follows. Suppose that h depends on parameters, ξ, so that we can write h = h(y, y ; ξ). For the separating equilibrium to be independent of ξ, we need hy (y, y; ξ) and hy (y, y; ξ) to both be independent of ξ for all y.  56  yˆ(x, θ′′ )  Returns (µ)  yˆ(x, θ′ ) yˆ(x, θ) µ(x) y(θ′′ ) y(θ′ ) y(θ)  x(θ)  x(θ′ )  x(θ′′ )  Wealth (x) Figure 3.2: Derivation of Equilibrium Investments The qualification ‘essentially’ reflects the fact that one could specify alternative offequilibrium values for µ that would not disrupt investment behaviour.  3.3 3.3.1  Analysis Efficiency  A theme common to papers that study this kind of environment in full-information settings (e.g. Peters and Siow (2002) and Cole, Mailath, and Postlewaite (2001)) is that the competition for partners helps remedy the inefficiency surrounding the externality associated with parental investment. This is not the case here. Proposition 7. Investments are inefficient in the separating equilibrium. In particular, wealth is weakly greater and parental investments are weakly lower than the corresponding Nash investments. A geometric proof is offered in Figure 3.3. In (x, y) space, the figure shows the locus of Nash investments, N (x), and equilibrium investments, y(x), in the case of α < 1. The locus of Nash investments is implicitly defined by (1 − α) · fx (x) = α · [hy (N (x), N (x))] as 57  Parental Investment (y)  N (x)  y(x)  A B  yˆ(x, θ) Wealth (x) Figure 3.3: Geometric Proof of Proposition 7 described above. The equilibrium investments are given by y(x) = µ(x) as derived above. In addition to these, the first-order condition that describes the optimal choice of y given any particular x, is plotted. This is given by α · hy (ˆ y (x, θ), yˆ(x, θ)) = cT (x + yˆ(x, θ), θ).  The figure clearly shows that the equilibrium investments (point B) are ‘southeast’ of the Nash investments (point A). When α = 1, the locus of Nash investments coincides with the y axis. The yˆ(x, θ) curve is still well-defined, and the conclusion remains.  3.3.2  Total Investment  If, in a separating equilibrium, incentives to invest in wealth are too great, and incentives to make parental investments too small, then what about total investment? Let T E (θ) ≡ x(θ) + y(θ) denote the total investment made by a type θ family in a separat-  ing equilibrium. Let T ∗ (θ) and T N (θ) represent the total investment in the efficient and Nash cases, respectively. Proposition 8. For all θ ∈ Θ, T N (θ) ≤ min{T E (θ), T ∗ (θ)}. If q is linear, then T N (θ) = T E (θ) < T ∗ (θ).  The second illustration described below assumes a form in which q is strictly concave, and derives an equilibrium in which T N < T E = T ∗ . Thus, we can be sure that the 58  pattern of inequalities identified in the result do not hold for all q.  3.3.3  An Alternative Benchmark: Random Matching  Equilibrium outcomes are compared to the Nash outcomes, which can be interpreted as the investments that would arise if agents matched on type (rather than wealth). Another reasonable benchmark may be the investments that arise if agents were randomly matched. This corresponds to a setting in which all of the agents’ characteristics are hidden. In certain cases, the two benchmarks display the same (aggregate) qualities. Proposition 9. If ρ = 1, then investments in the ‘random matching’ benchmark are identical to the investments in the Nash benchmark. Average welfare is the same across the benchmarks, although the lower types prefer random matching, and higher types prefer Nash. This follows simply from the observation that h is additively separable if ρ = 1, which in turn implies that the Nash investment is independent of the type of partner that a family is matched with (and therefore to any mixture, including random matching). The latter part of the proposition simply reflects the fact lower types get a better quality partner on average under random matching. To make some progress with more general results, assume α = 1 for clarity. Equilibrium with random matching requires that each type equalizes the marginal cost of parental investment with the expected marginal return.53 Suppose that all families made the Nash investments. For lower types, the expected marginal return is higher than the marginal return with perfect segregation (since they are matched with higher types on average), whereas the opposite is true for higher types. Thus, lower types have incentives to invest more than their Nash level, and higher types have incentives to invest less. To supplement this intuition, one can show that parental investment can not be decreasing in ρ for all types.54 Furthermore, optimal investments with random matching will be sensitive to the distribution of types. Explicit solutions are difficult to obtain when ρ < 1, however, a Cobb-Douglas (ρ → 0) example is solved in the Appendix.  3.4  Simple Illustration  The purpose of this section is to provide a simple demonstration of how to calculate and analyze equilibria. I assume perfect altruism, which implies that both pooling and 53  For ρ < 1, the marginal return to parental investment will depend on who the family happens to be matched with. 54 This is shown by treating investments as a function of ρ, and totally differentiating the first order condition.  59  separating equilibria exist. Comparing welfare across these equilibria will be a central concern. Assume perfect altruism (α = 1), and let the human capital and cost functions take the following forms: h(y, y ) = (1 − φ) · y + φ · y 1 c(x + y, θ) = (x + y)2 . 2θ Under this specification parents derive no intrinsic value from wealth and parental investments are substitutes where φ > 0 captures the degree to which there are spillovers in parental investments. In terms of the generalized CES formulation, q is the identity function and ρ = 1.  3.4.1  Efficient Investments  Since h is separable, the definition of the efficient investments is independent of the actual matching pattern. The efficient parental investment (wealth is efficiently zero) satisfies: y ∗ (θ) = arg max y − c(y, θ), implying that y ∗ (θ) = θ. If a family matches with a family of type θ , then the family’s welfare is: W ∗ (θ, θ ) =  1 ·θ+φ θ −θ . 2  Thus, average welfare is: W∗ ≡  W ∗ (θ)dΨ(θ) = Θ  1 · E [θ] , 2  since E[E[θ | θ]] = E[θ] for any matching pattern.  3.4.2  Pooling Equilibrium  There is a pooling equilibrium under this specification, since having no wealth is a best response to all other families having no wealth. In this equilibrium matching is random because all families appear equally attractive. An implication of this is that each family recognizes that increasing their wealth above zero will not affect the expected parental  60  investment made by their match. The objective facing families is: max (1 − φ) · y + φ · E[y ] − c(x + y, θ). x,y  There is clearly no incentive to acquire wealth, so that xP (θ) = 0. The optimal level of parental investment is y P (θ) = (1 − φ) · θ. If a family of type θ ends up being matched  with a family of type θ , then the family’s payoff is: W P (θ, θ ) =  (1 − φ)2 · θ + φ · (1 − φ) · θ , 2  and the expected welfare of a family of type θ in the pooling equilibrium is therefore: (1 − φ)2 · θ + φ · (1 − φ) · E [θ] , 2  W P (θ) =  since once again E[E[θ | θ]] = E[θ]. Average welfare is therefore: WP ≡  Θ  W P (θ) = (1 − φ2 ) ·  1 · E [θ] . 2  This is less than W ∗ as expected, and is monotonically declining in φ.  3.4.3  Separating Equilibrium  Given that all agents believe that a wealth of x will lead to finding a match that has µ(x) units of parental investment, agent i’s objective function is: V (x, y, µ(x), θ) = (1 − φ) · y + φ · µ(x) − c(x + y, θ). The first-order conditions are: φ · µx (x(θ)) = (1 − φ) =  y(θ) + x(θ) . θ  The first equality (i.e. equating the investments’ marginal returns) gives us a particularly simple differential equation: µx (x) =  1−φ . φ  Integrating both sides gives us the equilibrium return function: µ(x) =  1−φ · x + y0 . φ  61  Since xN (θ) = xP (θ) = 0, the value of y0 is given by y N (θ) = y P (θ) = (1 − φ) · θ, so that we have:  µ(x) =  1−φ · x + (1 − φ) · θ. φ  (3.13)  The second equation we need, yˆ(x, θ), is found by using the second equality in the firstorder conditions: yˆ(x, θ) + x = (1 − φ) · θ.  (3.14)  Since consistency requires µ(x(θ)) = y(θ), we can determine the equilibrium investments by using (3.13) and (3.14): y S (θ) = (1 − φ)2 · θ + φ · (1 − φ) · θ  xS (θ) = φ · (1 − φ) · [θ − θ] . The equilibrium welfare for a type θ family is: W S (θ, θ ) = W S (θ) =  (1 − φ)2 · θ + φ · (1 − φ) · θ, 2  since θ = θ in equilibrium (due to positive assortative matching). Average welfare is: WS ≡  3.4.4  W S (θ) = Θ  (1 − φ)2 · E [θ] + φ · (1 − φ) · θ. 2  Results  A comparison of the expressions for average welfare reveals that W S ≤ W P ≤ W ∗ ,  where the inequalities hold with equality if and only if there are no spillovers (φ = 0).  The difference between welfare in either equilibrium and the first-best welfare grows monotonically in spillovers. These results could be anticipated given the general results discussed in the previous section, however there are stronger welfare results generated in this setting. First, all agents find that the expected payoff is greater in the pooling equilibrium than in the separating equilibrium. Result 8. The pooling equilibrium ‘strictly ex-ante Pareto dominates’ the separating equilibrium: W S (θ) < W P (θ) for all θ ∈ Θ. A brief inspection of the relevant expressions reveals that W P (θ) − W S (θ) = φ(1 −  φ) · [E(θ) − θ], which is clearly positive. There is an even stronger result than this. All  62  agents get a greater payoff in the pooling equilibrium than in the separating equilibrium, regardless of which family they end up being matched with in the pooling equilibrium. Result 9. The pooling equilibrium ‘ex-post Pareto dominates’ the separating equilibrium: W S (θ) ≤ minθ {W P (θ, θ )} for all θ ∈ Θ. One may conjecture that this result reflects the fact that although some agents get lower-quality matches, they have lower investment costs. This is the logic behind the analogous result in standard signaling models. This is not the case here, because it is straightforward to verify that total investment for a family of type θ equals (1 − φ) · θ in  both the pooling and separating equilibria.55 Since investment costs are the same across equilibria, the following must apply. Result 10. The realized human capital level any given child realizes in the pooling equilibrium is never less than their human capital level in a separating equilibrium. To verify this, note that if a child with parents of ability θ is matched with a family of ability θ in a pooling equilibrium, then their human capital level is (1−φ)·[(1−φ)·θ+φ·θ ], whereas their human capital level in the separating equilibrium is (1−φ)·[(1−φ)·θ+φ·θ]. The difference between these, (1 − φ)φ · [θ − θ], is also clearly never negative.  To summarize, the pooling equilibrium welfare-dominates the separating equilibrium  in a very strong sense (under the specification considered here). To paraphrase Result 9, the very lowest payoff that a family can receive in the pooling equilibrium is never lower than the best payoff that they can receive in the separating equilibrium. Unlike standard signaling models, this has nothing to do with different investment costs across the equilibria, but rather, arises from the fact that the pooling equilibrium does not encourage a diversion of family resources away from parental investment. Given this strong welfare dominance, it seems quite reasonable to conjecture that the vast majority of societies will be characterized by norms consistent with the pooling equilibrium. For instance, there would be no impetus to acquire wealth in order to secure a desirable environment for a child’s development, and there would be little segregation - at least along wealth lines. This is hardly an accurate description of most modern societies. These observations can be reconciled by the fact that the existence of pooling equilibria is much less robust than is the existence of separating equilibria. Indeed, if consumption is valued at all then the pooling equilibrium will not exist. Intuitively, all families would ‘naturally’ have different wealth levels, reflecting differences in endowed 55  This is largely a consequence of y entering in a linear manner, since the first-order condition for parental investment (in both equilibria) is c(T, θ) = (1 − φ), which automatically pins down T for each θ. With imperfect altruism (α < 1), such linear specifications must be abandoned if we are to ensure that optimal parental investments are interior (for all possible types).  63  abilities. Further, those with high wealth levels would also be the ones with high investments in human capital. Thus, it is relatively easy to identify the desirable families, but each family recognizes that in order to convince a desirable family to match with them, they must ‘masquerade’ as a high-wealth family. Any pooling behaviour therefore unravels. The following section analyzes a model with imperfect altruism - and therefore, with no pooling equilibrium.  3.5  An Extended Illustration  The point of this section is to derive equilibrium variables for a more detailed economy. The functional forms are chosen so that the extension considered in the section is more readily analyzed. I assume non-perfect altruism, which implies that pooling equilibria fail to exist. This allows me to focus on the separating equilibrium, which in turn allows me to analyze the effect of spillovers, altruism, and the distribution of types. The assumed functional forms are as follows: f (x) = ln x h(y, y ) = (1 − φ) · ln y + φ · ln y c(x + y, θ) =  1 (x + y)1+η , θ 1+η  where α ∈ (0, 1) is the altruism parameter, φ ∈ [0, 1] is the spillover parameter, and η ≥ 0  describes the degree to which there is ‘crowding out effects’. The functional form for h is obtained from the generalized form by letting q be the natural log and setting ρ = 1. To begin, the symmetric Nash investments are given by: xN (θ) = y N (θ) =  (1 − α)  [1 − αφ]  η 1+η  [1 − αφ]  η 1+η  α(1 − φ)  1  · θ 1+η , 1  · θ 1+η .  Note that i) total investment equals [(1 − αφ)θ][1/(1+η)] , which is decreasing in both al-  truism and spillovers, however ii) the relative amount allocated to parental investment, y N (θ)/xN (θ), is equal to α(1 − φ)/(1 − α), which is increasing in altruism and decreasing  in spillovers.  Since the symmetric Nash investments coincide with the investments that would be made if matching were random, the fact that xN (θ) is strictly increasing in θ implies that there can not be a pooling equilibrium. Out of interest, the efficient investments  64  are given by: 1  x∗ (θ) = (1 − α) · θ 1+η , 1  y ∗ (θ) = α · θ 1+η . In order to calculate the separating equilibrium, we begin by deriving the initial values problem. µ (x) =  1 − α µ(x) 1 (1 − φ) − φ α x  µ xN (θ) = y N (θ). This is an ordinary linear differential equation which has the following solution: − 1−α αφ  µ(x) = Z · x  + δ · x,  (3.15)  where δ≡  α(1 − φ) 1 − α(1 − φ)  and Z adjusts so that the initial condition is satisfied. In other words, Z satisfies: y N (θ) = Z · xN (θ)  − 1−α αφ  + δ · xN (θ).  Re-arranging to get Z, then substituting into (3.15), we get: µ(x) =  xN (θ) x  1−α αφ  y N (θ) − δ · xN (θ) + δ · x  (3.16)  This gives us the equilibrium matching return function. The other equation we need, yˆ(x, θ), is given by the first-order condition for the optimal choice of y: α · (1 − φ) 1 = [x + yˆ(x, θ)]η . yˆ(x, θ) θ  (3.17)  Using the fact that y(θ) = µ(x(θ)), equation (3.16) describes parental investment as an increasing function of x (for x ≥ xN (θ)), whereas (3.17) describes y as a decreasing  function of x. When plotted in (x, y) space, the two curves intersect exactly once, in a manner similar to that depicted in Figure 3.2 above. The effect of altruism and spillovers on investment can be determined by manipulating the two curves. However, explicit  65  solutions are readily computed for the case in which θ = 0: µ(x) =  α(1 − φ) ·x 1 − α(1 − φ)  1  x(θ) = [1 − α(1 − φ)] · θ 1+η 1  y(θ) = α(1 − φ) · θ 1+η Note that total equilibrium investment in this case is θ[1/(1+η)] , which is the same as the total efficient investment. This is not general - it only holds when θ = 0 (otherwise total efficient investment is greater). The existence of spillovers, however, distorts the composition of this investment. Welfare for a type θ family is: W (θ) =  1 · [ln θ − 1] + 1+η (1 − α) · ln(1 − α(1 − φ)) + α · ln(α(1 − φ)).  The first term is independent of both spillovers and altruism, and the remaining terms are independent of type and cost externalities. The difference between equilibrium welfare and efficient welfare therefore depends only on the latter term. This term is plotted as a function of α for various values of φ in Figure 3.4. The case in which φ = 0 corresponds to the efficient welfare. The figure shows how equilibrium welfare departs from efficient welfare as altruism increases, and this occurs to a greater extent when spillovers are greater. A striking (general) feature of the above analysis is that the only relevant aspect of the distribution of types is the level of the lowest ability. In particular, the equilibrium investments of a type θ agent approaches the Nash investments as θ approaches θ. That is, all agents obtain a higher payoff as the lowest ability is raised. The fact that the equilibrium is insensitive to other qualities of the distribution of types, such as mean and variance, does not seem plausible. The following section demonstrates that this is a consequence of the assumption that wealth is perfectly observed, whereas parental investment is imperfectly observed (but not that parental investment in unobserved).  3.6  Imperfectly Observed Investments  The model so far has worked with the seemingly extreme assumption that wealth is perfectly observed, whereas parental investments are unobserved. This section makes an attempt at relaxing this assumption by supposing that both wealth and parental investments are observed with some noise. In this setting the distribution of abilities will 66  3KL 3KL 3KL  $OWUXLVP DOSKD  Figure 3.4: Welfare: The Effect of Altruism and Spillovers  67  become relevant since this information will be incorporated into the process of Bayesian updating. To make some progress, fairly particular functional forms are imposed. First, assume that the distribution of types is log-normal: ln θi = ln θ + εθi , εθi ∼ N 0, σθ2 .  (3.18)  Assume also that investments are observed with noise as follows: ln x ˜i = ln xi + εxi , εxi ∼ N 0, σx2  ln y˜i = ln yi + εyi , εyi ∼ N 0, σy2 .  (3.19) (3.20)  The log structure ensures that all random variables are positive. If the investment functions happen to be of the form: y(θ) = βy θγ and x(θ) = βx θγ , then it turns out (see the Appendix for a derivation) that: µ(˜ xi , y˜i ) ≡ E [ln yi | x ˜i , y˜i ] = λx · ln x ˜i + λy · ln y˜i + constants,  (3.21)  where σy2 γ 2 σθ2  λx ≡  σy2 + σx2 γ 2 σθ2 + σy2 σx2  λy ≡  σx2 γ 2 σθ2 σy2 + σx2 γ 2 σθ2 + σy2 σx2  (3.22) (3.23)  These expressions appear messy, but they are quite intuitive. For a fixed γ 2 σθ2 , λx is increasing in σy2 and decreasing in σx2 : the more noisy the signal of y relative to x the more weight that one should put on the signal of x. The same intuition applies for λy . For a fixed σx2 and σy2 , both λx and λy are increasing in γ 2 σθ : as the distribution of types becomes more dispersed, then more weight should be placed on the signals, and less on one’s prior. Since µ is increasing in both arguments, all families find those with high values of µ more attractive, and as such, families will match assortitatively on µ. Note that when there is no noise on wealth (i.e. σx2 = 0) but an arbitrarily small amount of noise on parenting investment (i.e. σy2 > 0) families match assortitatively on wealth (i.e. λx = 1 and λy = 0). This equilibrium matching pattern implies that if family i is matched with family j in equilibrium, then µ(˜ xj , y˜j ) = µ(˜ xi , y˜i ), so that once the noise on the signals are realized,  68  agent i’s expected utility is: v(xi , yi ; θ) = (1 − α) · ln xi + α(1 − φ) · ln yi + αφ · µ(˜ xi , y˜i ) − c(xi + yi ; θ).  (3.24)  It is straightforward to show that this implies the ex-ante expected utility can be written as: V (xi , yi ; θ) = ξ · ln xi + ζ · ln yi − c(xi + yi ; θ) + constants,  (3.25)  ξ ≡ (1 − α) + αφλx ,  (3.26)  where and  ζ ≡ α(1 − φ) + αφλy .  Maximizing this with respect to x and y produces the equilibrium investment functions, which are indeed of the form conjectured, where: βy = βx =  ζ [ζ + ξ] ξ  η 1+η  η  [ζ + ξ] 1+η 1 γ= . 1+η  = =  α [1 − φ(1 − λy )]  η  (3.27)  η  (3.28)  [1 − αφ [1 − λx − λy ]] 1+η 1 − α(1 − φλx ) [1 − αφ [1 − λx − λy ]] 1+η  (3.29)  To verify this result, consider what happens as σy2 → ∞. In this case λx → 1 and λy → 0,  which means that βy → [α(1 − φ)] and βx → [1 − α(1 − φ)], as derived above for the case in which θ = 0. This is true for any finite σx2 , which implies the above analysis does not at  all rely on wealth being perfectly observed (as long as parenting effort is not observed). On the other hand, if wealth is perfectly observed and parenting investments are imperfectly observed (i.e. σx2 = 0 and σy2 > 0), then we end up with the same results as those derived in the case in which parenting effort is not observed at all (since λx = 1 and λy = 0 in this case). Note that this holds for an arbitrarily small amount of noise contained in the signal of parenting effort. In summary, the qualitative results from the above analysis (in which wealth is perfectly observed and parental investment is unobserved) carries through to cases in which i) wealth is imperfectly observed and parenting effort is unobserved, and ii) wealth is perfectly observed and parenting effort is (arbitrarily) imperfectly observed. The opposite case - in which parental investments are better observed than wealth is, in the limit, reminiscent of Peters and Siow (2002). Investments do indeed approach the efficient investments in the limit (since λy → 1 and λx → 0, implying βy → α and βx → (1 − α)).  69  It may also be of interest to note that if both investments are imperfectly observed, then the equilibrium investments depend upon the distribution of abilities. In particular, investments approach the Nash investments as the distribution of types becomes degenerate around the mean (i.e. both λx and λy go to zero as σθ2 goes to zero). Intuitively, equilibrium behaviour approaches Nash behaviour as the population becomes more homogeneous.  3.7  Conclusions  The paper has aimed to illuminate the consequences of peer effects in a model of human capital development in the presence of the promise of social mobility. The key features of the model are i) parents care about the human capital of their child, as well as consumption, ii) human capital depends on parental investment and peer group spillovers, iii) peer groups are endogenously determined on the basis of wealth, and iv) that parental investment and wealth accumulation place competing demands on parents’ resources. The central message is that competition for desirable peer groups can induce detrimental incentives to undertake (socially productive) parental investment. I have shown that pooling equilibria generally have superior welfare properties to separating equilibria, but do not exist in general. A unique separating equilibrium always exists, and delivers an average welfare lower than that associated with Nash outcomes (which are themselves inefficient). One surprising result is that the separating equilibrium is invariant to the CES substitution parameter. Various properties of equilibrium are illustrated through the employment of functional forms, and an extension is provided in which both investments are observed with noise. Although specialized, the extension demonstrates how the distribution of types matters and provides a connection between the base model and related models in the literature. I hope the model proves useful as a starting point for research on pre-match investment with multiple investments. For instance, I see value in generalizing the extension presented in Section 3.6, and in embedding the model within a dynamic search framework.  70  Chapter 4  A Model of Frictional Pre-Match Investment: Implications for Income, Inequality and Welfare 4.1  Introduction  Our productivity is shaped, in large part, by our particular work environment. In many instances, especially in ‘high skill’ jobs, the relevant difference between work environments lies in differences in the quality of coworkers that one is exposed to.56 While many interesting insights and important implications of such coworker inter-dependence have been derived, the analysis has typically unfolded in an ‘idealized’ competitive setting in which the price of skills directs workers to firms (e.g. Kremer (1993) and Kremer and Maskin (1996)). Recognizing that the feasibility of this allocation mechanism is greatly diminished when the relevant skills are unobservable, some recent work proposes instead that workers take costly actions in order to ‘compete’ for the more desirable coworkers (e.g. Hoppe, et. al. (2005), Rege (2007), and Bidner (2008a)). These papers share feature that exante heterogeneous workers make a costly investment, then enter a matching market in which they are assigned a partner on the basis of the investment. When the investment is unproductive, the essential welfare trade-off lies in the fact that investing consumes resources, but facilitates a more efficient matching pattern.57 Basically, the fact that workers are complements means that output is maximized when matching is positive assortative on type (Becker (1973)). There is a ‘no-investment’ equilibrium in which no resources are wasted, but matching is inefficient (random). There is also an ‘investment equilibrium’ that achieves the efficient matching pattern, but comes at a resource cost since higher types make higher investments in order to differentiate themselves in the 56 This is as opposed to differences in capital - a feature stressed in standard assignment models (e.g. Sattinger (1993)). Hopkins (2005) studies a model that incorporates many of the features highlighted here, but assumes that firms are (exogenously) differentiated by their capital. 57 The investment is unproductive in Hoppe, et. al. (2005) and Rege (2007). A welfare trade-off of a different nature arises when investment is productive (see Bidner (2008a)).  71  matching market. The present paper seeks to extend this type of analysis by developing a model in which the relationship between investment and matching opportunities is hindered by frictions. For example, suppose a group of students are faced with a pass-fail test. The test is designed to determine which students exerted effort during the course, but is only imperfectly able able to do so. In particular, sometimes students that did not exert effort are awarded a pass, and sometimes students that do exert effort are not awarded a pass. Frictions arise here because of imperfections in the testing technology. Outside of education, imperfections of this nature also operate in the workplace (e.g. when managers promote workers to particular positions on the basis of imperfectly perceived performance). Such frictions are surely present in reality, and so this exercise brings the proposed mechanism a step closer to relevant empirical scrutiny. For instance, equilibria in the frictionless setting display either perfectly random matching or perfect segregation (of types), yet, in reality, segregation likely falls between these two extremes. One can not analyze intermediate degrees of segregation by simply appending ‘noise’ to a frictionless equilibrium since such noise is not orthogonal: the noise itself changes the nature of equilibrium as it affects the relative return to making investments. That is, randomness must be embedded within the structure of the economy, not simply added as an afterthought. I embed frictions by modeling a situation in which each agent is characterized by an informative (but noisy) signal of their investment, and matches are then formed on the basis of the signal. I adopt a specification that, although stylized, is both tractable and transparent. In particular, I assume that there are two types, two investment levels, and two signals. A single parameter determines the probability with which the signal correctly identifies whether the investment was made. One is then able to analyze how equilibrium segregation (and therefore output and inequality) is affected by this parameter. In addition, we can examine how welfare is affected both within and across equilibria. The first set of results are quite general in that they take a general class of functions that describe coworker spillovers and i) derive the relevant measure of segregation, and ii) describe how this measure affects output and inequality. The relevant measure is shown to be a correlation coefficient, and greater segregation raises both average output and inequality. From here I impose a particular function from this class and adopt the structure described above in order to analyze i) the existence of ‘investment’ and ‘no-investment’ equilibria, and ii) how frictions affect the segregation measure within each class of equilibria. I show that a no-investment equilibrium always exists, and that an investment 72  equilibrium exists when frictions and investment costs are sufficiently low. Frictions never affect segregation in the no-investment equilibrium, and do not always lower segregation in the investment equilibrium. This implies that output and inequality are not always monotonically decreasing in frictions in the investment equilibrium. I then analyze how frictions affect welfare. I show that welfare is not always monotonically decreasing in frictions in the investment equilibrium and show that a marginal increase in frictions past the point at which an investment equilibrium fails to exist always generates a Pareto improvement. The paper is related to models of informational frictions in the labor market. The seminal papers were concerned with wage contracting when workers’ (education) investments are perfectly observed, but their productivity is unobserved (e.g. Arrow (1973) and Spence (1973)), and this has been extended to settings in which productivity is imperfectly observed (statistical discrimination (Altonji and Pierret (2001), Farber and Gibbons (1996), Lang and Manove (2006) and Blankenau and Camera (2008)). A related set of papers focuses instead on consequences of agents having imperfect information about their own type (e.g. Eckwert and Zilcha (2007) and (2008)). Interaction with coworkers - and therefore the issue of matching - does not play any role in this strand of the literature. The quality of information may be relevant in a matching set-up because it influences whether a matched pair produces with the appropriate technology. Blankenau and Camera (2006) explore this channel by positing two technologies: autarkic production, and joint production which allows the unskilled to free-ride on high skilled. The possibility of free-riding leads to under-investment in skill, but higher quality information reduces the feasibility of free-riding. The primary difference to the current paper is that agents are motivated to invest for completely different reasons: there because it raises their productivity and here because it leads to better matches. Presumably for tractibility, agents are assumed to be randomly re-matched each period in their model, making it uninteresting to study equilibrium segregation (since it is determined exogenously). Costly search is another common departure from an idealized setting. Equilibrium matching patterns are of central interest both in models where workers match with firms (e.g. Acemoglu (1999), and Albrecht and Vroman (2002)) and in models where workers match with other workers (e.g. Shimer and Smith (2000), Burdett and Coles (1997), Smith (2006)). Apart from the fact that types are observable in these models (although, see Chade (2006)), the important difference between these models and the present model is that agents do not undertake costly actions in order to overcome the inherent search problem (the matching function is unaffected by agents’ actions) in these models, whereas this is a central element to the welfare trade-off stressed in this  73  model.58 The remainder of the paper is organized as follows. The model is laid out in Section 4.2, and analyzed in Section 4.3. Section 4.4 contains a discussion of the results, including a generalization of the information structure, and conclusions are drawn in Section 4.5. All proofs are contained in the Appendix.  4.2 4.2.1  A Model Spillovers and General Results  The results in this section are ‘general’ in the sense that they apply to any distribution of types. Although the results are expressed in terms of a specific (Cobb-Douglas) productivity function, this is done for clarity. The results futher generalize, quite easily, to a wider class of functions (e.g. all CES functions are in this class). The details of this generalization are contained in the Appendix. Consider some worker i, of type θi , that works with a worker of type θ−i . There  are spillovers in the sense that worker i’s productivity, yi , is influenced by their partner according to: φ yi = θi1−φ θ−i ,  (4.1)  where φ ∈ (0, 0.5) is a parameter that measures spillovers. Thus, we have that: ln yi = (1 − φ) · ln θi + φ · ln θ−i .  (4.2)  Spillovers are neutral in the sense that average log-output, µln y ≡ E[ln yi ], is independent  of spillovers. Furthermore, average log-output is unaffected by the particular matching pattern. In particular: Result 11. Average log-output is given by µln y = µln θ ,  (4.3)  and therefore depends only on the distribution of types. 2 ≡ E[(ln y − E[ln y ])2 ], does in general depend on both The variance of log-output, σln i i y  spillovers and the matching pattern. To describe the latter, define ρ as the correlation 58  Burdett and Coles (2001) present a search and matching model in which agents invest to make themselves more attractive as marriage partners. Although the investent improves marriage prospects in the sense that it expands the set of agents that are willing to agree to marriage, the model does not have the feature that the rate at which any given type is encountered is influenced by the investment decision. There are models in which this feature is central (e.g. Peters and Siow (2002) and Cole, Mailath, and Postelwaite (2001)), but there is no relevant private information: agents preferences are defined over the (observed) investment made by their partner, and not their partner’s type per se.  74  coefficient between the random variables (ln θi , ln θ−i ): ρ ≡ corr[ln θi , ln θ−i ].  (4.4)  Positive (negative) values of ρ imply positive (negative) assortative matching, and the magnitude of the absolute value of ρ provides a measure of the strength of ‘assortativeness’. A value of -1 corresponds to perfect negative assortative matching, a value of 0 corresponds to random matching, and a value of 1 corresponds to perfect positive assortative matching.59 In particular, we have the following. 2 be the variance of log-types. The variance of log-output is given by: Result 12. Let σln θ 2 2 σln y = [1 − 2φ(1 − φ) · (1 − ρ)] · σln θ ,  (4.5)  and therefore depends on both spillovers and the matching pattern (as captured by ρ). Let M ≡ [1 − 2φ(1 − φ) · (1 − ρ)], and note that M is decreasing in φ and increasing  in ρ. An increase in M raises inequality in the sense that it increases the variance of log-output. However, and increase in M also raises (approximate) average output.  Result 13. The expected value of output, as approximated by the expected value of a second-order Taylor series expansion, satisfies: 2 E[y] ≈ exp(µln θ ) · 1 + (M/2)σln θ ,  (4.6)  and is therefore increasing in M . Although the approximation is reasonable only if ln y does not vary too much around it’s mean (so that the Taylor series expansion approximates the actual function), the exact value of E[y] can sometimes be calculated with certain distributions of types.60 Despite not always being able to derive an exact expression for expected output, the following Result shows that we can be sure about the effect of M . Result 14. The expected value of output, E[y], is increasing in M . To the extent that one is interested in i) the variance of log output, and ii) the level of mean output, the results presented here indicate that M is a convenient summary variable to analyze when studying models of imperfect matching. One such model is now developed. 59  In (perhaps) more familiar terms, ρ2 coincides with the R2 that would arise if one were to regress ln θ−i on ln θi . 60 For instance, if θ were log-normally distributed with mean and variance parameters (µ, σ 2 ), then E[y] = exp(µ + (1/2)M · σ 2 ).  75  4.2.2  A Simple Model of Imperfect Matching  There is a continuum of agents, that each have a type θ ∈ {θL , θH }. A proportion ψ of  agents have a type of θH . An agent’s consumption equals their output, which in turns depends on who they are matched with. As discussed above, an agent of type θi that is  matched with an agent of type θ−i produces and consumes: φ yi = θi1−φ · θ−i ,  (4.7)  where φ ∈ [0, 1/2] is a parameterization of the degree to which a partner’s type matters.  Prior to finding a partner, each agent has the option to make an investment, x ∈  {0, 1}. There is no cost associated with x = 0, but x = 1 costs c. Once the investment  decision is made, each agent is allocated a noisy signal that partially reveals whether or not the investment was made. In particular, each agent has a signal s ∈ {0, 1}. The signal “0” is indicative of having not made the investment and the signal “1” is indicative of having made the investment in the sense that: Pr [si = 0 | xi = 0] = Pr [si = 1 | xi = 1] = λ ≥ 1/2.  (4.8)  The assumption that a single parameter captures the ‘efficacy’ of both actions is made for simplicity. The analysis is largely unchanged when this assumption is relaxed, as demonstrated in section 4.4.1 below. Once investments are made and signals realized, agents are divided into two pools those with s = 0 and those with s = 1. Agents are then randomly matched within these pools. A strategy for an agent of type θ, σ(θ), is a probability with which they choose x = 1. Such strategies are required to be optimal relative to beliefs, ξs , where ξs is the probability that a worker with signal s ∈ {0, 1} is of type θH . In particular, a strategy is optimal for a type θ agent if:  σ(θ) = arg max {U (σ, θ)} = arg max {θ1−φ · V (σ) − σ · c}, σ∈[0,1]  σ∈[0,1]  (4.9)  where V (σ) is the expected quality of the match conditional on investing with probability σ: V (σ) ≡ (1 − π(σ)) · V0 + π(σ) · V1 ,  (4.10)  where π(σ) ≡ σ ·λ+(1−σ)·(1−λ) is the probability that an agent obtains the s = 1 signal,  conditional on σ, and Vs is the expected quality of the match conditional on obtaining a  76  signal of s: φ φ Vs ≡ (1 − ξs ) · θL + ξs · θH ,  (4.11)  for s ∈ {0, 1}.  In addition, beliefs are required to be consistent with investment strategies, in the  sense that beliefs are derived from Bayes’ Rule. That is, if agents of type T invest with probability σT , then ξ0 ≡  [1 − π(σH )] · ψ , [1 − π(σH )] · ψ + [1 − π(σL )] · (1 − ψ)  (4.12)  π(σH ) · ψ , π(σH ) · ψ + π(σL ) · (1 − ψ)  (4.13)  and ξ1 ≡  An equilibrium (Bayesian-Nash) is an investment function, σ(θ), and beliefs, {ξ0 , ξ1 },  such that investments are optimal given beliefs, and beliefs are consistent with investments. Some initial observations will help narrow the search for equilibria. First, notice that V (σ) is linear in σ - implying that V (σ) is a constant. If some type finds it optimal to invest with positive probability, then V (σ) > 0 (otherwise it would never be worthwhile incurring the positive marginal cost). Furthermore, if V (σ) > 0, then the net marginal return to investment, Uσ (σ, θ) = θ1−φ · V (σ) − c, is strictly increasing in type.61 Furthermore, it can never be an equilibrium for both types to invest  with probability one since the signal would become uninformative - and therefore nonvaluable - in this case (implying a profitable deviation to invest with probability zero). Together, these observations ensure that the probability of investment is strictly increasing in type in any equilibrium in which some type invests with positive probability. Second, if type θ finds it optimal to use a completely mixed strategy, then we know from above that Uσ (σ, θ) is strictly increasing in type. Furthermore, since type θ is using a completely mixed strategy, the indifference condition: Uσ (σ , θ) = 0 for all σ ∈ [0, 1],  must hold. The fact that Uσ (σ, θ) is strictly increasing in type means that this indifference condition can hold for at most one type. Thus, it follows that at most one type can be using a completely mixed strategy in equilibrium. Third, if high types are using a mixed strategy (and therefore low types are not investing), then such a situation is ‘unstable’ in the sense that if high types invested with a slightly greater probability, then all high types would find it optimal to invest with prob-  61 This is due to complementary nature of interaction (i.e. yθi ,θ−i > 0), and not due to any assumption regarding type-dependent investment costs (as in the standard signaling framework).  77  ability one. Equilibria of this nature will sometimes exist, but are disregarded because of the stability issue. Together, these observations allow us to focus on two classes of equilibria. First, a noinvestment equilibrium in which no agent invests. Second, an investment equilibrium in which high types invest with probability one, and low types invest with some probability strictly less than one (possibly zero).  4.3  Analysis  In this section I first analyze the existence of both no-investment equilibria and investment equilibria. Then, I derive an expression connecting the model’s parameters to the equilibrium degree of segregation. I then analyze how frictions affect equilibrium segregation. Finally, I make welfare comparisons both within- and across equilibria.  4.3.1  Existence  No-Investment Equilibrium Proposition 10. A non-investment equilibrium always exists. Since the signal does not provide any information about an agent’s type, beliefs in such an equilibrium beliefs are ξ0 = ξ1 = ψ. Since the signal is effectively noise, there is no benefit to investing as this does not change the distribution of types that one will match with. Although there are no resources wasted on investment, matching in this equilibrium is completely random (and therefore average output is lower than it would be under some degree of positive assortative matching). Investment Equilibrium An investment equilibrium arises when all high types invest, and some proportion, σ ∈  [0, 1), of low types invest. Consistency of beliefs gives us:  (1 − λ) · ψ (1 − λ) · [ψ + σ(1 − ψ)] + λ · (1 − σ)(1 − ψ) λ·ψ ξ1 (λ, σ) = λ · [ψ + σ(1 − ψ)] + (1 − λ) · (1 − σ)(1 − ψ)  ξ0 (λ, σ) =  (4.14) (4.15)  Note that ξ0 is decreasing in λ and increasing in σ, while ξ1 is increasing in λ and decreasing in σ. In an investment equilibrium, making the investment raises the probability with which one is matched with a high type (relative to the probability without investing). 78  Define the additional probability as G(λ, σ). This variable can be written as: G(λ, σ) ≡ (2λ − 1) · [ξ1 (λ, σ) − ξ0 (λ, σ)]  (4.16)  The above properties imply that G is increasing in λ and decreasing in σ. Also note that G(λ, 1) = 0. Proposition 11. A (unique) investment equilibrium exists if c 1−φ θH  φ φ · [θH − θL ]  ≤ G(λ, 0).  (4.17)  Furthermore, the proportion of low types investing in this equilibrium, σ ∗ , is zero if: c 1−φ θH  ·  φ [θH  −  φ ] θL  ≤ G(λ, 0) ≤  c 1−φ θL  φ φ ] − θL · [θH  (4.18)  and satisfies G(λ, σ ∗ ) =  c 1−φ θL  φ φ · [θH − θL ]  (4.19)  otherwise. The first condition is an incentive compatibility condition that ensures that high types prefer to invest if all other high types - but no low types - invest. The final condition imposes that low types are indifferent between investing and not investing. The expressions show that an investment equilibrium exists when frictions, costs and spillovers are sufficiently small, or when θH − θL is sufficiently large. Furthermore, the  proportion of low types investing in such an equilibrium can never decrease as frictions decline. The proportion of low types that invest is zero when frictions are high, but may become increasingly large as frictions go to zero.  4.3.2  Equilibrium Segregation  As with the frictionless models, there is a trade-off between the two equilibrium classes: a no-investment equilibrium wastes fewer resources on investment, but average output is greater in an investment equilibrium due to the higher degree of positive assortative matching. As discussed in the previous section, the key variable in this type of analysis is M , which in turn requires a calculation of ρ. A general statement characterizing ρ in this model is now given.  79  Proposition 12. If the equilibrium probability that a high type is matched with another high type is Z, then ρ is given by: ρ≡  Z −ψ . 1−ψ  (4.20)  This result is useful because the value of Z is readily calculated. In the no-investment equilibrium Z = ψ and the correlation coefficient is zero (i.e. random matching). In the investment equilibrium Z = λ · ξ1 + (1 − λ) · ξ0 ∈ [ψ, 1], implying that matching is positive  assortative.62 In fact, Z also equals the probability that an investor is matched with a high type.63  4.3.3  The Impact of Frictions  Having characterized ρ, I now analyze how this variable is affected by frictions. Proposition 13. A reduction in frictions (i.e. an increase in λ): • Has no effect on ρ in the non-investment equilibrium, • Raises ρ in the investment equilibrium if σ ∗ = 0, and • Lowers ρ in the investment equilibrium if σ ∗ > 0. Consider the investment equilibrium when frictions are great enough such that no low types are investing. Lowering frictions further increases equilibrium segregation since investors are more likely to match with other investors, and there is no change in the composition of the pools. However, there may come a point at which some low types find it optimal to invest. If this point is reached, then reducing frictions further makes non-investors worse off. This encourages some more non-investors (low types) to invest. Although the lower frictions means that investors are more likely to match with other investors, the composition of the pool of investors is also changed due to the influx of low types. The latter effect dominates the former effect, and investors actually become less likely to match with a high type. Corollary 2. If  c 1−φ φ φ θL · θH −θL  < G(1, 0), then ρ is inverse-U shaped in λ.  62  The fact that Z > ψ in the investment equilibrium follows from the observation that ξ1 > ξ0 (otherwise there would be no incentive to invest) and ψ = qξ1 + (1 − q)ξ0 , where q ∈ (0, 1) is the proportion of investors in the population (since both sides of the equation represent the probability that a random agent is of a high type). 63 This claim follows immediately in the investment equilibrium since i) the probability of matching with a high type depends only on the investment decision, and ii) all high types are investors. In a non-investment equilibrium the probability that an investor is matched with a high type equals the probability that a noninvestor is matched with a high type. Since high types are non-investors, this also equals the probability with which high types are matched with high types (Z).  80  The condition given in the Corollary is simply the condition that ensures that a positive proportion of low types invest if frictions are sufficiently small.  4.3.4  Welfare  Although there are many plausible alternatives64 , define welfare to be the average payoff: W ≡  I  ui di = µy − c ·  σi di.  (4.21)  I  I now discuss how welfare changes both within- and across equilibria. Within Equilibria We know from the above discussion that frictions do not affect welfare in the noninvestment equilibrium, since neither segregation nor the propensity to invest are affected by frictions. The remainder of the section focuses on an investment equilibrium. The fact that output can be inverse U-shaped in frictions leads to the following Corollary. Corollary 3. If  c 1−φ φ φ θL · θH −θL  < G(1, 0), then welfare in the investment equilibrium is  inverse-U shaped in λ.  When it is only the high types that invest, lower frictions raise welfare because matching displays greater segregation, raising average output (without any change in investment costs). However, when some low types are investing welfare is reduced on two fronts: i) matching becomes less segregated, lowering average output, and ii) more agents are incurring investment costs. Notice that if a reduction in frictions happens to lower welfare, then we know that frictions must have also lowered segregation (ρ). But, by the characterization of ρ, this implies that the probability that a high type matches with a high type also declines, which in turn implies that a high type’s payoff declines. Since lower frictions always lowers a low type’s payoff, it follows that if a decline in frictions lowers welfare, then both high and low types are worse off. Conversely, if an increase in frictions raises welfare, then both high and low types are better off. Thus, the following (stronger) welfare result is apparent. 64 For instance, one may be interested in the average value of augmented individual payoffs, g(ui ), where g is some increasing function. If g is concave, then this tends to lower the welfare gains from segregation since such gains involve a greater dispersion in payoffs (higher types increase their payoffs by more than lower types decrease theirs). The perverse effects of frictions are robust to this generalization; see Proposition 14.  81  Proposition 14. If a decrease (increase) in frictions leads to a decrease (increase) in welfare, then the change in frictions also induces a Pareto deterioration (improvement). In other words, whenever frictions have a ‘perverse’ welfare effect, the effect is robust in the sense that all agents’ payoffs change in the same direction: therefore the particular choice of welfare weights is immaterial. This is not the case when changes have ‘non-perverse’ effects: e.g. when lower frictions raise welfare, it is because segregation is increased, and although this raises the payoffs of investors (high types), it lowers the payoff of non-investors (low types). Across Equilibria Let λ be the smallest value of λ such that an investment equilibrium exists. To make this section relevant, I assume throughout that λ < 1. Welfare in the no-investment equilibrium, W N , is unaffected by frictions. This forms a natural benchmark against which to compare welfare in the investment equilibrium. Let welfare in the investment equilibrium be denoted W I (λ). To begin, note that the investment equilibrium can never Pareto dominate the noinvestment equilibrium (since the low types are always worse off in the investment equilibrium). The reverse is not true. Proposition 15. There always exists a region of frictions, [λ, λ ], where λ < λ , for which the no-investment equilibrium Pareto dominates the investment equilibrium for all λ ∈  [λ, λ ].  Figure 4.1 provides a geometric proof of this proposition. The figure plots the payoffs obtained by a high type as a function of frictions, in various scenarios. The horizontal line is the payoff in the no-investment equilibrium, and is simply the expected output produced by a high type in that equilibrium. The two dark lines represent the expected output of a high type given that all other high types, and no low types, are investing. The upper line corresponds to the expected output obtained if the high type invests, and the lower line corresponds to the expected output if the high type does not invest. The two lines meet when λ = 1/2 since the signal is uninformative in this case. The lower of these two lines also represents the payoff of a high type when they do not invest, since investment costs are zero. To get the payoff from investing, the upper line needs to be shifted down by the investment cost, as indicated. The value of λ is obtained by finding the point at which the payoff lines cross: for values of λ below this point, the payoff to not investing is higher. The figure shows that the payoff to a high type when frictions are at λ is less than the payoff associated with the no-investment equilibrium (as indicated by the leftmost set of arrows). By continuity, the welfare of high types remains lower in the 82  Payoffs  YHI c YHI − c  WHN  WHN  WHI (λ)  YHN 1/2  λ  ˜ λ  λ  Figure 4.1: A Geometric Proof of Proposition 15 investment equilibrium for values of λ marginally above λ. For low types, it is always the case that the investment equilibrium delivers a lower payoff than the no-investment equilibrium. Thus, there is a range of λ, just above λ, for which all agents get a lower payoff in the investment equilibrium than in the no-investment equilibrium. As frictions decrease (i.e. as λ increases from λ), the payoff to high types increases because we know that σ ∗ = 0 in the investment equilibrium at λ (since high types are indifferent, and low types get a lower payoff from investment). If low types have not ˜ then λ = λ. ˜ This is started to invest by the time λ reaches the point indicated by λ, because this is the point at which high types get the same payoff across equilibria. If, on ˜ then λ = 1. This is because the the other hand, low types do start to invest prior to λ, payoff to high types falls with λ once low types start to invest. The case of λ = 1 is not very interesting since it implies that the investment equilibrium is Pareto dominated whenever one exists. The following result, which discusses the weaker notion of welfare dominance65 , applies to the more interesting case of λ < 1. Proposition 16. If λ < 1, then there exists a λ , where λ < λ , such that the noinvestment equilibrium welfare dominates the investment equilibrium for all λ ∈ [λ, λ ]. 65 By this I mean that welfare dominance is weaker than Pareto dominance (since the former is implied by, but does not imply, the latter).  83  The proof of this follows from noting that the no-investment equilibrium welfare dominates the investment equilibrium at λ = λ . The logic is that the high types get the same payoff across the equilibria whereas the low types are strictly worse off in the investment equilibrium. Thus, W N > W I (λ ), and continuity ensures that W N > W I (λ +ε) for small enough ε > 0. It may very well be the case that the no-investment equilibrium welfare dominates the investment equilibrium for all levels of frictions (i.e. λ = 1). Thus, if the investment equilibrium is to welfare dominate the no-investment equilibrium, then it must be that λ < 1. But even then, the fact that welfare in the investment equilibrium (W I (λ)) can be inverse U-shaped means that there is no guarantee that this welfare dominance does not reverse itself at even lower frictions (higher λ).  4.4 4.4.1  Discussion Differential Efficacy of Investment Decisions  The model presented above restricted the two possible signals to be equally effective in the sense that the probability of s = 1 given investment is the same as the probability of s = 0 given non-investment. This section explores the consequences of allowing for the signals to differ along this dimension. To this end, let Pr [si = 0 | xi = 0] = λ0 ∈ [0.5, 1]  Pr [si = 1 | xi = 1] = λ1 ∈ [0.5, 1]. That is, λ1 reflects the effectiveness of the investment in generating the ‘high’ signal and λ0 reflects the effectiveness of not-investing in generating the ‘low’ signal. This generalization is easily incorporated into the model since the only major change is that the probability of obtaining the high signal, conditional on investing with probability σ, is now: π(σ) = σ · λ1 + (1 − σ) · (1 − λ0 ) The no-investment equilibrium is completely unaffected by this generalization. The investment equilibrium is affected to the extent that now the difference in the probability of matching with a high type associated with making the investment (relative to not investing) is: G((λ0 , λ1 ), σ) = (λ0 + λ1 − 1) · [ξ1 ((λ0 , λ1 ), σ) − ξ0 ((λ0 , λ1 ), σ)]  (4.22)  84  where (1 − λ1 ) · ψ (1 − λ1 ) · [ψ + σ(1 − ψ)] + λ0 · (1 − σ)(1 − ψ) λ1 · ψ ξ1 ((λ0 , λ1 ), σ) = λ1 · [ψ + σ(1 − ψ)] + (1 − λ0 ) · (1 − σ)(1 − ψ) ξ0 ((λ0 , λ1 ), σ) =  Two observations emerge from (4.22). First, for fixed beliefs, (ξ0 , ξ1 ), all that matters for determining G (and therefore incentives to invest) is the average value of λ0 and λ1 . This is not to say that generalizing the information structure in this way has no real effects on equilibrium, since the second observation is that for fixed investment behaviour (σ), equilibrium beliefs are sensitive to λ0 and λ1 independent of their average. To see that equilibrium beliefs are affected, consider the case in which ψ = 0.5. The difference in beliefs then becomes: ξ1 − ξ0 =  λ1 1 − λ1 − . 1 + (λ1 − λ0 ) + σ · (λ1 + λ0 − 1) 1 − (λ1 − λ0 ) + σ · (λ1 + λ0 − 1)  To analyze the importance of dispersion in the λ terms, fix some λ ∈ (0.5, 1) and let  λ0 = λ − ν and λ1 = λ + ν, where |ν| is small enough that (λ − ν, λ + ν) ∈ (0.5, 1) × (0.5, 1). The difference in beliefs then reduces to: ξ1 − ξ0 =  λ+ν 1−λ−ν − ≡ L(ν, λ, σ). 1 + 2ν + σ · (2λ − 1) 1 − 2ν + σ · (2λ − 1)  Taking the derivative with respect to ν gives: (2λ − 1)(1 + σ) (2λ − 1)(1 − σ) d L(ν, λ, σ) = , 2 − dν ((2λ − 1)σ − 2ν + 1) ((2λ − 1)σ + 2ν + 1)2 which is positive whenever ν is positive. This implies that if we enacted a mean-preserving increase in dispersion by raising λ1 (and therefore lowering λ0 ), the net return to investment is increased. Furthermore, notice that the derivative is negative at some ν < 0 as long as σ is not too large. In other words, if ν is sufficiently negative (the cut-off depending on σ) the net return to investment is increased by making ν even more negative. Thus, increasing dispersion in the λ’s by instead raising λ0 above λ1 can also raise the return to investment. In particular, this is always the case when σ = 0 since the derivative is positive if and only if ν is positive. In terms of policy, the desirability of raising the return to investment depends on whether doing so induces more low types to invest. If it does not, then welfare can be increased by increasing the dispersion of the λ’s (in either direction) for any given  85  mean value. On the other hand, if some low types are investing, then welfare can be increased by lowering the return to investment by reducing the dispersion if λ1 > λ0 . The minimum return that can be generated by adjusting the λ’s when some low types are investing will have the property that λ0 > λ1 .66 Since such considerations are not central to the paper, exploring the finer details of, and developing a clear intuition for, these results is left to future research.  4.4.2  Interpreting Trends in Inequality  A great deal of research, both empirical and theoretical, has been devoted to understanding the recent dramatic rise in wage inequality, especially in the United States (see Acemoglu (2002) for a survey). Two central themes arising from the empirical research are i) that a great deal of inequality is unexplained (i.e. not attributed to changes in characteristics or their prices), and ii) that inequality is more prominent among the high skilled (e.g. Lemieux (2006)). Within a standard human capital framework, these observations suggest technological changes in the production technology in which ‘skill’, whether observed or unobserved, has become more productive (since such skill commands a higher price). However, if these observations are interpreted within the context of the above model, no such conclusion need be drawn. To see this, interpret the model in the following way. Since production entails interpersonal spillovers, and does not stress physical capital, the agents can be understood to be high skilled workers (that is, both types in the model are highly productive relative to the population as a whole, but the high types are relatively more productive than the low types). To fix ideas, interpret the agents in the model to be those individuals that attend college. The investment is interpreted as the effort expended in college to earn a high grade, and frictions correspond to the probability that a high effort is rewarded with a high grade. Grades are observed in the labor market, and matches are formed on this basis, however the researcher only observes whether the worker attended college. The model can be used to understand the two observations when interpreted in this way, since i) all inequality is unexplained, and ii) applies only to high skill workers. Changes in inequality, however, have nothing to do with the production technology, but rather are the result of changing frictions. Of course, frictions are not observed directly, 66  This is because the first-order condition for the minimization problem is to set ν at the value that satisfies ((2λ − 1)σ + 2ν + 1)2 (1 − σ) = , (1 + σ) ((2λ − 1)σ − 2ν + 1)2 which, for σ > 0 involves ν < 0 (i.e. λ1 < λ0 ).  86  so other evidence is needed. One avenue is to note that the model predicts that inequality should largely be a between-firm phenomenon since it stresses inequality arising from matching patterns. Further, inequality should be not be as pronounced in manufacturing where the types of spillovers analyzed are less likely to operate across workers. Although the lack of matched worker-firm data make this a difficult exercise to test existing evidence is supportive of these conclusions (Dunne et. al. (2004) and Faggio et. al. (2007)). A richer model is needed to make predictions about cross-industry or cross-occupation trends, since selection across industries (which are perhaps differentiated by spillovers) will be a major econometric concern. Another avenue is to exploit the fact that although increased segregation raises wage inequality, it has no necessary effect on average wages. This is not generally true of the human capital (‘price of skill’) explanation for rising inequality. This is essentially because the unobserved ability appears in both the constant and the error term in that conceptualization. Thus, a higher price of unobserved skill increases wage inequality, but will generally have level effects also. Exploring the finer details of this argument in order to develop a discriminating empirical test is left to future research.  4.5  Conclusions  I have developed a model of matching with hidden types in which the capacity to secure a better match via wasteful investment is diminished by the existence of frictions. This extensions adds an element of reality to this class of models, which I see as being important for two reasons. First, it forms a richer base from which to formulate empirical tests of the general mechanism. Second, it aids us in understanding the central welfare trade-off inherent in such models. A central result is that equilibrium segregation (and therefore output and inequality) and welfare are generally not monotonic in the degree of frictions. Although lower frictions allow investors to be more readily matched with each other, lower frictions raise the return to investing which can change the composition of the pool of investors, actually reducing the degree of segregation.  87  Chapter 5  Conclusions The research contained in the preceding chapters provides a framework with which to understand and analyze economic environments in which i) agents realize their payoffs through interaction with others, ii) agents have a capacity to influence who they match with, and iii) agents’ relevant characteristics are unobservable in the matching market. The first two models have the additional feature that the relevant characteristics are endogenously determined. I abstract from this feature in the third model in order to provide a clearer focus on the role of frictions. A general point to be taken from the models is that competition for partners in such settings is not efficient, and in fact can be welfare-reducing relative to some reasonable benchmarks. The first model shows that, despite the positive skill spillover, there is over-investment in education when education is used to attract highly skilled coworkers. I argue that this mechanism provides a plausible explanation of credentialism in the labour market that avoids certain criticisms of standard signaling models. The overinvestment result is shown to arise because of the local nature of the spillover, and it is this feature that produces policy implications that are contrary to the received view based on global spillovers. Observed returns to education need to be interpreted with care in the presence of such localized spillovers. I extend the model to a simple dynamic setting, and show that many of the conclusions drawn in the static setting remain intact in the more realistic dynamic version. In contrast to the over-investment result, the second model demonstrates how the inherent under-investment problem is exacerbated in the presence of multiple pre-match investments. In a setting in which a child’s human capital is subject to peer effects, the model describes how altruistic parents accumulate wealth in order to place their child among desirable peers. Doing so represents a drain on parental resources, and parental investment is made more costly as a result. Beside providing illustrations of equilibrium, I extend the model by allowing both wealth and parental investment to be observed with noise. Continuing this exercise of adding imperfections to the base model, the third model imposes frictions by breaking the deterministic link between actual and perceived investment. The notion of efficiency is slightly different in this chapter, since I assume that the relevant characteristic is exogenously fixed. This is helpful in sharpening the fo88  cus on the inefficiency surrounding imperfectly assorative equilibrium matching. I show how greater frictions can actually improve matching efficiency by discouraging some low types from masquerading as high types. The results contained in the preceding chapters must be taken within the context of the models’ limitations. One prominent limitation is that much of the analysis was performed in a static setting (although a simple dynamic setting is analyzed in the first model). A more careful modeling of a dynamic environment, including the introduction of decentralized search, is planned for future research. I also see value in further exploring the implications of various imperfections (costly search is but one). The modeling of noise in the second model, and of frictions in the third model, was necessarily simple. A more general approach to modeling frictions quickly becomes intractable, and the intuition behind the mechanisms rapidly become opaque. However, a more general treatment has obvious benefits and, in my opinion, represents a useful topic for future research. Apart from those avenues already mentioned, the technical details of the models presented here are quite simple and leave much scope for extensions. For instance, it may be insightful to explore the case of finite numbers of agents, or a case in which types are multi-dimensional. Perhaps the most pressing direction for future research is an empirical one. The empirical literatures on returns to education, and peer effects are huge. There is also a growing interest in social returns to education. I hope the research presented here provides a useful theoretical lens through which such empirical work can be interpreted.  89  Bibliography A CEMOGLU, D. (1999): “Changes in Unemployment and Wage Inequality: An Alternative Theory and Some Evidence,” American Economic Review, 89(5), 1259–1278. (2002): “Technical Change, Inequality, and the Labor Market,” Journal of Economic Literature, 40(1), 7–72. A CEMOGLU, D.,  AND  J. A NGRIST (1999): “How Large are the Social Returns to Edu-  cation? Evidence from Compulsory Schooling Laws,” NBER Working Papers 7444, National Bureau of Economic Research, Inc. A KERLOF, G. A. (1976): “The Economics of Caste and of the Rat Race and Other Woeful Tales,” The Quarterly Journal of Economics, 90(4), 599–617. A LBRECHT, J.,  AND  S. V ROMAN (2002): “A Matching Model with Endogenous Skill Re-  quirements,” International Economic Review, 43(1), 283–305. A LTONJI , J. G.,  AND  C. R. P IERRET (2001): “Employer Learning And Statistical Dis-  crimination,” The Quarterly Journal of Economics, 116(1), 313–350. A RROW, K. J. (1973): “Higher education as a filter,” Journal of Public Economics, 2(3), 193–216. AUTOR , D. H., F. L EVY, AND R. J. M URNANE (2003): “The Skill Content Of Recent Technological Change: An Empirical Exploration,” The Quarterly Journal of Economics, 118(4), 1279–1333. B AKER , G., R. G IBBONS,  AND  K. J. M URPHY (1994): “Subjective Performance Measures  in Optimal Incentive Contracts,” The Quarterly Journal of Economics, 109(4), 1125– 56. B ECKER , G. S. (1973): “A Theory of Marriage: Part I,” Journal of Political Economy, 81(4), 813–46. B EDARD, K. (2001): “Human Capital versus Signaling Models: University Access and High School Dropouts,” Journal of Political Economy, 109(4), 749–775.  90  B ENABOU, R. (1996a): “Equity and Efficiency in Human Capital Investment: The Local Connection,” Review of Economic Studies, 63(2), 237–64. (1996b): “Heterogeneity, Stratification, and Growth: Macroeconomic Implications of Community Structure and School Finance,” American Economic Review, 86(3), 584–609. B ERG, I. (1970): Education and Jobs: The Great Training Robbery. Praeger Publishers, New York. B IDNER , C. (2008a): “A Spillover-Based Theory of Credentialism,” Working paper, University of British Columbia. (2008b): “Two-Sided Noisy Signaling,” Working paper, University of British Columbia. B LANKENAU, W.,  AND  G. C AMERA (2006): “A Simple Economic Theory of Skill Accumu-  lation and Schooling Decisions,” Review of Economic Dynamics, 9(1), 93–115. (2008): “Public Spending on Education and the Incentives for Student Achievement,” Economica, Forthcoming. B ROWN, D. K. (1995): Degrees of Control: A Sociology of Educational Expansion and Occupational Credentialism. Teachers College Press, New York and London. B URDETT, K.,  AND  M. G. C OLES (1997): “Marriage and Class,” The Quarterly Journal of  Economics, 112(1), 141–68. (2001): “Transplants and Implants: The Economics of Self-Improvement,” International Economic Review, 42(3), 597–616. C HADE , H. (2006): “Matching with noise and the acceptance curse,” Journal of Economic Theory, 127(1), 81–113. C HO, I.-K.,  AND  D. M. K REPS (1987): “Signaling Games and Stable Equilibria,” The  Quarterly Journal of Economics, 102(2), 179–221. C OLE , H. L., G. J. M AILATH ,  AND  A. P OSTLEWAITE (1995): “Incorporating concern for  relative wealth into economic models,” Quarterly Review, (Sum), 12–21. (2001): “Efficient Non-Contractible Investments in Large Economies,” Journal of Economic Theory, 101(2), 333–373. C OLLINS, R. (1979): The Credential Society. Academic Press, Boston. 91  D AMIANO, E., AND H. L I (2007): “Price discrimination and efficient matching,” Economic Theory, 30(2), 243–263. DE  B ARTOLOME , C. A. M. (1990): “Equilibrium and Inefficiency in a Community Model  with Peer Group Effects,” Journal of Political Economy, 98(1), 110–33. D RAGO, R.,  AND  G. T. G ARVEY (1998): “Incentives for Helping on the Job: Theory and  Evidence,” Journal of Labor Economics, 16(1), 1–25. D UNNE , T., L. F OSTER , J. H ALTIWANGER ,  AND  K. R. T ROSKE (2004): “Wage and Pro-  ductivity Dispersion in United States Manufacturing: The Role of Computer Investment,” Journal of Labor Economics, 22(2), 397–430. D URLAUF, S. N. (1996): “A Theory of Persistent Income Inequality,” Journal of Economic Growth, 1(1), 75–93. D URLAUF, S. N. (2004): “Neighborhood Effects,” in Handbook of Regional and Urban Economics, ed. by J. V. Henderson, and J. F. Thisse, vol. 4, pp. 2173–2242. Elsevier. E CKWERT, B.,  AND  I. Z ILCHA (2007): “The Effect of Better Information on Income In-  equality,” Economic Theory, 32(2), 287–307. (2008): “Efficiency of Screening and Labor Income Inequality,” Journal of Public Economic Theory, 10(1), 77–98. FAGGIO, G., K. S ALVANES,  AND  J. V. R EENEN (2007): “The Evolution of Inequality  in Productivity and Wages: Panel Data Evidence,” Working Paper 13351, National Bureau of Economic Research. FARBER , H. S.,  AND  R. G IBBONS (1996): “Learning and Wage Dynamics,” The Quarterly  Journal of Economics, 111(4), 1007–47. F ERNANDEZ , R.,  AND  R. R OGERSON (1996): “Income Distribution, Communities, and  the Quality of Public Education,” The Quarterly Journal of Economics, 111(1), 135– 64. F ERRER , A. M.,  AND  W. C. R IDDELL (2002): “The role of credentials in the Canadian  labour market,” Canadian Journal of Economics, 35(4), 879–905. F RANK , R. H. (1985): “The Demand for Unobservable and Other Nonpositional Goods,” American Economic Review, 75(1), 101–16. F RIEDMAN, B. (2005): The Moral Consequences of Economic Growth. Knopf, New York.  92  G ANT, J., C. I CHNIOWSKI ,  AND  K. S HAW (2002): “Social Capital and Organizational  Change in High-Involvement and Traditional Work Organizations,” Journal of Economics and Management Strategy, 11(2), 289–328. G LAESER , E. L. (1999): “Learning in Cities,” Journal of Urban Economics, 46(2), 254– 277. G OTTSCHALK , P.,  AND  T. M. S MEEDING (1997): “Cross-National Comparisons of Earn-  ings and Income Inequality,” Journal of Economic Literature, 35(2), 633–687. H AN, S. (2005): “Competitive Investments and Matching: Hedonic Pricing Problems,” Mimeo, McMaster University. H OLMSTROM , B.,  AND  P. M ILGROM (1991): “Multitask Principal-Agent Analyses: In-  centive Contracts, Asset Ownership, and Job Design,” Journal of Law, Economics and Organization, 7(0), 24–52. H OPKINS, E. (2005): “Job Market Signalling of Relative Position, or Becker Married to Spence,” ESE Discussion Papers 134, Edinburgh School of Economics, University of Edinburgh. H OPPE , H. C., B. M OLDOVANU,  AND  A. S ELA (2005): “The Theory of Assortative Match-  ing Based on Costly Signals,” Discussion Papers 85, SFB/TR 15 Governance and the Efficiency of Economic Systems, Free University of Berlin, Humboldt University of Berlin, University of Bonn, University. I CHNIOWSKI , C.,  AND  K. S HAW (2003): “Beyond Incentive Pay: Insiders’ Estimates of  the Value of Complementary Human Resource Management Practices,” Journal of Economic Perspectives, 17(1), 155–180. J ACOBS, J. (2004): Dark Age Ahead. Random House, New York. J OVANOVIC, B.,  AND  R. R OB (1989): “The Growth and Diffusion of Knowledge,” Review  of Economic Studies, 56(4), 569–82. K REMER , M. (1993): “The O-Ring Theory of Economic Development,” The Quarterly Journal of Economics, 108(3), 551–75. K REMER , M.,  AND  E. M ASKIN (1996): “Wage Inequality and Segregation,” Harvard In-  stitute of Economic Research Working Papers 1777, Harvard - Institute of Economic Research. L ABAREE , D. F. (1997): How to Succeed in School Without Really Learning: The Credentials Race in American Education. Yale University Press, New Haven and London. 93  L ANG, K.,  AND  D. K ROPP (1986): “Human Capital versus Sorting: The Effects of Com-  pulsory Attendance Laws,” The Quarterly Journal of Economics, 101(3), 609–24. L ANG, K.,  AND  M. M ANOVE (2006): “Education and Labor-Market Discrimination,”  NBER Working Papers 12257, National Bureau of Economic Research, Inc. L ANGE , F. (2007): “The Speed of Employer Learning,” Journal of Labor Economics, 25(1), 1–36. L ANGE , F.,  AND  R. T OPEL (2006): “The Social Value of Education and Human Capital,”  in Handbook of the Economics of Education, ed. by F. Hanushek, Eric Welch, chapter 8. North Holland, Boston. L AZEAR , E. P. (1999): “Personnel Economics: Past Lessons and Future Directions: Presidential Address to the Society of Labor Economists, San Francisco, May 1, 1998,” Journal of Labor Economics, 17(2), 199–236. L EMIEUX , T. (2006): “Postsecondary Education and Increasing Wage Inequality,” American Economic Review, 96(2), 195–199. L EMIEUX , T., W. B. M AC L EOD,  AND  D. PARENT (2007): “Performance Pay and Wage  Inequality,” Working Paper 13128, National Bureau of Economic Research. L EVY, F.,  AND  R. J. M URNANE (2004): The New Division of Labor. Princeton University  Press, Princeton, New Jersey. L UCAS, R. J. (1988): “On the mechanics of economic development,” Journal of Monetary Economics, 22(1), 3–42. M ORETTI , E. (2004): “Workers’ Education, Spillovers, and Productivity: Evidence from Plant-Level Production Functions,” The American Economic Review, 94(3), 656–690. M URNANE , R. J.,  AND  F. L EVY (1996): Teaching the New Basic Skills. The Free Press,  New York. P ETERS, M. (2004a): “Non-Cooperative Foundations of Hedonic Equilibrium,” Micro Theory Working Papers peters-04-07-30-12-06-27, Microeconomics.ca Website. (2004b): “The Pre-Marital Investment Game,” Micro Theory Working Papers peters-04-02-18-01-42-09, Microeconomics.ca Website. (2006): “Truncated Hedonic Equilibrium,” Micro Theory Working Papers peters06-04-11-02-42-39, Microeconomics.ca Website.  94  (2007): “A Non-Cooperative Approach to Hedonic Equilibrium,” mimeo, University of British Columbia. P ETERS, M.,  AND  A. S IOW (2002): “Competing Premarital Investments,” Journal of Po-  litical Economy, 110(3), 592–608. R EGE , M. (2007): “Why do people care about social status?,” Journal of Economic Behavior and Organization, Forthcoming. R OTH , A.,  AND  M. A. O. S OTOMAYOR (1992): Two-Sided Matching : a Study in Game-  Theoretic Modeling and Analysis. Cambridge University Press, Cambridge. R OTHSCHILD, M.,  AND  J. E. S TIGLITZ (1976): “Equilibrium in Competitive Insurance  Markets: An Essay on the Economics of Imperfect Information,” The Quarterly Journal of Economics, 90(4), 630–49. S AND, B. M. (2007): “Has there been a Structural Change in the Labor Market? Evidence from U.S. Cities,” Working paper, University of British Columbia. S ATTINGER , M. (1993): “Assignment Models of the Distribution of Earnings,” Journal of Economic Literature, 31(2), 831–80. S EN, A. (1999): Development as Freedom. Oxford University Press, Oxford. S HIMER , R.,  AND  L. S MITH (2000): “Assortative Matching and Search,” Econometrica,  68(2), 343–370. S MITH , L. (2006): “The Marriage Model with Search Frictions,” Journal of Political Economy, 114(6), 1124–1146. S PENCE , A. M. (1973): “Job Market Signaling,” The Quarterly Journal of Economics, 87(3), 355–74. S PENCE , A. M. (1974): Market Signaling: Informational Transfer in Hiring and Related Screening Processes. Harvard University Press, Cambridge, Massachusetts. S TIGLITZ , J. E. (1975): “The Theory of “Screening”, Education, and the Distribution of Income,” American Economic Review, 65(3), 283–300. T YLER , J. H., R. J. M URNANE ,  AND  J. B. W ILLETT (2000): “Estimating The Labor  Market Signaling Value Of The GED,” The Quarterly Journal of Economics, 115(2), 431–468. W EISS, A. (1995): “Human Capital vs. Signalling Explanations of Wages,” Journal of Economic Perspectives, 9(4), 133–54.  95  Appendix A  Appendix to Chapter 2 A.1 A.1.1  Proofs Proof of Lemma 1  Proof. For any given µ, optimality implies that investment is a non-decreasing function of ability. To see this, consider two investments, x and x > x, and two types θ and θ > θ. If θ prefers x to x, then so too will θ . This is because if (1 − φ) · s(x , θ) − s(x, θ) + φ · µ(x ) − µ(x) − c(x ) − c(x) ≥ 0, then (1 − φ) · s(x , θ ) − s(x, θ ) + φ · µ(x ) − µ(x) − c(x ) − c(x) ≥ 0, since s(x , θ )−s(x, θ ) > s(x , θ)−s(x, θ) (by virtue of x > x and the property that sxθ > 0). Given this, rational expectations implies that beliefs are non-decreasing in investment. That is, all workers prefer to be matched with those that invest more. Suppose that matching were not positive assortative. Then there is some x and x = x such that m(x) = x . If x < x then workers that invest x prefer to match among themselves than according to m. By feasibility, at least some workers that invest x will match with workers that invest x. If x > x, then workers that invest x would prefer to match among themselves than according to m. Therefore, the only stable matching is positive assortative (which is also feasible).  A.1.2  Proof of Proposition 1  Proof. 1. All solutions are strictly increasing and ‘unbounded’. The direction field, along with the definition of x∗0 , ensures that all solutions are strictly increasing for x ≥ x0 whenever x0 ≥ x∗0 . It is impossible for a proposed solution to cross the locus of {x, ξ | Γ(x, ξ) = 0}, since this would imply that there are two distinct  curves emanating from the crossing point. As mentioned in the text, this possibility is ruled out by technical assumptions made on s(x, θ).  96  For unboundedness, suppose to the contrary that there exists a ξ¯ such that a solution ¯ we have that cx (x) − sx (x, ξ(x)) > ξ(x) ≤ ξ¯ for all x. Then since sx (x, ξ(x)) < sx (x, ξ), ¯ for all x. Further, since cx (x) − sx (x, ξ(x)) is strictly increasing with a cx (x) − sx (x, ξ) ¯ = 0 (social planner’s problem is well-defined), it follows finite x ˆ such that cx (x) − sx (x, ξ) that cx (x) − sx (x, ξ(x)) > 0 for all x > x ˆ. In addition, the fact that sθ (x, θ) is bounded on [0, ∞] × Θ implies that sθ (x, θ) ≤ s¯θ < ∞ for all x ≥ x ˆ. By definition of ξ (x), we have  that ξ (x) ≥ [cx (x) − sx (x, ξ(x))]/(φ · s¯θ ), which is a strictly increasing function of x that  is strictly positive for x > x ˆ. But this is a contradiction since ξ(x) would not be bounded if it’s slope were bounded strictly above zero as x increases without bound. We conclude that any solution is unbounded. 2. Off-equilibrium values of µ(x) can be chosen such that workers would never deviate to them. Simply set µ(x) = 0 for x that do not arise in equilibrium. By definition of x∗∗ 0 , agents of the lowest type prefer their equilibrium payoff to the maximum payoff available when remaining unmatched. Single-crossing implies that all higher types strictly prefer to invest x∗∗ 0 and be matched with workers of the lowest type, to being unmatched and making any other investment. As long as workers are choosing investments to maximize their payoff (verified next), it follows that all types prefer their equilibrium payoff to any payoff they could achieve by remaining unmatched. Investing x and being unmatched is equivalent to µ(x) = 0, which produces the result. Such harsh beliefs are by no means necessary, but they are sufficient to ensure that all agents prefer their equilibrium outcome to investing any other off-equilibrium amount. 3. Agents are optimizing. Let P (θ) be the set of (x, µ) that is strictly preferred by a type θ worker to their equilibrium outcome. From the geometric representation of preferences, P (θ) was shown to be convex. Suppose that there exists a type, θ, and an equilibrium investment, x, such that (x, µ(x)) ∈ P (θ). Suppose first that x > x(θ). Then, by the mean value theorem, there is a x ˆ ∈ (x(θ), x) such that µ (ˆ x) =  µ(x)−µ(x(θ)) , x−x(θ)  which itself is strictly greater than  the slope of the type θ indifference curve at x(θ) (since the first-order condition holds for type θ). This greater investment is made by a higher type, θˆ > θ (since ξ is strictly increasing). The slope of the type θˆ indifference curve must be less than the slope of µ(ˆ x) if their indifference curve is to cross that of the type θ worker. This contradicts the fact ˆ The analogous argument holds if x < x(θ). that the first-order condition holds also for θ. An alternative approach is to show that equilibrium investments constitute a local maximum, then use a similar argument to show that indifference curves touch the return function only once. The curvature of the objective function at the equilibrium in-  97  vestment is: ∂2 u(x, θ, µ(x)) = ∂x2  ∂ {(1 − φ) · sx (x, θ) + φ · [sx (x, ξ) + sθ (x, ξ) · ξx ] − cx } ∂x ∂ = {(1 − φ) · sx (x, θ) + φ · sx (x, ξ) − sx (x, ξ)} ∂x ∂ {sx (x, θ) − sx (x, ξ)} = (1 − φ) · ∂x = (1 − φ) · [sxx (x, θ) − sxx (x, ξ) − sxθ (x, ξ) · ξx ] = −(1 − φ) · sxθ (x, θ) · ξx ,  which is strictly negative for x > x0 since sxθ (x, θ) > 0. The investment is therefore a local maximum for all types θ > θ. Either way, I conclude that agents are optimizing given µ(x) = s(s, ξ(x)), implying that x(θ) = ξ −1 (θ) is an equilibrium investment function.  A.1.3  Proof of Proposition 2  Proof. Notice that since sx (x, ξ ∗ (x)) = cx (x), we can write: ξ (x) =  sx (x, ξ ∗ (x)) − sx (x, ξ(x)) . φ · sθ (x, ξ(x))  Thus, there is over-investment (ξ ∗ (x) > ξ(x)) if and only if ξ (x) > 0. Since ξ (x) > 0 in equilibrium for all x > x0 , it follows that all types higher than the lowest type overinvest. Since the Pareto dominant separating equilibrium selects x0 such that ξ (x0 ) = 0, it follows that the lowest type is investing efficiently.  A.1.4  Proof of Proposition 3  Proof. If θ < θ, then the initial condition associated with the Pareto dominant equilibrium has the property that ξ0 < ξ0 and x0 < x0 . The solution to the initial values problem with initial condition {ξ0 , x0 } lies everywhere below the solution to the problem with initial condition {ξ0 , x0 }. If it did not, then the solutions would cross at some point (x, ξ).  The fact that they cross implies that their slopes are different at that point, however the slope of the solution is determined by Γ(ξ, x), which takes a single value. This is a contradiction.  A.1.5  Proof of Proposition 4  Proof. Let the solution to the initial values problem when spillovers are φ be given by ξ(x; φ). Consider the direction field and note that a higher value of φ lowers Γ(ξ, x) (the 98  slope of ξ) at every (ξ, x) pair. Thus, ξ(x; φ ) must cross ξ(x; φ) ‘from above’. Investment of the lowest type is unaffected since this is determined by the condition cx (x0 ) = sx (x0 , θ), which is independent of φ. Thus, we have ξ(x0 ; φ ) = ξ(x0 ; φ). This in turn implies that ξ(x; φ ) must lie below ξ(x; φ) for all x > x0 .  A.2  A Further Rationale for Selecting The Pareto Dominant Equilibrium  If Pareto optimality and the associated argument that choosing this equilibrium to focus on can only harm an argument for over-investment, then I propose that the following is a reasonable economic sense in which Pareto dominant equilibrium is more robust than the others. Suppose we introduced firms explicitly and allowed them to offer investment-contingent wage contracts, just as in standard signaling models. I shall consider an equilibrium unreasonable if there exists a contract that guarantees a positive profit regardless of which types it attracts. Entrepreneurs that are able to offer such contracts are not part of the model, but this is for simplicity and clarity - not because their nonexistence is a relevant feature of the environment. If such entrepreneurs were to be incorporated, then these contracts would surely be offered since they are profitable for any beliefs about the distribution of types that such entrepreneurs may hold.67 To formalize this idea, consider a particular contract (w, x). If the firm hires workers with a profile of types (θ, θ ), then the resulting profits are: π(w, x; θ, θ ) = y(s(x, θ), s(x, θ )) + y(s(x, θ ), s(x, θ)) − 2 · w. The payoff accruing to a worker from accepting the contract is: v˜(w, x) = w − c(x). For some particular equilibrium, let v(θ) represent the equilibrium payoff to a type θ agent. A contract is said to be strongly profitable relative to the equilibrium if there exists a non-empty set of types, Θ , such that: 1. Profitability: π(w, x; θ, θ ) > 0 for all θ, θ ∈ Θ, and 2. Attractiveness: v˜(w, x) > v(θ) for all θ ∈ Θ . The first condition says that the contract makes a strictly positive profit, regardless of which workers are hired. The second condition says that we can find a set of workers that 67  Provided that beliefs are restricted to placing zero weight on types that do not in fact exist.  99  would strictly prefer the contract to their equilibrium outcome. An equilibrium is said to be contract robust if there does not exist an associated strongly profitable contract. Lemma 2. If an equilibrium is contract robust if and only if it is the Pareto dominant equilibrium (i.e. x0 = x∗ (θ)). Proof. Consider some Pareto dominated equilibrium (x0 > x∗ (θ)). I am going to construct a strongly profitable contract by setting x = x0 − ε for an arbitrarily small ε > 0, and  setting w = s(x, θ) − ν for an arbitrarily small ν > 0. The contract is strongly profitable  since the smallest profit possible arises when the firm happens to hire two workers of type θ, and is 2 · [s(x, θ) − w] = 2 · ν > 0. The contract is attractive to workers of the lowest type, since for them v(θ) = s(x0 , θ) − c(x0 ), yet v˜(w, x) = s(x0 − ε, θ) − c(x0 − ε) − ν. We can  always find a ε > 0 and ν > 0 such that the latter is greater. To see this, start at ν = 0 and note that since v˜(w, x) = v(θ) at ε = 0 and: ∂ v˜(w, x)|ε=0 = − [sx (x0 , θ) − cx (x0 )] > 0, ∂ε (it is positive since the bracketed term is negative by virtue of x0 > x∗ (θ)), we can always find a ε > 0 such that v˜(w, x) > v(θ). It is then a matter of setting ν small enough that this inequality is not over-turned. In the other direction, consider the Pareto dominant equilibrium and suppose to the contrary that a strongly profitable contract exists. For the contract to be profitable, it must attract workers. If it attracts any workers it will always attract workers of the lowest type. Then, the lowest type must prefer it: w − c(x) = v˜(w, x) > s(x∗ (θ), θ) − c(x∗ (θ)), but note that s(x∗ (θ), θ) − c(x∗ (θ)) ≥ s(x , θ) − c(x ) for all x , since x∗ (θ) by definition satisfies the first-order condition for the maximization of the right side with respect to x (and the objective is concave). Therefore, w − c(x) > s(x , θ) − c(x ) for all x , and in particular, by setting x = x, we have: w > s(x, θ). But this contradicts the contract being profitable when the firm happens to hire two 100  workers with types sufficiently close to (but greater than) (θ, θ). From a different angle, it is also the case that the Pareto dominant equilibrium is the only equilibrium in which there are off-equilibrium values of µ(x) that satisfy the following: there exists a θ ∈ Θ such that µ(x) ≥ s(x, θ ). In particular, if we let µ(x) = s(x, θ) for all off-equilibrium values of x, then no worker has any incentive to deviate to  an off-equilibrium investment level. This argument can be seen most clearly from Figure 2.5.  A.3  Derivation of Closed-Form Solution  Let s(x, θ) = θ · g(x) and consider x0 > 0. To see that Γ is linear in ξ, let a(x) ≡ b(x) ≡  cx (x) φg(x) ,  −gx (x) φg(x)  and  so that we can write:  ξ (x) = a(x) · ξ(x) + b(x). The solution to this first-order linear differential equation is: ξ(x) = K +  x  z  b(z) · exp −  x  a(t)dt dz · exp  a(z)dz ,  where K is a constant that adjusts so that the initial condition is satisfied and the notation  x  f (z)dz represents the indefinite integral of f (x). Note that x  −(1/φ) ln(g(x)), so that exp(  a(z)dz) =  ξ(x) =  K+  =  K+  g(x)−1/φ ,  x  x  a(z)dz =  which lets us write: 1  1 −φ  b(z) · g(z) φ dz · g(x) 1 φ  x  c (z) · g(z)  1−φ φ  dz · g(x)  1 −φ  For any given {ξ0 , x0 }, we know that: 1  K = ξ0 · g(x0 ) φ −  1 φ  x0  c (z) · g(z)  1−φ φ  dz,  so that: ξ(x) =  1  ξ0 · g(x0 ) φ +  g(x0 ) = ξ0 · g(x)  1 φ  1 φ  1 + φ  x x0  c (z) · g(z) x  x0  1−φ φ  1 −φ  dz · g(x)  c (z) g(z) · g(z) g(x)  1 φ  dz  101  Finally, the initial condition is that x0 satisfies ξ0 = θ = cx (x0 )/gx (x0 ), which gives the result in the text once the substitution is made. The first term in the expression goes to zero as x0 goes to zero, implying that the expression also represents an inverse investment function when x0 = 0. Geometric arguments using the direction field can be used to show that this is the unique soluˆ such that ξˆx (x) = Γ(x, ξ) ˆ and ξ(x ˆ 0) = tion. Suppose there was some other function, ξ, ˆ cx (x0 )/gx (x0 ) = 0. For some x > 0, if ξ(x) < ξ(x), then there will exist some x ∈ (0, x] ˆ such that ξ(z) < 0 for z ∈ (0, x ), which is impossible since ξˆ represents a type, and all ˆ types are non-negative. Similarly, if ξ(x) > ξ(x) then there exists a x such that ξˆx (z) < 0 for all z ∈ (0, x ). This is not allowed either since the inverse investment function must  be strictly increasing. Thus, there is a unique solution to the initial values problem for all x0 ≥ 0.  A.4  An Illustration  This section utilizes particular functional forms to demonstrate an equilibrium in closed form. This exercise illustrates some of the points made above and introduces some additional points of interest. Let the skill production function be given by s(x, θ) = θ · xη for  some η ∈ (0, 1), and let investment costs be given by c(x) = c · x. Using (2.4), the inverse investment function turns out to be:  ξ(x) = Λ0 ·  x0 x  η φ  (1−η)  x0  + Λ · x1−η ,  where x0 = [(η/c) · θ](1/(1−η)) and Λ0 ≡ Λ ≡  c φ(1 − η) · , η η + φ(1 − η) c . η + φ(1 − η)  Since ξ starts off convex, then turns concave, the investment function starts off concave before turning convex. To verify that there is over-investment, note that the inverse of the efficient investment function is ξ ∗ (x) = (c/η) · x1−η . There is over-investment since ξ(x)/ξ ∗ (x) ≤ 1 (strict for x > x0 ).68 Thus, the type that efficiently invests x is greater than the type that  does invest x. Furthermore, the ratio approaches unity (investments become efficient) as spillovers go to zero. η+φ(1−η)·[(x0 /x)(η/φ)+1−η ] The actual expression equals . The result follows since the bracketed term is η+φ·(1−η) equal to one when x = x0 and is strictly less than one for x > x0 . 68  102  To verify that a higher lowest type lowers investment, it is straightforward to see that ξ is increasing in x0 (and therefore in θ). Also, the fact that ξ is decreasing in φ demonstrates that increases in φ increase equilibrium investment. The output produced by a worker that invests x, g(x) · ξ(x), is: y(x) = Λ0 ·  x0 x  η 1−φ φ  x0 + Λ · x,  which is a convex function. If the first term is ignored, the OLS slope will appear to be Λ. Interestingly, this slope is decreasing in the degree of spillovers. This result is consistent with the observation that Mincerian returns to education tend to be higher in poorer countries and in countries that have lower aggregate educational attainment (since economies with low spillovers have also have lower aggregate productivity and educational attainment). The functional form used for g is convenient because the case in which investment is unproductive corresponds to the limiting case in which η → 0. In the limit, the inverse  investment function is (c/φ) · x, implying that the investment function is x(θ) = (φ/c) · θ. Thus, the over-investment result does not rely on investment being productive.69 The  case of unproductive investment highlights the way in which returns to education can be misinterpreted, even if ability were observed. This is because output is: E [yi | θi , xi , η = 0] = (1 − φ) · θi + φ · ξ(xi ) = (1 − φ) · θi + c · xi .  Therefore, an OLS regression of output on investment would reveal a positive correlation, even controlling for ability. Often, controlling for ability makes the researcher more comfortable in making claims of causality, and this is true here. However, education causes increased output at the individual level because of the matching market, not because education augments ability. An analytic solution for the investment function is available if we let θ = 0. Equilibrium investment is: x(θ | θ = 0) = Λ−1 · θ  1 1−η  ,  69 When η = 0, there is an equilibrium in which workers invest according to x(θ) = (φ/c) · θ. Given this, an investment of x yields a partner of skill µ(x) = (c/φ)·x. Once inserted into the objective function, it becomes clear that payoffs are independent of investment (among equilibrium investments anyway). The investment function therefore constitutes an equilibrium. Uniqueness relies on investments being productive however, since there is another equilibrium surviving the refinement in which all workers invest zero.  103  which makes equilibrium output: y(θ | θ = 0) = Λ  η − 1−η  1  · θ 1−η .  Equilibrium welfare is: u(θ | θ = 0) = Λ  η − 1−η  −c·Λ  1 − 1−η  1  · θ 1−η ,  which is a decreasing function of φ. Comparing the equilibrium outcome to that which would arise with global spillovers, it turns out that the equilibrium variables are the same replacing Λ with ΛG ≡ c/[η · (1 −  φ)]. Average welfare is greater with visible investments when η > 1/2 and greater with  hidden investments when η < 1/2 (this holds for all values of φ ∈ (0, 1)). Welfare is the same across the two cases when φ = 0 or φ = 1. When φ = 0 behaviour is the same across  the cases. However, when φ = 1 the behaviour is quite different since investment is zero with hidden investments but is very large with visible investments (so much so that all workers end up with a payoff of zero). One benefit of a closed-form expression for the investment function is that it allows us to explicitly examine inequality. Although there are many available measures of inequality, a prominent measure is the percentile gap (e.g. the 90-10 and the 90-50 percentile differences are often documented for wage distributions). If yp is the output (and in this case, wage) at the pth percentile, then the inequality measure is ∆(p, p ) ≡ yp − yp , for 0 ≤ p < p ≤ 1. Since equilibrium output is a strictly increasing function of type, yp  is simply the output produced by the type at the pth percentile. That is, the type θp such that F (θp ) = p. It then follows that the inequality measure is given by: η − 1−η  ∆(p, p ) = y(θp ) − y(θp ) = Λ  1  1  · θp1−η − θp1−η .  The first point to note is that inequality is decreasing in Λ, and is therefore increasing in spillovers. Furthermore, if F is not too convex (e.g. a uniform distribution) then the inequality measure is increasing in p, holding p − p fixed.70 That is to say that  inequality is greater at the top end of the distribution. This feature is not due solely to 1/(1−η)  Specifically, if ψ(z) ≡ F −1 (z) is a convex function. For example, any distribution of the form F (z) = z q will do as long as q ≤ 1/(1 − η) (e.g. a uniform distribution). All distributions with a nonincreasing density will work (e.g. the class of exponential distributions). If we let a ∈ (0, θ) and assume that (θ − a) is distributed log-normal (with ‘mean’ parameter m and ‘standard deviation’ parameter s), then the condition holds for sufficiently large s. 70  104  spillovers, since this holds even when all workers are investing efficiently - the relevant point is that the effect of spillovers on inequality will be most pronounced at the top of the distribution. Result 15. Rising spillovers induce an increase in inequality. For example, the 90/10 wage differential increases. If F is not too convex (e.g. a uniform distribution), then this increase is concentrated at the top end of the distribution. For example, the 90-50 wage differential increases more than the 50/10 wage differential. Rising inequality within many OECD countries since the 1970’s - particularly in the 1980’s - has been well-documented (see Gottschalk and Smeeding (1997) for a survey), and Autor, Katz, and Kearny (2006) document the continued growth in the 90/50 (log) wage differential since the late 1980’s in the U.S. In order to better understand potential sources of inequality, it is common to compare outcomes within and between ‘skill groups’. In practical terms, these skill groups often refer to different educational categories, such as ‘college educated’ and ‘high school educated’. Since educational attainment is termed ‘investment’ in the model, we can perform the exercise of comparing outcomes within and between ‘investment groups’. To this end, suppose that the researcher does not observe investment, but, rather, observes the region in which the investment falls. For instance, the researcher knows that a person has a college degree but does not know what major the degree is in, the person’s grades, the quality of the institution, and so on. This creates a situation in which workers of different skills (in the sense used in the model) are observationally equivalent. To make some progress on this, assume that abilities are uniformly distributed on [0, 1]. The researcher observes whether or not a worker’s investment is greater than some cut-off, x ˆ. At the risk of confusing terminology, let us say that workers with investments less than x ˆ are ‘un-skilled’, and workers with investments greater than x ˆ are ‘skilled’. Given the cut-off investment, the equilibrium investment function implies a cut-off ability, where a worker is ‘skilled’ in equilibrium if and only if their type is above this cut-off, is given by θˆ ≡ x ˆ1−η · Λ. Assuming that θˆ ≤ 1, the ‘supply of skilled workers’ is  therefore given by:  S(ˆ x; ·) = 1 − θˆ = 1 − x ˆ1−η · Λ. Since S is decreasing in Λ, greater spillovers lead to a greater supply of skilled workers. ˆ The average ability of a skilled worker is (1+ θ)/2, and the average ability of an unskilled ˆ worker is θ/2. Greater spillovers therefore lowers both of these averages. The effect on average output is more complex because greater spillovers induce more investment. The  105  average output (and wage) among the unskilled is: YU =  1−η ·x ˆ · Λ, 2−η  which is monotonically decreasing in spillovers.71 Intuitively, there is an upper limit to the investment made by an unskilled worker (by definition). As spillovers increase, this maximum investment is being made by lower ability workers. The average output (and wage) among the skilled is:  YS =   1−η ·x ˆ·Λ·  2−η   −x ˆ1−η Λ    1−x ˆ1−η Λ 1  1 x ˆΛ 1−η  This expression can be either increasing or decreasing in spillovers: greater spillovers induces more investment, but also induces entry of lower ability workers. Unlike the unskilled group, there is no upper limit on investment within the skilled group. Notice how the large bracketed term will equal YS /YU ; a measure of the skill premium. This expression is decreasing in Λ, implying that greater spillovers increase the apparent skill premium. Result 16. Rising spillovers induce a positive correlation between the skill premium, YS /YU , and the supply of skilled workers, S. Furthermore, the rise in the skill premium involves a decline in the average wage of unskilled workers. This result lends itself to a comment on skill-biased technical change. The above correlation has been widely documented for the US and has been interpreted as ‘skillbiased technical change’ (see Acemoglu (2002)). Furthermore, the rising skill premium is due to falling wages for the unskilled as opposed to rising wages of the skilled, as happens here. Although both models predict this correlation, the policy implications are dramatically different. Under skill-biased technical change, encouraging educational attainment is valuable because the production technology is becoming increasingly geared toward skilled workers. On the other hand, there is already over-investment in education in the above model and any further encouragement will lower welfare. Another well-documented trend is that of residual inequality. This can be analyzed by examining the inequality within each skill group separately. The worker that has their ability at the pth percentile among the unskilled, θpU , is that value that satisfies 71  The result would not hold for all distributions of ability since it could be the case that there are relatively few workers on the verge of becoming identified as skilled. As greater spillovers induce these workers to change categories, the reduction in average output among the unskilled due to the composition change will not fall much relative to the increase in average output owing to the fact that all workers invest more.  106  F (θpU )/θˆ = p. Similarly, the worker that has their ability at the pth percentile among the ˆ ˆ Utilizing the uniform skilled, θS , is that value that satisfies (F (θS ) − F (θ))/(1 − F (θ)). p  p  distribution, we can write: θpU  = p · θˆ = p · x ˆ1−η · Λ  θpS = p + (1 − p) · θˆ = p + (1 − p) · x ˆ1−η · Λ. Inequality within skill group K ∈ {U, S} then becomes: ∆ (p , p) = K  θpK  1 1−η  Λη  θpK − Λη  1 1−η  .  Result 17. Spillovers increase inequality among the skilled, but lowers inequality among the unskilled. Lemieux (2006) shows trends in residual inequality by education group and shows how higher education groups have greater residual inequality. Further, it seems that this inequality has grown faster for higher educational groups since the early 1970’s (and has actually declined for the very lowest education groups). Finally, although there is evidence that inequality has risen in many economies, largely attributable to residual inequality of the most skilled groups, it must be that this inequality is a between-firm phenomenon if the above theory is to concord with the evidence. Some recent papers provide empirical support for this (Faggio, Salvanes, and Van Reenen (2007) and Dunne, Foster, Haltiwanger, and Troske (2004)).  A.5  Optimal Policy  Given the over-investment in equilibrium, it will never be the case that an investment subsidy is optimal despite the positive spillover associated with investment. In order to characterize the optimal policy, suppose a planner is able to charge a investment tax of τ (x). Incorporating this tax into the model is straightforward since we can use the expressions derived previously using augmented investment costs: c˜(x) ≡ c(x) + τ (x).  That is, the inverse investment function becomes: ξ (x | τ ) =  cx (x) − sx (x, ξ(x)) + τx (x) . φ · sθ (x, ξ(x))  107  The tax schedule is chosen so that the induced inverse investment function coincides with the inverse efficient investment schedule, ξ ∗ (x), where this satisfies: cx (x) = sx (x, ξ ∗ (x)) That is, we choose τ (·) such that ξ(x | τ ) = ξ ∗ (x).  In the Pareto efficient equilibrium, the lowest types invest efficiently and therefore  need no distortion: τ (x0 ) = 0. In any other separating equilibrium, this initial value is adjusted (increased) until the point at which the lowest types invest efficiently. To ensure that ξ(x | τ ) = ξ ∗ (x) for all higher types, we choose τ so that the slope of the two functions are everywhere equal. That is:  cx (x) − sx (x, ξ ∗ (x)) + τx (x) = ξ ∗ (x). φ · sθ (x, ξ ∗ (x)) Noting that cx (x) = sx (x, ξ ∗ (x)) by definition of ξ ∗ (x), and re-arranging gives: τx (x) = ξ ∗ (x) · φ · sθ (x, ξ ∗ (x)).  (A.1)  The optimal tax schedule for the Pareto efficient separating equilibrium is therefore the solution to the initial values problem defined by (A.1) and τ (x0 ) = 0. It is immediate from (A.1) that higher spillovers increase the optimal tax at each investment level. The optimal tax policy for the illustration presented in Section A.4 can be derived by using (A.1) along with the initial condition. In this case the optimal policy is a (piecewise) linear tax given by:  0 τ ∗ (x) = φ · c ·  1−η η  · [x − x0 ]  if x ∈ [0, x0 ) if x ≥ x0 .  The optimal tax rate is non-zero if and only if spillovers exist. Furthermore, the optimal marginal tax rate is increasing in marginal investment costs and approaches infinity as investment becomes unproductive (as η → 0).  A.6  A Model With Classes  The model presented above can be generalized to a setting in which each agent belongs to one of two classes, where a match requires one agent from each class. The classic example is marriage, where the classes are males and females. Let there be a unit measure of both males and females. Each male i is endowed with a type, θi ∈ Θ. The distribution of male types is GM (·), where GM (z) ∈ (0, ∞) for all z ∈ Θ. Each female j  108  ˜ The distribution of female types is GF (·), where G (z) ∈ is endowed with a type θ˜j ∈ Θ. F ˜ (0, ∞) for all z ∈ Θ.  I am going to restrict attention to fully revealing equilibria. Since investment is  increasing in type, equilibrium will involve a type θ male being matched with a type h(θ) female, where GM (θ) = GF (h(θ)). If males perceive a return function of µ(x), and females perceive a return function of µ ˜(x), then rational expectations requires: µ (x(θ)) = s(˜ x(h(θ)), h(θ)) µ ˜ (˜ x(h(θ))) = s(x(θ), θ), where x(·) and x ˜(·) are the equilibrium investment functions for males and females respectively. Given these return functions, the investment functions must be optimal: x(θ) = arg max u (x, θ, µ(x)) , ∀ θ ∈ Θ x≥0  x(h(θ)) = arg max u (x, h(θ), µ ˜(x)) , ∀ θ ∈ Θ. x≥0  If we define x ˆ(θ) ≡ x ˜(h(θ)), then the the same procedure as described above can be employed, and we end up with a system of first-order differential equations: x (θ) = Γ (x, x ˆ, θ) ˆ (x, x x ˆ (θ) = Γ ˆ, θ) . This system, along with the initial conditions, {x(θ), x ˆ(θ)}, form an initial values prob-  lem. The solution to this problem will constitute an equilibrium when both functions in the solution are strictly increasing, and the second-order condition is satisfied. Analysis of this problem, especially the geometric arguments, is considerably more complicated than that of the case in which there is a single class because of the extra dimension.72 Phase diagrams are infeasible since the system of differential equations is non-autonomous, and direction fields need to be given a three-dimensional treatment. Note however that assuming unproductive investment makes the problem tractable ˆ is no longer a function of x. Thus, the two since Γ is no longer a function of x ˆ and Γ equations can be treated separately according to the procedure used in the single-class case. One way to make some progress is to use the method of undetermined coefficients. That is, employ a particular structure and guess that the solution will take a particular parameterized form, then use the optimality and rational expectations conditions to 72  The two-class problem is equivalent to the single-class problem when the distribution of types is the same for both classes (h is the identity function).  109  solve for the parameters. This is of course a much less general approach to the problem since particular functional forms are used, however the exercise is useful because it allows for some closed-form solutions. Let GM be the uniform distribution on [0, 1], and let GF be the uniform distribution on [0, F ]. This implies that h(z) = z · F for z ∈ [0, 1]. Let g(x) = x and let c(x) = (c/2) · x2 . I will guess that the solution is linear: x(θ) = β · θ and x ˆ(θ) = βˆ · θ. Given µ and µ ˜, optimality requires that:  (1 − φ) · θ + φ · µ (x) = c · x,  (1 − φ) · F θ + φ · µ ˜ (x) = c · x, and rational expectations requires: µ(x(θ)) = x ˆ(θ) · F θ  µ ˜(ˆ x(θ)) = x(θ) · θ.  Taking the derivative of each of these with respect to θ, and substituting the conjectured form gives: µ (x(θ)) = µ ˜ (ˆ x(θ)) =  βˆ · 2F θ β β · 2θ. βˆ  Substituting these into the first-order conditions, and again applying the conjectured forms, gives: βˆ · 2F = c · β, β β ˆ (1 − φ) · F + φ · · 2 = c · βF. βˆ (1 − φ) + φ ·  An equilibrium of the form conjectured exists if we can find a β > 0 and a βˆ > 0 such that the above two conditions are satisfied. These conditions can be re-written as follows: βˆ = β =  1 c · β 2 − (1 − φ) · β 2F φ F c · βˆ2 − (1 − φ) · βˆ . 2φ  ˆ space easily reveals that there are unique positive Depicting these relationships in (β, β) values of β and βˆ that satisfy these relationships. 110  Are the equilibrium investments efficient? To begin, we can calculate the set of Pareto efficient investment pairs by equating the slope of indifference curves in (x, x ˆ) space. ˆ This leads us to the conclusion that if the female invests x ˆ = β · θ (some multiple of  the male’s type), then the corresponding Pareto efficient investment from the male is ˆ · θ, where: x = κ∗ (β) ˆ ≡ 1−φ + φ κ (β) c c ∗  2  ·  1 . ˆ β − 1−φ c  From here we can ask the question of whether there exists an F such that the resulting equilibrium investments are Pareto efficient. As is turns out, such a value does not exist. To show this, suppose that the female invested βˆ · θ. Then, by substituting out for F , it  ˆ · θ in equilibrium, where: can be shown that the male will invest κ(β) ˆ ≡ 1−φ +2 φ β = κ(β) c c  2  ·  1 . ˆ β − 1−φ c  ˆ it follows that equilibrium investments are always inefficiently great. ˆ > κ∗ (β), Since κ(β) In other words, there will always exist lower investments from both the male and female that results in a Pareto-improvement. Furthermore, since higher values of F correspond to lower values of βˆ (which correspond to higher values of β), an increase in F leads to more investment for males and less investment for females. Intuitively, an increase in F increases the dispersion of female types, which provides incentives for males to compete harder at each investment level. One interesting matter of non-existence arises when the two-sides are explicitly considered. The reason is that when the two sides become too ‘asymmetric’ - in this case a very high or very low F - the less differentiated side must invest a great deal in equilibrium. This investment can in fact be so great that they would prefer making their Nash investment, even if it meant leaving them without a partner. This is under the assumption that having no partner is equivalent to having a partner of zero skill. There is no reason why this need be the case, but it does not seem unreasonable.73 When F is too extreme, all agents on one side of the market have a profitable deviation in investing their Nash level and remaining without a partner. However, if all workers did this, then agents would still be fully separated (since Nash investment is a strictly increasing function of type), which implies that matching would still be positive assortative. Thus, there would be a profitable deviation to be had from raising investment a little in order 73  In fact, the argument goes through for any utility level above negative infinity that one wishes to assign to the state of not being matched.  111  to pretend to be a worker of higher ability. Thus, there may be no symmetric equilibrium in pure strategies. Having said this, there always exists an equilibrium when the two sides are sufficiently similar (i.e. when F is sufficiently close to unity). The possibility of non-existence never arises in the single-class model because this can be thought of as a model in which the two sides have the same distribution.  A.7  Signaling With Patience  Consider the following dynamic signaling model. A generation of workers are born each period, and each is endowed with some ability, θ ∈ Θ. For comparability, take Θ = θ, θ  and assume that θ is distributed on this interval with positive density. Workers are able to invest in a signal, x, at some positive increasing cost. The investment is a pure signal - a worker’s output simply equals their productivity. In their first period of existence workers make their investment and enter the labour market. Firms do not observe ability but do observe investment, and offer workers a wage schedule, w(x). The wage contract is enforceable, implying that if a firm hires a worker that ends up producing less than their wage, their only recourse is to fire the worker.74 Workers select which firm they wish to work at (free entry ensures that each worker gets a job), get paid w(x) and produce output. At the end of the period the firm decides whether they wish to fire the worker. If the worker is fired, then they reenter the labour market the following period. Finding a new job takes one period. To maintain comparability, I assume that the labour market does not observe the worker’s work history (only their investment). Workers each have a discount factor of β ∈ [0, 1), implying that the present value of  lifetime earnings given a wage stream of {wt }∞ t=0 is V ≡ model is a special case in which β = 0.  ∞ t t=0 β · wt .  The standard static  The efficient outcome is to have all workers investing zero, and this can not be supported by a separating equilibrium at β = 0. Intuitively, if all workers invested zero then all workers appear the same. Any worker can claim to be the highest type worker, and get paid θ. The firm offering this wage would fire the worker, but since the future is completely discounted this does not concern the worker. What about if β > 0? There is always a non-degenerate (and continuous) distribution of types such that it is possible to support a ‘separating’ and efficient equilibrium when β > 0. To see this, 74 An alternative assumption would be that the wage is re-negotiated, but this introduces unnecessary complications. For instance, if the firm must fire the worker before offering them the renegotiated wage (which involves the worker waiting one period), and if the worker has all the bargaining power, then the conclusions (conditions required for efficiency) are exactly the same. Giving bargaining power to the firm only makes a deviation less attractive, in which case the efficient outcome is easier to support, strengthening the argument.  112  firms offer a continuum of wage contracts of the form w(x) = w for each w ∈ Θ. Since the  wages on offer do not depend on investment, it follows that all workers will invest zero (efficiently). In essence, workers choose their own wage mindful of the fact that if they choose a wage for which they are under-qualified, then the firm will fire them and they will have to spend a period looking for a new job. If a worker chooses a wage for which they are qualified, w, their payoff is (1 − β)−1 · w. If they instead choose a wage for which  they are not qualified, w , then their payoff is [(1+β)·(1−β)]−1 ·w .75 The most attractive  deviation possible is that made by the lowest type worker claiming to be the highest type worker. If this deviation is unprofitable, then all deviations are unprofitable. This deviation is unprofitable if (1+β)·θ > θ. Therefore efficient investments can be supported  in a separating equilibrium with a continuum of types if the highest type is not too great relative to the lowest type. Note that if β = 0 (the static case) such an equilibrium can only be supported if the distribution is degenerate. As patience increases, so too does the maximum allowable value of the highest type. The conclusion to be taken from this is that different levels of patience can qualitatively change the nature of separating equilibria in standard signaling models with a continuum of types. This highlights a key difference between the matching model and signaling models.  A.8  Pooling Equilibria  This section examines equilibria in which at least some workers of different types invest the same amount. The simplest version is a complete pooling equilibrium in which all workers invest some xp , as depicted in Figure A.1. The rational expectations condition simply requires that µ passes through the point (xp , s(xp )), where s(x) ≡  Θ s(x, θ)dF (θ)  is the expected skill of a worker randomly drawn from the population, given that all workers invest x. There will in general be equilibria in which any finite number of investments are made (as in signaling models). The fact that equilibrium investments are non-decreasing implies that these equilibria can be calculated according to the following. Proposition 17. For any finite integer N , fix any partition of the type space, [θ, θ1 ), [θ1 , θ2 ), ..., [θN −1 , θ] .  There exist investments, {x1 , ..., xN } such that a pooling equilibrium exists in which all members in the nth partition invest xn .  The key to demonstrating this is to use the following recursive method. Take some x1 , and assume that all θ ∈ [θ, θ1 ) invest x1 . The return function must pass through the 75  To see this, note that the worker alternates between employment and job search each period. While employed they get w and while unemployed get zero. The payoff stream is therefore: V = w + β 2 · w + β 4 · w + ... = (1 − β 2 )−1 w = [(1 + β) · (1 − β)]−1 · w .  113  Iθ  s, s′  Iθ Skill  s(x, θ)  Iθ s(x)  s(xp )  µ(x) s(x, θ) Iθ  xp  x  Investment Figure A.1: Possible Pooling Equilibrium point (x1 , s(θ, θ1 | x1 )), where s(θ , θ | x) ≡  θ θ  s(x, θ)dF (θ)/(F (θ ) − F (θ )). From here we  can draw in the indifference curve of workers of type θ1 that passes through this point. Now consider workers with types in [θ1 , θ2 ). We can depict the curve of s(θ1 , θ2 | x) as a function of x. This curve will cut the indifference curve of the type θ1 workers at one investment level. Simply let x2 be this investment level, then repeat the same procedure until the final partition is reached. One implication of this is that x1 < x2 < ... < xN . Apart from these equilibria, there will be hybrid equilibria in which some types pool and some types separate.  114  Appendix B  Appendix to Chapter 3 B.1  Complementarities: Separating Equilibrium vs Random Matching  Consider an alternative benchmark in which all characteristics are hidden, thereby requiring that partners are randomly assigned. For simplicity, assume perfect altruism (α = 1). This allows us to focus on incentives to make parental investments in isolation (since wealth will optimally be zero for all families). In contrast to the illustrations used in the text, this illustration assumes that ρ < 1. In particular, as ρ → 0, h approaches the Cobb-Douglas form:  h = q(y)1−φ · q(y )φ . I use this functional form, with q being the identity function, and assume that costs are given by c(y, θ) = (1/2) · y 2 /θ.  There is always an equilibrium in which all agents invest zero. There is also a more  interesting equilibrium in which positive investments are made. The first-order condition, once re-arranged, is: (1 − φ) ·  [y R (τ )]φ dΨ(τ ) =  [y R (θ)]1+φ . θ  The left-side is a constant (from the perspective of a particular family). If this is denoted by A, then equating this to the right side, we have: 1  1  y R (θ) = A 1+φ · θ 1+φ . Using this form in the left side, we have that the value of A satisfies: A=φ·  φ  φ  A 1+φ · τ 1+φ dΨ(τ ),  115  thereby implying that: φ  1  A 1+φ ≡ (1 − φ) ·  τ 1+φ dΨ(τ ).  The optimal investment under random matching is therefore: φ  1  τ 1+φ dΨ(τ ) · θ 1+φ .  y R (θ) = (1 − φ) ·  It is straightforward to show that the Nash investments are y N (θ) = (1−φ)·θ. Therefore, comparing these we get: y R (θ) = y N (θ)  φ 1+φ  τ θ  dΨ(τ ).  The term on the right is continuous and strictly decreasing in θ, strictly greater than one at θ = θ, and strictly less than one at θ = θ. Thus, there is a critical type θ ∈ (θ, θ)  such that all families with θ < θ invest more than their Nash level, and all families with θ > θ invest less than their Nash level. Although investments made by individual families can not be unambiguously ranked across the benchmarks, average investment can be. Average investment with random matching is φ  E[y R ] = (1 − φ) ·  τ 1+φ dΨ(τ ) ·  1  θ 1+φ dΨ(θ) ,  (B.1)  whereas average Nash investment is: E[y N ] = (1 − φ) ·  θdΨ(τ ).  Jensen’s inequality implies that the average Nash investment is greater than the average investment with random matching, since: φ  1  E[y R ] = (1 − φ) · E θ 1+φ · E θ 1+φ < (1 − φ) · E [θ] = E[y N ]. In terms of human capital, the random matching environment does even worse: not only is the average parental investment lower, matches are formed in a less efficient manner. Average human capital under random matching is: E[hR ] = E[(y R )1−φ ] · E[(y R )φ ],  (B.2)  which, again by Jensen’s inequality, is less than E[(y R )]. Average human capital in the 116  Nash environment is simply E[(y N )], which we have already established is greater than E[(y R )]. Thus, average human capital is greater in the Nash environment. Finally, since average Nash investment is less than average efficient investment, we also know that average welfare in the Nash benchmark is greater than the average welfare in the random matching benchmark.  B.1.1  Comparison with Separating Equilibrium  The fact that the separating equilibrium is independent of ρ means that we can use the equilibrium values derived in Section 3.4. Average welfare in the separating equilibrium is: W S = (1 − φ) ·  1−φ · E[θ] + φ · θ . 2  After a few algebra steps, the expected welfare in the random matching setting is: W R = (1 − φ) ·  φ 1+φ · E θ 1+φ 2  2  1−φ  · E θ 1+φ .  Unlike the Nash benchmark, the welfare rank is ambiguous (in the sense that it depends on the distribution of types). Simple manipulation shows the following. Proposition 18. Welfare is greater in the separating equilibrium than under random matching if and only if: φ  1−φ ≥ 1+φ  E θ 1+φ  2  1−φ  · E θ 1+φ  E [θ]  −2·  φ θ · . 1 + φ E[θ]  One unusual feature of this is that the rank depends on how low the lowest type is relative to the mean. The above condition becomes easier to satisfy as the gap between the lowest type and the mean shrinks. This reflects the fact that distortions in the separating equilibrium are made more severe as the lowest type falls. To illustrate, suppose that types are log-normally distributed: ln θ ∼ N (m, σ 2 ), then  it turns out that:  (1 − φ)(1 + φ) σ2 2φ2 + (1 − φ)2 · exp m + · 2 2 (1 + φ)2 (1 − φ)(1 − φ) σ2 WS = · exp m + , 2 2  WR =  117  so that the separating equilibrium provides the greater welfare if and only if: 1−φ φ(2 − φ) ≥ exp −σ 2 · 1+φ (1 + φ)2  .  In other words, if: σ 2 ≥ S(φ) ≡ ln  1+φ 1−φ  ·  (1 + φ)2 . φ(2 − φ)  The function S(φ) is strictly increasing on [0, 1] with limφ→0 S(φ) = 1 and limφ→1 S(φ) = ∞. Three main properties emerge. First, the mean of the distribution of log-types  plays no role. Second, the separating equilibrium produces greater welfare than random matching for sufficiently large σ 2 , whereas the opposite is true for sufficiently small σ 2 . Third, random matching always produces a greater welfare for φ sufficiently close to one. Of even greater significance is the possibility that average human capital under random matching is greater than in the separating equilibrium. This never occurs when human capital is treated as fixed, since the matching arrangement under the separating is more efficient. Average human capital in the separating equilibrium is: E[hS ] = (1 − φ)2 · E[θ] + φ(1 + φ) · θ, whereas average human capital under random matching is: φ  1−φ  E[hR ] = (1 − φ) · E[θ 1+φ ]2 · E[θ 1+φ ] For simplicity, let θ = 0 so that the average human capital level in the separating equilibrium is greater than under random matching if: φ  1−φ  E[θ 1+φ ]2 · E[θ 1+φ ] ≤ (1 − φ) · E[θ]. Using the log-normal example again, this requires that: exp m +  σ2 2φ2 + (1 − φ)2 · 2 (1 + φ)2  ≤ (1 − φ) · exp m +  σ2 2  ,  or, once simplified: exp −σ 2 ·  φ(2 − φ) (1 + φ)2  ≤ (1 − φ).  118  In other words, if 1 1−φ  σ 2 ≥ S ∗ (φ) ≡ ln  ·  (1 + φ)2 . φ(2 − φ)  The properties of S ∗ (φ) are similar to those of S(φ). Figure B.1 depicts both S(φ) and S ∗ (φ).  &ULWLFDO 9DULDQFH  $  6ĭ  %  6 ĭ  &  6SLOORYHUV ĭ  Figure B.1: The Functions S(φ) and S ∗ (φ) When (φ, σ 2 ) lies in region A, average human capital is greater in the separating equilibrium. This is the standard result. When (φ, σ 2 ) lies in region C, we have that average human capital is actually greater under random matching: despite the fact that matches are formed less efficiently under random matching, parental investment is greater. In region B, average human capital is greater in the separating equilibrium but average welfare is greater under random matching (i.e. the superior matching pattern in the separating equilibrium does not compensate for the extra costs involved in achieving separation).  119  B.2 B.2.1  Proofs Proof of Proposition 5  Proof. Suppose to the contrary that a pooling equilibrium exists, and yet α < 1. The first-order conditions (the assumptions on f and h, along with the fact that α ∈ (0, 1), guarantee that the solution is interior) imply that:  (1 − α) · fx (x) = α · Hy (y) = cT (x + y, θ). Since x is a constant across types in a pooling equilibrium (by definition), the first equality implies that y will also be a constant across types (since Hy (y) is strictly increasing). The final inequality then implies that the marginal cost is constant across types, which is contradicted by the fact that cT θ (·) > 0.  B.2.2  Proof of Result 3  Proof. The first part comes from the condition that µ(x(θ)) = y(θ). If x(·) is increasing (decreasing), then y(·) is weakly increasing (decreasing). That total investment is increasing in type comes from noting that the payoff function can be expressed as V (x, y, µ(x), θ) = U (x, y) − c(x + y, θ), and applying a revealed preference argument to two different types, θ < θ , gives:  U (x , y ) − c(x + y , θ ) ≥ U (x, y) − c(x + y, θ ) and U (x, y) − c(x + y, θ) ≥ U (x , y ) − c(x + y , θ). Adding these, re-arranging, and defining T = x + y and T = x + y gives c(T , θ) − c(T, θ) ≥ c(T , θ ) − c(T, θ ). The fact that cT θ (·) > 0 implies that this inequality can only hold if T > T . The assumptions on V ensure that optimal parental investment is positive, and, since V (x, y, µ, θ) is differentiable (and concave) in y, optimal parental investment must be characterized by the first-order condition: Vy (x(θ), y(θ), µ(x(θ)), θ) = 0. Again using the equilibrium condition that µ(x(θ)) = y(θ) indicates that y(θ) is implicitly defined by Vy (x(θ), y(θ), y(θ), θ) = 0. Since Vy is differentiable in all arguments and x(·) is differentiable by assumption, the derivative of y(·) exists and is given by −[Vyx x (θ) + Vyθ ]/[Vyy + Vyy ] (the denominator is ensured to be non-zero by the regularity assumption  120  that hyy + hyy ≤ 0.  B.2.3  Proof of Proposition 6  Proof. The proof demonstrates that the first-order conditions are sufficient for a maximum by showing that the objective function is globally concave when evaluated using a candidate return function, µ(x). That is, I prove that v(x, y) ≡ (1 − α) · f (x) + α · h(y, µ(x)) 2 . These are given by: is concave. I need to show that vxx ≤ 0, vyy ≤ 0, and vxx vyy ≥ vxy  vxx = (1 − α) · fxx + α · [hy µxx + hy y (µx )2 ] ≤ 0 vyy = α · hyy ≤ 0  2 vxx vyy − vxy = vxx vyy − (α · hyy µx )2 ≥ 0,  (B.3) (B.4) (B.5)  for arbitrary values of {x, y}. It is immediate that (B.4) is satisfied. To make progress with (B.3), note that we can determine hy µxx as follows:  ∂ {Γ(µ(x), x)} = hy [Γµ µx + Γx ] ∂x 1 − α fx 1−α = q (µ) · µx − · fxx . α q (µ) α  hy µxx = hy ·  Once substituted into (B.3), the condition becomes: vxx = α ·  1 − α fx q (µ) · µx + hy y (µx )2 , α q (µ)  which is non-positive, since q > 0, q ≤ 0, µx ≥ 0, and hy y ≤ 0.  Turning to (B.5), after expanding and simplifying we end up with: 2 vxx vyy − vxy = α2 · hyy · µx ·  1−α q fx α q  + (α · µx )2 · [hyy hy y − h2yy ], which is non-negative since it is the sum of two non-negative terms (the latter is nonnegative since h is concave), and therefore (B.5) is also satisfied. I conclude that v is a concave function. Since −c(x + y, θ) is a also a concave function, the objective function is concave, and the first-order conditions are sufficient for a global maximum.  121  B.2.4  Proof of Proposition 8  Proof. Consider the problem: max {F (x, y)} subject to x + y ≤ T, x,y  (B.6)  where F is such that the ‘y’ solution is strictly positive: y(T ) > 0. Let F (T ) be the associated maximum value function, and consider the problem: max {F (T ) − c(T, θ)} . T  (B.7)  Consider two different functions, F and Fˆ . The associated solutions, T ∗ and Tˆ∗ will satisfy T ∗ > Tˆ∗ if it happened to be the case that F (T ) > Fˆ (T ) for all T . Since T only enters the constraint in the original problem, the envelope theorem can be used to show that this holds if Fy (x(T ), y(T )) > Fˆy (ˆ x(T ), yˆ(T )).  (B.8)  If we let F be the objective function facing the social planner, and Fˆ be the objective function facing an agent in equilibrium, then Fy = α · q (y(T )) and Fˆy = α · (1 − φ)q (ˆ y (T )).  If q is linear, then q (y(T )) = q (ˆ y (T )) and Fˆy (z )/Fy (z) = (1 − φ) < 1 for any (z, z ), implying that T ∗ > T . By interpreting Fˆ as the objective function facing a Nash investor, the same expressions apply (since the marginal return to parental investment is the same in the Nash and equilibrium settings). This observation implies both i) that total efficient investment is strictly greater than total Nash investment, and ii) total Nash investment equals total equilibrium investment.  B.3  Derivation: Investments with Noise  The following uses the fact that if x ∼ N (y, a) and y ∼ N (b, d), then y | x ∼ N (λx + (1 − λ)b, v), where λ ≡ d/(d + a) and v ≡ ad/(a + d). Since  ln yi = ln βy + γ ln θi = ln βy + γ ln θ + γεθi , the prior is given by ln y ∼ N (ln βy + γ ln θ, γ 2 σθ2 ).  (B.9)  122  The structure of the noise implies that: ln y˜ ∼ N (ln y, σy2 ).  (B.10)  However, note that we also have: ln xi = ln βx + γ ln θi = ln(βx /βy ) + ln yi . By adding noise to this, we have: ln x ˜ + ln(βy /βx ) = ln yi + εxi . Therefore, we also have: ln x ˜ + ln(βy /βx ) ∼ N (ln y, σx2 ).  (B.11)  Updating the prior (eqn (B.9)) with the information contained in the signal of parental investment (eqn (B.10)) gives: ln y |˜ y ∼ N λ · ln y˜ + (1 − λ) · A1 , σ12 ,  (B.12)  where A1 ≡ [ln βy + γ ln θ] is a constant and λ≡  γ 2 σθ2 γ 2 σθ2 + σy2  and  σ12 ≡  γ 2 σθ2 σy2 . γ 2 σθ2 + σy2  (B.13)  This posterior forms the new prior when using the information contained in the signal of wealth (eqn (B.11)). Using the same results, we have: ln y |˜ y, x ˜ ∼ N λ · ln x ˜ + (1 − λ )λ · ln y˜ + A2 , σ22 ,  (B.14)  where A2 ≡ λ ln(βy /βx ) + (1 − λ )(1 − λ) · A1 is a constant and λ ≡  σ12 σ12 + σx2  and  σ22 ≡  σ12 σx2 . σ12 + σx2  (B.15)  Then, it follows that: E[ln y | x ˜, y˜] =  γ 2 σθ2 σ12 σx2 · ln x ˜ + · ln y˜ + constants, σ12 + σx2 σ12 + σx2 γ 2 σθ2 + σy2  (B.16)  where, after simplification, the two bracketed coefficients correspond to λx and λy given  123  in the text.  124  Appendix C  Appendix to Chapter 4 C.1  A Generalization of the Productivity Function  Let the productivity of agent i be given by a function, y : R2+ → R+ , that can be expressed as:  y(θi , θ−i ) = g −1 ((1 − φ) · g(θi ) + φ · g(θ−i )) ,  (C.1)  where g : R+ → R is a strictly monotone function with the property that −g (x)/g (x) > 0 for all x ≥ 0 (i.e. g is any strictly increasing and concave function, or any strictly decreas-  ing and convex function). Monotonicity ensures the inverse is well-defined, and the latter condition ensures that types are complements (i.e. yθi θ−i > 0). It is straightforward to verify that y is increasing in both arguments.  This specification is general enough to encompass the class of CES functions: simply let g(x) = xγ for γ ∈ {γ | γ = 0, γ < 1}. The case of γ = 0 corresponds to Cobb-Douglas  case used in the text (and g is the natural log). Types become perfect substitutes as γ → 1, and become perfect complements as γ → −∞. The specification is not restricted  to the CES functions however, since g can by any strictly increasing and concave or strictly decreasing and convex function. Since we have: g(yi ) = (1 − φ) · g(θi ) + φ · g(θ−i ),  (C.2)  arguments virtually identical to those used in the proof of Results 11 and 12 can be used to show i) that the average value of g(y), µg(y) equals µg(θ) , and therefore depends only on the distribution of types, and ii) that the variance of g(y) is: 2 2 σg(y) = [1 − 2φ(1 − φ) · [1 − ρ]] · σg(θ) ,  (C.3)  2 where σg(θ) is the variance of g(θ), and ρ ≡ corr(g(θi ), g(θ−i )). Again, define M ≡  [1 − 2φ(1 − φ) · [1 − ρ]].  To provide an approximation of the expected value of output, let q(z) ≡ g −1 (z). Thus,  observing g(y) = z tells us that y = q(z). We know that z is distributed with a mean of  125  2 . Take a second-order Taylor-Series expansion µ ≡ µg(θ) and a variance of σ 2 ≡ M · σg(θ)  of q(·) around the mean of z:  q(z) ≈ q(µ) + [z − µ] · q (µ) + [z − µ]2 ·  q (µ) , 2  (C.4)  so that the expected value of y satisfies E[y] = E[q(z)] ≈ q(µ) + σ 2 ·  q (µ) . 2  (C.5)  As long as q (µ) > 0, it follows that the expected value of y is increasing in M . To verify that q (µ) > 0, start with the observation that q(g(z)) ≡ z. Differentiating both sides with respect to z gives q (g(z))g (z) = 1. Differentiating once again gives: q (g(z))g (z) + q (g(z))g (z)g (z) = 0  (C.6)  Re-arranging, and using the fact that q (g(z))g (z) = 1 gives: q (g(z)) = −  g (z) 1 > 0. g (z) (g (z))2  (C.7)  Therefore, once everything is substituted in, we get: 2 E[y] ≈ q(µg(θ) ) + M · σg(θ) · q (µg(θ) )/2,  (C.8)  where q = g −1 . This approximation is clearly increasing in M . Even though a closed-form expression for the exact value of expected output is not available, the proof of Result 14 below establishes that the exact value of expected output is increasing in M . To summarize, the analysis in the text generalizes quite naturally to this wider class of functions, and does not rely on specific features of the Cobb-Douglas form, such as separability.  126  C.2 C.2.1  Proofs Proof of Results 11 and 12  Proof. The expression for µln y is: E[ln y] = (1 − φ) · E[ln θi ] + φ · E[ln θ−i ] = (1 − φ) · E[ln θ] + φ · E[ln θ] = E[ln θ] ≡ µln θ For the variance expression, 2 2 2 2 2 σln y = (1 − φ) · σln θi + φ · σln θ−i − 2φ(1 − φ)cov[ln θi , ln θ−i ] 2 2 = [(1 − φ)2 + φ2 ] · σln θ − 2φ(1 − φ)σln θ · ρ  Simplifying this gives the result.  C.2.2  Proof of Result 13  Proof. The result follows from applying the expression for the Taylor series expansion in the general case - equation (C.8) above - and noting that when g is the log function, q (the inverse) is the exponential function, and as such, q(·) = q (·) = q (·) = exp(µln θ ).  C.2.3  Proof of Result 14  The proof covers the general case discussed above. The case presented in the text corresponds to g being the natural log. Proof. Let xi represent g(yi ). From the first two results, we know that x is distributed with a mean of E[g(θ)] ≡ µ, and a variance of M · var[g(θ)] ≡ σ 2 . Let q(x) ≡ g −1 (x), so  that yi = q(xi ). Since we are interested in the expected value of y, we are interested in the expected value of q(x). The assumptions on g ensure that q is convex (twice differentiate both sides of the identity g(q(x)) = x). Given µ and σ, we seek to determine how E[q(x)] changes with σ. Define εi ≡ (xi − µ)/σ. By construction, ε has a mean of zero and a variance of one. If  we let the distribution of ε be denoted by H, then we can write: E[q(x) | σ] =  q (µ + σ · ε) dH(ε).  (C.9)  127  The derivative of this with respect to σ is: d {E[q(x) | σ]} = dσ  q (µ + σ · ε) εdH(ε).  (C.10)  We are interested in the sign of this expression. The fact that q is convex implies i) if q (·) > 0, then q (µ + σ · ε) is an increasing function of ε, and ii) if q (·) < 0, then q (µ + σ · ε)  is a decreasing function of ε. In either case, [q (µ + σ · ε) − q (µ)] · ε is strictly positive (for all ε = 0). Therefore:  [q (µ + σ · ε) − q (µ)] · εdH(ε) > 0.  (C.11)  Splitting this apart, and noting that q (µ) εdH(ε) = q (µ) ·  εdH(ε) = 0,  (C.12)  gives us the desired result: q (µ + σ · ε) εdH(ε) > 0.  (C.13)  Thus, expected output is increasing in M , since this variable increases σ without affecting µ.  C.2.4  Proof of Proposition 11  Proof. Let Λ1 be the probability with which an investor is matched with a high type, and Λ0 be the probability that a non-investor is matched with a high type. That is: Λ0 ≡ λ · ξ0 + (1 − λ) · ξ1  Λ1 ≡ λ · ξ1 + (1 − λ) · ξ0 .  (C.14) (C.15)  Given Λ1 and Λ0 , agents of type T find it optimal to invest if: φ φ θT1−φ · Λ1 · θH + (1 − Λ1 ) · θL −c≥ φ φ θT1−φ · Λ0 · θH + (1 − Λ0 ) · θL  Once re-arranged, this is just: φ φ c ≤ θT1−φ · [Λ1 − Λ0 ] · θH − θL .  128  Subtracting (C.14) from (C.15) reveals that Λ1 − Λ0 = (2λ − 1) · [ξ1 (λ, σ) − ξ0 (λ, σ)], which, by definition, equals G(λ, σ). Thus, a type T agent invests if and only if c φ φ θT1−φ · θH − θL  ≤ G(λ, σ).  (C.16)  Since G is decreasing in σ, it follows that if G(λ, 0) <  c 1−φ θH  (C.17)  φ φ · θH − θL  then an investment equilibrium could never exist, since high types would never find it optimal to invest (even if all other high types invested and no low types invested). If c 1−φ θH  ·  φ θH  −  φ θL  ≤ G(λ, 0) ≤  c 1−φ θL  φ φ − θL · θH  (C.18)  then high types find it optimal to invest, but low types do not. Higher values of σ will not reverse the latter inequality, implying that the unique equilibrium has σ = 0. If c 1−φ θL  φ φ − θL · θH  < G(λ, 0)  (C.19)  then low types find it optimal to invest if no other low types invested. Therefore σ = 0 not an equilibrium. However, since G(λ, σ) is continuous and strictly decreasing in σ, and has the property that G(λ, 1) = 0, it follows that there exists a unique σ ∗ ∈ (0, 1)  such that  c 1−φ θL  ·  φ θH  −  φ θL  = G(λ, σ ∗ ).  (C.20)  For σ < σ ∗ all low types find it optimal to invest (requiring that σ = 1 > σ ∗ ), and for σ > σ ∗ no low type finds it optimal to invest (requiring that σ = 0 < σ ∗ ). Thus, the unique equilibrium has low types investing with probability σ ∗ since low types are indifferent. The equilibrium is stable in the sense that low types find it optimal to invest when σ < σ ∗ (which raises σ) and find it optimal to not invest when σ > σ ∗ (which lowers σ).  129  C.2.5  Proof of Proposition 12  Proof. There are ψZ high-high matches, and ψ(1 − Z) high-low matches. This last obser-  vation implies that there are also ψ(1 − Z) low-high matches, leaving (1 − ψZ − 2ψ(1 − Z))  low-low matches. The expected value of log output is therefore: µln y = Zψ · ln θH + ψ(1 − Z) · [ln θH + ln θL ]  + [1 − Zψ − 2ψ(1 − Z)] · ln θH ,  (C.21)  which equals ψ · ln θH + (1 − ψ) · ln θL . This could have been anticipated by the application  of the more general formula, noting that µln θ = ψ · ln θH + (1 − ψ) · ln θL .  After (considerable) manipulation, the variance of log-output can be written as: 2 σln y = 1 − 2φ(1 − φ) ·  1−Z 1−ψ  · ψ(1 − ψ) · [ln θH − ln θL ]2 .  (C.22)  2 = ψ(1 − ψ) · [ln θ − ln θ ]2 , a comparison with the general formula reveals that Since σln H L θ  (1 − Z)/(1 − ψ) = 1 − ρ, producing the result.  C.2.6  Proof of Proposition 13  Proof. The first point follows immediately from the observation that Z = ψ in a noninvestment equilibrium. In the investment equilibrium, let Λ1 be the probability with which an investor is matched with a high type, and Λ0 be the probability with which a non-investor is matched with a high type. As argued in the text, Z = Λ1 . Thus, the proof establishes that Λ1 is increasing in λ when σ ∗ = 0 and is decreasing when σ ∗ > 0. By the definition of G, we have G = Λ1 − Λ0 .  (C.23)  By expressing the probability of observing a ‘cross’ match in two different ways, it must also be that: ψ(1 − Λ1 ) = (1 − ψ)[σΛ1 + (1 − σ)Λ0 ]. Together, these imply that Λ1 = ψ + (1 − ψ)(1 − σ) · G(λ, σ).  (C.24)  If σ ∗ = 0, then Λ1 is increasing in λ since G(λ, 0) is increasing in λ. If σ ∗ > 0, then G is  130  1−φ φ φ −1 a constant (c · [θL · [θH − θL ]] ), and Λ1 is decreasing in σ ∗ . The result follows since  σ ∗ is increasing in λ (since G(λ, σ) is increasing in λ and decreasing in σ, but equal to a constant when σ ∗ > 0).  131  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0066903/manifest

Comment

Related Items