An Empirical and Economic Analysisof High School Peer EffectsbyAndrew J. HillB.Sc., The University of Cape Town, 2003B.Com.(Hons), The University of Cape Town, 2004M.Com., The University of Cape Town, 2006M.A., The University of British Columbia, 2007A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Economics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)October 2013c? Andrew J. Hill 2013AbstractParents are concerned about the influence of friends during adolescence.Using the gender composition of schoolmates in an individual?s close neigh-bourhood as an instrument for the gender composition of an individual?sself-reported friendship network, Chapter 2 of this dissertation finds thatthe share of opposite gender friends has a sizeable negative effect on highschool GPA. The effect is found across all subjects for students over the ageof sixteen, but is limited to mathematics and science for younger students.Self-reported difficulties getting along with the teacher and paying attentionin class are important mechanisms through which the effect operates. Thesubject-specific effects for younger students and larger estimates for femalesin general are consistent with a gender socialization hypothesis in whichyoung females conform to traditional gender roles in the presence of males.Chapter 3 investigates the extent to which course repeaters in high schoolmathematics courses exert negative externalities on their course-mates. Us-ing individual and school-specific course fixed effects to control for abilityand course selection, it shows that doubling the number of repeaters in agiven course (holding the number of course-takers constant) results in a 0.15reduction in GPA scores for first-time course-takers. Further results suggestthat the negative effect is only evident when the share of repeaters reachesa threshold of five to ten percent of the total number of course-takers.Chapter 4 provides evidence that part-time work during high schoolaffects the college attendance and labour market entry decisions of youngadults: 8-10th grade students working more than five hours per week are lesslikely to attend college and more likely to enter the labour market upon highschool graduation than other students. The part-time working behaviour ofsame-grade schoolmates is used as an instrument for individual part-timeworking behaviour.iiPrefaceAll chapters of this dissertation are original, unpublished and sole-authored.This research uses data from Add Health, a program project designedby J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris, andfunded by a grant P01-HD31921 from the National Institute of Child Healthand Human Development, with cooperative funding from 17 other agencies.Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle forassistance in the original design. Persons interested in obtaining data filesfrom Add Health should contact Add Health, Carolina Population Center,123 W. Franklin Street, Chapel Hill, NC 27516-2524.Ethics approval under the project title ?Adolescent Experiences andEarly Adult Outcomes? was obtained through the Behavioural ResearchEthics Board of the University of British Columbia (H10-02708).iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . ixDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 The Girl Next Door: The Effect of Peer Gender Composi-tion on High School Achievement . . . . . . . . . . . . . . . 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Empirical Methodology and Specification . . . . . . . . . . . 82.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3.1 Distance and Friendship . . . . . . . . . . . . . . . . 162.3.2 Descriptive Statistics . . . . . . . . . . . . . . . . . . 192.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312.7 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373 If At First You Don?t Succeed: Negative Externalities inHigh School Course Repetition . . . . . . . . . . . . . . . . . 473.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2 Empirical Methodology . . . . . . . . . . . . . . . . . . . . . 523.3 Data and Descriptive Statistics . . . . . . . . . . . . . . . . . 56ivTable of Contents3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.7 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684 To Go To College or Get A Job? The Effects of Part-TimeWork During High School . . . . . . . . . . . . . . . . . . . . 774.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.2 Empirical Methodology . . . . . . . . . . . . . . . . . . . . . 804.2.1 The First Stage: Peer Effects in High School WorkingBehaviour . . . . . . . . . . . . . . . . . . . . . . . . 814.2.2 The Second Stage: The Effect of High School WorkingBehaviour . . . . . . . . . . . . . . . . . . . . . . . . 844.3 Data and Descriptive Statistics . . . . . . . . . . . . . . . . . 864.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.6 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934.7 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101AppendicesA The Effect of Peer Gender Composition on High SchoolAchievement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110A.1 Sensitivity of Results to Instrument Specification . . . . . . . 111A.2 Non-classical Measurement Error Arising from Self-reportingBias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113A.3 Boundary Concerns . . . . . . . . . . . . . . . . . . . . . . . 116A.4 Cumulative Effect Interpretation . . . . . . . . . . . . . . . . 117A.5 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . 125B Negative Externalities in High School Course Repetition 134B.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . 136C The Effects of Part-Time Work During High School . . . . 139C.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . 141vList of Tables2.1 Descriptive statistics: dyadic pairs . . . . . . . . . . . . . . . 372.2 OLS estimates of friend interactions on distance . . . . . . . . 382.3 Descriptive statistics: key variables . . . . . . . . . . . . . . . 392.4 OLS estimates of GPA on gender composition of schoolmatesand close neighbours . . . . . . . . . . . . . . . . . . . . . . . 402.5 IV estimates of GPA on gender composition of high schoolfriendship networks . . . . . . . . . . . . . . . . . . . . . . . . 412.6 IV estimates of academic achievement by age . . . . . . . . . 422.7 IV estimates of potential mechanism - school and classroombehaviours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432.8 IV estimates of potential mechanism - social and home be-haviours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.9 IV estimates of selected mechanisms by age . . . . . . . . . . 452.10 IV estimates of long-term effects of peer gender composition . 463.1 Number of years of high school mathematics courses/creditsrequired for graduation . . . . . . . . . . . . . . . . . . . . . . 683.2 Courses required for high school graduation/diploma . . . . . 693.3 Descriptive statistics - Pooled (Units: student-years) . . . . . 703.4 Descriptive statistics by math course - Pooled (student-years) 713.5 Transition matrices - shares: Mathematics (student-years) . . 723.6 Correlation between previous and current mathematics achieve-ment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.7 Effect of course repeaters on academic performance of first-time course-takers . . . . . . . . . . . . . . . . . . . . . . . . 733.8 Placebo tests: Pseudo course-mate achievement at time t? 1to t+ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.9 Gender and race heterogeneity in effect of course repeaters . . 753.10 Separating effects of course-mates? course failure and courserepetition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.1 Descriptive statistics I - weekly hours worked and controls . . 94viList of Tables4.2 Descriptive statistics II - outcomes (Wave 4 unless otherwisestated) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954.3 First-stage results - peer effects in part-time work during highschool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964.4 Balance tests - OLS results from regressing instrument oncontrols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.5 IV results - effect of part-time work during high school oneducational outcomes . . . . . . . . . . . . . . . . . . . . . . . 984.6 IV results - educational outcomes . . . . . . . . . . . . . . . . 994.7 IV results - labour market outcomes . . . . . . . . . . . . . . 100A.1 Descriptive statistics: controls . . . . . . . . . . . . . . . . . . 125A.2 Instrument balance tests . . . . . . . . . . . . . . . . . . . . . 126A.3 Sensitivity analysis - instrument specification . . . . . . . . . 127A.4 Sensitivity analysis - network density and friendship definitions128A.5 Sensitivity analysis - school urbanicity . . . . . . . . . . . . . 129A.6 Placebo test - effect of share even birth month . . . . . . . . 130A.7 Measurement error from self-reporting bias . . . . . . . . . . 131A.8 Sensitivity of first stage to distance from community origin . 132A.9 Cumulative effect estimates - math and science GPA . . . . . 133B.1 Descriptive demographic statistics - Pooled (student-years) . 136B.2 Effect of course repeaters on academic performance of first-time course-takers . . . . . . . . . . . . . . . . . . . . . . . . 137B.3 Robustness check - excluding selected subjects and schools . . 138B.4 Correlation between course failure rate and subsequent GPA 138C.1 OLS results - educational outcomes . . . . . . . . . . . . . . . 141C.2 OLS results - labour market outcomes . . . . . . . . . . . . . 142viiList of Figures2.1 Parent motivation for housing location . . . . . . . . . . . . . 312.2 Spatial distribution within selected school 1 . . . . . . . . . . 322.3 Spatial distribution within selected school 2 . . . . . . . . . . 322.4 Share of matched friendship nominations . . . . . . . . . . . . 332.5 Friendship nomination and sampling processes . . . . . . . . 342.6 Distribution of grades . . . . . . . . . . . . . . . . . . . . . . 352.7 Distribution of friendship network gender composition . . . . 352.8 Distribution of close neighbourhood gender composition . . . 362.9 Balance tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.1 Number of students enrolled in math courses per school . . . 643.2 Distribution of math GPA scores by past achievement . . . . 653.3 Number of students per school-course-year (class) . . . . . . . 653.4 Distribution of number of students repeating a failed courseper school-course-year . . . . . . . . . . . . . . . . . . . . . . 663.5 Distributional effects . . . . . . . . . . . . . . . . . . . . . . . 663.6 Threshold effects of share repeaters on GPA of first-time course-takers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.1 Distribution of part-time hours worked . . . . . . . . . . . . . 93viiiAcknowledgementsI wish to thank Nicole Fortin for countless hours of advice and supervision.Your honest and thoughtful feedback greatly improved my work. Disser-tation committee members Thomas Lemieux and Craig Riddell providedinsightful suggestions and much encouragement. Thank you.I am grateful to University of British Columbia (UBC) faculty membersDavid Green, Joshua Gottlieb, Yoram Halevy, Vadim Marmer, Kevin Milli-gan, Dotan Persitz, Marit Rehavi, and Francesco Trebbi for valuable inputalong the way. Seminar participants at UBC, the Canadian Economics Asso-ciation 2012 Conference and the European Association of Labour Economists2012 Conference provided useful feedback on selected chapters.UBC classmates, officemates and labmates were a constant source ofconstructive criticism and support. I benefited greatly from sharing thegraduate school experience with residents of St John?s College, many ofwhom are now great friends.I also wish to acknowledge classmates and faculty members at the Uni-versity of Cape Town, particularly Johannes Fedderke, who encouraged mypursuit of a graduate education in economics.Funding from the Canadian Labour Market and Skills Researcher Net-work (CLSRN), the Skye Foundation, and the Len Baumann Trust is greatlyappreciated.ixDedicationTo my parents, for challenging me to think and showing me how to love;To my brothers, for leading the way;And to my friends, for seeking adventure.xChapter 1Introduction1Chapter 1. IntroductionIndividuals are affected by their peers. In particular, friends and school-mates influence the decisions high school students make, the activities theyperform, and the attitudes they possess. This dissertation investigates theeffects of three specific dimensions of high school peer group compositionon a variety of academic and labour market outcomes. First, it asks howthe gender composition of an individual?s high school friendship networkaffects school performance; second, it explores whether course repeaters inhigh school mathematics courses exert negative externalities on their course-mates; and, third, it investigates how the part-time working behavior ofschoolmates in the same grade affects an individual?s own part-time work-ing behavior, and, consequently, how this peer-induced variation in part-time working behaviour affects the individual?s subsequent decisions andoutcomes.Adolescent experiences have the potential to affect both an individ-ual?s contemporaneous and subsequent outcomes in economically meaningfulways. The first two chapters of this dissertation focus on academic attain-ment and achievement in high school as the outcomes of interest. Essentially,these chapters characterize how peers enter the education production func-tion. Economists are interested in understanding the determinants of educa-tion because of the overwhelming evidence of positive returns to educationin both the labour market (Card, 1999) and various quality of life measures1.The outcome of interest in the third chapter is the school-leaving decision toenter the labour market or go to college. This is important because of thepath dependence typically observed in labour markets2; the effects of eitherentering the labour market or going to college after graduating high schoolare likely to persist throughout an individual?s life.High school students belong to a variety of peer groups. Peers may befriends, neighbours, classmates or schoolmates, and there are several finerdimensions within each of these groupings that are likely to exert an inde-pendent influence on incentives and outcomes. Investigating compositioneffects for each of these peer groups provides unique challenges, but an over-arching concern is dealing with non-random selection into the relevant peergroup. The first two chapters of this dissertation introduce novel empiri-cal strategies to overcome this problem; the first chapter uses the gendercomposition of schoolmates in an individual?s close neighbourhood as an in-strument for the gender composition of an individual?s friendship network,and the second chapter extends an existing fixed effects strategy to longitu-1See, for example, Oreopoulos and Salvanes (2009).2See, for example, Keane and Wolpin (1997).2Chapter 1. Introductiondinal transcript data in a way that allows for a variety of additional controlsnot feasible in previous analyses. The third chapter uses the familiar across-cohort within-school variation introduced by Hoxby (2000), applies it to anew outcome, and then goes one step further by using the estimated peercomposition effect as the first stage in an instrumental variables? estimation.An important feature of each of the analyses is exploring the mecha-nisms through which the estimated peer composition effects operate. Thisis important because although there are several policy instruments availableto affect high school peer group composition directly, an understanding ofthe mechanisms is likely to both improve policy predictions and provide amore fundamental description of how peers affect incentives and behaviour.This dissertation builds on the existing peer effects and economics ofeducation literature by investigating a new set of peer composition effects.Deepening our understanding of how high school peer composition entersthe education production function is important in the continuing quest toimprove and ultimately optimize education policy.3Chapter 2The Girl Next Door: TheEffect of Peer GenderComposition on High SchoolAchievement42.1. Introduction2.1 IntroductionPeer effects are an important concern in education (Sacerdote, 2011). Thequestions of whether classrooms should be single-gendered or mixed andwhether students should be tracked into classes based on ability are largelybased on the premise that peer effects matter. The clear importance ofthese questions for parents and policy-makers has stimulated an extensiveliterature investigating peer effects in education. Exogenous peer effects3 aredifficult to identify because peer groups are typically selected; parents chooseschools for their children, and teenagers choose their friends. This paperintroduces an innovative identification strategy to overcome this selectionproblem and estimate exogenous peer effects within friendship networks, apeer group that is intensively selected on both observable and unobservabledimensions.Individuals in school spend a large amount of time with their friends(Fuligni and Stevenson, 1995; Gager et al, 1999), and school friends are gen-erally considered to exert considerable influence over each other?s incentivesand actions. Peer gender composition is also believed to have a considerableeffect on teenage behaviour. Beyond the general questions around single-sexeducation, there are debates around how gender-specific social interactionsaffect the development of cognitive skills, the development of social and in-terpersonal skills, and the propensity to engage in risky behaviour. Thispaper takes a new step by considering the effect of the share of oppositegender school friends on academic achievement.4,5 An instrumental vari-ables approach overcomes the endogeneity of peer composition arising fromselection into friendship groups: the gender composition of schoolmates inan individual?s close neighbourhood induces plausibly exogenous variationin the gender composition of an individual?s friendship network.The study makes four contributions to the economics of education liter-3Manski (1993) discusses the different types of peer effects. Endogenous peer effectsoperating through the actions and decisions of friends are not modeled in this paper.Estimating these peer effects is the focus of several papers in the economics of educationliterature (Bramoulle? et al, 2009; Cooley, 2010; De Giorgi et al, 2010; Lin, 2010).4The effect of the share rather than the number of opposite gender friends is modelledin this paper. This is motivated by considering that many adolescent activities are con-ducted as friendship groups rather than separately as friendship pairs. The paper doesnot preclude the potential for the number of opposite gender friends to have a separateeffect, but the analysis of this is left for future work; the empirical strategy in this papercannot identify a number of friends effect without several additional assumptions.5Waddell (2012) investigates the role of opposite gender peer drinking on adolescentsexual behaviour.52.1. Introductionature. First, it shows that an increase in the share of opposite gender schoolfriends reduces academic achievement. To the best of my knowledge, thereis no prior evidence of a causal effect associated with the gender composi-tion of an individual?s friendship network on academic outcomes.6 A onestandard deviation increase in the share of opposite gender friends resultsin a 0.4 (one half of a standard deviation) reduction in GPA scores. This isapproximately twice the mean female-male achievement gap of 0.2 found inthe data, suggesting a moderately-sized effect.Second, this paper studies potential mechanisms through which friend-ship network gender composition effects operate. It finds that an increasein the share of opposite gender friends increases the reported frequencies ofdifficulties getting along with the teacher and difficulties paying attentionin class, two effects occurring within the classroom and strongly associatedwith negative academic outcomes. No convincing evidence is found to sup-port channels operating outside the classroom. Lavy and Schlosser (2011)investigate mechanisms through which peer gender composition effects oper-ate for same-grade schoolmates, finding that a higher share of female peerslowers the level of classroom disruption, improves relationships in the class-room, increases students? overall satisfaction in school, and lessens teachers?fatigue. The mechanisms identified in this paper complement those found intheir study, and provide further channels through which peer gender com-position affects academic achievement.Third, this paper provides suggestive empirical support for gender so-cialization effects (Galambos et al, 1990). Specifically, it presents sugges-tive evidence that young teenage girls may be incentivized to fulfil genderstereotypes in the presence of boys. The negative effect for younger studentscaused by an increase in the share of opposite gender friends is most evidentfor females in subjects traditionally considered the domain of males, mathe-matics and science, and are not found in English and history. These resultsare aligned with a small set of papers that have investigated gender social-ization effects in the economics of education literature.7 The findings in this6Poulin et al (2011) attempt to identify an effect using longitudinal variation in thecomposition of friendship networks, but they cannot account for time-varying changes inunobservable characteristics. Cipollone and Rosolia (2007) use a policy change in southernItaly to show that increasing the schooling attainment of boys increases the schoolingattainment of girls, but they do not observe friendship networks.7Schneeweis and Zweimu?ller (2011) use natural variation in the gender composition ofadjacent cohorts within schools to show that females with a higher share of male classmatesare less likely to choose male-dominated vocational school types, and, in experimentalsettings, Gneezy et al (2003, 2009) and Booth and Nolen (2012a, 2012b) find that thebehaviour of females responds to the gender composition of the group in which they are62.1. Introductionpaper complement the existing gender socialization literature by introduc-ing an analysis at the friendship level, a peer group in which socializationpressures are likely to be considerable given the desire of adolescents to beaccepted by their friends.And, fourth, this paper distinguishes a socially-based classroom gendercomposition effect from other effects operating in the classroom. Previousgrade gender composition estimates have not been able to separate the effectarising from peer interactions in the classroom with correlated effects thatmay be responding to classroom gender composition (such as teaching styleand disciplining behaviour).Existing studies examining exogenous peer effects on academic achieve-ment have used two broad approaches to overcome peer selection. Thefirst approach exploits the institutional random assignment of peers. Sacer-dote (2001), Zimmerman (2003), Stinebrickner et al (2006) and Carrell et al(2011) use the random assignment of students to different residences at thesame post-secondary institution to investigate the effects of peer characteris-tics on various student outcomes. This type of random assignment typicallyonly occurs at the post-secondary level, so although it provides compellingidentification, its application is limited to a subset of questions. Further-more, assignment is typically not across genders, limiting the potential forstudying gender composition effects.8The second approach exploits some form of conditional exogenous vari-ation in the composition of peer groups. Typically, this approach relies onthe peer group being defined so that its composition along the dimension ofinterest is exogenous conditional on a set of observable characteristics. Theprimary application uses variation in the composition of students acrossgrades within the same school to identify exogenous peer effects for same-grade schoolmates, and is based on selection into schools being a functionof school characteristics rather than cohort-specific deviations from thesecharacteristics. It has been used to investigate exogenous peer effects alongmultiple dimensions: race (Angrist and Lang, 2004; Hanushek et al, 2009),domestic violence (Carrell and Hoekstra, 2010), home language (Friesen andKrauth, 2011), and, related to this paper, gender (Hoxby, 2000; Lavy andSchlosser, 2011; Schneeweis and Zweimu?ller, 2011).9 These studies providecompelling evidence that grade composition matters, but cannot inform ourinteracting.8Whitmore (2005) uses the class size randomization of Project STAR to investigate theeffects of gender composition in elementary school classrooms.9Bifulco et al (2011) draws attention to some concerns when interpreting these findings,and Krauth (2011) provides a treatment effect interpretation of these effects.72.2. Empirical Methodology and Specificationunderstanding of composition effects for finer peer groups in which observ-ables cannot control for selection. Nonetheless, my finding that friendshipnetwork effects operate within the classroom provides further support andjustification for the above papers that focus exclusively on peer compositioneffects at the grade level.The paper is organized in the following way. Section 2 introduces the em-pirical methodology, particularly the strategy to overcome the endogeneityin the gender composition of high school friendship networks. The subse-quent data section is divided into two subsections. First, Section 3.1 pro-vides empirical support for the claim that distance is a significant determi-nant of friendship, the hypothesis on which the identification strategy relies,and Section 3.2 describes the data in detail. Section 4 begins by reportingthe primary findings, and then considers a variety of potential mechanismsthrough which the effect may operate. It also includes a set of results re-lated to long-term effects of friendship network gender composition. Theconclusion provides an overall interpretation of the results.2.2 Empirical Methodology and SpecificationIndividuals in a social network are often defined by their type (such as gen-der, race or age). The number of types, distribution of types and relativedistribution of types in an individual?s friendship network may all affect highschool performance. This paper focuses on the effects associated with other-type friendships.10 Models of network formation typically impose some ad-ditional cost for behaving like or interacting with other types (Bisin et al,(2011). This study defines type by gender, and proposes the idea that other-type or opposite gender friendships are associated with an additional cost10Homophily in friendship networks (the tendency to form same-type friendships) hasbeen extensively modeled in the network formation literature (Currarini et al, 2009).This paper exploits an aspect of the friendship formation process to obtain exogenousvariation in network composition, but otherwise abstracts away from network formationto consider the effects of homophily on academic achievement and other outcomes (ratherthan understand why homophily arises).82.2. Empirical Methodology and Specificationor negative input in the education production function.11,12The academic achievement Y of individual i in grade g and school s ismodelled as a linear function of a female indicator F , a vector of remainingindividual and background characteristics X, the share of opposite genderfriends O, and grade and school fixed effects D:Yigs = ?Fi + ?Xi + ?Ois + ?Dg + ?Ds + ?igs (2.1)The model is estimated on the combined sample of males and females,imposing gender symmetry in the effect. An interaction term Fi ? Ois isincluded when gender symmetry is relaxed. This specification shows thatthe effect of opposite gender friends goes in the same direction for males andfemales, but is plausibly of different magnitudes (although not statisticallydifferent). The advantage of the symmetry restriction is an increase in thestatistical power of the estimation.The gender composition of an individual?s high school friendship networkis likely to be correlated with a variety of unobservable characteristics thataffect academic achievement. Candidates include parental inputs, person-ality traits and non-cognitive skills. For example, supportive parents mayencourage participation in a wide range of extra-mural activities (result-ing in more gender-balanced friendship groups) as well as greater academicachievement. This introduces correlation between the gender compositionvariable and the error term in the absence of perfect controls for parentalinputs.Peer gender composition is also measured with error. Beyond the attenu-ation bias associated with potential classical measurement error, friendshipnetworks are constructed from self-reported friendship nominations. Thisprocess may yield systematically biased measures of friendship gender com-position. For example, certain students may nominate opposite gender class-11The debate on whether classrooms should be single-gendered or mixed can also beinterpreted in the context of same-type and other-type interactions. The effect of single-gendered classrooms is equivalent to the effect of same-type (own gender) classmates. Tothe extent that the grade gender composition literature informs the debate on single-sexeducation, this specification would actually have an easier interpretation than those usingfemale grade share given the impossibility of increasing or decreasing the female gradeshare for all individuals.12The primary outcome of interest in this paper is academic achievement. Oppositegender friends may also affect non-cognitive development, and given the recognized im-portance of soft skills in subsequent life outcomes, there may be a trade-off between thenegative effects on school performance and potential positive effects on non-cognitive skills.A preliminary analysis found no convincing evidence of an effect of opposite gender friendson measures of soft skills in the Add Health data.92.2. Empirical Methodology and Specificationmates as friends in an effort to appear more popular, generating correlationbetween the share of opposite gender friends and high school performance ifthe selection of these students is correlated with their academic achievement.The omitted variables problem and potential measurement error resultsin least squares estimation of Equation 2.1 providing biased estimates of ?,the effect of the gender composition of school friendship networks on aca-demic achievement. A source of exogenous variation in gender compositionis required to obtain consistent estimates of the effect. This paper exploitsvariation in the gender composition of schoolmates in the close neighbour-hood to obtain exogenous variation in the gender composition of schoolfriends.The idea behind the identification strategy is introduced by an example.Two females, Alice and Barbara, attend the same school and are in the samegrade. They share identical individual and background characteristics, andboth live next door to someone who attends the same school. Alice lives nextdoor to a male, Charles, and Barbara lives next door to a female, Debbie.Alice catches the bus with Charles, they are friends, and, in addition, Alicehas become friends with some of Charles? (mostly male) friends.13 Barbaraand Debbie also catch the bus together, they are friends, and Barbara isalso friends with some of Debbie?s (mostly female) friends. As a result,Alice has a larger share of male friends than Barbara. This arose by chance,as both Alice?s and Barbara?s parents did not know the gender of theirneighbours? children when they chose where to live even though both theirchoices may have been based on a variety of other factors, such as incomeand the proximity to a good school.The relationship between distance and friendship is central to the identi-fication strategy. The probability of Alice being friends with her neighbourCharles needs to be greater than the probability of Alice being friends withsomeone identical to Charles who lives on the other side of town. A furtherrequirement is that there needs to be variation in the gender compositionof schoolmates across neighbourhoods; the strategy does not work if every-one in the school has one male neighbour and one female neighbour as thiswould not generate variation in the gender composition of friendship net-works. These conditions are discussed further in the data and descriptivestatistics section. The strength of the relationship between distance and13It is well-established that the probability of friendship increases with the existenceof mutual friends (see, for example, Goodreau et al, 2009). This channel is not actuallyrequired for the identification strategy to work. It does, however, strengthen it, as thegender composition of Alice?s friends is not only affected by her friendship with her maleneighbour, Charles, but also by her friendships with Charles? mostly male friends.102.2. Empirical Methodology and Specificationfriendship is also supported by existing empirical evidence. Using the samedata as this paper, Mouw and Entwisle (2005) find that friends are morethan five times as likely as non-friends to live within 0.25km of one anotherafter conditioning on several observable characteristics.The exact instrument used in this paper is a weighted average of the gen-der composition of someone?s nearest twenty same-school neighbours (the setdenoted by J20 in the specification below). Each neighbour j is identified asbeing of opposite gender to i by the indicator Oijs, and their contribution tothe mean function is weighted by an inverse function of the distance betweenthe relevant individual and the neighbour Dijs (the nearer the neighbor, thegreater the weight). The weighting function w(Dijs) takes the form of thestandard Epanechnikov kernel with bandwidth equal to the distance to thetwentieth nearest neighbor DJ20.Variations of the instrument based on the gender composition of a dif-ferent number of the nearest neighbours or neighbours within a specifiedradius, as well as weighted and unweighted versions, were considered. Thechosen measure was found to have the strongest results, although the resultdoes not depend on the functional form of the instrument or the weightingfunction.14 The below equations specify the first stage for investigating theeffect of the share of opposite gender friends.Ois = ?0Fi + ?0Xis + ?0?j?J20 w(Dijs)Oijs?j?J20 w(Dijs)+ ?0Dg + ?0Ds + ?igs (2.2)w(Dijs) =34?1??DijsDJ20is?2?(2.3)The causal parameter of interest ? in Equation 2.1 is identified if thegender composition of an individual?s close neighbourhood is restricted toaffect academic performance only through the gender composition of theindividual?s friendship network.15 This claim is supported by two arguments.First, the gender of an individual?s neighbour is very likely random. Es-sentially, parents do not choose the locations of their homes based on the14The estimates associated with other instruments and weighting functions are lessprecise, but very similar in magnitude. Weights simply reflect the empirical observationthat the probability of friendship is inversely related to spatial proximity. Several of theseresults are reported in the appendix.15In addition to this exclusion restriction, the monotonicity assumption required forinstrument validity is very likely satisfied. An individual exposed to an increase in theshare of opposite gender close neighbours is unlikely to decrease their share of oppositegender friends.112.2. Empirical Methodology and Specificationgender of school-going neighbours. Figure 2.1 reports the distribution ofparent motivations for housing locations in the data. The gender of childrenin the neighbourhood was not an available option, but the age of childrenin the neighbourhood was, and was infrequently cited (around five percent).This suggests that locational choice is rarely influenced by the compositionof neighbourhood children. Potential correlation with observables is inves-tigated in the empirical section by performing balance tests in which theinstrument is regressed on a set of individual and background characteris-tics; these are shown to have no systematic effect.16Second, the friendship network is defined as a set of weak ties rather thanstrong friendships.17 Neighbours may exert influence without being strongfriends, but are likely to be included in a weak friendship network. Twoindividuals are defined as friends if either nominated the other individual asa friend rather than the mutual nomination that would be indicative of astrong friendship. (The procedure for nominating and matching friends isdiscussed in more detail in the data section.) An alternative way of think-ing about this would be defining the weak friendship network as the set ofschoolmates with whom friend-like social interactions occur, and interpret-ing the friendship network in the data as a proxy for this network.18There are two primary threats to identification. The first is that thegender of schoolmates in the close neighbourhood may affect another di-mension of an individual?s friendship network, and this other dimension ofthe friendship network may affect achievement. Two primary candidates arefriendship network age composition and number of friends. For example, amale with only female schoolmates in the close neighbourhood may be more16Several papers (such as Angrist and Evans, 1998) find and exploit the fact that girlsare more likely than boys to come from larger families, particularly for lower incomefamilies. This phenomenon does not appear to be sufficiently large to have an effect onthe instrument in this paper. Furthermore, the conditioning on extensive controls forfamily structure in this paper is likely to alleviate potential biases arising through thischannel.17Granovetter (1973) is the seminal paper on the importance of weak ties. Several papershave recognized the independent importance of strong ties (Card and Guiliano, 2011).Patacchini et al (2012) find that both strong and weak friendships have a contemporaneouseffect on high school grades, but only strong friendship effects persist in the long run. Thissupports using both strong and weak ties when analyzing short run education production.Lavy and Sand (2012) find that different types of friends in the classroom have differenteffects on learning outcomes for Israeli students transitioning from elementary to middleschool, a younger population than that studied in this paper.18Same-school neighbours are not required to be (at least) weak friends for the empiricalstrategy to be valid; it just requires that neighbour?s gender be orthogonal to achievementif they are not friends.122.2. Empirical Methodology and Specificationlikely to befriend older (or younger) teenagers in the neighbourhood, as wellas have fewer friends, both of which may affect school performance. Thesehypotheses cannot be ruled out, but are challenged by the finding that thegender composition of close neighbours does not affect the age compositionof school friends or the number of friends.The second concern is that the friendship nomination process may beaffected by the gender composition of the close neighbourhood. For exam-ple, low-performing individuals may disproportionately nominate oppositegender neighbours as friends (to appear more popular) without them ac-tually being friends. The consequence of this would be measurement errorin friendship network gender composition (arising from self-reporting bias)being correlated with the instrument in a way that biases results. Theappendix provides a more formal discussion of this problem, and reportsresults that contest this hypothesis by showing that a constructed proxy formeasurement error in the gender composition of the self-reported friendshipnetwork is uncorrelated with the gender composition of the close neighbour-hood.It is worth noting for exposition that the instrument would be invalid forinvestigating the race composition of high school friendship networks. Thisis because race and neighbourhood characteristics are not independent. Anindividual with mostly black same-school neighbours is likely to be differentalong a number of dimensions to another individual in the same school withmostly white neighbours, even if they share the same observable character-istics.The negative effect of opposite gender friends on education productionmay arise from a variety of (non-exclusive) sources. Equation 2.1 can beinterpreted as the reduced form of a simple linear model in which the shareof opposite gender friends affects a vector of intermediate mechanisms W ,which, in turn, affects academic achievement.19Wm,is = ?mFi+?mXis+?mOis+?mDg +?mDs+?m,igs for m = 1, . . . ,M(2.4)19The parameter of interest in the primary specification ? =?m=1,...,M ?S,m?m, theeffect of the share of opposite gender gender friends on achievement, is obtained by sum-ming over each mechanism the products of the effect of that mechanism on the outcomeand the effect of the share of opposite gender friends on that mechanism. This model doesnot allow feedback from the academic outcome to the mechanisms.132.2. Empirical Methodology and SpecificationYigs = ?SFi + ?SXi +M?m=1?S,mWm,is + ?SDg + ?SDs + ?igs (2.5)Using the instrument for the share of opposite gender friends in Equation2.4 identifies the parameter ?m, the effect of peer gender composition onthe candidate mechanism Wm. Evidence that ?m ?= 0 (gender compositionaffects the mechanism) and ?S,m ?= 0 (the mechanism affects achievement)indicates an operating mechanism Wm, while ?m = 0 rejects a candidatemechanism for the peer gender composition effect (although the mechanismcould still affect achievement). The parameters ?S,m cannot be identifiedwithout additional exclusion restrictions (we cannot identify the effects ofthe mechanisms on academic outcomes), so ?S,m ?= 0 is inferred from non-causal correlations or taken from existing empirical literature. A series ofequations taking the form of Equation 2.4 are estimated to investigate theset of mechanisms that may be in operation.An understanding of the mechanisms through which peer gender compo-sition effects operate is useful for deepening our understanding of adolescentbehaviour, as well as potentially informing policy. Friendship networks can-not be directly regulated, but policy instruments may be available to act onthe channels through which these effects operate. Furthermore, they mayalso inform out-of-sample predictions. The effects of manipulating peer gen-der composition beyond what was originally observed are predictable onlyif the mechanisms continue operating in the same way. This is particularlyrelevant given the evidence in Carrell et al (2011) that reduced-form peereffects estimates do not inform out-of-sample predictions.This paper broadly groups candidate mechanisms into those operatingwithin and outside the classroom. First, opposite gender friends may re-duce the quality of classroom inputs in the education production function.Abstracting away from the friendship formation process, consider that main-taining (the utility associated with) friendships requires regular interactions.Outside of the classroom, high school students typically engage in a rangeof gender-specific activities.20 These activities provide ample opportunitiesfor the own gender interactions that characterize and maintain own genderfriendships. The mixed gender classroom provides a relatively scarce oppor-tunity for interactions with opposite gender friends. As a result, individuals20Fuligni and Stevenson (1995) and Gager et al (1999) document typical time use ofAmerican teenagers in the 1990s. Fuligni and Stevenson find that studying, part-timework, extracurricular activities (such as sports), watching television and socializing withfriends each consume between 10 and 20 hours per week.142.3. Datain class may distract or be distracted by opposite gender friends more thanown gender friends, reducing the quality of classroom inputs for individualswith a greater share of opposite gender friends.Second, opposite gender friendships may reduce the quantity and qualityof non-classroom inputs in the education production function, such as home-work. For example, high school social activities may be more fun if oppositegender friends are present. This increases the time spent socializing, pos-sibly at the expense of homework, and may reduce both the quantity andquality of homework produced the subsequent day. The idea behind thisclass of mechanisms is that, holding all else equal, opposite gender friendsincrease the marginal utility of leisure, resulting in an equilibrium charac-terized by leisure increasing (and homework decreasing) with an increase inthe share of opposite gender friends.The evidence in the empirical section supports the first set of mechanismsover the second set of mechanisms, suggesting that friendship network gen-der composition effects operate within rather than outside the classroom.2.3 DataThis paper uses data from the National Longitudinal Study of AdolescentHealth (Add Health). The Add Health is a school-based longitudinal studyof a nationally representative sample of US adolescents who were in grades7 to 12 during the 1994-1995 school year. The selected schools were repre-sentative of the US with respect to region of country, urbanicity, size, type,and ethnicity. Students in each school were stratified by grade and sex, andan average of 200 students were selected from each school to form the coresample. This sample was interviewed between April and December 1995 inthe first wave of the study. The second wave of the study was conductedthe subsequent year, and there have been two further in-home interviews,the most recent being in 2008. This paper primarily uses data from thefirst wave of the study. The fourth wave of the study is used to investigatethe effect of the gender composition of high school friendship networks onlong-term outcomes.152.3. Data2.3.1 Distance and FriendshipThere are two aspects of the data that are both unique to the Add Healthstudy and of particular importance: spatial locations21 and friendship net-works. The Euclidean distance between individuals? homes can be calculatedfrom spatial locations recorded in the data. These locations are reported interms of X and Y-coordinates for each individual in a school relative to anarbitrary origin. Figure 2.2 provides an example of the spatial distributionof individuals in a small school in the Add Health data. Individuals are clus-tered in the centre of the map, presumably near the location of the school.The grey lines connecting nodes reveal the friendship networks within theschool. Females are denoted by red circles and males by blue triangles.Another example of the spatial distribution of individuals within a schoolis provided by Figure 2.3. This figure highlights the identification strategy.The friendship network for an arbitrarily chosen female individual in theschool is shown in grey. Her nearest twenty schoolmates are circled. Thegender composition of the individual?s friendship network is instrumented bythe distance-weighted gender composition of the circled individuals.22 Six ofthe selected individual?s eleven friends are included in the set of the twentynearest neighbours; the proportion of matched friends within the twentynearest neighbours is larger than the proportion of matched friends outsidethe nearest twenty neighbours, suggesting the role of distance in friendshipformation for this individual.Friendship networks are constructed using data from the first wave of thestudy. Surveyed individuals were asked to nominate up to five male friendsand five female friends.23 Individuals could leave nominations blank, butcould not exceed the limit of five nominations per gender. These nominationswere matched to other individuals in the same school using school rosters.Sixty-eight percent of friendship nominations by individuals in the sam-ple are matched. Unmatched nominations typically arise from two sources:21Spatial locations have been exploited in a small number of papers in the peer effectsand education literature. Helmers and Patnam (2011), for example, incorporate spatialpeer interaction into a production function of child cognitive development.22The share of opposite gender friends and the distance-weighted share of oppositegender neighbours are both slightly above 0.3 for this individual.23Some individuals were asked to nominate only one male and one female friend. Thegender compositions of friendship networks computed for these individuals are interpretedas noisier proxies for the gender composition of the underlying friendship network. Resultsin which the sample is split by the number of friendship nominations or restricted to thosewith at least two friends show that restricting the number of nominations does not affectthe primary conclusions.162.3. Datanominations to individuals in another school or nominations using namesthat could not be matched on the school roster (for example, nominationsusing nicknames). The effect investigated in this paper is for the gendercomposition of matched friends.Figure 2.4 plots the distribution of the share of matched friendship nom-inations (ratio of matched nominations to total nominations) by gender,showing that nominations are fully matched for about forty percent of in-dividuals in the sample. The share of matched friendship nominations isorthogonal to the gender composition of schoolmates in the close neigh-bourhood; the correlation coefficient between the instrument and the shareof matched nominations is -0.02. This ensures that the empirical strategydeals with potential bias introduced by the matching process (in additionto other biases), such as nominations from weak students being less likelyto be matched.Friendship networks can be defined in a variety of ways using these data.This paper primarily defines any nomination or receipt of nomination as afriendship, generating a network of weak friendships. This is the preferreddefinition of friendship as it includes the largest set of potential influences.Results with alternative definitions of friendships are reported in the ap-pendix to show that the findings are generally robust to the definition offriendship that is chosen. Effects associated with the gender composition ofnominated friends and the gender composition of nominating friends (indi-viduals from whom friend nominations are received) are also considered.The friendship nomination and sampling processes are graphically illus-trated in Figure 2.5. The first and second panels show a hypothetical schoolwith nine students of which five are randomly sampled to complete the de-tailed survey. The third and fourth panels show the friendship nominationprocess, including D nominating an individual who could not be matched toan individual in the school. The fifth panel drops the individuals who werenot sampled to show the observed school friendship network. Note that thisnetwork distinguishes the direction of nominations from which alternativetypes of friendships can be defined. The sixth panel shows the weak friend-ship network used in the analysis. I is dropped as peer gender compositionis not well-defined for an individual with no matched friends. The directionof nominations is no longer distinguished as any nomination defines a weakfriendship.Each friendship nomination generates a dyadic pair. Table 2.1 describesthe 13,142 friendship pairs generated by matched nominations in the an-alyzed sample. Reciprocated nominations appear as two observations inthese data, but the reported interactions may differ as they depend on the172.3. Dataresponse of the surveyed individual. The first row of the table provides sim-ple evidence that distance is a significant determinant of friendship. Themean distance between friends in a school is significantly smaller than themean distance between two randomly-drawn individuals in a school.24Males nominate a higher share of opposite gender friends than females.Slightly fewer than forty percent of friendship pairs go to each other?s home,about half meet after school, and over forty percent spend time togetherduring the weekend. Forty-four percent of friendship pairs in the data ?talkabout a problem?; interestingly, but perhaps unsurprisingly, this activityis much more likely in friendship pairs nominated by females. The mostcommon activity among friendship pairs is talking on the phone, whichoccurs in about sixty percent of friendships in the data.Table 2.2 provides evidence that the distance between individuals affectsthe intensity of their social interactions. This table reports results from re-gressing binary indicators of each of the interactions with nominated friendsdiscussed above on the distance between the two individuals, the gender(and relative gender) of the nominated friend, and a vector of individualcharacteristics.25Conditional on being friends, individuals are more likely to go to afriend?s house, meet after school and spend time together on the weekendif they live closer together. Distance also affects the likelihood of talkingon the phone despite this activity being largely independent of spatial prox-imity. Under the hypothesis that talking on the phone is correlated withthe strength of the friendship, this provides suggestive evidence that friendswho are geographically proximate have stronger relationships. Females areless likely to meet after school or during weekends and more likely to talkon the phone or about a problem with their nominated friends. All inter-actions are less likely with opposite gender friends. Given that interactionsare more probable with close neighbours, and that these interactions varyby the gender of the nominated friend, the variation in friendship networkgender composition induced by the gender composition of close neighbours24This comparison does not control for characteristics that may be correlated with thedistance between individuals, such as the probability of being the same race. For example,in a school neighborhood in which everyone west of the school is white and everyone east ofthe school is black, the mean distance between friends may be smaller just because friendsare more likely to be of the same race. The evidence that distance affects friendshipsprovided by Mouw and Entwisle (2005) conditions on several observable characteristicsincluding race.25This analysis is only possible within friendship pairs as individuals were not askedabout their potential interactions with all other individuals in the school; this would beprohibitively costly in terms of data collection.182.3. Datawill affect an individual?s weekly interactions in a meaningful manner.2.3.2 Descriptive StatisticsThe Add Health Wave I dataset samples 20,769 individuals from 80 schools26.Individuals without core demographic information, GPA scores and spatiallocations are dropped.27 The gender composition of an individual?s friend-ship is only well-defined when the individual has at least one friend. Individ-uals with no matched friends are therefore dropped from the data. Finally,schools in which fewer than twenty students remain in the sample after thisprocess are also dropped. This leaves a final sample of 8,435 individualsfrom 76 schools.28Descriptive statistics of the variables used in the paper are reported inTable 2.3. The primary outcome variable considered in the paper is an over-all mean of self-reported grades across four subjects: English, Mathematics,Science and History. Letter-grades are converted to numerical grades byassigning fours to As and ones to Ds or lower. The overall mean gradeis computed by equally weighting all non-missing subject grades for eachindividual.Figure 2.6 shows the full distribution of overall high school grades bygender. There are two striking differences in the grade distributions formales and females. First, the female distribution is centred at a highergrade (the mode is 3 for females and 2.5 for males), and, second, there isa spike in the distribution for females at scores of 4 (As in all subjects).The mass of females scoring at the top of the distribution is also noted byFortin et al (2011) and Bertrand and Pan (2011). Measurement error inself-reported GPA is computed using transcript GPA scores for a subset ofthe sample for whom these are available.The next set of variables describes the gender composition of high schoolfriendship networks. Recall that individuals must be linked to at least oneother individual to be included in the data. The type of network for which26About half of the 80 schools are school pairs. School pairs are created to represent oneschool when the sampled high school does not have lower grades (such as ninth grade).This is done by probabilistically matching high schools without lower grades to one feederschool in the area based on the likelihood with which students come from the set ofcandidate feeder schools.27Individuals with and without GPA scores and spatial locations appear similar alongobservable dimensions, reducing the concern of sampling bias.28The mean number of students per school is 422, the median is 100, and the smallestand largest schools have 20 and 1515 students, respectively. The sample includes twoCatholic schools and five private schools.192.3. Datathe variable mean is calculated is denoted in parentheses after the variable.The weak friendship network in which any nomination is considered to es-tablish a friendship is the primary focus of this paper, but a description ofthe strong friendship network in which reciprocated nominations define afriendship is included for comparison purposes.The mean share of opposite gender friends is slightly below 0.4 in theweak friendship network. This confirms the tendency towards nominatingfriends of the same gender. The distribution of the mean share of oppositegender friends is plotted in Figure 2.7. This is done for both the full sampleand for a restricted sample in which only individuals matched to at leasttwo friends are included. There are mass points at one and zero in the fullsample. This is because about sixty percent of the sample were only matchedto same-gender friends or were only matched to one friend. The distributionfor those with at least two friends shows the modal share of opposite genderfriends to be 0.5, but retains the strong feature of a tendency towards samegender friendships.29The share of opposite gender friends in strong friendship networks (de-fined by reciprocated nominations) is considerably lower than that found inweak friendship networks. This shows that opposite gender friends are lesslikely to reciprocate nominations than friends of the same gender.Exogenous variation in the neighbourhood gender composition of school-mates is used to obtain identifying variation in the gender composition offriendship networks. The exact instrument is based on the distance-weightedgender composition of each individual?s nearest twenty neighbours (in thesame school). The next row of Table 2.3 shows that the mean share of op-posite gender close neighbours is very close to the expected 0.5 in the fullsample and for males and females.The distribution of this variable is important for two reasons. First,there is a concern that all individuals may have a similar share of male andfemale schoolmates in their close neighbourhoods. Under this scenario, evenif distance were a significant determinant of friendship, it would not generatevariation in the gender composition of friendship networks. The distributionof the weighted gender composition of the nearest twenty neighbours for thefull sample, as well as by gender, is plotted in Figure 2.8, confirming variationin neighbourhood gender composition.Second, we can test whether the distribution of the share of opposite29The appendix reports estimation results for the restricted sample that are similar tothose for the full sample. This provides evidence that results are not driven by individualsonly matched to one friend.202.3. Datagender neighbours is consistent with a data-generating process in whichlocation decisions are independent of the gender composition of the closeneighbourhood. Parents favouring gender-balanced neighbourhoods, for ex-ample, would be evident if the standard deviation of the share of oppositegender neighbours were smaller than a comparable series based on randomlocation decisions (although the mean share of opposite gender neighboursmay be unaffected and remain 0.5). This would be a concern if these parentsalso systematically affect their children?s school performance. We performa Kolmogorov-Smirnov test for equality of distributions on the observedmeasure of neighbourhood gender composition and a constructed pseudo-measure of neighbourhood gender composition in which gender is randomlyreassigned to households.30 The null hypothesis of equality of distributionscannot be rejected, supporting the claim that location decisions and theneighbourhood gender composition are orthogonal.Self-reported measures describing the extent to which individuals havebehavioural troubles at school are used to support the hypothesis that partsof the socialization effects identified in this paper operate within the class-room. Ordinal scales for these variables are converted to numerical scalesby assigning zeroes to responses of ?Never? and fours to responses of ?Ev-ery day?, the most frequent of five categories. Males are more likely thanfemales to report having both troubles getting along with the teacher andpaying attention in school. Both types of troubles occur within the class-room. They are infrequently reported. Considering behaviours outside theclassroom, males are more likely to report trouble completing homework andinteracting with other students.Socialization effects may also operate outside the school environment ifopposite gender friends affect the marginal utility of leisure. The number offriends? variable in Table 2.3 corresponds to the number of matched friendsin the weak friendship network. Conditional on being matched to at least oneother individual in the data, individuals have an average of 2.6 friends. Thisis slightly greater for males than females. The reported number of friendsis likely to be less than the true number of friends in an individual?s schoolfriendship network. This is due to the imposition of a maximum number ofnominations and some nominations being unmatched31. The composition30This assignment is based on birth months being odd or even, which is assumed trulyrandom.31It is probable that unmatched nominations occur more frequently within schools inwhich individuals were less likely to be sampled. Results (not reported) in which thesample is limited to the set of schools in which all individuals were sampled reveal thesame pattern of effects.212.3. Datameasure used in this paper is therefore interpreted as a proxy for the truegender composition of the friendship network.Over one half of the sample report being in a relationship in the last18 months. Females are more likely to report a previous romantic relation-ship.32The gender composition of an individual?s friendship network may alsoaffect smoking and drinking behaviour (see, for example, Clark and Loheac,2007), and smoking and drinking may affect academic achievement. Aboutone quarter of the sample report smoking at least one day in the past month,and this does not differ by gender. Males are more likely than females toreport being drunk at least one day in the past year; just under one third ofmales and just over one quarter of females report this behaviour. Variousother measures of smoking and drinking behaviour were also considered;they convey essentially the same information as these measures.Finally, we are interested in the persistence of gender composition effects.The long-term outcomes of subsequent-year GPA, graduated high school,attended college and ever married are taken from the fourth wave of theAdd Health study in which individuals are asked about their educational andrelationship histories. This wave was conducted in 2008 when individualswere 24 to 32 years old. Ninety-five percent of the sample graduates highschool and sixty-eight percent of the sample completes at least one year ofpost-secondary education, the definition of attending college used in thispaper. The probability of males attending college is ten percentage pointslower than that for females. Almost one half of the sample report beingmarried (or previously being married).Core demographic characteristics reported in Appendix Table A.1 pro-vide information on the composition of the sample. Just over half the sampleis white and one fifth of the sample is black.33 Ninety percent of the sampleis born in the US and the mean age is 16, corresponding approximately tothe tenth grade. The means of all other control variables are also reported.These include variables describing parent characteristics, home language,32The behaviours associated with ?being in a romantic relationship? are likely to varyconsiderably across individuals in high school. Finer measures of relationship-type be-haviour would be required to obtain a fuller picture of the potential effects of peer gendercomposition.33The Add Health study over-sampled black students. Sample weights are not used inthis analysis because their application to friendship pairs is ambiguous; it is not clear howfriendships with over-sampled individuals should affect the gender composition of thatindividual?s friendship network. At the same time, it is noted that results are insensitiveto the inclusion of sample weights at the estimation stage, although they do affect theprecision of some of the estimates.222.4. Resultshousehold income, family structure and grade repetition34.2.4 ResultsThe objective of this paper is to empirically investigate the effect of gen-der homophily in high school friendship networks on academic achievement.Opposite gender friends are shown to have a negative effect on high schoolperformance. Subsequent results explore whether the effect differs by gen-der, across school subjects, and by age. Finally, results investigating themechanisms through which peer gender composition affects achievement arereported. Errors are clustered at the school level throughout the analysis.35The first two columns of Table 2.4 report OLS results from regressingGPA scores on friendship network gender composition measures. The firstcolumn reports results from the model that imposes gender symmetry, andthe second column reports results from the model that includes a gender in-teraction on the explanatory variable of interest.36 Results in these columnsshow that males with a higher share of opposite gender friends are associ-ated with better school performance, while the correlation for females (thesum of the coefficients) is close to zero. As discussed in the methodologysection, this correlation could arise from unobserved parental inputs, biasin self-reported friendship nominations or other unobserved characteristics.The causal effects subsequently reported are consistently of the oppositesign. This is consistent with individuals with large shares of opposite genderfriends being positively selected, and confirms the importance of an empiri-cal strategy to overcome the endogeneity bias in the gender composition offriendship networks.The third and fourth columns of Table 2.4 reports the direct effect ofthe instrument on academic achievement. The gender composition of same-school neighbours is considered exogenous, so these estimates have a causalinterpretation. An increase in the share of opposite gender schoolmates in34Grade repetition controls are included throughout the paper to control for potentialdifferences in friendship network formation and effects for repeating students. Results aresimilar if these students are excluded from the analysis.35This is conservative given the inclusion of school fixed effects and that the level ofvariation exploited in this paper is cross-sectional and at the individual level. The precisionof estimates obtained without clustering are generally very similar.36Note that the effect for males in the gender interaction model is the coefficient on theshare of opposite gender friends, and the effect for females is the sum of this coefficientand the coefficient on the interaction. Only the first coefficient (the male effect) and thesum (the female effect) are reported in subsequent tables for ease of interpretation.232.4. Resultsthe close neighbourhood reduces school performance.37 The fourth columnshows that the sign of the effect does not differ by gender, justifying thestatistically more powerful gender-symmetric model. This paper interpretsthese effects to be operating through weak friendship networks.Goux and Maurin (2007) exploit the institutional environment in Franceto conclude that an adolescent?s outcomes in junior high school are stronglyinfluenced by (and not just correlated with) the performance of neighbours.Foley (2012) finds that neighbourhoods affect university participation. Thereduced form result in this paper supports the hypothesis that close neigh-bours matter. It provides a potential mechanism for these findings andsuggests that part of the neighbourhood effect may be driven by the set ofclose neighbours that are in an individual?s weak friendship network.The primary causal estimates from the instrumental variable (IV) spec-ification are reported in Table 2.5. The top panel reports results for themodel in which gender symmetry in the effect is imposed. The bottompanel reports results for the model that relaxes gender symmetry, confirm-ing that the effect is the same sign and not statistically different for malesand females.Results for the preferred model include the coefficient of interest andF-statistic from the first stage, as well as a weak IV-robust confidence in-terval (Moreira and Pan, 2001). The first stage coefficients are precise withreasonably-sized F-statistics across specifications. The weak IV-robust confi-dence intervals allow potential nonnormality in GMM statistics arising fromweak identification (as discussed in Stock et al, 2002). Andrews and Stock(2005) advocate inference based on this confidence interval given its robust-ness properties; opposite gender friends affect academic achievement if theconfidence interval is bounded away from zero.The negative effect of opposite gender friends is evident both without(first column) and with (second column) controls. The point estimate in thesecond column is negative and the corresponding weak IV-robust confidenceinterval does not include zero. The standard deviation of the share of oppo-site gender friends is 0.4, so the estimate of -1.0 means that a one standarddeviation increase in the opposite gender friend share causes a 0.4 declinein GPA. In comparison to other variables in the model, this is twice thecoefficient on the female indicator. This is interpreted as a moderately-sizedeffect.37This reduced form analysis is also performed on the original sample before individualswith no matched friends are dropped. The estimated coefficient of -0.063 (0.035) is notstatistically different from the estimated coefficient of -0.132 (0.054) reported in the table.242.4. ResultsResults in the bottom panel provide suggestive evidence that the effectis larger for females than males. The magnitude of the effect for femalesis consistently around three times larger than the effect for males (withthe caveat that a lack of power prevents statistically distinguishing theseestimates).The subject-specific results reported in the third and fourth columns ofTable 2.5 suggest that the overall effect is larger in mathematics and sciencethan English and history.38 This is shown to be driven by the absence of aneffect in English and history for individuals under the age of sixteen in Ta-ble 2.6. The negative effect of opposite gender friends for girls in their earlyteenage years in traditionally male-dominated school subjects is consistentwith gender socialization effects in the existing developmental psychologyand economics of education literature: adolescent females conform to tradi-tional gender roles (such as not doing well in mathematics and science) inthe presence of males. The reduced socialization effects on males are alsoconsistent with previous studies and the commonly-held view that femalesmay have more to gain from reduced gender socialization pressures (such assingle-sex classrooms).Results in Table 2.6 are constructed by splitting the sample at the age ofsixteen. As discussed above, the effect for younger individuals is limited tomathematics and science, while the effect for older individuals is larger andequally prevalent across all school subjects. This table also provides insightinto the operation of the instrument. Older high school students in the USare likely to be more mobile; the driving age in most states is sixteen. Thissuggests that the instrument is likely to be less effective for older studentsas geographic distance becomes a less important determinant of friendship.The smaller and less precise first stage coefficients for older students areconsistent with this hypothesis.The mechanisms through which friendship network gender compositionaffects academic achievement are important for understanding how highschool peers affect incentives and actions. An investigation of these mecha-nisms provides a fuller description of the education production function andindirectly informs policy related to gender composition in the school envi-ronment. Results in Tables 2.7 and 2.8 explore possible channels throughwhich the gender composition of friendship groups affects school perfor-mance. Table 2.7 considers a set of school behavioural troubles and Table2.8 investigates social behaviours.3938Subjects are not considered individually as there are some individuals with missingsubject GPA scores and grouping increases the sample sizes.39A more direct mechanism may operate through the academic ability of peers. As a252.4. ResultsIndividuals were asked the frequencies with which they have troubles get-ting along with the teacher, paying attention in class, getting homework doneand relating to other students on a five-point scale (from zero to four withfour being the most frequent). The first row of Table 2.7 (OLS coefficient inGPA regression) reports strong negative correlations between the frequen-cies of these troubles and GPA scores. Results in the respective columnsshow that an increase in the share of opposite gender friends increases thefrequency of trouble getting along with the teacher and paying attention inclass, while the effects on trouble getting homework done and trouble withother students are similarly positive, but imprecisely measured.40The significant effects on the first two school troubles are sizeable. Aone standard deviation increase of 0.4 in the opposite gender friend share isassociated with a 0.5 increase in the reported frequencies getting along withthe teacher and paying attention in class. These are large effects given boththese series have approximate means and standard deviations of one.41The set of mechanisms in Table 2.8 relate to effects most likely occur-ring outside the classroom. The only significant gender composition effectamong this set of mechanisms operates through the probability of being in aromantic relationship. An exogenous increase in the share of opposite genderfriends increases the likelihood of reporting being in a romantic relationshipin the past 18 months. Being in a romantic relationship is negatively associ-ated with GPA scores when included independently of the other mechanisms(not reported), but is essentially uncorrelated with GPA scores when all foursocial mechanisms are included (first row). The existing literature finds aresult of the gender gap in school performance, females with a larger share of oppositegender friends will, on average, have a larger share of less academically able friends. This,in turn, may reduce the school performance of these females. No empirical support forthis hypothesis was found; the gender composition of the close neighbourhood had noeffect on the ability composition of friends for males and females separately, as well as thecombined sample.40The gender composition of friends taking the same classes is highly correlated (0.7)with the gender composition of all friends. This correlation is computed for a smallsubsample of individuals for whom indices indicating the extent of shared courses withschoolmates was available. It supports using the measure of gender composition basedon all friends when investigating friendship network gender composition effects inside theclassroom.41Note that the correlations between the reported troubles and GPA scores are notcausal. There are many factors correlated with these troubles and not included in the setof controls that may affect school performance. An alternative interpretation of this resultis considering the first two reported school troubles as proxies for general classroom be-haviour and that the gender composition effect operates more generally through classroombehaviour.262.4. Resultscorrelation but does not make a strong case for a causal relationship be-tween romantic relationships or sexual activity and high school achievement(Halpern, 2000; Sabia, 2007). The fifth column shows that the behaviouraltroubles found in the classroom are not evident in the home; individuals withgreater shares of opposite gender friends are not significantly more likely tohave had serious arguments with their mothers in the four weeks precedingthe survey.Results in Table 2.9 provide further analysis of the three potential mech-anisms for which precise estimates were obtained. The increased troubles inthe classroom and probability of being in a romantic relationship are stronglyevident for older high school students, but not for younger students. This isconsistent with the negative effect in mathematics and science for youngerstudents being a consequence of broader gender socialization effects ratherthan any of the direct effects considered here.The gender composition of an individual?s friendship network is likelyto fluctuate during high school as individuals move in and out of friend-ship groups. The Add Health study included friendship nominations duringboth the initial in-school interview (in which a brief survey was admittedto all individuals in each sampled school) and the subsequent Wave 1 inter-view (conducted on a subset of individuals at each school). The correlationsbetween the friendship network gender compositions are positive and signifi-cant, varying between 0.3 and 0.5. This correlation confirms the presence ofan enduring component in peer gender composition, suggesting the poten-tial for long run effects. Note that Wave 1 friendship nominations generatethe opposite gender friend shares used in this paper as outcomes and spatiallocations are obtained from this wave.Results in Table 2.10 consider the effect on four long-term outcomesof interest: subsequent-year GPA scores, graduated high school, attendedcollege and ever married. These outcomes are measured in Wave 4 of theAdd Health study conducted in 2008; the sample is substantially smaller dueto attrition.42 Estimates suggest an imprecise, negative effect of the oppositegender share of friends on the three academic outcomes, and a significantpositive effect on the probability of ever being married. The latter finding isnot surprising given that an increase in the share of opposite gender friendsincreases the probability of being in a romantic relationship in high school,and the likely correlation between this and ever being married. These resultssuggest persistence in the peer gender composition effects associated withhigh school friendship groups.42Results using imputed values are similar to those reported here.272.4. ResultsThe identification strategy relies on the gender composition of same-school close neighbours being orthogonal to all factors affecting academicachievement other than the gender composition of weak friendship networks.Correlation with unobservables is inherently untestable, but correlation withobservables is investigated graphically and statistically by regressing the in-strument on observable characteristics.43 Thinking about the gender com-position of an individual?s close neighbourhood as a random treatment, thisloosely investigates whether assignment was truly random.Figure 2.9 plots the mean share of opposite gender close neighbours forthe four categories of mother?s and father? education, as well as four binsof annual household income. Plots suggest the absence of a systematic pat-tern in the gender composition of close neighbours; means are 0.5 acrosscategories for each variable for both males and females. Plots for the shareof white same-school neighbours are included for comparison, and, as ex-pected, means are no longer constant across categories for each variable.This indicates that the share of white same-school neighbours is correlatedwith the selected socio-economic indicators, and could not be interpreted asa random treatment.Appendix Table A.2 confirms that the instrument is balanced across ob-servable individual characteristics. The correlation with the female indicatorvariable reflects mild gender imbalance in the sample, while remaining cor-relations are not systematic. There are some significant correlations in thegender-specific results in the second and third columns, but these are alsonot systematic.The fourth column reports correlations between the share of white same-school neighbours and individual characteristics. This column is includedfor comparison purposes. The significant negative correlations with mother?seducation and public assistance receipt confirm that the race compositionof close neighbours is not balanced; white neighbourhoods differ systemat-ically from non-white neighbourhoods along observable dimensions. Thissuggests that these neighbourhoods may also differ along unobservable di-mensions under the assumption of correlated selection on observables andunobservables, indicating the invalidity of this approach to investigating therace composition of friendship networks.Remaining robustness checks are reported and discussed in the appendix.These include showing that the pattern of results is not affected by thefunctional form of the instrument, the school-specific nomination process or43Altonji and Taber (2005) combine estimated selection on observables with variousassumptions about selection on unobservables to bound estimates of a treatment effect.282.5. Conclusionthe chosen definition of friendship.2.5 ConclusionParents are typically concerned about the composition of their high schoolchildren?s peer groups. High school years are considered to be particularlyformative, and the general view is that friends exert considerable influenceover their peers during this period. This paper supports this hypothesisby finding that an increase in the share of opposite gender friends causesa reduction in high school academic achievement. The magnitude of theeffect is moderate: a one-standard deviation increase in the share of oppositegender friends is associated with a 0.4 reduction in GPA.An abundance of existing papers in the economics of education find thatclassroom gender composition matters (Hoxby, 2000; Lavy and Schlosser,2011; Schneeweis and Zweimu?ller, 2011). This paper provides evidence thatthe gender composition of school friends plays an important role. Part ofthe effect is shown to operate within the classroom environment throughincreased troubles getting along with the teacher and paying attention inclass. These mechanisms are similar in type to those proposed by Lavy andSchlosser (2011) for the positive effect of female classmates. They find thatfemale classmates lower the level of classroom disruption and improve re-lationships in the classroom through changes in classroom composition andnot individual behaviour. Taken together with the results in this paper inwhich individual behaviour is affected by friendship network gender compo-sition, evidence is increasingly supportive of a general hypothesis in whichsocial interactions between genders affect classroom education production.The negative effect of opposite gender friends for younger students isfound in mathematics and science and not in English and history. Inter-preted alongside existing experimental and non-experimental studies, theexclusivity of the effect in the traditionally male-dominated science sub-jects, and the consistently larger estimates for females, are supportive of agender socialization hypothesis in which young adolescent females conformto traditional gender roles in the presence of males. Given the sub-optimalityof socially-constructed constraints on achievement, this suggests there maybe efficiency gains from policy interventions limiting these effects.4444One example of such a strategy may be single-sex mathematics and science class-rooms. According to the National Association for Single Sex Public Education(www.singlesexschools.org), the number of coeducational schools offering single-sex class-rooms has increased from around a dozen in 2002 to 390 in the 2011-2012 school year.292.5. ConclusionThis study can also be interpreted in the context of the continuing de-bate around single-sex and mixed gender education (Halpern et al, 2011).The difficulties getting along with the teacher and paying attention in classcaused by an increase in the share of opposite gender friends are unlikely tooccur in single-sex classrooms that exclude opposite gender friends. Theseeffects on classroom behaviour are also indicative of the type of channelsthrough which achievement may be affected by reorganizing classroom gen-der composition, and suggest that effects may operate through more thanbetter matches between teaching styles and the gender composition of theclass.4545There are several other factors not captured by this analysis that would need tobe considered for an evaluation of single-sex education, but the evidence in this papercontributes to the debate in the absence of random assignment to single-sex and mixedgender classrooms in North America. Jackson (2011) exploits quasi-random assignmentto single-sex and mixed gender high schools in Trinidad and Tobago to investigate theeffects of single-sex education.302.6. Figures2.6 FiguresFigure 2.1: Parent motivation for housing location0.05.1.15.2Near old workNear current workOutgrown previousAffordable good housingLess crimeLess teenage crimeClose friends (parents)Better schoolsAge of childrenBorn hereMost important reason Response?weighted reasonData obtained from 6404 parent interviews (75% of sample).The most important reason is an answer to the question:Which one statement describes the most important reason why you live in this neighborhood?The response?weighted reason is calculated from binary responses to statements of the form:You/your household lives here because X. The interviewed parent could answer yes to as many reasons as desired.Parent motivation for housing location312.6. FiguresFigure 2.2: Spatial distribution within selected school 1?5000050001000015000Y co?ordinate (m)?10000 ?5000 0 5000 10000 15000X co?ordinate (m)Female MaleComplete friendship networkSpatial distribution within selected school:Figure 2.3: Spatial distribution within selected school 20200040006000800010000Y co?ordinate (m)0 5000 10000 15000 20000X co?ordinate (m)FemaleMaleSelected node (female)FriendNearest neighbourSelected female has 11 friends of which 6 are in the set of 20 nearest schoolmates.There are 152 students in the school.Friendship network and 20 nearest neighbours of selected nodeSpatial distribution within selected school:322.6. FiguresFigure 2.4: Share of matched friendship nominations010203040Density0 .2 .4 .6 .8 1Males FemalesThe share of matched friendship nominations is defined as the ratio of matched nominations to totalnominations. Sixty?eight percent of friendship nominations by individuals in the sample are matched.Share of matched friendship nominations332.6. FiguresFigure 2.5: Friendship nomination and sampling processesA B CD E FG H IA B CD E FG H I1. School with nine students 2. Five students randomly sampledA B CD E FG H I?A B CD E FG H I?3. Sampled student D nominates friends 4. All sampled students nominate friendsBD EH IBD EH5. Observed school friendship network 6. Weak friendship network342.6. FiguresFigure 2.6: Distribution of grades051015Density1 2 3 4Males FemalesMean grade (A=4, D or lower=1)Figure 2.7: Distribution of friendship network gender composition010203040Density0 20 40 60 80 100Share opposite genderMales FemalesAll friends010203040Density0 20 40 60 80 100Share opposite genderMales FemalesAt least two friendsBased on 8,435 observations from in?school component of Add Health survey.Gender composition of school friendship networks352.6. FiguresFigure 2.8: Distribution of close neighbourhood gender composition0123Density0 .2 .4 .6 .8 1All0123Density0 .2 .4 .6 .8 1Males0123Density0 .2 .4 .6 .8 1FemalesShare opposite genderNearest 20 neighbours:Figure 2.9: Balance tests.35.4.45.5.55.6.65<High High <College College <High High <College CollegeShare opp genderMalesFemalesShare whiteMalesFemalesMother?s education.35.4.45.5.55.6.65<High High <College College <High High <College CollegeShare opp genderMalesFemalesShare whiteMalesFemalesFather?s education.35.4.45.5.55.6.65<20k 20?40k 40?60k >60k <20k 20?40k 40?60k >60kShare opp genderMalesFemalesShare whiteMalesFemalesAnnual household incomeNearest 20 neighbours: share opposite gender362.7. Tables2.7 TablesTable 2.1: Descriptive statistics: dyadic pairsMean (standard deviation)All Males FemalesDistanceDistance between friends (m) 5,273 5,043 5,506(8,404) (8,199) (8,602)[98] [136] [139]Distance between randomly-drawn 7,140 7,038 7,237schoolmates (m) (6,317) (6,007) (6,599)[68] [94] [101]Gender of nominated friendOpposite gender friend 0.37 0.39 0.35(0.48) (0.49) (0.48)Female friend 0.52 0.39 0.65(0.50) (0.49) (0.48)Interactions with nominated friendGo to friend?s house 0.38 0.41 0.35(0.48) (0.49) (0.48)Meet after school to hang out 0.51 0.53 0.50(0.50) (0.50) (0.50)Spend time together during weekend 0.42 0.44 0.41(0.49) (0.50) (0.49)Talk about a problem 0.44 0.33 0.55(0.50) (0.47) (0.50)Talk on the phone 0.60 0.57 0.64(0.49) (0.50) (0.48)Observations 13,142 6,612 6,530Share 1.00 0.50 0.50Reciprocated nominations appear twice in these data. Standard devia-tions in parentheses. Standard errors in square brackets.372.7.TablesTable 2.2: OLS estimates of friend interactions on distance(1) (2) (3) (4) (5)Dyadic data: all nominationsGo to friend?s Meet after Spend time Talk about a Talk onhouse school during w/end problem phoneDistance quantiles and interactionsOmitted category: Large distance between friendsSmall distance between friends 0.26*** 0.11*** 0.14*** 0.01 0.05**(0.02) (0.02) (0.02) (0.02) (0.02)Medium distance between friends 0.08*** 0.01 0.03 -0.00 0.05**(0.02) (0.02) (0.02) (0.02) (0.02)Female x small distance -0.05** -0.00 -0.01 -0.02 -0.05**(0.03) (0.03) (0.03) (0.03) (0.03)Female x medium distance -0.03 0.03 -0.01 -0.02 -0.04(0.03) (0.03) (0.03) (0.03) (0.03)Opposite gender friend x small distance -0.10*** -0.03 -0.06** -0.00 -0.03(0.03) (0.03) (0.03) (0.03) (0.03)Opposite gender friend x medium distance -0.02 0.02 -0.01 0.04 -0.01(0.03) (0.03) (0.03) (0.03) (0.03)Female x opposite gender friend x small 0.01 -0.03 0.01 0.01 0.06(0.04) (0.04) (0.04) (0.04) (0.04)Female x opposite gender friend x medium -0.01 -0.05 -0.00 -0.07* -0.00(0.04) (0.04) (0.04) (0.04) (0.04)Gender and friend genderFemale -0.03 -0.04* -0.04** 0.31*** 0.14***(0.02) (0.02) (0.02) (0.02) (0.02)Opposite gender friend -0.20*** -0.21*** -0.18*** -0.02 -0.09***(0.02) (0.02) (0.02) (0.02) (0.02)Female x opposite gender friend 0.01 0.03 0.03 -0.18*** -0.16***(0.03) (0.03) (0.03) (0.03) (0.03)Observations 13,141 13,141 13,140 13,140 13,140R-squared 0.14 0.10 0.09 0.13 0.09382.7. TablesTable 2.3: Descriptive statistics: key variablesMean (standard deviation)All Males FemalesGPA (A=4, D or lower=1; self-reported)Overall mean 2.8 2.7 2.9(0.8) (0.8) (0.8)Mathematics and Science 2.7 2.6 2.8(0.9) (0.9) (0.9)English and History 2.9 2.7 3.0(0.9) (0.9) (0.8)School friendsShare opposite gender 0.38 0.38 0.37(any nomination) (0.39) (0.39) (0.39)Share opposite gender 0.17 0.19 0.16(reciprocated nomination) (0.33) (0.34) (0.32)Nearest 20 schoolmates (weighted)Share opposite gender 0.49 0.50 0.49(0.14) (0.14) (0.14)School behavioural troubles (Never=0, every day=4)Trouble getting along with teacher 0.8 0.9 0.7(0.9) (1.0) (0.9)Trouble paying attention in class 1.2 1.3 1.1(1.0) (1.0) (1.0)Trouble getting homework done 1.2 1.3 1.1(1.1) (1.1) (1.0)Trouble with other students 0.8 0.9 0.8(1.0) (1.0) (1.0)Friends, relationships, smoking and drinking behaviourNumber of friends 2.6 2.7 2.6(2.6) (2.6) (2.6)Relationship in past 18 months 0.56 0.54 0.57(0.50) (0.50) (0.49)Smoked at least one day in past 30 days 0.26 0.26 0.25(0.44) (0.44) (0.43)Drunk at least one day in past year 0.28 0.31 0.26(0.45) (0.46) (0.44)Long-term outcomes (reduced samples)Graduated high school 0.94 0.93 0.95(0.23) (0.25) (0.22)Attended college 0.68 0.62 0.72(0.47) (0.48) (0.45)Ever married 0.45 0.43 0.46(0.50) (0.49) (0.50)Observations 8,435 4,124 4,311Share 1.00 0.49 0.51392.7. TablesTable 2.4: OLS estimates of GPA on gender composition of schoolmates andclose neighbours(1) (2) (3) (4)Overall GPA (A=4,D or lower=1)School friendsShare opposite gender 0.07*** 0.11***(0.02) (0.03)Female x share opposite gender -0.08**(0.04)Nearest 20 schoolmatesShare opposite gender -0.13** -0.08(0.06) (0.09)Female x share opposite gender -0.11(0.12)ControlsFemale 0.20*** 0.23*** 0.19*** 0.25***(0.02) (0.02) (0.02) (0.06)Other controlsa x x x xSchool and grade fixed effects x x x xGender-specific correlationsShare opposite gender: males 0.11*** -0.08(0.03) (0.09)Share opposite gender: femalesb 0.03 -0.18**(0.02) (0.08)Observations 8,435 8,435 8,435 8,435R-squared 0.24 0.24 0.23 0.23aOther controls include individual demographics, parent demographicsand education, household income and family structure. bEstimate ob-tained by summing share opposite gender and female x share oppositegender coefficients. Indicator variables for school in saturated sampleand period of interview included. Robust standard errors clustered byschool in parentheses. *** p<0.01, ** p<0.05, * p<0.1.402.7. TablesTable 2.5: IV estimates of GPA on gender composition of high school friend-ship networksOverall GPA Math & English &(A=4, D or lower=1) Science GPA History GPAGender-symmetric effects(1) (2) (3) (4)School friendsShare opposite gender -0.84* -1.05** -1.58** -0.67(0.51) (0.53) (0.72) (0.54)weak IV-robust 95% CI [-2.3, 0.1] [-2.6, -0.2] [-4.0, -0.5] [ -2.1, 0.3]ControlsFemale 0.21*** 0.18*** 0.13*** 0.24***(0.02) (0.02) (0.03) (0.02)Other controlsa x x xSchool and grade fixed effects x x x xFirst-stage coefficientsbShare opposite gender in closeneighbourhood 0.13*** 0.13*** 0.12*** 0.13***(0.03) (0.03) (0.03) (0.03)DiagnosticsF-stat on excluded instrument 16.12 15.32 12.17 15.03Gender-specific effectsc(5) (6) (7) (8)School friendsShare opposite gender: males -0.47 -0.54 -0.89 -0.33(0.88) (0.80) (1.14) (0.79)Share opposite gender: females -1.28 -1.63 -2.37 -1.05(1.05) (1.15) (1.79) (1.04)p-value of gender difference 0.61 0.50 0.56 0.63ControlsFemale 0.52 0.60 0.68 0.51(0.61) (0.61) (0.95) (0.55)Other controls x x x xSchool and grade fixed effects x x x xObservations 8,435 8,435 8,169 8,410aOther controls include individual demographics, parent demographics and education,household income and family structure. bEach coefficient is from the corresponding firststage regression for that column. cThese models include interaction female x share oppo-site gender, so female estimate obtained by summing share opposite gender and female xshare opposite gender coefficients. Indicator variables for school in saturated sample andperiod of interview included. Robust standard errors clustered by school in parentheses.*** p<0.01, ** p<0.05, * p<0.1.412.7. TablesTable 2.6: IV estimates of academic achievement by ageOverall GPA Math and English andScience GPA History GPAAge ? 16(1) (2) (3)School friendsShare opposite gender -0.70 -1.39* 0.09(0.60) (0.82) (0.61)ControlsFemale 0.19*** 0.14*** 0.26***(0.03) (0.04) (0.03)Other controls x x xSchool and grade fixed effects x x xFirst-stage coefficientsShare opposite gender 0.14** 0.14*** 0.14***(0.05) (0.05) (0.05)F-stat on excl instrument 9.15 8.76 9.07Observations 4,142 4,133 4,134Age > 16(4) (5) (6)School friendsShare opposite gender -1.60** -1.83* -1.88*(0.81) (1.01) (1.00)ControlsFemale 0.17*** 0.12*** 0.23***(0.03) (0.03) (0.03)Other controls x x xSchool and grade fixed effects x x xFirst-stage coefficientsShare opposite gender 0.11** 0.09** 0.11***(0.04) (0.04) (0.04)F-stat on excl instrument 8.85 5.94 8.82Observations 4,293 4,036 4,276Indicator variables for school in saturated sample and period of interview in-cluded. Robust standard errors clustered by school in parentheses. *** p<0.01,** p<0.05, * p<0.1.422.7. TablesTable 2.7: IV estimates of potential mechanism - school and classroom be-havioursTrouble Trouble Troublegetting paying getting Troublealong with attention homework with otherteacher in class done studentsOLS coefficient inGPA regressiona -0.10*** -0.06*** -0.15*** 0.02(0.01) (0.01) (0.01) (0.01)Gender-symmetric effectsb(1) (2) (3) (4)School friendsShare opposite gender 1.22* 1.27* 1.11 0.69(0.73) (0.75) (0.76) (0.58)ControlsFemale -0.18*** -0.13*** -0.20*** -0.03(0.03) (0.03) (0.03) (0.03)All other controls x x x xObservations 8,493 8,492 8,491 8,491aEstimates in this row from OLS regression of GPA on potential mecha-nisms. bResults from gender-specific regressions not reported as no signif-icant gender differences. Indicator variables for school in saturated sam-ple and period of interview included. Robust standard errors clustered byschool in parentheses. *** p<0.01, ** p<0.05, * p<0.1.432.7. TablesTable 2.8: IV estimates of potential mechanism - social and home behavioursRelationshipin Smoked in SeriousNumber of past 18 past 30 Drunk in argumentfriends months days past year with momOLS coefficient inGPA regressiona 0.02*** 0.00 -0.29*** -0.13*** -0.08***(0.01) (0.01) (0.03) (0.02) (0.02)Gender-symmetric effectsb(1) (2) (3) (4) (5)School friendsShare opposite gender -1.73 0.89** -0.08 0.31 0.46(1.13) (0.38) (0.28) (0.26) (0.30)ControlsFemale -0.03 0.05*** -0.00 -0.04*** 0.08***(0.05) (0.02) (0.02) (0.01) (0.01)All other controls x x x x xObservations 8,497 8416 8,440 8,483 7,984aEstimates in this row from OLS regression of GPA on potential mechanisms. bResultsfrom gender-specific regressions not reported as no significant gender differences. Indicatorvariables for school in saturated sample and period of interview included. Robust standarderrors clustered by school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.442.7. TablesTable 2.9: IV estimates of selected mechanisms by ageTrouble Trouble Relationshipgetting paying inalong with attention past 18teacher in class monthsAge ? 16(1) (2) (3)School friendsShare opposite gender 0.76 0.36 0.09(0.95) (0.79) (0.47)ControlsFemale -0.19*** -0.12*** 0.03(0.04) (0.03) (0.02)All other controls x x xObservations 4,141 4,141 4,133Dependent variableMean 0.94 1.19 0.46[standard deviation] [0.99] [1.00] [0.50]Age > 16(4) (5) (6)School friendsShare opposite gender 1.60** 2.41** 1.89***(0.77) (1.11) (0.66)ControlsFemale -0.18*** -0.14*** 0.07**(0.03) (0.04) (0.03)All other controls x x xObservations 4,293 4,292 4,283Dependent variableMean 0.74 1.27 0.65[standard deviation] [0.89] [1.03] [0.48]Indicator variables for school in saturated sample and period ofinterview included. Robust standard errors clustered by schoolin parentheses. *** p<0.01, ** p<0.05, * p<0.1.452.7. TablesTable 2.10: IV estimates of long-term effects of peer gender compositionGraduatedSubsequent high Attended Everyear GPA school college marriedGender-symmetric effectsa(1) (2) (3) (4)School friendsShare opposite gender -0.68 -0.13 -0.13 0.52**(0.52) (0.11) (0.25) (0.26)ControlsFemale 0.19*** 0.01 0.08*** 0.04***(0.02) (0.01) (0.01) (0.01)All other controls x x x xObservations 5,822 6,646 6,647 5,894aResults from gender-specific regressions not reported as no significant gen-der differences. Indicator variables for school in saturated sample and pe-riod of interview included. Robust standard errors clustered by school inparentheses. *** p<0.01, ** p<0.05, * p<0.1.46Chapter 3If At First You Don?tSucceed: NegativeExternalities in High SchoolCourse Repetition473.1. Introduction3.1 IntroductionThe questions of whether low-achieving students should be retained in agrade or required to repeat a failed course are answered by the extent towhich grade or course repetition affects the retained or repeating individualand the extent to which grade or course repeaters affect their classmates. Anextensive literature has investigated the effect of grade or course repetitionon the individual, but there is a surprising lack of evidence on the potentialeffects of grade or course repeaters on their classmates.46 This paper ad-dresses the gap in the literature by investigating whether course repeaters inhigh school mathematics courses exert significant negative externalities ontheir course-mates. Using individual and school-specific course fixed effectsto control for ability and course selection, it shows that doubling the numberof repeaters in a given course (holding the number of course-takers constant)results in a 0.6 reduction in GPA scores for first-time course-takers.Many US states have both increased the number of mathematics creditsrequired for high school graduation and specified particular mathematicscourses that need to be passed (Reys et al, 2007). Media reports indicatethat this has increased the likelihood of repetition for students who fail highschool mathematics courses (Helfand, 2006). Seven percent of students inthe sample are repeating a failed mathematics course. For Algebra I, thisincreases to fifteen percent. The effects of repetition in high school math-ematics course are therefore important to understand. The state-specificpolicies as of 2006 are summarized in Tables 3.1 and 3.2, and confirm thata majority of US states have specific mathematics requirements for highschool graduation. The negative externalities exerted by repeaters on theirclassmates found in this paper suggest a cost to course repetition ignoredby previous analyses, and, to the extent that the above policies encour-age course repetition, a cost to these policies that has been overlooked bypolicy-makers.4746Lavy, Paserman and Schlosser (2011) come closest by investigating how the shareof students who are old for their grade (having been retained) affects their same-gradeschoolmates.47There may, of course, be a benefit or cost experienced by the repeating individual.This is not the focus of this paper, but is clearly important for a complete policy analysis.Rose and Betts (2004) find that advanced high school mathematics courses have greatereffects on students? earnings a decade after graduation than less advanced courses. Thismay be interpreted as suggesting a possible benefit to repeating and passing a difficultmathematics course. A recent article in the New York Times (Hacker, 2012) criticizingpolicies that require algebra for high school graduation evoked substantial debate andstrong opinions on both sides, although little in the way of convincing empirical evidence.483.1. IntroductionUnderstanding the externalities imposed by repeaters in high schoolmathematics courses may also inform the grade retention debate. This isbecause both grade retention and course repetition result in students beingexposed to a set of low-achieving classmates who are likely to share similarcharacteristics.48 To the extent that repeating and retained students exertsimilar externalities on their classmates, this paper suggests grade reten-tion analyses should include effects exerted on classmates of the retainedindividual.Course repeaters may exert externalities on their course-mates in a va-riety of ways. These course composition effects can be grouped into twocategories: general effects arising from repeaters being low-achievers andspecific repeater effects not exerted by other low-achievers. Low-achievingstudents are likely to disproportionately extract teacher inputs or redirectteacher inputs away from first-time course-takers. They may need moretime to understand concepts, slowing the pace of the class, and may also bemore likely to misbehave in the classroom given that disruptive behavior isgenerally correlated with classroom ability, requiring teacher intervention.Low-achieving classmates may also be more likely to directly distract theirclassmates, lowering education production even without affecting teacherinputs.In addition to these low-achiever effects, course repeaters may exert addi-tional externalities specifically related to failing and retaking a course. Theymay be bored and inattentive when encountering course material for the sec-ond time, increasing the likelihood of disruptive behavior. Repeaters mayalso have a poor attitude or be uncooperative because they failed the coursethe previous year, and this may negatively affect both their classmates andthe teacher.49Course repeaters may also exert externalities through course size andclass assignment (for courses with more than one class). Course size effects48The effects of grade retention and course repetition on the individual, however, arelikely to differ along several dimensions. This is primarily because retained and repeatingstudents are likely to be of different ages and maturities (retention typically occurs injunior and middle schools while course repetition typically occurs in high school). Inaddition, retained and repeating students are exposed to a different peer group shock(retained students repeat all courses associated with a particular grade so are exposed toa completely new set of peers while repeating students are only exposed to new peers inthe course they repeat).49Another potential repeater mechanism operates in the other direction; repeaters mayprovide examples of the consequences of failure, incentivizing more effort from first-timecourse-takers at risk of failing. This paper finds an overall negative effect, so this channelis at most a mitigating factor.493.1. Introductionare fully controlled in the estimation procedure, but are unlikely to be a fac-tor given the large changes in class sizes required to observe effects50. Classassignment may matter if repeaters are assigned to classes non-randomly.For example, repeaters may be assigned to the best teacher for a particularcourse if failing for a second time is particularly costly (either from the per-spective of the school or the student). This may increase the likelihood offirst-time course-takers being assigned to another class with a worse teacher,leading to poorer performance for first-time course-takers.The primary focus of this study is an analysis of the combined lowachiever and repeater effects that course repeaters exert on their course-mates. This is the appropriate level of analysis for an overall evaluation ofcourse repetition effects. Secondary results attempt to separate the generallow-achiever and specific repeater externalities.The paper uses a fixed effects strategy on longitudinal transcript datafor multiple cohorts of US high school students to estimate the causal effectof course repeaters on their classmates. Essentially, the study comparesthe achievement of first-time course-takers in the same mathematics course(such as Algebra I) in the same high school in different years using year-to-year variation in the number of repeaters in the course to identify the effect.It is assumed that unobserved year-specific shocks to classroom educationproduction in the previous year provide variation in the number of courserepeaters. An example of this is an absent teacher causing a higher coursefailure rate.Holding the number of students in a course constant (either parametri-cally or using course size fixed effects), the academic achievement of first-time course-takers is shown to be negatively correlated with the number ofrepeaters in the course that year. This relationship is robust to a variety ofdifferent specifications. The effect is concentrated in the lower and middleparts of the achievement distribution, and males and females are similarly af-fected. Suggestive evidence that the negative externalities exerted by courserepeaters are due to their being low-achieving and repeating is provided.These results are best compared with those obtained by Lavy, Pasermanand Schlosser (2011). Defining low-ability students as students who are oldfor their grade (most likely having repeated kindergarten or first grade), theyfind that the proportion of low-ability peers is negatively correlated with theacademic achievement of regular students. Variation in the composition ofseven adjacent cohorts of 10th grade students in Israeli high schools (from1994 to 2000) is used to identify the effect. It is argued that the majority of50See, for example, Hanushek (1999).503.1. Introductionstudents had little experience with their peers prior to entering high school,so results are not driven by common cohort-specific shocks.This paper has three key distinctions from Lavy et al (2011). First,we observe course enrolment and achievement for all students in a set ofhigh schools for multiple years allowing the inclusion of both individual andschool-specific course fixed effects. This approach deals with potentially con-founding individual effects (such as cohort-specific shocks and ability differ-ences) and course effects (such as repeaters being more likely to repeat diffi-cult courses) that cannot be dealt with using repeated cross-sectional data.Second, it isolates the effects of course-mates rather than grade-mates. Stu-dents in the same grade may have little interaction and may not take manyof the same courses, which would attenuate effects for analyses performed atthe grade level. And, third, it focuses on high school mathematics coursesin the US, which is particularly relevant given policies stipulating minimummathematics requirements for graduation in US high schools increasing thelikelihood of mathematics course repetition.Repeaters are low-achieving peers for first-time course-takers. Resultscan therefore be compared with the literature investigating ability peer ef-fects in high school. These papers exploit a variety of identification strate-gies and typically find moderately-sized, negative achievement effects forindividuals exposed to low-ability peers.51The externalities exerted by course repeaters may also be placed in thecontext of the related literature investigating the effects of grade retention.Babcock and Bedard (2011) investigate the long run effects of primary schoolretention rates on both the retained and the promoted. They cannot sepa-rate the effects of retention for the retained and promoted, but find that aone standard deviation increase in early grade retention is associated witha 0.7 increase in mean male hourly wages that is evident throughout thewage distribution. They do, however, find that retention rates and educa-tional attainment are statistically insignificant and economically small, andacknowledge the possibility that retention rates may affect the retained andpromoted in opposite directions. In the same way, it is plausible that repeti-tion may be beneficial to the repeating student and costly to the classmatesof the repeating student.The literature investigating the causal effect of retention on the retainedhas exploited a variety of policies to overcome selection into retention. Itprovides evidence of both positive and negative effects. Positive achievementeffects of retention for third grade students are found by Jacob and Lefgren51See, for example, Lavy et al (2012) and Burke and Sass (2013).513.2. Empirical Methodology(2004) and Greene and Winters (2007). These papers use Chicago andFlorida accountability policies respectively to obtain exogenous variation ingrade retention. Ding (2010) finds that holding children back in kindergartenhas positive but diminishing effects on their academic performance up tothird grade. Eide and Showalter (2001) use kindergarten entry dates as aninstrument for retention and find that retention reduces the probability ofdropping out of high school for white students. 52 These findings suggestgenerally positive effects of retention on students retained up to the thirdgrade.The effects for older students (more like those studied in this paper)appear to be more nuanced. Jacob and Lefgren (2004, 2009) find that re-tention in the sixth grade does not significantly affect achievement or highschool graduation, while retention in the eighth grace reduces the proba-bility of high school graduation. Using data from junior high schools inUruguay and a policy of automatic grade failure for certain low-achievingstudents, Manacorda (2012) shows that grade failure increases dropout ratesand lowers educational attainment.Fruehwirth, Navarro and Takahashi (2011) recognize that retention ef-fects are likely to differ by the grade at which the student is retained and theunobservable behavioural and cognitive abilities of the student. They allowfor heterogeneous effects in their econometric model and obtain generallynegative effects from retention, suggesting grade retention is not an effectivepolicy for raising the performance of low-ability students.This remainder of this paper is organized in the usual way: methodology,data, results and then interpretation.3.2 Empirical MethodologyThe academic achievement of first-time course-taker i in course j, high schools, cohort c and year t is modeled as a linear function of the natural loga-rithm of one plus the number of repeaters in the course Rjst53, the naturallogarithm of the number of students in the course Cjst (course size), and acomposite error term:GPAijsct = ? ln(1+Rjst)+ ? lnCjst + ?i + ?js + ?sc + ?t + ?jst +?ijsct (3.1)52Estimates for black students were uninformative.53The addition of one to the number of repeaters ensures that the natural logarithmof zero is avoided. Results are qualitatively similar when courses with zero repeaters aredropped from the sample.523.2. Empirical MethodologyThe coefficient ? represents the level change in student GPA score fora percentage change in the number of course repeaters. The lnCjst termcontrols for the potential negative effect of course size on achievement andis important because of the mechanical relationship between the numberof course repeaters and course size. Without controlling for course size,estimates of the negative externalities exerted by repeaters would exaggeratethe effect.This parametrization of the education production function is chosen soestimated coefficients are easy to interpret. Results from a variety of otherspecifications in which course repeaters enter linearly, quadratically and asshares are reported in the appendix. The interpretation of results are con-sistent across specifications.The error term is modeled to consist of individual ability ?i, school-specific course difficulty ?js, a school-specific cohort effect ?sc, a generaltime trend ?t, a school-specific course time trend ?jst, and a remaining id-iosyncratic shock to achievement ?ijsct.Several components of this error term may be correlated with the numberof course repeaters, which would bias estimates of the effect. A varietyof fixed effects and time trends are included in the estimation to removethese potential biases.54 Several of these rely on observing multiple yearsof student achievement in high school for multiple cohorts, representing anadvantage over repeated cross-sectional analyses or longitudinal analyses ofone cohort.GPAijsct = ? ln(1 +Rjst) + ? lnCjst + ?js + ?t + ?jst + ?i + ?ijsct (3.2)School-specific course fixed effects ?js control for course difficulty as wellas any other course-specific factor affecting both the achievement of first-time course-takers and the number of course repeaters. A positive correla-tion between course difficulty and the number of course repeaters is expectedif students are more likely to fail and repeat difficult courses. Alternatively,low-ability students who consider themselves more likely to repeat a coursemay select out of difficult courses (if the course is not required for gradua-tion). This would generate a negative correlation between course difficultyand the number of repeaters. The net direction of the correlation between54These are implemented using a two-stage procedure in which fixed effects are appliedto demean variables in the first stage before the analysis is performed on the demeanedvariables in the second stage. This is because the final estimation is only performed onfirst-time course-takers, but the fixed effects need to capture the influence of repeaters.533.2. Empirical Methodologycourse difficulty and the number of course repeaters could be either positiveor negative, which respectively would bias the effect upwards or downwardsin the absence of these fixed effects.Year fixed effects ?t and linear school-specific course trends ?jst con-trol for any correlated trends in student achievement and course repeti-tion. Consider grade inflation. Every subsequent year, fewer students faila given course, resulting in fewer course repeaters every subsequent year.At the same time, first-time course-takers perform better every year. Thisgenerates a pattern of increased achievement associated with fewer courserepeaters that has nothing to do with course repetition. In the absenceof this set of controls, estimates of the effect of course repeaters would beupwardly-biased.School-specific cohort fixed effects ?sc may be included to control for co-hort effects. An alternative approach to dealing with cohort effects is theinclusion of individual fixed effects ?i. 55 Individual fixed effects are pre-ferred as they improve precision by controlling for individual ability. Theyalso control for other forms of individual selection not considered in theabove discussion.Finally, it is noted that grading to a curve would bias the estimatedeffects. Repeaters are low-achieving students, so maintaining a constantcourse average in the presence of an increase in the number of repeaterswould necessitate higher GPA scores for first-time course-takers. This wouldattenuate estimates of the externalities exerted by course repeaters on first-time course-takers, so results would be a lower bound of the true effect.However, variation in the unconditional means of school-specific course GPAscores in different years suggest that year-to-year grading to a curve maynot be that pervasive.Descriptive statistics include results from ordinary least squares regres-sions of current achievement on an individual?s past mathematics courseachievement (such as failing and repeating the course). Estimates from thisequation do not have a causal interpretation as we cannot control for non-random selection into course repetition, but are included to describe whathappens to individual students when they repeat a course.56Placebo tests in which achievement depends on the number of repeatersin the same course and same school but in different years are conducted55Individual fixed effects nest cohort fixed effects as individuals belong to one cohort.56Existing studies have used a variety of natural experiments and policies to obtaincausal estimates of this relationship. These are discussed in the introduction.543.2. Empirical Methodologyusing the following equation where p ? {t? 1, t, t+ 1, t+ 2}:GPAijsct = ?p ln(1 +Rjsp) + ?p lnCjsp + ?js + ?t + ?jst + ?i + ?ijsct (3.3)The number of repeaters at time t ? 1 and time t + 2 should be uncor-related with the achievement of first-time course-takers at time t, so it isexpected that ?t?1 = ?t+2 = 0. In addition, there may be a negative rela-tionship between the number of repeaters at time t+1 and the achievementof first-time course-takers at time t. This is because there may be fewerrepeaters when first-time course-takers perform well the previous year andmore repeaters when first-time course-takers perform poorly.Separating the general low achiever and specific repeater effects is inves-tigated by including separate variables for the number of course repeaterswho failed the course the previous year (the variable used in the primaryspecification above), the number of course-mates who failed their mathe-matics course the previous year F (but are not necessarily repeating thecourse that they failed), and the number of students who are repeating thecourse even though they may not have failed it the previous year Q.57GPAijsct = ?R ln(1+Rjst)+?F ln(1+Fjst)+?Q ln(1+Qjst)+? lnCjst+?ijsct(3.4)The effect of the number of students who failed their mathematics coursethe previous year ?F captures externalities exerted by low-achieving course-mates, while specific repeater externalities are reflected in the coefficienton the number of students who are repeating without necessarily havingfailed ?Q. These effects are contrasted with the externalities associatedwith course-mates repeating the course after failing the course the previousyear ?R.Results from this specification need to be interpreted with some caution.First, the variables R, F and Q are highly collinear, increasing specifica-tion sensitivity and reducing out-of-sample performance. And, second, thevariation that identifies the coefficients is generated by endogenous studentchoices. We do not have a policy or natural experiment that determineswhether a student who fails a mathematics course choose to repeat it orchooses to do another mathematics course, and this may affect the exter-nalities they exert.57There are a surprisingly large number of students repeating a course after passing itthe previous year. This phenomenon is discussed in more detail in the data section.553.3. Data and Descriptive StatisticsMechanisms through which course repeaters exert negative externalitiesare investigated by considering how the number of repeaters in a courseaffects the self-reported educational experience of first-time course-takers.Students were surveyed during high school and asked about the frequencywith which they experienced a set of difficulties in the classroomDisct. Thesedata are only available for two years (there were only two survey wavesconducted while students were in high school) and for a small number ofstudents, so these estimations do not include individual fixed effects:Disct = ? ln(1 +Rjst) + ? lnCjst + ?js + ?t + ?jst + ?ijsct (3.5)The subsequent section provides a full description of the data used inthe analysis.3.3 Data and Descriptive StatisticsThis paper uses data from the National Longitudinal Study of AdolescentHealth (Add Health). The Add Health is a school-based longitudinal studyof a nationally representative sample of US adolescents who were in grades7 to 12 during the 1994-1995 school year. A core sample was selected toparticipate in a series of detailed surveys, the most recent being in 2008when individuals were aged between 24 and 32.Complete high school transcript data (grades 9 to 12) are available forindividuals selected for the core sample. For all of these individuals, thetranscript data include a categorization of the mathematics course taken inevery year of high school (or an indication that no mathematics courses weretaken that year), the GPA score obtained in each of these courses, and afailure index variable describing whether the student passed or failed thesecourses.58 This information is required for all students in a school in order toaccurately compute the course composition measures used in the analysis. Iftranscript information is only available for a subset of students in the school,information is only available for a subset of a student?s course-mates. Thestudy is therefore restricted to fifteen schools in which all students in theschool were selected for the core sample. This is known as the saturated58A subset of students may have taken more than one mathematics course in a givenyear. For these students, the provided course categorization is for the highest level math-ematics course taken that year, the reported GPA score is the mean GPA score overall mathematics courses taken, and the failure index describes the share of mathematicscourses failed.563.3. Data and Descriptive Statisticssample. Figure 3.1 plots the number of students who enrolled in at leastone mathematics course in each of these schools, showing that there are twolarge schools and thirteen smaller schools.The analysis is restricted to the years between 1992 and 1996 to ensurethat courses are mostly populated by students included in the above sam-ple. The pooled sample includes 6341 student-years. Appendix Table B.1reports the demographics of the sample. There are 2270 unique studentsin the sample, so course achievement data is observed an average of 2.8times per student. There are 3191 student-year observations describing theachievement of male students and 3150 student-year observations describingthe achievement of female students. Descriptive statistics are provided inTable 3.3. Female students consistently perform better than male studentsacross all measures of academic achievement.Past achievement for each student in each year is described by three vari-ables: an indicator for repeating a mathematics course that was failed theprevious year, an indicator for failing any mathematics course the previousyear (without necessarily repeating it the subsequent year), and an indicatorfor repeating the same mathematics course (without necessarily having failedit the previous year). There are some student-year observations with miss-ing past achievement information. These students are considered first-timecourse-takers, although removing them from the sample does not change theresults. Figure 3.2 reports the distribution of mathematics GPA scores byprevious performance. As expected, first-time course-takers perform con-siderably better than repeat course-takers, although there are some repeatcourse-takers who obtain the maximum GPA score of 4.The first column of Table 3.3 indicates that 7 percent of students inthe sample are repeating a failed mathematics course.59 The externalitiesexerted by these students on first-time course-takers are the primary focus ofthis study. The other two achievement indicators provide secondary evidenceto distinguish the externalities associated with general low-achievement andspecific repetition: 17 percent of students failed their previous mathematicscourse and 14 percent of students are repeating a mathematics course.Course composition measures are obtained by averaging the individualachievement indicators of students in the same course in the same school inthe same year. Course-mates may not be classmates if courses are dividedinto multiple classes within a school. The mean number of students permathematics course is 11260, indicating that an average course consists of59Technically, these are student-years, so 7 percent of student-year observations describestudents repeating a failed math course.60Note that these means are computed by equally weighting student-year observations573.3. Data and Descriptive Statisticsmore than one class. On average, first-time course-takers are exposed tofive students who are repeating the course after failing it the previous year.Course composition is also described in terms of shares rather than counts.The distribution of course sizes for all of the course-years included in theanalysis is plotted in Figure 3.3. The identification relies on variation in thenumber of repeaters in the larger courses; effects are imprecisely estimatedif these courses are excluded. Figure 3.4 plots the variation in the number ofstudents repeating a failed course per school-course-year. Thirty percent ofstudent-course observations correspond to school-course-years in which nostudents are repeating a failed course, and the median and mean numberof students repeating a failed course per school-course-year are 3 and 5.5,respectively.Course-specific descriptive statistics are provided in Table 3.4. Mathe-matics courses are categorized into nine different groupings by survey ad-ministrators.61 These are loosely ordered by difficulty from Basic/RemedialMathematics to Calculus. The three most popular high school mathematicscourses (by enrolment) are Algebra I, Geometry and Algebra II. Results arelargely driven by variation in the number of repeaters across these threecourses.Fifteen percent of students in Algebra I are repeating the course afterfailing it the previous year. The shares of students repeating the moreadvanced courses of Geometry and Algebra II are smaller. This indicatesthat low-ability students who are most likely to fail and repeat select outof mathematics courses after taking Algebra I. This may be by choice orbecause they are not allowed to progress given their achievement in AlgebraI.The transition of students between mathematics courses is described inthe two panels of Table 3.5. The first panel is based on 3741 student-yearobservations and describes the course transition of students who passed theirprevious mathematics course. The second row of this panel indicates that ofthe 450 students who passed General/Applied Mathematics, 66 percent takeAlgebra I the following year. Seventy-one percent of students follow AlgebraI with Geometry while seven percent follow with Algebra II. Somewhatsurprisingly, 16 percent of students repeat Algebra I after passing it. Theand not by equally weighting course-year observations, so large course-years with manystudents receive a greater weight. This also explains why the mean shares are not simplythe ratios of the mean counts.61The actual categorization process is not important for this paper given that studentsin the same course in the same school in the same year are necessarily categorized astaking the same course.583.3. Data and Descriptive Statisticsprimary source of this irregularity appears to be one large school. Thisschool is excluded from the analysis in a sensitivity check to confirm thatthis anomaly is not affecting results.62 Ninety percent of students whopass Geometry follow it with Algebra II and 74 percent of students whopass Algebra II follow it with Calculus. A typical progression for passingstudents is a subset of the path General/Applied Mathematics to Algebra Ito Geometry to Algebra II to Calculus, although several other course pathsare also observed.The second panel of Table 3.5 describes the course transitions of stu-dents who failed their previous mathematics course. It indicates that, formost courses, repetition is the modal behavior of students who failed. In-terestingly, a nontrivial number of students still progress. Twenty-threepercent of students who fail General/Applied Mathematics take Algebra Ithe next year, 29 percent of students who fail Algebra I take Geometry, and37 percent of students who fail Geometry take Algebra II.The final set of descriptive statistics is provided in Table 3.6. Thistable describes how current student achievement is associated with pastachievement in a series of OLS regressions. The negative coefficients in thefirst three columns reflect that students repeating a failed math course arelower achievers (and likely of lower ability) than first-time course-takers.The coefficient drops from -0.8 to -0.4 in the third column when school-specific course fixed effects control for course difficulty. This suggests thatpart of the reduced achievement of repeaters is because they are repeatingmore difficult courses than those taken by other students.The remaining three columns in Table 3.6 include individual fixed effectsto control for individual ability. The correlations between repeating a failedcourse and current achievement are imprecise but positive in the fourth andfifth columns, suggesting an increase in achievement for students repeatinga failed course relative to when they took it for the first time. The sixthcolumn includes the other past achievement indicators to separate the as-sociations with failing and repeating. Students perform better after failingtheir previous mathematics course and when repeating the same mathemat-ics course, but there is no additional improvement for specifically repeatinga failed course. It is emphasized that these associations are descriptive andnon-causal.62One possible hypothesis is that two different courses at this school were categorizedas Algebra I.593.4. Results3.4 ResultsTable 3.7 reports the primary set of results. Controls are added sequentiallyacross columns. The coefficient on the log number of students failed andrepeating of -0.25 in the first column is estimated without controlling forcourse size and excluding school-course and individual fixed effects. Thecoefficient falls in magnitude to -0.20 when course size is controlled in thesecond column, confirming that the previous estimate exaggerated the effect.Course size and GPA are shown to be negatively correlated. The specifi-cation in the third column includes school-course fixed effects to controlfor course difficulty, and the magnitude of the effect falls further to -0.17.This indicates that the net effect of course difficulty biased the estimatesdownwards; first-time course-takers systematically perform worse in moredifficult courses with more repeaters.The fourth column reports results from the preferred specification, con-trolling for course size and including the full set of fixed effects. The coeffi-cient of -0.15 is the level change in GPA scores for first-time course-takerscaused by a doubling of the number of repeaters in a course (a 100 percent-age point increase in the number of course repeaters or a 1 unit increase inthe natural logarithm of the number of course repeaters). For the averagemathematics course, this is an increase from five to ten repeaters in a courseof around 100 students. Course repeaters do exert negative externalities.The relationship between course size and GPA is no longer significant.Without information on class assignment (within courses), class size effectscannot be directly investigated with these data. This result does, however,suggest that repetition effects may be a more important concern than classsize effects (which have received considerable attention).Results from placebo tests in Table 3.8 support the empirical strategy.The first column indicates that doubling the number of repeaters in thecourse the year before it was taken is associated with a -0.02 (no) change inGPA scores for first-time course-takers. The second column is the originalspecification, while the third column reveals that the achievement of first-time course-takers is negatively correlated with the number of repeaters thenext year, although the estimate is not significant. This is expected as courserepeaters are course-takers from the previous year that performed poorly.Distributional effects are investigated in Figure 3.5. This graph plots es-timates from a series of linear probability models in which binary indicatorsfor attaining at least the specified GPA score are the dependent variables.Results in this figure partially inform our understanding of the negativeexternalities exerted by repeaters. Negative effects at the top of the distri-603.4. Resultsbution may indicate teachers transferring inputs from high achievers to lowachievers (such as slowing the pace of the class), negative effects throughoutthe ability distribution may indicate repeaters being generally disruptive,while negative effects concentrated at the bottom of the distribution mayindicate repeaters specifically distracting other low achievers. The negativeexternalities exerted by repeaters are evident in the middle and lower partsof the distribution. This is evidence against the hypothesis that teacherstransfer inputs away from high achievers when there are more repeaters ina course, and suggests repeaters may specifically distract other students insimilar parts of the achievement distribution.Course repeaters may exert negative externalities on first-time course-takers only when they reach a threshold share of the course. This formof nonlinearity cannot be captured by the above specifications. Figure 3.6investigates threshold effects by plotting coefficients from a series of regres-sions taking the form of Equation 3.2, but with the explanatory variablebeing a binary indicator of whether the share of repeaters exceeds the spec-ified level. The estimated effect is the difference in GPA scores betweenfirst-time course-takers exposed to a share of repeaters above the specifiedlevel and first-time course-takers exposed to a share of repeaters below thespecified level. The plot suggests the negative effect is already evident whenthe share of repeaters reaches five percent of course-takers, although it isonly statistically significant when the share reaches nine percent of course-takers. The negative effect on first-time course-takers remains relatively flatuntil the share of repeaters reaches fifteen percent after which it becomesvery imprecise.Gender and race heterogeneity in the effect is investigated by interact-ing the number of repeaters with gender and race indicators. These resultsare reported in Table 3.9. The third column includes both gender and raceinteractions. Doubling the number of repeaters in a course reduces the GPAscores of white males (the omitted category) by 0.27. Females are slightlyless affected than males, but the gender difference is not statistically dif-ferent. The negative externalities exerted by repeaters on black first-timecourse-takers are significantly smaller than those exerted on white students,while other differences by race are imprecisely estimated. Descriptive statis-tics in Appendix Table B.1 indicate that black students are more likely tofail and repeat mathematics courses. The smaller effect for black studentssuggests smaller effects in schools with more black students, and, given thatblack students are more likely to repeat, may indicate a declining effect for613.5. Conclusioneach additional percentage point increase in the number of repeaters.63Results in Table 3.10 attempt to distinguish the externalities exerted bycourse repeaters because they are low achievers and the externalities exertedspecifically because they are repeating. The number of students who failedtheir previous mathematics course is considered a proxy for the number oflow achievers in the course. All repeaters should exert the specific exter-nalities associated with course repetition to some extent, so including thenumber of repeaters who previously passed or failed therefore captures thespecific repeater externality. (Recall from Table 3.5 that a surprising numberof students repeat a passed course.) As discussed in the empirical methodol-ogy section, these measures are highly correlated and results are somewhatsensitive. They are interpreted as suggestive rather than conclusive.The second and third columns reveal that both the number of low achiev-ers and the number of repeaters are negatively correlated with the GPA offirst-time course-takers when included in separate regressions. The thirdcolumn includes both of these measures and the original variable. Only thecoefficient of -0.14 on the log number of students failed and repeating is neg-ative and statistically significant, although the number of low achievers (asmeasured by the number of students who failed their previous mathematicscourse) also enters the estimated GPA production function negatively. Thissuggests that both specific repeater and general low achiever effects maybe in operation. One implication of this is that encouraging low-achievingstudents to progress to higher-level mathematics course rather than repeatmay not fully address the issue as the negative externalities exerted by thesestudents would persist in the higher-level courses. A more appropriate pol-icy for negating these externalities may be to direct failing students awayfrom mathematics courses or towards less cognitively-demanding numeracycourses.3.5 ConclusionMathematics is difficult for many students, and course repetition in highschool mathematics courses is a common occurrence. This repetition ispromoted by policies in several US states that stipulate a minimum levelof mathematics to graduate high school. Mathematics is also generally63The logarithmic functional form captures some nonlinearity in the effect, but actualnonlinearities may be more pronounced or take a different form. The small sample andthe related absence of statistical power do not allow a fuller investigation of this; a non-parametric analysis in which a series of bins for the number of repeaters were included asexplanatory variables was uninformative.623.5. Conclusionconsidered important for future job market success, acting as further en-couragement for students to repeat failed mathematics courses. Previousdiscussions around the benefits and costs of course repetition have focusedon the potentially-repeating individual student.This paper takes a new step by considering the externalities exerted bycourse repeaters on other students taking the course for the first time. Adoubling of the number of repeaters in a mathematics course leads to a 0.15reduction (approximately equal to the mean female-male achievement gap)in GPA scores for first-time course-takers. The effect appears to dominatecourse size effects, and, given the relationship between course size and classsize and the extensive literature on class size, warrants more attention.Using Israeli data, Lavy et al (2011) finds that higher proportions oflow-ability students in a grade are associated with reductions in the generalquality of the classroom environment. This provides a candidate mechanismthrough which the negative externalities reported in this paper may operate.The estimated distributional effects indicate that course repeaters negativelyaffect students at the middle and lower parts of the achievement distribution.This suggests that course repeaters may be more likely to distract classmateswho are located in similarly-low parts of the achievement distribution ratherthan high achievers, which is particularly concerning given these studentsare already at risk. The effect does not appear to operate through teachersredirecting resources to low-ability students from high-ability students, sopolicies that promote maintaining a constant level of teacher inputs irre-spective of the classroom distribution of repeaters may not be effective inalleviating the negative externalities.Results also suggest that the negative externalities exerted by course re-peaters arise because these students are both low-achieving and repeating.This is important because policies that reduce course repetition may notdeal with the low-achiever effects. If the negative externalities exerted bycourse repeaters outweigh the potential benefits of repetition for the repeat-ing student, a more fitting solution may be promoting numeracy coursesrather than Algebra and Geometry for high school students who do notdisplay an aptitude for mathematics.Finally, suggestive evidence indicates that the negative effect is mitigatedif the share of repeaters remains below five percent. This presents a possiblepolicy response of stipulating a maximum share of repeaters permitted in acourse. The overall finding of negative externalities emphasizes the need toinclude the effect of repeaters on their classmates when considering optimalgrade retention, course repetition and high school graduation policies.633.6. Figures3.6 FiguresFigure 3.1: Number of students enrolled in math courses per school02004006008001000Number of students1 2 3 4 5 6 7 8 9 10 11 12 13 14 15All fifteen schools in sample020406080100Number of students1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Excluding largest two schools (for rescaling)Number of students enrolled in math courses per school643.6. FiguresFigure 3.2: Distribution of math GPA scores by past achievement0.1.2.3.4Density0 1 2 3 4First?time course?takers0.1.2.3.4.5Density0 1 2 3 4Repeat course?takers (who failed course in previous year)The unit of observation is a student?course. There are 3,379 student?course observations corresponding tofirst?time course?takers and 310 student?course observations corresponding to repeat course?takers (who failed course in previous year).Distribution of math GPA scores by previous performanceFigure 3.3: Number of students per school-course-year (class)050100150200250Number of students0 50 100 150 200 250All school?course?years01020304050Number of students0 50 100 150 200 250Excluding largest thirty school?course?years (for rescaling)Number of students per school?course?year (class)653.6. FiguresFigure 3.4: Distribution of number of students repeating a failed course perschool-course-year0.1.2.3Density0 10 20 30Number of students failed and repeatingThe unit of observation is a student?course. Thirty percent of student?course observationscorrespond to school?course?years (classes) in which no students are repeating a failedcourse. The median and mean number of students repeating a failed course per school?course?year are 3 and 5.5, respectively indicated by the green and red vertical lines.Figure 3.5: Distributional effects?.1?.050.05Estimated effect0 1 2 3 4GPAEstimate 95% confidence intervalEach estimated effect and associated confidence interval is from a separate linear probability model.The dependent variable is an indicator for attaining the specified GPA and the independent variable isthe natural logarithm of the number of course repeaters. The full set of fixed effects are included.663.6. FiguresFigure 3.6: Threshold effects of share repeaters on GPA of first-time course-takers?.4?.20.2.4.6Estimated effect0 .05 .1 .15 .2Share of course repeatersEstimate 95% confidence intervalEach estimated effect and associated confidence interval is from a separate regression of the GPAof first?time course?takers on a binary variable indicating whether the share of course repeatersexceeds the specified threshold. The full set of fixed effects are included.673.7. Tables3.7 TablesTable 3.1: Number of years of high school mathematics courses/credits re-quired for graduationYears States TotalSpecified at local level CO, IA, ME, MA, NE 51 year 02 years AK, AZ, CA, ID, MT, ND, WI 73 years CT, DC, DoDEA, HI, IL, KS, KY, LA,MD, MN, MO, NH, NM, NJ, NV, NY,OH, OK, OR, PA, TN, UT, VT, WY 244 years AL, AR, DE, FL, MI, MS, RI, SC, TX,WA, WV 11Varies by diploma IN (2-4 yrs), GA (3-4 yrs), NC (3-4 yrs),SD (3-4 yrs), VA (3-4 yrs) 5Source: Reys et al, 2007683.7. TablesTable 3.2: Courses required for high school graduation/diplomaCourse States TotalAlgebra I AL, AR, CA, DoDEA,DC*, FL*, GA*, IL,KY, MD, MI, MS, ND,NH, NM**, OK**, SD,TX, UT* 19Algebra IIntegrated Mathematics I IN, LA*, NC, TN* 4Geometry AL, AR, DoDEA, IL,KY, MD, MI, TX, UT* 9Geometry orIntegrated Mathematics II 0Algebra II AR, MI 2Algebra IIIntegrated Mathematics III DE* 1Algebra I, Geometry, Algebra IIOR Integrated Mathematics I-III LA, TN*, VA 3* Or an equivalent course, ** Minimum requirement.Source: Reys et al, 2007693.7. TablesTable 3.3: Descriptive statistics - Pooled (Units: student-years)Mean (standard deviation)All Males FemalesAcademic outcomes:Math GPA score (transcript)a 2.17 2.05 2.28(1.17) (1.16) (1.17)Individual past achievement- binary indicators:bFailedc and repeating math course 0.07 0.08 0.06Failed math course in previous year 0.17 0.19 0.15Repeating math course from previous year 0.14 0.16 0.13Course-mates:dCourse size (number of students) 111.63 113.11 110.12(88.10) (87.47) (88.72)Number of students failed and repeating 5.46 5.55 5.37(6.55) (6.57) (6.52)Number of students failed 13.22 13.48 12.96(14.34) (14.29) (14.38)Number of student repeating 12.44 12.72 12.15(18.36) (18.47) (18.25)Share of students failed and repeating 0.08 0.08 0.08(0.10) (0.10) (0.11)Share of students failed 0.18 0.19 0.18(0.19) (0.19) (0.19)Share of students repeating 0.19 0.19 0.19(0.24) (0.24) (0.24)Observations 6341 3191 3150Share 1 0.50 0.50aThe math GPA score is the mean GPA over all math courses taken in agiven year if more than one course is taken in the year. bMeans for these bi-nary indicators based on smaller samples due to missing past achievementfor some individuals. cFailed is a binary indicator that equal to one if anyfailure in previous year?s math courses. dCourse-mates are students in thesame school, taking the same course, in the same year.703.7.TablesTable 3.4: Descriptive statistics by math course - Pooled (student-years)Basic/ General/ Pre- Algebra Geometry Algebra Advanced Pre- CalculusRemedial Applied algebra I II calculusAcademic outcomes:Math GPA score (transcript)a 1.66 2.07 1.96 1.92 2.26 2.32 3.00 2.65 3.04Individual past achievement- binary indicators:bFailedc and repeating math course 0.11 0.08 0.12 0.15 0.05 0.04 0.00 0.01 0.00Failed math course in previous year 0.54 0.36 0.51 0.21 0.13 0.10 0.00 0.05 0.00Repeating math course from previous year 0.22 0.28 0.20 0.36 0.07 0.05 0.04 0.04 0.00Individual current achievement- binary indicators:Fail and repeat math course 0.10 0.03 0.08 0.08 0.04 0.05 0.00 0.02 0.00Fail math course 0.31 0.19 0.25 0.23 0.17 0.18 0.04 0.09 0.05Repeat math course 0.20 0.12 0.15 0.20 0.07 0.07 0.25 0.12 0.20Course-mates:dCourse size (number of students) 51.12 89.09 41.61 159.69 130.24 100.96 8.07 58.28 24.85Number of students failed and repeating 3.70 1.92 2.18 10.94 4.48 4.34 0.00 0.79 0.00Number of students failed 21.08 6.11 11.32 16.11 16.72 12.57 0.21 3.43 0.00Number of student repeating 8.51 7.24 4.17 29.04 7.01 6.11 0.25 2.41 0.00Share of students failed and repeating 0.12 0.12 0.13 0.14 0.04 0.04 0.00 0.01 0.00Share of students failed 0.53 0.34 0.40 0.20 0.12 0.10 0.01 0.05 0.00Share of students repeating 0.22 0.47 0.19 0.35 0.06 0.05 0.04 0.03 0.00Observations 340 571 313 1790 1469 1160 72 483 143Share 0.05 0.09 0.05 0.28 0.23 0.18 0.01 0.08 0.02aThe math GPA score is the mean GPA over all math courses taken in a given year if more than one course is taken in the year. bMeans forthese binary indicators based on smaller samples due to missing past achievement for some individuals. cFailed is a binary indicator that equalto one if any failure in previous year?s math courses. dCourse-mates are students in the same school, taking the same course, in the same year.713.7. TablesTable 3.5: Transition matrices - shares: Mathematics (student-years)Panel A: No math course failure in previous yearCurrent coursePrevious course 1 2 3 4 5 6 7 8 9 Total1 - Basic/Remedial 0.19 0.10 0.21 0.47 0.02 0 0 0 0 1632 - General/Applied 0.12 0.11 0.06 0.66 0.03 0.01 0 0.00 0 4503 - Pre-algebra 0.06 0.05 0.08 0.78 0.02 0.00 0 0 0 2024 - Algebra I 0.01 0.02 0.00 0.16 0.71 0.07 0 0.01 0 1,2645 - Geometry 0.00 0.01 0.00 0.02 0.03 0.90 0.01 0.02 0 9476 - Algebra II 0.00 0.04 0.00 0.01 0.12 0.03 0.05 0.74 0.01 5397 - Advanced 0 0 0 0 0.11 0 0.33 0.44 0.11 98 - Pre-calculus 0 0 0 0 0 0.01 0.09 0.07 0.83 1679 - Calculus 0 0 0 0 0 0 0 0 0 0Total 115 139 89 758 1,012 972 64 449 143 3,741Panel B: Any math course failure in previous yearCurrent coursePrevious course 1 2 3 4 5 6 7 8 9 Total1 - Basic/Remedial 0.41 0.07 0.28 0.24 0 0 0 0 0 682 - General/Applied 0.46 0.18 0.12 0.23 0.01 0 0 0 0 953 - Pre-algebra 0.30 0.11 0.39 0.18 0 0.02 0 0 0 564 - Algebra I 0.10 0.08 0.07 0.44 0.29 0.02 0 0.00 0 3235 - Geometry 0.06 0.09 0.09 0.05 0.34 0.37 0 0.01 0 1636 - Algebra II 0.04 0.11 0.04 0.03 0.08 0.53 0 0.18 0 797 - Advanced 0 0 0 0 0 0 0 0 0 08 - Pre-calculus 0 0.13 0 0 0.13 0.13 0 0.63 89 - Calculus 0 0 0 0 0 0 0 0 0 0Total 134 78 93 199 156 110 0 22 0 792723.7. TablesTable 3.6: Correlation between previous and current mathematics achieve-mentDependent variable: GPA score (1) (2) (3) (4) (5) (6)Previous year academicachievement:Failed and repeating course -0.89***-0.81***-0.39*** 0.36 0.30 -0.11(0.04) (0.02) (0.08) (0.25) (0.21) (0.12)Failed course in previous year 0.34***(not necessarily repeating) (0.06)Repeating course fromprevious year 0.18***(not necessarily having failed) (0.04)Fixed effects:Year (5) x x x x xSchool-cohort (56) x x x x xSchool-course (84) x x xIndividual (2047) x x xObservations (student-years) 4533 4533 4533 4533 4533 4533Number of students 2047 2047 2047 2047 2047 2047Robust standard errors clustered by school in parentheses. *** p<0.01, ** p<0.05,* p<0.1.Table 3.7: Effect of course repeaters on academic performance of first-timecourse-takersSample: First-time course-takersDependent variable: Math GPA score (1) (2) (3) (4)Course-mates:Log number of students failed -0.25***-0.20***-0.17*** -0.15**and repeating (0.04) (0.03) (0.02) (0.04)Log number of students in course -0.18***-0.11*** -0.03(0.03) (0.04) (0.15)Fixed effects:Year (5) and school-cohort (53) x x x xSchool-course (78) x xSchool-course trends (78) x xIndividual (1810) xObservations (student-years) 3379 3379 3379 3379Number of students 1810 1810 1810 1810Robust standard errors clustered by school in parentheses. *** p<0.01, **p<0.05, * p<0.1.733.7. TablesTable 3.8: Placebo tests: Pseudo course-mate achievement at time t? 1 tot+ 2Sample:First-time course-takers at time tDependent variable: Math GPA score (1) (2) (3) (4)Pseudo course-mate achievementat time: t? 1 t t+ 1 t+ 2Pseudo course-mates:Log number of students failed -0.02 -0.15** -0.09 0.10and repeating (0.04) (0.04) (0.08) (0.09)Fixed effectsa x x x xObservations (student-years) 3160 3379 3324 3337Number of students 1739 1810 1790 1783aYear, school-cohort, school-course, school-course trends and individual fixedeffects, as well as log number of students in course included. Robust standarderrors clustered by school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.743.7. TablesTable 3.9: Gender and race heterogeneity in effect of course repeatersSample:First-time course-takersDependent variable: Math GPA score (1) (2) (3)Course-mates:Log number of students failed -0.21* -0.21*** -0.27***and repeating (0.11) (0.05) (0.08)x Female 0.11 0.12(0.14) (0.13)x Black 0.21* 0.22*(0.11) (0.10)x Hispanic -0.01 -0.01(0.11) (0.11)x Asian 0.19 0.19(0.06) (0.07)x Other 0.25 0.25(0.41) (0.40)Fixed effectsa x x xObservations (student-years) 3377 3377 3377Number of students 1808 1808 1808aYear, school-cohort, school-course, school-course trends and indi-vidual fixed effects, as well as log number of students in course in-cluded. Robust standard errors clustered by school in parentheses.*** p<0.01, ** p<0.05, * p<0.1.753.7. TablesTable 3.10: Separating effects of course-mates? course failure and courserepetitionSample: First-time course-takersDependent variable: Math GPA score (1) (2) (3) (4)Course-mates:Log number of students failed -0.15** -0.14***and repeating (0.04) (0.04)Log number of students failed -0.18*** -0.12(0.09) (0.09)Log number of students repeating -0.09*** 0.05(0.15) (0.04)Fixed effectsa x x x xObservations (student-years) 3379 3379 3379 3379Number of students 1810 1810 1810 1810aYear, school-cohort, school-course, school-course trends and individual fixedeffects, as well as log number of students in course included. Robust standarderrors clustered by school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.76Chapter 4To Go To College or Get AJob? The Effects ofPart-Time Work DuringHigh School774.1. Introduction4.1 IntroductionOver half the students in US high schools engage in some form of marketwork during the school year.64 Part-time work during high school may affectsubsequent labour market outcomes in a variety of ways. The focus of thispaper is an exploration of the extent to which high school work experienceincentivizes labour market entry and college attendance after high school.Part-time work during high school may increase the opportunity costof attending college. Working during high school increases the likelihood ofemployment after high school, and may also increase initial wages if there arereturns to high school work experience. This would increase the probabilityof labour market entry after high school. At the same time, working in anunskilled occupation during high school may provide motivation to pursuepostsecondary education as a means to greater job satisfaction and higherwages in the future, encouraging college attendance.The effects of part-time work during high school are also likely to beheterogeneous. The effect of an additional hour of market work is probably ofdifferent magnitude, and possibly of different sign, for a high school studentwho works relatively few hours per week and a high school student whoworks relatively many hours per week. It is also likely to differ by the gradeand age during which the part-time work occurs. An important contributionof this paper is providing a joint analysis of grade or age and work intensityheterogeneity in the effects of part-time work.This paper contributes to the literature by finding that part-time workduring high school reduces college attendance and lowers the age of entryinto the full-time labour market for 8-10th grade students with high workintensities. There is no effect on the probability of dropping out. Theseoutcomes have not been fully investigated in existing studies. It also con-siders the effect on subsequent self-reported job satisfaction, finding no ef-fect of part-time work irrespective of the grade in which the work occurredand the intensity of the work. Noting the concerns associated with usingself-reported job satisfaction as a dependent variable (Bertrand and Mul-lainathan, 2001), this suggests that part-time work during high school mayaffect the career paths of individuals but not their subsequent wellbeing.There are likely to be several unobservable factors that influence bothhigh school part-time working behaviour and the decision to work or study64Pabilonia (2001) provides a full description of part-time working behaviour duringhigh school in the 1990s using data from the National Longitudinal Survey of Youth 1997.The paper also includes a discussion of the Federal Fair Labor Standards Act, the lawthat governs the ages at and intensities with which children are allowed to work.784.1. Introductionafter high school. This endogeneity problem is overcome by exploiting peer-induced variation in part-time hours worked. The basic idea is that condi-tionally random variation in the working behaviour of an individual?s peersinduces exogenous variation in the individual?s own working behaviour. Es-sentially, the presence of peer effects allows peer behaviour to be used as aninstrument for an individual?s own behaviour. The exclusion restriction isthat peer working behaviour does not affect the outcome of interest throughany channel other than individual working behaviour. The first part of theempirical strategy applies an existing methodology to provide evidence ofpeer effects in part-time working behaviour, and the second part of the paperuses this estimated peer effect as the first-stage of an instrumental variables?estimation.There is an established literature investigating the effects of part-timework during high school on both academic achievement and labor marketoutcomes. The primary challenge in investigating the effect of working dur-ing school is controlling for the endogeneity in the decision to work duringhigh school. The endogeneity problem arises because unobserved factors(such as ability, motivation or parental inputs) affect both the decision towork and the respective outcome variable.The methodology employed in this paper partly exploits the fact thatcontemporaneous academic achievement effects associated with part-timework are zero or close to zero. This conclusion is supported by severalpapers using a variety of datasets and empirical strategies.65 One notableexception to finding no or negligible effects is Tyler (2003). This study usesvariation in the labour supply of 12th grade students generated by interstatevariation in child labour laws to find that decreasing work intensity improvesmathematics scores. Oettinger (1999) and Montmarquette (2007) find someevidence of negative academic achievement effects for individuals with veryhigh work intensities. Given the local nature of these estimated effects, itseems reasonable to conjecture the general absence of an effect with someunderlying heterogeneity.Effects of part-time work on subsequent labour outcomes are mixed.The existing literature has focused on wage and employment effects. Ruhm(1997) finds that hours worked during an individual?s senior year in highschool and future earnings are correlated. The paper argues that an ex-tensive set of controls are sufficient to overcome the endogeneity problem.65Dustmann and van Soest (2007) uses the UK National Child Development Study,Rothstein (2007) uses the National Longitudinal Survey of Youth 1997, Sabia (2009) usesthe National Longitudinal Study of Adolescent Health, and Buscha et al (2012) uses theNational Education Longitudinal Study of 1988.794.2. Empirical MethodologyOther papers find no effects. Light (1999) uses an instrumental variablesstrategy and concludes that the direct effect of high school employment onsubsequent wages is small and relatively short-lived, while Hotz (2002) usesdynamic selection methods to reach a similar conclusion.Related papers investigate the effects of part-time work during collegeon academic and employment outcomes. These are useful for comparisonpurposes, although working during high school and college are likely to havequite different effects. Stinebrickner and Stinebrickner (2003) compare ordi-nary least squares, fixed effects and instrumental variable approaches, stress-ing the importance of dealing with the endogeneity of hours worked. Theyfind that working during college has a small, negative effect on academicachievement. Hakkinen (2006) shows that working during college has var-ious short-term effects on earnings and time-to-degree, but concludes thatthere are ultimately no significant returns to student employment.The structure of this paper is as follows. The second section of thepaper introduces the empirical strategy. After explaining the identificationstrategy, a detailed exposition of the first stage and second stage regressionsneeded for identifying the effect are presented. The first-stage subsectionoutlines the conditional effect of the instrument on the explanatory variableof interest. This is non-trivial as there are various complexities that needto be considered when deriving a causal peer effect. The second- stagesubsection considers the causal relationship of interest: the effect of highschool working behavior on school performance. The third section of thepaper explains the data used in the analysis and the fourth section reportsthe estimation results.4.2 Empirical MethodologySeveral unobservable factors affect both part-time working behaviour duringhigh school and future labour market outcomes. For example, parents whopromote academic achievement during high school may discourage marketwork, and these may be the same parents who encourage their childrento pursue the postsecondary education that leads to positive job marketoutcomes. An observed negative correlation between the intensity of part-time work during high school and future labour market outcomes may thenbe driven by differences in parental inputs across students. As a result, anordinary least squares regression of educational or labour market outcomeon hours worked during high school is likely to yield biased estimates of theeffect of part-time work.804.2. Empirical MethodologyThis paper uses an instrumentation strategy to overcome the endogene-ity problem. The part-time working behavior of an individual?s peers isemployed as an instrument for an individual?s own part-time working be-havior. The key idea is that students are induced to work varying numbersof hours during high school by variation in the working intensity of theirpeers. Grade fixed effects, school fixed effects and using past rather thancontemporaneous peer behavior support the claim of instrument exogeneity.There are a variety of channels through which the hours worked by stu-dents in the same grade may be correlated. First, individuals in the samegrade are likely to share information. This information may be about thegeneral costs and benefits of working during high school, or about actual jobopportunities. For example, an individual working at a 10-hour per week jobmay inform schoolmates of job opportunities for similar work at the sameemployer. This would introduce positive correlation in hours worked duringschool. This channel is supported by the extensive literature on the role ofjob information networks in job search.66Second, individuals may work similar hours to other students in the samegrade because of similarities in their recreational activities and expenditurepatterns. Individuals may have an incentive to work more hours (and obtainmore disposable income) if they want to engage in costly social activitieswith schoolmates who work more hours (and therefore have more disposableincome). Examples of costly social activities include anything from watchingmovies to drinking alcohol or smoking cigarettes.And, third, there are various local neighborhood effects that may resultin positive correlation among the hours worked by students in the samegrade and school. An example of this may be the proximity of the highschool to employers of high school workers (such as fast food chains).4.2.1 The First Stage: Peer Effects in High School WorkingBehaviourThe first requirement for using the empirical strategy outlined above is es-tablishing a causal link between the hours worked by an individual?s school-mates in the same grade and an individual?s own hours worked. A standardspecification in the empirical social interactions and peer effects literatureis the linear-in-means model in which an individual?s weekly hours worked66Ioannides and Loury (2004) provide a survey.814.2. Empirical Methodologyis a function of the mean weekly hours worked by the peer group:Hti = ?0 + ?11n? 1?j ?=iHtj + ?ti (4.1)Individual i?s peers are indexed by j and n denotes the size of the peergroup. This model is simple to estimate, but the limitations associated withinterpreting the relationship between individual behavior and mean groupbehavior as causal are well-understood.Manski (1993, 2000) considers three reasons why we may observe correla-tions between individual and group behavior. These are termed endogenous,exogenous and correlated effects in the peer effects literature. Endogenouseffects arise when individuals respond to the actions of other group mem-bers. This is the nature of the causal relationship considered in this paper,and is typically difficult to identify.Exogenous effects (also known as contextual effects) arise when an indi-vidual?s behavior is a function of an individual?s exogenous or backgroundcharacteristics, and these exogenous characteristics are shared by groupmembers. For example, white males are more likely to work in high schoolthan other demographic groups. Under the empirical regularity that thepeer groups of white males are more likely to be constituted of other whitemales (Currarini et al, 2009), observed correlation in peer group workingbehavior may be due to this exogenous effect.Finally, correlated effects arise when group members respond to environ-mental or institutional factors or shocks that are common to members of thegroup. This is particularly relevant in the school setting considered in thispaper. Observed correlation in the working behavior of schoolmates may bea consequence of school location if, for example, some schools are close topotential employers of high school workers and other schools are not. Thiswould be considered a correlated effect.One of the primary difficulties in identifying endogenous effects arisesbecause an individual?s own behavior and the mean behavior of that in-dividual?s peer group are simultaneously determined. In other words, thebehavior of an individual both affects and is affected by the behavior ofgroup members. This is known as the reflection problem.67In order to separate the different effects and deal with the reflection67This is less of a concern when the relationship between individual behavior groupand group behavior is nonlinear (see, for example, Bramoulle, Djebbari and Fortin, 2009,and Brock and Durlauf, 2001). A developing strand of the social interactions literatureexploits nonlinearity to identify peer effects, but this is not considered in this paper.824.2. Empirical Methodologyproblem, the standard linear-in-means model is amended in three ways: thereference behavior of the group is lagged by one period, the mean charac-teristics of the reference group are included as explanatory variables, andschool fixed effects are included. The amended specification is as follows:Htigs = ?0+?1Xigs+?21ni ? 1?j ?=iHt?1jgs +?31ni ? 1?j ?=iXjgs+Dg+Ds+?tigs(4.2)The peer group in this paper is defined to be individuals in the samegrade in the same school, same-grade schoolmates. This is a natural def-inition of a high school peer group if we consider that individuals in highschool spend most of their time with other individuals of similar ages whoare likely to be in the same grade. Classmates may be a finer measure of therelevant peer group, but this introduces bias due to nonrandom selectioninto classes (see, for example, Hoxby, 2000). Htigs denotes the hours workedby individual i in grade g and school s. Other students in the same gradeare indexed by j, and ni denotes the size of individual i?s peer group (thenumber of students in individual i?s school and grade). Exogenous individ-ual characteristics are denoted by Xigs and grad and school fixed effects aredenoted by Dg and Ds.The inclusion of mean group characteristics controls for potential contex-tual peer effects under the assumption that the effect of group characteristicson the outcome variable is linear. Bifulco et al (2011) considers the effectof classmate characteristics on various economic and social outcomes, andfinds that only mother?s education plays a statistically significant role indetermining these early adult outcomes. This paper includes a variety ofgroup characteristics in addition to mother?s education.Grade fixed effects control for grade-specific variation in part-time workduring high school. This controls for the correlation induced by older stu-dents both working more hours per week and being in higher grades. Schoolfixed effects control for correlated effects arising from school-specific fac-tors or shocks that may affect the working behavior of members of the sameschool. These include the proximity of the school to employers of high schoolworkers, as well as local labor market conditions.Note that the identifying variation is cross-sectional. Time-varying changesin local labour market conditions would simultaneously affect peer workinghours and the decision to attend college, which would invalidate the in-strument. Relying on across-cohort within-school variation in peer workinghours at a particular point in time (with fixed local labour market condi-834.2. Empirical Methodologytions) eliminates this concern.The reflection problem is solved by using weekly hours worked the pre-vious year by students currently in the same grade; an individual?s currentworking behavior is affected by but cannot affect the previous working be-havior of students in the same grade. Several papers have used this approachto deal with the reflection problem (see, for example, Clark and Loheac,2007). The lag length of one year is chosen based on data availability.68One caveat in using this approach to solve the reflection problem is thathours worked during school cannot be static over time. This would be thecase, for example, if hours worked were fixed throughout high school. Thisis because we cannot disentangle the simultaneity in the determination ofan individual?s own hours worked and the hours worked by students in thesame grade if hours worked do not change over time.4.2.2 The Second Stage: The Effect of High SchoolWorking BehaviourThe causal relationship between an individual?s same-grade schoolmates?weekly market hours worked and an individual?s own weekly market hoursworked is interesting in its own right. This provides evidence of peer effectsin high school, and informs understanding of adolescent decision-making ingeneral. Peer effects relating to alcohol and substance use in high school havebeen documented in the social interactions literature, but to my knowledgethere is no prior evidence of peer effects in market working behavior duringhigh school. This paper takes an additional step and uses the exogenousvariation in hours worked induced by classmates? hours worked to identifythe effect of part-time work during high school on future educational andlabour market outcomes.The proposed empirical strategy identifies the causal effect of part-timework during high school under the assumption that the part-time workingbehavior of an individual?s same-grade schoolmates only affects individualoutcomes through the part-time working behaviour of the considered indi-vidual and not through any other channel. There are scenarios in which thisexclusion restriction would be violated. Consider a world in which part-timework during high school negatively affects school performance and the aca-demic achievement of same-grade schoolmates affects an individual?s ownacademic achievement. Now consider a student in this world exposed tosame-grade schoolmates who work an above average number of hours. The68Richer data would allow an analysis of the role played by lag length.844.2. Empirical Methodologyschool performance of this student will be reduced both because she is in-duced to work more hours by her peers and because she has lower-achievingpeers (because they work more hours). The suggested instrumentation strat-egy would combine these effects, exaggerating the estimated effect of part-time working behavior on academic achievement.Two arguments support the above identifying assumption. First, thecontemporaneous effect of part-time work on school performance is zero orclose to zero. This claim is supported by both the existing literature and byempirical results in this paper. Rothstein (2007) and Sabia (2009) find noeffects on academic achievement, Tyler (2003) and Dustmann (2007) findsmall effects, and, using the instrumentation strategy proposed above (thatwould exaggerate the estimates), my study finds no effect.And, second, classroom peer effects on achievement are known to besmall (see, for example, Zimmerman, 2003). This means that even in thepresence of small effects of part-time work on school performance, the com-pounded effects operating through peer achievement are likely to be negli-gibly small. Essentially, this study shows that part-time work during highschool only affects an individual?s future educational and labour marketdecisions, and these decisions are not affected by an individual?s previoussame-grade schoolmates.The effect of hours worked during high school on future educational andlabour market outcomes is modeled by the following equation:Y t+1igs = ?0 + ?1Xigs + ?21ni ? 1?j ?=iXjgs + ?3Htigs +Dg +Ds + ?t+1igs (4.3)The outcome of interest at time t+1 (some future time) for individual iin grade g and school s at time t is denoted by Y t+1igs . Hours worked at timet is instrumented by classmates? hours worked at time t ? 1. The need toinstrument is made explicit in the above equation because we expect thatown hours worked Htigs is correlated with the error term ?t+1igs through someunobserved characteristic that affects both hours worked during school andthe future educational or labour market outcome. Controls for both thecharacteristics of the student and the student?s same-grade schoolmates areincluded, and grade and school fixed effects capture grade and school-specificdifferences in the outcome of interest and high school working behavior.Results from the initial specification in which the coefficient on the hoursworked term ?3 is not allowed to vary by grade are interpreted as an averageeffect of high school market work on the outcome of interest. Subsequent854.3. Data and Descriptive Statisticsresults explore grade (or age) heterogeneity by separately estimating theequation for 9th and 10th grade students and 11th and 12th grade stu-dents.69Heterogeneity is also explored along the dimension of part-time workintensity. These results are obtained using a two-step procedure. The first-stage relationship is estimated on the full sample (as above), but the second-stage analysis is performed separately on individuals working fewer than 5hours per week and individuals working at least 5 hours per week (duringthe school term). These results have a specific interpretation. For example,the ?3 coefficient from the equation estimated on the sample of individualsworking fewer than 5 hours per week is the effect of an additional hour ofwork on the outcome of interest conditional on having chosen to work fewerthan 5 hours per week. These results cannot account for the initial decisionto work few or many hours.4.3 Data and Descriptive StatisticsThis paper uses data from the National Longitudinal Study of AdolescentHealth (Add Health). The Add Health is a longitudinal study of a nationallyrepresentative sample of US adolescents who were in grades 7 to 12 duringthe 1994-1995 school year. The second wave of the study was conducted thesubsequent year, and there have been two further in-home interviews, themost recent being in 2008. This paper uses data from the first, second andfourth waves of the study.Descriptive statistics of the explanatory and control variables are pro-vided in Table 4.1. The core sample consists of 8,429 students (after drop-ping observations with missing information). The part-time work informa-tion was obtained asking survey participants how many hours they spendworking for pay during a typical non-summer week. In the second wave ofthe study, students in the sample worked an average of 8.5 hours per weekduring high school. The means for 8-10th grade students and 11-12th gradestudents are 5 and 12.5 hours, confirming that older students work morethan younger students in high school. Over 40 percent of the students inthe sample report no market work during high school. The share increasesto 55 percent when adding students working less than four hours per week,69Note that the corresponding first-stage relationship is also estimated separately, al-lowing the structure of peer effects in market work during high school to vary across thetwo groups.864.3. Data and Descriptive Statisticsleaving 45 percent of the sample working over five hours per week.70 Thedistribution of part-time hours worked for all surveyed students working 40hours or less is plotted in Figure 4.1. The cut-off of five hours is chosenwhen investigating heterogeneity as it lies approximately midway betweenthe median and mean, ensuring enough variation in hours worked aboveand below the cut-off to estimate effects. Results remain qualitatively simi-lar with small variations in this cut-off. Ruhm (1997) reports lower meansfrom the NLSY, suggesting an increase in the intensity of part-time workduring high school in the US from the early 1980s (NLSY) to the mid 1990s(Add Health).The subsequent sets of variables describe demographic differences inpart-time working behavior. Males engage in more market work than fe-males during high school. There is a disproportionately high share of whitestudents working over five hours per week while the share of black studentsworking over five hours per week is disproportionately low. Evidence sug-gests that individuals with less educated parents work more intensively.Table 4.2 describes the outcome variables. Most of these were measuredduring the fourth wave of the study when individuals were aged between 24and 32. The first set of variables relate to education. Mean overall GPAscores (measured at the same time as the explanatory variable during Wave2) vary little by grade or work intensity. 11-12th grade students are morelikely to have graduated high school by Wave 4 of the study than 8-10thgrade students. This is because some of the students in the 8-10th gradesample will choose to drop out by the time they would have been in the11-12th grade sample. This form of selection is more evident when lookingat college attendance; 68 percent of 8-10th grade students in the sampleattend college while 74 percent of 11-12th grade students do so.The second set of variables describes labour market outcomes in earlyadulthood. Differences in labour market outcomes across grades (or ages)and work intensities are mostly small and generally statistically insignificant.Older students earn more (which is somewhat mechanical as they are olderand more experienced at the time of Wave 4 study), they are less likely to dophysical work, and are more likely to be satisfied with their jobs. In terms ofwork intensity, students working over five hours per week earn more, worklonger hours, are more likely to do physical work, and are more likely to besatisfied with their job than other students.These correlations include the effects of several observed and unobserved70Note that individuals working zero hours per week are included in the group workingzero to four hours per week.874.4. Resultsfactors that vary with part-time working behavior and the outcome variable.The subsequent section reports causal effects.4.4 ResultsResults from the first stage of the instrumental variables estimation are re-ported in Table 4.3. Specifications include the full set of controls with errorsclustered at the school-grade level. The first column reports that an increaseof one hour in the mean hours worked by same-grade schoolmates the pre-vious year is associated with a 0.51 increase in individual hours worked forthe full sample. The coefficient is precisely estimated and the F-statisticis over 40, confirming a suitably strong relationship for implementation ofthe instrumental variables strategy. The second and third columns reportthe first-stage results for the samples of 8-10th and 11-12th grade students,respectively. The effect of mean peer hours worked on own hours workeddrops to 0.31 and 0.36 for the two groups. It remains precise, although theF-statistic falls. The estimated coefficients on the controls show that fe-males and black students work fewer hours per week (than males and whitestudents), while older students (within a grade) work more than youngerstudents.Table 4.4 reports results from balance tests in which the instrument isregressed on the full set of controls. The purpose of this table is to show thatobservable controls are uncorrelated with the instrument after conditioningon the grade and school fixed effects necessary for identification. This sug-gests that the same is true for unobservable characteristics to the extentthat observable and unobservable characteristics are correlated (Altonji etal, 2005), providing some support for the claim of instrument exogeneity.Grade and school fixed effects are included sequentially. The p-value of 0.26on the F-statistic associated with the full specification in the third columnindicates that we cannot reject the hypothesis that the coefficients on theindividual controls are jointly equal to zero; the instrument is conditionallyindependent of observables.71The effects of part-time work on four educational outcomes are reportedin Table 4.5. All regressions include controls for individual characteristics,mean grade characteristics, grade fixed effects and school fixed effects. The71The set of observable characteristics are considered non-identifying controls. Theyare included to increase the precision of the estimates rather than control for some formof selection. The school and grade fixed effects are considered necessary for identificationof the effect. They deal with grade-specific and school-specific factors that may otherwisebias the estimates.884.4. Resultsfirst column shows that the number of hours worked during high schoolhas no effect on contemporaneous GPA scores. This supports the findingsof Rothstein (2007) and Sabia (2009) in which the achievement effects ofpart-time work are zero. The absence of an effect on high school graduationreported in the second column is broadly consistent with this result.There are, however, negative effects on the number of years of educationand the probability of attending college. These are shown in the third andfourth columns. An additional hour of work reduces education by 0.06 yearsand reduces the probability of attending college by one percentage point.These results suggest that part-time work during high school affects thechoices individuals make after graduating from high school without havingaffected achievement during high school. Essentially, students who engage inmarket work during high school appear more likely to enter the job marketand less likely to pursue further education after high school. A variety ofreasons for this are proposed.First, part-time work during high school may lower job search costs uponhigh school graduation. Students may be able to continue working in theirhigh school jobs after school or be given other opportunities with the sameemployer. The reduced uncertainty of finding work increases the expectedreturns from pursuing market work. Second, the opportunity cost of attend-ing college may be greater for students who worked part-time during highschool. This is because giving up a paying job to study is more costly thangiving up staying at home and watching television (for example). And, third,students who work part-time during high school may have acquired moreindependence and therefore be more attached to the job market than otherstudents. They may have developed spending habits and other behavioursthat encourage working rather than studying.These mechanisms cannot be directly investigated with the availabledata, but their plausibility is explored by analyzing the effects of part-timework on college expectations and a variety of subsequent labour market out-comes. The remaining results are all presented in the form of three-by-threetables for each outcome to reflect heterogeneous effects. Each cell reports theestimated coefficient from a regression estimated on the specified subsample.The columns consider grade (or age) heterogeneity and correspond to thefull sample, 8-10th grade students, and 11-12th grade students, while therows consider work intensity heterogeneity and correspond to the full sam-ple, students working fewer than five hours per week, and students workingat least five hours per week.Results in Table 4.6 show that the only nonzero effect of part-time workon high school outcomes is for 11-12th grade students working fewer than894.4. Resultsfive hours per week. For these students, there is a 0.1 reduction in GPA foran additional hour of work.The negative effects of part-time work on years of education and collegeattendance are driven by 8-10th grade students working more than five hoursper week. An additional hour of work for these students reduces educationby 0.44 years and reduces the probability of attending college by eleven per-centage points. These students are likely to have a strong attachment to thejob market by the time they finish high school and their opportunity cost ofstudying may be greater than students working fewer or zero hours. Resultsin the table also indicate that these students have a reduced desire to attendcollege and expectation of attending college. These effects on expectationsare somewhat consistent with those found by Neumark and Joyce (2011)in which school-to-work programs increased the perceived likelihood of fu-ture labor market activity. Interestingly, 11-12th grade students workingfewer than five hours per week are more likely to both expect to attend andattend college. Working during high school may provide information thatre-enforces the desire to pursue postsecondary education for these students.The final table of results investigates the effect of working during highschool on labour market outcomes. Recall that these were measured duringthe fourth wave of the study when individuals were aged between 24 and32. Table 4.7 indicates that students who work more during high schoolhave their first full-time job at younger ages than other students. This isdriven by 8-10th grade students working more than five hours per week, thesame group who were less likely to attend college. They enter the full-timelabour market 0.23 years younger for every additional weekly hour of part-time work during high school. This supports the hypothesis that 8-10thgrade students working more than five hours per week during high schoolchoose are incentivized to enter the labour market rather than study afterhigh school.Working during high school is associated with increases in income for 11-12th grade students irrespective of their work intensity, although the resultsfor hours worked also indicate that these students work more hours. Forstudents working less than five hours, results in Table 4.5 indicate that thiscould be due to an increased probability of attending college. Generally,students working more than five hours per week during high school remainmore hard-working than their same-grade schoolmates in early adulthood.The final two variables describe the type of work and job satisfaction.These are proxies for job quality. Individuals who study rather enter thelabour market after school may be employed in higher quality jobs whenaged between 24 and 32, so given that part-time work during high school904.5. Conclusionencourages early entry in the labour market, students who work more duringhigh school may have lower quality jobs. Alternatively, students who enterthe labour market early may have accumulated sufficient work experienceto be promoted into more satisfying and higher quality jobs by their latetwenties.Results are not conclusive. There is no effect of part-time work duringhigh school on self-reported job satisfaction, and the only significant effectson the probability of doing physical labour are for 11-12th grade studentsworking more than five hours per week. For these students, there is some ev-idence that part-time work during high school results in subsequent selectionout of jobs requiring physical labour.4.5 ConclusionThis paper contributes to our understanding of the effects of part-time workduring high school by exploring grade and work intensity heterogeneity andfocusing on early adult outcomes. In doing so, it provides an alternativenarrative on the benefits and costs of market work during high school. Con-sistent with the existing literature (Rothstein, 2007; Sabia, 2009), thereappears to be no effect on contemporaneous academic achievement. Thispaper rather focuses on effects on post-high school decision-making withrespect to college attendance and entry into the full-time labour market.These effects of part-time work during high school on subsequent labouroutcomes differ by the grade in which the work occurred and the time in-tensity of the work. This paper finds that 8-10th grade students workingmore than five hours per week are both less likely to attend college thanother students and begin full-time work at a younger age than other stu-dents. There is no effect on the college attendance decision and the age offirst full-time job for 8-10th grade students working less than five hours perwork, as well as 11-12th grade students working any number of hours. Theeffects for 8-10th grade students with high work intensity may be becausethey are strongly attached to the labour market by the time they graduatehigh school and have higher opportunity costs of postsecondary educationthan other students.The effects on subsequent income appear to operate through other chan-nels. An additional hour of part-time work during high school increasessubsequent income for 11-12th grade students working any number of hours,while 8-10th grade students working less than five hours per week experi-ence a negative income shock from part-time work. The positive effects on914.5. Conclusionincome for older students may be due to information or motivation gainedfrom working during high school, or, more directly, the acquisition of skillsand work experience that yield subsequent returns in the labour market.There is some evidence that these students are also more likely to attendcollege, which would also increase subsequent income.924.6. Figures4.6 FiguresFigure 4.1: Distribution of part-time hours worked0.1.2.3.4Density0 10 20 30 40Part?time hours workedThe median and mean number of hours worked are 3 and 9.8, respectively indicated by the solid greenand blue vertical lines. The dashed vertical line indicates the sample split at 5 hours used in theregression analysis.934.7. Tables4.7 TablesTable 4.1: Descriptive statistics I - weekly hours worked and controlsMean (standard deviation)8-10th 11-12th Hours worked (Wave 2)All grades grades 0 0-4 Over 5Part-time work during high schoolWeekly hours worked 8.55 5.10 12.63in Wave 2 (11.47) (8.84) (12.82)Weekly hours worked 5.78 3.31 8.71in Wave 1 (9.62) (6.99) (11.34)Individual characteristicsFemale 0.50 0.50 0.51 0.53 0.51 0.48White 0.67 0.68 0.64 0.57 0.61 0.73Black 0.15 0.15 0.16 0.20 0.18 0.12Hispanic 0.12 0.11 0.12 0.15 0.13 0.10Asian 0.04 0.03 0.05 0.05 0.04 0.03Other 0.03 0.03 0.03 0.03 0.03 0.03Age (years and months) 16.27 15.34 17.83 15.94 15.80 16.83Not born in US 0.05 0.04 0.07 0.06 0.06 0.05Mother?s educationLess than high school 0.16 0.17 0.15 0.18 0.17 0.16High school 0.35 0.36 0.33 0.32 0.32 0.37Some college 0.18 0.18 0.19 0.18 0.18 0.19College 0.26 0.25 0.27 0.26 0.27 0.24Father?s educationLess than high school 0.13 0.13 0.14 0.14 0.13 0.14High school 0.24 0.25 0.22 0.22 0.22 0.27Some college 0.13 0.12 0.14 0.12 0.12 0.14College 0.24 0.23 0.25 0.23 0.25 0.22Household income (Wave 2)Less than $20k 0.15 0.17 0.12 0.17 0.16 0.14$20k - $40k 0.24 0.24 0.23 0.23 0.23 0.25$40k - $60k 0.20 0.21 0.19 0.19 0.20 0.20More than $60k 0.22 0.21 0.25 0.21 0.23 0.22Observations 8429 4570 3859 3558 4632 3797Share 1 0.54 0.46 0.42 0.55 0.45944.7. TablesTable 4.2: Descriptive statistics II - outcomes (Wave 4 unless otherwisestated)Mean (standard deviation)8-10th 11-12th Hours worked (Wave 2)All grades grades 0 0-4 Over 5Educational outcomesMean GPA score 2.83 2.83 2.84 2.82 2.85 2.81(Wave 2) (0.75) (0.77) (0.73) (0.76) (0.76) (0.74)Graduated high school 0.95 0.94 0.97 0.95 0.95 0.95(0.21) (0.25) (0.16) (0.21) (0.22) (0.21)Years of education 14.46 14.26 14.71 14.48 14.51 14.40(2.17) (2.21) (2.09) (2.21) (2.21) (2.12)Desire to attend collegea 4.44 4.46 4.42 4.48 4.49 4.37(Wave 2) (1.03) (1.02) (1.04) (0.99) (0.98) (1.08)Expectation of attending 4.22 4.20 4.24 4.24 4.25 4.18college (Wave 2) (1.13) (1.11) (1.15) (1.10) (1.09) (1.17)Attended college 0.71 0.68 0.74 0.70 0.71 0.70(0.46) (0.47) (0.44) (0.46) (0.45) (0.46)Labour market outcomesAge at first full-time job 20.42 20.04 20.95 20.52 20.49 20.32(2.56) (2.53) (2.50) (2.62) (2.59) (2.51)Number of jobs 3.42 3.66 3.15 3.47 3.55 3.27(2.66) (2.62) (2.67) (2.79) (2.79) (2.48)Income 35088 31661 39095 33609 33617 36858(42198) (37419) (46864) (45811) (45462) (37832)Hours worked per week 41.11 40.93 41.31 40.62 40.70 41.60(11.20) (11.41) (10.96) (11.16) (11.32) (11.04)Do physical work 0.57 0.61 0.54 0.55 0.56 0.58(0.49) (0.49) (0.50) (0.50) (0.50) (0.49)Satisfied with job 0.74 0.72 0.75 0.71 0.72 0.76(0.44) (0.45) (0.43) (0.45) (0.45) (0.43)aThe desire to attend college and expectation of attending college are self-reported rank-ings from 1 to 5. These were obtained during Wave 2 of the study.954.7. TablesTable 4.3: First-stage results - peer effects in part-time work during highschoolAll 8-10th 11-12thDependent variable: grades grades gradesPart-time hours worked during high school (1) (2) (3)Hours worked by same-grade schoolmates 0.51*** 0.31** 0.36***(in previous year) (0.08) (0.12) (0.12)Individual characteristicsFemale -0.98*** -0.56* -1.81***Black -1.84*** -1.34 -2.55**Hispanic 0.04 0.42 -0.54Asian -0.31 -0.27 0.11Other 0.68 0.40 1.44Age (years and months) 1.87*** 2.22*** 1.54**Not born in US -0.42 -0.77 -0.17Mother?s education (Omitted: high school)Less than high school -0.44 -0.09 -1.00Some college -0.22 -0.76* 0.55College -0.81* -0.46 -1.38Father?s education (Omitted: high school)Less than high school 0.45 -0.22 1.74*Some college -0.22 -0.33 0.05College -1.08** -0.98* -1.06Household income: (Omitted: >$60k)<$20k 0.82 0.63 0.73$20k - $40k 1.01** 0.74 1.28$40k - $60k 0.55 0.36 0.65Other controlsa x x xDiagnosticsF-statistic on excluded instrument 43.33 6.34 9.38Number of school-grade clusters 506 287 219Observations 8429 4570 3859aOther controls include indicators describing household structure, grade rep-etition history, school year in progress and school in saturated sample, aswell as school-grade characteristics and grade and school fixed effects. Ro-bust standard errors clustered by school-grade in parentheses. *** p<0.01,** p<0.05, * p<0.1.964.7. TablesTable 4.4: Balance tests - OLS results from regressing instrument on controlsDependent variable: hours worked bysame-grade schoolmates (previous period) (1) (2) (3)Individual characteristics(Omitted: male, white, born in US)Female 0.09 -0.05 -0.05Black -1.25***-1.32***-0.15*Hispanic -0.84***-0.99***-0.06Asian -1.73***-1.96***-0.17Other -0.49* -0.52* 0.09Age (years and months) 2.07***0.29***0.04Not born in US -0.23 -0.20 -0.28**Mother?s education (Omitted: high school)Less than high high school 0.03 0.13 0.12Some college -0.05 -0.17 -0.08College -0.28** -0.31***-0.01Father?s education (Omitted: high school)Less than high high school 0.14 0.04 0.09Some college 0.23 0.17 0.21**College -0.09 -0.19 0.06Household income Omitted: >$60k<$20k -0.23 -0.06 -0.07$20k - $40k -0.14 -0.11 -0.05$40k - $60k -0.07 -0.02 -0.10Other controlsa x x xIdentifying controlsGrade fixed effects x xSchool fixed effects xDiagnosticsF-statistic on non-identifying controls 22.04 3.55 1.17p-value 0.00 0.00 0.26Observations 8429 8429 8429aOther controls include indicators describing household structure, graderepetition history, school year in progress and school in saturated sam-ple, as well as school-grade characteristics. Robust standard errors clus-tered by school-grade in parentheses. *** p<0.01, ** p<0.05, * p<0.1.974.7. TablesTable 4.5: IV results - effect of part-time work during high school on edu-cational outcomesMean Graduated YearsGPA high of Attendedscore school education college(1) (2) (3) (4)Hours worked -0.004 -0.001 -0.06** -0.01**(instrumented) (0.010) (0.003) (0.03) (0.01)Individual characteristicsFemale 0.19*** 0.01 0.36*** 0.06***Black -0.14*** 0.01 -0.12 0.02Hispanic -0.21*** -0.04* -0.24* -0.03Asian 0.14** 0.01 0.18 0.01Other -0.05 -0.03 -0.20 -0.01Age (years and months) -0.03 -0.04*** -0.20*** -0.04**Not born in US 0.08 0.02 0.34** 0.08**Mother?s educationLess than high school -0.04 -0.05*** -0.25*** -0.07***Some college 0.10*** 0.01 0.41*** 0.08***College 0.15*** 0.01 0.68*** 0.10***Father?s educationLess than high school -0.03 -0.03** -0.18* -0.05*Some college 0.14*** 0.02* 0.42*** 0.08***College 0.21*** 0.01* 0.76*** 0.11***Household income<$20k -0.11*** 0.00 -0.61*** -0.09$20k - $40k -0.11*** 0.00 -0.41*** -0.05$40k - $60k -0.02 0.01 -0.21** -0.02Other controlsa x x x xIdentifying controlsGrade fixed effects x x x xSchool fixed effects x x x xObservations 8339 8429 8429 8429aOther controls include indicators describing household structure, graderepetition history, school year in progress and school in saturated sample,as well as school-grade characteristics. Robust standard errors clusteredby school-grade in parentheses. *** p<0.01, ** p<0.05, * p<0.1.984.7. TablesTable 4.6: IV results - educational outcomesAll grades 8-10th grades 11-12th gradesMean GPA score (Wave 2)All hours -0.004 -0.01 0.004Hours worked<5 -0.01 0.02 -0.09***Hours worked?5 -0.003 -0.04 0.03Graduated high schoolAll hours -0.001 0.002 0.002Hours worked<5 -0.003 0.010 -0.004Hours worked?5 0.002 -0.011 0.006Years of educationAll hours -0.06** -0.23** 0.03Hours worked<5 0.01 0.04 0.13Hours worked?5 -0.11*** -0.44*** -0.04Desire to attend college (Wave 2)All hours -0.01 -0.11** 0.08Hours worked<5 0.01 -0.04 0.04Hours worked?5 -0.04* -0.24*** 0.02Expect to attend college (Wave 2)All hours 0.001 -0.12* 0.05Hours worked<5 0.02 -0.02 0.14**Hours worked?5 -0.02 -0.20*** -0.04Attended collegeAll hours -0.01** -0.07** 0.01Hours worked<5 -0.002 -0.02 0.03**Hours worked?5 -0.02** -0.11*** -0.003Robust standard errors clustered by school-grade in parentheses.*** p<0.01, ** p<0.05, * p<0.1.994.7. TablesTable 4.7: IV results - labour market outcomesAll grades 8-10th grades 11-12th gradesAge at first full-time jobAll hours -0.08* -0.38** -0.02Hours worked<5 -0.05 -0.13 -0.07Hours worked?5 -0.11** -0.23* -0.003Log(number of jobs)All hours -0.002 0.01 0.04***Hours worked<5 0.01 0.04 0.05*Hours worked?5 -0.01 -0.06* 0.04*Log(income)All hours 0.01 -0.01 0.09*Hours worked<5 0.004 -0.08* 0.10*Hours worked?5 0.01 0.09 0.10***Log(hours worked per week)All hours 0.009** 0.01 0.03**Hours worked<5 -0.003 -0.01 -0.01Hours worked?5 0.02*** 0.04** 0.05***Do light or hard physical workAll hours -0.01** -0.03 -0.02Hours worked<5 -0.01 -0.02 0.02Hours worked?5 -0.01 0.01 -0.04***Satisfied with jobAll hours 0.002 -0.003 0.004Hours worked<5 -0.01 -0.01 0.00Hours worked?5 0.01 0.03 0.01Robust standard errors clustered by school-grade in parentheses.*** p<0.01, ** p<0.05, * p<0.1.100Bibliography[1] Joseph G Altonji, Todd E Elder, and Christopher R Taber. Selectionon Observed and Unobserved Variables: Assessing the Effectiveness ofCatholic Schools. Journal of Political Economy, 113(1):151?184, 2005.[2] Donald W.K. Andrews and James H. Stock. Inference with weak in-struments. Working Paper 313, National Bureau of Economic Research,August 2005.[3] Joshua D Angrist and William N Evans. Children and Their Par-ents? Labor Supply: Evidence from Exogenous Variation in Family Size.American Economic Review, 88(3):450?477, 2011.[4] Joshua D Angrist and Kevin Lang. Does School Integration Gener-ate Peer Effects? Evidence from Bostons Metco Program. AmericanEconomic Review, 2004.[5] Philip Babcock and Kelly Bedard. The Wages of Failure: New Evidenceon School Retention and Long-Run Outcomes. Education Finance andPolicy, 6(3):293?322, July 2011.[6] M Bertrand and Sendhil Mullainathan. Do people mean what theysay? Implications for subjective survey data. The American EconomicReview, 91(2), 2001.[7] Marianne Bertrand and Jessica Pan. The trouble with boys: Social in-fluences and the gender gap in disruptive behavior. American EconomicJournal: Applied Economics, 5(1):32?64, 2013.[8] R Bifulco and JM Fletcher. The effect of classmate characteristics onpost-secondary outcomes: Evidence from the Add Health. AmericanEconomic Journal: Economic Policy, 3(February):25?53, 2011.[9] Alberto Bisin, Eleonora Patacchini, Thierry Verdier, and Yves Zenou.Formation and persistence of oppositional identities. European Eco-nomic Review, 55(8):1046?1071, 2011.101Bibliography[10] Alison Booth and Patrick Nolen. Choosing to compete: How differentare girls and boys? Journal of Economic Behavior & Organization,81(2):542?555, February 2012.[11] Alison L Booth and Patrick Nolen. Gender Differences in Risk Be-haviour: Does Nurture Matter? Economic Journal, 122:56?78, 2012.[12] Y. Bramoulle?, H. Djebbari, and B. Fortin. Identification of peer effectsthrough social networks. Journal of Econometrics, 150(1):41?55, 2009.[13] W. a. Brock and S. N. Durlauf. Discrete Choice with Social Interactions.The Review of Economic Studies, 68(2):235?260, April 2001.[14] MA Burke and TR Sass. Classroom peer effects and student achieve-ment. Journal of Labor Economics, 31(1):51?82, 2013.[15] Franz Buscha, Arnaud Maurel, Lionel Page, and Stefan Speckesser. TheEffect of Employment while in High School on Educational Attainment:A Conditional Difference-in-Differences Approach. Oxford Bulletin ofEconomics and Statistics, 74(3):380?396, June 2012.[16] David Card. The causal effect of education on earnings. Handbook ofLabor Economics, 3:1801?1863, 1999.[17] David Card and Laura Giuliano. Peer effects and multiple equilibria inthe risky behavior of friends. Review of Economics and Statistics, (0),2011.[18] Scott E Carrell and Mark L Hoekstra. Externalities in the Classroom:How Children Exposed to Domestic Violence Affect Everyones Kids.American Economic Journal: Applied Economics, 2(1):211?228, 2010.[19] Scott E Carrell, Bruce I Sacerdote, and James E West. From natu-ral variation to optimal policy? the lucas critique meets peer effects.Technical report, National Bureau of Economic Research, 2011.[20] Piero Cipollone and Alfonso Rosolia. Social interactions in high school:Lessons from an earthquake. American Economic Review, 97(3):948?965, 2007.[21] Andrew E Clark and Youenn Lohe?ac. ?It wasn?t me, it was them!?social influence in risky behavior by adolescents. Journal of HealthEconomics, 26(4):763?84, July 2007.102Bibliography[22] Jane Cooley. Desegregation and the Achievement Gap: Do DiversePeers Help? 2010.[23] Sergio Currarini, Matthew Jackson, and Paolo Pin. An Economic Modelof Friendship: Homophily, Minorities, and Segregation. Econometrica,77(4):1003?1045, 2009.[24] Giacomo De Giorgi, Michele Pellizzari, and Silvia Redaelli. Identifi-cation of social interactions through partially overlapping peer groups.American Economic Journal: Applied Economics, 2(2):241?275, 2010.[25] Yingying Dong. Kept back to get ahead? Kindergarten retentionand academic performance. European Economic Review, 54(2):219?236,February 2010.[26] Christian Dustmann and Arthur Soest. Part-time work, school suc-cess and school leaving. Empirical Economics, 32(2-3):277?299, August2006.[27] Eric R. Eide and Mark H. Showalter. The effect of grade retentionon educational and labor market outcomes. Economics of EducationReview, 20(6):563?576, December 2001.[28] Kelly Foley. Can neighbourhoods change the decisions of youth on themargins of university participation? Canadian Journal of Economics,45(1):167?188, February 2012.[29] Nicole M Fortin, Philip Oreopoulos, and Shelley Phipps. Leaving boysbehind: Gender disparities in high academic achievement. Technicalreport, National Bureau of Economic Research, 2013.[30] Jane Friesen and Brian Krauth. Ethnic enclaves in the classroom.Labour Economics, 18(5):656?663, October 2011.[31] JC Fruehwirth, Salvador Navarro, and Yuya Takahashi. How the timingof grade retention affects outcomes: Identification and estimation oftime-varying treatment effects. 2011.[32] Andrew J. Fuligni and Harold W. Stevenson. Time use and mathe-matics achievement among american, chinese, and japanese high schoolstudents. Child Development, 66(3):pp. 830?842, 1995.[33] Constance Gager, Teresa Cooney, and Kathleen Thiede Call. The ef-fects of family characteristics and time use on teenage girls and boyshousehold labor. Journal of Marriage and Family, 61(4):982?994, 1999.103Bibliography[34] Nancy L. Galambos, David M. Almeida, and Anne C. Petersen. Mas-culinity, femininity, and sex role attitudes in early adolescence: Explor-ing gender intensification. Child Development, 61(6):pp. 1905?1914,1990.[35] Uri Gneezy, Kenneth L. Leonard, and John A. List. Gender Differencesin Competition: Evidence From a Matrilineal and a Patriarchal Society.Econometrica, 77(5):1637?1664, 2009.[36] Uri Gneezy, Muriel Niederle, and Aldo Rustichini. Performance in com-petitive environments: gender differences. Quarterly Journal of Eco-nomics, (August):1049?1074, 2003.[37] Steven M Goodreau, James A Kitts, and Martina Morris. Birdsof a Feather, Or Friend of a Friend?: Using Exponential RandomGraph Models to Investigate Adolescent Social Networks. Demogra-phy, 46(1):103?125, 2009.[38] Dominique Goux and Eric Maurin. Close Neighbours Matter: Neigh-bourhood Effects on Early Performance at School. Economic Journal,117:1193?1215, 2007.[39] Mark S Granovetter. The Strength of Weak Ties. American Journal ofSociology, 78(6):1360?1380, 1973.[40] Jay P. Greene and Marcus a. Winters. The effects of exemptions toFlorida?s test-based promotion policy: Who is retained? Economics ofEducation Review, 28(1):135?142, February 2009.[41] Andrew Hacker. Is Algebra Necessary? The New York Times, July 28,2012, 2012.[42] Iida Ha?kkinen. Working while enrolled in a university: does it pay?Labour Economics, 13(2):167?189, April 2006.[43] C T Halpern, K Joyner, J R Udry, and C Suchindran. Smart teensdon?t have sex (or kiss much either). The Journal of Adolescent Health,26(3):213?25, March 2000.[44] Diane F Halpern, Lise Eliot, Rebecca S Bigler, Richard A Fabes,Laura D Hanish, Janet Hyde, Lynn S Liben, and Carol Lynn Martin.The Pseudoscience of Single-Sex Schooling. Science, 333:1706?1707,2011.104Bibliography[45] E.A. Hanushek. The failure of input-based schooling policies. EconomicJournal, 113(485):64?98, 2003.[46] Eric A Hanushek, John F Kain, and Steven G Rivkin. New Evidenceabout Brown v. Board of Education: The Complex Effects of SchoolRacial Composition on Achievement. Journal of Labor Economics,27(3), 2009.[47] Duke Helfand. A Formula for Failure in L . A . Schools. The LosAngeles Times, January 30, 2006, 2006.[48] Christian Helmers and Manasa Patnam. Does the Rotten Child SpoilHis Companion? Spatial Peer Effects Among Children in Rural India.2010.[49] V. Joseph Hotz, Lixin Colin Xu, Marta Tienda, and Avner Ahituv.Are There Returns to the Wages of Young Men from Working While inSchool? Review of Economics and Statistics, 84(2):221?236, May 2002.[50] Caroline Hoxby. Peer effects in the classroom: Learning from genderand race variation. Technical report, National Bureau of EconomicResearch, 2000.[51] Yannis M Ioannides and Linda Datcher Loury. Job information net-works, neighborhood effects, and inequality. Journal of Economic Lit-erature, 42(4):1056?1093, 2004.[52] C Kirabo Jackson. Single-sex schools, student achievement, and courseselection: Evidence from rule-based student assignments in trinidadand tobago. Journal of Public Economics, 96(1):173?187, 2012.[53] BA Jacob and Lars Lefgren. Remedial education and student achieve-ment: A regression-discontinuity analysis. Review of Economics andStatistics, 86(February):226?244, 2004.[54] Brian a Jacob and Lars Lefgren. The Effect of Grade Retention on HighSchool Completion. American Economic Journal: Applied Economics,1(3):33?58, June 2009.[55] Charlene Marie Kalenkoski and Sabrina Wulff Pabilonia. Time to workor time to play: The effect of student employment on homework, sleep,and screen time. Labour Economics, 19(2):211?221, April 2012.105Bibliography[56] Michael P Keane and Kenneth I Wolpin. The career decisions of youngmen. Journal of Political Economy, 105(3):473?522, 1997.[57] Brian Krauth. Peers as treatments. 2011.[58] Victor Lavy, MD Paserman, and Analia Schlosser. Inside the Black Boxof Ability Peer Effects: Evidence from Variation in the Proportion ofLow Achievers in the Classroom*. The Economic Journal, 122:208?237,2012.[59] Victor Lavy and Edith Sand. The friends factor: How students socialnetworks affect their academic achievement and well-being? WorkingPaper 18430, National Bureau of Economic Research, October 2012.[60] Victor Lavy and Anal??a Schlosser. Mechanisms and Impacts of GenderPeer Effects at School. American Economic Journal: Applied Eco-nomics, 3(October 2006):1?33, 2011.[61] Victor Lavy, Olmo Silva, and Felix Weinhardt. The good, the bad,and the average: evidence on ability peer effects in schools. Journal ofLabor Economics, 30(2):367?414, 2012.[62] Audrey Light. High school employment, high school curriculum, andpost-school wages. Economics of Education Review, 18:291?309, 1999.[63] X. Lin. Identifying peer effects in student academic achievement byspatial autoregressive models with group unobservables. Journal ofLabor Economics, 28(4):825?860, 2010.[64] Marco Manacorda. The Cost of Grade Retention. Review of Economicsand Statistics, 94(2):596?606, May 2012.[65] Charles F Manski. Identification of Endogenous Social Effects: TheReflection Problem. Review of Economic Studies, 60(3):531?542, 1993.[66] Charles F Manski. Economic Analysis of Social Interactions. Journalof Economic Perspectives, 14(3):115?136, August 2000.[67] Claude Montmarquette, Nathalie Viennot-Briot, and Marcel Dagenias.Dropout, school performance, and working while in school. The Reviewof Economics . . . , 89(November):752?760, 2007.[68] Marcelo J Moreira and Brian Poi. Implementing Conditional Testswith Correct Size in the Simultaneous Equations Model. Stata Journal,(1):1?15, 2001.106Bibliography[69] Ted Mouw and Barbara Entwisle. Residential Segregation and Interra-cial Friendship in Schools. American Journal of Sociology, 112(2):394?441, 2006.[70] David Neumark and Mary Joyce. Evaluating school-to-work programsusing the new NLSY. Journal of Human Resources, 36(4):666?702,2001.[71] GS Oettinger. Does high school employment affect high school academicperformance? Industrial and Labor Relations Review, 53(1):136?151,1999.[72] Philip Oreopoulos and Kjell G Salvanes. How large are the returnsto schooling? hint: Money isn?t everything. Working Paper 15339,National Bureau of Economic Research, September 2009.[73] SW Pabilonia. Evidence on youth employment, earnings, and parentaltransfers in the National Longitudinal Survey of Youth 1997. Journalof Human Resources, 36(4), 2012.[74] E. Patacchini, E. Rainone, and Y. Zenou. Student Networks and Long-Run Educational Outcomes: The Strength of Strong Ties. 2012.[75] Franc?ois Poulin, Anne-Sophie Denault, and Sara Pedersen. Longitu-dinal Associations Between Other-Sex Friendships and Substance Usein Adolescence. Journal of Research on Adolescence, 21(4):776?788,December 2011.[76] Barbara J Reys, Shannon Dingman, Nevels Nevels, and Dawn Teuscher.High School Mathematics: State-Level Curriculum Standards andGraduation Requirements. Technical report, Center for the Study ofMathematics Curriculum, 2007.[77] Heather Rose and Julian R. Betts. The Effect of High School Courseson Earnings. Review of Economics and Statistics, 86(2):497?513, May2004.[78] DS Rothstein. High school employment and youths? academic achieve-ment. Journal of Human Resources, (October 2003), 2007.[79] CJ Ruhm. Is high school employment consumption or investment?Journal of Labor Economics, 15(4):735?776, 1997.107Bibliography[80] Joseph J. Sabia. Reading, writing, and sex: The effect of losing virginityon academic performance. Economic Inquiry, 45(4):647?670, 2007.[81] Joseph J. Sabia. School-year employment and academic performanceof young adolescents. Economics of Education Review, 28(2):268?276,April 2009.[82] B Sacerdote. Peer effects with random assignment: Results for Dart-mouth roommates. Quarterly Journal of Economics, (May), 2000.[83] Bruce Sacerdote. Peer Effects in Education: How Might They Work,How Big Are They and How Much Do We Know Thus Far?, volume 3of Handbook of the Economics of Education, chapter 4, pages 249?277.Elsevier, June 2011.[84] Nicole Schneeweis and Martina Zweimu?ller. Girls, girls, girls: Gendercomposition and female school choice. Economics of Education Review,31(4):482?500, August 2012.[85] Ralph Stinebrickner and Todd R. Stinebrickner. What can be learnedabout peer effects using college roommates? Evidence from new surveydata and students from disadvantaged backgrounds. Journal of PublicEconomics, 90(8-9):1435?1454, September 2006.[86] Ralph Stinebrickner and TR Stinebrickner. Working during school andacademic performance. Journal of Labor Economics, 21(2):473?491,2003.[87] James H Stock, Jonathan H Wright, and Motohiro Yogo. A surveyof weak instruments and weak identification in generalized method ofmoments. Journal of Business and Economic Statistics, 20(4):518?529,2002.[88] Petra E. Todd and Kenneth I. Wolpin. On the specification and es-timation of the production function for cognitive achievement. TheEconomic Journal, 113(485):F3?F33, 2003.[89] JohnH. Tyler. Using State Child Labor Laws to Identify the Effectof SchoolYear Work on High School Achievement. Journal of LaborEconomics, 21(2):381?408, April 2003.[90] Glen R. Waddell. Gender and the influence of peer alcohol consumptionon adolescent sexual activity. Economic Inquiry, 50(1):248?263, 2012.108[91] Diane Whitmore. Resource and Peer Impacts on Girls? AcademicAchievement: Evidence from a Randomized Experiment. AmericanEconomic Review, 95(2):199?203, 2011.[92] DJ Zimmerman. Peer effects in academic outcomes: Evidencefrom a natural experiment. Review of Economics and Statistics,85(February):9?23, 2003.109Appendix AThe Effect of Peer GenderComposition on High SchoolAchievement110A.1. Sensitivity of Results to Instrument SpecificationThe appendix describes results from a variety of robustness and sensi-tivity checks. It also provides an econometric framework from which theestimated parameters can be given a cumulative effect interpretation. Ap-pendix Table A.1 describes the composition of the sample, and AppendixTable A.2 provides a balance test suggesting orthogonality of the instrument.These are discussed in the main body of the paper. The remaining appendixdiscussion is split into four subsections. The first subsection discusses thesensitivity and robustness of the instrument, the second subsection outlinesthe potential for non-classical measurement error and explains how the em-pirical strategy overcomes it, the third section shows that the first stagerelationship is not driven by spatial outliers, and the fourth section derivesthe cumulative effect interpretation.A.1 Sensitivity of Results to InstrumentSpecificationThe instrument is constructed using the weighted gender composition of thenearest twenty schoolmates. Results in Appendix Table A.3 show that anunweighted version of this instrument, as well as weighted and unweightedinstruments based on the gender composition of schoolmates within 2km,generate similar findings. The estimates using the distance-based measureare less precise but of similar magnitude to those reported in Table 2.5.Appendix Table A.4 considers the sensitivity of results both to the den-sity of friendship networks from which the gender composition measures arederived and the chosen definition of friendship. The first set of results splitsthe sample according to the number of friend nominations asked of surveyedindividuals and restricts the sample to individuals with at least two friends.The second set of results repeats the primary analysis using different defi-nitions of friendship. These results address concerns related to differencesbetween true and observed friendship networks, and the friendship definitionon which these networks are based.Individuals were asked to nominate either one friend or five friends. Thestudy was designed so that all individuals in the same school nominated thesame number of friends. The gender composition measures derived fromfriendship networks based on five friend nominations are likely to be mea-sured with less error than those based on single nominations. This tableshows that potential biases introduced by this aspect of the design do notaffect the initial findings. Results are consistent with those originally re-ported, although they are measured imprecisely due to the smaller sample111A.1. Sensitivity of Results to Instrument Specificationsizes.The friendship network gender composition of individuals matched toonly one friend are extreme; observed opposite gender friend shares are ei-ther zero or one. To show that these individuals do not drive the result, theanalysis is performed on a sample restricted to individuals with at least twofriends. The point estimate of interest is very similar in this specification.The results are also similar when the analysis is performed on the restrictedsample of individuals for whom at least seventy-five percent of friendshipnominations were matched. The gender composition of friendship networksfor these individuals is likely measured with less error, explaining the main-tained precision of the estimates despite the smaller sample.The nominating process discussed in the data section allows for differentdefinitions of friendships. The preferred definition of friendship for this paperconsiders any friendship nomination to form a friendship. This is because theidentification strategy relies on neighbours affecting outcomes only throughthe friendship network, and the weakest definition of the friendship networkis most likely to satisfy this exclusion restriction. Two alternative friend-ship networks definitions based on the nomination process are directional:nominated and nominating friendship networks. These definitions only con-sider either sent or received nominations to form friendships, respectively. Afourth definition of the friendship network is the strong friendship networkdiscussed in the body of the paper in which only reciprocated nominationsform friendships.Estimates in Appendix Table A.4 are similar to those reported in the pa-per for weak friendship networks, although they are less precisely measureddue to the smaller sample sizes. (The sample sizes are smaller because thestronger definitions result in greater exclusion from the sample. Recall thatindividuals are excluded from the sample if they are not matched to anyfriends as the gender composition of friendship networks is not well-definedfor these individuals.)The urbanicity of the community in which the school is located may affectthe results. This is both because the dependence of achievement on peergender composition may vary between urban, suburban and rural schools,and because the first stage relationship between distance and friendshipmay have a different structure across these types of communities. Resultsin Table A.5 show that the first stage is weak in urban schools (the firstcolumn), so the instrument cannot inform our understanding of the effect inthese communities. However, restricting the sample to suburban and ruralschools (the fourth column) shows the negative effect of opposite genderfriends on achievement.112A.2. Non-classical Measurement Error Arising from Self-reporting BiasThe validity of the IV strategy is tested in two ways in Appendix Ta-ble A.6. First, it considers an experiment similar to randomly reassigningthe gender of schoolmates and showing that the reassigned gender compo-sition of close neighbours does not affect the original gender composition offriendship networks. And, second, it confirms the first stage relationship fora composition measure for which randomness is even less contestable thangender.These tests are performed by introducing another composition measure:share even birth month. Consider an individual with a set of neighboursin the data. Now consider an experiment in which the gender of neigh-bours is reassigned so that neighbours with an even birth month are of theopposite gender. The share of reassigned opposite gender close neighbours(equivalent to the share of even birth month close neighbours) should not becorrelated with the share of (true) opposite gender friends. The first columnof Appendix Table A.6 reports results from regressing the share of oppositegender friends on the share of even birth month close neighbours and showsthat they are uncorrelated.The share of even birth months friends should have no effect on aca-demic achievement, but, given that individuals are more likely to be friendswith schoolmates living in the close neighbourhood, the share of even birthmonth close neighbours should affect an individual?s share of even birthmonth friends. The second column of Appendix Table A.6 confirms thepresence of this relationship. Using this as the first stage of an (unneces-sary) instrumental variables strategy, the share of even birth month friendsis shown to have no effect on school performance (as expected). This ta-ble supports the validity of the first stage by performing a placebo test onthe first stage and due to the absence of plausible alternative explanationsfor the relationship between the share of even birth month neighbours andfriends.A.2 Non-classical Measurement Error Arisingfrom Self-reporting BiasThe gender composition of an individual?s friendship network is derived fromself-nominated friends, and the measure of academic achievement is self-reported GPA. These data may suffer from self-reporting bias. This sectionoutlines the potential for such biases, as well as showing that an instru-mental variables strategy deals with these concerns under the assumptionthat the instrument is orthogonal to self-reporting bias in the outcome and113A.2. Non-classical Measurement Error Arising from Self-reporting Biasexplanatory variable. This result is somewhat obvious, but is useful for un-derstanding the direction of the potential bias under various assumptionsabout the self-reporting bias and its correlation with other variables in themodel.Consider a simple model in which the true value of an outcome y? is alinear function of the true value of an explanatory variable x?.y? = x?? + e (A.1)Allow for some form of endogeneity, so cov(x?, e) ?= 0.We can write the measured values of the outcome and explanatory vari-ables y and x as the sum of the true values and a measurement error term.These error terms are not random, allowing for some form of systematicself-reporting bias.y = y? + uy (A.2)x = x? + ux (A.3)We can substitute Equations A.2 and A.3 into the true model in EquationA.1.y = x? + (e+ uy + ux?) (A.4)The error term in Equation A.4 is clearly correlated with the explanatoryvariable. The correlation between x and e follows from the endogeneity, andthere is a mechanical correlation between x and ux from Equation A.3.We can investigate the consistency of an OLS estimate ?? more formally.plim ?? = cov(y,x)var(x)= cov(y?+uy ,x?+ux)var(x?+ux)= cov(x??+e+uy ,x?+ux)var(x?+ux)= 1var(x?)+var(ux){? var(x?) + ? cov(x?, ux) + cov(x?, e)+ cov(ux, e) + cov(x?, uy) + cov(ux, uy)}(A.5)This equation breaks down the potential biases into six components (theterms of the sum inside the braces). The first and third components are thefamiliar attenuation and endogeneity biases. The remaining components are114A.2. Non-classical Measurement Error Arising from Self-reporting Biasbest understood in terms of the example of this paper. Consider the outcometo be GPA and the explanatory variable to be peer gender composition.The second component cov(x?, ux) is the bias introduced by correlationbetween true peer gender composition and self-reporting bias in peer gendercomposition. For example, individuals with few opposite gender friendsmay over-report opposite gender friendships, such that cov(x?, ux) < 0.This would either bias the estimate towards zero or change the sign of theestimate depending on the relative magnitude of the attenuation bias.The fourth component cov(ux, e) relates to correlation between unob-served determinants of GPA and self-reporting bias in peer gender com-position. This may involve some personality trait such as overconfidence.Overconfident individuals may both perform poorly academically and over-report opposite gender friendships, for example. This would also bias thepoint estimate downwards as cov(ux, e) < 0.The fifth component cov(x?, uy) is bias introduced by the correlationbetween true peer gender composition and self-reporting bias in GPA. Forexample, males with a larger share of female friends may systematically over-report GPA if female friends are of higher ability (on average), and individ-uals have a propensity toward reporting the mean GPA of their friendshipnetworks.Finally, the sixth component cov(ux, uy) relates to correlation betweenself-reporting bias in peer gender composition and self-reporting bias inGPA. This correlation would be generated by a world in which some peo-ple consistently tell the truth and others consistently distort the truth. Forexample, some individuals may systematically exaggerate all self-reporteddata in the direction that is perceived to be more socially-favourable.The above analysis highlights the potential concerns of using self-reporteddata. The subsequent analysis shows that using an instrumental variablesstrategy deals with these concerns under some assumptions about the in-strument. (Of course it is already well-known that instruments deal withthe attenuation and endogeneity biases.)Consider an instrument for the explanatory variable z?, such that cov(z?, e) =0. It is assumed to be of the same scale for ease of exposition.z? = x? + ? (A.6)Now consider the covariance between the measured outcome variable y115A.3. Boundary Concernsand the instrument z?.cov(y, z?) = cov(y? + uy, z?)= cov(x?? + e+ uy, z?)= cov(x? + e+ uy ? ux?, z?)= ? cov(x, z?) + cov(z?, e) + cov(z?, uy)? ? cov(z?, ux)= ? cov(x, z?) + cov(z?, uy)? ? cov(z?, ux)(A.7)In order for ? = cov(y,z?)cov(x,z?) , the familiar exactly-identified univariate IVresult, it is required that cov(z?, uy) = 0 and cov(z?, ux) = 0. In otherwords, the self-reporting biases in the outcome and explanatory variablesneed to be uncorrelated with the instrument.In terms of this paper, self-reporting biases in GPA and peer gender com-position need to be uncorrelated with neighbourhood gender composition.The only real concern is that neighbourhood gender composition may affectself-reporting bias in peer gender composition. For example, an individ-ual with a large share of opposite gender close neighbours may over-reportopposite gender friends.Appendix Table A.7 reports results from regressing constructed proxiesfor measurement error in peer gender composition and GPA on the instru-ment, a female indicator, and self-reported GPA. Proxies for measurementerror are constructed by differencing the observed measure from anothermeasure that does not suffer from potential self-reporting biases. The gen-der composition measurement error proxy is the difference between the weakfriendship network (in which all nominations generate friendships, so sus-ceptible to self-reporting biases) and strong friendship network (recipro-cated nominations generate friendships, so less susceptible to self-reportingbias) gender composition measures, and the achievement measurement er-ror proxy is the difference between self-reported and transcript GPA scores(for the subsample for which transcript GPA scores are available). Theseresults suggest that the instrument is uncorrelated with measurement errorin self-reported GPA and peer gender composition, providing support forthe empirical strategy.A.3 Boundary ConcernsOne remaining concern is that distance to the community origin may gen-erate the first stage relationship between the gender composition of closeneighbours and the gender composition of friends. Consider a world in116A.4. Cumulative Effect Interpretationwhich gender is spatially uniformly distributed throughout the school com-munity and individuals are only friends with their neighbours. Individualsclose to the community origin will have an equal share of own and oppositegender neighbours, and therefore an equal share of own and opposite genderfriends. Individuals at the community boundary, though, may only haveonly neighbour, and therefore only one friend.This type of community organization and friendship formation wouldgenerate the observed first stage, but variation would be driven completelyby individuals at the community boundary. These individuals are likely todiffer systematically from individuals at the community origin, and thereforereduce the generalizability of the estimated local effect.Appendix Table A.8 shows that this is not a concern in the data. Thistable splits the sample into three groups according to the distance be-tween individuals and the community origin (defined as the mean X- andY-coordinates in a school community). The first stage is strongest for thosein the middle third (in terms of distance to the community origin) and notthe furthest third, showing that the effect is not driven by individuals at theboundary. Interestingly, the effect is weak for individuals closest to the com-munity origin. This is consistent with the idea that the increased density ofschoolmates in the neighbourhood close to the community origin may resultin increased opportunities for friendship formation. This reduces the rela-tionship between distance and friendship as individuals are able to chooseamong their close neighbours for matches that make better friends.A.4 Cumulative Effect InterpretationThe discussion in the main body of the paper provides a contemporaneousinterpretation of the friendship network gender composition effect. Thissection recognizes that peer gender composition affects education productionin every period, allows education production to depend on past production72,and, given persistence in friendship networks, shows that the parameterestimated in the paper can be interpreted as the cumulative effect of oppositegender friends. It subsequently provides a set of assumptions that allowseparation of the cumulative and contemporaneous effects.Consider the simplified education production of individual i at age t tobe a a function of individual characteristics X and the share of opposite72Hanushek (2003) and Todd and Wolpin (2003) discuss different formulations of edu-cation production functions, particularly the different assumptions underlying level andvalue-added specifications.117A.4. Cumulative Effect Interpretationgender friends O:Yit = ?tXit + ?tOit + ?it (A.8)In this formulation, the age-varying parameters ?t and ?t can be inter-preted as the cumulative effects of individual characteristics and oppositegender friends on achievement (up to age t). We can explicitly include laggedachievement to give these parameters a contemporaneous (or value-added)interpretation (dropping the i subscript for clarity).Yt = ?tXt + ?tOt + ?tYt?1 + et (A.9)Education production at age t can then be expressed as a function ofthe history of individual characteristics {Xt, Xt?1, . . . , X1}, opposite gen-der friend shares {Ot, Ot?1, . . . , O1}, initial ability Y0, and the history ofproduction shocks {et, et?1, . . . , e1}:Yt =?tXt +t?1?j=1?t?jj?k=1?t+1?kXt?j + ?tOt +t?1?j=1?t?jj?k=1?t+1?kOt?j+t?k=1?t+1?kY0 + et +t?1?j=1j?k=1?t+1?ket?j(A.10)We now make two simplifying assumptions.A1. Xu = Xu?1 for all u.A2. The share of opposite gender friends evolves according to the fol-lowing process:O1 = ?0Z + u1O2 = ?2O1 + u2. . .Ot = ?tOt?1 + ut(A.11)The first assumption fixes individual characteristics as an individualages. This assumption is not restrictive for characteristics such as gender,race and immigrant status. It may have some bite for characteristics such118A.4. Cumulative Effect Interpretationas parent education and household income that potentially vary for someschool-going individuals over time.The second assumption describes a simple evolution for the share of op-posite gender friends. The initial share of opposite gender friends O1 is alinear function of the share of opposite gender friends in the close neighbour-hood Z with all other initial determinants of friendship composition in theerror term u1.73 Opposite gender friend shares then follow an AR(1) processwhere friendship network gender composition depends on lagged friendshipnetwork gender composition and an additive shock (which is not necessarilyorthogonal to other components of the model). This is reasonable givenpersistence in friendship networks as an individual ages (networks do notreset every period). The model does not allow families to relocate.These assumptions allow us to express education production as a func-tion of individual characteristics Xt, current share of opposite gender friendsOt, initial ability Y0, the history of shocks to production {et, et?1, . . . , e1},and the history of shocks to friendship network gender composition that af-fect current achievement through their affect on past achievement {ut, ut?1, . . . , u1}.Yt =????t +t?1?j=1?t?jj?k=1?t+1?k???Xt +????t +t?1?j=1?t?jj?k=1?t+1?k?t+1?k???Ot+t?k=1?t+1?kY0 + et +t?1?j=1j?k=1?t+1?ket?j?t?1?j=1?t?jj?k=1?t+1?kj?l=1? j?m=l?t+1?m??1ut+1?l(A.12)Along with the direct effects of characteristics and peer gender com-position operating through ?t and ?t, the additional term in the respectivecoefficients describe the indirect effects operating through past achievement.Under this framework, consistent estimates of the coefficients on Xt and Otshould therefore be interpreted as the cumulative effects of individual char-acteristics and the share of opposite gender friends on achievement.The initial endogeneity problem in estimating the effect of opposite gen-der friends arose because of the correlation between friendship network gen-73The effect of individual characteristics on friendship network gender composition isomitted at this stage for tractability; it would not change the econometric analysis ifincluded. It is included when the model is taken to the data.119A.4. Cumulative Effect Interpretationder composition and the contemporaneous unobservable determinants of ed-ucation production. This formulation suggests additional concern due to po-tential correlation with the both the history of education production shocks,and the past shocks to friendship network composition that affect currentachievement through past achievement.The neighbourhood gender composition instrument, however, remainscorrelated with the current share of opposite gender friends and orthogonalto all components of the error term (initial ability and the shocks), allow-ing identification of the cumulative effect of opposite gender friends. Theorthogonality with initial ability and unobservable determinants of achieve-ment follows from the same arguments as provided in the initial exposition.Orthogonality with the shocks to friendship network gender composition area consequence of the assumption that the only direct effect of close neigh-bourhood gender composition on friendship network gender composition isin the initial period.The correlation between current share of opposite gender friends andshare of opposite gender close neighbours is evident in the below expressionfor Ot, which is essentially the first stage for using Z as an instrument forOt.Ot = ?tOt?1 + ut=t?1?j=1?j+1?0Z +t?1?j=1j?k=1?t+1?kut?j + ut(A.13)The predicted friendship network gender composition O?t =?t?1j=1??j+1?0Zidentifies the cumulative effect of opposite gender friends in Equation A.12.The following two equations define the reduced form first and secondstage parameters for an individual of age t. The dependence of the share ofopposite gender friends on characteristics X is made explicit.Ot = ?tXt + ?tZ + vt (A.14)Yt = ?tXt + ?tOt + ?t (A.15)The reduced form parameters of interest can be expressed in terms ofthe underlying parameters.120A.4. Cumulative Effect Interpretation?t =t?1?j=1?j+1?0 (A.16)?t =????t +t?1?j=1?t?jt?k=1?t+1?k?t+1?k???(A.17)The time-varying underlying parameters are not identified even withdata for individuals of different ages. Additional assumptions separate thecontemporaneous and cumulative effects of opposite gender friends.A3. ?u = ?u?1, ?u = ?u?1, ?u = ?u?1 and ?u = ?u?1 for all u.A4. ? ?= 1.A5. ?? ?= 1.The third assumption imposes constancy in the parameters over all agesin both the education production function described by Equation A.12 andthe friendship network gender composition process described by EquationA.13.74 Assumption A3 identifies the underlying parameters if we observeindividuals of different ages. The fourth and fifth assumptions simply allowus to use the formula for summation of a geometric sequence. These areeasily relaxed, and an alternative derivation is provided below for when A5does not hold.Given these assumptions, education production for an individual of highschool age t is given by:Yt =?1? ?t1? ?Xt + ?1? (??)t1? (??)Ot + ?tY0 + et +t?1?j=1?jet?j? ?t?1?j=1?jj?l=1?l?j?1ut+1?l(A.18)The simplified first stage (omitting dependence on X as before) is given74This may be less restrictive if we limit the model to describe individuals of middleand high school ages where initial ability is that accumulated by the beginning of middleschool and middle school initiates a new friendship gender composition process.121A.4. Cumulative Effect InterpretationbyOt = ?t?1?0Z +t?1?j=1?jut?j (A.19)The parameters of interest are identified up to an indexing of a fromthe first and second stage estimates for two adjacent age groups.75 Table2.6 reports estimates for those above and below the age of sixteen, provid-ing the necessary information. Finer sample splits provide overidentifyingrestrictions, but are costly in terms of statistical precision given the smallsamples.The following four equations determine the contemporaneous effect ofopposite gender friends ?, the effect of lagged achievement ?, the friend-ship network gender composition process ?, and the correlation betweeninitial friendship network gender composition and close neighbourhood gen-der composition ?0. The standard errors of these parameters need to bebootstrapped given that ? is the solution to a higher order polynomial forwhich an analytical solution may not exist.? =?a+1?a(A.20)?0 =?aa?a?1a+1(A.21)?a(??)a+1 ? ?a+1(??)a + (?a+1 ? ?a) = 0 (A.22)? = ?a1? (??)1? (??)a(A.23)? and ? describe AR(1) processes for education production and friend-ship network gender composition. It is not unreasonable to consider the casewhere these parameters are equal, violating A5 (the assumption that allowedus to express the coefficient as the summation of a geometric sequence). Analternative assumption results in the following formulation (where ? and ?0are defined as before).75The age index a is a free parameter. For example, choosing a = 1 assumes thatthe modelled education production process begins at high school. Age increments aredescribed by the integers, but need not correspond to the same period of time over theeducation process. For example, two years at high school could correspond to one year atprimary school. This provides flexibility in the choice of a, so estimates for a wide rangeof possible values are reported.122A.4. Cumulative Effect InterpretationA5?. ?? = 1Yt =?1? ?t1? ?Xt + ?tOt + ?tY0 + et +t?1?j=1?jet?j ? ?t?1?j=1?jj?l=1?l?j?1ut+1?l(A.24)? =?aa(A.25)? = ? (A.26)a =?a?a+1 ? ?a(A.27)This formulation restricts the index a, which is useful given that thecontemporaneous effect of friendship network gender composition ? wouldotherwise only be identified up to a multiplicative constant.Appendix Table A.9 reports estimates of these parameters under thedifferent sets of assumptions: the first six columns underA5 and assumptionson the age index, and the seventh column under A5?. The friendship networkgender composition process ? is stable and indicates that two-thirds of theshare of opposite gender friends persists as an individual ages. It is preciselyestimated because it is the ratio of precise first stage estimates ?a+1?a . Thedependence of initial friendship network gender composition on the gendercomposition of the close neighbourhood is only precisely estimated if it isassumed that the friendship process begins at high school (a = 1). It getslarger as the gender composition process is assumed to begin at earlier ages(which is mechanical given the AR(1) friendship process), but is less preciselyestimated. This is because it is the ratio of exponential functions of imprecisefirst stage estimates. Similarly, the correlation between current and laggedGPA ? gets larger as GPA accumulation is assumed to begin at earlier ages.Finally, the contemporaneous effect of the share of opposite gender friendson achievement ? remains relatively precise over assumptions on the ageindex, confirming a negative effect. Under assumption A5? in which the ageindex is determined by the model, high school is estimated to begin at anage index a = 3. The associated contemporaneous effect of opposite genderfriends is negative, but imprecisely estimated.This section has provided a cumulative interpretation of the effects offriendship network gender composition. Without assumption A3, the con-temporaneous and cumulative effects cannot be separated. Empirical resultsin the paper can therefore be interpreted in two ways. First, this assumption123A.4. Cumulative Effect Interpretationcan be discarded, and the original estimated parameter may be interpretedas the cumulative effect of exogenous variation in the share of opposite gen-der friends induced by an initial dependence of friendship composition onthe gender composition of the close neighbourhood. Second, this assump-tion can be adopted, and data for individuals of different ages identify theparameters that describe the evolution of education production and friend-ship network gender composition. Furthermore, this allows us to separatethe contemporaneous effect of opposite gender friends from the cumulativeeffect operating through past production. The estimated parameters areconsistent with opposite gender friends negatively affecting high school per-formance under both interpretations.124A.5. Appendix TablesA.5 Appendix TablesTable A.1: Descriptive statistics: controlsMeanAll Males FemalesCore demographicsWhite 0.52 0.52 0.52Black 0.20 0.20 0.20Hispanic 0.16 0.16 0.16Asian 0.09 0.10 0.09Other 0.03 0.03 0.03Not born in US 0.09 0.09 0.09Age (years and months) 16.16 16.24 16.07Home languageEnglish spoken at home 0.89 0.89 0.88Spanish spoken at home 0.08 0.08 0.08Mother educationMother did not graduate high school 0.18 0.17 0.19Mother graduated high school 0.32 0.34 0.31Mother attended some college 0.18 0.17 0.19Mother graduated college 0.25 0.25 0.25Father educationFather did not graduate high school 0.14 0.14 0.14Father graduated high school 0.23 0.24 0.22Father attended some college 0.13 0.14 0.13Father graduated college 0.22 0.22 0.21Parent characteristicsInterviewed parent not born in US 0.15 0.16 0.15Not receiving public assistance 0.07 0.05 0.08Receiving public assistance 0.78 0.80 0.77Household incomeHousehold income: <$20k 0.14 0.14 0.15Household income: $20k-$40k 0.21 0.22 0.21Household income: $40k-$60k 0.19 0.19 0.18Household income: >$60k 0.19 0.19 0.19Household structureMother in household 0.91 0.91 0.91Father in household 0.71 0.73 0.69Biological mother in household 0.86 0.86 0.86Biological father in household 0.60 0.62 0.58Grade repetitionHas repeated at least one grade 0.19 0.24 0.15Observations 8,435 4,124 4,311Categories for missing such that shares sum to one not reported butincluded in all analyses.125A.5. Appendix TablesTable A.2: Instrument balance tests(1) (2) (3) (4)Share opposite gender Share whiteAll Males Females AllCore demographicsFemale -0.01*** 0.01Black 0.00 -0.01 0.01 -0.23***Hispanic 0.00 -0.02 0.02 -0.10***Asian 0.01 -0.02 0.04*** -0.12***Other 0.00 -0.00 0.00 -0.05***Age (years and months) -0.01 -0.01 -0.01 -0.01***Not born in US 0.00 0.01 -0.01 0.01*Home languageSpanish spoken at home 0.00 -0.01 0.01 -0.04***Other language spoken at home -0.02** -0.06*** 0.01 -0.01At least one ESL course taken 0.01 0.01 0.01 -0.01**Parent characteristicsMother did not graduate high school 0.01 0.00 0.00 -0.01**Mother attended some college 0.00 -0.00 0.01 0.00Mother graduated college 0.00 0.01* -0.01 0.01Father did not graduate high school -0.00 -0.00 -0.01 0.00Father attended some college -0.00 -0.01 0.00 0.00Father graduated college 0.01 -0.01 0.01* 0.01Not born in US -0.00 0.03*** -0.03*** -0.01Receiving public assistance 0.01 -0.00 0.01 -0.02***Annual household incomeZero 0.01 -0.01 -0.00 -0.01<$20k 0.00 -0.00 0.01 -0.03***$20k-$40k 0.00 0.00 0.00 -0.02***$40k-$60k 0.01 0.00 0.01* -0.01Household structureMother in household 0.01 0.02 0.00 0.03**Father in household 0.01 0.01 0.01 -0.01Biological mother in household -0.01 -0.01 -0.01 -0.02***Biological father in household -0.00 -0.00 -0.00 0.00Grade repetitionHas repeated a grade -0.01 0.01 -0.01* 0.00Observations 8,435 4,124 4,311 8,435School and grade fixed effects included. Indicator variables for school in satu-rated sample and period of interview included. Robust standard errors clusteredby school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.126A.5. Appendix TablesTable A.3: Sensitivity analysis - instrument specification(1) (2) (3) (4)Overall GPA (A=4, D or lower=1)Instrument specification20 nearest schoolmates x xSchoolmates within 2km x xWeighted x xUnweighted x xSchool friendsShare opposite gender -1.05** -1.02** -0.79 -1.38(0.53) (0.47) (0.68) (1.06)ControlsFemale 0.18*** 0.18*** 0.19*** 0.19***(0.02) (0.02) (0.02) (0.02)Other controls x x x xSchool and grade fixed effects x x x xFirst-stage coefficientsShare opposite gender in closeneighbourhood 0.13*** 0.16*** 0.08** 0.06**(0.03) (0.03) (0.03) (0.03)DiagnosticsF-statistic on excluded instrument 15.32 14.46 6.37 4.46Observations 8,435 8,435 8,160 8,160Indicator variables for school in saturated sample and period of interviewincluded. Robust standard errors clustered by school in parentheses. ***p<0.01, ** p<0.05, * p<0.1.127A.5.AppendixTablesTable A.4: Sensitivity analysis - network density and friendship definitions(1) (2) (3) (4) (5) (6) (7) (8)Overall GPA (A=4, D or lower=1)Sample restrictionNone x x x xSingle friend nomination xFive friend nomination xAt least two friends xAt least 75% nominations matched xFriendship definitionAny nomination x x x x xNominating friendships (out) xNominated friendships (in) xReciprocated nominations xSchool friendsShare opposite gender -1.05** -0.84 -1.19 -0.83* -1.08* -0.59 -1.08* -0.45(0.53) (0.66) (0.79) (0.49) (0.55) (0.56) (0.55) (1.32)ControlsFemale 0.18*** 0.18*** 0.19*** 0.17*** 0.20*** 0.16*** 0.21*** 0.17***(0.02) (0.03) (0.03) (0.02) (0.03) (0.03) (0.03) (0.06)Other controls x x x x x x x xSchool and grade fixed effects x x x x x x x xObservations 8,435 4,559 3,876 4,110 4,283 6,174 6,017 2,834Indicator variables for school in saturated sample and period of interview included. Robust standard er-rors clustered by school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.128A.5. Appendix TablesTable A.5: Sensitivity analysis - school urbanicity(1) (2) (3) (4)Overall GPA (A=4, D or lower=1)School urbanicityUrban xSuburban x xRural x xSchool friendsShare opposite gender 0.01 -0.96 -2.44 -1.31***(1.24) (0.64) (1.67) (0.60)ControlsFemale 0.20*** 0.20*** 0.17*** 0.19***(0.06) (0.03) (0.05) (0.02)Other controls x x x xSchool and grade fixed effects x x x xFirst-stage coefficientsaShare opposite gender in closeneighbourhood 0.08 0.14*** 0.11* 0.13***(0.07) (0.04) (0.06) (0.03)DiagnosticsF-statistic on excluded instrument 1.39 10.04 3.70 14.30Observations 1,892 4,456 2,087 6,543Indicator variables for school in saturated sample and period of interviewincluded. Robust standard errors clustered by school in parentheses. ***p<0.01, ** p<0.05, * p<0.1.129A.5. Appendix TablesTable A.6: Placebo test - effect of share even birth month(1) (2) (3) (4)First stage OLS IVSchool friendsShare Shareopposite even birth GPAgender month (A=4, D=1)Nearest 20 schoolmatesShare even birth month 0.03 0.21***(0.04) (0.04)School friendsShare even birth month 0.01 -0.38(0.02) (0.32)ControlsFemale -0.01 0.01 0.20*** 0.20***(0.01) (0.01) (0.02) (0.01)Other controls x x x xSchool and grade fixed effects x x x xObservations 8,435 8,435 8,430 8,430Indicator variables for school in saturated sample and period of interviewincluded. Robust standard errors clustered by school in parentheses. ***p<0.01, ** p<0.05, * p<0.1.130A.5.AppendixTablesTable A.7: Measurement error from self-reporting bias(1) (2) (3) (4) (5) (6) (7) (8)Measurement error proxy (calculated for subset of sample)Share opposite gender GPA (A=4, D or lower=1)All nominations Self-reported- reciprocated nominations - transcriptNearest 20 schoolmates: -0.00 -0.01 -0.06 -0.03share opposite gender (0.04) (0.04) (0.07) (0.07)Female 0.03*** 0.03*** -0.10*** -0.12***(0.01) (0.01) (0.02) (0.02)GPA (A=4, D or lower=1; self-reported) -0.01 -0.01 0.09*** 0.10***(0.01) (0.01) (0.01) (0.01)Observations 2,834 2,834 2,834 2,834 3,811 3,811 3,811 3,811R-squared 0.00 0.00 0.00 0.00 0.00 0.01 0.01 0.03Indicator variables for school in saturated sample and period of interview included. Robust standard errorsclustered by school in parentheses. *** p<0.01, ** p<0.05, * p<0.1.131A.5. Appendix TablesTable A.8: Sensitivity of first stage to distance from community originSchool friends:Share opposite genderClosest Middle Furthestthird third third(1) (2) (3)Nearest 20 schoolmatesShare opposite gender 0.00 0.18*** 0.15***(0.06) (0.05) (0.06)ControlsOther controls x x xSchool and grade fixed effects x x xDiagnosticsF-statistic of excluded instrument 0.01 10.46 5.74Observations 2,961 2,755 2,719Indicator variables for school in saturated sample and period ofinterview included. Robust standard errors clustered by school inparentheses. *** p<0.01, ** p<0.05, * p<0.1.132A.5.AppendixTablesTable A.9: Cumulative effect estimates - math and science GPA(1) (2) (3) (4) (5) (6) (7)Identifying assumption (A5 or A5?)?? ?= 1 (A5)?? = 1 (A5?)Cumulative effect parametersAge index at 1 2 3 4 5 6 3.20high school (10)?: friendship 0.67 0.67* 0.67* 0.67** 0.67 0.67* 0.67**(0.53) (0.35) (0.38) (0.31) (0.44) (0.38) (0.32)?0: instrument 0.14*** 0.21 0.31 0.47 0.70 1.05 0.34(0.04) (0.34) (1.21) (253) (2313) (334) (126)?: GPA 0.21 0.49 0.64 0.73 0.78 0.81 0.67*(1.23) (5.70) (5.63) (0.70) (1.53) (3.04) (0.32)?: opp gender -1.39* -0.80*** -0.48** -0.30 -0.20 -0.14* -0.43(0.82) (0.34) (0.22) (0.20) (0.34) (0.08) (1.06)Observations 8169 8169 8169 8169 8169 8169 8169Replications 50 50 50 50 50 50 50Bootstrapped standard errors in parenthesis (school-grade-gender strata. *** p<0.01, **p<0.05, * p<0.1.133Appendix BNegative Externalities inHigh School CourseRepetition134Appendix B. Negative Externalities in High School Course RepetitionThe appendix describes results from a variety of robustness and sensi-tivity checks.135B.1. Appendix TablesB.1 Appendix TablesTable B.1: Descriptive demographic statistics - Pooled (student-years)First- Failed and Course-time repeating takerscourse- course- whoAll takers takers passMath GPA score (transcript) 2.17 2.29 1.26 2.61Gender and race:Female 0.50 0.51 0.41 0.52White 0.45 0.45 0.32 0.51Black 0.13 0.12 0.22 0.11Hispanic 0.21 0.21 0.22 0.18Asian 0.18 0.21 0.19 0.18Other 0.02 0.02 0.04 0.02Age (years and months) 16.80 17.03 17.12 16.65Immigrant statusand home language:Not born in US 0.13 0.15 0.11 0.13Home language: English 0.84 0.83 0.85 0.85Home language: Spanish 0.10 0.10 0.13 0.09Home language: Other 0.06 0.07 0.02 0.06Parent characteristics:Mother ed: Less than high school 0.19 0.18 0.21 0.16Mother ed: High school 0.32 0.31 0.26 0.33Mother ed: Some college 0.18 0.18 0.19 0.19Mother ed: College 0.24 0.26 0.25 0.25Father ed: Less than high school 0.16 0.15 0.18 0.14Father ed: High school 0.24 0.23 0.21 0.24Father ed: Some college 0.16 0.17 0.15 0.17Father ed: College 0.22 0.25 0.14 0.25Parent not born in US 0.23 0.26 0.26 0.22Household income:Household income: <$20k 0.09 0.08 0.11 0.09Household income: $20k-$40k 0.23 0.23 0.24 0.23Household income: $40k-$60k 0.20 0.21 0.17 0.21Household income: >$60k 0.18 0.19 0.16 0.19Observations 6341 3379 310 3937Share 1 0.53 0.05 0.62136B.1. Appendix TablesTable B.2: Effect of course repeaters on academic performance of first-timecourse-takersDependent variable: Sample: First-time course-takersMath GPA score (1) (2) (3) (4) (5) (6) (7)Course-mates:Number of studentsfailed and repeating:Natural log -0.58**(0.23)Linear -0.02*** -0.09** -0.04*** -0.13***(0.004) (0.03) (0.01) (0.04)Quadratic 0.001 0.004(0.001) (0.003)Share of students -0.20 -0.11failed and repeating (0.82) (1.46)Course size (number of students):Linear -0.001*** 0.003(0.0003) (0.003)Quadratic 0.000(0.000)Non-parametric x x x xFixed effectsa x x x x x x xObservations(student-years) 3379 3379 3379 3379 3379 3379 3379Number of students 1810 1810 1810 1810 1810 1810 1810aYear, school-cohort, school-course, school-course trends and individual fixed effectsincluded. Robust standard errors clustered by school in parentheses. *** p<0.01, **p<0.05, * p<0.1.137B.1. Appendix TablesTable B.3: Robustness check - excluding selected subjects and schoolsSample:First-time course-takersDependent variable: Math GPA score (1) (2) (3)Exclusions:Algebra I x xSelected schools x xCourse-mates:Log number of students failed -0.06 -0.13** -0.04and repeating (0.04) (0.06) (0.06)Fixed effectsa x x xObservations (student-years) 2828 2414 2023Number of students 1565 1291 1130aYear, school-cohort, school-course, school-course trends and indi-vidual fixed effects, as well as log number of students in course in-cluded. Robust standard errors clustered by school in parentheses.*** p<0.01, ** p<0.05, * p<0.1.Table B.4: Correlation between course failure rate and subsequent GPADependent variable: Sample: Course-takers who passSubsequent year math GPA score (1) (2) (3) (4)Course-mates:Log number of students who -0.09 -0.13fail and repeat current course (0.11) (0.09)Log number of students who -0.10* -0.03fail current course (0.05) (0.07)Log number of students who 0.02 0.09repeat current course (0.07) (0.06)Fixed effectsa x x x xObservations (student-years) 3276 3276 3276 3276Number of students 1860 1860 1860 1860aYear, school-cohort, school-course, leading school-course and individ-ual fixed effects, as well as log number of students in course included.Robust standard errors clustered by school in parentheses. *** p<0.01,** p<0.05, * p<0.1.138Appendix CThe Effects of Part-TimeWork During High School139Appendix C. The Effects of Part-Time Work During High SchoolThe appendix describes results from OLS regressions of the various de-pendent variables on part-time hours worked during high school. Theseestimates do not have a causal interpretation.140C.1. Appendix TablesC.1 Appendix TablesTable C.1: OLS results - educational outcomesAll grades 8-10th grades 11-12th gradesMean GPA score (Wave 2)All hours -0.002* 0.0004 -0.005***Hours worked<5 0.004 0.01 -0.01Hours worked?5 -0.004*** -0.001 -0.007***Graduated high schoolAll hours -0.0008* -0.001* -0.0004Hours worked<5 -0.002 -0.004 -0.0003Hours worked?5 -0.001 -0.001 -0.0005Years of educationAll hours -0.012*** -0.013*** -0.012***Hours worked<5 0.02 0.03 -0.04Hours worked?5 -0.016*** -0.01** -0.017***Desire to attend college (Wave 2)All hours -0.005*** -0.004 -0.006**Hours worked<5 0.01 -0.003 0.04Hours worked?5 -0.008*** -0.006* -0.012***Expect to attend college (Wave 2)All hours -0.003* -0.001 -0.006**Hours worked<5 0.01 0.02 -0.03Hours worked?5 -0.008*** -0.004 -0.012***Attended collegeAll hours -0.001 -0.002** 0.0002Hours worked<5 0.016** 0.02** 0.01Hours worked?5 -0.001 -0.001 -0.001Robust standard errors clustered by school-grade in parentheses.*** p<0.01, ** p<0.05, * p<0.1.141C.1. Appendix TablesTable C.2: OLS results - labour market outcomesAll grades 8-10th grades 11-12th gradesAge at first full-time jobAll hours -0.024*** -0.031*** -0.02***Hours worked<5 -0.01 0.00 -0.09Hours worked?5 -0.017*** -0.02*** -0.02**Log(number of jobs)All hours -0.001 -0.002 0.000Hours worked<5 0.01 0.01 0.004Hours worked?5 -0.002 0.001 -0.002Log(income)All hours 0.004*** 0.003 0.006**Hours worked<5 0.01 0.005 0.02Hours worked?5 0.003 0.002 0.003Log(hours worked per week)All hours 0.0013*** 0.001 0.002***Hours worked<5 0.001 0.003 -0.004Hours worked?5 0.002** 0.001 0.003***Do light or hard physical workAll hours 0.0002 0.001 0.000Hours worked<5 0.03*** 0.03*** 0.02Hours worked?5 0.000 0.001 -0.001Satisfied with jobAll hours 0.001 0.0002 0.001Hours worked<5 0.01 0.01 -0.004Hours worked?5 0.001 0.001 0.0002Robust standard errors clustered by school-grade in parentheses.*** p<0.01, ** p<0.05, * p<0.1.142
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- An empirical and economic analysis of high school peer...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
An empirical and economic analysis of high school peer effects Hill, Andrew J. 2013
pdf
Page Metadata
Item Metadata
Title | An empirical and economic analysis of high school peer effects |
Creator |
Hill, Andrew J. |
Publisher | University of British Columbia |
Date Issued | 2013 |
Description | Parents are concerned about the influence of friends during adolescence. Using the gender composition of schoolmates in an individual's close neighbourhood as an instrument for the gender composition of an individual's self-reported friendship network, Chapter 2 of this dissertation finds that the share of opposite gender friends has a sizeable negative effect on high school GPA. The effect is found across all subjects for students over the age of sixteen, but is limited to mathematics and science for younger students. Self-reported difficulties getting along with the teacher and paying attention in class are important mechanisms through which the effect operates. The subject-specific effects for younger students and larger estimates for females in general are consistent with a gender socialization hypothesis in which young females conform to traditional gender roles in the presence of males. Chapter 3 investigates the extent to which course repeaters in high school mathematics courses exert negative externalities on their course-mates. Using individual and school-specific course fixed effects to control for ability and course selection, it shows that doubling the number of repeaters in a given course (holding the number of course-takers constant) results in a 0.15 reduction in GPA scores for first-time course-takers. Further results suggest that the negative effect is only evident when the share of repeaters reaches a threshold of five to ten percent of the total number of course-takers. Chapter 4 provides evidence that part-time work during high school affects the college attendance and labour market entry decisions of young adults: 8-10th grade students working more than five hours per week are less likely to attend college and more likely to enter the labour market upon high school graduation than other students. The part-time working behaviour of same-grade schoolmates is used as an instrument for individual part-time working behaviour. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2013-10-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0165625 |
URI | http://hdl.handle.net/2429/45364 |
Degree |
Doctor of Philosophy - PhD |
Program |
Economics |
Affiliation |
Arts, Faculty of Vancouver School of Economics |
Degree Grantor | University of British Columbia |
GraduationDate | 2013-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2013_fall_hill_andrew.pdf [ 1.27MB ]
- Metadata
- JSON: 24-1.0165625.json
- JSON-LD: 24-1.0165625-ld.json
- RDF/XML (Pretty): 24-1.0165625-rdf.xml
- RDF/JSON: 24-1.0165625-rdf.json
- Turtle: 24-1.0165625-turtle.txt
- N-Triples: 24-1.0165625-rdf-ntriples.txt
- Original Record: 24-1.0165625-source.json
- Full Text
- 24-1.0165625-fulltext.txt
- Citation
- 24-1.0165625.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0165625/manifest