Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The local political economy of conditional cash transfers in Brazil Frey, Anderson 2016

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2016_november_frey_anderson.pdf [ 754.68kB ]
JSON: 24-1.0319146.json
JSON-LD: 24-1.0319146-ld.json
RDF/XML (Pretty): 24-1.0319146-rdf.xml
RDF/JSON: 24-1.0319146-rdf.json
Turtle: 24-1.0319146-turtle.txt
N-Triples: 24-1.0319146-rdf-ntriples.txt
Original Record: 24-1.0319146-source.json
Full Text

Full Text

The Local Political Economy ofConditional Cash Transfers in BrazilbyAnderson FreyB.A., University of São Paulo, 2002M.A., New York University, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Economics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)October, 2016© Anderson Frey 2016AbstractConditional cash transfers (CCT) are currently one of the most popular povertyreduction policies worldwide. Nevertheless, there is still limited evidence of theirimpact on the local political dynamics of developing nations plagued by corruption,clientelism, and vote buying. This thesis studies this issue using data from Brazil’sBolsa Família, the largest CCT program in the world.Chapter 2 proposes a theoretical mechanism to illustrate how a CCT programcould affect local politics in such institutional setting, when local politicians cannotmanipulate program eligibility. In a nutshell, when mayors are able to buy votes bydiverting public resources into private payments to swing voters, a CCT programworks as an income shock that reduces the voter’s marginal utility coming from votebuying.Chapter 3 uses data from Brazil’s Bolsa Família to test the theoretical predic-tions from the first Chapter. It shows that, when transfers are shielded from theinfluence of political intermediaries, they trigger a reduction in incumbency advan-tage, an increase in both political competition and the quality of candidates, and areduction in the support for high-clientelism parties. Transfers also lead incumbentsto shift spending toward redistributive health and education services. These resultsare estimated with a nonparametric multivariate regression discontinuity design.Despite of the improvements brought out by the CCT program, when politicianscan control access to the program, some of these political impacts might be negativein the short-term. Chapter 4 tells this other side of the story: what happens wheniiAbstractpoliticians are able to manipulate program enrollment. Using administrative datafrom the Bolsa Família registry, and a regression discontinuity design in close elec-tions, this Chapter shows that mayors with reelection incentives are more likely topromote income underreporting fraud by households, for the purpose of eligibility toCCT. This fraud is rewarded by voters, as corrupt mayors have a higher reelectionprobability. Finally, the Chapter also shows the need for disciplining devices toreduce this type of corruption. A higher risk of audits by the federal government isshown to drastically reduce the effects of reelection incentives on fraud.iiiPrefaceThis dissertation is original, unpublished, independent work by the author, An-derson Frey.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 A Probabilistic Voting Model of Vote Buying . . . . . . . . . . . . 92.1 Basic Model Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Vote Buying and Incumbency Advantage . . . . . . . . . . . . . . . 152.3 Model Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.4 Incorporating Rent Seeking . . . . . . . . . . . . . . . . . . . . . . . 203 Cash Transfers, Clientelism, and Political Enfranchisement . . . 233.1 Institutional Background . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Using a Discontinuity in the FHP Funding as Research Design . . . 28vTable of Contents3.3 Data Sources and Description of Variables . . . . . . . . . . . . . . 313.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.5 Results and Interpretation . . . . . . . . . . . . . . . . . . . . . . . 423.6 Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564 Reelection Incentives and Fraud in Cash Transfers . . . . . . . . 674.1 CCT Program Design, Credit Claiming and Manipulation of Eligibil-ity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.2 Theory: Reelection Incentives and Reverse Accountability . . . . . 734.3 Data and Construction of Variables . . . . . . . . . . . . . . . . . . 764.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.5 Result and Interpretation . . . . . . . . . . . . . . . . . . . . . . . . 844.6 Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 905 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104AppendicesA Appendix to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 112A.1 Proofs of Propositions . . . . . . . . . . . . . . . . . . . . . . . . . 112A.2 A Generic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119A.3 Incorporating Rent Seeking . . . . . . . . . . . . . . . . . . . . . . . 125B Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 129B.1 Effects on Hiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129B.2 Bandwidth Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 131B.2 Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136viTable of ContentsC Appendix to Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . 146C.1 Income Reporting, Cancellation and Distribution of BF Benefits . . 146C.1 Figures and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146viiList of Tables3.1 Balance of Fixed and Pre-determined Variables . . . . . . . . . . . . 613.2 Regression Discontinuity Results . . . . . . . . . . . . . . . . . . . . 623.3 Main Results: Political Outcomes . . . . . . . . . . . . . . . . . . . . 633.4 Robustness to Kernel Choice: Political Outcomes . . . . . . . . . . 643.5 Other Robustness Tests: Political Outcomes . . . . . . . . . . . . . 653.6 Main Results by Subsamples . . . . . . . . . . . . . . . . . . . . . . 664.1 Main Results: Fraud (Income Underreporting) . . . . . . . . . . . . 934.2 Distribution of Benefits . . . . . . . . . . . . . . . . . . . . . . . . . 944.3 Other Household Characteristics Reported in CadUnico . . . . . . . 954.4 Measures of the Mayor’s Performance in 2012 . . . . . . . . . . . . . 964.5 Robustness to the Experience and Ability Biases . . . . . . . . . . . 974.6 Effect of Audits on Fraud, by Different Specifications . . . . . . . . . 984.7 Impact of BF Fraud on the Probability of Reelection . . . . . . . . . 99B.1 Optimal Bandwidths . . . . . . . . . . . . . . . . . . . . . . . . . . 139B.2 Main Results: Other Political Outcomes . . . . . . . . . . . . . . . . 140B.3 Political Outcomes at Different FPM Population Thresholds . . . . . 141B.4 Change in the Political Impact of the FHP post-2004 . . . . . . . . . 142B.5 Robustness of the Exclusion Restriction . . . . . . . . . . . . . . . . 143B.6 Health Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144B.7 Hiring Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145viiiList of TablesC.1 Bolsa Família Targeting . . . . . . . . . . . . . . . . . . . . . . . . . 147C.2 Balance of Fixed and Pre-determined Variables . . . . . . . . . . . . 148C.2 Balance of Fixed and Pre-determined Variables (continued) . . . . . 149C.2 Balance of Fixed and Pre-determined Variables (continued) . . . . . 150C.3 Income Underreporting by Year of Last Update . . . . . . . . . . . . 151C.4 Distribution of Benefits for Households Updating in 2012 . . . . . . 152C.5 Heterogeneity of Results: by Mayor’s Age and Poverty Rate . . . . . 153C.6 Balance Before and After Matching . . . . . . . . . . . . . . . . . . . 154C.5 Balance Before and After Matching (Continued) . . . . . . . . . . . 155ixList of Figures3.1 Timeline of Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.2 Map of Municipalities in the Sample . . . . . . . . . . . . . . . . . . 563.3 CCT Coverage vs. Number of Poor Families . . . . . . . . . . . . . . 573.4 Potential Sample and Treatment Frontier . . . . . . . . . . . . . . . 573.5 Heterogeneity of the CATE for CCT Coverage . . . . . . . . . . . . 583.6 Conditional ATE for Political Outcomes . . . . . . . . . . . . . . . . 593.7 Conditional ATE for Political Outcomes (Continued) . . . . . . . . . 604.1 Reported Monthly Income in Brazil: CadUnico vs. Census . . . . . . 904.2 Discontinuity in Reported Income . . . . . . . . . . . . . . . . . . . . 904.3 Balance of Fixed or Pre-Determined Variables . . . . . . . . . . . . . 914.4 The Impact of Reelection Incentives on Fraud, by Update Year . . . 924.5 The Impact of Fraud on Electoral Performance, by Update Year . . 92B.1 CCT Coverage: One-dimension RDD . . . . . . . . . . . . . . . . . . 137B.2 CCT Coverage: Three Dimensional Parametric Fit . . . . . . . . . . 137B.3 Conditional ATE for the log of Total Doctor’s Visits . . . . . . . . . 138B.4 Conditional ATE for the Total Doctor’s Visits per Family . . . . . . 138xAcknowledgementsI am very grateful to Francesco Trebbi for the advice, encouragement, and con-structive criticism at all stages of this project. I would also like to thank my othercommittee members, Siwan Anderson, Patrick Francois and Thomas Lemieux, fortheir invaluable contributions.My research also benefited from conversations with numerous faculty membersand seminar participants at the UBC economics department. Although they aretoo many to list here, I would like to single out David Green and Thorsten Rogall.I am also grateful to my fellow PhD students for their help and advice at multiplestages of this project, and their companionship during this journey, especially OscarBecerra, Nathan Canen, Thomas Cornwall, Hugo Jales, Chad Kendall, and RogérioSantarrosa.I am indebted to my family, for their longstanding support, and most of all, Iam grateful to my wife Michelle, who always provided unconditional love, supportand encouragement, and never once doubted our decision to embark on this pathtogether, even when I did.xiSoli Deo Gloria.xiiChapter 1IntroductionConditional cash transfers (CCT) programs, present in more than 40 countries,are now one of the leading global poverty alleviation policies. These programshave been successful in providing a stable income for the very poor, provided thatthey comply with conditions related to school attendance and regular doctor visits(Fiszbein et al., 2009). What is often an overlooked but important feature of CCTprograms, is that many were designed to prevent capture by political intermediariesin both the distribution of resources and selection of beneficiaries.1Local political leaders in many developing democracies have historically beenable to reap electoral rewards from the capture of public resources. The problem isexacerbated when the practice is fueled by funds transferred from foreign or higherlevels of government, as in the case of CCT programs. Capture can take many forms,from undeserved credit claiming to clientelism and vote buying.2 While shieldingtransfers from the influence of local political manipulation is at the core of most CCTdesigns, there is still limited evidence of the long-term impact of these programs onlocal political dynamics.3 If this feature of the CCT design is successful, would it1De La O (2015) provides a qualitative discussion on the potential impact of CCT programdesign on clientelism.2For the purpose of this study, the terms clientelism and vote buying are used interchangeably tomean the promise of a private transfer to a select group of voters, using public resources, with theexpectation that it will influence electoral support. Stokes (2005) also uses the term machine politicsto refer to this political exchange. The relationship between clientelism and economic developmentis discussed in the following reviews: Boix and Stokes (2009) and Hicken (2011). It is also a constantin the literature examining clientelism, in both political sciences and economics, such as: Brusco,Nazareno, and Stokes (2004); Vicente and Wantchekon (2009); Fujiwara and Wantchekon (2013);Anderson, Francois, and Kotwal (2015); Cruz, Keefer, and Labonne (2015).3At the national level, it is reasonable to expect that such programs would deliver positive1Chapter 1. Introductionreduce the power and influence of local clientelistic networks? Would it make thepoor less vulnerable to vote buying? How would it change the electoral process,the profile of successful politicians, and the local public good distribution? wouldpoliticians try to manipulate the program resources using fraud? Would this beeffective?This thesis answers such questions by examining the political impacts of a CCTprogram in local (municipal) Brazilian governments. The country provides an uniquecontext for this analysis, for a few reasons. First, Brazil runs the largest CCTprogram in the world, Bolsa Família (BF), reaching 13 million families. Second,local politics are characterized by clientelism and pork barrel spending across alllevels (Alston and Mueller, 2006; Fried, 2012), and vote buying is abundant.4 Third,different from most other CCT experiences, BF has simple eligibility rules andself-enrollment. In addition to enrolling themselves, households know exactly whythey were eligible and where the funds come from,5 and they are paid withoutintermediaries. Fourth, the program enrollment is conducted by local offices. Thisallows politicians to defraud the income reporting process, and to claim credit forCCT funds transferred to some households.Within this institutional context, this thesis analyzes the local political impactof the CCT program from two different angles. In the first two Chapters, it examinesthe side of the story in which the CCT design succeeds, and the arrival of transfersis effectively shielded from both local political manipulation and credit claiming.electoral rewards to federal governments that implemented. This is a topic that has been moreextensively explored in the literature. For example, Manacorda, Miguel, and Vigorito (2011); Baezet al. (2012); De La O (2013) and Zucco (2013) show that this is the case in Uruguay, Colombia,Mexico, and Brazil, respectively.4More than 450 mayors were impeached due to the practice between 2000 and 2009. Sugiyamaand Hunter (2013) show that, in three poor Northeastern municipalities, 66% of the survey re-spondents were aware of quid pro quo offers for votes, and 28% reported having received recentoffers.5The BF program has been aggressively marketed as a federal program, thereby limiting theability of local politicians to claim credit for its implementation. This is the most recognizedwelfare program in the country.2Chapter 1. IntroductionThese Chapters show that transfers have a positive and long-lasting impact on thelocal political dynamic. Higher CCT coverage reduces the incumbency advantagefor mayors, increases both electoral competition and the quality of the candidates,weakens support for parties associated with clientelism, and prompts an increase inredistributive (health care and education) spending.6 These effects are observed forup to nine years after the start of the program.In Chapter 2, I propose a probabilistic voting model of vote buying to illustratethe mechanism by which the CCT program affects local politics in the long term. Inthis model, incumbent mayors are able to buy votes by diverting public resources intoprivate payments to impoverished swing voters. The CCT program, in this context,is a positive income shock that reduces the voter’s marginal utility coming fromthese private payments. This leads to a reduction in the incumbency advantage,and a consequent increase in electoral competition. As vote buying becomes lessattractive, incumbents try to recover the lost votes by funneling resources back intopublic goods.Because of the manipulation opportunities arising from the CCT design in Brazil,coverage can not be considered exogenous to local politicians. Chapter 3 teststhe model predictions empirically, using a multivariate regression discontinuity de-sign (MRDD) to ensure the exogeneity of the municipal CCT coverage. It relieson the part of the cross-municipal variation in coverage generated by the FamilyHealth Program (FHP), a household-based health program run by municipalities,and funded by the federal government. The intuition is simple: the FHP funding isdiscontinuous, and exogenous to mayors.7 The FHP health-care teams had capil-larity and penetration among the poor, and were able to better spread information6In Brazil, public health and education are available to all, but are consumed primarily by thepoor. The wealthy and middle class are much more likely to use private alternatives. Given thiscontext, a shift in spending toward these categories can be seen as redistributive (Fujiwara, 2015).7Municipalities with populations below 30,000, and a human development index (HDI) below0.7, are eligible to receive 50% additional federal funding for the FHP.3Chapter 1. Introductionabout BF in the early stages of the CCT program. More FHP funding lead to betterinformation, which in turn created differentials in CCT enrollment that were ableto persist over time due to the program rules.Together these Chapters find the following: the additional FHP funding gener-ates a 7.9 percentage points (pp) higher CCT coverage.8 A 10pp increase in CCTcoverage would trigger a 8.2pp vote loss for the incumbent, which is mostly drivenby the effect on politicians that belong to high-clientelism parties. Higher CCTcoverage also prompts a 6.3pp fall in the margin of victory, 0.4 more candidatesin the mayoral race, 50% less private campaign donations to incumbents, and a24pp increase in the share of competitive challengers having at least a high schooleducation. With respect to budget allocation, there are increase of 2.3pp and 2.7ppin the shares of budget allocated to education and health care; along with a 1.5ppdecrease in the share allotted to capital investment.There is, however, another side to this story. Chapter 4 explores one way in whichmayors can manipulate and claim credit for the CCT resources: income reportingfraud, which is measured using the administrative registry of the BF program. Iestimate that at least 8 million households are currently receiving BF benefits bymeans of underreporting their income.In a nutshell, Chapter 4 shows three things. First, using a regression disconti-nuity design on close elections, it demonstrates that the possibility of reelection formayor creates incentives for fraud, generating expenses with the program that arenearly 7% higher in these municipalities. This gives mayors ample ability to claimcredit over the distribution of resources that are five times the average campaignspending in local elections. Second, using random program audits by the federalgovernment, it shows that a higher expectation of being audited significantly re-duces the likelihood of mayors promoting fraud. Third, using matching over 408For municipalities having at least 25% of poor families4Chapter 1. Introductionobserved characteristics of municipalities, parties and mayors, it shows that fraudhelps incumbents to be reelected, especially in more competitive political environ-ments.Both the approach of the first two Chapters, and the approach of Chapter 4, es-timate the same thing: the effect of higher CCT coverage on local politics. However,while Chapter 2 and 3 show that CCT coverage reduces incumbency advantage andincreases electoral competition, Chapter 4 shows that incumbents can benefit fromthe program. These results can be re-conciled after considering the nature and thepersistence of each effect, in the context of Brazil. The negative effects shown inChapter 4 are characteristic of the period when beneficiaries are joining the pro-gram, and the manipulation of eligibility by politicians can create credit claimingopportunities. Similar results already exist in the literature (De Janvry, Finan,and Sadoulet, 2012; Labonne, 2013). However, given the program characteristics(long permanence, easy access to the central government office for complaints), andthe local political environment (two-term limit for mayors), these effects are likelyshort-lived.In Chapter 2 and 3, the effects are not only estimated for a longer period, butthey also represent the lasting impact of the CCT program. They only exist ifbeneficiaries see the CCT income as permanent, in a way that would change theirconsumption decisions, and utility. Accordingly, these effects should last as long asthe CCT benefit is stable, and should survive through different political cycles andincumbents (they do not concern the relationship between voters and one specificpolitician, i.e., the one that might have given them access to CCT). In other words,after a short-period in which credit can be claimed by politicians for getting theprogram benefits to some voters (Chapter 4), the long-term effects of the CCTprogram that come through the impact of the permanent income increase on votebuying should be positive (Chapters 2 and 3). Although the different empirical5Chapter 1. Introductiontechniques employed in this study do not all me to compare the magnitude of theopposing effects estimated in Chapters 3 and 4, a back of the envelop calculationindicates that the long-term (positive) effects are at least four time larger than theshort-lived (negative) effects of CCT on local politics.This thesis contributes to five main strands of the literature pertaining to botheconomics and political science. First, it takes into account the extensive litera-ture that looks at various aspects of vote buying around the world.9 A constantthroughout this literature is the association between vote buying and poverty.10Chapter 2 and 3 propose and test a plausible channel through which income trans-fers can reduce vote buying. Chapter 2 also addresses the literature concerned withthe mechanisms that support vote buying where there is imperfect monitoring ofvoters. Stokes (2005) argues that politicians can partially monitor voters with thehelp of social networks, and both Nichter (2008) and Hidalgo and Nichter (2015)show that vote buying can actually be turnout buying under weak monitoring. Themodel proposed here provides theoretical support for vote buying of swing voters insecret-ballot elections.Second, the literature on the political effects of CCT is generally concerned withnational politics (the level at which the program is implemented). A number ofpapers point out that the electoral support for the incumbent increases followinga CCT implementation.11 Labonne (2013) and Rodriguez-Chamussy (2015)12 ex-9Finan and Schechter (2012) show the role of reciprocity in the way voters are targeted for votebuying in Peru. Anderson, Francois, and Kotwal (2015) discuss the role of land ownership and castrelations in clientelism in India. Brusco, Nazareno, and Stokes (2004) discuss the role of discountrate in vote buying in Argentina. Sugiyama and Hunter (2013) show evidence that public goods areused in clientelistic exchanges in Brazil. Fujiwara and Wantchekon (2013) and Cruz, Keefer, andLabonne (2015) discuss the role of information in the dynamics of vote buying in Benin and thePhilippines, respectively. Larreguy, Marshall, and Trucco (2015) show how institutional changes ina urban land tilting program in Mexico reduced clientelism.10The causes and nature of this relationship are reviewed in Hicken (2011).11In Uruguay (Manacorda, Miguel, and Vigorito, 2011), in Colombia (Baez et al., 2012), in Mexico(De La O, 2013), and in Brazil (Zucco, 2013)12Unlike this thesis, Rodriguez-Chamussy (2015) focuses on party incumbency rather than can-didate incumbency, given that mayors cannot be reelected in Mexico.6Chapter 1. Introductionamine the impact of CCT on incumbency advantage using randomized programroll-outs in the Philippines and Mexico, respectively. Unlike the results presentedin Chapter 3, both studies find that incumbents benefit from higher coverage, mostlikely because mayors are able to claim part of the credit for the introduction of CCT.De Janvry, Finan, and Sadoulet (2012) examine an older Brazilian CCT program(Bolsa Escola), in which mayors could claim credit for the policy (they selected thebeneficiaries). The authors show that better program performance was associatedwith higher incumbency advantage. This thesis provides a comprehensive pictureof the way in which cash transfers affect a variety of local political outcomes, in acontext where the ability of mayors to claim credit for the program is limited.Third, by extending the multivariate regression discontinuity design (MRDD)method proposed by Zajonc (2012), and by using a formal approach to bandwidthselection, Chapter 3 departs from the strategies commonly used for the estimationof regression discontinuity designs based on more than one running variable. Also,to my knowledge, this is the first study that uses the heterogeneity of the averagetreatment effect generated by a RDD to inform an instrumental variable regression.Fourth, this thesis extends the literature on the relationship between policy, po-litical institutions and corruption in developing countries, with an unique and com-prehensive contribution. It shows, both theoretically and empirically, the multipleimpacts of a policy design (the CCT program) on local politics from two differentperspectives. The one in which the policy succeeds in its intent to reduce clientelismand political manipulation, and the other in which it fails, thereby increasing fraudand the electoral rewards of corruption.13 Moreover, by showing evidence of the im-pact of disciplining devices on corruption, it contributes to a still sparse literature13Specifically in the case of Brazil, these are a few examples of related articles that examine therelationship between corruption, institutions and elections: Brollo et al. (2013) show that corruptionincreases after a resource windfall to municipalities; Ferraz and Finan (2011) show how reelectionincentives determine corruption; and Fujiwara (2015) shows that a change in voting technologyincreased electoral turnout among the poor and prompted more redistributive health care policies.7Chapter 1. Introduction(Pande and Olken, 2012).14Finally, Chapter 4 contributes to the political accountability literature by show-ing how reelection incentives can increase corruption in some cases, and how cor-ruption can, in turn, help incumbents to be reelected. These outcomes are oppositeto the prediction given by the agency models of rent seeking,15 because this specifictype of fraud is aligned with both the preferences of voters and the politician’s ob-jective. Camacho and Conover (2011) also examine politically-motivated fraud ina CCT program (Colombia), and in line with this Chapter, they find that fraud ishigher in environments of more political competition. But given that mayors havea one-term limit in Colombia, that paper focuses on the mechanics of program ma-nipulation, as opposed to the incentives for fraud (e.g. reelection and audit risk),or the electoral rewards of fraud.14Again in the case of Brazil: Ferraz and Finan (2008) show that the information about corruptionaudits, when disseminated, reduced incumbency advantage; and Lichand, Lopes, and Medeiros(2016) show that the expectation of future audits reduced corruption in health services.15See Barro (1973); Ferejohn (1986), and more recently Duggan and Martinelli (2016), for thetheory. Ferraz and Finan (2011) provide an empirical application in Brazil.8Chapter 2A Probabilistic Voting Model ofVote BuyingThe motivation for starting this study with a theoretical framework is threefold.First, the literature currently lacks a proper theory that illustrates how a CCTprogram could affect multiple local political outcomes in a context of clientelismand vote buying, and where the program design successfully shields the transfersfrom local political capture.16 Accordingly, this model provides a well structurednarrative to connect all the empirical results to be presented in Chapter 3.Second, it incorporates in the theory some features of vote buying in the localcontext. Vote buying in Brazil characteristically involves the use of public resources,the targeting of the poor, the payment made in more than one installment, and thehigh risk of impeachment for mayors.17 It is also abundant, even in the presence ofsecret-ballot elections. Sugiyama and Hunter (2013) show the results from a surveywhere 28% of respondents received recent vote buying offers (Northeastern Brazil).They also show that government programs are highly associated with vote buying.A survey conducted by a local polling institute after the 2004 elections (IBOPE,2005) shows that vote buying is more common among the poor and, in at least 67%of the cases, includes offers of public goods and services.16In the probabilistic voting models proposed by both Labonne (2013) and Camacho and Conover(2011), local incumbents are thought to fully claim credit for CCT transfers coming from the centralgovernment, and vote buying is not included.17The following examples of media coverage (in Portuguese) provide insights on the mechanics ofthis practice: ;; and 2. A Probabilistic Voting Model of Vote BuyingThird, it also provides one plausible solution for what I call the secret-ballotpuzzle. The literature on vote buying has tried to understand how and why thepractice is so widespread in the context of poverty, even in the apparent absenceof a commitment device, i.e., when monitoring of voters is limited by secret-ballotelections. Several explanations exist for this phenomenon, such as the partial mon-itoring of voters using polling station data (Querubin and Snyder, 2013), or socialnetworks in poor communities (Finan and Schechter, 2012; Boix and Stokes, 2009).Also, part of the literature has claimed that the prevailing practice under secret-ballot elections is turnout buying. In this case, instead of buying the support ofvoters that would otherwise vote for another candidate, politicians pay their ownsupporters to go vote (Nichter, 2008). In Brazil, however, voting is mandatory andturnout is already high (more than 80%), so even the manipulation of turnout ismade difficult.18 In this proposed model, incumbents signal their promise of post-election benefits to targeted voters, if reelected, by paying a first installment beforethe election. They will commit to honor their promises for fear of being exposed(and prosecuted) by the illegal practice of vote buying. Targeted voters, in turn,have incentives to vote for the incumbent for fear of not being targeted for privatetransfers in the future election cycle (by the new candidates).Accordingly, the model intuition is as follows. Incumbent mayors can reap elec-toral rewards from converting public resources intended for the poor into privatepayments (vote buying) to swing voters.19 Vote buying here consists of a cash pay-ment20 accompanied by a promise that, on the condition that the incumbent isreelected, the payment will be repeated. Because vote buying is illegal and punish-18(Hidalgo and Nichter, 2015) shows that local politicians can indeed buy turnout in Brazil bybuying voters form other municipalities (paying them to migrate).19The targeting of swing voters is in keeping with the vote buying model proposed by Stokes(2005), and the framework for redistributive policy proposed by Dixit and Londregan (1996).20These payments may also come in the form of goods and services redirected from public re-sources (e.g. medicine, food, cement). These resources are perceived as close substitutes for cashgiven their immediate necessity to the poor voters, who would at any rate have used cash to buysuch goods and services.102.1. Basic Model Setupable with impeachment, incumbents have incentives to pay the second installmentif reelected. The swing voters targeted today, might not be targeted in a futureelection by the new candidates, so they have an incentive to maximize the reelec-tion probability of the incumbent in order to receive their second installment. Aslong as a group of swing voters can be identified in the population, this mechanismmakes it is optimal for the incumbent to pay voters without monitoring.In this context, cash transfers represent a positive income shock to the poorthat reduces the marginal utility of these vote buying payments.21 With CCT, votebuying becomes a less effective electoral strategy, and it falls. Accordingly, theincumbency advantage is also reduced, given that it comes from the incumbent’sunique ability to target swing voters with public resources. As a consequence, in-cumbents channel resources back into pro-poor public goods. I also show that, ifpoliticians have a type that makes them more effective in vote buying than in publicgood distribution, then higher-type politicians (i.e. relatively better at vote buying)will lose more votes, and be less able to shift resources towards public goods.2.1 Basic Model SetupConsider a two-period model, where there is an incumbent politician and achallenger, indexed by e ={IP C}. In period 1, the incumbent carries out policy,and faces a challenger in the election at the end of the period. There is a two-termlimit, so only the challenger could potentially be reelected in the election takingplace at the end of period 2.21The mechanism here relies on CCT programs having a significant impact on the consumptionstandards of the targeted populations. In the case of Brazil, this is expected due to the significantincrease in income provided by the program (~50%), and its long term existence. Fiszbein et al.(2009) also show that this is the case for most CCT programs around the world. They also showthat, in many cases, CCT beneficiaries tend to treat the transfer as permanent income in theirconsumption decisions.112.1. Basic Model SetupThere are three groups of voters, indexed by J ={HPaP h}.22 The groups differin size, income, and in the distribution of political preferences. The share of groupJ in the population is given by J . Group H is comprised of wealthy voters withincome w. Groups a and h are comprised of poor voters, with income y < w. Theparameter Ji denotes the preference of voter i in group J for the challenger. Thisparameter has a group-specific, uniform distribution on 2− )2ϕJP )2ϕJ3, with densityϕJ . Group h is the ”swing” group. When compared to a, h has a higher density ofpolitical preferences (ϕh S ϕa), and is smaller 2h < a3. Define ϕh = zϕa, withz S 1. The intuition here is that, within the larger group of poor voters, incum-bents can observe a group of voters that have close-to-neutral political preferencefor the current candidates, and are an attractive target for vote buying. Finally,the challenger’s relative popularity in the entire population is given by , with anuniform distribution on 2− )2 P )2 3.Political preferences here can reflect any dimension in which the candidates differfrom each other, apart from policy. These characteristics of each candidate areobservable and cannot be altered. While group income, size, and density remainconstant, individual political preferences (Ji) are redrawn for every electoral cycle,given the change in candidates. The poor voters belonging to groups a and h aretherefore different in each race. As an example, an individual with strong preferencefor candidate A over candidate B is allowed to have a strong preference for candidateC over A in a different race. Also, a voter with ai S 2192ϕh3 in period 1 would notbe eligible to be part of group h. However, in period 2 every individual voter has anequal probability being in the swing group, which is given by h92a5h). All thegroup parameters are exogenous and, while politicians observe these parameters,they do not observe individual preferences.22This is in the spirit of the probabilistic voting model of Persson and Tabellini (2000)122.1. Basic Model Setup2.1.1 Probability of ReelectionVoters have utility, under politician e , coming from public goods (ge ), andprivate consumption (xe ), which includes their income and private payments frompoliticians. I will use a log utility function, separable on the consumption of pub-lic and private goods, as j2ge P xe 3 =  log2ge 3 5 21 − 3 log2xe 3. This functionalform for the utility was chosen so the model can be solved explicitly. The main as-sumption behind this choice is that public and private consumption enter the utilityindependently. I provide an appendix with the results for a generic concave utilityfunction to show how a more flexible functional form would affect the main modelpredictions.Voters choose the candidate that is expected to provide the highest utility inperiod 2, taking into account their political preference, and the challenger’s relativepopularity. An individual i will vote for the incumbent if: j2gI P xI3 − j2gC P xC3 −− Ji S :. In which case, every voter with Ji < j2gI P xI3− j2gC P xC3− , votesfor the incumbent. The incumbent’s vote share is:.I =∑JJϕJ [jJ2gI P xI3− jJ2gC P xC3− 5 12ϕJ] (2.1)With ϕ = ∑J JϕJ , and under the model’s distributional assumptions, theprobability that the incumbent is reelected is:erowO[.I S12] =125 ϕ∑JJϕJ [jJ2gI P xI3− jJ2gC P xC3] (2.2)This is the objective function to be maximized by the politicians. I assumethat they do not extract rents from office other than ego rents, in which case theycare only about the reelection probability. Later in this Chapter, I discuss theimplications of including rent seeking in the model.132.1. Basic Model Setup2.1.2 Budget AllocationThe budget w is exogenous, and fixed across time and between politicians. It canbe allocated across two types of public goods: one that is consumed by the poorgroups (te ), and one that is consumed by the wealthy group (w− te ).23 Politicianscan also allocate resources to vote buying, in the form of cash payments to theswing group. Because they are unable to observe individual political preferences,the amount of vote buying resources (ve ) is divided equally among all voters in thegroup. These funds are diverted from pre-existing allocations to pro-poor and pro-wealthy public goods, with exogenous shares s and 1− s. If c is the total numberof voters, and  = 192hc3, the utilities of the groups of voters under politician eare:H :j2gHe P xHe 3 =  log2w− te − 21− s3ve 3 5 21− 3 log2w3a :j2gae P xae 3 =  log2tp − svp3 5 21− 3 log2y3h :j2ghe P xhe 3 =  log2tp − svp3 5 21− 3 log2y 5 e ve 3Candidates also differ in their ability to deliver vote buying payments, denotedby e . While te and ve are decision variables, this ability is exogenous, fixed overtime, and observable by voters. It enters the model as a multiplier to ve . If e < 1,the politician is relatively less efficient at delivering vote buying payments whencompared to public goods, and vice-versa (the implicit multiplier on public goods isequal to one.). Finally, vote buying offers are only made with public resources.23These assumptions reflect the Brazilian context, where municipalities receive most of theirrevenues from higher levels of government, and where public health and education services areavailable to all but are only consumed by the poor.142.2. Vote Buying and Incumbency Advantage2.2 Vote Buying and Incumbency AdvantageGiven that voters are prospective, the incumbent’s policy in period 1 is onlyuseful to them as an indication of the expected policy for period 2. Policy hereis defined as the allocation of the budget across groups of voters, and types ofpublic goods and vote buying. If the incumbent wins, he replicates the policy fromperiod 1 due to the lack of reelection incentives.24 The public goods allocation ismaintained,25 and the same vote buying payments are made to the same voters inperiod 2. Here, the period 1 payments are the first installment of a full transferto be completed upon reelection. However, since the incumbent is not eligible inperiod 2, the question is: why would he honor his promises?Vote buying is an electoral crime in Brazil.26 The law prohibits candidates tooffer goods or services to voters in exchange of votes. The candidate does not evenneed to explicitly demand the vote in the exchange, if there is enough evidencethat the practice falls in the category of ”vote buying”. Punishment for this crimeincludes losing the post, and the political rights for a period of 8 years. In 2000-2009,more than 600 politicians were prosecuted and lost their tenures due to practice inBrazil. By paying the swing group the first installment, the incumbent undertookan illegal activity. Accordingly, if voters do not receive the promised amount, theymay expose the practice to the electoral courts, and thereby trigger an impeachmentprocess.Voters will still face a competing utility given by the challenger, but challengersare not able to make credible promises without access to budget resources. In theabsence of a first installment, challengers are not at risk, and are therefore in aposition where they might not honor their promises. However, if they win, they will24This assumption would be slightly different if rent extraction was allowed, with no change tothe results.25I assume that there is a very small cost of shifting policy.26Article 41 of the Law 9.504/1997.152.2. Vote Buying and Incumbency Advantageseek reelection. Thus, voters infer their expected utility coming from challengersfrom their allocation of public goods and vote buying in period 2, which followsthe incumbent’s optimization problem. The use of the illegality of vote buyingin this model simplifies the commitment problem between voters and politicians,and clearly illustrates the difference in the ability of incumbent and challenger toeffectively buy votes. These feature of the local electoral rules, however, might notpresent in other contexts, and yet vote buying is still observed around the worldunder similar characteristics. Modeling the specific and complex features of thebargaining problem between voters, politicians, and possibly brokers (Stokes et al.,2013), is beyond the scope of this study.Poor voters in groups a and h expect to have the following private consumptionunder the incumbent: xaI = y and xaI = xhC = y 5 IvISc , respectively. Because thecomposition of groups in period 2 is not yet known, all poor voters expect fromthe challenger the same private consumption: xaC = xhC = y 5 CvC(S+L)c . While theincumbent’s allocation is more advantageous to the swing voters, the challenger’spolicies benefit the non-targeted poor voters. In this context, incumbency advantageexists because the incumbent is the only politician that can target the swing votersof the current electoral cycle, and their votes are more likely to define the currentelection than those of non-targeted poor voters.If politicians do not differ in type, the values of te and ve are the same underboth candidates. For the sake of simplicity, I assume that  = I = C ,27 tI = tC = t,and vI = vC = v. In this case, it is the private consumption alone that determinesthe incumbency advantage. Combining the utility of private consumption undereach candidate with equation 2.1, with ϵ = 1922a 5 h3c3, the vote share of theincumbent is now:27This assumption will be relaxed in Proposition 6.162.3. Model Analysis.I = aϕa log2 yy 5 vϵ3 5 hϕh log2y 5 vy 5 vϵ3− 5 12(2.3)According to the equation above, for high enough values of 2 − ϵ3 and hϕh ,there will exist incumbency advantage (i.e..I S )2). The equilibrium values of pro-poor public good (t) and vote buying (v3 are explicitly calculated in the appendixto this Chapter, and they show under which conditions vote buying will exist.2.3 Model AnalysisThis model aims to provide a theoretical framework for the analysis of the ef-fect of a CCT program on incumbency advantage, budget allocation, the profileof politicians, and vote buying. Conditional cash transfers enter the model as apositive income shock to the poor. Such shock is assumed to be exogenous to localpoliticians, i.e., they can neither claim credit for its arrival, or manipulate its allo-cation, magnitude or timing. In this section, I show how CCT would change themodel equilibrium, and affect the political outcomes. To respect the intent of theprogram, I assume that w S y even after the income shock has occurred. All proofsfor the following propositions are in the appendix.Proposition 1. The implementation of a CCT program reduces the amount ofresources allocated to vote buying (yvyy < :).The marginal utility of private consumption of the poor decreases with the in-come shock. Given that, by assumption, there is no change in the marginal utility ofconsumption of public goods, it is optimal for the incumbent to reduce the amountallocated to vote buying.Proposition 2. The implementation of a CCT program shifts the budget al-location toward the pro-poor public good ( ytyy S :), and away from the pro-wealthy172.3. Model Analysispublic good. This happens for a low enough value of the parameter s, and only ifvote buying exists.An income shock does not affect the marginal utility of public good consumption,so without vote buying, the incumbent would not change the policy allocation. Ifthere is vote buying, it follows from Proposition 1 that the incumbent will channelresources back into public goods. The magnitude of the reallocation (between t andw − t) depends on the size and density of the poor and wealthy groups, and on s,which is the exogenous share of vote buying resources taken from the pro-poor publicgood allocation.28 Incumbents increase their allocation to pro-poor goods for a lowenough size (H) and density (ϕH) of the wealthy group, and for a low s. If s is toohigh, then the increase in the pro-poor spending will be low (or even negative), dueto the direct effect of a reduction in v on the allocation to the pro-poor good.Proposition 3. The implementation of a CCT program reduces the vote shareof the incumbent (yIyy < :). This happens if and only if there is vote buying, and ifthe swing group has a large enough density of political preferences around the neutrallevel.Without vote buying, incumbency advantage would not exist, and the CCT pro-gram would not affect vote shares. With vote buying, voters face different utilitiescoming from the incumbent and the challenger (equation 2.3). Given that the in-cumbency advantage depends on v, it will also decline after the income shock. Thenecessary condition is that the ratio of the densities of groups h and a (z = ϕSϕLS 1)is at least equal to the ratio of the marginal utilities of private consumption for thegroups (z ≥ y+vy ).Proposition 4. Both the decline in the incumbent’s vote share and the shift to28The value of s is not observed in the data and cannot be estimated empirically in this study.However, anecdotal evidence points to a value that is neither close to one or zero, as surveys indicatethat vote buying offers can come in various and flexible forms such as doctor visits (e.g. taken fromt) or bags of cement or water (e.g. taken for b− t).182.3. Model Analysispro-poor public good, are larger when the political preferences of the poor are moreneutral ( yIyyyϕL< : and ytyyyϕLS : ). This happens if the poor group is large enough,or if it has a more neutral density of political preferences when compared to thewealthy group.The density of the political preferences of the poor population is given by ϕa(ϕh = zϕa). A higher ϕa makes the poor a more attractive target for both publicgoods and vote buying. If group a is an attractive target for public good distribution(i.e., if it either has a large size or a high density of political preferences around zero),then the effect on the attractiveness of public goods dominates. In this case, for ahigher value of ϕa, more resources are moved into pro-poor public goods, and theincumbent’s vote loss is larger.Proposition 5. Both the decline in the incumbent’s vote share and the shift topro-poor public goods are larger when the poor population is large ( yIyyyL< : andytyyyLS : ). This happens if the distance between the densities of the wealthy andpoor groups (ϕHand ϕa) is relatively small.A larger a relative to h increases the uncertainty of poor voters about receiv-ing a vote buying offer from the challenger, improving the incumbency advantage.However, it also makes the poor a more attractive target for public good distri-bution, which works against the incumbent. When the latter effect dominates theformer, then the decline in the incumbent’s vote share is higher for a larger poorpopulation ( yIyyyL< :). This depends on the levels of ϕa and ϕH . It suffices forthe result that these densities are close in magnitude, independent of the direction.The intuition is that both wealthy and poor groups must be relatively attractive toreceive the incoming resources from vote buying.Proposition 6. Incumbent politicians with a higher type will face a largerdecline in vote shares ( yIyyyP < :), and will carry out a smaller shift to pro-poorgoods ( ytyyyP < :).192.4. Incorporating Rent SeekingTo test this Proposition, I allow politicians to differ in type (e ). A higher typemeans that the politician is more efficient when it comes to buying votes ratherthan delivering public goods. Given that transfers reduce vote buying, high-typeincumbents reduce the resources invested into an electoral strategy where they arerelatively more efficient, which generates the larger vote loss. Also, given that high-type incumbents are less efficient in delivering public goods, their budget shift willbe smaller.2.4 Incorporating Rent SeekingThe appropriation of public resources by politicians is a common feature of de-veloping democracies (Pande and Olken, 2012). In this section, I allow for politiciansto extract rent from the public budget w. Based on the previous model, an amountr of rents can be extracted from v and, instead of being paid to swing voters, areappropriated by the incumbent (r is a third decision variable). Assume that r is themaximum rent that can be extracted without legal consequences, or the politicianbeing exposed for vote buying.It is easy to see that winning incumbents always extract r in period 2, due tothe absence of a commitment device after reelection. Because the utility of votersonly considers the expected performance of candidates in period 2, incumbents alsohave no incentive to extract less rent in period 1. Thus, for both candidates, it isoptimal to always have r is all periods. The new allocations for v and t are shown inthe appendix. Under these assumptions, there is always a positive amount of rentbeing extracted, but there is no change in the model predictions for yvyy , ytyy or yIyy .A further refinement could be made in the spirit of the agency models, wherepoliticians have an incentive to extract less than maximum rents if it helps themto get reelected. This is the approach used in the existing corruption literature in202.4. Incorporating Rent SeekingBrazil (Brollo et al., 2013; Ferraz and Finan, 2011). The intuition is that candidatesdiffer in their willingness or ability to seek rents, and high-rent politicians mightfind it optimal to extract less in period 1, so they can hide their true type. Insteadof providing a completely new theoretical framework, I illustrate this mechanism inthe context of this model by assuming that politicians have an ability to extractrents uniformly distributed in (rP r ), unobserved by voters.They cannot extract more rent than their true type allows. As before, reelectedincumbents always extract their true rent type in period 2. When incumbentsextract r in period 1, voters then update their expectation for rI in period 2 asfollows: rI = :O52r5r3. For simplicity, I assume that the politician’s utility is linearin r. Denote the probability of reelection shown in equation 2.2 by p, and the truetype of the incumbent by g. With : <  < 1, the utility of the incumbent is definedas kI = rI 5 pg. This the new equation to be maximized by the incumbent.In the appendix I show the new equilibrium values of yvyy , ytyy and yryy . Here,rent seeking works as another resource reallocation alternative for an incumbentreducing v. The attractiveness of this alternative is given by , and it is easy tosee that, when  = :, the equilibrium conditions become similar to the ones in thebase case model. As before, an income shock will reduce vote buying (yvyy < :), andincrease the allocation to pro-poor goods for low enough values of s ( ytyy < :). Also,the income shock causes an increase in rent extraction. This result comes from thespecific assumption made regarding the utility function. If public good and privateconsumption were complementary in the utility, there is an equilibrium in whichrents would be reduced as a result of a CCT program.For the reelection probability, the intuition is that a higher maximum rent (r)reduces the incumbent’s vote share. In this case, the expected utility of voters fromthe incumbent in period 2 is reduced due to rent extraction (which is not the case inthe utility coming from the challenger). Accordingly, the income shock would also212.4. Incorporating Rent Seekinghave a lower effect on the fall in vote shares (yIyy ), as the incumbency advantagecreated by vote buying is already less effective with the existence of rents. All in,although the intuition does not change, the result from Proposition 3 now dependson stronger conditions on the exogenous parameters to hold (shown in the appendix).22Chapter 3Cash Transfers, Clientelism, andPolitical EnfranchisementMotivated by the previous theoretical predictions, this Chapter shows that trans-fers that are shielded from the influence of local politicians improve the electoralprocess and contribute to the political enfranchisement of the poor. As mentionedbefore, the Bolsa Família (BF) program in Brazil was aggressively marketed as afederal program, limiting the ability of local politicians to claim credit for its imple-mentation. There are, however, two main challenges to identifying the exogenouseffects of this program in the local political environment: the implementation wasnot randomized, and the program is jointly run by the central and municipal gov-ernments, giving mayors some leeway to manipulate the enrollment of beneficiaries.Accordingly, this Chapter estimates the political effects of the CCT program us-ing a novel identification strategy. It relies on the part of the cross-municipal varia-tion in CCT coverage generated by the Family Health Program (FHP), a household-based health program run by municipalities, and funded by the central government.For this identification strategy to be effective, two things are required. First, theremust exist exogenous assignment of FHP funding or coverage. Accordingly, sinceAugust 2004, municipalities with population below 30,000, and a human develop-ment index (HDI) below 0.7, are eligible to receive 50% additional federal fundingfor the FHP. This discontinuous assignment of program funding allows this study toinstrument local CCT coverage with this funding differential, using a multivariate23Chapter 3. Cash Transfers, Clientelism, and Political Enfranchisementregression discontinuity design (MRDD).Second, it is required that the FHP funding has a strong impact on the CCTcoverage (first stage), and no meaningful direct effect on political outcomes (exclu-sion restriction). With respect to the first stage, the impact of the FHP fundingon CCT coverage comes from two institutional features of the BF program: Theform of dissemination of information about BF, and program permanence rules.Access to information about BF was key to the enrollment of potential beneficiariesin the early years (2004-06). Where local city administrations had few resourcesto promote enrollment, the FHP teams had sufficient capillarity and penetrationamong the poor to disseminate the information, therefore supporting higher enroll-ment rates where the FHP was better funded. Moreover, BF program rules allowhouseholds to continuously receive the benefit for a minimum of four years once theyare enrolled (much more in many cases), which allowed the coverage differentialscreated between 2004 and 2006 to persist over the long term (2012+).With respect to the exclusion restriction, there is strong evidence that the polit-ical effects of the FHP operate almost exclusively through the channel of CCT cov-erage. This evidence emerges from the use of a MRDD for the estimation. MRDDsdiffer from single-variable cases on the basis of their implementation methods andinterpretive framework. For example, treatment effects are identified along a fron-tier of points on a two-variable space, rather than on a single point (for the variablesfor which the discontinuous assignment exist, in this case: population and HDI).This allows the researcher to observe the heterogeneity of these treatment effectsfor municipalities with different values of population and HDI along this frontier.In this context, the exclusion restriction is supported by the fact that, although theFHP funding differential is applied to the entire frontier, effects on political out-comes are only observed for the same group of municipalities for which differentialsis CCT coverage are also observed.243.1. Institutional BackgroundI estimate the MRDD nonparametrically, and most results are robust to thechoice of kernel, bandwidth, and the inclusion of control variables. This Chapterprovides estimates based on a data-driven plug-in algorithm for bandwidth selectionin the multivariate case, adapted to the specific demands of this study, and fullydescribed in the appendix.3.1 Institutional BackgroundIn this Section I briefly describe the main institutional features of the BrazilianCCT program, the Family Health Program, and the local political environment inBrazil.3.1.1 Brazilian CCT programBolsa Família (BF) is the largest CCT program in the world, covering 13 millionhouseholds (nearly 20% of the population in Dec 2012). It was created in 2003 andunified other smaller CCT programs previously run by different government min-istries.29 In a nutshell, households with per capita income below a certain thresholdare eligible to receive a monthly BF grant, which varies according to the number ofchildren in the family under 18 years of age. For example, a family of two adultsand two school-age children with per capita income at the lower threshold (R$70),would receive R$134, roughly a 50% income increase.Eligibility is based on self-declared income, but households are subject to au-dits run by both the local and federal program offices. Program permanence issubject to compliance with conditionalities, particularly school attendance and, fortargeted populations, health check-ups. The BF operations are run jointly by thecentral and local governments.30 The former determines the major guidelines (the29See Fried (2012) and Zucco (2013) for a historical account of the program30For a more extensive analysis, see Lindert et al. (2007)253.1. Institutional Backgroundannual budget, the total cap on the benefits, the eligibility thresholds and the munic-ipal coverage targets), controls the approval and cancellation of benefits, and paysthe beneficiaries directly through a cash-card system. Local offices are responsiblefor the enrollment process, household data collection, requesting cancellations andadditions, keeping the registry updated, and checking whether the conditionalitiesare being met by the families.3.1.2 CCT and Local PoliticsThe Brazilian political system has a low level of ideological identification (Amesand Smith, 2010). The party system is highly fragmented and local political alliancestend to span the entire ideological spectrum. Mayors have a two-term limit andelections are held every four years in one round, under majority rule.31 Between2000 and 2012, 70% of eligible incumbents ran for reelection, and nearly 60% werereelected. Voting is mandatory and the average turnout in the 2008 and 2012elections was 83%. Accordingly, this study will focus on the reelection of candidatesand not parties.The implementation of most government policies in Brazil is decentralized andprograms are often financed jointly by federal, state, and municipal administrations.The majority of revenues for small municipalities come from transfers from higherlevels of government; and clientelism and pork barrel spending are abundant at alllevels (Alston and Mueller, 2006; Fried, 2012). Given that mayors have significantcontrol over the budget allocation (Ferraz and Finan, 2011), vote buying offers thatare financed through public funds and services are endemic to the political culture.Many of these exchanges also include bestowing administrative favors, such as accessto health services or redirecting supplies from public construction projects.The innovative BF program was specifically designed to reduce local political31The runoff system exists for larger municipalities in Brazil, which are not included in thissample.263.1. Institutional Backgroundinterference and promote the central government brand. Surveys indicate that itsbeneficiaries perceive BF as being more resistant to local political manipulation thanother government programs (Rego and Pinzani, 2013; Sugiyama and Hunter, 2013).BF funds now represent the second most important source of federal governmenttransfers to municipalities, comprising more than 12% of the total transfers, andthese funds represent a disproportionately important source of revenues in less pop-ulated areas. Where BF total spending represents ~0.5% of Brazil’s GDP, the BFtransfers to small municipalities represent nearly 5% of the local budget.3.1.3 The Family Health Program (FHP)The FHP was created by the Ministry of Health (MH) in 1994. The programprovides teams of health professionals that regularly visit households to providebasic health care; teams include a minimum of one family doctor, one nurse, oneassistant nurse, and six health agents. Each team is responsible for a geographicarea within the municipality, and serves a population of up to 4,000 by keeping aregistry of clients, providing home visits, and functioning as the first point of accesspoint to the broader health care system. Given that the majority of the middle andhigh income population use only the private health care system (Alves and Timmins,2003), the FHP is, in practice, providing services to the poor population.The identification strategy in this study uses a discontinuity in funding for theFHP program to instrument the cross-municipality variation in CCT coverage. Basichealth care in municipalities is co-financed by federal, state, and local resources. Thefederal transfers used to finance the FHP teams are paid monthly as a fixed amountper team.32 These payments were uniform across municipalities until August 2004,when municipalities with population below 30,000, and an HDI below 0.70,33 started32The federal FHP transfers represent roughly two-thirds of the basic attention funds, which inturn represent 6% of all the direct transfers to municipalities, including CCT.33The population limit is 50,000 for the states that form the legal Amazon, including the entireNorth region (7 states), and the states of Maranhao and Mato Grosso. This region is excluded from273.2. Using a Discontinuity in the FHP Funding as Research Designto receive an extra 50% funding per team (see the timeline of events in Figure 3.1).The HDI for eligibility was calculated based on the 2000 census, and the populationwas referenced using the 2003 estimates of the government’s statistics department(IBGE). Both these values, and the list of eligible municipalities, have not beenupdated since.343.2 Using a Discontinuity in the FHP Funding asResearch DesignThe Bolsa Família program was first implemented in late 2003 and has had full,or nearly full global coverage since 2006. Nevertheless, there is significant varia-tion in coverage between municipalities, already taking into account the number ofeligible poor families. In 2012, the average local coverage exceeded the estimatednumber of eligible families by 10%, with a standard deviation of 16% (Figure 3.3).35This study uses the cross-municipality coverage variation to identify the effects ofCCT on political variables. Given that the joint administration of the programbetween central and local governments allows for local political manipulation of eli-gibility, this paper will not treat this variation as exogenous, in contrast to previousattempts in the literature.36I use a discontinuity in the funding for the FHP (described above) as an in-strument for CCT coverage. I argue that the allocation of FHP funds prompted aour sample.34The list of locations eligible for the benefit that was released in 2004 has not changed evenwith the publication of new population estimates and a new census in 2010. This means that theeligibility for treatment could not have been manipulated by local political authorities. The originallist is constant of the following decree: PORTARIA Nº 1.434/GM, July 14, 2004.35The global coverage target for BF is the sum of local targets and is binding. The local coveragetargets however are not binding. The sample has a large number of municipalities with coveragehigher than 100%.36Zucco (2013) argues that this differential can be treated as a quasi-random distribution withingroups of similar cities, and Fried (2012) shows that this differential is not determined by nationalpolitics.283.2. Using a Discontinuity in the FHP Funding as Research Designdifferential in program enrollment across municipalities in the early years of the BFprogram. While the smaller, older CCT programs had at most 6 million beneficiariesin 2002 (Zucco, 2013),37 BF’s coverage target was 11.1 million families (Figure 3.3).In order to achieve this, a massive migration from older programs had to be under-taken, along with a concerted effort to reach new entrants. Smaller municipalitieslacked the resources necessary to reach all of the eligible households, given that theydid not receive federal resources for program administration prior to 2006. At thistime, FHP teams provided one of the most important sources of information aboutgovernment benefits, given the extent of their capillarity and penetration withinpoor communities.Evidence supporting this mechanism comes from a 2009 survey38 including morethan 10,000 CCT-eligible households. The survey shows that health care agentsasked the household about their coverage status in 50% of the visits. In 12% ofthe households surveyed, the information about BF came first from a health profes-sional.39 Another 9% of the respondents stated that they would speak to a healthprofessional when it came to questions about the program, in preference to otherlocal officials.There are two main reasons that account for early CCT coverage differentialspersisting in the long run. First, the target of 11.1 million households was reachedin 2006 and capped until 2009, which meant that new beneficiaries could only joinwhen someone left BF.40 Second, although beneficiaries are required to update theirinformation every two years, under penalty of losing the benefit, this rule was not37This number is an overestimation, given that hundreds of thousands of households which werebenefiting from more than one program at the same time were double counted.38AIBF - Avaliação de Impacto do Bolsa Família.39This percentage is calculated on the basis of including only respondents that learned aboutthe program from a first-hand source. This excludes information coming from family and friends,which in turn may also have been acquired from a first-hand source such as the media, schools,health professionals, etc.40The global coverage target for BF based on the 2004 PNAD survey was 11.1 million families,effective between 2004 and 2009. The target was changed in 2009 and in 2011, in order to incorporatedata from the 2006 PNAD survey and the 2010 census, respectively.293.2. Using a Discontinuity in the FHP Funding as Research Designproperly enforced prior to 2009. When the government started to increase thenumber of audits, it also created a permanence rule,41 which allowed householdsthat surpassed the income threshold to receive the benefit for two additional years.This long term persistence of coverage differentials allows me to examine the politicalimpact of the program up to 9 years after its implementation.This identification strategy also relies on the assumption that the FHP fundingdifferentials do not have a significant direct effect on political outcomes (exclusionrestriction). Supporting evidence emerges from the use of a MRDD for the estima-tion, which allows me to observe the heterogeneity of the FHP funding effects onboth CCT coverage and the political outcomes, over different values of populationand HDI (details in Section 3.4 and Figure 3.4). The idea is that, although theFHP funding differential is applied to the sample, effects on political outcomes areonly observed for the same group of municipalities for which differentials is CCTcoverage are also observed.The intuition from the existence of heterogeneous effects on CCT coverage isthe following: CCT coverage post-2004 is only likely to respond to FHP funding inmunicipalities that are neither very poor nor very wealthy. Very poor municipalities(low HDI) were the main targets of the old CCT programs, which means that theinformation about BF was widely available in these locations prior to 2004. Onthe other hand, wealthy municipalities (high HDI), especially the ones with lowpopulation, are likely to have a higher budget, more employees, and better coverageof health services. In these places, it was easier for city administrators to boostBF enrollment in the early years, severely limiting the impact of the additionalFHP funding on information dissemination. If CCT coverage differentials are onlyobserved in these municipalities, the same should happen to the political effects.Before proceeding to the empirical strategy, there are two other issues to address.41Legislation: Portaria MDS No 617 from August 11, 2010.303.3. Data Sources and Description of VariablesFirst, given that the FHP funding discontinuity was introduced after the CCT pro-gram (Figure 3.1), the exclusion restriction cannot be directly tested. Section 3.5.3provides detailed evidence to support a valid exclusion restriction.Second, the identification relies on the assumption that no other relevant vari-able follows the same pattern as the FHP funding at the discontinuity. The mostimportant source of transfers from the Brazilian central government to municipal-ities is the Fundo de Participacao dos Municipios (FPM). The FPM is distributedin a discontinuous form across several population thresholds, where larger munici-palities receive more funding. One of the population thresholds, at 30,564, is closeto the population frontier of 30,000 in this study. Nevertheless, Section 3.5.4 showsstrong evidence that the FPM does not drive the results. Two other alternative testsare performed to rule out the existence of potential omitted variable biases. First,the effects are estimated for all outcome variables in two periods, pre-treatment(see Table 3.4). Second, the paper shows the balance of fixed and pre-determinedmunicipality characteristics, also before treatment (Table 3.1).3.3 Data Sources and Description of VariablesData on municipal CCT coverage comes on a monthly basis from the MDS,since January 2004. CCT coverage is measured as the percentage difference be-tween the number of households receiving the benefit, and the number of eligiblehouseholds.42 In addition to Bolsa Famiília, I also include households receiving theolder CCT programs from the Ministry of Education (Bolsa Escola), the Ministry42The number of eligible families was estimated three times by the MDS, based on the PNADsurveys from 2004 and 2006, and the 2010 census. The 2006 and 2010 estimates are an improvementover the previous one on the basis of including a coefficient for income volatility. For the coveragein 2008 and 2012, I use the estimates of poor families from 2006 and 2010, respectively. For thepre-treatment CCT coverage calculate in June 2004, I use the 2004 estimate.313.3. Data Sources and Description of Variablesof Health (Bolsa Alimentação) and the MDS (Cartão Alimentação).43 Nearly allthe beneficiaries of these programs migrated to BF between 2003 and 2006, whichmeans that their number is not meaningful after 2008.The federal government provides monthly data with respect to the basic healthtransfers to municipalities, including the FHP funding. The MH makes data avail-able with respect to coverage by health teams, as well as some health outcomes.These outcomes are measured within the scope of the public health system; whilethey might not be a good proxy for the overall quality of health services in a givenmunicipality, they are a good proxy for the quality of the services provided to thepoor share of the population.Annual budget allocation data has been obtained from the National Treasurydatabase (FINBRA), which breaks public expenses into two main categories: first,in terms of capital investment, personnel expenses, and other expenses; and second,in terms of function (e.g., education and health).44 Not all municipalities releasethe data every year. I only use municipalities that released four years of data in atleast one of the mayoral tenures of interest here (2005-08 and 2009-12), as well asfor the base period of 1997-00, which is used as a control.45 Thus, the sample usedto estimate the effects on budget shares is a subset of the main sample.Election data comes from the Federal Electoral Authority (TSE). For the fourmunicipal elections held between 2000 and 2012, I extract the following variables: the43Although households could only migrate to BF if they no longer received other benefits, therewas a portion of recipients of the older programs that had more than one benefit. They wouldbe double-counted in the sample. Since the number of beneficiaries in this older programs is notdiscontinuous at our thresholds of interest, and their number is 0.1% of the sample in 2008, and0.0% in 2012, this should not be cause for bias.44I exclude from the sample all the municipalities that report a zero share of budget in eitherpersonnel or capital expenses, and also in education or health expenses. This is most likely areporting error.45Health expenses were only reported as a separate category after 2000. Previously they wereaggregated with spending in sanitation. Thus, all regressions including health expenses for themayoral tenure of 1997-2000 also include sanitation expenses. They are not fully comparable tothe ones using the later tenures (although sanitation was on average only 6% of the aggregatedexpenses in 2004-2012). For the year of 2001, I simply assume that health and sanitation expenseshad the same ratio as in 2002-04 (same mayoral tenure), and I adjust the data accordingly.323.3. Data Sources and Description of Variablesincumbent’s vote share, as a percentage of valid votes; the margin of victory, as thedifference between the winner and the runner-up in percentage points; the number ofcandidates; the education level of the challengers, as the number of challengers withhigh school education among the ones that ranked first, or second in a competitiveelection;46 campaign donations from the private sector;47 and turnout. Data fromthe 2008 and 2012 mayoral elections is used. The 2004 election happened two monthsafter the introduction of the discontinuity, so any effects are unlikely to be observed.For robustness, the paper shows the estimation results for the 2000-2004 period.I classify the main Brazilian political parties according to their level of clien-telism, on the basis of data from the Democratic Accountability and Linkages Project(DALP).48 The variable labeled ”challenger’s party” is defined as the number ofchallengers in clientelistic parties, among the ones that ranked first, or second in acompetitive election. In keeping with the proposed model, the sample includes onlymunicipalities where the mayor has reelection incentives. The subset is determinedusing the results from the previous election.49 Cases in which an eligible mayor didnot run for reelection are not excluded from the main sample, as this decision islikely to be endogenous. However, for the estimation of the election outcomes, only46Competitive election is one where the margin of victory was less than 10pp.47I only use campaign donations from corporations and individuals (which excludes donationsfrom parties and candidates). I will also aggregate all direct donations to the mayoral candidate,to the municipal election committee, and to the municipal party branch (noting that the last twocan also be spent on campaigning for assembly members). Donations data are not available for the2000 election nor for all incumbents.48The DALP (Democratic Accountability and Linkages Project) is a survey from 2008 wherepolitical experts from several countries respond to questions about the political behavior of localparties. The project is supported and made available by the Political Science Department at DukeUniversity. I use the scores from the four questions related to the intensity with which partiesuse clientelistic exchanges to gather votes. The parties with an average score above 3 (out of 4in an increasing scale) are identified as clientelistic parties, with value = 1. All other parties areidentified with value = 0. All small parties that were not evaluated by DALP are identified asnon-clientelistic. Between 2008 and 12, 95% of municipalities were governed by a party that wasrepresented in the DALP survey.49In some cases the municipal election was ruled illegal by the electoral courts and a new, sup-plementary election was called. In this case, the results of the supplementary election were used toappoint the incumbent.333.4. Empirical Strategymunicipalities where the incumbent is actually running can be used.50The following pre-determined variables come from the 2000 census: age profile,as the share of population aged 20-50; income inequality, as the population shareof the top 10% in income divided by the share of the bottom 40%; share of urbanpopulation; share of males; and schooling, calculated as the share of householdheads having completed high school. The GDP per capita is the average from 2000to 2002. For the purposes of this study, municipalities with a small number ofpoor households are likely just generating noise. Thus, the main specification onlyuses municipalities with at least 25% of poor households, roughly 60% of the fullsample.51 The sample only includes municipalities created prior to 2000, and alldata in R$ was converted to real values based on the CPI from Dec 2012 (IPCA).3.4 Empirical StrategyIn this Section I briefly describe the use of a multivariate regression discontinuitydesign as an estimation strategy, and provide a link between the model of Chapter2 and the statistical parameters estimated in this Chapter.3.4.1 The Multivariate Regression Discontinuity Design (MRDD)Single score RD designs have been widely explored in recent economic applica-tions, and are generally seen as one of the most credible identification strategies(Lee and Lemieux, 2010; Keele and Titiunik, 2014). An extension of the RD ap-proach is the case where the treatment eligibility is determined by two running50I do not include mayors that did not achieve their post by way of an election (e.g. vicemayors who may have inherited the position following a resignation), given that they did not havea vote share in the previous election. Also, the timing of such event may occur be too close tothe forthcoming election, which suggests that the reelection incentives for these mayors may beinsignificant to budget allocation. Although most of these non-elected incumbents can be identifiedin the data, adding them to the sample does not alter the results.51Results for the estimation using the full sample are provided in the appendix.343.4. Empirical Strategyvariables, e.g., latitude and longitude (Gerber, Kessler, and Meredith, 2011; Dell,2010; Keele and Titiunik, 2014) or test scores (Jacob and Lefgren, 2004; Papay,Willett, and Murnane, 2011; Zajonc, 2012; Clark and Martorell, 2014). In the two-score case (MRDD), the average treatment effect (ATE) is identified for a frontierof points, in contrast to a single point in the one-score case. In this study, amunicipality m with population pm and HDI hm, with respective treatment cut-offs at 30 (thousands) and 0.70, has the ATE defined over the following frontier:F = 2pmP hm3 : 2pm <= =:P hm = :O73 ∪ 2pm = =:P hm <= :O73 (Figure 3.4).This changes the estimation and interpretation of the treatment effects withinthe RD framework, mainly due to potential heterogeneity of these effects along thefrontier. The literature on identification and estimation of MRDDs is sparse andlacks consensus on a definitive strategy. Papay, Willett, and Murnane (2011) pro-pose a framework to estimate the ATEs nonparametrically when there are multipletreatments, which is not the case for this project. Although Reardon and Robinson(2010) and Wong, Steiner, and Cook (2013) review potential estimation strategies,they focus on the average effects, without emphasis on the heterogeneity along thefrontier.In several applications, researchers approached the problem by reducing it to asingle-score RDD. This can be accomplished by estimating two separate ATEs forthe two running variables (Reardon and Robinson, 2010; Wong, Steiner, and Cook,2013), or by collapsing the variables into one single score. This single score is usuallydefined as the minimum distance to the frontier, among the values of the multiplescores (Jacob and Lefgren, 2004; Dell, 2010; Clark and Martorell, 2014). This latterapproach, however, is more compelling when the variables are on the same scale, astest scores. Another noteworthy alternative is to assume a parametric52 function52In this parametric approach, the ATE along the frontier can only be inferred if the polyno-mial on the scores does not include interactions with the treatment dummy (spline). Under theinclusion of such interactions, the coefficient measuring the treatment dummy will have a differentinterpretation. It will reflect the conditional ATE at the point where the running variables equal353.4. Empirical Strategyover the two score variables, as suggested by Reardon and Robinson (2010). Dell(2010) provides an application of this strategy by using a cubic polynomial in ageographical MRDD.53 In both these cases, if there are enough observations onboth sides of the entire frontier, the heterogeneity of the ATE can be consistentlyestimated using fixed effects for frontier segments, interacted with the treatmentdummy. This is not, however, the case of the sample here (Figure 3.4).There are two good reasons to explore the heterogeneity of the effects here. First,the two scores have a distinct nature. The sub-populations being compared along thefrontier, by way of either a minimum distance approach or a parametric approach,might differ considerably. This would defeat the spirit of the RDD.54 Second, theestimation of the effect of CCT on political outcomes depends on having a stronginstrument for CCT coverage. The former approaches would mask areas of thefrontier where the instrument is weak or ineffective. This paper follows the generalestimation approach in Zajonc (2012), in which the conditional ATE (CATE) isestimated for several points of the treatment frontier, and the average effect for anyfrontier segment is derived by averaging those CATEs. This approach makes thetreatment heterogeneity along the frontier fully observable, and ensures that theobservations being used are in the same neighborhood with respect to the scores.For scores pm and hm, the CATE (Con) at a point 2pP h3 in the frontier isgiven by equation 3.1 below. c+ϵ 2pP h3 and c−ϵ 2pP h3 are neighborhoods of radius ϵaround point 2pP h3, comprised of treated and non-treated observations, respectively.Zajonc (2012) shows that this effect can be identified by way of assumptions similarto the single-score problem. Namely, the orthogonality of treatment assignment tothe cutt-offs, in contrast to the average treatment effect for the entire frontier.53More precisely, Dell (2010) adopts a semi-parametric approach as the local polynomial is esti-mated for different bandwidths in distance to the treatment border54A municipality with a 30,000 population with HDI below 0.60 might be compared to one witha 3,000 population and a 0.7 HDI. The spirit of the RDD is to match observations that are in thesame neighborhood in regards to the score variables, which is not necessarily the case under theseapproaches.363.4. Empirical Strategythe outcome variable; the positivity of the frontier, to assure that points near thefrontier do exist; and the continuity of both the conditional regression functionsE[ym213 | pm = pP hm = h] and E[ym2:3 | em = pPHm = h], and the marginal jointdensity of the scores along the frontier.Con = limϵ→(E[ym | 2pmP hm3 ∈ c+ϵ 2pP h3] − limϵ→(E[ym | 2pmP hm3 ∈ c−ϵ 2pP h3] (3.1)Following the recommendation for the similar, single-score RDD (Imbens andLemieux, 2008; Imbens and Kalyanaraman, 2012), the CATE for the point 2pP h3can be estimated nonparametrically by local linear regression, using the two scoresas dependent variables. In municipality m at period t,55 CCT coverage is defined asxxtmt. The first stage of the instrumental variable estimation is shown in equation3.2 below.xxtmt = (5)m52pxm5+hxm54mpxm55mhxm5 t5 s5 m5mt (3.2)The treatment effect is denoted by ), where m = 1[2pm <= =:P hm <= :O7].The values of population and HDI centered around point 2pP h3 are denoted by2pxmP hxm3. I will also include state effects (s), a period dummy (t), and a vector ofmunicipal controls (m).56 This is usual in RDDs to reduce the sample variability55A period is defined as the 4-year electoral term, and the CCT coverage is measured at theend of the period. In this case, the estimation is done for the values of coverage in 2008 and 2012(December).56Unless otherwise noted, the following variables are included as controls: latitude, longitude,their interaction, area, pre-treatment CCT coverage, pre-treatment FHP coverage, share of OldCCT beneficiaries in the population, GDP per capita (log) and the share of males in the population.The regressions for the incumbent’s vote share and campaign donations also include the share ofvotes in the last election and a dummy indicating if the candidate belongs to the federal party. Forvariables that measure the education and clientelistic party affiliation of the challengers, dummiesare included for both the federal party, and the clientelistic party affiliation of the incumbent inthat election. Finally, for the budget variables, the past share of the budget (1997-2000 tenure) is373.4. Empirical Strategy(Lee and Lemieux, 2010). The local linear regression will be weighted by the edgekernel in the main specification, but the robustness to the choice of kernel will alsobe tested. All kernels are two-dimensional, defined as the following product kernel:K2u)P u23 = [K2u)3 ·K2u23] .This equation can be estimated for any point of the frontier, by centering thescores at the desired point. The effects are estimated for 19 points, limiting the datato a bandwidth defined over the two score variables.57 For a simpler interpretationof the bandwidths, I will normalize the score variables to the same scale, accordingto their standard deviation. The average effect for any frontier segment 2Vvg3 canbe estimated as the average of CATEs for k points along the frontier, weighted bythe joint density 2pkP hk3 of each point (equation 3.3).^Vvg =∑Kk5) Con2pkP hk3^2pkP hk3∑Kk5) ^2pkP hk3(3.3)Given that the subsample used in the estimation of each CATE might overlap,the standard errors of the averaged coefficients are bootstrapped. Confidence in-tervals are calculated using the bias corrected and accelerated bootstrap method(Efron, 1979).Another advantage of using a nonparametric strategy is the possibility of usinga formal process for bandwidth selection. I present results estimated under optimalbandwidths, calculated with a data-driven plug-in algorithm for bandwidth selec-tion in two dimensions. This algorithm is based on Zajonc (2012), in the spirit ofImbens and Kalyanaraman (2012). In contrast to the procedure proposed by Za-jonc (2012), this algorithm allows the use of a different optimal bandwidth for eachfrontier point for which the CATE is estimated. It also allows the use of differentalso included as a control.57As an example, for the point pm 5 25 and hm 5 (:7, the centered values are pHm 5 −5 andhHm 5 (. For bandwidths of 10,000 in population and 0.1 in HDI, the data used in the estimationis D 5 (pm; hm) 2 ()5 ≤ pm ≤ 35; (:6 ≤ hm ≤ (:0)383.4. Empirical Strategybandwidths for the two score variables, reducing the estimated mean squared errorof the coefficients. Excessively wide bandwidths are a common problem in outputsof plug-in algorithms. This procedure puts a cap on the maximum bandwidth, ef-fectively limiting the amount of bias in the estimation. Finally, the algorithm isexpanded to estimate the bandwidth for different kernels. I describe the construc-tion of this algorithm in the appendix (Section B.2). As a robustness check, I alsopresent results for constant bandwidths of 0.90 and 0.75 standard deviations.For any political outcome outmt, equation 3.4 below shows the second stage ofthe 2SLS estimation. Here, the effect of CCT on political outcomes is estimatedusing the predicted values of coverage from the first stage. The interaction betweentreatment m and the score variables is also included in the second stage.58 Thecoefficient ) represents the conditional ATE of CCT coverage on political outcomes,at the specific frontier point for which it was calculated. Average effects for anyfrontier segment, defined here as ^Ik Vvg, can be calculated using equation 3.5 below.outmt = (5) ^xxtmt52pxm5+hxm54mpxm55mhxm5t5s5m5 ϵmt (3.4)^Ik Vvg =∑Kk5) )k2pkP hk3^2pkP hk3∑Kk5) ^2pkP hk3(3.5)Although I run the regression for a total of 19 points along the frontier, the mainspecification in this paper uses the frontier segment where the instrument is deemedstrong.59 Section 3.5.1, considers the reasons why the instrument is strong in only58The single instrument used in the estimation is the ATE for CCT coverage at each discontinuitypoint.59I use the segment that aggregates a continuum of 6 points where the average of the t-tests for the coefficient of the CCT coverage variable is at least 3.2, which corresponds tothe rule of thumb statistic for the instrument to be considered strong in a just identifiedRDD. The segment selected using this strategy corresponds to the following: hegment 5(pm; hm) 2 (pm S5 27:5; hm 5 (:7) ∪ (pm 5 3(; hm S5 (:65)393.4. Empirical Strategyselected parts of the sample. The average optimal bandwidths for each outcomevariable in this segment are shown in Table B. Mapping the Theory to the DataThe model in the last Chapter presented six predictions for the effect of theCCT program on budget allocation, incumbency advantage, quality of politicians,and vote buying. These predictions are tested empirically using the cross-municipalvariation in CCT coverage. This variation is instrumented by the discontinuity in theFHP funding, as described in Section 3.2, for municipalities at the treatment frontier.The effect of the CCT program on any political outcome is always calculated, for agiven segment of the treatment frontier, by equation 3.5 defined in Section 3.4.Proposition 2 predicts that transfers would increase pro-poor spending by incum-bents, and it is tested using the local budget allocation. The data categorizes thespending on the basis of type and function. By type, the three observed categoriesare capital investment, personnel expenses, and other expenses. By function, thepaper uses six categories, as follows: education, health, administration, urbanizationand housing, social security, and transportation. They represent nearly 90% of thetotal spending. I define the pro-poor public good in terms of spending in educationand health services. The effects of CCT coverage on the pro-poor and pro-wealthyspending are used test the signal of the terms ytyy andy(b−t)yy , respectively.The variable that maps the data to Proposition 3 (transfers reduce an incum-bent’s vote share) is the share of votes of the incumbent. The effect of CCT onthis variable is a proxy for the value of yIyy . In addition, I will examine secondarypolitical outcomes that might lend support for this result (e.g. campaign donations,margin of victory in elections, number of candidates in elections).Proposition 6 states that incumbents that are less efficient in public good dis-tribution lose more votes and conduct a smaller shift towards pro-poor goods with403.4. Empirical Strategythe arrival of CCTs. This proposition is tested by mapping the candidate’s type tohis level of education, under the assumption that less educated politicians are likelyto be relatively less efficient at public good distribution (and more efficient at votebuying, with a higher type ).60 The education levels of the competitive challengersare used as an outcome (competitive challengers are the ones that win, or becomethe runner-up in a close election). The effect of CCT on this variable correspondsto yIyyyP . Additional tests for the signal of the incumbent’s vote loss and budgetshift to pro-poor goods by politician’s quality (i.e., yIyyyP andytyyyP) are performedby splitting the data according to the education level of the incumbent. The effectof the CCT program on the incumbent’s vote share and budget allocation is thenobserved for the different subsamples.61Although Propositions 4 and 5 do not provide new insights on political outcomes,the alignment of the empirical results with the predictions provides support forthe mechanism under examination (they predict how the main results should varywith both the size of the poor population, and the political preferences of voters).Proposition 4 is tested, for both budget allocation and the incumbent’s vote share, bysplitting the data into municipalities according to the density of political preferencearound the neutral level,62 and by examining the coefficients for each subsample. In60This assumption is largely based on the following fact regarding candidates for mayor in 2000-12 in Brazil: Candidates from parties with high-clientelism score are on average less educated thancandidates from parties with low-clientelism score (e.g. 12% less likely to have a college degree).61Low education refers to politicians having less than a college degree, and high education refersto politicians having at least some post-secondary education (more than high school).62Although local Brazilian politics are largely candidate-centered, as opposed to party-centered,federal elections are more ideological, and political platforms of congressional candidates are morealigned with the coalitions at the federal level. Based on this, I identify a group of ”swing parties”in Brazil (PMDB, PP and PTB). These are the large parties that between1994 and 2012 alwaysparticipated in the national government coalition, independent of the party holding power. Between1994 and 2002, they composed the center-right coalition with PSDB and PFL. Between 2002 and2012, they composed the center-left coalition led by PT. These parties had nearly 40% of thevotes in the congressional elections of 1998 and 2002. Their share of votes in these congressionalelections is used to proxy the density of the political preferences of municipalities, pre-treatment.Municipalities with an above-median share, are identified as ”high density”. Unfortunately, I donot observe the vote shares by level of income in these municipalities. The model prediction isbased on the level of political preference of the poor group. Therefore, in the empirical estimation,I implicitly assume that this split, based on the political preferences of the entire municipality, is a413.5. Results and Interpretationa similar fashion, Proposition 5 is tested, for both outcomes, by splitting the datainto municipalities according to the share of poor families. In fact, this paper usesthe subsample of municipalities with a high share (at least 25%) of poor families asthe paper’s main specification.Proposition 1 states that the arrival of the CCT program reduces vote buying.Although the existence of vote buying cannot be directly tested within the scopeof this study, its inclusion here is key to reconcile the mechanism that connectsthe other results. One alternative test can be performed using the DALP survey,which classifies Brazilian parties according to the level of clientelism. With yvyy < :,the model predicts that the loss in vote shares should be larger in municipalitiesgoverned by high-clientelism parties. The construction of all variables referencedabove is fully described in Section Results and InterpretationBefore proceeding with the discussion of the results, note that Table 3.1 showsthat the municipalities on the two sides of the treatment frontier are comparable.The coefficients are shown for the preferred frontier segment. The coefficients areneither statistically significant at the optimal bandwidth (column 3), nor at thealternative bandwidths (columns 1 and 2).3.5.1 First Stage: Regression Discontinuity ResultsTable 3.2 shows the first stage results for the health funding and CCT cover-age, for the preferred frontier segment. The coefficients estimated along the entirefrontier can be seen in Figure 3.6. Column (1) is the preferred specification. Itshows that a municipality at the discontinuity would have annual basic health trans-fers of R$1.8bn (pre-treatment), with the treatment triggering an increase of ~25%good proxy for the poor share of the population.423.5. Results and Interpretation(R$0.44bn). CCT coverage is 7.9pp (percentage points) higher at that segment. Theadditional amount of resources received by voters in a treated municipality is thenR$1.2bn, which is nearly 3x higher than the differential in health funding generatedby the discontinuity.The remainder of the table shows the robustness of these results to the choice ofbandwidth, kernel, and the use of controls and state effects. There is no significantvariation in the magnitude of the coefficients under most specifications. Finally,as the bandwidth increases, the magnitude of the CCT coverage coefficient becomeweaker. This means that the potential bias coming from widening the bandwidth,although seemingly small, would work in favor of the results.Heterogeneity of the ATEFigure 3.5 shows the heterogeneity of the coefficient estimated for CCT coveragealong the treatment frontier. The strength of the effect varies significantly acrossmunicipal characteristics. The first plot shows the CCT coverage coefficients foreach one of the 19 bins in the frontier. The instrument is only strong at the filledblue circles (2 points on the left and 5 points on the right). Wealthier and smallermunicipalities (left side of the figure) have a coefficient that is barely statisticallysignificant, and weak in magnitude. The instrument is also weak for that fron-tier segment. For very low-HDI municipalities (extreme right side), the coefficientis weak in magnitude and statistically insignificant, which renders the instrumentineffective.The second and third plots shows that, in line with the expectation, CCT cov-erage only responds to the FHP discontinuity in locations with the following char-acteristics: a relatively low infrastructure of public services, and a relatively lowcoverage of old CCT programs (see the discussion in Section 3.2). Accordingly,the second plot shows the pre-treatment, average coverage of health programs for433.5. Results and Interpretationthe same points as the first plot. The strength of the CCT coefficient (first plot)coincides with low coverage, either by family health teams or health agents.The intuition is that, in area well-covered by health services before treatment,the FHP extra funding would not have a significant impact on how well informationabout CCT was disseminated in those municipalities (i.e. health teams had alreadyenough funds to cover most poor population with frequent visits). This is supportedby the coefficients estimated for FHP visits along the treatment frontier, shown inFigure B.3 and Figure B.4 in the appendix. They show that both the total numberof visits, and the rate of visits per family, are only significantly higher for treatedmunicipalities on the right side of the plot.The third plot shows the quality of infrastructure before treatment (per capitabudget and per capita number of public employees), and the coverage of old CCTprograms. Again, the instrument is weak where the infrastructure for public ser-vices is better. This indicates that the funding discontinuity had a limited role onthe information channel for CCT enrollment in those areas. The low-HDI areas(extreme right) show a much higher coverage from older CCT programs, indicat-ing that information about BF would likely have been well disseminated in theselocations prior to 2004 (beneficiaries of old CCT programs only had to migrate toBF).3.5.2 Second Stage: Political OutcomesTable 3.3 shows the main results for the political outcomes, for the paper’s mainspecification. Column (1) shows the pre-treatment value; column (2) shows thereduced form coefficients from the regression discontinuity; column (3) shows theresults of an OLS regression; and column (4) shows the results of the IV regression.Table 3.4 shows the robustness of the MRDD results to kernel and bandwidth choice,and the inclusion of municipal controls. Table 3.4 also contains a placebo test, with443.5. Results and Interpretationthe coefficients estimated for the past values of the variables.63 All results areestimated for the preferred frontier segment. The results for the entire treatmentfrontier are shown in Figure 3.6 and Figure 3.7. Table 3.6 shows the coefficientsfor subsamples of the data, according to the following categories: education of theincumbent, party of the incumbent, density of political preference, and the numberof poor families in the municipality. All results are for a sample of municipalitieswith at least 25% of poor households. An exception is column (7) in Table 3.6,where the results are shown for a sample of municipalities with a small share ofpoor households (less than 40%). For reference, Table B.2 in the appendix presentsthe estimation results for electoral turnout, as well as all the remaining budgetvariables not included here.Election Outcomes: Proposition 3 The main result of interest here is the lossof support by the incumbent, which is a direct model prediction. From Table 3.3,the pre-treatment average vote share for incumbents is 51% in the elections of 2008and 2012. In line with the theory, columns (2) and (4) show that the overall effect ofCCT on the incumbent’s vote share is negative. From the IV regression, for a 10ppincrease in CCT coverage, there is an 8.2pp vote loss for the incumbent. This resultis robust to the choice of kernel, bandwidth, and the exclusion of municipal controls(Table 3.4). Column (8) in Table 3.4 shows that this effect was not present in theelections of 2000 and 2004, when in fact the coefficient was statistically insignificantand positive, at 1.1pp.The loss in vote shares by the incumbent is supported by the effects foundfor other election outcomes that are not direct predictions of the model.64 These63Column (8) with the past coefficients includes pooled data from the elections in 2000 and 2004,or the 1997-2000 and 2001-2004 mayoral tenures when available, unless otherwise indicated.64Although the results for the loss of vote shares by incumbents is significant, the results using theprobability of victory for incumbents are not statistically significant. In other words, incumbentslose votes, but most of them are still reelected. This is mostly because elections were not verycompetitive before treatment (the average margin of victory was very high at ~17pp). If we look453.5. Results and Interpretationvariables measure different dimensions of electoral competition (margin of victoryand number of candidates in elections), and support for the incumbent (campaigndonations). From the IV results in column (4) of Table 3.3, for a 10pp increase inCCT coverage, there are 0.4 more candidates running for mayor,65 a 6.3pp lowermargin of victory, and 50% less private campaign donations. These results are alsorobust to the choice of kernel, bandwidth, and the exclusion of controls (Table 3.4).As before, no significant effect on those variables are observed in past elections.Finally, the voter turnout does not significantly change at the discontinuity (TableB.2).Election Outcomes: Proposition 6 I also test the prediction that less educatedcandidates would lose more support after the CCT arrival. From the IV results incolumn (4) of Table 3.3, for a 10pp increase in municipal CCT coverage, there isa 24pp higher share of competitive challengers that have at least a high schooleducation. This result is robust to the choice of kernel, bandwidth, the exclusionof municipal controls, and it is also absent from the pre-treatment period. Thisprediction can also be tested by splitting incumbents into more and less educated,as shown in Table 3.6, in columns (1) and (2).66 As expected, the magnitude of thenegative coefficient for the incumbent’s vote share, for the group of less educatedincumbents, is almost double that of the group of more educated incumbents.67at the results by sub-sample, we see that in the case of less educated incumbents, there is also asignificant reduction in the probability of victory (in addition to the loss in vote shares, which isalso much higher for this sub-sample).65The fall in vote shares of the incumbent is not a mechanic consequence of the reduction in theincrease in the number of candidates. Even when using only 2-candidate races, the effect on voteshares are still statistically significant.66High education is defined here as having more than 12 years of formal schooling, i.e., somepost-secondary education; and low education is defined here as up to and including high school.67Note that the confidence intervals for the two coefficients overlap. This is the case with mostof the results in this Table, since the subsamples have a much lower number of observations theestimation has high variance. Nevertheless, it is important to notice that most deviations in mag-nitudes respect the theoretical propositions and the coefficients in the sub-groups where strongerresults are expected remain significant is most cases.463.5. Results and InterpretationElection Outcomes: Propositions 4 and 5 The test of predictions fromPropositions 4 and 5, for the incumbent’s vote share, are shown in Table 3.6. First,the sample is split into municipalities with low and high density of political pref-erence (see Section 3.4.2). The results are shown in columns (3) and (4). In linewith the theory, in areas of less dense political preferences, the coefficient for theincumbent’s vote loss becomes insignificant. It has roughly half the magnitude ofthe coefficient estimated for the high-density group. The results for Proposition 5are shown in columns (7) and (8), and are also in keeping with the theory. The voteloss for the incumbent is higher in municipalities with a larger poor population.Election Outcomes: Proposition 1 Although the existence of clientelism can-not be directly tested in the scope of this study, one alternative test can be performedusing the DALP survey. The sample is split into groups of parties that have a highor low clientelism score, as defined by the survey. Columns (5) and (6) show thatthe incumbent’s vote loss is higher when they belong to a high-clientelism party,indicating that the mechanism currently in place is, in fact, affecting vote buyingpractices. In addition, Table 3.4 shows that, for a 10pp increase in municipal CCTcoverage, there is a 20pp lower share of competitive challengers for high-clientelismparties. This result is robust to the exclusion of municipal controls, narrower band-widths, it is not present in the elections pre-treatment, but loses significance underunder-smoothing kernels. For all variables, under narrower bandwidths, the magni-tude of the results increases. As in the case of CCT coverage, increases in bandwidthrepresent a conservative approach and offer no reason for concern with respect tobias.Election Outcomes: Discussion The overall direction shown by the resultslines up well with the theoretical predictions. As for the magnitude, the loss in theincumbent’s vote share is more than one-to-one in relation to the additional number473.5. Results and Interpretationof households receiving the benefit. This suggests some form of propagation of thevoting effects of the CCT. If there are positive economic spillovers from higher CCTcoverage, the electoral effects are expected to be higher than the ones restricted tothe households receiving the benefits. The same goes for the increase in the numberof candidates, given that even wealthier households will face a larger number ofchoices. As far as campaign donations go, it is unlikely that the reduction comesfrom donations from poor voters. Donations probably follow the reduction in theallocation to the pro-wealthy public good (see below), given that this group providesthe bulk of financial support to candidates. Accordingly, lower campaign spendingcould also feed into further vote loss. All in all, this study’s identification strategysupports the hypothesis that the loss of votes by the incumbent goes beyond theaffected poor families, but it cannot identify which of the propagating effects is morerelevant.Budget Outcomes: Proposition 2 The main result of interest here is thebudget allocation in health care and education. From Table 3.3, the pre-treatmentaverage budget share of these services together is 52%. In line with the theory,columns (2) and (4) show an increase in the budget allocation to these areas. Fromthe IV regression, for a 10pp increase in CCT coverage, there are increases of 2.7ppand 2.3pp in the shares allocated to education and health care, respectively. Thehealth results are robust to the choice of kernel, bandwidth, and the exclusion ofmunicipal controls. For education, the coefficients are less robust, losing significanceboth in the IV regression and without the municipal controls (Table 3.4). Althoughthe research design predicts a small automatic increase in health spending as a directresult of the FHP funding gap, this increase would be at most 0.2pp. This is 5 timessmaller than the lower limit of the confidence interval (1.1pp), and 10 times smallerthan the point estimate.As a placebo test, the positive effect for both variables was not present in the483.5. Results and Interpretationelectoral tenures prior to 2004, when the aggregate result was negative and insignif-icant (see column (8) of Table 3.4).68 For all other categories of budget allocationby function, the results were insignificant (Table B.2). The corresponding reductionin budget shares was likely spread across several functions. Two categories shown inTable B.2 warrant a comment. First, the effect in urbanization spending is positiveat 2.1pp in 1997-04, despite being low and insignificant in 2005-2012. This indicatesthat urbanization spending could have provided the main source for funds reallo-cated to health and education. Second, although transportation spending69 (2.7%of the total spending) was negative and significant, this result has to be treated withskepticism. The effect is not robust to the IV estimation, and it also pre-dates thetreatment, given that it was present in between 1997 and 2004.The spending by type provides a better picture of the mechanism in place. FromTable 3.3, columns (2) and (4) show a decrease in the budget allocation to capitalinvestment. From the IV regression, for a 10pp increase in CCT coverage, there is a1.5pp decrease in the share allocated to this category. Between 1997 and 2004, thiseffect was positive and insignificant. The result is robust to kernel choice, but it losespower with the exclusion of the controls, or at narrower bandwidths. This suggeststhat incumbents reallocate resources to redistributive spending from infra-structureinvestment. In line with this dynamic, the expenses with personnel were higher androbust at the discontinuity, most likely due to the shift from capital investment tolabor-intensive spending in education and health.70 While the potential implicationsof this budget shift in the profile of public employees is outside the focus of this study,68These effects were also insignificant when estimated for these two variables using only datafrom 2004.69Transportation spending in small municipalities is mostly infra-structure spending in road trans-port, i.e., construction of roads and bridges. Most small municipalities have little spending on apublic system of transportation.70Although treated locations have a slightly lower annual budget, and a slightly higher numberof employees in 2008-12, these effects are not statistically significant (Tables B.2 and B.7). Never-theless, their combination allows for treated locations having a higher share of budget spent withpersonnel, but no significant difference in average wages.493.5. Results and Interpretationit is briefly discussed in the appendix (Section B.1).Budget Outcomes: Propositions 4, 5 and 6 The tests of the remainingtheoretical predictions for the budget allocation variables are shown in Table 3.6.Columns (1) and (2) show that a less educated incumbent will also be less effectivein accomplishing the budget shift, in keeping with the theoretical prediction fromProposition 6. Columns (3) and (4) show that the budget shift is stronger in mu-nicipalities with high-density of political preference. Columns (7) and (8) show thatthe results are all weaker and statistically insignificant for a low-poverty sample.Finally, columns (5) and (6) show the budget effects by clientelism scores. Theexpectation here is less straightforward than the negative effects predicted for theincumbent’s support. Following the result found for less educated incumbents, thebudget shift should be smaller in high-clientelism parties if the politicians in theseparties possess a personal competitive advantage in vote buying. However, thebudget shift should be larger in high-clientelism parties if the local strength of such aparty is mainly supported by a specific municipal characteristic (e.g. a high numberof voters with low political preference), following the results from Proposition 4.Accordingly, the only significant result here was a larger increase in health spendingfor the high-clientelism group.3.5.3 Exclusion RestrictionThe direct effects of the program’s funding cannot be tested separately fromthe effects happening through CCT, given that the discontinuity was introduced atthe early stages of the BF program (Figure 3.1). Instead, there are four pieces ofevidence indicating that the political outcomes are predominantly generated by theCCT variation, and not by the FHP funding.First, the funding discontinuity is small and does not significantly change total503.5. Results and Interpretationbudget (Table B.2). Even if these funds were used for purposes other than health,it is unlikely that this funding was relevant. Second, a channel through which theFHP could affect politics is that of health outcomes. However, Table B.6 showsthat the funding differential did not have a significant effect on the primary healthoutcomes measured by the basic health system (e.g., rate of vaccinated children,infant mortality, and rate of pre-natal exams).Third, the heterogeneity of the average treatment effects of the FHP funding onCCT coverage along the treatment frontier provide strong evidence in favor of theexclusion restriction. The coefficients for political outcomes show a weak responsein the same areas of the frontier where the instrument is weak (Figures 3.6 and3.7). To further investigate this, I selected two equal size ranges of the frontier (6bins): the first was the preferred segment for this study, where the instrument isstatistically strong, and the second was the segment with population from 7,500 to17,500, and HDI of 0.7 (weak instrument).71 The coefficients for these segments areshown in Table B.5. In the weak-IV segment (Segment A), all variables that weresignificant in segment B become insignificant, with the exception of health spending.This variable was still significant but with a lower power and magnitude.Fourth, I attempt to identify a change in the direct effect of the FHP in politicsafter the arrival of the BF program using a different strategy. I run a regressionusing the political outcomes (ymt) as dependent variables, explained by a dummythat reflects the presence of the FHP in municipality m at time t (FHmt). Thisregression suffers from omitted variable bias. Political outcomes could be affectedby the same factors influencing the decision to implement the FHP. Nevertheless,if the omitted variable bias is constant over time, then the interaction between a71I selected these bins (0.7 HDI and low population) over the 6 bins on the other extreme of thefrontier (low HDI and 30,000 population) for two reasons. First, the low population side providesa much larger sample. Second, the low HDI side is not balanced with respect to the number ofpoor families (i.e. treated municipalities tend to have more poor families), whereas the other twoselected areas are balanced in all variables from Table 1.513.5. Results and Interpretationtime dummy for post-2004 22((0−)23, and the FHP presence (FHmt), should notbe significant unless there is another variable changing the program’s impact onpolitics.72 This effect is measured by ) in the equation below.ymt = (5)FHmt∗2((0−)252FHmt52((0−)252((45mixro5m5mt (3.6)I also include municipality-specific controls (see the Table B.4 for details), de-noted by m, and dummies for micro regions73, denoted by mixro. Columns (1-2)in Table B.4 show the coefficient 2, i.e. the direct effect of the FHP presence inthe political outcomes. This coefficient cannot be trusted for inference, due to theomitted variable bias. Columns (3-4) show the value of ). After 2004, there is a sta-tistically significant change in the program effect when it comes to the incumbent’svote share, the education of the challenger, health spending, capital investment,and personnel expenses. This change is always in the same direction (positive ornegative) as the one pointed out by the main results of the MRDD in this study.I cannot affirm that the estimates in this paper are an upper or lower bound forthe true effect of CCT on political outcomes without making assumptions about thesignal of a potential direct impact of the FHP. This evidence, however, supports thehypothesis that another variable is driving these political outcomes, and confirmsthe findings with respect to the direction of these effects.Finally, potential direct impacts are not necessarily intuitive. The FHP is thebest rated health program in Brazil (IPEA, 2011). It is unlikely that it triggers amassive negative effect on the support for the mayors that implement it. As for72I also include a dummy for 2004 (2004), so the effect in 1 is compared to the effect in 2000.Although the 2004 electoral cycle started after the BF program, there insufficient time for theappearance significant changes in the political outcomes.73Micro regions are not administrative regions, but they are used by IBGE for statistical purposes.They are comprised of contiguous municipalities with similar social and economic characteristics.There are 558 micro-regions in Brazil, each one with an average of 10 municipalities.523.5. Results and Interpretationbudget allocation, even if the entire FHP extra funding was actually spent in healthservices,74 the predicted mechanical increase in the budget share of health servicesis only 0.2pp, nearly 10 times lower than the point estimate found in the paper.3.5.4 Alternative Policy DiscontinuitiesThe main potential source of omitted variables bias would be a policy with dis-continuous implementation around the same thresholds of population and/or HDIused in this study. The Fundo de Participacao dos Municipios (FPM) is the mostimportant source of federal transfers, and the main source of revenues for smallmunicipalities. The FPM is distributed in a discontinuous form across several pop-ulation thresholds, where larger municipalities receive a higher amount.75 Althoughone of the FPM thresholds, at 30,564, is close to the 30,000 threshold used in thisdesign, there is strong evidence that the FPM is not the cause of the political effectsobserved in this study.First, the methodology of fund allocation in this design differs from the FPMmethodology. The population threshold that determines the eligibility to a higherFHP funding was fixed in 2003, while the thresholds change every year for the FPM.This difference creates confounding effects in cases where municipalities crossed theFHP threshold at any time between 2003 and 2012, especially in the estimationunder the edge kernel. Second, the absence of a significant effect on the total budgetis evidence that the FPM is not generating a funding gap at this threshold (TableB.2). This is not surprising, given that the theoretical FPM differential is muchhigher at lower population thresholds.76Third, the FPM dates back to the 1980s, and the population thresholds have74The FHP is jointly financed by central, state and municipal governments. Thus, although thefederal funds have to be spent in the program, municipalities would have been allowed to reducetheir own contribution to the program and spend in other budget areas as they saw fit.75This variation was recently explored in the political economy literature (Brollo et al., 2013).76The difference in funding at the first 7 population thresholds for the FPM is: 33% for 10,188;25% for 13,584; 20% for 16.980; 17% for 23,772; 14% for 30,564; 13% for 37,356; and 11% for 44,148.533.5. Results and Interpretationremained the same since 2000. Thus, any direct effects of the FPM on politicaloutcomes would have been observed in past elections (2000 and 2004), which isnot the case (Table 3.4). Fourth, Table B.3 shows the reduced form coefficientsfor political outcomes in 2008 and 2012, setting the population cut-off for differentFPM thresholds. I use populations of 23,773 and 37,356 (one threshold lower andone higher than 30,564). None of the variables had a significant result in the samedirection as the results revealed by this study.3.5.5 Alternative Explanations for the MechanismIn this Section I briefly discuss two potential alternative explanations for themechanism driving the empirical results. The main novelty in the framework pre-sented in these Chapters is the inclusion of clientelistic exchanges in the canonicalprobabilistic voting model, as the cause of incumbency advantage. Although clien-telism cannot be directly observed, there are two main reasons for this inclusion.First, the traditional model would not be able to explain why incumbents losevotes with changes in the utility of voters (this is shown in the Appendix). Sec-ond, clientelism as the channel for the political effects of CCT is supported bothby the context, and by the results in this study (e.g. the fall in vote share ofhigh-clientelism parties, or the results of the secondary propositions). It is possiblethat these (or other) alternative mechanisms co-exist with the one being tested inthis thesis. These alternative stories, however, are neither tested empirically in thisstudy, nor developed theoretically. Here I simply provide a few arguments on whyI believe the proposed model is the best fit for the results in the data.Change in Preferences. There is a class of models (e.g. median-voter theo-rem) that treats voters preferences as single-peaked over a certain policy dimension.In this context, the permanent income boost generated by the CCT program couldshift their preferences, which would lead them to vote away from the politician that543.5. Results and Interpretationrepresented their old preference set (e.g. the incumbent, the less educated), andvote for the challenger. The main difficulty would be to conciliate the fact that vot-ers are receiving more of the goods targeted to the poor, when they are becomingless poor, and their preferences should be moving move away from public servicesof education and health care. This model would also fail to explain why the onlyparty effects observed are based on the clientelism score (there are no significanteffects for party by coalition or ideology).Increase in Demand for Public Services. The CCT benefit comes attachedto the condition that beneficiaries attend school and have regular health check-ups.This could increase the demand for these services, which would lead politicians toprovide more of them, and voters to replace politicians that provide little, with theones that could be ”better” at delivering public goods (thereby voting the incum-bents out). This story, however, fails to account for the fact that the poor populationin Brazil already lacked proper access to schooling and health care before BF, andthe demand for these services was likely already there before the program came. Ifproviding these public goods was an effective electoral strategy, incumbents wouldhave done it before the program arrived. Moreover, mayors had no obligation at-tached to the CCT program to increase spending in these areas.553.6. Figures and TablesFigure 3.1: Timeline of EventsThe FHP started in 1995. The Bolsa Família program started in Oct 2003 and the FHP funding discontinuityin Aug 2004. All elections happened in early October.Figure 3.2: Map of Municipalities in the SampleMunicipalities in white are either outside of the bandwidth used for the RDD, located in the legal Amazon(high-left side) or have a share of poor population below 25% (most of the south). The map shows a totalof 1,577 colored municipalities.563.6. Figures and TablesFigure 3.3: CCT Coverage vs. Number of Poor FamiliesThe y-axis shows in percentage points how much the CCT coverage was below the target. ** I adjust 2004for 974,000 households receiving Bolsa Família and Bolsa Escola, as per the MDS. Even so, this coverage isstill likely overstated (there is no adjustment for duplicity across old CCT benefits). CCT coverage includesthe Bolsa Família, Bolsa Escola, Bolsa Alimentação and Cartão Alimentação.Figure 3.4: Potential Sample and Treatment FrontierThe potential sample includes all municipalities within the central 95% percentile in population and HDI.The treatment frontier is the red line. Light blue dots represent municipalities eligible to treatment.573.6. Figures and TablesFigure 3.5: Heterogeneity of the CATE for CCT CoverageConditional ATEs for CCT CoverageAverage Coverage of Health ProgramsAverage Quality of Public Infra-Structure and Coverage of Old CCTsThe y-axis shows in the first plot the conditional ATEs for CCT Coverage. The y-axis on the other plotsshows the pre-treatment average of the variables (in common scale). For all charts, the left side has HDI fixedat 0.7 and population in 7,500-30,000. The right side has population fixed at 30,000 and HDI in 0.5875-0.7.The size of the dots represent the number of observations in each one of the 19 bins. I repeat the bin locatedat the origin in both sides.583.6. Figures and TablesFigure 3.6: Conditional ATE for Political Outcomes(a) Basic Health Funds (b) CCT Coverage(c) Incumbent’s Vote Share (d) Number of Candidates(e) Campaign Donations (f) Margin of VictoryThe y-axis shows the conditional ATEs along the treatment frontier. The left side has HDI=0.7 and pop-ulation in 7,500-30,000. The right side has population=30,000 and HDI in 0.7-0.5875. Regressions use theedge kernel, year and state effects, and controls. Bandwidth is set at 0.9 standard deviations for both scores.The size of the dots are the number of observations in each bin. I repeat the bin located at the origin inboth sides. The bandwidth is the optimal for each variable.593.6. Figures and TablesFigure 3.7: Conditional ATE for Political Outcomes (Continued)(g) Challenger’s Education (h) Challenger’s Party(i) Capital Investment (j) Personnel Spending(k) Education Spending (l) Health Care SpendingThe y-axis shows the conditional ATEs along the treatment frontier. The left side has HDI=0.7 and pop-ulation in 7,500-30,000. The right side has population=30,000 and HDI in 0.7-0.5875. Regressions use theedge kernel, year and state effects, and controls. Bandwidth is set at 0.9 standard deviations for both scores.The size of the dots are the number of observations in each bin. I repeat the bin located at the origin inboth sides. The bandwidth is the optimal for each variable.603.6. Figures and TablesTable 3.1: Balance of Fixed and Pre-determined VariablesPT Coefficient Opt. bandMean [90% CI] (Pop,HDI)[90% CI] (1) (2) (3) {Obs. / bin}Bandwidth Optimal Optimal 0.90 0.75Latitude -40.59 -0.16 -0.21 -0.28 (1.00,0.97)(degrees) [-41.18,-40.02][-0.63,0.29] [-0.71,0.27] [-0.85,0.25] {584}Longitude -13.16 -0.06 -0.04 -0.04 (0.98,1.00)(degrees) [-13.96,-12.29][-0.66,0.53] [-0.71,0.60] [-0.86,0.76] {596}Schooling 9.43 0.65 0.73 0.66 (1.00,0.97)(% with high school) [8.97,9.95] [-0.09,1.46] [-0.05,1.61] [-0.24,1.64] {574}Income Inequality 21.42 2.31 2.26 2.50 (1.00,1.00)(top 10% / bot. 40%) [20.43,22.67] [-0.06,4.81] [-0.37,5.01] [-0.61,5.81] {610}Age Profile 39.60 -0.44 -0.36 -0.29 (1.00,1.00)(share with 20-50) [39.30,39.92] [-1.09,0.16] [-1.09,0.27] [-1.14,0.43] {607}GDP per capitaa 2.83 0.07 0.05 0.02 (0.99,0.99)(R$ ’000) [2.58,3.22] [-0.15,0.30] [-0.20,0.31] [-0.27,0.35] {603}Areaa 0.82 0.26 0.28 0.31 (0.96,0.99)(’000 km2) [0.67,0.99] [-0.12,0.61] [-0.14,0.66] [-0.17,0.77] {574}Urban pop. 63.01 1.13 1.06 -0.30 (1.00,1.00)(% share) [60.43,65.70] [-3.56,5.80] [-4.16,5.99] [-6.37,5.24] {609}Gender 49.90 0.02 0.03 0.09 (1.00,1.00)(% share of male) [49.71,50.09] [-0.33,0.51] [-0.37,0.61] [-0.43,0.80] {607}FHP teams 58.56 -0.19 0.02 -0.58 (1.00,1.00)(% coverage) [52.75,64.02] [-11.65,11.01] [-12.32,12.58] [-14.68,14.67] {611}Poverty 39.87 0.75 0.70 0.78 (1.00,1.00)(% share) [38.31,41.23] [-2.06,2.99] [-2.40,3.20] [-2.86,3.87] {609}CCT Coverage -14.34 1.20 1.66 3.16 (0.99,1.00)(% over target) [-19.11,-8.76] [-6.97,9.75] [-6.92,11.35] [-6.55,14.65] {605}Old CCT benefits 18.57 -0.85 -1.03 -1.35 (1.00,0.99)(% of pop.) [17.27,20.02] [-3.15,1.47] [-3.53,1.42] [-4.20,1.38] {606}Obs. per bin {476} {314}aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. The coefficients represent the average effect for the preferred frontier segment (6 bins). Re-gressions include year and state effects. Pre-treatment means correspond to the predicted values of thevariables for a municipality at the discontinuity segment, before treatment.613.6. Figures and TablesTable 3.2: Regression Discontinuity ResultsPT CoefficientMean [90% CI], avg. band. (Pop,HDI), {Obs. per bin}[90% CI] (1) (2) (3) (4)Kernel Edge Edge Epanec. Gaussian UniformMain specification, optimal bandwidthBasic Health Fundsa 1.76 0.22*** 0.21*** 0.20*** 0.22***(R$ million) [1.64,1.86] [0.14,0.29] [0.13,0.29] [0.10,0.30] [0.11,0.34](0.71,0.99) (0.64,0.97) (0.53,0.94) (0.51,0.90){329} {269} {180} {166}CCT Coverage 0.52 7.90*** 7.65*** 7.29*** 6.92***(% over target) [-2.64,3.07] [3.84,11.60] [3.60,11.27] [3.32,10.91] [2.82,10.36](1.00,1.00) (1.00,0.98) (1.00,0.94) (1.00,0.92){612} {590} {544} {530}No municipal controls or state effects, optimal bandwidthBasic Health Fundsa 0.25*** 0.24*** 0.25*** 0.27***(R$ million) [0.15,0.34] [0.14,0.34] [0.14,0.36] [0.16,0.39]CCT Coverage 5.92** 5.81** 5.36** 5.23**(% over target) [0.98,10.31] [1.30,10.07] [1.03,9.43] [0.93,9.26]Main specification, bandwidth = 0.90Basic Health Fundsa 0.25*** 0.27*** 0.29*** 0.30***(R$ million) [0.18,0.31] [0.20,0.33] [0.22,0.35] [0.23,0.36]CCT Coverage 8.44*** 8.07*** 7.53*** 7.09***(% over target) [4.11,12.43] [3.90,11.96] [3.41,11.34] [2.90,10.81]Main specification, bandwidth = 0.75Basic Health Fundsa 0.21*** 0.23*** 0.23*** 0.24***(R$ million) [0.13,0.29] [0.15,0.30] [0.16,0.31] [0.16,0.31]CCT Coverage 8.93*** 8.54*** 8.25*** 8.07***(% over target) [3.91,13.41] [3.66,12.81] [3.57,12.34] [3.44,12.14]aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. The coefficients represent the ATE for the preferred frontier segment. The list of includedmunicipal-level controls is described in the text. Pre-treatment means correspond to the predicted values ofthe variables for a municipality at the discontinuity segment, before treatment.623.6. Figures and TablesTable 3.3: Main Results: Political OutcomesCoefficient, [90% CI] Avg Band.PT Mean RDD OLS IV (Pop. HDI)(1) (2) (3) (4) {Obs. /bin}ELECTIONSIncumbent’s Vote Share 50.90 -7.73*** 0.03 -0.82** (0.99,0.96)(%) [47.71,53.71] [-12.42,-3.60] [-0.08,0.12] [-1.91,-0.28] {431}Number of candidates 2.31 0.39*** 0.00 0.04*** (1.00,0.98)(number) [2.24,2.40] [0.20,0.63] [0.00,0.00] [0.02,0.10] {450}Margin of Victory 16.53 -5.96** 0.07 -0.63** (1.00,1.00)(pct points) [13.62,20.59] [-11.69,-1.74] [-0.04,0.17] [-1.70,-0.09] {467}Campaign Donationsa 11.29 -0.41* 0.00 -0.05* (1.00,1.00)(R$’000) [11.03,11.46] [-0.89,-0.06] [-0.01,0.01] [-0.14,0.00] {448}Challenger’s Education 85.56 19.27** -0.04 2.38* (0.75,0.75)(% with high school) [74.87,92.72] [4.56,33.57] [-0.42,0.34] [0.36,22.59] {198}Challenger’s Party 51.58 -19.30* -0.07 -1.99* (1.00,1.00)(1=clientelistic) [43.63,60.24] [-33.85,-3.42] [-0.45,0.29] [-5.26,-0.09] {378}BUDGET SHARES (by type)Capital Investment 10.56 -1.73* -0.02 -0.15* (1.00,1.00)(% share) [9.70,11.73] [-3.34,-0.22] [-0.05,0.00] [-0.35,-0.02] {352}Personnel Spending 46.58 2.85** 0.01 0.25** (1.00,1.00)(% share) [44.82,47.97] [0.66,5.20] [-0.03,0.05] [0.07,0.60] {352}BUDGET SHARES (by function)Education 29.64 1.84** 0.01 0.23 (1.00,1.00)(% share) [28.33,30.90] [0.34,3.91] [-0.02,0.03] [-0.02,1.04] {350}Health 22.22 2.17*** 0.00 0.27** (1.00,0.96)(% share) [21.18,23.38] [1.05,3.73] [-0.02,0.03] [0.07,1.28] {329}aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. The coefficients represent the ATE for the preferred frontier segment, under the edge kernel.All regressions include year and state effects. The list of included municipal-level controls is described inthe text. Pre-treatment means correspond to the predicted values of the variables for a municipality at thediscontinuity segment, before treatment.633.6. Figures and TablesTable 3.4: Robustness to Kernel Choice: Political OutcomesCoefficient, [90% CI](1) (2) (3) (4)Kernel Edge Epanech. Normal UniformELECTIONSIncumbent’s Vote Share -7.73*** -7.39*** -7.08*** -6.68***(%) [-12.42,-3.60] [-11.83,-3.50] [-11.17,-3.19] [-10.71,-2.82]Number of candidates 0.39*** 0.37*** 0.34*** 0.32***(number) [0.20,0.63] [0.18,0.59] [0.15,0.54] [0.13,0.52]Margin of Victory -5.96** -5.40** -4.79** -4.37**(pct points) [-11.69,-1.74] [-10.41,-1.37] [-9.34,-1.03] [-8.64,-0.73]Campaign Donationsa -0.41* -0.38* -0.46* -0.46*(R$’000) [-0.89,-0.06] [-0.85,-0.02] [-0.93,-0.06] [-0.94,-0.05]Challenger’s Education 19.27** 17.51** 13.96* 12.68*(% with high school) [4.56,33.57] [3.68,31.20] [1.11,25.75] [0.14,23.88]Challenger’s Party -19.30* -16.61* -13.80 -10.65(1=clientelistic) [-33.85,-3.42] [-31.17,-0.57] [-28.24,1.53] [-25.70,4.26]BUDGET SHARES (by type)Capital Investment -1.73* -1.65* -1.68** -1.62**(% share) [-3.34,-0.22] [-3.15,-0.25] [-3.10,-0.42] [-2.98,-0.41]Personnel Spending 2.85** 2.62** 2.56** 2.43**(% share) [0.66,5.20] [0.53,4.85] [0.57,4.74] [0.50,4.64]BUDGET SHARES (by function)Education 1.84** 1.64* 1.54* 1.40(% share) [0.34,3.91] [0.19,3.59] [0.12,3.37] [-0.08,3.14]Health 2.17*** 2.09*** 2.03*** 2.00***(% share) [1.05,3.73] [1.03,3.56] [0.97,3.43] [0.98,3.30]aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. The coefficients represent the ATE for the preferred frontier segment, under the optimalbandwidth. All regressions include year and state effects. The list of included municipal-level controls isdescribed in the text.643.6. Figures and TablesTable 3.5: Other Robustness Tests: Political OutcomesCoefficient, [90% CI](1) (2) (3) (4)Bandwidth Optimal 0.90 0.75 OptimalControls No Yes Yes YesPeriod 2008-12 2008-12 2008-12 2000-04ELECTIONSIncumbent’s Vote Share -7.60*** -7.89*** -7.99** 1.08(%) [-11.95,-3.03] [-12.95,-3.40] [-14.03,-2.25] [-2.96,5.71]Number of candidates 0.35*** 0.42*** 0.46*** 0.12(number) [0.14,0.59] [0.22,0.68] [0.23,0.77] [-0.06,0.32]Margin of Victory -5.69* -6.50** -6.94** -0.73(pct points) [-11.09,-0.91] [-12.94,-1.77] [-14.85,-1.28] [-4.17,3.28]Campaign Donationsa -0.40* -0.47** -0.55** -0.58(R$’000) [-0.89,-0.04] [-0.97,-0.08] [-1.12,-0.13] [-1.47,0.11]Challenger’s Education 17.65** 13.68* 19.31** -0.65(% with high school) [4.33,30.17] [1.79,25.01] [4.60,33.60] [-21.37,17.04]Challenger’s Party -20.19* -23.63** -29.28** -8.38(1=clientelistic) [-35.42,-3.49] [-39.12,-6.49] [-47.60,-10.10][-25.10,7.18]BUDGET SHARES (by type)Capital Investment -1.43 -1.62 -1.67 0.77(% share) [-3.01,0.02] [-3.45,0.13] [-4.05,0.70] [-1.18,3.32]Personnel Spending 2.73** 2.88* 3.08 0.20(% share) [0.60,5.19] [0.34,5.54] [-0.32,6.66] [-1.65,2.21]BUDGET SHARES (by function)Education 0.36 1.95* 2.25* -1.20(% share) [-1.64,2.56] [0.26,4.38] [0.07,5.09] [-2.56,0.09]Health 2.61*** 2.29*** 2.44** 0.15(% share) [1.21,4.28] [1.03,4.10] [0.74,4.77] [-1.76,1.81]aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. The coefficients represent the ATE for the preferred frontier segment, under the edge kernel.All regressions include year and state effects. The list of included municipal-level controls is described in thetext.653.6. Figures and TablesTable 3.6: Main Results by SubsamplesCoefficient, [90% CI]DependentVariableIncumbent’s VoteShareCapitalInvestmentEducationSpendingHealth CareSpendingby the Incumbent’s EducationHigh -6.05** -2.69** 3.47*** 2.27**[-12.26,-1.39] [-5.60,-0.60] [1.72,6.53] [0.68,4.90]{263} {150} {155} {147}Low -11.13** 0.08 -1.08 1.49[-20.04,-3.21] [-3.35,3.14] [-3.83,1.63] [-0.60,3.66]{205} {129} {122} {114}by the Density of Political Preferences Around the Neutral LevelHigh -10.79*** -3.63*** 4.69*** 2.52**[-21.70,-4.82] [-6.11,-1.97] [2.10,8.21] [0.57,5.25]{215} {176} {174} {164}Low -4.43 -0.89 0.07 1.25[-10.68,1.50] [-3.09,1.54] [-2.18,2.58] [-0.65,3.21]{215} {176} {175} {164}by the Party’s Clientelism ScoreHigh -8.72*** -1.17 1.20 2.50**[-15.69,-3.17] [-3.63,0.92] [-0.69,4.14] [0.82,5.40]{248} {160} {153} {143}Low -2.51 -1.40 1.22 1.94[-11.07,4.86] [-4.48,0.75] [-0.72,4.74] [-0.96,4.57]{183} {100} {104} {099}by Poverty in the MunicipalityHigh -7.73*** -1.73* 1.84** 2.17***[-12.42,-3.60] [-3.34,-0.22] [0.34,3.91] [1.05,3.73]{431} {352} {350} {329}Low -4.43 -0.59 1.64 1.64[-11.44,4.58] [-2.62,1.46] [-1.07,4.85] [-1.07,4.85]{355} {353} {345} {345}Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered by municipality. Highly educatedincumbents have some post-secondary education; less educated ones have less than than college. Highdensity municipalities have above-median votes for ”swing” parties in the 1998-2002 congress elections. Highclientelism parties are defined by the DALP survey. High poverty sample has at least 25% of poor families(main specification). Low poverty sample has at most 45% of poor families. The coefficients represent theATE for the preferred frontier segment (edge kernel). All regressions include year and state effects, and themunicipal-level controls described in the text. 66Chapter 4Reelection Incentives and Fraudin Cash TransfersThe previous two Chapters show that, when cash transfers are effectively shieldedfrom both local political manipulation and credit claiming, they reduce incumbencyadvantage in local elections. In this Chapter, I examine the scenario in whichincumbent politicians are able to manipulate eligibility to CCT, and also benefitfrom it in the elections.Corruption in the administration of public resources is a threat to the functioningof democracies. While corruption can take many forms, it is often driven by one oftwo motivations: politicians seeking personal rewards, or using the spoils of office toinfluence voters and gain reelection. The literature has well documented examplesof rent extraction for personal benefit.77 Against this rent-motivated corruption, thepossibility of reelection can serve as a disciplining device that aligns the preferencesof voters with the actions of politicians, therefore reducing corruption (Barro, 1973;Ferejohn, 1986; Duggan and Martinelli, 2016). On the other hand, when politiciansillegally capture resources with the intent to direct them to voters, and influence77See Ferraz and Finan (2011); Brollo and Troiano (2016) for their work in Brazil; Eggers andHainmueller (2009) in Britain; and Querubin and Snyder (2013) in the USA.67Chapter 4. Reelection Incentives and Fraud in Cash Transferselections,78 the possibility of reelection creates incentives to fraud.79This Chapter uses the administrative registry of the BF program (CadUnico)to measure income underreporting fraud. CadUnico contains the timing of enroll-ment, and the income reported by more than 24 million households in Brazil, forthe purpose of program eligibility. I estimate that roughly 8 million householdsare currently receiving benefits by means of underreporting their income. Not allfraud, however, can be attributed to the households alone. The program enrollmentis managed and overseen by municipal offices under the mayoral administrations,allowing for political manipulation of reported income.I first demonstrate that the possibility of reelection for mayor creates incentivesfor fraud in the BF program. Using a regression discontinuity design (RDD), Iidentify the effects of reelection incentives on fraud by comparing municipalitieswere incumbents barely won the 2008 election (and were ineligible in 2012 due toa two-term limit), to ones were they barely lost (and new mayors were eligible in2012). The evidence of fraud comes from the following: while the 2010 Census showsno significant differences in the income distribution and poverty levels of the twogroups, a significantly higher share of the population reports very low income (inCadUnico) in the municipalities with reelection incentives.This pattern is stronger near the election period, and it is uncorrelated to bothcontemporaneous measures of administrative and economic performance in the mu-78See Camacho and Conover (2011) for an example in Colombia. The two motivations for rentextraction can co-exist, and often are mutually exclusive (the politician can chose to either use therents for personal benefit or vote buying). This Chapter focuses on cases in which the institutionaldesign leaves little room for extraction of personal benefits from the public sphere, but ample roomfor manipulation favoring voters. This is the case of the CCT program in Brazil.79Politicians can also use public resources to influence voters without committing fraud (andwithout offering universal public goods). As an example, they can claim undeserved credit for thearrival of resources that they do not control. The case of foreign aid to the Philippines is welldocumented in Cruz and Schneider (2016). They can also use public resources to conduct tacticalredistribution, by targeting selected groups of voters in order to maximize reelection probabilities(Dixit and Londregan, 1996). See Stokes et al. (2013) for a detailed account of clientelistic practices,which include the use of public resources for vote buying. This Chapter focuses on cases where thepolitician can only claim credit or manipulate a specific source of funds by illegal means (i.e. fraudor corruption).68Chapter 4. Reelection Incentives and Fraud in Cash Transfersnicipality, and other demographic information reported by poor households. It alsolines up with the outcome of random government audits on reported income. Thistype of fraud generates expenses with BF that are nearly 7% higher in these munici-palities, giving mayors ample ability to claim credit over the distribution of resourcesthat are 5 times the average campaign spending in municipal elections.Second, when the expectation of being audited increases in the municipality,mayors are less likely to promote fraud. Since 2004, the Brazilian government auditsa random sample of municipalities every year, in order to examine the spending offederal resources, including the BF payments. Income underreporting fraud cantrigger financial charges and legal action against both the household and membersof the administration. The occurrence of audits in nearby municipalities shouldmake the consequences of an audit more salient to mayors, and are used as a proxyfor the mayor’s audit risk perception. Given that audits are random, this Chaptercan identify the change caused by audit risk in the effect of reelection incentives onthe type of fraud mentioned above. Accordingly, the occurrence of at least one auditin a nearby municipality is enough to curtail all the politically-motivated fraud inelection years.Third, this Chapter shows that fraud helps incumbents to be reelected. Usingmatching over 40 observed characteristics of municipalities, parties and mayors, Icompare the administrations that are top 25% in fraud to the remaining ones. Forthe sub-sample of all municipalities where the mayors actually ran for reelection in2012, higher fraud increases the reelection probability by a significant 36%. Thiseffect is higher where past elections were more competitive. Here again, there isstrong evidence that fraud is uncorrelated with the administration performance inother areas, indicating that the increase in reelection probability is likely due to themayor’s ability to claim credit over illegally obtained BF transfers. All these resultsfit a probabilistic voting model tailored to include conditional cash transfers, and694.1. CCT Program Design, Credit Claiming and Manipulation of Eligibilityfederal audits as an accountability device.4.1 CCT Program Design, Credit Claiming andManipulation of EligibilityAs described in Section 3.1 (Chapter 3), Bolsa Família (BF) is the largest CCTprogram in the world, and represents a significant share of all non-discretionaryfederal transfers to municipalities in Brazil. Given the country’s history of abundantclientelism (Fried, 2012; Sugiyama and Hunter, 2013), and a decentralized structureof public spending (Ferraz and Finan, 2011; Brollo and Nannicini, 2012), it is notsurprising that BF’s political impact would raise the interest of both researchers80and the local press since its inception.81CCT programs are known to increase the support for the federal governmentthat implemented it. At the local level, although most CCT experiences severelylimit the role of local politicians in the implementation, incumbents can still benefitfrom claiming credit for the program’s arrival (Labonne, 2013; Rodriguez-Chamussy,2015). In Brazil, however, undeserved credit claiming opportunities for local politi-cians are curtailed by a combination of program design and the political context.First, different from most other CCTs around the world (e.g. Mexico, Philip-pines, Colombia), BF benefits are granted solely based on self-reported income(Handa and Davis, 2006). When designing the eligibility mechanism for CCT, thereis a trade-off between inclusion and the quality of targeting. Simpler enrollmentrules, such as self-reporting, are easy understood by beneficiaries and allow for moreinclusion. They, however, come at a cost to targeting, given that ineligible house-80 See examples in De Janvry, Finan, and Sadoulet (2012); Sugiyama and Hunter (2013); Zucco(2013).81The role of BF in shaping the Brazilian vote has been widely discussed by the presssince the program’s creation in 2003. These are examples of press articles (in Portuguese):,, CCT Program Design, Credit Claiming and Manipulation of Eligibilityholds can easily access the benefit by underreporting income. While the BrazilianCCT was designed to maximize inclusion (Soares, Ribas, and Osório, 2010), thepolitical implications of this enrollment technology are often overlooked. When in-dividuals fully understand the eligibility rules and self-enroll, they are also less likelyto attribute undeserved credit to a local politician.Second, BF was designed to promote the federal government brand. The moneyis deposited directly into the beneficiary’s account without the need for brokersor intermediaries, and a toll-free number gives them direct access to the federalmanagement office if they need support. It is clear to beneficiaries who is eligibleto the benefit, and who is paying for it.82 Accordingly, surveys indicate that itsbeneficiaries perceive BF as being more immune to local political manipulation thanother government programs (Rego and Pinzani, 2013; Sugiyama and Hunter, 2013).Third, the Brazilian political environment has multiple parties (35), low ideolog-ical identification (Ames and Smith, 2010), and local elections are candidate-driven.Local coalitions do not respect party alliances at the federal level. In fact, it is com-mon to observe coalitions including parties that are on opposing sides at the centralgovernment. In this context, local politicians have limited ability to claim credit fora successful policy implemented by their party at a higher level. Municipal electionsin Brazil happen every four years, and there is a two-term limit for mayors.While undermining credit claiming efforts, the BF management structure favorsmanipulation of program eligibility by local politicians. The program is managedjointly by the central and local administrations. Municipal governments are mainlyresponsible for household enrollment and data collection, which is done throughthe Cadastro Unico (CadUnico).83 The federal government sets program targets,82The BF card is carries the brand of the federal government on the front, and the toll-freenumber on the back.83CadUnico is an integrated registry that contains a vast array of demographic information re-ported by households at the time of their enrollment/update. It can be accessed and updated bythe local offices, and the data is used by the federal office to grant or deny BF benefits. CadUnicois also the platform for accessing other (smaller) poverty alleviation policies run by the federal714.1. CCT Program Design, Credit Claiming and Manipulation of Eligibilityfunding, eligibility criteria, and approves and denies benefits. The municipal officesdo not control either the timing or the approval of benefits.84 All households withmonthly per capita (pc) income below half the minimum age (R$311 in 2012) areeligible to enroll in CadUnico. However, only families with pc income below R$140are eligible to BF benefits, which are mainly of two types. The full benefit of R$70is granted to all households with pc income below R$70, and it is not conditionalon anything other than the self-reported income. The variable benefit (R$32-R$38)targets children (below 18) and pregnant women, and it is available to all householdswith pc income below R$140, conditional on school attendance and health check-ups.Although local BF offices are responsible for the accuracy of the information en-tered into CadUnico, the only existing source of accountability for BF targeting arethe audits run periodically by the federal government. The Office of the ComptrollerGeneral (CGU) audits the spending of earmarked federal resources on 120-180 ran-dom municipalities every year, which gives each municipality a probability of around3% of being audited in any given year. These audits include the resources transferredto mayors to manage the CadUnico enrollment, and also interviews with a subset ofthe program beneficiaries to verify their reported income.85 Income underreportingis considered a fraud, and households or public servants found responsible for thisact can be prosecuted in a civil court.To the extent that they can influence the enrollment/update process of CadUnico,incumbent mayors can use different strategies to benefit from the program.86 Thisgovernment.84For a more extensive analysis, see Lindert et al. (2007).85In addition to audits, the federal government has recently started to cross the CadUnico dataagainst data from other government databases such as RAIS (a database containing wage infor-mation for all employees in for formal market in Brazil), and CNIS (the database of retirementbenefits). MDS also had a pilot program to use the remaining demographic information included(e.g. housing conditions, reported expenses) in CadUnico to detect inconsistencies in reportedincome, but this is still not the formal procedure for eligibility.86There several reports in the local press (in Portuguese) describing attempts of local politiciansto use the program in their own benefit. These include enrolling friends and family in the pro-gram, threatening households to cancel their benefits, or enrolling beneficiaries that are not eligiblein exchange for votes. See some examples here:,,724.2. Theory: Reelection Incentives and Reverse Accountabilitypaper focuses on the income underreporting fraud motivated by the reelection in-centives of the mayor. Self-reported income as the only eligibility criteria generatesa massive amount of income underreporting by households. Figure 4.1 shows thedifference between the distributions of reported income in CadUnico and the 2010Census.87 The share of the population reporting income below R$70 (eligible to thefull BF benefits) is much higher in CadUnico. While part of this underreportingcan be attributed to the beneficiaries themselves,88 this paper shows strong evidencethat reelection incentives can lead mayors to promote income reporting fraud, andbenefit electorally from it.4.2 Theory: Reelection Incentives and ReverseAccountabilityWhen politicians face the trade-off between providing public goods and extract-ing rent for personal benefit, accountability mechanisms such as reelection incentiveshad been shown to be effective in reducing corruption, as they realign the incentivesof voters and politicians.89 In the case of BF, the incentives of mayors and votersare aligned to increase program fraud. Income underreporting sanctioned by the lo-cal politician increases both the transfers to voters and the probability of reelectionfor incumbents, as it allows them to claim full credit for the extra benefit under a, Census is conducted by a different government institute, and it is not used in any way foreligibility purposes in the BF program. The MDS, however, uses the Census data to update itsestimates of eligible families on each municipality.88Additionally, part of these gap can be attributed to the timing of both surveys, and to incomevolatility. In fact, the government official target of BF beneficiaries is not solely calculated basedon reported income in the Census, but also includes an income volatility component. Nevertheless,there is clear evidence of income underreporting even after this adjustment. For example, for 2010,21% of households reported income below R$140 in the Census survey, while the BF target is 29%of the population. However, the effective share of the households reporting income below R$140 inCadUnico is 41%.89See Ferraz and Finan (2011) for an example in Brazil, and Duggan and Martinelli (2016) for arecent literature review on the subject.734.2. Theory: Reelection Incentives and Reverse Accountabilityvery low cost (the BF resources do not come out of the local budget). I call thisrealignment reverse accountability.This mechanism can be illustrated by this version of the probabilistic votingmodel adapted for CCT. This is a much simpler model than the one presented inthe second Chapter. It follows the same general framework and, while it includesthe possibility of claiming credit for CCTs, it does not include vote buying.90 Let pbe the share of poor households in the population n, one that is eligible to at leastsome benefits. Let  be the share of the poor (p) that are very poor, i.e. eligible tothe full BF benefit (w). The very poor group cannot defraud the program, as theyreceive the full benefit without underreporting income. The share 21−3 could onlyreceive w if they underreport income. Define the share of households defrauding BFas f , i.e. the share of np21− 3 that is actually underreporting income.91In the spirit of the previous model, voters are prospective. They have utilitycoming from public goods received from the incumbent 2g3 and transfers, as longas their credit can be attributed to the local politician. I assume that g is exoge-nous, and constant over time and politician. Let h be the exogenous probabilitythat the challenger will not allow and/or promote income underreporting if elected.Thus, when voters are allowed to underreport their income, they know that theycontinue to receive the extra benefit with probability 1−h if the challenger wins andprobability 1 if they reelect the incumbent. Let  be the perceived probability thatthe municipality will be audited. The voter’s utility coming from the incumbentis g 5 fnp21 − 321 − 3w − , where  is a random variable that represents theoverall popularity of the challenger, uniformly distributed in [−) P ) ], so that  isthe density of the distribution. The cost to the incumbent of promoting fraud is90It is a model in the spirit of Persson and Tabellini (2000), and more recently Camacho andConover (2011).91For simplicity, I assume that the share ()−p) of the population is not willing to commit fraud toenter the BF program in exchange of political support, so it will not be targeted by the municipality.This assumption has no relevant influence on the mechanism in place, and follows the literature onclientelism and vote buying to the extent that these practices thrive in the context of poverty.744.2. Theory: Reelection Incentives and Reverse Accountabilityincreasing in f (assume xf2), where x is exogenous.92Following the Brazilian case, mayors can be reelected only once. In a two-periodmodel, where winning incumbents have no reelection incentives in period 2, theironly objective is the rents of office. Let r be the exogenous, maximum share of gthat can be extracted as rent for personal gain. Because promoting BF fraud hasa cost, but it only benefits the mayor through reelection probabilities, in period 2the optimal fraud (f2) for reelected mayors is zero, and rents are maximized at rg.Solving back to period 1, the incumbent’s objective is to maximize his reelectionprobability ., which is shown below, subject to the cost of promoting fraud.. = erow[2rg 5 f)np21− 321− 3hw− rg3 S ] (4.1)Because the challenger cannot manipulate the eligibility, he only offers rg tovoters.93 Under the model assumptions, first-term incumbents maximize .rg− x2f2) ,solving for f) as shown below.f) = np21− 321− 3whrg2x(4.2)Several predictions of this model can be tested in the data. First, it is easy tosee that, while second-term mayors have no incentives to promote income reportingfraud if x S :, f) is positive for first-term mayors. Also, fraud should be higher inmunicipalities where the poor population is higher (Uf1Up S :), simply because fraudis more likely to have a meaningful impact on the election in these locations. Second,when the perceived probability of being audited is higher 2Uf1U < :3, mayors haveless incentives to promote fraud, in keeping with the government’s accountabilitymechanism. Third, higher fraud enhances the chances of reelection 2 UUf1 S :3 by92In practice, c should be fairly low.93In this model, because the rent extraction in period 1 does not serve as a signal of period 2extraction (voters know that lame-duck mayors will always extract maximum rent), rent extractionis always maximized in period 1 as well.754.3. Data and Construction of Variablesaligning the preferences of poor voters and politicians.94 Fourth, more competitiveelections should increase both fraud 2Uf1U S :3, and the effect of fraud on the re-election probability 2 U2Uf1U S :3. Although this paper’s identification strategy doesnot allow for a formal test of these last two predictions, there is strong evidenceindicating that their are in fact in line with the empirical results.4.3 Data and Construction of VariablesThe main data source in this paper is the administrative database with thefull information of the CadUnico registry.95 This database was extracted in Dec2012, and it contains all the demographic information provided by the householdsduring their most recent update process (households are required to update theirinformation every two years, under the risk of losing their benefits). It also has dataon which households are effectively receiving each type of BF benefit, and the datein which the information was last updated by the household, allowing me to observehow the income reporting changes as elections approach. Given that the identifyingvariation in this study is at the municipality level (i.e. the reelection incentives), allthe household data is aggregated at the this level.Within the group of households that are eligible to enroll in the CadUnico (pcincome ≤ R$311),96 I define poor households as the ones with pc income belowR$140, and very poor households as the ones with per pc income below R$70. Themain variable used to measure fraud is the share of CadUnico-eligible households94Given that the CCT transfers are funded by the federal government, and municipalities do nothave binding coverage targets, the fact that some voters might be receiving the benefit does notaffect the utility of other voters in the same municipality, i.e., they are not less or more likely toreceive the benefit.95This database is not available for public access, it was provided by the Ministry of SocialDevelopment (MDS) upon request.96Approximately 8% of the entries in CadUnico had pc income above R$311. These householdsare excluded from the sample. The higher income could be a consequence of a data entry error, orthe fact that households that have pc income above the limit, but still have total monthly incomeof up to three minimum salaries, can also enroll (although they could never access BF benefits).764.3. Data and Construction of Variablesthat are very poor, which are the ones eligible to the full benefit. I will also reportsupporting results for the following two variables: (1) the share of CadUnico-eligiblehouseholds that are poor ; and (2) the share of BF-eligible households that are verypoor. I built the same variables using data from the sample of the Census 2010,which was taken only two years before the 2012 election.The Ministry of Social Development (MDS) provides monthly data on municipalCCT coverage, and the total value of transfers related to BF, from 2004 to 2015. Thisdata, although only reported at the aggregated level, complements the CadUnicoinformation for the period after the 2012 election (CadUnico data is not availableafter 2012). It also provides the IGD index, which measures the performance ofthe local administration in managing the BF program. The index is composedby four different scores, measuring the performance in checking health and schoolconditionalities, registry update, and coverage of the CadUnico.Data on historical precipitation comes from the Climate Research Unit (CRU)of the University of East Anglia, and it is available for a 0.5 degrees grid in latitudeand longitude, since 1970.97 The remaining geographical (e.g. area, latitude, lon-gitude) and demographic (e.g. population) data comes from the Brazilian Instituteof Geography and Statistics (IBGE). Data on gender, income distribution, literacyand urban population comes from the 2000 IBGE Census.As in Chapter 3, election data comes from the Superior Electoral Authority(TSE) and includes, in addition to election results, personal information of candi-dates such as gender, education, occupation, age, assets and campaign donations.Mayors with reelection incentives are the ones elected in 2008 for their first-term(first consecutive term). Mayors reelected in 2008 for their second consecutive termare the lame-duck mayors. The sample excludes municipalities were the lame-duckmayor stepped down before the end of the term. The reason is that the replacement97I use the average monthly rainfall in 1970-2008, and the coefficient of variation of the rainfall,calculated as the standard deviation divided by the mean in this period.774.3. Data and Construction of Variablesmayor would actually have reelection incentives, without being elected in 2008.98I classify the main Brazilian political parties according to their level of clientelism(high or low), and ideology (left or right), based on the survey from the DemocraticAccountability and Linkages Project (DALP).99I use the number of recent CGU audits in neighbor municipalities100 to proxythe expected probability that a certain municipality will be audited. The idea thataudits in nearby locations make the audit risk more salient to politicians has recentlybeen used in Lichand, Lopes, and Medeiros (2016). The main sample in this studyincludes only municipalities in the upper tercile in poverty, which are the ones withat least 67% of the households eligible to enroll in CadUnico (as of 2007).101 This isin line with the model prediction that fraud should be lower in municipalities withlow poverty rates, and also with the estimation strategy in Chapter 3. Table C.5 inthe appendix shows the results for the low poverty sample of roughly the same size(bottom tercile in poverty).98These municipalities are identified using the candidates in the 2012 election that reported theiroccupation as mayors, in municipalities were reelection incentives were absent. The results arerobust to this exclusion, and the excluded observations represent only 1.6% of the sample99The DALP (Democratic Accountability and Linkages Project) is a survey from 2008 wherepolitical experts from several countries respond to questions about the political behavior of localparties. The project is supported and made available by the Political Science Department at DukeUniversity. I use the scores from the four questions related to the intensity with which partiesuse clientelistic exchanges to gather votes. The parties with an average score above 3 (out of 4in an increasing scale) are identified as clientelistic parties, with value = 1. All other parties areidentified with value = 0. All small parties that were not evaluated by DALP are identified as non-clientelistic. For ideology, I all major parties with a score lower than 5 in 0-10 scale were brandedas left. I also included in the left group the small parties with a open socialist ideology.100I do not include past audits in the municipality of interest, as those might have a confound-ing effect. There might be direct effects of recent audits on the amount of BF fraud in a givenmunicipality.101I calculate the poverty rate using the estimate of CadUnico eligible households made by theMDS, based on the 2006 PNAD survey, and the population count of 2007. This was the officialtarget of BF beneficiaries until 2011, when it was re-calculated based on data from the 2010 Census.I still use the pre-2010 variable, as it pre-dates the 2008 election.784.4. Empirical Strategy4.4 Empirical StrategyIn this Section I briefly describe the three different empirical strategies employedin this study.4.4.1 Reelection Incentives and FraudThe first part of this Chapter estimates the effect of reelection incentives onincome underreporting fraud by comparing mayors elected for a first term in office,to mayors reelected for a second and last term. This strategy carries two mainpotential sources of bias (Ferraz and Finan, 2011; De Janvry, Finan, and Sadoulet,2012), as illustrated in equation 4.3 below:Fm = ( 5 )m 5 2zxpzriznxzm 5 +vwilitym 5 m (4.3)Fm is the level of fraud in municipality m, and m indicates a first-term mayorwith reelection incentives 2m = 13. The coefficient of interest is ). First, theexperience bias reflects the fact that second-term mayors have, by design, moreconsecutive years of experience on the job than first-term mayors. For example, ifpolitical experience increases the willingness or ability of a mayor to promote fraud,then )will be biased upwards. The ability bias comes from a selection problem.Second-term mayors are the only group that was reelected for office after revealingtheir political ability. The estimate of ) will again be biased if both the politician’sreelection chances, and the level of fraud, are correlated with either the politicianability or any characteristic of the municipality.I employ a regression discontinuity design (RDD) to eliminate the potentialability bias coming from unobserved municipal characteristics that might determinereelection probabilities. The spirit of the RDD is to compare municipalities whereincumbent mayors barely won the 2008 election, and therefore had no reelection794.4. Empirical Strategyincentives in 2012 (control group), to municipalities where they barely lost, andwere replaced by first-term mayors that could be reelected in 2012 (treated group).This strategy provides a quasi-random assignment of first- and second-term mayorswhere elections were very close (Lee, 2008; Eggers et al., 2015), i.e. the margin ofvictory 2bkm3 was nearly zero. I define the margin of victory as the difference inpercentage points between the winner and the runner-up. The application of theRDD to close elections has been widely used in Brazil.102 Under the assumptionsof the RDD (Lee and Lemieux, 2010), the local average treatment effect (LATE)is identified for bkm = :. I estimate this effect non-parametrically using a locallinear regression, as shown in the equation 4.4 below.Fm = ( 5 )m 5 2bk m 5 +mbkm 5 s 5 p 5 m 5 m (4.4)The treatment effect is again denoted by ). I also include state effects (s),party effects (p),103 and a vector of municipal controls (m).104 This is usual inRDDs to reduce the sample variability (Lee and Lemieux, 2010). The local linearregression is weighted by the edge kernel, and estimated for a sample limited bya bandwidth around bkm = :. The main specification uses a bandwidth of 16pp(percentage points), which is roughly the average optimal bandwidth calculated forthe three variables that measure fraud. For robustness, results will be shown forlower and higher bandwidths (13pp and 19pp), and also for an optimal bandwidthcalculated by the widely used method proposed by Calonico, Cattaneo, and Titiunik(2014).The RDD is only valid if covariates that are either fixed or pre-determined (de-102For examples, see Boas and Hidalgo (2011); Ferraz and Finan (2011); Brollo and Nannicini(2012); Brollo and Troiano (2016).103Brazil has a total of 26 states. There are 23 political parties represented in the sample.104Unless otherwise noted, the following variables are included as controls: latitude; longitude;their interaction; the shares of male population and urban households (2000); the mayor’s genderand education level (=1 if attended College); log of population; the share of votes PT had in themunicipality in the 2006 presidential election; and the IGD index for the municipality for 2006-08.804.4. Empirical Strategytermined before treatment) are balanced around at the discontinuity, so the controland treatment groups are not significantly different. Figure 4.3 (and C.2 in theappendix) show that the sample is balanced, under all bandwidths, for 26 out of 27variables that include characteristics of the municipality, elected party, and mayor.The exception is the age of the elected mayor, which is significantly lower in thegroup of first-term mayors, most likely following the empirical design. In the ap-pendix to this Chapter (Table C.5), I also show that the results estimated for asub-sample of mayors between 40 and 55 years, where the variable age is balanced,are still significant and very close to the ones reported for the main sample.In addition to the 27 pre-determined variables above, Figure 4.3 also showsthat the variables measuring reported income levels in the 2010 census are alsobalanced. It is key for the results in this Chapter that municipalities with andwithout reelection incentives have the same share of poor households when measuredby the income reported in the census survey. Evidence of income underreportingfraud comes from these municipalities having different shares of poor householdswhen measured by income reported in the CadUnico survey (the one used for BFeligibility purposes).To address the issues of both experience and ability biases, I follow Ferraz andFinan (2011). First, I show in Table 4.5 that the results remain robust if estimatedonly for a sample of first-term mayors that have previous political experience inthe same municipality, i.e. they were mayors in 1996-2000 or 2000-04, or membersof the local council in 2000-2004 or 2004-2008. The same Table also shows thatthe results remain robust when estimated only for a sample of first-term mayorswith comparable political ability to second-term mayors, i.e., first-term mayors thatactually ran for reelection in 2012, and won.If the income underreporting is actually caused by an electoral strategy, as op-posed to being a spillover of the mayor’s administrative practices and performance,814.4. Empirical Strategyone should expect two things. First, the effects of reelection incentives on fraudshould be uncorrelated with other performance measures in the mayoral adminis-tration, especially the ones related to the management of the BF program. I testthis hypothesis on Table 4.3 in the appendix.Second, the effects should be more significant near the election. Accordingly, theresults estimated in this paper correspond to the fraud measured only for householdsthat updated or enrolled in CadUnico in 2012, the year of the municipal election.These correspond to 45% of the households in the registry. Because households areonly required to update their information every two years, and many do not comply,a still relevant part of the sample had their last income update on, or before 2011.This allows me to observe if these effects also occur in updates pre-election (Figure4.5).4.4.2 Expected Probability of Audits and FraudI use the number of CGU audits in the municipality’s neighborhood in the firsttwo years of the mayoral tenure (2009-10), which precede the pre-election period(2011-12), to test the model prediction that a higher expected probability of beingaudited reduces the mayor’s incentives to promote fraud. I define a neighbor mu-nicipality as the one in the same micro-region (as defined by IBGE). The effect ofaudit risk on fraud is estimated by interacting the variable that measures reelectionincentives (m) with m (m = 1 if there was any fraud in a neighbor municipality),as shown below:Fm = ( 5 )m 5 2bk m 5 +mbkm 5 4m 5 5mm56mbkm 5 7mmbkm 5 s 5 p 5 m 5 m (4.5)824.4. Empirical StrategyThe original treatment effect (reelection incentives on fraud) is again denotedby ), but the impact of audit risk on this effect is given by 4. In addition to themain results, I also show results including past audits in the same municipality, andrestrict the neighborhood to radiuses of 110km and 55km, within the micro-region.As a placebo test, I estimate m using audits in 2012-13 (the election was in 2012),when no significant effects should be observed.4.4.3 Program Fraud and Reelection ProbabilityThe identification of the causal effect of fraud on elections is not as straightfor-ward. There are many variables that could potentially influence both the electionresults and the level of income underreporting in a given municipality. Here, I usea non-parametric matching procedure. Matching has been widely used in politicalscience applications, and it relies on the same identification assumption as the OLSregression (selection-on-observables), i.e., that there are no variables other than themodel covariates that affect both the outcome (election results) and the explanatoryvariable (fraud). It has, nevertheless, two major advantages: a less restrictive func-tional form assumption, and it guarantees that the control and treatment groupsare balanced on the selected covariates (Sekhon, 2009).For this purpose, I define the explanatory variable non-linearly, as an indicatorof the top 25% administrations in fraud.105 The matching procedure used here is theentropy balancing method proposed by Hainmueller (2012). This method assignsweights to the observations in the control group (i.e. municipalities that are in thebottom 75% in fraud) to ensure that treatment and control are balanced on all theselected covariates, and the groups are comparable. In order to minimize the po-tential omitted variables bias arising from the selection-on-observables assumption,I condition the effects on 40 different covariates. These include characteristics of105Or the top 25% administration with a higher share of incomes reported below R$70 in theelection year (2012) in CadUnico.834.5. Result and Interpretationthe municipality such as GDP, poverty rate and regional location; of the mayor,such as gender, education, occupation and assets; and of parties, such as coalitionstatus and ideology. Table C.6 in the appendix shows that, as expected, severalmunicipal characteristics are imbalanced before matching. The number of house-holds reporting low income is expected to be correlated with some observed variablessuch as GDP, and poverty rate. However, all covariates are perfectly balanced aftermatching.4.5 Result and InterpretationIn this Section I present the results, their interpretation, and their link to thetheoretical framework.4.5.1 Do Reelection Incentives Increase Fraud?The results show that mayors with reelection incentives promote more fraudin BF program than lame-duck mayors. Table 4.2 shows the local average treat-ment effects obtained with the RDD, for the sample including all households thatupdated/enrolled on CadUnico in 2012 (election year), and for all municipalitieswith a share of poor families above 67%.106 Under the main empirical specification(bandwidth of 16pp including all controls), the share of CadUnico-eligible house-holds reporting income below R$70 is 5.4pp higher in municipalities were mayorshave reelection incentives (7% higher). Similar results are found for the share ofCadUnico-eligible households reporting income below R$140 (2.1pp higher), andBF-eligible households reporting income below R$70 (3.9pp higher). These resultsare robust to the exclusion of controls, state and party effects, and to differentbandwidths (Table 4.2 ).106In line with the model prediction, the effects described here are not observed for a sample of”wealthier” municipalities, as shown in the appendix (Table 4.5).844.5. Result and InterpretationIn addition to the robust evidence of income underreporting fraud, the resultsfrom this section provide two interesting insights on the mechanism in place. First,from the larger universe of CadUnico eligible households, much higher effects areobserved for income reported below R$70 (eligible to the full BF benefit), as opposedto R$140 (eligible to some BF benefits). This implies that the core of the fraud isto get households that are already poor (and likely eligible to some BF benefit) toreceive the full BF benefit.107 This practice is in line with the distribution of reportedincome shown in Figure 4.1, and well illustrated by the model. That is, this typeof fraud has a lower search cost to the local offices (many of these households cometo update their CadUnico profile regularly), it is harder to be detected by audits,and this beneficiaries are less costly to monitor (no extra health or school conditionsattached).Second, Figure 4.2 shows the distribution of municipalities across the disconti-nuity, according to their income underreporting level. Although the effects are notidentified for points apart from the discontinuity (bkm = :), a visual examina-tion shows that, for municipalities with less competitive elections on the right (withreelection incentives), the effects of higher fraud disappear. This is line with thetheoretical prediction 2Uf1U S :3, i.e., mayors elected by a large vote margin in 2008are expected to rely less on fraud to be reelected in 2012.4.5.2 The Magnitude of the Losses Triggered by FraudIn Dec 2012, municipalities with reelection incentives had a total BF spend-ing that was 7% higher than municipalities governed by lame-duck mayors (Table4.2). To put this figure in perspective, it represents R$0.3mn in annual expensesfor a municipality at the discontinuity, which is roughly five times the average costof an election campaign. Additionally, it represents roughly three times what the107which includes the monthly payment of R$70 with no conditions attached854.5. Result and Interpretationmunicipal administration receives form the federal government to spend in the man-agement of CadUnico and update/enrollment efforts. While 7% might not seem amajor leakage for a program of the scale of BF, it represents a major source ofresources that can be controlled by mayors looking for reelection.Additionally, these effects are slightly conservative. It might take several monthsfor the actual distribution of benefits to reflect changes in the income reported in re-cent enrollment/updates.108 Table 4.2 illustrates this temporary disparity betweenincome reporting and benefits. At the end of 2012, the share of households receivingthe full BF benefit is not significantly higher for the entire set of beneficiaries inmunicipalities with reelection incentives. However, in 2014, the share of householdsreceiving the full BF benefit is already significantly higher (by 0.6pp) for the en-tire set of beneficiaries treated municipalities, where a higher BF spending is alsoobserved (a 12% higher spending).4.5.3 Other Robustness, Placebo and Specification TestsFigure 4.3 (full results in Table C.3 in the appendix) shows that, in line withthe expectations, fraud is only observed in near the election (in 2011, and strongerin 2012), being absent from the two first years of the mayoral tenure. Table 4.3shows that reelection incentives had no impact on both the performance of mayorsin managing the BF program, measured by the components of the IGD index, and oncontemporaneous economic growth, measured by the GDP growth between 2011-12 and 2010-12 in the municipality. These results show that it is unlikely thatincome underreporting is a spillover from the practices and performance of theadministration, as opposed to fraud.Table 4.5 shows the robustness of the main results to the potential experienceand ability biases. The effects on election year-fraud remain significant for both the108The local program administration has no control over the timing, and the actual approval anddenial of benefits, see the appendix for a more detailed description of this process.864.5. Result and Interpretationsub-samples of mayors with similar experience and ability to second-term mayors. Infact, the magnitude of the coefficients is much higher in the high-ability sub-sample,indicating that the main sample might be giving conservative estimates of the actualeffects of reelection incentives of fraud. In other words, less political experience andability might be in fact slightly reducing fraud. The appendix (Table C.5) alsoshows the results for the sample with balanced age, which produces results that aresimilar to the main specification.Households also report other characteristics in CadUnico such as their monthlyfood and electricity expenses, size of their house, and the existence of a water con-nection. These variables are not actively used to define program eligibility.109 Itis possible that income underreporting is systematic enough that households alsomisrepresent the values of all variables consistently. It is more likely, however, thatmore attention is paid to the only variable that matters for eligibility, i.e., income.Accordingly, there are no significant differences in the effects of reelection incentiveson the reporting of these variables in 2012, showing additional evidence that the in-come underreporting effects observed in the same year are indeed a consequence offraud.Finally, using information from the CadUnico, I trace the past income (pre-2012)of all households updating in 2012,110 and aggregate the municipal variables onlyfor the households that has their last update in the first two years of the mayoraltenure (2009-10). These results can be found in the appendix, Table C.4. There wasno significant difference in past income across the groups of municipalities (with andwithout reelection incentives in 2012). This is evidence that the income reportingeffects found in this study are indeed generated only in 2011-2012 (near the electionyear).109The government uses the additional information in CadUnico to calculate an index of householddevelopment (IDF). There was a pilot project in place to use this index for detection of fraud inincome reporting, but to the best of my knowledge, this was never implemented before 2012.110This information is not available for households first enrolling in the registry in 2012.874.5. Result and Interpretation4.5.4 External Accountability: Does Audit Risk Reduce Fraud?The hypothesis arising from the theoretical model is that a higher risk of auditswill reduce fraud in the election year. Table 4.6 shows the coefficient of interest,estimated using equation 4.5. The occurrence of recent audits in a municipality’sneighborhood reduces the positive effect of reelection incentives on fraud. In fact,the magnitude of the results indicate that recent audits wipe out all these effects,given that an audit leads 6.8pp less households to report income below R$70, whencompared to the effect observed for municipalities without increased audit risk.This result is robust to bandwidth and model specification, but it loses power andmagnitude when the neighborhood is contracted to include a narrower radius. Table4.6 also shows a placebo test. The effects disappear when estimated using eitheraudits that happened after elections.This is evidence that audits are an effective and relatively costly instrument toreduce program fraud, in the presence of reelection incentives. For perspective, thecountry-wide BF spending was R$20bi in 2012, which puts the estimated impact offraud at more than R$1bi per year. At the same time, the annual budget of theCGU is currently less than R$100mn.4.5.5 Does Fraud Help the Incumbent Politician?In municipalities on the top 25% in fraud, incumbent mayors have a 36% higherprobability of reelection (18pp higher), for a 6% higher share of votes (3pp higher).These results are observed for a sub-sample of municipalities where the past election(2008) had a margin of victory of 10pp or less (Table 4.7). They are also in linewith both these predictions of the theoretical model: fraud should help incumbentsto get reelected, and this effect should be higher where elections are more compet-itive. While these results are still significant but with a lower magnitude for themunicipalities where the 2008 margin of victory was less than 15pp, they fade when884.5. Result and Interpretationthe entire sample is used (margin of victory up to 100pp).Table 4.7 also shows that being in the group with top 25% election year-fraudis not correlated with contemporaneous performance indicators of the mayors, mea-sured by the components of the IGD index and GDP growth, as before. The electoralrewards of fraud are not correlated with the mayor’s performance in other areas (BFmanagement and economic growth). Finally, Figure 4.4 shows the results of an addi-tional placebo test for the matching specification. We estimate the electoral rewardsof being a top 25% municipality in low reported income, based on updates madeby on CadUnico on the pre-election period (2009-2011). If the results found in thissection come from some omitted variable bias that causes both a higher incumbencyadvantage and lower reported income, then the coefficients observed for the yearsother than 2012 should also be significant and strong. The fact that no effects areobserved for these years is evidence that the electoral rewards estimated under thisspecification are coming from income reporting fraud in 2012 only.894.6. Figures and TablesFigure 4.1: Reported Monthly Income in Brazil: CadUnico vs. CensusData from the sample of the 2010 Census, and CadUnico updated as of Dec 2012. It excludes the householdsthat reported zero income (10% of the CadUnico sample, and 11% of the Census sample).Figure 4.2: Discontinuity in Reported IncomeThe right side shows the municipalities with reelection incentives. The solid lines are the local fit of a seconddegree polynomial, and the dashed lines are the 90% confidence intervals. The sample includes municipalitiesin the two upper terciles in poverty.904.6. Figures and TablesFigure 4.3: Balance of Fixed or Pre-Determined VariablesThe chart shows the normalized confidence intervals (C.I.) for a confidence level of 90%. The C.I. areheteroskedasticity robust.914.6. Figures and TablesFigure 4.4: The Impact of Reelection Incentives on Fraud, by Update YearThe regression is estimated using the full specification, under a bandwidth of 16pp. The bars show the 90%confidence interval.Figure 4.5: The Impact of Fraud on Electoral Performance, by Update YearThe regression is estimated using a matching algorithm, where the explanatory variable is a dummy indicatingif the municipality is top 25% in fraud. The bars show the 90% confidence interval.924.6. Figures and TablesTable 4.1: Main Results: Fraud (Income Underreporting)PT Mean Coefficient, [S.E.] Band.Depended Variable (1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalExcludes Municipal Covariates, State dummies, and Region-Party dummiesVery Poor, ri≤R$70 81.227 4.848** 5.157** 5.196*** 5.024** 15.0(% of CadUnico Eligible) [0.458] [2.180] [1.997] [1.841] [2.049] {589}Very Poor, ri≤R$70 88.464 3.576** 3.796** 3.828*** 3.690** 14.8(% of BF Eligible) [0.341] [1.641] [1.502] [1.383] [1.553] {585}Poor, ri≤R$140 91.435 1.642 1.813* 1.848** 1.867** 17.0(% of CadUnico Eligible) [0.218] [1.067] [0.974] [0.894] [0.947] {644}Includes State dummies and Region-Party dummiesVery Poor, ri≤R$70 6.001*** 5.944*** 5.768*** 5.894*** 15.0(% of CadUnico Eligible) [2.095] [1.902] [1.753] [1.956] {589}Very Poor, ri≤R$70 4.199*** 4.150*** 4.023*** 4.128*** 14.8(% of BF Eligible) [1.615] [1.453] [1.330] [1.511] {585}Poor, ri≤R$140 2.366** 2.383*** 2.330*** 2.409*** 17.0(% of CadUnico Eligible) [0.955] [0.880] [0.817] [0.858] {644}Full Specification (includes Municipal Covariates, State dummies, and Region-Party dummies)Very Poor, ri≤R$70 5.500*** 5.444*** 5.263*** 5.375*** 15.0(% of CadUnico Eligible) [1.921] [1.744] [1.611] [1.794] {589}Very Poor, ri≤R$70 3.977*** 3.916*** 3.769*** 3.885*** 14.8(% of BF Eligible) [1.478] [1.336] [1.229] [1.388] {585}Poor, ri≤R$140 2.058** 2.084** 2.049*** 2.104*** 17.0(% of CadUnico Eligible) [0.893] [0.819] [0.760] [0.800] {644}Observations 544 617 698Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.934.6. Figures and TablesTable 4.2: Distribution of BenefitsPT Mean Coefficient, [S.E.] Band.Depended Variable (1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalSample Aggregating All Households in CadUnico (MDS Data at the Municipal Level)full Benefits, Dec 2012 97.465 0.509 0.563 0.574* 0.579 17.1(% of total benefits) [0.099] [0.422] [0.376] [0.339] [0.362] {651}BF pc Benefit, Dec 2012a 5.064 0.012 0.016 0.014 0.014 19.2(R$ ’000/yr) [0.006] [0.024] [0.022] [0.020] [0.020] {700}BF tot Benefits, Dec 2012a 12.741 0.067* 0.073** 0.067** 0.066** 19.4(R$ mn/yr) [0.034] [0.037] [0.034] [0.032] [0.032] {704}full Benefits, Dec 2014 98.409 0.618 0.618* 0.574* 0.534* 22.5(% of total benefits) [0.089] [0.384] [0.338] [0.304] [0.280] {748}BF pc Benefit, Dec 2014a 5.245 0.069** 0.065** 0.058** 0.066** 14.4(R$ ’000/yr) [0.007] [0.030] [0.027] [0.025] [0.029] {577}BF tot Benefits, Dec 2014a 12.959 0.122** 0.119*** 0.107*** 0.113*** 17.8(R$ mn/yr) [0.034] [0.049] [0.045] [0.041] [0.043] {669}Sample Aggregating Households that Updated their Information in the Election Year (2012)full Benefits, Dec 2012 97.862 0.458 0.499 0.523* 0.524 17.6(% of total benefits) [0.088] [0.386] [0.341] [0.307] [0.322] {663}BF Benefits, Dec 2012 78.544 -1.541 -1.247 -1.004 -1.371 14.8(% of total enrollment) [0.290] [1.317] [1.222] [1.130] [1.258] {585}BF pc Benefit, Dec 2012a 5.086 0.027 0.031 0.028 0.030 17.0(R$ ’000/yr) [0.006] [0.025] [0.023] [0.022] [0.023] {648}BF tot Benefits, Dec 2012a 12.112 0.150 0.159* 0.149* 0.130* 23.3(R$ mn/yr) [0.038] [0.093] [0.086] [0.079] [0.072] {760}Observations 544 617 698aCoefficient estimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors areheteroskedasticity robust. The list of included municipal-level covariates is described in the text. Pre-treatment means correspond to the predicted values of the variables for a municipality at the discontinuitysegment, before treatment. Optimal bandwidths are calculated using the CCT algorithm.944.6. Figures and TablesTable 4.3: Other Household Characteristics Reported in CadUnicoPT Mean Coefficient, [S.E.] Band.(1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalSample of Households that Updated their Information in the Election Year (2012)Food Expenses 48.555 -3.584 -3.205 -2.897 -3.218 15.5(pc avg) [0.588] [2.494] [2.277] [2.097] [2.312] {599}Electricity Expenses 6.717 -0.332 -0.264 -0.176 -0.168 19.3(pc avg) [0.097] [0.384] [0.352] [0.327] [0.326] {700}Rooms 1.344 0.034 0.029 0.025 0.030 15.5(pc avg) [0.010] [0.032] [0.029] [0.027] [0.030] {599}Floor 63.823 -5.317* -4.663 -4.166 -4.865 15.0(share of built floor) [0.798] [3.215] [2.966] [2.738] [3.044] {587}Water 544 617 698(share of water connection) [0.500] [2.251] [1.705] [1.265] [1.435] {2164}Observations 872 1626 2556Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.954.6. Figures and TablesTable 4.4: Measures of the Mayor’s Performance in 2012PT Mean Coefficient, [S.E.] Band.(1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalSchool Conditionalities 0.895 0.016 0.016 0.015 0.016 13.5(index in 0-1, 2012) [0.004] [0.014] [0.013] [0.012] [0.014] {555}Health Conditionalities 0.835 -0.010 -0.005 -0.002 -0.010 13.0(index in 0-1, 2012) [0.005] [0.022] [0.020] [0.018] [0.022] {544}CadUnico Updates 0.793 -0.006 -0.004 -0.005 -0.004 16.0(index in 0-1, 2012) [0.004] [0.017] [0.016] [0.015] [0.016] {619}CadUnico Coverage 0.992 0.002 0.002 0.001 0.002 17.4(index in 0-1, 2012) [0.002] [0.007] [0.007] [0.006] [0.007] {656}GDP growth 0.003 -0.009 -0.016 -0.018 -0.006 12.5(%, 2012/2011) [0.006] [0.016] [0.015] [0.015] [0.016] {531}GDP growth 0.050 0.020 0.006 0.001 0.012 14.5(%, 2012/2010) [0.007] [0.026] [0.025] [0.025] [0.026] {579}Observations 544 617 698Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.964.6. Figures and TablesTable 4.5: Robustness to the Experience and Ability BiasesP.T.MeanCoefficient, [S.E.] Band.(1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalEXPERIENCE: mayors with reelection incentives that have previous political experienceVery Poor, ri≤R$70 80.896 5.395** 5.012** 4.582** 5.032** 16.3(% of CadUnico Eligible) [0.534] [2.207] [2.106] [1.961] [2.114] {383}Very Poor, ri≤R$70 88.216 3.528** 3.291** 3.060** 3.332** 15.9(% of BF eligible) [0.394] [1.733] [1.655] [1.532] [1.674] {372}Poor, ri≤R$140 91.301 2.313** 2.161** 1.939** 1.896** 16.9(% of CadUnico Eligible) [0.261] [1.031] [0.966] [0.893] [0.875] {394}Observations 403 462 528 454ABILITY: mayors with reelection incentives that ran and won in 2012Very Poor, ri≤R$70 82.371 9.242*** 9.186*** 8.695*** 9.139*** 14.0(% of CadUnico Eligible) [0.500] [2.165] [2.039] [1.920] [2.025] {419}Very Poor, ri≤R$70 89.360 6.951*** 6.783*** 6.393*** 6.887*** 13.4(% of BF eligible) [0.364] [1.645] [1.542] [1.448] [1.547] {407}Poor, ri≤R$140 91.844 3.282*** 3.424*** 3.240*** 3.366*** 14.9(% of CadUnico Eligible) [0.242] [1.083] [1.013] [0.944] [0.992] {432}Observations 397 460 531 419Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.974.6. Figures and TablesTable 4.6: Effect of Audits on Fraud, by Different SpecificationsPT Mean Coefficient, [S.E.](1) (2) (3) (4) (5) (6)Depended Variable: (Share of Poor with Income below R$70)Bandwidth 16.0 16.0 16.0 13.0 16.0 19.0Effect of Recent Audits (2009-10)Micro-region, 0.567 -4.641 -5.100 -7.336** -6.779** -6.170*(1=any audit) [0.020] [3.911] [3.471] [3.661] [3.406] [3.153]Micro-region +a 0.608 -5.193 -6.293* -7.695** -7.175** -6.559**(1=any audit) [0.020] [3.933] [3.492] [3.684] [3.424] [3.166]Micro-region, 111km rad. 0.532 -3.276 -5.973* -8.033** -7.298** -6.752**(1=any audit) [0.020] [4.019] [3.454] [3.648] [3.378] [3.125]Micro-region, 55km rad. 0.364 1.267 -2.147 -3.235 -2.981 -2.657(1=any audit) [0.019] [4.260] [3.512] [3.685] [3.415] [3.190]Placebo TestMicro-region 0.334 -2.091 0.574 -1.017 -0.152 0.084(1=any audit, 2012-13) [0.019] [4.328] [3.566] [4.011] [3.703] [3.389]Observations 617 617 617 544 617 698State, Party-Region Dum. N Y Y Y Y YControls N N N Y Y YaIt also includes past audits in the municipality. Significant at: 99% ***, 95% **, 90% *. Standard errorsare heteroskedasticity robust. The list of included municipal-level covariates is described in the text. Pre-treatment means correspond to the predicted values of the audit dummies (i.e the share of municipalities thatexperienced audits in their neighborhood) for a municipality at the discontinuity segment, before treatment.984.6. Figures and TablesTable 4.7: Impact of BF Fraud on the Probability of ReelectionPT Mean Coefficient, [S.E.]Depended Variable (1) (2) (3) (4)Margin of Victory, 2008 ≤15pp ≤10pp ≤15pp AllElection ResultsShare of Votes, 2012 47.970 3.060** 2.115* -0.315(%) [11.984] [1.308] [1.185] [0.983]Prob. of Reelection, 2012 48.354 17.611*** 12.017** 1.726(%) [50.036] [6.680] [5.609] [4.708]Mayoral Performance IndexHealth Conditionalities, 2011-12 0.844 -0.002 -0.007 -0.016(index in 0-1) [0.130] [0.020] [0.014] [0.013]School Conditionalities, 2011-12 0.905 -0.007 -0.017 -0.013(index in 0-1) [0.072] [0.013] [0.013] [0.009]CadUnico Coverage, 2011-12 0.995 0.003 -0.001 -0.002(index in 0-1) [0.024] [0.005] [0.004] [0.003]CadUnico Update, 2011-12 0.801 -0.005 -0.003 -0.003(index in 0-1) [0.090] [0.015] [0.012] [0.010]Overall Index (Igd), 2011-12 0.886 -0.003 -0.007 -0.009*(index in 0-1) [0.048] [0.007] [0.006] [0.005]GDP Growth, 2012-11 -0.332 0.020 0.016 0.012(% yoy) [11.413] [0.012] [0.011] [0.010]GDP Growth, 2012-10 4.688 -0.001 -0.006 -0.006(% yoy) [14.516] [0.018] [0.015] [0.012]Observations 527 406 527 740Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The regression wasweighted by weights generated in the matching procedure. The explanatory variable is a dummy indicatingif the municipality is top 25% in fraud (% of CadUnico-eligible households reporting income below R$70).99Chapter 5ConclusionThis thesis studies the impact of a CCT program in the local politics of Brazil,a developing nation plagued by clientelism, vote buying and corruption. Chapters2 and 3 show that, when transfers are properly shielded from the influence andmanipulation of local politicians, the CCT program reduces incumbency advantage,increases both electoral competition and the quality of candidates, and weakensthe support for clientelistic parties. Cash transfers also contribute to the politicalenfranchisement of the poor by shifting spending into redistributive health and edu-cation services. The theory reconciles these empirical findings by showing that cashtransfers reduce the ability of the incumbents to raise support with vote buying.These effects are observed over a nine-year period.Chapter 4 examines the relationship between CCT and local politics from adifferent angle. When the program can be used by politicians to influence voters,the possibility of reelection aligns corruption with both the preferences of votersand the politician’s objective, giving raise to a mechanism of reverse accountability.This Chapter exposes one way in which politicians illegally extract rents from the BFprogram, and use them to increase their reelection prospects. By promoting incomeunderreporting fraud, mayors can make not-so-poor households eligible to receivefederal funds. These households will, in turn, reward mayors with votes. ThisChapter also shows that this reverse accountability calls for external discipliningdevices to keep politicians and voters in check. The expectation of being audited isshown to drastically reduce the effects of reelection incentives on income reporting100Chapter 5. Conclusionfraud in the BF program.Both the approach of the first two Chapters, and the approach of Chapter 4,estimate the same thing: the effect of higher CCT coverage on local politics. Theresults obtained with the different approaches, however, point in the opposite di-rection. While Chapter 2 and 3 show that CCT coverage reduces incumbency ad-vantage and increases electoral competition, Chapter 4 shows that incumbents canbenefit from the program with the use of fraud. These results can be re-conciledafter considering the nature and the persistence of each effect, in the context ofBrazil. The negative effects shown in Chapter 4 are characteristic of the periodwhen beneficiaries are first joining the program, and the manipulation of eligibilityby politicians can create credit claiming opportunities. Similar results already existin the literature (De Janvry, Finan, and Sadoulet, 2012; Labonne, 2013). However,given the program characteristics (long permanence, easy access to the central gov-ernment office for complaints), and the local political environment (two-term limitfor mayors), these effects are likely short-lived.In Chapter 2 and 3, the effects are not only estimated for a longer period (twoelection cycles), but they also represent a lasting impact of the CCT program.The mechanism proposed requires that beneficiaries see the CCT income shock aspermanent, in a way that would change their consumption decisions, and utility.Accordingly, these effects should last as long as the CCT benefit is stable, andshould survive through different political cycles and incumbents (they do not concernthe relationship between voters and one specific politician, i.e., the one that mighthave given them access to CCT). In other words, after a short-period in whichcredit can be claimed by politicians for getting the program benefits to some voters,the long-term effects of the CCT program that come through the impact of thepermanent income increase on vote buying should be positive. They should workagainst incumbents that replace public good with clientelistic exchanges.101Chapter 5. ConclusionThis set of results also has various policy implications. First, Chapter 3 suggeststhat policy spillovers matter. In addition to the political impacts of the CCT pro-gram, it shows that a small differential in funding for a health program generated amuch larger impact on the CCT distribution. Second, CCT programs may increasethe ability of national governments to shape local politics in their favor. This is evenmore significant when considering that the literature has shown that politicians areable to reap electoral rewards, at the national level, by implementing the program.Third, when properly shielded from local political interests, a CCT program can betreated as an exogenous income increase. In this context, these findings are usefulto inform other policies that aim for a similar impact on the incomes of the poor.These final results also have significant policy implications. In a program of thescale and scope of BF (and in many other CCT programs around the world), even asmall source of fraud could represent billions in losses. This study estimates losseswith fraud that could surpass R$1bn a year. Put in perspective, disciplining devicesas audits are an very efficient strategy to reduce fraud, at a relatively low cost (thebudget of the auditing unit CGU is less than R$100mn/year).The overall findings in this thesis raise at least one important question thatmight be theme for future research: what are the social welfare implications of thepolitical effects of a CCT program? There are a few potential developments here.First, while Chapters 2 and 3 show that the CCT program, when effectivelybypassing local politicians, is positive for the poor sectors of the population, theoverall welfare effects remain uncertain. Anticipating the long-term consequences ofless capital-intensive public goods are beyond the ability of the identification strategyemployed here. Also, the magnitude of the effects has to be treated carefully whenapplied to a more general context. The level of the coefficients may well dependon institutional features that are specific to the Brazilian case, and the magnitudemay still be affected by a residual direct impact of the FHP funding, even though102Chapter 5. Conclusionthe evidence on the exclusion restriction convincingly supports the direction of theresults presented here.Second, to the extent that politicians can defraud the CCT program, corruptionmight be offsetting part of the welfare improvements generated by the program.Although this thesis shows that fraud can counteract some of the incumbency ad-vantage reduction effects shown in Chapter 3, a precise estimation of the overallimpact of fraud on the entire range of political outcomes is beyond the scope ofthe empirical strategy employed here (a back of the envelop calculation, however,indicates that the long-term positive effects of the CCT are at least four times largerthan the short-lived negative effects). At least from a pure economic standpoint, itis likely that fraud is inefficient from a social welfare perspective. Assuming thatthe marginal utility of consumption increases as income decreases, fraud is welfarereducing if very poor households cannot access the full BF benefit due to incomeunderreporting by undeserving households. In 2011, the government estimated thatnearly 800 thousand poor households in Brazil did not have the benefit, even thoughthe average BF coverage across municipalities was above 100%.103BibliographyAlston, Lee J. and Bernardo Mueller. 2006. “Pork for Policy: Executive andLegislative Exchange in Brazil.” Journal of Law, Economics, and Organization22 (1):87–114.Alves, Denisard and Chris Timmins. 2003. “Social Exclusion and the Two-TieredHealthcare System of Brazil.” In J.R. Behrman, A.G. Trujillo, and M. Szekely(eds.), Who is In and Who is Out: Social Exclusion in Latin America. Washington,DC: Inter-American Development Bank. .Ames, Barry and Amy Erica Smith. 2010. “Knowing Left from Right: Ideolog-ical Identification in Brazil, 2002-2006.” Journal of Politics in Latin America2 (3):3–38.Anderson, Siwan, Patrick Francois, and Ashok Kotwal. 2015. “Clientelism in IndianVillages.” American Economic Review 105 (6):1780–1816.Baez, Javier E., Adriana Camacho, Emily Conover, and Roman A. Zarate. 2012.“Conditional Cash Transfers, Political Participation, and Voting Behavior.” PolicyResearch Working Papers .Barro, Robert J. 1973. “The control of politicians: An economic model.” PublicChoice 14 (1):19–42.Boas, Taylor C. and F. Daniel Hidalgo. 2011. “Controlling the Airwaves: Incum-104Bibliographybency Advantage and Community Radio in Brazil.” American Journal of PoliticalScience 55 (4):869–885.Boix, Carles and Susan C. Stokes. 2009. “Political Clientelism.” The Oxford Hand-book of Comparative Politics .Brollo, Fernanda and Tommaso Nannicini. 2012. “Tying Your Enemy’s Hands inClose Races: The Politics of Federal Transfers in Brazil.” American PoliticalScience Review 106:742–761.Brollo, Fernanda, Tommaso Nannicini, Roberto Perotti, and Guido Tabellini. 2013.“The Political Resource Curse.” American Economic Review 103 (5):1759–96.Brollo, Fernanda and Ugo Troiano. 2016. “What Happens When a Woman Winsan Election? Evidence from Close Races in Brazil.” Working Paper .Brusco, Valeria, Marcelo Nazareno, and Carol Stokes. 2004. “Vote Buying in Ar-gentina.” Latin American Research Review 39 (2):66–88.Calonico, Sebastian, Matias D. Cattaneo, and Rocio Titiunik. 2014. “Robust Non-parametric Confidence Intervals for Regression-Discontinuity Designs.” Econo-metrica 82 (6):2295–2326.Camacho, Adriana and Emily Conover. 2011. “Manipulation of Social ProgramEligibility.” American Economic Journal: Economic Policy 3 (2):41–65.Clark, Damon and Paco Martorell. 2014. “The Signaling Value of a High SchoolDiploma.” Journal of Political Economy 122 (2):282–318.Cruz, Cesi, Philip Keefer, and Julien Labonne. 2015. “Incumbent Advantage, VoterInformation, and Vote Buying.” Working Paper .Cruz, Cesi and Christina J. Schneider. 2016. “Foreign Aid and Undeserved CreditClaiming.” Working Paper .105BibliographyDe Janvry, Alain, Frederico Finan, and Elisabeth Sadoulet. 2012. “Local electoralincentives and decentralized program performance.” Review of Economics andStatistics 94 (3):672–685.De La O, Ana L. 2013. “Do Conditional Cash Transfers Affect Electoral Behav-ior? Evidence from a Randomized Experiment in Mexico.” American Journal ofPolitical Science 57 (1):1–14.———. 2015. Crafting Policies to End Poverty in Latin America. CambridgeUniversity Press.Dell, Melissa. 2010. “The Persistent Effects of Peru’s Mining Mita.” Econometrica78 (6):1863–1903.Dixit, Avinash and John Londregan. 1996. “The Determinants of Success of SpecialInterests in Redistributive Politics.” The Journal of Politics 58 (4):1132–1155.Duggan, John and César Martinelli. 2016. “The Political Economy of DynamicElections: Accountability, Commitment, and Responsiveness.” Working Paper .Efron, Bradley. 1979. “Bootstrap Methods: Another Look at the Jackknife.” TheAnnals of Statistics 7 (1):1–26.Eggers, Andrew C., Anthony Fowler, Jens Hainmueller, Andrew B. Hall, andJames M. Snyder. 2015. “On the Validity of the Regression Discontinuity Designfor Estimating Electoral Effects: New Evidence from Over 40,000 Close Races.”American Journal of Political Science 59 (1):259–274.Eggers, Andrew C. and Jens Hainmueller. 2009. “MPs for Sale? Returns to Officein Postwar British Politics.” American Political Science Review 103:513–533.Ferejohn, John. 1986. “Incumbent Performance and Electoral Control.” PublicChoice 50 (1/3):5–25.106BibliographyFerraz, Claudio and Frederico Finan. 2008. “Exposing Corrupt Politicians: TheEffects of Brazil’s Publicly Released Audits on Electoral Outcomes.” QuarterlyJournal of Economics 123 (2):703–745.———. 2011. “Electoral Accountability and Corruption: Evidence from the Auditsof Local Governments.” American Economic Review 101 (4):1274–1311.Finan, Frederico and Laura Schechter. 2012. “Vote-Buying and Reciprocity.” Econo-metrica 80 (2):863–881.Fiszbein, Ariel, Norbert Schady, Francisco H.G. Ferreira, Margaret Grosh, NiallKeleher, Pedro Olinto, and Emmanuel Skoufias. 2009. “Reducing Present andFuture Poverty. World Bank Policy Research Report.”World Bank Policy ResearchReport .Fried, Brian J. 2012. “Distributive Politics and Conditional Cash Transfers: TheCase of Brazil’s Bolsa Familia.” World Development 40 (5):1042–1053.Fujiwara, Thomas. 2015. “Voting Technology, Political Responsiveness, and InfantHealth: Evidence From Brazil.” Econometrica 83 (2):423–464.Fujiwara, Thomas and Leonard Wantchekon. 2013. “Can Informed Public Delib-eration Overcome Clientelism? Experimental Evidence from Benin.” AmericanEconomic Journal: Applied Economics 5 (4):241–55.Gerber, Alan S., Daniel P. Kessler, and Marc Meredith. 2011. “The PersuasiveEffects of Direct Mail: A Regression Discontinuity Based Approach.” The Journalof Politics 73:140–155.Hainmueller, Jens. 2012. “Entropy Balancing for Causal Effects: A MultivariateReweighting Method to Produce Balanced Samples in Observational Studies.”Political Analysis 20 (1):25–46.107BibliographyHanda, Sudhanshu and Benjamin Davis. 2006. “The Experience of ConditionalCash Transfers in Latin America and the Caribbean.” Development Policy Review24 (5):513–536.Hicken, Allen. 2011. “Clientelism.” Annual Review of Political Science14 (1):289–310.Hidalgo, F. Daniel and Simeon Nichter. 2015. “Voter Buying: Shaping the Electoratethrough Clientelism.” American Journal of Political Science .IBOPE. 2005. “Compra de Votos nas Eleicoes 2004.” Ibope Opinião .Imbens, Guido W. and Karthik Kalyanaraman. 2012. “Optimal Bandwidth Choicefor the Regression Discontinuity Estimator.” The Review of Economic Studies79 (3):933–959.Imbens, Guido W. and Thomas Lemieux. 2008. “Regression discontinuity designs:A guide to practice.” Journal of Econometrics 142 (2):615 – 635.IPEA. 2011. “SIPS - Sistema de Indicadores de Percepção Social.” IPEA .Jacob, Brian A and Lars Lefgren. 2004. “Remedial education and student achieve-ment: A regression-discontinuity analysis.” Review of economics and statistics86 (1):226–244.Keele, Luke J. and Rocío Titiunik. 2014. “Geographic Boundaries as RegressionDiscontinuities.” Political Analysis .Labonne, Julien. 2013. “The local electoral impacts of conditional cash transfers:Evidence from a field experiment.” Journal of Development Economics 104 (0):73– 88.108BibliographyLarreguy, Horacio, John Marshall, and Laura Trucco. 2015. “Breaking Clientelismor Rewarding Incumbents? Evidence from an Urban Titling Program in Mexico.”Working Paper .Lee, David S. 2008. “Randomized experiments from non-random selection in U.S.House elections.” Journal of Econometrics 142 (2):675 – 697.Lee, David S. and Thomas Lemieux. 2010. “Regression Discontinuity Designs inEconomics.” Journal of Economic Literature 48 (2):281–355.Lichand, Guilherme, Marcos F. M. Lopes, and Marcelo C. Medeiros. 2016. “IsCorruption Good For Your Health?” Working Paper .Lindert, Kathy, Anja Linder, Jason Hobbs, and Benedicte de la Briere. 2007. “TheNuts and Bolts of Brazil s Bolsa Familia Program: Implementing ConditionalCash Transfers in a Decentralized Context.” World Bank Working Papers .Manacorda, Marco, Edward Miguel, and Andrea Vigorito. 2011. “GovernmentTransfers and Political Support.” American Economic Journal: Applied Eco-nomics 3 (3):1–28.Nichter, Simeon. 2008. “Vote Buying or Turnout Buying? Machine Politics and theSecret Ballot.” American Political Science Review 102:19–31.Pande, Rohini and Benjamin Olken. 2012. “Corruption in Developing Countries.”Annual Review of Economics 4:479–505.Papay, John P., John B. Willett, and Richard J. Murnane. 2011. “Extending theregression-discontinuity approach to multiple assignment variables.” Journal ofEconometrics 161 (2):203 – 207.Persson, Torsten and Guido E. Tabellini. 2000. “Political Economics: ExplainingEconomic Policy.” Cambridge, Massachussets: MIT Press .109BibliographyQuerubin, Pablo and James M. Snyder. 2013. “The Control of Politicians in Nor-mal Times and Times of Crisis: Wealth Accumulation by U.S. Congressmen,1850–1880.” Quarterly Journal of Political Science 8 (4):409–450.Reardon, Sean F. and Joseph P. Robinson. 2010. “Regression discontinuity de-signs with multiple rating-score variables.” Center for Education Policy AnalysisWorking Papers .Rego, Walquiria L. Rego and Alessandro Pinzani. 2013. “Vozes do Bolsa Familia.”Editora Unesp .Rodriguez-Chamussy, Lourdes. 2015. “Local Electoral Rewards from CentralizedSocial Programs: Are Mayors Getting the Credit?” IDB Working Paper Series(550).Ruppert, D. and M. P. Wand. 1994. “Multivariate Locally Weighted Least SquaresRegression.” The Annals of Statistics 22 (3):1346–1370.Sekhon, Jasjeet S. 2009. “Opiates for the Matches: Matching Methods for CausalInference.” Annual Review of Political Science 12 (1):487–508.Soares, Fábio Veras, Rafael Perez Ribas, and Rafael Guerreiro Osório. 2010. “Evalu-ating the impact of Brazil’s Bolsa Familia: Cash transfer programs in comparativeperspective.” Latin American Research Review 45 (2):173–190.Stokes, Susan, Thad Dunning, Marcelo Nazareno, and Valeria Brusco. 2013. “Bro-kers, Voters and Clientelism.” New York: Cambridge University Press .Stokes, Susan C. 2005. “Perverse Accountability: A Formal Model of MachinePolitics with Evidence from Argentina.” American Political Science Reviewnull:315–325.110Sugiyama, Natasha Borges and Wendy Hunter. 2013. “Whither Clientelism?Good Governance and Brazil’s Bolsa Familia Program.” Comparative Politics46 (1):43–62.Vicente, Pedro C. and Leonard Wantchekon. 2009. “Clientelism and vote buying:lessons from field experiments in African elections.” Oxford Review of EconomicPolicy 25 (2):292–305.Wong, Vivian C., Peter M. Steiner, and Thomas D. Cook. 2013. “AnalyzingRegression-Discontinuity Designs With Multiple Assignment Variables A Compar-ative Study of Four Estimation Methods.” Journal of Educational and BehavioralStatistics 38 (2):107–141.Zajonc, Tristan. 2012. “Essays on Causal Inference for Public Policy.” PhD Disser-tation. Harvard University .Zucco, Cesar. 2013. “When Payouts Pay Off: Conditional Cash-Transfers and VotingBehavior in Brazil: 2002-2010.” American Journal of Political Science 47 (3).111Appendix AAppendix to Chapter 2A.1 Proofs of PropositionsProof of proposition 1. The incumbent maximizes the total utility of votersgiven by the equation below:idiOjI = HϕH [ log2w3 5 21− 3 log2w− t− 21− s3v3]5aϕa[ log2y3 5 21− 3 log2t− sv3]5hϕh [ log2y 5 v3 5 21− 3 log2t− sv3]The first order conditions for the maximization problem are:t : HϕH21− 3w− t− 21− s3v = aϕa21− 3t− sv 5 hϕh21− 3t− svv : HϕH21− 321− s3w− t− 21− s3v 5 2aϕa 5 hϕh321− 3st− sv = hϕhy 5 vThe equations above can be solved for for t and v , with  = ()−) :t =2hϕhs5 ϕ− HϕH3w 5 y2ϕ21− s3− HϕH32hϕh5 ϕ3112A.1. Proofs of Propositionsv =hϕhw − ϕy2hϕh5 ϕ3Taking the value of v calculated above, this is the derivative with respect toincome:yvyy= − ϕ2hϕh5 ϕ3Proof of proposition 2. Using the results obtained above, the value of ytyy andy(b−t)yy are given by:ytyy=ϕ21− s3− HϕH2hϕh5 ϕ3and y2w− t3yy= −ϕ21− s3− HϕH2hϕh5 ϕ3The condition on the parameters that give ytyy S : are given by the inequalitybelow:2a 5 zh3ϕaHϕHSs21− s3It remains to show that this result would not be obtained in the absence of votebuying. Take the utility below, given by the incumbent, in a model without v andwithout the swing group.idiOjI = HϕH [ log2w3 5 21− 3 log2w− t3] 5 aϕa[ log2y3 5 21− 3 log2t3]The maximization problem can be solved for the value of t, which does notdepend on income, as follows:t =aϕawϕ113A.1. Proofs of PropositionsProof of proposition 3. Equation 2.3 shows the total utility faced by thevoter, from both the challenger and the incumbent. The incumbent’s vote share isgiven by .I = idiOjI − idiOjC 5 . Substituting into that equation the valuesof t and v found before, and rearranging, I have:.I = aϕa log2 yy 5 vϵ3 5 hϕh log2y 5 vy 5 vϵ3− 5 12Taking the derivative with respect to income, and rearranging,y.Iyy= [aϕay 5 vϵy2ϵ2v − y yvyy 32y 5 vϵ323− hϕh y 5 vϵy 5 v22v − y yvyy 32 − ϵ32y 5 vϵ323] (A.1)For ( yIyy < :), the term in brackets needs to be negative, which results in thefollowing condition:z =ϕhϕa≥ y 5 vyThe ratio of the density of the political preference parameters for the swing andpoor groups has to be at least the ratio of the marginal utilities of consumption forthe same groups. The condition can also be written as a function of the parameters,as:z Sϕah2 by 5 13− 2ϕ− hϕh325 13ϕah3Proof of proposition 4. First, I show that vote buying is decreasing on thedensity of political preference. For the value of v calculated above, the derivativewith respect to ϕa is:yvyϕa=−ahϕaz2w 5 y32hϕh5 ϕ3222114A.1. Proofs of Propositionsand the derivative of this equation with respect to income is:yvyϕayy=−ahϕaz2hϕh5 ϕ3222Now, the derivative of the incumbent’s vote share with respect to income, shownin equation A.1, simplifies to:y.Iyy= ϵ1y 5 vϵ2v − yyvyy3[aϕaz1y− hϕaz 1y 5 v2 − ϵ3ϵ]Rearranging the terms even further, and using the identity (−ϵ)ϵ = LS, I have:y.Iyy= ϵaV[ϕa21y− zB3]Where,B =12y 5 v3=hϕaz5 ϕhϕaz2w 5 y3with yByϕa=2aϕa − HϕH3hz2w 5 y32hϕaz3222w 5 y32Defining  = −ϵ = LL+S, I also have:2v − yyvyy3 =hϕazwhϕaz5 ϕand 2y 5 vϵ3 = hϕaz2wϵ 5 y3 5 ϕyhϕaz5 ϕAnd,V =2v − y yvyy 32y 5 vϵ3=hϕazwhϕaz2wϵ 5 y3 5 ϕywith yVyϕa=wyhzHϕH[ϕaz2wϵ 5 y3 5 ϕy]2S :Now, the equation of interest is the following:115A.1. Proofs of Propositionsy.Iyyyϕa= ϵayVyϕa[ϕa21y− zB3] 5 ϵaV[21y− zB3− ϕaz yByϕa]Rearranging,y.Iyyyϕa= ϵa[yVyϕaϕa21y− zB3 5V21y− zB3−Vϕaz yByϕa]The equation above has three terms inside the brackets. From the assumptionsupporting the result of Proposition 3, I have 2 )y −zB3 < :. Thus, because yVyϕL < :,the first two terms inside the brackets are negative. If the sum of these two termshas an absolute value that is higher than Vϕaz yByϕL, then yIyyyϕL< :. A strongercondition to achieve this result is to simply require that yByϕL≥ :. This occurs ifaϕa ≥ HϕH , i.e., the poor group in the population is a relatively more attractivetarget for public good distribution than the wealthy group. Even if, aϕa < HϕH ,the result holds if the value of yByϕLis small enough.Now for the prediction related to budget allocation, the derivative of ytyy withrespect to ϕa gives the result, as shown below.ytyyyϕa=2a 5 zh3HϕH2hϕh5 ϕ32S :Proof of proposition 5. With V and B defined as before, I use the derivativeof the incumbent’s vote share with respect to income shown in equation A.1, whichsimplifies to:y.Iyy= ϵaV[ϕa21y− zB3]This result depends on the value of yIyyyL, which is calculated maintaining thesize of the swing group fixed (i.e. an increase in a causes a decrease of the samemagnitude in H). Defining C = [ϕa )y − ϕhB] < :, the equation of interest is the116A.1. Proofs of Propositionsfollowing:y.Iyyya= − aa 5 hϵVC 5 ϵVC 5 ϵayVyaC 5 ϵaVyCya(A.2)Taking the derivative of V with respect to a,yVya=hϕhw[hϕh2wϵ − ϕy3− 2ϕa − ϕH3ay]2a 5 h322hϕh2wϵ 5 y3 5 ϕy32The condition that makes yVyLS : is hϕh2wϵ − ϕy3 S 2ϕa − ϕH3ay . Nowtaking the derivative of B with respect to a,yBya=2ϕa − ϕH3hϕh2w 5 y3The derivative is positive if 2ϕa − ϕH3. Equation A.2 has four terms. The sumof the first and second term is negative. For yIyyyL< :, it is sufficient to have one ofthe last two terms being negative and higher than the other. For simplicity, I willpresent the conditions to make both terms negative. If ϕa S ϕH , the last term isnegative, so as long as the difference between these densities is small, the third termis also negative. If, however, ϕa < ϕH , automatically the third term is negativeand the fourth is positive. As long as we have − yVyL[ϕa2 )y − zB3] S −zϕa yByL , theresult also holds. Overall, the relative distance between the densities in the politicalpreferences of the wealthy and poor groups should be small, independent on thedirection it takes.Now for the prediction related to budget allocation, I take the derivative of theequation below with respect to a:ytyy=ϕ21− s3− HϕH2hϕh5 ϕ3117A.1. Proofs of Propositionswhich is positive as follows:ytyyya=2ϕa − ϕH3[21− s3hϕh5 HϕH ] 5 ϕH2hϕh5 ϕ32hϕh5 ϕ32S :Finally, we show the that yvyyyL< : if ϕa S ϕH , as follows:yvyyya=−2ϕa − ϕH32w 5 y3hϕh2hϕh5 ϕ32Proof of proposition 6. Now, if the politicians differ in type, we might havedifferent allocations of t and v across candidates. For the incumbent, taking thefirst derivative of the voter’s utility with respect to income, and substituting theoptimal values, I have,yidiOjIyy= 21−3ϕs5 HϕHy5aϕa1y521−3 2aϕa 5 hϕh32ϕw 5 y2aϕa 5 hϕh35hϕh1w 5 yTaking the derivative with respect to type,yidiOjIyyy= w[−21− 3 ϕ2aϕa 5 hϕh32[ϕw 5 y2aϕa 5 hϕh3]2−  hϕh2w 5 y32] < :With yjIyyy < :, a clientelistic type would always have a higher utility fall comingfrom an income shock than a public good-type of politician. As for the allocationto pro-poor goods, it is easy to see that if ytyy S :, then ytyyy < :, as it is given bythe expression below:ytyyy= −ϕ21− s3− HϕH2hϕh5 ϕ32118A.2. A Generic ModelA.2 A Generic ModelIn this section, I present the same model using a generic form for the utilityfunction. This version shows the sensitivity of the main theoretical predictionsto the assumption regarding the complementary of the consumption of public andprivate goods in the voter’s utility. I continue to assume that the utility is concave,and all the properties for maximization apply. In this section, I denote the firstand second derivatives of the utility using subscripts.111 Also, define HϕH = H,aϕa = a and hϕh = h, for simplicity.Proposition A1. A CCT program will reduce vote buying (yvyy < :) if theconsumption of private and public goods is complementary or neutral in the voter’sutility (jgx ≥ :), and jgxx ≥ :. If they are substitute, or jgxx < :, the result holdsfor a small enough magnitude of jgx.The first order conditions for maximization are now:t : −HjHg 5 ajag 5 hjhg = :v : −H21− s3jHg − asjag − hsjhg 5 hjhx = :Taking the total derivative of the equations above, and rearranging,[HjHgg5ajagg5hjhgg]yt5[H21−s3jHgg−shjhgg−sajagg5hjhgx]yv5[ajagx5hjhgx]yy = :(A.3)111 For example, ULLL denotes the second derivative of the utility of group L with respect to theconsumption of the public good.119A.2. A Generic Model[H21− s3jHgg − shjhgg − sajagg 5 hjhgx]yt5 [H21− s32jHgg 5 s2jagg 5 hs2jhgg − hsjhgx5h22jhxx]yv 5 [−asjagx − hsjhgx 5 hjhxx]yy = :(A.4)Now for simplicity, I rewrite the equations above as vyt 5 wyv 5 yyy = : andwyt5 xyv5 zyy = :. The two equation system can now be solved as yvyy = by−vevx−b2 . Bythe second order conditions of the maximization problem, vx − w2 S :. Also, aftersubstituting equations A.3 and A.4 I have:wy = [H21− s3jHgg − shjhgg − sajagg 5 hjhgx] ∗ [ajagx 5 hjhgx]−vz = [HjHgg 5 ajagg 5 hjhgg] ∗ [asjagx 5 hsjhgx − hjhxx]These terms can be rearranged into the equations below:iEgbO1 = HhjHggjhgx 5 2HjHgg 5 hjhgx3ajagxiEgbO2 = −2HjHgg 5 ajagg3hjhxx 5 h22−jhggjhxx 5 2jhgx323yvyy=iEgbO1 5 iEgbO2vx− w2 (A.5)By the concavity of the utility function, iEgbO2 < :. In the sequence, Iexamine potential cases for the values of jagx and jhgx from iEgbO1. First, it iseasy to see that, if jgx = :, then iEgbO1 < : and yvyy < :.Now, consider the case in which jgx S :. If jgxx ≥ :, then jhgx S jagx. Replacingthe value of jagx by jhgx in the equation above, we have the following condition:120A.2. A Generic Modeljhgx <HjHgghjhgx −HjHgghjhxx 5 h2a5 h32−jhggjhxx 5 2jhgx323−HajHggThe last term in the nominator was set at its highest value when I imposedjagx = jhgx, but it remains negative. Thus, the entire term on the right side of theinequality above is negative, which means that any positive values of jagx and jhgxwould satisfy the conditions for yvyy < :. If in fact jgxx < :, and jhgx < jagx, it is easyto see from equation A.5 that yvyy < : for any values of jagx, if −HjHgg S hjhgx.This is still a stronger condition than the one required for the result, as if it doesnot hold, we can still have yvyy < : if:jagx <HjHgghjhgx − 2HjHgg 5 ajagg3hjhxx 5 h22−jhggjhxx 5 2jhgx323−a2HjHgg 5 hjhgx3Now, consider the case in which jgx < :. The following condition must hold forthe result:HhjHggjhgx 5 2HjHgg 5 hjhgx3ajagx < 2HjHgg 5ajagg3hjhxx− h22−jhggjhxx 52jhgx323Proposition A2. If there is no vote buying, a CCT program will shift thebudget allocation towards the pro-poor public good ( ytyy S :) only if the consumptionof private and public goods is complementary in the voter’s utility (jgx S :). Withvote buying, the result also holds if public and private consumption are substitute orneutral (jgx ≤ :), for small enough values of jgx and s.If public and private consumption are complements, CCT will increase themarginal utility of the public good for the poor, relatively more than for the wealthy,because y increases and w does not. In this case, ytyy S : even without vote buying,as shown below. However, if they are substitutes or neutral, the result can onlybe achieved with vote buying. It follows from Proposition A1 that, under certain121A.2. A Generic Modelconditions, the incumbent will reallocate resources from v back into t and w− t. Ifjgx = :, than the income shock will have no direct effect on the marginal utility ofthe public good for the poor, so the amount of t and w − t reallocated will dependon three things. First, the relative marginal utilities jHg , jag and jhg . Second, therelative attractiveness of each group as an electoral target. Third, it will depend ons.First, I show that without clientelism the result can only be achieved if jagx S :.The first order condition is:t : −HjHg 5 ajag = :Taking the total derivative of the equation and rearranging, ytyy is positive ifand only if jgx S :, by the second order conditions for maximization of a concavefunction.ytyy=jagx−HjHgg − ajaggNow, using the notation from Proposition A1, the system shown in equationsA.3 and A.4 can be solved as ytyy = be−xyvx−b2 . Again, by the second order conditions ofthe maximization problem, vx − w2 S :. After substituting equations A.3 and A.4,I have:wz = [H21− s3jHgg − shjhgg − sajagg 5 hjhgx] ∗ [−asjagx − hsjhgx 5 hjhxx]−xy = [H21− s32jHgg 5 as2jagg 5 hs2jhgg − 2hsjhgx 5 h22jhxx] ∗ [−ajagx − hjhgx]122A.2. A Generic ModelThe signal of the sum of these two terms defines the signal of ytyy . I will splitthese two terms in the following four terms:i1 = −s[HjHgg 5 hjhgg 5 ajagg] ∗ hjhxxi2 = −ha22jhxxjagxi= = −H21− s3jHgg ∗ [ajagx 5 hjhgx]i4 = hsjhgx ∗ [ajagx 5 hjhgx] 5 hHjHggjhxxSo ytyy S : if i1 5 i2 5 i= 5 i4 S :. First, i4 is always positive and i1 isalways negative. Also, it is easy to see that, if jgx ≥ :, then i2 5 i= S :, by theconcavity of the utility function. It remains the following condition on s to achievethe result.s <iEgbO1 5 iEgbO2[HjHgg 5 hjhgg 5 ajagg] ∗ hjhxxwhereiEgbO1 = −ha22jhxxjagx −H21− s3jHgg ∗ [ajagx 5 hjhgx]iEgbO2 = hsjhgx ∗ [ajagx 5 hjhgx] 5HhjHggjhxxFrom the equation above, if jgx < :, then i25i= < : . Because the first term inthe nominator is positive for a concave function, small enough values of jgx will givethe result. The condition on s remains, as long as i4 S −2i1 5 i2 5 i=3. Finally,123A.2. A Generic Modelfor jgx = :, we have that i2 = i= = :, so the result holds if −H21 − s3jHgg S−s2hjhgg 5 ajagg3.Proposition A3. If the density of the political preference of the swing group ishigh enough around zero, then there are parameter values for which CCT reduce thevote share of the incumbent.First, it is easy to see that the result cannot be achieved without vote buying,as there is no incumbency advantage. Equation 2.1 gives the vote share of theincumbent. The total derivative of this equation with respect to income is zerowithout vote buying.Now with vote buying, taking the total differential of equation 2.3 for the in-cumbent’s portion, I have:yidiOjIyy= −Hj IHg [ytyy5 21− s3yvyy] 5 aj Iag [ytyy5 syvyy]5hj Ihg [ytyy5 syvyy] 5 aj Iax 5 hjIhx 5 hjIhxyvyyFor the challenger’s portion, both poor and swing groups will receive the sameamount of public and private transfers. Therefore, jCa = jCh , and the totaldifferential is:yidiOjCyy= −HjCHg [ytyy5 21− s3yvyy] 5 2a5 h3jCag [ytyy5 syvyy]52a5 h3jCax 5 2a5 h3ϵjCaxyvyyFirst, we know that, with equal allocations, we have HjCHg = Hj IHg . Now,the change in the vote share of the incumbent after CCT is defined by:124A.3. Incorporating Rent Seekingy.Iyy=ytyy2aj Iag 5 hjIhg − 2a5 h3jCag 3 5 aj Iax 5 hj Ihx − 2a5 h3jCax5yvyy2hj Ihx − 2a5 h3ϵjCax 5 asj Iag 5 hsj Ihg − 2 5 h3sjCag 3Rearranging, and using the identity ϕh = zϕa, we find the equation for z thatsupports yIyy < :.z Sa[ ytyy 2jIag − jCag 3 5 2j Iax − jCax 3 5 yvyy 2−ϵjCax 5 sj Ihg − sjCag 3]−h [ ytyy 2j Ihg − jCag 3 5 2j Ihx − jCax 3 5 yvyy 2j Ihx − ϵjCax 5 sj Ihg − sjCag 3]A.3 Incorporating Rent SeekingThe new allocations for v and t in the simpler version of the rent seeking modelare seen below. Under these assumptions, there is always a positive amount of rentbeing extracted, but there is no change in the model predictions for yvyy , ytyy or yIyy .v =hϕh2w− r3− ϕy2hϕh5 ϕ3t =2hϕhs5 ϕ− HϕH32w− r3 5 y2ϕ21− s3− HϕH32hϕh5 ϕ35srhϕhFor the more refined model, the first order conditions are:r :  =g ϕ[HϕH21− 321− s3w− t− 21− s32v 5 r3 5 2aϕa 5 hϕh321− 3st− s2v 5 r3 ]125A.3. Incorporating Rent Seekingt : HϕH21− 3w− t− 21− s32v 5 r3 = aϕa21− 3t− s2v 5 r3 5 hϕh21− 3t− s2v 5 r3v : HϕH21− 321− s3w− t− 21− s32v 5 r3 5 2aϕa 5 hϕh321− 3st− s2v 5 r3 = hϕhy 5 vAnd the values of yvyy , ytyy and yryy are:ytyy=2ϕ21− s3− HϕH3g ϕ2g 2hϕh5 ϕ3 5 ϕ3yvyy=−ϕ2g 5 32g 2hϕh5 ϕ3 5 ϕ3yryy=ϕ2g 2hϕh5 ϕ3 5 ϕ3For the reelection probability, the intuition is that a higher maximum rent (r)reduces the incumbent’s vote share because the expected utility of voters from theincumbent in period 2 is lower, while it is not for the challenger. This can been seenin the equation below:idiOjI = HϕH21− 3 log2w− t− 21− s32v 5 :O52r 5 r333 5 aϕa log2y352aϕa 5 hϕh321− 3 log2t− s2v 5 :O52r 5 r333 5 hϕh log2y 5 v3However, the income shock would also have a lower effect on the fall in voteshares (yIyy ), as the incumbency advantage created by vote buying is already lesseffective with the existence of rents, as described above. Below I show the new126A.3. Incorporating Rent Seekingequation for the vote shares.112 The last two terms are the same as before, andthe sum of their derivatives with respect to income should be negative to sustainyIyy < :..I = HϕH21− 3 log2w− t− 21− s32v 5 :O52r 5 r33w− t− 21− s32v 5 r3 352hϕh 5 aϕa3 log2 t− s2v 5 :O52r 5 r33t− s2v 5 r3 321− s32yryy2w− t− 21− s32v 5 :O52r 5 r333]The first two terms differ from the previous model, and their derivative withrespect to income is positive, as seen below. This means that the existence of rentsimposes stronger conditions on the minimum necessary strength of the vote buyingnetworks to achieve yIyy < :.y.I21st3yy= xons1 ∗ [2− ytyy− 21− s3yvyy− 21− s32yryy3221− s322r − r33521− s32yryy2w− t− 21− s32v 5 :O52r 5 r333]with xons1 = HϕH21− 3 1[w− t− 21− s32v 5 r3][w− t− 21− s32v 5 :O52r 5 r3]y.I22ny3yy= xons2 ∗ [2 ytyy− syvyy− s2yryy3s22r − r3 5 s2yryy2t− s2v 5 :O52r 5 r333]112To simplify the analysis, assume that both politicians have an equal true type g.127A.3. Incorporating Rent Seekingwith xons2 = 2hϕh 5 aϕa3 1[t− s2v 5 r3][t− s2v 5 :O52r 5 r33]128Appendix BAppendix to Chapter 3B.1 Effects on HiringThe empirical evidence in this study indicates that the budget shift to redis-tributive services of health and education happens in tandem with an increase inthe budget share spent with personnel. I briefly examine the available hiring data113to provide insight on how this budget reallocation can affect public services in thelong term. The estimation results are show in Table B.7.In Brazil, public servants are hired through a competitive, meritocratic process,which includes a written exam. Employees hired through this process have jobsecurity guaranteed by the legislation. Mayors are allowed to bypass the process intwo ways. First, the creation of ”political” positions, usually reserved for high levelexecutive jobs in the administration. These are often allocated to allies, or used aspolitical exchange. Second, the hiring of temporary employees for a specific projectand a limited tenure.Table B.7 shows the effects, at the discontinuity, on both the total employmentand the share of the permanent work force hired in political jobs. These effects are113Hiring data comes from the IBGE annual survey of municipalities conducted in 2004, 2008 and2012. Not all municipalities reported information and the 1999 survey has no data on interns ortemporary employees. For this reason, I only use data from 2004-12. Where the 2008 and 2012surveys report both separate categories for interns and temporary employees, the 2004 survey onlyreports the interns category. The patterns in the data indicate that this category also includestemporary employees, which I will assume that it does. I only include in the sample the munici-palities that had a non-zero change in the calculated variables, as a zero change it is more likelyto be a reporting error. Nevertheless, the results are not sensitive to this exclusion. Finally, forconsistence the sample is limited to the municipalities for which the budget allocation data is alsoavailable.129B.1. Effects on Hiringshown for both the level and rate of change in these variables. The only noteworthyeffect is a 8.2pp higher change in the share of temporary hiring in 2008-12. In thedata, temporary hiring is generally positively correlated with budget changes.114This indicates that it might be a procedure used by mayors to conduct budgetchanges quickly, without committing to long term labor costs.The education level of the labor force provides another evidence that this typeof hiring is consonant with the budget reallocation narrative in this study. Theshare of temporary workers with less than a high school degree is 17pp lower fortreated municipalities. This is in line with the expectation that a shift to healthand education would increase the demand for high-skilled jobs (e.g. doctors, healthagents, teachers), while the fall in capital investment should reduce the demand forlow-skilled jobs (construction workers).While these results are consistent with the budget shift, their persistence overthe long term might have implications for public good distribution that are stillunknown, and beyond the scope of this study. If temporary jobs outlive the budgetshift and are not replaced by permanent ones115, there are at least two potentialissues arising. First, there is a trade-off between quality and effort in public service.Where the lack of a formal selection process for temporary jobs might negativelyimpact the quality of public service, the absence of job security might work as amechanism to extract a higher effort. Second, the persistence of this type of hiringcould be evidence that mayors are using a different avenue to conduct clientelisticexchanges with the wealthier population. Because there is no significant effect ob-served in 2008-12 in the political hires, the evidence for this argument seems limitednow, but it might worth examining in the future.114I calculate a coefficient of variation for the budget allocation, as the sum of the squared changesin budget shares across tenures, for 2004-12. This coefficient is positively correlated (0.1) to theabsolute change in the share of temporary employees.115In 2007-2011, 77% of the municipalities opened at least one formal hiring process. However,there is no significant difference between treated and non-treated locations.130B.2. Bandwidth SelectionB.2 Bandwidth SelectionBandwidth selection is a significant component of RDDs, as the bandwidth isusually the tool used to control the trade-off between bias and efficiency in theestimation. Although there are plenty of approaches to select the bandwidth in one-score RD designs,116 there is still sparse literature discussing similar methodologiesfor the MRDD. In fact, most of the literature applying some form of MRDD doesnot discuss the issue at all.117 I will use a plug-in algorithm for bandwidth selectionbuilding on the work of Zajonc (2012). In this section I briefly discuss the mainpractical challenges in implementing the procedure in the multivariate context. Idescribe the algorithm on that paper and how it tackles those problems, and Ipropose alterations. All technical notation in this section is taken from that paperto facilitate reference.The method follows the plug-in algorithm developed for the single-score casein Imbens and Kalyanaraman (2012). In a nutshell, plug-in methods aim to findan optimal bandwidth by the minimization of an expression for the mean squarederror (MSE) at the cut-off, in three main steps. First, a theoretical expression forthe minimum MSE is calculated, as a function of bandwidth and other parametersfrom the data. The MSE expression is a combination of terms for the bias and thevariance in the estimation. Second, these parameters are estimated using the data,with the exception of the bandwidth. Third, they are plugged back into the originalMSE expression to derive the optimal bandwidth.The main difficulties for the two-score case are described here. First, the plug-inmethod does not have a closed form solution for more than one bandwidth. Zajonc116See Imbens and Lemieux (2008); Imbens and Kalyanaraman (2012); ? for various discussionson different methodologies of bandwidth selection for the single-score RDD.117The discussion is absent is most cases either because the multiple variables are collapsed to oneand the one-dimensional approaches are used, or the methodology is parametric. Dell (2010) showsthe results for different bandwidths in longitude and latitude, but there is no formal approach todefining their levels.131B.2. Bandwidth Selection(2012) uses the same bandwidth for both scores for a feasible solution. Second,bandwidths cut the kernel differently along the frontier, so the shape of the treatedand non-treated subsets is endogenous to the bandwidth selection near the origin.118Here, the solution was to calculate the bandwidth only for points far from the origin.Finally, unreliable bandwidth values can arise due to the assumptions involved in thecalculation of the MSE parameters. (Imbens and Kalyanaraman, 2012) correct thisproblem with a regularization term. Zajonc (2012) calculates the optimal bandwidthfor various points and uses the minimum value for under-smoothing.I propose a different approach for tackling these three problems. First, I allowthe bandwidths to be different for each score variable. Thus, instead of using anexpression for the single optimal bandwidth, I will minimize the MSE expressionnumerically for different pairs of bandwidths. This provides efficiency gains in nearlyall cases, as the two-dimensional bandwidth outperforms the unique one in termsof MSE. Furthermore, I use an elliptical bandwidth instead of a rectangular one, toensure that the points within a lower distance from the cut-off are used.119Second, if the sample is not balanced along the frontier, the optimal bandwidthcalculated away from the origin will not be optimal, or relevant, for the estimationat the origin. This is a key problem in this study, as the area of the frontier used forthe main specification is near the origin. Thus, I expand the algorithm to includethe calculation of optimal bandwidths at this point (Pop=30 and HDI=0.7), wherethe kernel will cut the treatment frontier in a well defined manner. This changerequires the estimation of cross derivatives from a second-degree polynomial on thescore variables, which was not required before.Third, for regularization I will run a constrained minimization of the MSE ex-118The origin is defined as the point where the two segments of the two-dimensional treatmentfrontier connect. In the case of this study, it is where population = 30 and HDI = 0.7119The algorithm will produce the values for the sides of a rectangular bandwidth. I will use anellipse centered at the same cut-off with an area that equals the area of the rectangle produce by theselection algorithm. It will have radiuses that are slightly higher than the sides of the rectangularbandwidth, but it will exclude the distant points in the corners of the rectangle.132B.2. Bandwidth Selectionpression, using a cap for the bandwidth. The nature of the problem of bandwidthselection is to find the optimal value, in light of the trade-off between bias andvariance. However, whereas variance is salient in the regression results, bias is not.Thus, proposing a cap effectively limits the amount of bias that the researcher iswilling to accept, at a cost of higher variance. It remains the issue of setting anappropriate cap. For simplicity, I will run the original algorithm for the first stage(CCT coverage) at points away from the origin, and select one minimum uniquebandwidth for the scores, using that as a cap for the new algorithm.For reference, all the other parameters, including the pilot bandwidths, are keptas proposed by the original algorithm. Finally, whenever the description of anequation is not detailed, it is because it fully replicates a step shown in Zajonc(2012). The algorithm is described below. Items 1-4 are taken from that paper,with a small adjustment to item 4. Steps 5-7 were modified as described above. Thescore variables are also normalized by their standard deviation to be in a commonscale.1. Using the entire sample, calculate the standard deviations for population pand HDI h.2. Select a pilot bandwidth for each variable using Scott’s rule (e.g.h^ = pn−16 )and limit the sample to those bandwidths.3. Calculate the conditional variance v^2pP h3 and density f^2pP h3.4. Apply again a rule-of-thumb bandwidth to create a subsample for the esti-mation of the second derivatives, which are calculated using a second degreelocal polynomial regression on both sides of the discontinuity. Here, althoughI keep the rule-of-thumb bandwidth originally proposed, I add a cap at of1.65 in order to avoid using the extreme points in the estimation of the secondderivatives, i.e. 5% of the sample on each side. Estimate the second derivatives133B.2. Bandwidth Selectionfor both sides.5. Plug-in the parameters calculated above and the hessian matrix b j , wherej = 2:P 13 represents the treatment status, in the following formula for theMSE:bhE = 2Bivs) 5Bivs2 5Bivs+32 5 2 ∗ k vrivnxz (B.1)where,Bivs) = 2b()2 −b))23hphhphC2Bivs2 = 2b()) −b)))3h2p2pC+Bivs2 = 2b(22 −b)223h2h2hC4k vrivnxz =v2pP h3nhphhphf2pP h3C)The constants (C)P C2P C+P C4) are specific to the kernel and the region of thefrontier used for the MSE estimation. The horizontal frontier is defined as theregion where population varies between 0 and 30 (thousand) and HDI=0.7. Thevertical frontier has population = 30 and HDI in the range 0.5-0.7. The values of(C)P C2P C+P C4) are shown in the table below.134B.2. Bandwidth SelectionC1 C2 C3 C4(Away with Pop=30,Origin, Away with HDI =0.7)Edge (3.20,11.91,3.20) (0.00,-0.06,0.00) (0.08,-0.05,-0.05)(-0.05,-0.05,0.08)Epanechnikov (2.70,9.80,2.70) (0.00,-0.07,0.00) (0.10,-0.06,-0.06)(-0.06,-0.06,0.10)Normal (2.06,7.28,2.06) (0.00,-0.11,0.00) (0.15,-0.08,-0.08)(-0.08,-0.08,0.15)Uniform (2.00,7.00,2.00) (0.00,-0.13,0.00) (-0.17,-0.08,-0.08)(-0.08,-0.08,-0.17)The termBivs) goes to zero when the equation is estimated away from the origin.The expressions above are an expansion of the components of bias and variance usedin the theoretical MSE expression defined in Ruppert and Wand (1994). They arereproduced below with the notation from this paper.Conditional bias:E[m^2pP h3−m2pP h3 | 2ePH3] = e′1c−1p;h2∫Dp;h;Hw′k2u3u′H12b2pP h3H12uyu5dp2tr2H33Conditional variance:V[m^2pP h3 | 2ePH3] = [n−)|H|− 12 z′)c−)pPhipPhc−)pPhz)9f2x3] ∗ v2pP h3 ∗ 21 5 op2133PwherecpPh =∫Dp;h;Hw′wK2u3yuipPh =∫Dp;h;Hw′wK2u32yuw = [1 u′]As for notation,H 12 is a bandwidth matrix assumed to be diagonal as yivg2[hpp hii ]3,135B.2. Figures and Tablesb( and b) are the hessian matrices for the second degree polynomial estimatedusing pilot bandwidths for the non-treated and treated subsamples, respectively.K2u3 is the kernel,u = [u1P u2]′,z is defined as a vector of the same length as w, with 1 as the firstelement and 0 in all other elements. D)xPH1 and D(xPH0 are the sets of treatment andcontrol points, respectively; within a bandwidth from x and within the support ofthe kernel K.6. Find the pair (hpP hh) that minimizes the MSE expression, constraining themaximum bandwidth to a cap. I will use the cap of 1.0 for the entire sample,which is the minimum bandwidth found for the instrument using the originalalgorithm and one unique bandwidth for the two variables under the edge andepanechnikov kernels.7. The steps 1-6 are repeated for 5 points away from the origin on both thevertical and horizontal frontiers. Where HDI = 0.7, I use Pop=(5-15) in 2.5intervals and where Pop=30, I use HDI=(0.575-0.625) in 0.0125 intervals. Pickthe minimum of (hpP hh) for each frontier and use as a starting point (first binon each side). Calculate the (hpP hh) for the origin and linearly interpolatefor all the k points used to estimate the CATE between the extremes andthe origin. For example, for incumbency advantadge under the edge kernel,the minimum bandwidth in the horizontal dimension is 2hpP hh3 = 2:O68P :O873,the bandwidth at the origin was 2hpP hh3 = 21O::P 1O::3 and the minimumbandwidth in the vertical dimension was 2hpP hh3 = 21O::P :O793.8. Steps 1-7 are repeated for each kernel.136B.2. Figures and TablesFigure B.1: CCT Coverage: One-dimension RDDThe vertical line represents the treatment frontier (Pop = 30,000 and HDI = 0.7). Blue dots are the averageCCT coverage for each one of the 25 bins on each side of the discontinuity. The x-axis shows the minimumdistance of each observation to the treatment frontier. The red lines are fitted by local linear regression onthe unbined data. The grey lines are the 90% confidence level.Figure B.2: CCT Coverage: Three Dimensional Parametric FitThe surface is fitted with a quadratic polynomial on normalized population and HDI. The sample includesall observations within 2 distance units from treatment. 137B.2. Figures and TablesFigure B.3: Conditional ATE for the log of Total Doctor’s VisitsFigure B.4: Conditional ATE for the Total Doctor’s Visits per FamilyThe y-axis shows the conditional ATEs along the treatment frontier. The left side has HDI=0.7 and pop-ulation in 7,500-30,000. The right side has population=30,000 and HDI in 0.7-0.5875. Regressions use theedge kernel, year and state effects, and controls. Bandwidth is set at 0.9 standard deviations for both scores.The size of the dots are the number of observations in each bin. I repeat the bin located at the origin inboth sides. 138B.2. Figures and TablesTable B.1: Optimal BandwidthsKernel Edge Epanechnikov Normal UniformBasic Health Funds (0.71,0.99) (0.64,0.97) (0.53,0.94) (0.51,0.90)CCT Coverage (1.00,1.00) (1.00,0.98) (1.00,0.94) (1.00,0.92)Incumbent’s Vote Share (0.99,0.96) (0.99,0.93) (0.99,0.90) (0.99,0.89)Number of candidates (1.00,0.98) (0.99,0.96) (0.99,0.93) (0.99,0.92)Margin of Victory (1.00,1.00) (1.00,1.00) (1.00,0.99) (1.00,0.99)Campaign Donations (log) (1.00,1.00) (0.99,1.00) (0.86,0.96) (0.82,0.95)Challenger’s Education (0.75,0.75) (0.75,0.75) (0.75,0.75) (0.75,0.75)Challenger’s Party (1.00,1.00) (1.00,1.00) (1.00,0.99) (1.00,0.98)Turnout (1.00,1.00) (0.99,1.00) (0.86,0.98) (0.81,0.97)Total Budget (log) (0.94,1.00) (0.85,0.99) (0.71,0.96) (0.68,0.95)Capital Investment (1.00,1.00) (1.00,1.00) (1.00,1.00) (1.00,1.00)Personnel Spending (1.00,1.00) (1.00,1.00) (1.00,0.96) (0.99,0.95)Other Spending (1.00,1.00) (1.00,0.99) (0.99,0.95) (0.99,0.94)Education (1.00,1.00) (0.99,1.00) (0.98,1.00) (0.97,1.00)Health (1.00,0.96) (1.00,0.94) (0.99,0.92) (0.99,0.91)Administration (1.00,1.00) (1.00,1.00) (0.98,0.90) (0.98,0.84)Urbanization (1.00,1.00) (1.00,1.00) (1.00,1.00) (1.00,1.00)Social Security (0.98,1.00) (0.96,1.00) (0.93,0.96) (0.93,0.95)Transportation (0.99,0.98) (0.99,0.97) (0.99,0.93) (0.99,0.93)Average optimal bandwidths calculated for the preferred frontier segment. They are expressed as (Pop.,HDI).139B.2. Figures and TablesTable B.2: Main Results: Other Political OutcomesCoefficient, [90% CI], {Obs per bin}P.T. Mean RDD OLS IV RDD(1) (2) (3) (4) (5)Period 2008-12 2008-12 2008-12 2008-12 2000-04ELECTIONSTurnout 84.55 0.44 0.03 0.03 -0.06(%) [83.60,85.41] [-1.13,2.03] [-0.01,0.06] [-0.17,0.27] [-1.52,1.37]{468} {468} {468} {494}BUDGET SHARES (by type)Total Budgeta 149.14 -0.04 0.00 0.00 0.08(R$ million) [139.80,161.01] [-0.13,0.03] [0.00,0.00] [-0.01,0.00] [-0.02,0.20]{325} {325} {325} {438}Other Spending 42.75 -1.15 0.02 -0.10 -0.81(% share) [41.59,43.91] [-3.05,0.82] [-0.02,0.05] [-0.34,0.05] [-3.08,1.35]{351} {351} {351} {477}BUDGET SHARES (by function)Administration 14.36 -1.15 0.01 -0.13 0.30(% share) [13.42,15.40] [-3.16,0.96] [-0.03,0.04] [-0.67,0.25] [-1.94,2.83]{353} {353} {353} {454}Urbanization 9.75 -0.12 -0.01 -0.03 2.05*(% share) [9.10,10.40] [-1.76,1.24] [-0.03,0.02] [-0.50,0.19] [0.13,4.48]{353} {353} {353} {454}Social Security 6.25 -0.06 -0.01 0.00 -0.45(% share) [5.59,7.10] [-1.50,1.12] [-0.02,0.01] [-0.23,0.18] [-1.63,0.60]{347} {347} {347} {447}Transportation 2.71 -0.79** -0.01 -0.10 -1.09**(% share) [2.28,3.21] [-1.34,-0.19] [-0.02,0.00] [-0.29,0.05] [-2.03,-0.19]{344} {344} {344} {441}aEstimated in log(Variable). bCoefficient for the period 2000-04 contains data from 2004 only. Significantat: 99% ***, 95% **, 90% *. Standard errors are clustered by municipality. The coefficients represent theATE for the preferred frontier segment, under the edge kernel. All regressions include year and state effects.The list of included municipal-level controls is described in the text. Pre-treatment means correspond to thepredicted values of the variables for a municipality at the discontinuity segment, before treatment.140B.2. Figures and TablesTable B.3: Political Outcomes at Different FPM Population ThresholdsPop. nod = 23,772 Pop. nod = 37,356Coeff., [90%CI]{Obs. per bin} Coeff., [90%CI]{Obs. per bin}Basic Health Fundsa -0.07* {1053} 0.02 {354}(R$ million) [-0.14,0.00] [-0.10,0.14]CCT Coverage -3.00 {1053} -4.81 {354}(% over target) [-7.00,0.77] [-10.08,1.61]Incumbent’s Vote Share 2.08 {804} 0.36 {271}(%) [-1.90,5.99] [-7.94,6.51]Number of candidates -0.16 {804} 0.07 {271}(number) [-0.34,0.01] [-0.19,0.33]Margin of Victory 4.18* {804} -0.51 {271}(pct points) [0.44,8.29] [-12.94,8.81]Campaign Donationsa 0.44* {761} -0.26 {260}(R$’000) [0.03,0.91] [-0.88,0.38]Challenger’s Education -0.01 {637} 10.56 {219}(% with high school) [-0.12,0.10] [-7.30,31.92]Challenger’s Party 0.10 {637} -5.70 {219}(% clientelistic) [-0.06,0.25] [-37.88,28.66]Capital Investment 0.10 {591} 0.97 {207}(% share) [-1.18,1.28] [-1.97,4.27]Personnel Spending 0.48 {591} 1.05 {207}(% share) [-1.72,2.65] [-3.37,5.46]Education 1.22 {591} -1.72 {203}(% share) [-0.37,2.72] [-4.90,0.80]Health -0.91 {591} 1.64 {203}(% share) [-2.18,0.24] [-2.08,6.62]aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. In order to isolate the effect of theFPM, treatment is attributed to municipalities only on the basis of population, so in this case municipalitieswith population below the cuttoff and HDI above 0.7 are considered as treated. Standard errors are clusteredby municipality. The coefficients represent the ATE for the frontier with IDH between 0.65-0.70, andpopulation as indicated in the table header. Bandwidths are set as one standard deviations of populationand HDI. Regressions include year and state effects, and the municipal-level controls listed in the text.141B.2. Figures and TablesTable B.4: Change in the Political Impact of the FHP post-2004Does it have FHP? 1=Yes FHP in 2008-12(1A) (1B) (2A) (2B) {Obs.}ELECTIONSIncumbent’s Vote Share 2.90*** 2.76*** -2.44* -2.55** {7675}(%) [0.53] [0.51] [1.29] [1.25]Number of candidates -0.05** -0.06** 0.05 0.06 {7675}(number) [0.02] [0.02] [0.05] [0.05]Margin of Victory 0.67 0.85 -0.12 -0.78 {7675}(pct points) [0.55] [0.53] [1.31] [1.24]Challenger’s Education 1.01 -0.02 8.75* 5.07 {5877}(% with high school) [1.94] [1.88] [4.61] [4.43]Challenger’s Party -0.50 -0.29 -2.78 -3.58 {5879}(% clientelistic) [2.01] [1.90] [4.82] [4.59]BUDGET SHARES (by type)Capital Investment 0.06 -0.05 -0.77** -0.78** {7606}(% share) [0.19] [0.19] [0.38] [0.38]Personnel Spending -0.71** -0.58** 2.89*** 2.62*** {7606}(% share) [0.25] [0.25] [0.45] [0.47]Education -0.23 -0.22 -0.30 -0.41 {7382}(% share) [0.17] [0.17] [0.39] [0.41]Health 1.22*** 1.27*** 1.02** 0.85** {7382}(% share) [0.16] [0.17] [0.31] [0.30]aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. The first two columns estimate theeffect on political outcomes of having the FHP present in the municipality in 2000. The following two columns(2A and 2B) estimate this effect is 2008-12. Columns (1A) and (2A) include dummies for micro-regions (10municipalities on average) and columns (1B) and (2B) dummies for macro-regions (40 municipalities onaverage). The sample includes only smaller municipalities (below 60 thousand) and is not limited by povertylevels is order to allow enough variation in the FHP presence is 2008-12. All regressions include a dummy for2004 and control for latitude, longitude, their interaction and 2000 HDI of the municipality. The regressionfor the vote shares of the incumbent also includes the past vote share of the candidate as a control.142B.2. Figures and TablesTable B.5: Robustness of the Exclusion RestrictionCoefficient, [90% CI], {Obs. per bin}Segment B (Strong IV) Segment A (Weak IV)Controls Yes No Yes NoCCT Coverage 7.90*** 6.97** 4.14** 3.85*(% over target) [3.84,11.60] [2.15,11.38] [0.76,7.54] [0.17,7.39]Incumbent’s Vote Share -7.73*** -7.60*** 0.78 1.02(%) [-12.42,-3.60] [-11.95,-3.03] [-2.30,4.27] [-1.99,4.58]Number of candidates 0.39*** 0.35*** 0.00 0.00(number) [0.20,0.63] [0.14,0.59] [-0.12,0.12] [-0.12,0.12]Margin of Victory -5.96** -5.69* -2.66 -2.58(pct points) [-11.69,-1.74] [-11.09,-0.91] [-6.68,1.23] [-6.39,1.29]Campaign Donationsa -0.41* -0.40* -0.04 -0.05(R$’000) [-0.89,-0.06] [-0.89,-0.04] [-0.36,0.29] [-0.38,0.29]Challenger’s Education 19.27** 17.65** 2.51 1.74(% with high school) [4.56,33.57] [4.33,30.17] [-9.19,14.28] [-10.00,13.73]Challenger’s Party -19.30* -20.19* -1.83 -2.37(% clientelistic) [-33.85,-3.42] [-35.42,-3.49] [-14.91,9.73] [-15.57,9.43]Capital Investment -1.73* -1.43 -0.79 -0.60(% share) [-3.34,-0.22] [-3.01,0.02] [-1.90,0.16] [-1.68,0.41]Personnel Spending 2.85** 2.73** 0.36 -0.03(% share) [0.66,5.20] [0.60,5.19] [-1.00,1.74] [-1.44,1.40]Education 1.84** 0.36 0.03 -0.41(% share) [0.34,3.91] [-1.64,2.56] [-1.12,1.32] [-1.69,0.91]Health 2.17*** 2.61*** 1.20** 1.24**(% share) [1.05,3.73] [1.21,4.28] [0.34,2.10] [0.27,2.26]aCoefficient estimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Segment A:7,500≤Pop≤17,500 , HDI = 0.7. Segment B: 0.65≤HDI≤0.70, Pop = 30,000. Standard errors are clus-tered by municipality. The average effect is estimated using the edge kernel for 6 bins. Regressions includeyear and state effects. The municipal controls are listed in the text.143B.2. Figures and TablesTable B.6: Health OutcomesPT Mean Coefficient, [90% CI], {Obs. per bin}[90% CI] 2008-12 2000-04Number of Visitsa 86.54 0.11*** 0.04(’000 per year, 4-year avg.) [83.39,89.50] [0.04,0.17] [-0.10,0.20]{476} {542}Children below 2ya 0.38 0.10** 0.07(’000 in any given month) [0.36,0.39] [0.02,0.17] [-0.06,0.21]{476} {542}Number of Babies Borna 0.11 0.10* 0.10(’000 per year, 4-year avg.) [0.10,0.11] [0.01,0.19] [-0.02,0.23]{476} {542}Visits per Family 11.41 1.06** -0.54(per year) [11.07,11.75] [0.22,1.87] [-1.74,0.49]{476} {542}Mortality Rate 10.79 1.30 1.65(children less than 11m) [9.59,12.18] [-0.59,3.20] [-1.66,4.97]{470} {529}Pre-Natals 66.27 0.49 3.20(% of pregnancies) [63.20,69.26] [-5.01,6.19] [-3.12,10.17]{423} {408}Children < 2y Vaccinated 95.71 0.03 1.46(% of children) [94.93,96.32] [-0.87,1.01] [-1.04,3.75]{476} {542}aEstimated in log (Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are clustered bymunicipality. All regressions include year and state effects and are estimated using the edge kernel and abandwidth of 0.90, set as before. Pre-treatment means correspond to the predicted values of the variablesfor a municipality at the discontinuity segment, before treatment, in 2008-12.144B.2. Figures and TablesTable B.7: Hiring OutcomesPT Mean Coefficient[90% CI] [90% CI] {Ob. perbin}Total Employmenta 1.19 -0.04(’000) [1.12,1.27] [-0.14,0.05] {475}Share of Political Employmenta 9.31 -0.05(%) [8.25,10.57] [-0.28,0.17] {476}Share of Temporary Employmenta 19.26 0.17(%) [15.23,23.19] [-0.15,0.56] {431}Total Employment (chg.) 28.06 0.66(pp) [22.63,38.47] [-10.29,9.38] {474}Share of Political Employment (chg.) 16.68 -5.65(pp) [2.79,64.59] [-24.24,2.27] {471}Share of Temporary Employment (chg.) 2.11 8.19***(pp) [-1.37,4.96] [4.09,13.63] {464}Share of Less Educated Employees 28.06 -4.27(% of total employment) [25.44,30.80] [-9.55,0.92] {421}Share of Less Educated Employees 30.81 -17.20***(% of temporary employment) [25.28,42.88] [-34.89,-7.30] {421}Formal hiring process in 2007-11 0.76 -0.06(1=Yes) [0.66,0.83] [-0.23,0.11] {476}aEstimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. The coefficients represent the ATEfor the preferred frontier segment. Standard errors are clustered by municipality. All regressions includeyear and state effects and are estimated using the edge kernel and a bandwidth of 0.9 standard deviations ofpopulation and HDI. The list of municipal-level controls are in the text. Pre-treatment means correspondto the predicted values of the variables for a municipality at the discontinuity segment, before treatment.145Appendix CAppendix to Chapter 4C.1 Income Reporting, Cancellation and Distributionof BF BenefitsThere are two main reasons why the actual distribution of benefits in 2012 willnot immediately reflect the fraud in that same year. First, 10% of all CadUnicohouseholds entered the system for the first time in 2012. This is 22% of all updates inthe election year. Local offices can manipulate income reporting, but not the timingor approval of the actual benefit, which in many cases takes several months. Second,BF has a permanence rule since 2009, stating that households that increased theirreported income above the threshold are still eligible to keep their former benefitfor a grace period that could last up to two years (i.e. until their next scheduledupdate).Accordingly, Table 3.1 below shows that 26% of the CadUnico households shouldbe receiving BF benefits (yellow cells), but are not (mostly due to recent enroll-ment/updates), and 11% of households are receiving benefits while they should not,mostly due to the permanence rule.146C.1. Figures and TablesTable C.1: Bolsa Família TargetingReported p.c. Monthly Income, R$ Total% of Households [0,70] (70,140] (140,311]Full Benefits 43.3 7.6 3.5 54.3Variable Benefit Only 0.3 4.7 0.9 6.0No Benefit 14.4 11.0 14.3 39.7Total 58.0 23.4 18.6 100.0The cells contain the percentage of households in each category. The yellow cells on theleft side represent households that are receiving less benefits than they should. The bluecells (right side) are the households that are receiving more benefits than they should.147C.1. Figures and TablesTable C.2: Balance of Fixed and Pre-determined VariablesPT Mean Coefficient, [S.E.] 16.0 13.0 16.0 19.0 Optimal {Obs.}Latitude -7.962 -0.565 -0.385 -0.298 -0.307 18.69(degrees) [0.188] [1.002] [0.904] [0.824] [0.831] {688}Longitude -41.986 -1.157 -0.821 -0.581 -1.471 11.45(degrees) [0.247] [1.156] [1.067] [0.988] [1.222] {498}Areaa 0.701 0.325 0.283 0.245 0.326 12.97(km2) [0.001] [0.245] [0.225] [0.207] [0.245] {544}Population, 2008a 13.230 -0.066 -0.023 -0.019 -0.020 18.34(thousand) [1.033] [0.154] [0.142] [0.131] [0.133] {679}GDP, avg 2001-08a 4.175 0.116 0.096 0.078 0.105 14.81(pc R$ ’000) [1.016] [0.072] [0.066] [0.061] [0.068] {585}Male, Census 2000 50.854 0.402 0.357 0.320 0.418* 12.47(share of pop) [0.055] [0.248] [0.230] [0.214] [0.252] {531}Urban, Census 2000 45.228 -0.226 -0.644 -1.099 -0.727 16.38(share of pop) [0.706] [3.434] [3.125] [2.862] [3.087] {629}Inequality, Census 2000 0.572 -0.016 -0.015 -0.014 -0.016 14.8(Gini) [0.003] [0.014] [0.012] [0.011] [0.013] {585}Poverty, MDS 2006 55.282 1.236 1.267 1.414 1.393 18.62(% of BF eligible) [0.293] [1.326] [1.226] [1.131] [1.142] {684}Poverty, MDS 2006 76.183 0.897 0.936 1.073 1.020 17.72(% of CadUnico eligible) [0.245] [1.103] [1.018] [0.939] [0.971] {667}Observations 544 617 698aCoefficient estimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedas-ticity robust. Pre-treatment means correspond to the predicted values of the variables for a municipality at thediscontinuity segment, before treatment.148C.1. Figures and TablesTable C.2: Balance of Fixed and Pre-determined Variables (continued)P.T.MeanCoefficient, [S.E.] 16.0 13.0 16.0 19.0 Optimal {Obs.}BF Coverage, 12/2008 94.771 0.065 0.284 0.424 0.533 22.19(% of BF Target 06) [0.680] [3.275] [3.016] [2.781] [2.612] {742}BF Expenses, 12/2008a 154.879 -0.050 -0.002 0.009 0.009 19.33(R$ ’000/month) [0.001] [0.157] [0.144] [0.133] [0.132] {704}ri ≤ R$70, Census 2010 30.250 1.522 1.688 1.806 1.557 14.83(% of CadUnico Eligible) [0.346] [1.701] [1.551] [1.421] [1.604] {586}ri ≤ R$70, Census 2010 67.831 1.418 1.554 1.600 1.434 14.19(% of BF Eligible) [0.309] [1.491] [1.357] [1.244] [1.431] {573}ri ≤ R$140, Census 2010 63.447 0.444 0.673 0.868 0.624 15.63(% of CadUnico Eligible) [0.285] [1.392] [1.272] [1.168] [1.285] {602}Mayor’s Education 0.410 0.066 0.060 0.050 0.064 14.07(share with college) [0.020] [0.093] [0.085] [0.078] [0.090] {570}Mayor’s Gender 0.154 0.100 0.093 0.089 0.092 16.17(share of female) [0.015] [0.075] [0.068] [0.062] [0.067] {624}Mayor’s Age 47.801 -3.885** -3.487** -3.381** -3.533** 15.22(years) [0.391] [1.907] [1.736] [1.588] [1.776] {593}Mayor’s Party 0.076 -0.051 -0.038 -0.029 -0.052 12.71(share of PT) [0.011] [0.048] [0.045] [0.041] [0.048] {538}Mayor’s Party 0.390 -0.047 -0.032 -0.017 -0.036 15.27(share of PT coalition) [0.020] [0.089] [0.082] [0.076] [0.084] {595}Observations 544 617 698aCoefficient estimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors areheteroskedasticity robust. Pre-treatment means correspond to the predicted values of the variables for amunicipality at the discontinuity segment, before treatment.149C.1. Figures and TablesTable C.2: Balance of Fixed and Pre-determined Variables (continued)P.T.MeanCoefficient, [S.E.] 16.0 13.0 16.0 19.0 Optimal {Obs.}Mayor’s Party 0.148 0.070 0.063 0.067 0.068 13.51(share of state govt party) [0.014] [0.070] [0.064] [0.059] [0.069] {557}Mayor’s Party 0.284 -0.068 -0.046 -0.040 -0.068 13.15(share of leftist) [0.018] [0.082] [0.076] [0.070] [0.082] {548}Campaign Donations 53.712 -0.009 -0.003 0.001 0.004 22.37(R$ ’000 in 2008) [0.001] [0.245] [0.227] [0.211] [0.200] {737}Occupation: Business 0.206 -0.117 -0.081 -0.067 -0.109 13.65(share of business owners) [0.016] [0.072] [0.066] [0.061] [0.071] {559}Occupation: Agriculture 0.166 -0.078 -0.074 -0.072 -0.074 17.71(share in agriculture) [0.015] [0.062] [0.058] [0.055] [0.056] {667}Occupation: Public Sector 0.094 0.051 0.026 0.013 0.047 13.41(share in agriculture) [0.012] [0.062] [0.055] [0.050] [0.060] {555}Avg Rain 86.596 2.472 0.849 0.102 1.057 14.97(mm, 1970-2008) [1.613] [8.005] [7.244] [6.602] [7.477] {588}Rain Volatility 1.023 0.022 0.029 0.031 0.029 21.14(se/avg, 1970-2008) [0.009] [0.047] [0.043] [0.039] [0.037] {726}BF Management Index 0.769 -0.017 -0.012 -0.009 -0.015 14.15(IGD, 2006-2008) [0.003] [0.016] [0.014] [0.013] [0.015] {573}Audits in Microregion 1.388 -0.229 -0.186 -0.207 -0.222 20.38(number in 2006-2008) [0.048] [0.202] [0.187] [0.174] [0.169] {714}Observations 544 617 698aCoefficient estimated in log(Variable). Significant at: 99% ***, 95% **, 90% *. Standard errors areheteroskedasticity robust. Pre-treatment means correspond to the predicted values of the variables for amunicipality at the discontinuity segment, before treatment.150C.1. Figures and TablesTable C.3: Income Underreporting by Year of Last UpdateCoefficient, [S.E.]Update Year 2009 2010 2011 2012Very Poor, ri≤R$70 1.989 0.573 3.669** 5.444***(% of CadUnico Eligible) [3.006] [2.293] [1.687] [1.744]Very Poor, ri≤R$70 1.457 0.995 2.884** 3.916***(% of BF eligible) [1.953] [1.614] [1.192] [1.336]Poor, ri≤R$140 0.633 -0.861 1.226 2.084**(% of CadUnico Eligible) [2.492] [1.611] [1.035] [0.819]Observations 617 617 617 617Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.151C.1. Figures and TablesTable C.4: Distribution of Benefits for Households Updating in 2012PT Mean Coefficient, [S.E.] Band.(1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalIncome Reporting (Income reported in the previous updated that took place in 2009-2010)Very Poor, ri≤R$70 85.061 0.905 1.277 1.536 1.126 15.2(% of CadUnico Eligible) [0.400] [1.590] [1.475] [1.367] [1.505] {590}Very Poor, ri≤R$70 88.889 1.060 1.186 1.276 1.207 16.2(% of BF eligible) [0.309] [1.244] [1.146] [1.056] [1.139] {625}Poor, ri≤R$140 95.439 -0.213 0.083 0.298 -0.237 12.7(% of CadUnico Eligible) [0.186] [0.735] [0.696] [0.649] [0.741] {538}Change in Income (from previous report before 2012)Income went above R$70 6.313 -3.201*** -2.980*** -2.831*** -3.123*** 13.5(% of CadUnico) [0.267] [1.042] [0.938] [0.861] [1.020] {558}Income went below R$70 5.063 -0.420 -0.691 -0.827* -0.722 16.4(% of CadUnico) [0.173] [0.590] [0.538] [0.495] [0.532] {631}Observations 544 617 698Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.152C.1. Figures and TablesTable C.5: Heterogeneity of Results: by Mayor’s Age and Poverty RateP.T.MeanCoefficient, [S.E.] Band.(1) (2) (3) (4) (5) {Obs.}Bandwidth 16.0 13.0 16.0 19.0 Optimal OptimalSample with municipalities where the elected mayors had between 40 and 55 years of ageVery Poor, ri≤R$70 81.803 6.385*** 6.250*** 6.016*** 6.236*** 16.3(% of CadUnico Eligible) [0.573] [2.165] [2.039] [1.920] [2.025] {383}Very Poor, ri≤R$70 88.919 3.739** 3.837** 3.802*** 3.834** 15.9(% of BF eligible) [0.422] [1.645] [1.542] [1.448] [1.547] {372}Poor, ri≤R$140 91.624 3.187*** 2.943*** 2.729*** 2.891*** 16.9(% of CadUnico Eligible) [0.273] [1.083] [1.013] [0.944] [0.992] {394}Mayor’s Age 47.343 -1.029 -0.888 -0.855 -0.921 14.6(years) [0.229] [1.150] [1.043] [0.947] [1.088] {352}Observations 332 375 422 383Sample with municipalities in the bottom tercile in povertyVery Poor, ri≤R$70 42.571 -4.610 -4.283 -3.668 -4.743 16.3(% of CadUnico Eligible) [0.729] [3.585] [3.217] [2.973] [3.719] {383}Very Poor, ri≤R$70 59.728 -2.852 -2.748 -2.172 -2.820 15.9(% of BF eligible) [0.668] [3.353] [3.015] [2.784] [3.473] {372}Poor, ri≤R$140 69.340 -3.206 -2.688 -2.379 -3.225 16.9(% of CadUnico Eligible) [0.521] [2.369] [2.136] [1.988] [2.376] {394}Observations 452 533 585 423Significant at: 99% ***, 95% **, 90% *. Standard errors are heteroskedasticity robust. The list of includedmunicipal-level covariates is described in the text. Pre-treatment means correspond to the predicted valuesof the variables for a municipality at the discontinuity segment, before treatment. Optimal bandwidths arecalculated using the CCT algorithm.153C.1. Figures and TablesTable C.6: Balance Before and After MatchingUnweightedMeanUnweightedMeanDifference WeightedMeanVariables Treat.=0 Treat.=1 Treat.=0Latitude -8.52 -6.90 1.62*** -6.90Longitude -41.97 -42.79 -0.820 -42.79Population, 2008a 2.53 2.60 0.080 2.60Region, Northeast 0.80 0.87 0.07* 0.87Region, Southeast 0.04 0.01 -0.03*** 0.01Region, South 0.02 0.00 -0.02*** 0.00Region, Midwest 0.00 0.01 0.010 0.01Rain Volatility, 1970-2008 1.01 0.99 -0.020 0.99Avg Rain, 1970-2008 100.85 107.90 7.040 107.90pc GDP, 2001-08a 1.45 1.33 -0.11*** 1.33Pop Literacy, 2000 63.19 60.04 -3.15*** 60.04Pop Gender, 2000 50.71 50.86 0.150 50.86Urban Pop, 2000 46.06 41.95 -4.11** 41.95GINI, 2000 0.57 0.57 0.000 0.57Transfers from federal govt, 2008a 9.12 9.17 0.050 9.17Rate of Very Poor, 2010 67.54 70.33 2.79*** 70.33Poverty, 2006 54.70 58.29 3.59*** 58.29BF Coverage, Dec 08 93.94 94.98 1.040 94.98CadUnico Coverage, Dec 08 103.37 101.53 -1.840 101.53Total BF Spending, Dec 08a 11.87 12.02 0.15** 12.02alog(Variable). Significant at: 99% ***, 95% **, 90% *.154C.1. Figures and TablesTable C.5: Balance Before and After Matching (Continued)UnweightedMeanUnweightedMeanDifference WeightedMeanVariables Treat.=0 Treat.=1 Treat.=0IGD Index, 2006-08 0.76 0.75 -0.010 0.75Audits in City, 2009-10 0.08 0.08 -0.010 0.08Audits in Micro-region, 2009-10 0.86 0.86 0.000 0.86PT’s Presidential Votes, 2006 67.27 69.64 2.37* 69.64Margin of Victory, 2008 6.60 6.25 -0.350 6.25Gender 0.13 0.17 0.040 0.17Education, College 0.42 0.38 -0.040 0.38Age 45.08 46.73 1.66* 46.73Campaign Donations, 2008a 10.88 10.89 0.010 10.89Assets, 2008a 5.03 4.99 -0.050 4.99Mayor’s Votes, 2008 50.31 49.91 -0.400 49.91Occupation, Public Sector 0.12 0.13 0.010 0.13Occupation, Business 0.24 0.16 -0.08** 0.16Occupation, Agriculture 0.13 0.19 0.060 0.19Political Experience (Mayor) 0.13 0.11 -0.030 0.11Political Experience (Legislator) 0.13 0.20 0.07* 0.20PT 0.11 0.08 -0.020 0.08PSDB 0.13 0.10 -0.030 0.10PT’s Local Coalition 0.43 0.36 -0.070 0.36Leftist Party 0.31 0.28 -0.030 0.28alog(Variable). Significant at: 99% ***, 95% **, 90% *.155


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items