UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Essays on econometrics Yu, Zhengfei 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2015_september_yu_zhengfei.pdf [ 3.39MB ]
Metadata
JSON: 24-1.0166460.json
JSON-LD: 24-1.0166460-ld.json
RDF/XML (Pretty): 24-1.0166460-rdf.xml
RDF/JSON: 24-1.0166460-rdf.json
Turtle: 24-1.0166460-turtle.txt
N-Triples: 24-1.0166460-rdf-ntriples.txt
Original Record: 24-1.0166460-source.json
Full Text
24-1.0166460-fulltext.txt
Citation
24-1.0166460.ris

Full Text

Essays on EconometricsbyZhengfei YuB.A., Zhejiang University, 2009M.A., The University of British Columbia, 2010A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinThe Faculty of Graduate and Postdoctoral Studies(Economics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)July 2015c© Zhengfei Yu 2015AbstractThis thesis studies two topics in Econometric models, multiple equilibriaand weak instruments. Chapter 1 is an introduction.Chapter 2 considers nonparametric structural equations which may havemultiple solutions for the endogenous variables. The main finding is thatmultiple equilibria would reveal itself in the form of jump(s) in the densityfunction of the endogenous variables. When there is a unique equilibrium,the density function of endogenous variables will be continuous, while whenthere are multiple equilibria, the density will have a jump at some point,under reasonable conditions. Our test statistic is based on maximizing localjumps over the support of endogenous variables and the critical value iscomputed via a Gaussian multiplier bootstrap.Chapter 3 shows that in games with incomplete information, evenwhen the payoff functions and the latent distributions are all smooth, theobserved conditional choice probabilities may have a jump with respect tocontinuous covariates. This chapter provides a theoretical analysis on therelationship between the equilibrium behaviour of the game and thepresence of a jump in the conditional choice probabilities. Such jump(s)matters in empirical research for two reasons. Statistically, it affects theestimation of the conditional choice probabilities. Economically, whetherthe conditional choice probabilities have a jump or not reveals informationabout the equilibrium behaviour of the game. Our findings are robust tocorrelated private information and unobserved heterogeneity independentof covariates.Chapter 4 considers efficient inference for the coefficient of theendogenous variable in linear regression models with weak instrumentalvariables (Weak-IV). We focus on the power of tests for the alternativehypotheses that are determined by arbitrarily large deviations from thenull. We derive the power envelope for such alternatives in the Weak-IVscenario. Then we compare the power properties of popular Weak-IVrobust tests, focusing on the Anderson-Rubin (AR) and ConditionalLikelihood Ratio (CLR) tests. We find that their relative performancedepends on the degree of endogeniety in the model. In addition, weiiAbstractpropose a Conditional Lagrange Multiplier (CLM) test. We also extendour analysis to heteroskedastic models.iiiPrefaceChapter 4 is co-authored with Prof. Vadim Marmer. I have been activelyparticipating in all stages of this project, including reviewing the literature,deriving and proving propositions, conducting numerical computation, andwriting the manuscript.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Dissertation Outline . . . . . . . . . . . . . . . . . . . . . . . 22 Testing for Multiple Equilibria in Continuous DependentVariables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.1 Related Literature . . . . . . . . . . . . . . . . . . . . 72.1.2 Organization of Chapter 2 . . . . . . . . . . . . . . . 92.2 The Framework and Examples . . . . . . . . . . . . . . . . . 92.2.1 The Econometric Model . . . . . . . . . . . . . . . . 92.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 112.3 Testing the Presence of Multiple Equilibria via Discontinuity 132.3.1 An Equilibrium Selection Rule . . . . . . . . . . . . . 132.3.2 A Testing Criterion . . . . . . . . . . . . . . . . . . . 182.3.3 An Extension: Jumps in the Latent Distribution . . . 302.4 A Test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . 312.4.1 Construction of the Test . . . . . . . . . . . . . . . . 312.4.2 Asymptotic Properties of the Test . . . . . . . . . . . 362.5 Monte Carlo Simulations . . . . . . . . . . . . . . . . . . . . 382.5.1 Data Generating Processes (DGP) . . . . . . . . . . . 38vTable of Contents2.5.2 Performance of the Test . . . . . . . . . . . . . . . . . 432.6 Conclusions and Remarks . . . . . . . . . . . . . . . . . . . . 473 Jumps of the Conditional Choice Probabilities in IncompleteInformation Games . . . . . . . . . . . . . . . . . . . . . . . . . 493.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.1.1 Related Literature . . . . . . . . . . . . . . . . . . . . 503.1.2 Organization of Chapter 3 . . . . . . . . . . . . . . . 523.2 The Set-up of the Game . . . . . . . . . . . . . . . . . . . . 523.3 An Implication of Equilibrium Behaviour on Data . . . . . . 553.3.1 The Equilibrium Behaviour and Presence of A Jumpin Conditional Choice Probabilities . . . . . . . . . . 573.3.2 A Numerical Example . . . . . . . . . . . . . . . . . . 653.3.3 An Extension: Unobserved Heterogeneity . . . . . . . 673.3.4 Testing for Multiple Equilibria via Discontinuity . . . 733.4 An Estimation Problem Due to Jumps . . . . . . . . . . . . 753.5 Testing for the Presence of Jump(s) . . . . . . . . . . . . . . 773.6 Conclusions and Remarks . . . . . . . . . . . . . . . . . . . . 794 Efficient Inference in Econometric Models When Identifica-tion Can Be Weak . . . . . . . . . . . . . . . . . . . . . . . . . 814.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 814.1.1 Organization of Chapter 4 . . . . . . . . . . . . . . . 834.2 An Asymptotic Experiment for Linear IV Models . . . . . . 834.3 The Optimal Rotational Invariant and Asymptotically Simi-lar Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874.4 The Power Envelope Under an Unknown Nuisance Parameter 914.5 Power Comparisons of Robust Tests . . . . . . . . . . . . . . 944.5.1 Popular Weak-IV Robust Tests . . . . . . . . . . . . 944.5.2 A New Test: Conditional Lagrange Multiplier (CLM)Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964.5.3 Power Calculations . . . . . . . . . . . . . . . . . . . 1004.6 Heteroskedastic Models . . . . . . . . . . . . . . . . . . . . . 1074.6.1 The Optimal Rotational Invariant and AsymptoticallySimilar Test In Heteroskedastic Models . . . . . . . . 1104.6.2 The Generalized Likelihood Ratio (GLR) Statistic InHeteroskedastic Models . . . . . . . . . . . . . . . . . 1124.7 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . 115Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116viTable of ContentsAppendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A Appendix for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . 121A.1 Proofs for Section 2.3 . . . . . . . . . . . . . . . . . . . . . . 121A.2 Conditions in Section 2.4 . . . . . . . . . . . . . . . . . . . . 131A.3 Proofs for Section 2.4 . . . . . . . . . . . . . . . . . . . . . . 133B Appendix for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . 153B.1 A Discussion of the Alternative Equilibrium Notion . . . . . 153B.2 Mathematical Proofs . . . . . . . . . . . . . . . . . . . . . . 154viiList of Tables2.1 Conditions for multiple equilibria to produce jump(s) in density 222.2 Equilibrium prices . . . . . . . . . . . . . . . . . . . . . . . . 412.3 Rejection frequencies of the test . . . . . . . . . . . . . . . . . 452.4 Rejection frequencies under different bandwidths . . . . . . . 462.5 Rejection frequencies under different maximizing sets . . . . . 473.1 Normal-form of the game . . . . . . . . . . . . . . . . . . . . 533.2 Normal-form of the numerical example . . . . . . . . . . . . . 653.3 Parameters for the games . . . . . . . . . . . . . . . . . . . . 673.4 Normal-form of the game with T = t . . . . . . . . . . . . . . 68viiiList of Figures2.1 Illustration of jumps in density . . . . . . . . . . . . . . . . . 62.2 Reduced-form functions . . . . . . . . . . . . . . . . . . . . . 152.3 Multiple equilibria and jump(s) . . . . . . . . . . . . . . . . . 192.4 Characterization of equilibria . . . . . . . . . . . . . . . . . . 402.5 Data generated by DGP0,1,2,3 . . . . . . . . . . . . . . . . . 422.6 Jump location curves . . . . . . . . . . . . . . . . . . . . . . . 443.1 The equilibrium belief(s) σ1 . . . . . . . . . . . . . . . . . . . 663.2 Equilibria correspondence from covariate x to belief σ1 . . . 683.3 Observed conditional choice probability . . . . . . . . . . . . 694.1 Power comparisons among robust tests, 2 IVs . . . . . . . . . 1014.2 Power comparisons among robust tests, 5 IVs . . . . . . . . . 1024.3 Relative power of the AR versus CLR tests, 2 IVs . . . . . . 1034.4 Relative power of the AR versus CLR tests, 5 IVs . . . . . . 1044.5 Relative power of the AR versus CLR tests, 10 IVs . . . . . . 1054.6 Relative power of the AR versus CLR tests, 20 IVs . . . . . . 106ixAcknowledgementsI am very grateful to my thesis supervisors, Prof. Vadim Marmer and Prof.Kyungchul (Kevin) Song, for their enlightening guidance, helpful advice,and continuous encouragement.I would like to thank Prof. Hiroyuki Kasahara for his valuable commentsand generous support. I also benefited a lot from Prof. Yanqin Fan andProf. Paul Schrimpf.Special thanks are owed to my parents, who have supported me throughoutmy years of pursuing this degree. They are thousands of miles away fromtheir only child, but they always pass their confidence and optimism to me.I thank my friend Zhe Chen for living through the Ph.D life together withme, thank my colleague Jun Ma from whom I have learned a lot ofeconometrics.Thanks also go to my old friends, Zi (Lindsay) Lin, Hui Jiang, Tianfang Wuand Yike Mao. Although I have experienced a lot of “new” things in thepast ten years, fortunately I still keep the curiosity, enthusiasm and beliefthat we ever shared.xChapter 1IntroductionThis thesis studies two topics in econometric models: multiple equilibria andweak instruments.The presence of multiple equilibria poses challenges for comparativestatic and conterfactual experiments (see Berry, Levinsohn and Pakes(1999), Echenique and Komunjer (2009) and Borkovsky et al. (2014)), andit may also effect the identification and estimation approaches (seeAradillas-Lopez (2010), Paula (2012), Aguirregabiria and Mira (2013),Wan and Xu (2014) and Berry and Haile (2014), among others). Chapter 2considers nonparametric structural equations which may admit multiplesolutions for the dependent variables. The main finding is that uniquenessor multiplicity of equilibria produces testable implications on thecontinuity or discontinuity of the density function of dependent variables.Then we propose a test for multiple equilibria based on checking theexistence of jump(s) in the density function of dependent variables.Chapter 3 considers incomplete information games with possibly correlatedprivate information. We show that even when the payoff functions and thelatent distributions are all smooth, the observed conditional choiceprobabilities may have jump(s) with respect to the continuous covariates.Statistically, the possibility of such a jump affects estimation of theconditional choice probabilities. Economically, the presence of such jumpsreveals information about equilibrium behaviour of the game.Weak instrumental variables have received a lot of attention ineconometrics (see Staiger and Stock (1997), Kleibergen (2002, 2007),Moreira (2001, 2003) and Andrews, Moreira and Stock (2006)). Chapter 4studies efficient inference of the structural parameter in linear instrumentalvariables regression models when the instruments may be weak, with afocus on the alternatives that are determined by arbitrarily largedeviations from the null.11.1. Dissertation Outline1.1 Dissertation OutlineChapter 2, Testing for Multiple Equilibria in Continuous DependentVariables, proposes a test for the presence of multiple equilibria when thestructural equations are nonparametric and the dependent variables arecontinuous. Multiple equilibria may arise when the structural equationsadmit multiple solutions. Such multiplicity can happen in economicmodels of price competitions, social interactions, macroeconomics andmany other fields. This chapter finds that multiple equilibria would revealitself in the form of discontinuities in the (conditional) density function ofthe dependent variables. Under some regularity assumptions, a model witha unique equilibrium necessarily makes the conditional density continuouswith respect to the endogenous variables. On the other hand, underreasonable conditions, a model with multiple equilibria produces jump(s)in the conditional density with respect to the endogenous variables. Suchjump(s) is typically caused by changes in the number of solutions to thestructural equations. With an additional assumption, uniqueness ormultiplicity of equilibrium leads to continuity or discontinuity in theunconditional density function of endogenous variables. This way wetransform the problem of testing for multiple equilibria into testing for thepresence of a jump in the density of dependent variables. Our testingprocedure consists of three steps. Firstly, for a given point, we computethe local jump of the density function. Secondly, we obtain the teststatistic by computing the maximal local jump. Lastly, we simulate thecritical value via the Gaussian multiplier bootstrap proposed byChernozhukov, Chetverikov and Kato (2013).In Chapter 3 titled Jumps of the Conditional Choice Probabilities inIncomplete Information Games, we show that in static incompleteinformation games, the conditional choice probabilities may have a jumpwith respect to the continuous covariates, even when the payoff functionsand latent distributions are all smooth. The possibility of such jump(s)would affect the estimation strategy of the conditional choice probabilities.We further establish the relationship between the equilibrium behaviourand jump(s) in the conditional choice probabilities. In particular, when theequilibrium characterizing equations always admit a unique solution, theconditional choice probabilities will be continuous. On the other hand,when there are multiple equilibria, or the single equilibrium present in thedata varies in types, the conditional choice probabilities will have a jump,except for some special equilibrium selection rules. Such a relationship isrobust to correlated private information across players and to unobserved21.1. Dissertation Outlineheterogeneity independent of covariates. Hence testing for the presence ofa jump in the conditional choice probabilities also provides informationabout the equilibrium behaviour of the game.The last chapter, Efficient Inference in Econometric Models WhenIdentification Can Be Weak, considers efficient inference for the coefficientof the endogenous variable in instrumental variables regression modelswith weak instrumental variables (Weak-IV). We focus on the power oftests for alternatives that are determined by arbitrarily large deviationsfrom the null. We derive the power envelope for such alternatives in theWeak-IV scenario. Then we compare the power properties of popularWeak-IV robust tests, focusing on the Anderson-Rubin (AR) andConditional Likelihood Ratio (CLR) tests. We find that their relativeperformance depends on the degree of endogeniety in the model. This isdifferent from Andrews, Moreira and Stock (2006), which found that theCLR test numerically dominates the AR test when the weighted averagepower is concerned. In addition, we propose the Conditional LagrangeMultiplier (CLM) test, which is asymptotically efficient when theinstruments are strong, robust to Weak-IV, and exhibits the same power asthe AR test for arbitrarily large deviations from the null. We also studythe heteroskedastic case, and find that the generalized likelihood ratiostatistic under heteroskedasticity reduces to the AR statistic in theWeak-IV scenario and when the alternatives are determined by arbitrarilylarge deviations from the null.3Chapter 2Testing for MultipleEquilibria in ContinuousDependent Variables2.1 IntroductionMultiple equilibria are commonly generated by economic models inindustrial organization, social interactions, macroeconomics, and manyother fields. From an econometric perspective, multiple equilibria can beregarded as situations in which the exogenous variables cannot uniquelydetermine the endogenous variables1. Jovanovic (1989) described threecases where multiple equilibria may arise: optimization problems, linearsimultaneous equations and non-linear simultaneous equations.Detecting the presence of multiple equilibria matters for severalreasons. Firstly, multiplicity of equilibria poses challenges for comparativestatics and counterfactual analysis. Empirical researchers usually assumethe uniqueness of equilibrium when conducting counterfactual experiments(see for example, Berry, Levinsohn and Pakes (1999).2). When multipleequilibria arise, the predicted outcome from a policy change becomesindeterminate. Secondly, uniqueness or multiplicity of equilibria is closelyrelated to identification of nonparametric simultaneous equations. Whilenonparametric simultaneous equations can be identified under an abstractcompleteness conditions, an alternative identification approach requiresuniqueness of equilibrium as a maintained assumption for identification(see Berry and Haile (2013, 2014), Mazkin (2008), among others foridentification of nonparametric simultaneous equations, and Horowitz(1996) for the transformation model). Thirdly, we will show that1We use dependent variables, endogenous variables and outcome variables interchange-ably in this chapter.2They wrote: “This assumes both that the equilibrium without the VER is also Nashin prices and that the equilibrium is unique (or at least that we solve for the relevantone).”42.1. Introductionmultiplicity of equilibria is associated with jump(s) in the density functionof the dependent variables, under regularity conditions. This will affect theestimation approach for the density. Lastly, knowing the presence ofmultiple equilibria helps understanding whether the variation in theoutcomes can be better explained by fundamentals or a self-fullingmechanism.3While tests for multiple equilibria in discrete games have beendeveloped by Paula and Tang (2012) and Aguirregabiria and Mira (2013),among others, research for continuous dependent variables (for example,prices) with multiple equilibria has been rare. To the best of myknowledge, this chapter makes the first attempt to propose a test foruniqueness against multiplicity of equilibria when the dependent variablesare continuous and the model is nonparametric. Our framework is asystem of equations r(Y ) = g(X) + U , where r(·) and g(·) are notparametrically specified. The observed data is an i.i.d. sample {Xi, Yi}ni=1.Here we do not make the assumption that the unknown function r(·) : Y→ RJ is one to one, where Y ⊂ RJ . Therefore, the dependent variables Ymay not be uniquely determined by exogenous variables (X,U). Anequilibrium selection rule is introduced to complete the model. Such aframework is general enough to include applications varying from pricecompetitions to social interactions.For structural equations r(y) = g(x) + u and an equilibrium selectionrule, we say there is a unique equilibrium if the structural equationsr(y) = g(x) + u admit a unique solution for y given any value (x, u), orthey may admit multiple solutions, but the equilibrium to be present inthe data is uniquely determined by the value (x, u). In contrast, we saythere are multiple equilibria if the structural equations admit multiplesolutions at some (x, u), and more than one of the solutions can be realizedin the data with positive probabilities, after fixing (x, u). Assuming thatthe functions r and g are twice continuously differentiable, unobservable Uhas a continuous density function, and U is independent of X, this chapterfinds that when there is a unique equilibrium, the conditional densityfY |X(y|x) is continuous4 in y within its support, for all x. On the otherhand, when there are multiple equilibria, under reasonable conditions, theconditional density fY |X(y|x) will have a jump in y, such that the limits3For example, Dagsvik and Javanovic (1994) considered the question “was the GreatDepression the outcome of a massive coordination failure? Or was it a unique equilibriumrespond to adverse shocks?”4Throughout this chapter, continuity at a point means that the limits from all directioncoincide. It does not place any condition on the value of the function at the limit point.52.1. IntroductionFigure 2.1: Illustration of jumps in densityJumps in the lower prices Jumps in the higher prices1.5 1.55 1.6 1.65 1.71.451.51.551.61.651.7Jump(s) in density under multiple equilibriaP1P24.9 4.95 5 5.05 5.1 5.155.15.155.25.255.35.35Jump(s) in density under multiple equilibriaP1P2from different directions are not all equal.5 With an additional assumption,uniqueness or multiplicity of equilibria leads to continuous or discontinuousunconditional density function fY (y). The source of such jump(s) can bedescribed as follows. The value of dependent variables are determined inequilibrium. Thus the (conditional) density of dependent variables dependson an equilibrium selection rule, which picks one from multiple solutionswith a selection probability. When there are multiple equilibria, one wouldtypically see that the number of solutions to the structural equations variesas explained below. Suppose that there is a region where each y is realizedas the unique solution to the structural equations, while there is anotherregion where each y is realized as one of multiple solutions. At one side ofthe boundary between above two regions, the selection probability isalways 1, while on the other side of the boundary, the selection probabilitytends to be below 1 because now the selection rule has to pick oneequilibrium from multiple candidates. Thus the selection probability has ajump between regions with distinct numbers of solutions. This wouldcreate a jump in the density of dependent variables.Figure 2.1 shows such jumps in the density function of data (P1, P2)generated by a simplified Berry, Levinsohn and Pakes (BLP) model but withheteroskedasticity in the idiosyncratic term of the consumers’ preference.The details of the model setup and parameter choices are given in Section2.5. Here the dependent variables are prices P1 and P2. We can clearly seethat the thickness of the points has an abrupt change on a line: this is the5Throughout this chapter, a discontinuity or a jump refers to a jump discontinuitywhere the directional limits are not all equal.62.1. Introductionjump location curve of the joint density of prices, fP1,P2(p1, p2).This way we translate the problem of testing for multiple equilibria totesting for the presence of a jump in the density function. Our testingprocedure consists of three steps. Firstly, we compute the local jump of thedensity at a fixed y. In doing this, we check directions along allcoordinates of y. Secondly, we maximize those local jumps over thesupport of Y . This gives our test statistic. Lastly, we compute the criticalvalue via the Gaussian multiplier bootstrap proposed by Chernozhukov,Chetverikov and Kato (2013).2.1.1 Related LiteratureIn a discrete game with incomplete information and independent types ofplayers, tests for multiple equilibria have been developed by Paula andTang (2012) and Aguirregabiria and Mira (2013). Their tests were basedon dependence among players’ choices. We use a different testing criterion,which fits our framework of continuous dependent variables. Besides, ourframework is nonparametric in both structural equations and theequilibrium selection rule. This is different from Dagsvik and Jovanovic(1994), which formulated a parametric model and a parametric equilibriumselection rule to study whether depression can be explained as a low levelequilibrium. Our problem also differs from Kasy (2012) which providedconfidence sets for the number of solutions to the equation g(y) = 0,assuming that the function g can be identified from some momentrestrictions. To the best of my knowledge, Echenique and Komunjer (2009)is the only paper that considered nonparametric models with multipleequilibria and a continuous dependent variable. Instead of testing formultiple equilibria, they developed a test for complementarity between thedependent variable (at extremal values) and covariates. As a matter ofdimension, Echenique and Komunjer (2009) focused on one dimensionalequilibrium, (i.e., the dependent variable is a scalar). When the dependentvariables in an economic model are multidimensional, (this happens inmany empirical applications), they reduced the number of equilibriumcharacterizing equations by substitution. The difficulty with substitution isthat it usually breaks the separable form of unobservables and observablesin the resulting structural equation.6 To avoid this problem, Echenique6For example, if an equilibrium (y1, y2) is characterized by r1(y1, y2) = u1 andr2(y1,y2) = u2, suppose the second equation yields the best response function of y2 asy2 = φ2(y1,u2). By substitution, we get an equation r1(y1, φ2(y1,u2)) = u1, where theunobservable u2 does not appear as a separable form.72.1. Introductionand Komunjer (2009) first assumed no unobservables in the structuralequations, used substitution to reduce the dimension of equations, andthen added a disturbance term back to the one-dimensional structuralequation.7 In this chapter, we consider the multidimensional dependentvariables directly and keep unobservables in the original structuralequations.The form of the econometric model in this chapter resemblesnonparametric simultaneous equation models (or the transformationmodel, when the dependent variable is a scalar) studied by Horowitz(1996), Ekeland, Heckman and Nesheim (2004), Matzkin (2008) and Berryand Halie (2013), among others. The crucial difference is that we do notimpose the assumption that the unknown function r(·) is one to one. Inthis sense, this chapter proposes a test for one of the key assumptionsmaintained in the standard nonparametric simultaneous equations models.Our test is also related to statistics literature on testing the presence ofa jump in a density or a regression function. Such a test differs from thebetter known approach to detect the location of the jump given itsexistence. (For the latter, see Hall and Titterington (1992), Mu¨ller (1992)and Delgado and Hidalgo (2000), among others.) Chu and Cheng (1996)proposed a test for the presence of a jump in univariate densities. Theircritical values were computed from a Gumbel distribution under the nullhypothesis of continuity. A similar test for a univariate regression functionhas been developed by Hamrouni (1999). Mu¨ller and Stadtmu¨ller (1999)showed that in univariate equidistant regression models, the sum ofsquared jump sizes can be represented as the coefficient of an asymptoticlinear model, with the squared difference of y as the dependent variableand twice the standard deviation of the error term as the intercept. Hencethey developed a test by checking whether the sum of the squared jumpsizes equals zero. Gijbels and Goderniaux (2004) proposed a two-stepbootstrap test for discontinuities in univariate regression functions, whichidentified a discontinuity as a point with the largest derivative. Bowman,Pope and Ismail (2006) developed a test based on the sum of squaredpointwise jumps for univariate and bivariate regression functions. Our testis based on maximized local jumps, which resembles Chu and Cheng(1996) and Qiu (2002). To compute the critical value, instead of usingGumbel distribution as an asymptotic approximation, we apply the7For the aforementioned example, this means that they began with r1(y1, y2) = 0and r2(y1, y2) = 0 in the first place, reduced it to r1(y1, φ2(y1)) = 0, and then added adisturbance term u to yield r1(y1, φ2(y1)) = u.82.2. The Framework and ExamplesGaussian multiplier bootstrap proposed by Chernozhukov, Chetverikovand Kato (2013).2.1.2 Organization of Chapter 2This chapter is organized as follows. Section 2.2 presents the basic setupand provides examples. Section 2.3 establishes a link between uniqueness ormultiplicity of equilibrium and the continuity or discontinuity in the densityfunction of the dependent variables. Section 2.4 constructs a test and derivesits asymptotic properties. Section 2.5 conducts Monte Carlo simulations.Section 2.6 concludes. All technical conditions and proofs are presented inAppendix A.2.2 The Framework and Examples2.2.1 The Econometric ModelConsider the following system of structural equationsr(Y ) = g(X) + U, (2.1)where the dependent (endogenous) vector Y has dimension J and supportY ⊂ RJ . The vector-valued function r(·) : Y → RJ is twice continuouslydifferentiable. The exogenous vector X has dimension L and support X ⊂RL. The vector g(X) = [g1(X), g2(X), ..., gJ(X)]′, where the real-valuedfunction gj : X → RJ is twice continuously differentiable for each j. Thefunctions r and g are nonparametric. In addition, we denote the support ofU as U and assume it to be connected. A generalization of model (2.1) isr(Y,W ) = g(X,W ) + U, (2.2)where (W,X) are observable exogenous variables. In the following, wefocus on (2.1), since (2.2) can be reduced to (2.1) once we condition on W .The essential difference between (2.1) and the standard nonparametricsimultaneous equations model is that we do not impose the assumptionthat function r : Y → RJ is one to one. This allows for the possibility ofmultiple y satisfying (2.1) given exogenous (x, u). When such multiplicityhappens for some (x, u), model (2.1) is incomplete. One can introduce aprobability measure to account for the randomness of Y given (X,U). Forthat purpose, we first introduce some notations and impose AssumptionD1. LetE(v) ≡{y ∈ RJ : r(y) = v}, for any v ∈ V,92.2. The Framework and Exampleswhere V ≡ {g(x) + u : (x, u) ∈ X × U}. E(v) is the set of solutions to theequation r(y) = v. In this chapter, we assume that E(v) is non-empty andhas a finite cardinality for any v ∈ V. This is Assumption D1.Assumption D1. (i) V ⊂ {v ∈ RJ : r(y) = v, for some y ∈ RJ}.(ii) Pr (# (E (g(X) + U)) <∞) = 1, where #(G) is the cardinality of ageneric set G.If the range of r function is RJ , Assumption D1(i) always holds.Assumption D1(ii) rules out functions r having a constant part such that#(E(v)) is uncountable for some v.Now we describe the aforementioned probability that accounts for therandomness of Y given (X,U). For any (x, u) ∈ X × U , let pi(x, u) be aprobability on the power set of E(g(x) + u) specified byPr (Y ∈ A|X = x, U = u) = pi(x, u)(A), (2.3)for all subsets A of E(g(x) + u).Under Assumption D1, we classify the model (i.e., the structural equa-tions (2.1) and the probability pi(x, u)) into one of the following three cate-gories based on equilibrium behaviour.(C1) # (E (g(x) + u)) = 1 for all (x, u) ∈ X × U .(C2) Pr (# (E (g(X) + U)) > 1) > 0 and Pr (# (supp (pi(X,U))) = 1) =1, where supp(pi(x, u)) denotes the support of the probability pi(x, u).(C3) Pr (# (E (g(X) + U)) > 1) > 0 and Pr (# (supp (pi(X,U))) > 1) >0.Category (C1) means that structural equations (2.1) admit a unique so-lution in y for all (x, u) ∈ X × U . Category (C2) means that structural e-quations (2.1) have more than one solutions in y for (x, u) with a positiveprobability, however, given (x, u) there is only one element in the equilibriaset E(g(x) +u) that Y can take with a positive probability, and that proba-bility is 1. Category (C3) means that structural equations (2.1) have morethan one solutions in y for some (x, u) with a positive probability, and atthose (x, u) there are strictly more than one elements in E(g(x) + u) thatY can take with positive probabilities. Throughout this chapter, a uniqueequilibrium refers to (C1) or (C2), whereas multiplicity of equilibria refersto (C3). In addition, we do not consider an exceptional case in which struc-tural equations (2.1) admit multiple solutions for some (x, u) ∈ X × U , butPr (# (E (g(X) + U)) > 1) = 0.102.2. The Framework and Examples2.2.2 ExamplesWe present three examples:[Example 1] A Nonparametric BLP Model. The model is basedon Berry and Haile (2014). There are T markets, J firms and each firmproduces a product. For the tth market, pt = (p1t, ..., pJt)′ denotes productprices, xt = (x1t, ..., xJt)′ denotes covariates in consumers’ preferences,wt = (w1t, ..., wJt)′ denotes covariates in cost functions, st = (s1t, ..., sJt)′denotes market shares, ζt = (ζ1t, ..., ζJt)′ and ωt = (ω1t, ..., ωJt)′ denoteproduct level unobservables in the preference and cost functionsrespectively. Conditional on (xt, ζt), the utility for consumer ivit = (vit0, ...vit1, ..., vitJ)′ from consuming each product, (where vi0tdenotes the utility of consumer i from consuming nothing), is i.i.d. acrossconsumers i = 1, ..., I and markets t = 1, ..., T . The market share ofproduct j in market t is the choice probabilitysjt = σj(pt, ζt, xt) = Pr(arg maxk=0,1,...,Jvitk = j).According to Berry and Haile (2014), under assumptions including demandindex restrictions8 and connected substitutes9, the nonparametric BLPmodel can be characterized by a system of 2J equations.σ−1j (st, pt) = xjt + ζjt (2.4)pi−1j (st, pt) = wjt + ωjt, (2.5)for j = 1, 2, .., J . This system of equations corresponds to (2.1). Berry andHaile (2014) show that σ−1j (·) and pi−1j (·) can be identified under “aunique equilibrium” assumption (Assumption 13 in Berry and Haile(2014)): “There is a unique vector of equilibrium prices associated with any(δjt, κjt), (where δjt = xjt + ζjt and κjt = wjt + ωjt).”10 While imposing8“Demand index” is Assumption 1 of Berry and Haile (2014), which leads to sjt =σj(pt, δt, x(2)t ), where δjt = x(1)jt + ζjt, x(1)jt is one element of xjt and x(2)jt are the remainingelements. As in their paper, x(2)jt is suppressed because we can conditional on an arbitraryvalue of x(2)jt . For notational simplicity, write δjt = xjt + ζjt.9Connected substitutes (Assumption 2 of Berry and Haile (2014)) leads to δjt =σ−1j (pt, st) for all j = 1, ..., J and for any (pt, st).10The model with “a unique equilibrium” in Berry and Haile (2014) corresponds tothe union of Category (C1) and a subset of Category (C2). We include in the uniqueequilibrium (more precisely, category (C2)) the case where the equilibrium selection ruledepends on (xt, ζt).(Similarly, (C2) allows equilibrium selection rule depends on (wt, ωt).)112.2. The Framework and Examplesthat assumption, they noted that “it is not hard to construct examplesadmitting multiple equilibria” and the unique equilibrium assumption“rules out random equilibrium selection or equilibrium selection based onxjt or ζjt instead of their sum δjt (and similarly for κjt).”[Example 2] A Semiparametric Social Interaction Model. Thisis an extension of the standard linear-in-means model in social interaction(Manski (1993), Brock and Durlauf (2001b)). Here, the endogenous effect iscaptured by ϕ(mg), where the function ϕ(·) is not parametrically specified.The behavioural equation iswgi = c′xgi + d′x¯g + e′yg + ϕ(mg) + αg + ugi, (2.6)where wgi denotes individual outcome, xgi individual characteristics, yggroup level characteristics, x¯g = E(xgi|Yg, αg, ug) group level expectedindividual characteristics, mg = E(wgi|yg, αg, ug) expected group outcome,ϕ(·) the effect of expected group outcome on individual outcome, αgunobservable group characteristics and ugi unobservable individualcharacteristics. The exogenous assumption isE (ugi|xgi, yg, αg) = 0. (2.7)By a self-consistency condition, we havemg − ϕ(mg) =(c′ + d′)x¯g + e′Yg + αg. (2.8)Equation (2.8) corresponds to (2.1). If we impose an ex-ante assumptionthat ϕ′(m) < 1, there is a unique solution in mg given the observable andunobservable. Thus ϕ(·) is identified by the standard transformation model.Otherwise, there may be multiple equilibria in mg.[Example 3] A macro model of employmentDagsvik and Javanovic (1994) set up a simple macro model whichallows for multiple equilibria in employment rate. The structural equationcharacterizing an equilibrium employment rate y is:z − β log φ(11 + exp(−z))= βδ log x+ β log u− log v,The unique equilibrium assumption in in Berry and Haile (2014) corresponds to ourAssumption D10 in Section 2.3. Thus by Proposition 3 of this chapter, we can usethe continuity of the unconditional density to test the validity of the unique equilibriumassumption in Berry and Haile (2014).122.3. Testing the Presence of Multiple Equilibria via Discontinuitywhere z = log(y/(1− y)), x denotes monetary supply, u denotes a shock toaggregate demand or productivity, v denotes aggregate shock to the supplyof labour. Dagsvik and Javanovic (1994) assumed a cubic form of φ(·) anda parametric form of equilibrium selection rule. In our framework, φ(·) isnot parametrically specified.2.3 Testing the Presence of Multiple Equilibriavia DiscontinuityThis section analyses the relations between the uniqueness or multiplicityof equilibria and the continuity or discontinuity of the density functionfY |X(y|x) in y. In particular, under regularity conditions, a model with aunique equilibrium ((C1) or (C2)) necessarily makes the conditionaldensity fY |X(y|x) continuous over y ∈ int(Y(x)), for all x ∈ X , where(Y(x)) is the support of Y conditional on X = x. On the other hand, amodel with multiple equilibria ((C3)), under reasonable conditions, givesrise to a discontinuity of fY |X(y|x) in y. Here a discontinuity refers to ajump of fY |X(y|x) between two non-zero values as we perturb y. Under anadditional condition, a model with a unique equilibrium leads tocontinuous unconditional density fY (y), while a model with multipleequilibria gives rise to a jump in the unconditional density fY (y). Proofsin this section are collected in Appendix A.1.2.3.1 An Equilibrium Selection RuleThe possibility of #(E(g(x) + u)) > 1 makes the model given by structuralequations (2.1) incomplete in the sense that the realization of exogenousvariables (x, u) cannot uniquely determine the value of endogenous variables.We address this issue by introducing an equilibrium selection rule, whichspecifies the probability with which each “type” of equilibrium is selected11.Let us first give an index system to indicate types of equilibria. Towardsthis end, we begin with some notations and assumptions. LetY¯ ≡ {y ∈ RJ : r(y) ∈ V}.11It is possible to use pi(x, u) in (2.3) as an equilibrium selection rule, see Echenique andKomunjer (2009). However, we try a different approach here, which is more convenientfor our analysis. Our equilibrium selection rule will be defined on “types” (or indices) ofequilibria.132.3. Testing the Presence of Multiple Equilibria via DiscontinuityThe set Y¯ collects all possible values of equilibria, and it is a superset ofY since some equilibria may not be realized. We impose a smoothnesscondition on functions r and g.Assumption D2. Functions r : Y¯ → RJ and g : X → RJ are twicecontinuously differentiable over int(Y¯). The Jacobian of r(y), denoted asJ(y), is bounded and twice continuously differentiable over int(Y¯).A large class of functions with an enough degree of smoothness satisfiesAssumption D2.If the structural equations (2.1) are in (C1), there is a unique reduced-form function which maps the exogenous term v = g(x)+u to y. Otherwise,the relation from v to y turns to be a correspondence. Each branch ofthe correspondence can be viewed as a reduced-form function from v to y.Definition 1 below formalizes it. The family {q1(·), ..., qM (·)} can be viewedas a collection of reduced-from functions mapping an exogenous term g(x)+uto a value of endogenous Y .Definition 1. Let M be the smallest number such that there exists afamily of twice continuously differentiable functions{q1(·), q2(·), ..., qM (·)} ,where qm : Bm → RJ , Bm is an open and connected subset of V, and thefollowing conditions are satisfied :(i) For any (y, v) ∈ Y¯ × Bm satisfying r(y) = v, we have y = qm(v).Furthermore, r(qm(v)) = v for all v ∈ Bm.(ii) ∪Mm=1B¯m = V. (Here B¯m denotes the closure of Bm.)(iii) For any v ∈ Bm ∩Bk and m 6= k, qm(v) 6= qk(v).Definition 1(i) implies that qm : Bm → Am is one to one in each Bm.(Suppose there are v1 6= v2 in Bm satisfying y = qm(v1) = qm(v2). Thus wehave r(y) = v1 6= v2 = r(y), which cannot hold.) Definition 1(ii) means thatfor almost all v ∈ V, there is a function qm (m may vary with v) defined onit. Definition 1(iii) says that each qm is distinct, otherwise we can excludesuch a v from either Bm or Bk. The sets B1, ..., BM are not all disjoint if thestructural equations (2.1) admit multiple solutions at some (x, u). Figure2.2 illustrates the collection of reduced-form functions {q1(·), ..., qM (·)} usinga simple example. That is, the structural equation is r(y) = u, where142.3. Testing the Presence of Multiple Equilibria via DiscontinuityFigure 2.2: Reduced-form functions−50 0 50012345678910Reduced form functionsv=g(x)+uy  q1(v)q2(v)q3(v)B1B2B3152.3. Testing the Presence of Multiple Equilibria via Discontinuityr(y) = −y3 + 15y2 − 59y + 45. By Definition 1, the example has threereduced-form functions q1, q2, q3 and their domains are B1, B2 and B3.We impose the following assumption on the structural equations (2.1).Assumption D3. (i) The constant M in Definition 1 is finite. (ii) Ifstructural equations (2.1) admit a unique solution in y for all (x, u) ∈ X ×U ,the Jacobian J(y) > 0 for all y ∈ Y¯.Assumption D3 (i) requires that the collection of reduced-form functionmapping from g(x) = u to y has finite elements. Assumption D3 (ii) guar-antees that for any model in (C1), the Jacobian J(y) is bounded away fromzero.Let Am = qm(Bm), the family of the reduced-form functions{q1, q2, ..., qM} in Definition 1 produces a collection of subsets of Y¯ withthe properties in Lemma 1 below.Lemma 1. Let A = {Am,m = 1, 2, ...,M}, where Am = qm(Bm). Thenunder Assumptions D1-D3, the collection A has the following properties:(i) Function r is one to one on each Am.(ii) Y¯ = ∪Mm=1A¯m.(iii) For any m 6= k, Am ∩Ak = ∅.(iv) If a model is in (C1), the constant M = 1 in Definition 1.(v) If a model is in (C2) or (C3), the constant M > 1 in Definition 1.Lemma 1 is a direct consequence of Assumptions D1-D3 and Definition1. The key role of Lemma 1 is to generate an index system of equilibriatypes. If y ∈ Am, we call it the type m equilibrium and index it by m.Define an equilibria indices set (given the exogenous value v = g(x) + u) asa collection of m’s such that (2.1) is satisfied for some y ∈ Am. That is, forany fixed v ∈ V,M(v) ≡ {m : v ∈ r(Am)}. (2.9)Note that the function M : V → the power set of {1, 2, ...,M} specified in(2.9) is fully determined by the structual equations (2.1). LetW = {(x, u) : g(x) + u ∈ ∪Mm=1Bm},we introduce the equilibrium selection rule.Definition 2. Let an equilibrium selection rule λ be a measurable func-tion W → [0, 1]M defined as162.3. Testing the Presence of Multiple Equilibria via Discontinuityλ(x, u) = [λ1(x, u), ..., λM (x, u)],whereλm(x, u) = Pr (Y ∈ Am|X = x, U = u) .Clearly, λh(x, u) = 0 if h /∈M(g(x)+u), and∑Mm=1 λm(x, u) = 1. Givena function M, the collection of all equilibrium selection rules areC(M) ={λ :W → [0, 1]M :∑Mm=1 λm(x, u) = 1,λh(x, u) = 0 if h /∈M(g(x) + u), for all (x, u) ∈ W}.In some situations, the econometrician may impose restrictions on the equi-librium selection rule. For example, later we will consider a class C0(M) ofequilibrium selection rules that no longer depends on unobservables, condi-tional on the equilibria indices set and the observable characteristics. Thatis, letM = {M(v) : v ∈ ∪Mm=1Bm}, (2.10)we haveC0(M) ={λ ∈ C(M) : there is a φ : M×X → [0, 1]Msuch that λ(x, u) = φ(M(g(x) + u), x), for all (x, u) ∈ W}.(2.11)We say an equilibrium selection rule λ is degenerate at (x, u) ifλm(x, u) = 1 for an m. Now by Lemma 1 and Definition 2, we obtainequivalent statements of (C1) to (C3) about the equilibrium behaviour ofthe model (the structural equations (2.1) and the equilibrium selection ruleλ).(C1) M = 1.(C2) M > 1, and Pr (λm(X,U) = 1 for some m) = 1.(C3) M > 1, and Pr (λm(X,U) < 1 for all m) > 0.172.3. Testing the Presence of Multiple Equilibria via Discontinuity2.3.2 A Testing CriterionWe assume that the unobservable U has a smooth density function.Assumption D4. (i) The unobservable U is independent of X. (ii)The distribution of U is absolutely continuous with respect to theLebesgue measure and its density fU (u) = dFU (u)/du is bounded andtwice continuously differentiable. (iii) The support U is connected.Assumption D4 means that the unobservable vector U is independent ofthe observable characteristics and its density is smooth enough. Therefore,any jump in the density of dependent variables is not due to the unobservableU . In the next subsection, we will relax this assumption to allow for jumpsin fU (u).When the structural equations (2.1) have more than one solutions, atypical situation is that the equilibria indices set M(v) defined in (2.9)changes with v. This is the main source for a jump in the conditionaldensity functions. Suppose there are y1, y2 ∈ Am associated with distinctequilibria indices sets. (i.e., M(r(y1)) 6= M(r(y2)).) Based on that, wecan further divide Am into disjoint subsets. To formalize this idea, recallthat M in (2.10) is the collection of all equilibria indices sets generated bythe structural equation (2.1). Let Mm be the subset of M such that eachequilibria indices set in Mm equal to M(r(y)) for some y ∈ Am:Mm = {G ∈M : G =M(r(y)), for some y ∈ Am}.Let Lm = {1, ...,#(Mm)} so that we can list the elements in Mm:Mm = {Gml : l ∈ Lm}.Using this notation, for each m = 1, ...,M , we can produce a collection ofsubsets of Am based on equilibria indices sets in Mm,Aml = {y ∈ Am :M(r(y)) = Gml } , for l ∈ Lm. (2.12)By construction, we have (i) A¯m = ∪l∈LmA¯ml, (ii) Aml ∩ Aml′ = ∅, (iii)M(r(Aml)) is defined, and (iv) M(r(Aml)) 6= M(r(Aml′)) for all m andl 6= l′. Note that (iii) and (iv) uses the following notation: for any V ⊂ V,define M(V ) =M(v) if M(v) is the same over v ∈ V .Consider the previous simple example without covariates X: r(y) = u,where r(y) = −y3 + 15y2 − 59y + 45. In Figure 2.3, r(y) is plotted against182.3. Testing the Presence of Multiple Equilibria via DiscontinuityFigure 2.3: Multiple equilibria and jump(s)0 1 2 3 4 5 6 7 8 9 1 0­ 5 0­ 4 0­ 3 0­ 2 0­ 1 001 02 03 04 05 0M u l t i p l e   e q u i l i b r i a   a n d   d i s c o n t i n u i t y   ( J = 1 )yr(y)A3 2A3 1A2A12A11192.3. Testing the Presence of Multiple Equilibria via Discontinuityy. There can be one or three equilibria (with positive probabilities). Inthis example, the three reduced-form functions illustrated in Figure 2.2divides Y¯ into A1, A2 and A3. Each of them is labelled with a differentcolour on the horizontal axis. Function r(y) is one to one within each Am.An equilibrium is indexed as type m if it is within Am, m = 1, 2, 3.Furthermore, we have M = {{1}, {3}, {1, 2, 3}}, and M1 = {{1}, {1, 2, 3}},M2 = {{1, 2, 3}}, M3 = {{3}, {1, 2, 3}}. Based on distinct equilibriaindices sets (corresponding to (2.12)), we have A¯1 = A¯11 ∪ A¯12 andA¯3 = A¯31 ∪ A¯32, where the equilibrium indices sets associated withA11, A12, A2, A31 and A32 are:M(r(A11)) = {1},M(r(A32)) = {3},M(r(A12)) =M(r(A2)) =M(r(A31)) = {1, 2, 3}.Now we begin to analyse the conditional density function fY |X(y|x) andits relation to the uniqueness/multiplicity of equilibria. Lemma 2 gives aformula of the conditional density function.Lemma 2. Suppose that Assumptions D1-D4 hold. Then for any y ∈∪Mm=1 ∪l∈Lm Aml and x ∈ X ,fY |X(y|x) = fU (r(y)− g(x))|J(y)|λs(x, r(y)− g(x)),where s ∈ {1, 2, ...,M} and h ∈ Ls such that y ∈ Ash.Lemma 2 shows that under some regularity conditions, the conditionaldensity fY |X(y|x) can be written as a product of the density of unobservedU , the Jacobian of r(y), and a selection probability (a component of theequilibrium selection rule). Since Aml’s are mutually exclusive across mand l, for any y ∈ ∪Mm=1 ∪l∈Lm Aml, there is a unique pair (s, h) wheres ∈ {1, 2, ...,M} and h ∈ Lm, such that y ∈ Ash. Also, note that ∪Mm=1∪l∈LmA¯ml = Y¯. From Assumption D2, D4 and Lemma 2, we can see that a jumpin fY |X(y|x), if it exists, must come from the selection probability specifiedby the equilibrium selection rule. The following proposition establishes atestable implication of a unique equilibrium.Proposition 1. Suppose Assumptions D1-D4 hold, and the model (i.e.,the structural equations (2.1) and the equilibrium selection rule λ) generatesa unique equilibrium (i.e., the model falls into Category (C1) or (C2)). Then(i) The conditional density fY |X(y|x) is continuous in y on int(Y(x))given any x ∈ X . (Y(x) is the support of Y given x.) That is, for any x ∈ X ,202.3. Testing the Presence of Multiple Equilibria via Discontinuityy ∈ int(Y(x)) and any two sequences {y1n}, {y2n} such that y1n, y2n ∈Y(x), y1n → y and y2n → y, we havelimy1n→yfY |X(y1n|x) = limy2n→yfY |X(y2n|x).(ii) If X is a continuous random vector, fY |X(y|x) is continuous in(x, y) over x ∈ int(X ), y ∈ int(Y(x)). That is, for any x ∈ int(X ), y ∈int(Y(x)) and any two sequences {(y1n, x1n)}, {(y2n, x2n)} such that y1n ∈Y(x1n), y2n ∈ Y(x2n), (y1n, x1n)→ (y, x) and (y2n, x2n)→ (y, x), we havelim(y1n,x1n)→(y,x)fY |X(y1n|x1n) = lim(y2n,x2n)→(y,x)fY |X(y2n|x2n).Remark 1. (i) Here continuity at y means that the limits to y from alldirections in the support coincide. It does not impose any restriction on thevalue of the density function at y. Throughout this chapter, “continuity”refers to “lack of a jump discontinuity”.(ii) Conditional density fY |X(y|x) may have jumps in y from some pos-itive number to zero at the boundary of Y(x). Therefore Proposition 1 isstated for y ∈ int(Y(x)). Later, when constructing the test statistic, we ex-clude data points near the boundary of its support.(iii) Though the main testing criterion of this chapter is a jump of thedensity function in y, Proposition 1(ii) tells us that a jump in x actuallyprovides additional evidence for multiple equilibria.(iv) If the model is in (C1), it will produce another restriction on theobservable data (Y,X). That is, the support Y(x) is connected for all x.In particular, when the model is in (C1), function r is one to one, henceY = r−1(g(X) + U). Note that U is connected, U is independent of X(Assumption D4) and r is continuous (Assumption D2). Therefore, for anyx, Y(x) is connected by Theorem 23.5 of Munkres (1999). (That is, theimage of a connected space under a continuous map is connected.) Forexample, if assuming U = RJ , we can further distinguish a model in (C1)from another model in (C2), because (C1) leads to Y = RJ whereas (C2)produces hole(s) in Y.Proposition 1 states that a model with a unique equilibrium necessarilyleads to continuous conditional density fY |X(y|x). The next questions arewhether and when a model with multiple equilibria would produce a jumpof fY |X(y|x) in y. The intuition is that if a model has multiple equilibria,212.3. Testing the Presence of Multiple Equilibria via DiscontinuityTable 2.1: Conditions for multiple equilibria to produce jump(s) in densityResults Assumptions onStructural equations (2.1) Equilibrium selection ruleProposition 2 D5(i), D6(i) D6(ii)Corollary 2 D5, D7(i) D7(ii), D8Corollary 3 D5, D7(i), D9(i) D7(ii), D9(ii)one would typically see that the equilibria indices set M(r(y)) changeswith y. Then at a fixed x, consider what would happen at the boundarybetween regions of y associated with different equilibria indices sets.Suppose there is a region with a unique equilibrium and another regionwith multiple equilibria. At one side of the boundary, the selectionprobability is one, while on the other side of the boundary, the selectionprobability tends to be below one because the equilibrium selection rulehas to pick one from multiple candidates. As a result, the selectionprobability tends to have a jump at the boundary between subsets of Y¯with distinct equilibria indices sets. This leads to a jump of fY |X(y|x) in y.In the following, each of Proposition 2, Corollary 2 and Corollary 3establishes the relation between multiple equilibria and jump(s) of fY |X(y|x)in y, under different sets of assumptions. As an easier reference, Table 2.1lists three sets of assumptions sufficient for multiple equilibria to producesuch jump(s) in the conditional density. Keep in mind that AssumptionsD1-D4 are maintained for all three results in Table 2.1.The following Assumption D5 is a technical assumption. AssumptionD5(i) will be used in Proposition 2, while Assumption D5(ii) and (iii) willbe used later to give a sufficient condition for the equilibria indices set tovary on some r(Am).Assumption D5. (i) For any ε > 0, there exists a ς > 0 such thatJ(y) > ς for all y ∈ A˚m ⊂ Am and all m, where A˚m satisfies µL(Am\A˚m) < εand µL denotes the Lebesgue measure on Y¯.(ii) For any v ∈ int(V), if for any  > 0 there exists a v0 such that|v − v0| <  and r(y) = v0 has a unique solution in y, then the JacobianJ(y) > 0 for all y satisfying r(y) = v.(iii) For any m ∈ {1, ...,M}, let v0 be an arbitrary point v0 ∈ B¯m\Bm,222.3. Testing the Presence of Multiple Equilibria via Discontinuityand y∗ satisfy r(y∗) = v0, y∗ 6= limv→v0,v∈Bm qm(v) ≡ ym. If y∗ exists andthere is no such y′ that every component of y′ − ym has the same sign asy∗ − ym and |y′ − ym| < |y∗ − ym|, then J(y∗) > 0.Assumption D5(i) means that the Jacobian of function r is boundedaway from zero over a large enough subset of Am, for all m. AssumptionD5(ii) and (iii) bound the Jacobian J(y) away from zero at some particularpoints.Suppose Assumptions D1-D4 and D5(i) hold, the following AssumptionD6 is sufficient for multiple equilibria to produce jump(s) in the conditionaldensity of dependent variables.Assumption D6. If the model generates multiple equilibria (i.e., themodel falls in Category (C3)), there exists an s ∈ {1, 2, ...,M} satisfying thefollowing conditions:(i) The set As has disjoint subsets Ash and Ash′ satisfying A¯sh∩A¯sh′ 6= ∅and M(r(Ash)) 6=M(r(Ash′)).(ii) For some y ∈ A¯sh ∩ A¯sh′ , where Ash and Ash′ satisfy AssumptionD6(i), there is a set XD ⊂ X with Pr(X ∈ XD) > 0 such that the followingconditions hold for any x ∈ XD,limy′∈Ash,y′→yλs(x, r(y′)− g(x))6= limy′∈Ash′ ,y′→yλs(x, r(y′)− g(x)),andlimy′∈Ash,y′→yλs(x, r(y′)− g(x))· limy′∈Ash′ ,y′→yλs(x, r(y′)− g(x))6= 0.Remark 2. (i) If the equilibrium selection rule belongs to C0(M) in(2.11), (That is, it does not depend on u, given M(g(x) + u) and x.)Assumption D6(ii) becomes: for some y ∈ A¯sh ∩ A¯sh′ , where Ash and Ash′satisfy Assumption D6 (i), there is a set XD ⊂ X with Pr(X ∈ XD) > 0such that the following conditions hold:φs (M(r(Ash)), x) 6= φs (M(r(Ash′)), x) ,φs (M(r(Ash)), x) · φs (M(r(Ash′)), x) 6= 0,for all x ∈ XD.(ii) Assumption D6 outlines the direct source for a jump (betweennon-zero values) of fY |X(y|x) as we perturb y. Note that Ash and Ash′ are232.3. Testing the Presence of Multiple Equilibria via Discontinuityassociated with distinct equilibria indices sets, M(r(Ash)) 6= M(r(Ash′)).Assumption D6 assumes two properties for a model with multipleequilibria: (i) The equilibria indices set M(r(y)) varies over Y¯. (ii) Acomponent of the equilibrium selection rule has a jump (between non-zerovalues) at the boundary between some neighbouring subsets of Y¯ withdistinct equilibria indices sets. Assumption D6 reasonably holds in manyinteresting cases with multiple equilibria. For example, Corollary 1 belowstates that if Assumption D5(ii), (iii) hold, and the structural equationsadmit a unique solution for some values of exogenous variables while admitmultiple solutions for other values, Assumption D6(i) is necessarilysatisfied. For our example in Figure 2, the equilibria indices set changes atthe boundary between A11 and A12 as well as at the boundary betweenA31 and A32. Imposing Assumption D6(ii) in that example includes thecase where the equilibrium selection rule assigns to equilibrium 1 aprobability strictly between zero and one when the equilibria indices setcontains equilibrium 1, 2 and 3.(iii) Suppose that Assumption 6(i) hold. Then the equilibrium selectionrules that do not satisfy Assumption D6(ii) can only take values of a zeromeasure “around” (x, r(y)−g(x)) for all x ∈ X and y ∈ A¯sh∩A¯sh′ . Precisely,let the equilibrium selection rules that do not satisfy Assumption D6(ii) forma collection CE(M). It must have the following property:{(limy′∈Ash,y′→y λs (x, r(y′)− g(x))limy′∈Ash′ ,y′→y λs (x, r(y′)− g(x))): λ ∈ CE(M)}has a zero measure in{(limy′∈Ash,y′→y λs (x, r(y′)− g(x))limy′∈Ash′ ,y′→y λs (x, r(y′)− g(x))): λ ∈ C(M)},for all x ∈ X , y ∈ A¯sh ∩ A¯sh′ and all s, h, h′ satisfying A¯sh ∩ A¯sh′ 6= ∅ andM(r(Ash)) 6=M(r(Ash′)). (See Assumption D6(i).)The following Proposition 2 establishes a testable implication for modelsthat have multiple equilibria and satisfy Assumption D6.Proposition 2. Suppose Assumptions D1-D4, D5(i) and D6 hold andthe model generates multiple equilibria (i.e., the model falls into Category(C3)). Then for all x ∈ XD,(i) The conditional density fY |X(y|x) has a jump at some y ∈ int(Y(x)).That is, there are disjoint open subsets C1, C2 ⊂ int(Y(x)) with C¯1∩C¯2 6= ∅242.3. Testing the Presence of Multiple Equilibria via Discontinuityand yd = C¯1 ∩ C¯2 satisfying∣∣∣∣ limy∈C1,y→ydfY |X(y|x)− limy∈C2,y→ydfY |X(y|x)∣∣∣∣ = δ > 0.(ii) If X is a continuous random vector, fY |X(y|x) has a jump at (x, y)for some x ∈int(X ), y ∈ int(Y(x)).In the following, we consider a typical case of multiple equilibria (tobe specified in Assumption D7). That is, (i) the structural equations (2.1)admit a unique solution for some values of the exogenous variables, and havemultiple solutions for other values, and (ii) the equilibrium selection ruleλ ∈ C0(M) in (2.11)(random selection, for example). We state the relationsfrom multiple equilibria to the presence of a jump of density fY |X(y|x) in yfor models with such structural equations and equilibrium selection rules.Assumption D7. If the model generates multiple equilibria, thefollowing conditions are satisfied.(i) Pr(g(X) + U ∈ S) > 0, where S = {v ∈ V : #(M(v)) = 1}.(ii) The equilibrium selection rule λ ∈ C0(M) as defined in (2.11).Assumption D7(i) means that the structural equations admit a uniquesolution for some values of exogenous variables, while they generatemultiple solutions for other values. Assumption D7(ii) means that the onlyway the unobservable u affects the equilibrium selection rule is through theequilibrium index setM(g(x) + u). In other words, givenM(g(x) + u), theselection rule is purely random or it only depends on the observablecovariates x. Assumption D7(i) is satisfied in several models with multipleequilibria, see Dagsvik and Jovanovic (1994) and Echenique and Komunjer(2007). It is reasonable that when the exogenous term g(x) + u takes avery large (or small) value, its effect dominates the self-fulling mechanismthat may lead to multiple equilibria, so that the exogenous variablesuniquely determine the dependent variables. In contrast, when theexogenous term g(x) + u takes a moderate value, the self-fullingmechanism tends to dominate and may produce (for example) threeequilibria: a large one, a medium one and a small one.Assumption D7(ii) says that the equilibrium selection rule λ ∈ C0(M) asdefined in (2.11). Echenique and Komunjer (2009) made an assumption verysimilar to Assumption D7(ii) when introducing their equilibrium selection252.3. Testing the Presence of Multiple Equilibria via Discontinuityrule.12 Under Assumption D7(ii), for all (x, u) ∈ W, we have λ(x, u) =φ(M(g(x) + u), x) = φ(M(r(y)), x) = φ(M(r(C)), x), where y ∈ C, r(y) =g(x) + u, and M(r(C)) is defined. In the following, whenever Assumption7(ii) holds, we will use φm(M(r(C)), x) to denote the mth component of theequilibrium selection rule.In addition, let CE0(M) be the collection of equilibrium selection rulesthat satisfy Assumption D7(ii) but do not satisfy Assumption D6(ii) (i.e.,may not produce a jump in the conditional density when there are multipleequilibria). By Assumption D6, D7(i) and Remark 2, we must have{φs (M(r(Ash)), x) , φs (M(r(Ash′)), x)) : φ ∈ CE0(M)}has a zero measure in{φs (M(r(Ash)), x) , φs (M(r(Ash′)), x)) : φ ∈ C0(M)},for all x ∈ X , and all s, h, h′ such that A¯sh ∩ A¯sh′ 6= ∅ and M(r(Ash)) 6=M(r(Ash′)). The existence of such s, h, h′ is ensured by Assumption D6(i),or by the following Corollary 1, which states that Assumption D7(i) is suf-ficient for Assumption D6(i), under technical conditions Assumption D5(ii)and (iii).Corollary 1. Suppose Assumptions D1, D2, D5(ii) and (iii) hold. Ifthe model (2.1) generates multiple equilibria, Assumption D7(i) is sufficientfor Assumption D6(i).Now we give sufficient conditions for Assumption D6(ii).Assumption D8. Suppose the model generates multiple equilibria,and Assumption D7 holds. Then there exists an m ∈ {1, 2, ...,M}satisfying r(Am) ∩ S 6= ∅ (S is defined in Assumption D7), and anXD ⊂ X , Pr (X ∈ XD) > 0, such that the following two conditions hold.(i) For all C ⊂ Am such that M(r(C)) is defined, the mth componentof the equilibrium selection rule satisfiesφm(M(r(C)), x) > 0.(ii) For some C0 ⊂ Am such that M(r(C0)) is defined, we haveφm(M(r(C0)), x) < 1.12They wrote: “For a given x, different realizations of u can affect the support of Pxu,but not the probabilities assigned to different outcomes in the support.”262.3. Testing the Presence of Multiple Equilibria via DiscontinuityAssumption D8 assumes that there is an Am ⊂ Y¯ such that a subsetof Am is associated with the singleton equilibrium indices set {m}, whileanother subset of Am is associated with a larger equilibria indices set (whichalso contains the element m). Furthermore, within Am, the probability ofchoosing equilibrium m is always positive, and sometimes is strictly less thanone. Corollary 2 gives another version of Proposition 2 when AssumptionD6 is replaced by Assumption D7 and D8.Corollary 2. Suppose the model generates multiple equilibria, andAssumptions D1-D5, D7 and D8 hold. Then there is a set XD1 ⊂ X withPr(X ∈ XD1) > 0, such that for all x ∈ XD1 the conclusions of Proposition2 hold.The example in Figure 2 satisfies Assumption D7(i). Recall that thereis no X. Thus the structural equations are r(y) = u and the equilibriumselection rule λ is a function of u. As a result, it will satisfy AssumptionD7(ii) if its equilibrium selection rule satisfies thatλ (r(y)) = φ (M(r(Amh))) ,for all y ∈ Amh, all m and h ∈ Lm. Furthermore, it is not hard to seethat Assumption D8 also holds in that example. Indeed, the m’s satisfyingr(Am)∩S 6= ∅ are m = 1 and 3. Suppose Assumption D8 is violated, that is,φ1 (M(r(A12))) and φ3 (M(r(A31))) are either 0 or 1. Then there are threecases, (recall φ1 (M(r(A12))) + φ3 (M(r(A31))) ≤ 1, which means the twoterms cannot all equal 1), however, all of them fall into Category (C2), whichcorresponds to a unique equilibrium. As a result, when there are multipleequilibria, at least one of φ1 (M(r(A12))) and φ1 (M(r(A31))) must be inthe open interval (0, 1). Suppose 0 < φ1 (M(r(A12))) < 1, density fY (y)will have a jump at yd1 = A¯11 ∩ A¯12 (i.e, the boundary between A11 andA12).The following Assumption D9 spells out an alternative set of sufficientconditions for Assumption D6. This time we do not assume that a compo-nent in the equilibrium selection rule is strictly less than 1.Assumption D9. Suppose that the model generates multiple equilibriaand Assumption D7 holds. Then,(i) For any v ∈ V, there exists an m ∈ {1, 2, ...,M} such that v ∈ r(Am)and r(Am) ∩ S 6= ∅. (S is defined in Assumption D7.)272.3. Testing the Presence of Multiple Equilibria via Discontinuity(ii) For any m satisfying r(Am) ∩ S 6= ∅, we have φm(M(r(C)), x) > 0for all C ⊂ Am with M(r(C)) defined, and for all x ∈ X .Assumption D9(i) is imposed on the structural equations (2.1). Itautomatically holds when Y is a scalar and the structural equations haveone solution for some values of exogenous variables, while have threesolutions for other values of exogenous variables (like the example inFigure 2.3). Such examples can be found in Dagsvik and Jovanovic (1994)and Echenique and Komunjer (2007). Assumption D9(ii) is a strongerversion of Assumption D8(i), as the former requires Assumption D8(i) tohold for all m such that r(Am) ∩ S 6= ∅, and for all x. The followingCorollary 3 states another version of Proposition 2 when Assumption D6 isreplaced by Assumption D7 and D9.Corollary 3. Suppose the model generates multiple equilibria, andAssumptions D1-D5, D7 and D9 hold. Then there is a set XD2 ⊂ X withPr(X ∈ XD2) > 0, such that for all x ∈ XD2 the conclusions of Proposition2 hold.The example in Figure 2.3 satisfies Assumption D9(i). Due to thesimplicity of the structural equation, Assumption D9(ii) is sufficient butnot necessary for that example to produce a jump in the density whenthere are multiple equilibria.Up to now, Proposition 1 and 2 (and Corollaries 2 and 3) all focus on theconditional density fY |X(y|x). However, a jump in the unconditional densityfY (y) is usually easier to detect. The next proposition shows that with theadditional Assumption D10 below, the relation from uniqueness/multiplicityof equilibria to continuity/discontinuity of the conditional density fY |X(y|x)can be transferred to the unconditional density fY (y).Assumption D10. If the model generates a unique equilibrium, thereexists a function ψ : ∪Mm=1Bm → [0, 1]M such that the equilibrium selectionrule satisfies λ(x, u) = ψ(g(x) + u) for all (x, u) ∈ W.When a model is in Category (C1), Assumption D10 holds trivially.Assumption D10 becomes a restriction for a model in (C2) by requiringthat the equilibrium selection rule only depends on g(x) + u. Using thestructural equations (2.1), we can re-write the restriction in Assumption D10as λ(x, u) = ψ(r(y)). Assumption D10 is useful in its own right. Recall theexample of nonparametric BLP model in Section 2.2, the unique equilibrium282.3. Testing the Presence of Multiple Equilibria via Discontinuityassumption imposed by Berry and Haile (2014) corresponded to models witha unique equilibrium satisfying Assumption D10.Proposition 3 below establishes the relations from the equilibriumbehaviour of the model to continuity or discontinuity of the unconditionaldensity of dependent variables.Proposition 3. Suppose Assumptions D1-D4 hold.(i) Under the additional Assumption D10, if the model generates a uniqueequilibrium, the unconditional density fY (y) is continuous over y ∈ int(Y).(ii) Under additional Assumptions D5, D7 and D8, (D8 can be replacedby D9), if the model generates multiple equilibria, the unconditional densityfY (y) has a jump at some y ∈ int(Y). That is, there exist disjoint opensubsets C1, C2 ⊂ int(Y) with C¯1 ∩ C¯2 6= ∅, and yd = C¯1 ∩ C¯2 satisfying∣∣∣∣ limy∈C1,y→ydfY (y)− limy∈C2,y→ydfY (y)∣∣∣∣ = δ > 0.Remark 3. (i) Assumption D10 is required for part (i) of Proposition3 but not for part (ii).(ii) An intuition for part (ii) of Proposition 3 is that under AssumptionD7(i), a jump occurs at the boundary of two subsets of Y¯ such that one ofthem is associated with a singleton equilibria indices set, while the other isassociated with an equilibria indices set containing more than one element.As a result, the signs of the jumps are the same over all x’s at which theconditional density fY |X(y|x) has a jump in y. Hence the jump remains inthe unconditional density after integrating out x (weighted with its proba-bility).(iii) Without Assumption D10 (but with other assumptions ofProposition 3 kept), if fY (y) is continuous, the underlying model musthave a unique equilibrium. On the other hand, if fY (y) has a jump, we canfurther check if fY |X(y|x) has a jump in y for some x to determine whetherthe model is in (C2) or (C3).(iv) Proposition 3 greatly facilitates the implementation of our test whenthe density jump is estimated nonparametrically. Dimension is reducedsubstantially by excluding observable characteristics X.In conclusion, we have established a testing criterion that translatestesting for multiple equilibria to testing for the presence of a jump in the(conditional) density of the dependent variables.292.3. Testing the Presence of Multiple Equilibria via Discontinuity2.3.3 An Extension: Jumps in the Latent DistributionIn this subsection, we consider an extension for our discontinuity criteriondiscussed so far. Recall that Assumption D4(ii) requires that the the densityfunction of unobservable U is twice continuously differentiable. As a result,if the conditional density fY |X(y|x) exhibits a jump in y, it is due to a jumpin the equilibrium selection rule λ(x, u), which in turn indicates multipleequilibria. Here we relax Assumption D4(ii) to allow for jumps in the densityfunction fU (u).Assumption D4’. (i) The unobservable U is independent of X. (ii)The distribution of U is absolute continuous with respect to the Lebesguemeasure and its density fU (u) = dFU (u)/du is bounded and twicecontinuously differentiable for almost all u ∈ U .Assumption D4’ allows for jump(s) in the density fU (u), such that thecollection of jump locations has a zero measure in U .Although a jump in fU (u) will produce a jump in fY |X(y|x), the followingproposition ensures that the jump in fY |X(y|x) due to jump(s) in fU (u)will vanish after one integrates out x with its weighted probability, if thedistribution of g(X) does not have a mass point.Assumption D11. Pr(g(X) ∈ F0) = 0, for any F0 ⊂ RJ with µ(F0) =0 (where µ denotes the Lebesgue measure on RJ).Proposition 4. Suppose Assumptions D1-D3, D4’, D10 and D11 hold.If the model has a unique equilibrium, the unconditional density fY (y) iscontinuous on y ∈ int(Y).The intuition of Proposition 4 as follows. For an arbitrary y0, the jumpsin the latent density fU (u) can produce a jump in the conditional densityfY |X(y|x) at y0, only for x ∈ XJL = {x : fU (u) has a jump at u = r(y0) −g(x)}. Since fU has jump(s) at a zero measure set and Assumption D11holds, we have Pr(X ∈ XJL) = 0. Therefore, the jump of conditional densityat y0 will disappear in the unconditional density of Y , after we integrate outx with its weighted probability. Proposition 3 and 4 together imply that thepresence of a jump in the unconditional density fY (y) is able to distinguisha jump due to multiple equilibria from that due to jump(s) in the latentdensity fU (u).302.4. A Test Statistic2.4 A Test StatisticThis section proposes a nonparametric test for multiple equilibria via testingfor the presence of a jump in the density function of dependent variables. Tofocus on the main problem, we assume that the assumptions for Proposition3 hold so that it suffices to consider the unconditional density function fY (y).Thus in this section, we can suppress the subscript Y in fY (y) withoutcausing confusion. Our testing procedure consists of three steps. Firstly, ata fixed y, we compute the local jump of the density. Secondly, we maximizethose local jumps over an appropriate set of y. This gives our test statistic.Lastly, we compute the critical value by a Gaussian multiplier boostrap.The way to compute the local jump at a fixed y resembles that in Qiu(2002), which focused on estimating the locations of discontinuities in abivariate regression. The method to compute the critical value is adaptedfrom Chernozhukov, Chetverikov and Kato (2013). We establish asymptoticproperties of our test in Proposition 5 and 6. The technical conditions SL1-SL5 for Proposition 5 and 6 are stated in Appendix A.2. The proofs forProposition 5 and 6 are collected in Appendix A.3.2.4.1 Construction of the TestAssumption S1. {Yi : i = 1, 2, ..., n} is an i.i.d. sample generated by thestructural equations (2.1) and an equilibrium selection rule.Let J = {1, 2, ..., J}, by Proposition 3, our testing problem isH0 : f(y) is continuous over y ∈ int(Y).H1 : f(y) has at least a jump within y ∈ int(Y).Recall Remark 1, continuity at y means that the limits from all directionscoincide. By Assumptions D2, D4 and Lemma 2, the density f(y) is twicecontinuously differentiable for y ∈ int(Y) except the jump(s), if they exist.Let yj denote the jth component of y and y−j denote the components ofy other than the jth, f(y+j , y−j) = limε↓0(yj + ε, y−j) and f(y−j , y−j) =limε↓0(yj − ε, y−j). Using this notation, at a fixed y, the local density jumpalong the direction of the jth coordinate is∆(j)(y) = f(y+j , y−j)− f(y−j , y−j).By Proposition 3, if the model has a unique equilibrium, ∆(j)(y) = 0 forall j ∈ J and y ∈ int(Y). On the other hand, if the model has multiple312.4. A Test Statisticequilibria, supj∈J ,y∈int(Y) ∆(j)(y) = δ > 0. We construct a kernel estimator13 of ∆(j)(y),∆(j)n (y) = fˆ(y+j , y−j)− fˆ(y−j , y−j),where fˆ(y+j , y−j) and fˆ(y−j , y−j) are constructed using one-sided kernelsdefined as follows.For j = 1, ..., J , let K+j (vj , v−j) and K−j (vj , v−j) be two non-negativefunctions which satisfy the following conditions:(i) The support of K+j (vj , v−j) is [0, 1]× [−1/2, 1/2]J−1 and the supportof K−j (vj , v−j) is [−1, 0]× [−1/2, 1/2]J−1.(ii)∫ 1−1∫ 1−1 ...∫ 1−1K+j (vj , v−j)dv = 1 and∫ 1−1∫ 1−1 ...∫ 1−1K−j (vj , vj)dv =1.A typical example is a product kernel. That is, for j = 1, ..., J ,K+j (vj , v−j) = K+(vj)×∏k 6=jK(vk),K−j (vj , v−j) = K−(vj)×∏k 6=jK(vk), (2.13)where K+ : [0, 1] → R+ ∪ {0} and K− : [−1, 0] → R+ ∪ {0} are one-sidedkernels functions while K : [−1/2, 1/2] → R+ ∪ {0} is a usual two-sidedkernel function.Using one-sided kernel functions, we construct estimators of the limitsof the density at y and along the jth coordinate asfˆ(y+j , y−j) =1npnhJ−1nn∑i=1K+j(Yi,j − yjpn,Yi,−j − y−jhn),fˆ(y−j , y−j) =1npnhJ−1nn∑i=1K−j(Yi,j − yjpn,Yi,−j − y−jhn). (2.14)where Yi,j is the jth component of Yi and Yi,−j is the components of Yiother than the jth. pn and hn are bandwidths of the kernel estimator. In13Under H0, f(y) is twice continuously differentiable, hence the kernel estimator of∆(j)(y) is consistent for any y and j, under standard conditions. Under H1, f(y) isnot continuous at some y. However, for those y, as long as f(y) is twice continuouslydifferentiable over a neighbourhood of (yj + ε, y−j) and (yj − ε, y−j) for sufficiently smallε, the one-sided kernel estimator of ∆(j)(y) remains consistent. Otherwise, the estimatormay not be consistent. (for example, if a bivariate density function f(y1, y2) has a jumpat y1 = c, f(c, y2) is not continuous for all y2). However, this does not affect the level ofour test. Moreover, the power of our test will not be reduced if supj∈J ,y∈int(Y) ∆(j)(y) isnot underestimated.322.4. A Test Statisticaddition, define a function ϕ(j)n : Y × Y → R as followsϕ(j)n (t, y) =1pnhJ−1n[K+j(tj−yjpn, t−j−y−jhn)−K−j(tj−yjpn, t−j−y−jhn) ],(2.15)where tj denotes the jth component of t and t−j denotes the components oft other than the jth. The same subscript is used for y. Therefore, we have∆(j)n (y) =1nn∑i=1ϕ(j)n (Yi, y).The variance of√npnhJ−1n ∆(j)n (y) isσ2j,n(y) = pnhJ−1n E[ϕ(j)n (Yi, y)−E[ϕ(j)n (Yi, y)]]2. (2.16)and an estimator of the variance above isσˆ2j,n(y) =pnhJ−1nnn∑i=1(ϕ(j)n (Yi, y)−∆(j)n (y))2. (2.17)Therefore, the normalized local jump at the fixed point y and along the fixeddirection parallel to the jth coordinate can be computed by∣∣∣∣√npnhJ−1n ∆(j)n (y)/σˆj,n(y)∣∣∣∣ .Then we consider maximizing local jumps across J coordinates and overan appropriate subset of the support of Y . We have to exclude points nearthe boundary of the support Y because they tend to distort our test. Thelocal jump may appear large at a point near the boundary of support Ysimply because few sample points are available at one side. In the following,we construct two maximizing sets Yˆn,κ1 and Yˆn,κ2 . Firstly for each y, wefix a coordinate j, and compute the proportion of data points with its jthcomponent lying between yj − wˆj and yj , conditional on the value of thecomponents other than the jth. We denote this proportion as Pˆ−j (y). Herethe distance wˆj is given as 1/H of the distance between the largest andthe smallest jth component of Yi. Similarly, we compute Pˆ+j (y) as theproportion of data points which have the jth component lying between yjand yj+wˆj , conditional on the value of components other than the jth. Thenwe construct Yˆn,κ1 as the collection of such y’s that the ratio of Pˆ−j (y) andPˆ+j (y) is neither very large nor very small for all coordinates. The cutoff332.4. A Test Statisticis specified by a constant κ1. Analogously, we construct Yˆn,κ2 using thecutoff κ2. The data dependent set Yˆn,κ1 serves as a maximizing set for thetest statistic while the set Yˆn,κ2 is for the critical value. We set κ2 slightlysmaller than κ1 to facilitate the investigation of asymptotic properties. If thedifference between κ1 and κ2 is small, the power loss due to the discrepancybetween κ1 and κ2 will also be small. In the simulation studies below, wetry (κ1, κ2) = (0.20, 0.19) and (0.25, 0.24). Note that our construction ofPˆ−j (y) and Pˆ+j (y) is conditional on the value of components other than thejth. This conditioning is necessary when the support of Y is not a rectangle,which is common in the models we consider. Ideally, the choice of H shoulddepend on the shape of the support of Y . For example, when the supporthas a hole (zero density) in the middle, it is better to choose bigger H sothat the data points near the boundary of such a hole will be excluded.In our simulation studies below, we set H = 3. As described so far, themaximizing sets are constructed as follows.Yˆn,κ1 = {y ∈ Y : Pˆ−j (y)/Pˆj(y) > κ1, Pˆ+j (y)/Pˆj(y) > κ1 all j},Yˆn,κ2 = {y ∈ Y : Pˆ−j (y)/Pˆj(y) > κ2, Pˆ+j (y)/Pˆj(y) > κ2 all j},where κ2 < κ1 are two constants, andPˆ−j (y) =1nn∑i=11{yj − wˆj ≤ Yi,j < yj , |Yi,k − yk| ≤ hn/2, all k 6= j},Pˆ+j (y) =1nn∑i=11{yj ≤ Yi,j ≤ yj + wˆj , |Yi,k − yk| ≤ hn/2, all k 6= j},Pˆj(y) = Pˆ−j (y) + Pˆ+j (y), wˆj = (maxiYi,j −miniYi,j)/H, (2.18)for all j ∈ J .The next step is to take the supremum of∣∣∣∣√npnhJ−1n ∆(j)n (y)/σˆj,n(y)∣∣∣∣across coordinates and over y ∈ Yˆj,κ1 . This gives our test statistic teststatistic ∆n.∆n = supj∈J ,y∈Yˆn,κ1∣∣∣∣∣∣√npnhJ−1n ∆(j)n (y)σˆj,n(y)∣∣∣∣∣∣. (2.19)The rejection rule isReject H0 when ∆n ≥ critical value c˜S1,n.342.4. A Test StatisticThe process {√npnhJ−1n (∆(j)n (y)−E(∆(j)n (y))σˆj,n(y), y ∈ Y} does not converge weaklyto a Gaussian process, therefore the calculation of critical value is notstandard.14 Here we use the supremum of the Gaussian multiplierbootstrap proposed by Chernozhukov, Chetverikov and Kato (2013) tosimulate critical values. The method consists of three steps:First draw a sequence of standard normal random variables {ξ1, ξ2, ..., ξn}independent of {Yi}ni=1 and calculateGˆn,j(y) =1√nn∑i=1ξi√pnhJ−1n[ϕ(j)n (Yi, y)−∆(j)n (y)]σˆj,n(y). (2.20)Then, repeat the first step R times and compute cˆS1,n(α) as the condi-tional (1− α)th empirical quantile of supj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣.Lastly, set the critical value asc˜S1,n = cˆS1,n(α) + Ccˆ1,n(γn), (2.21)where 1 ≥ γn → 0 and cˆ1,n(γn) is the (1− γn)th empirical quantile ofsupj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣ .The constant C satisfies that C ≥ M¯C4/((1 − 3n)c2). Details of theparameter (γn, M¯ , C4 3n, c2) will be specified in Conditions SL2-SL4 (inAppendix A.2), Proposition 5 and Condition SH3 (in Appendix A.3).However, under Conditions SL2, SL4, and SL5 (in Appendix A.2), anypositive value C (however small) will give a correct level for our test underthe null hypothesis of continuity. Therefore one can set C = 0 in practice.The validity of the Gaussian multiplier bootstrap does not depend on theweak convergence of the underlying empirical process, instead it comesfrom two coupling inequalities (see Theorem 3.1 and 3.2 of Chernozhukov,Chetverikov and Kato (2013)).14In this occasion, the problem of computing the critical value is related to constructingconfidence bands of kernel density estimators, which has been studied by Bickel andRosenblatt (1973), Gine´ and Nickl (2010), Chernozhukov, Chetverikov and Kato (2013),among others. When J = 1, Chu and Cheng (1996) applied the result of Bickel andRosenblatt (1973) to derive the limit of Pr (sup |∆n(y)| < an + bnz) as exp(−2 exp(−z))for carefully chosen an (diverging) and bn (converging to 0).352.4. A Test Statistic2.4.2 Asymptotic Properties of the TestIn the section, we describe the asymptotic performance of our test where thestatistic is constructed as (2.19) and critical value computed by (2.21). Themain results are stated in Proposition 5 and Proposition 6. To obtain thoseresults, we will impose five technical conditions, namely Conditions SL1 toSL5, among which Conditions SL1 to SL4 are adapted from Chernozhukov,Chetverikov and Kato (2013), and Condition SL5 is an under-smoothingcondition imposed on the bandwidths pn and hn in (2.14). All those technicalconditions are stated in Appendix A.2. Here we briefly describe them. Recallthe function ϕ(j)n defined in (2.15), Condition SL1 requires that the followingclass of functions∆Kn,j =√pnhJ−1n ϕ(j)n (·, y)σn,j(y): y ∈ Yκ2is bounded, and its L2 covering number has a specified bound (i.e., is a VC(Vapnik-Cervonenkis) class in the sense of Chernozhukov, Chetverikov andKato (2013), or is an Euclidean class in the sense of Pakes and Pollard(1989)). Note that here the set Yκ2 is the population version of themaximizing set Yˆn,κ2 . The set Yκ2 is specified at the beginning of AppendixA.2. Condition SL2 imposes bounds for the variance given by (2.16). Ifone uses the product kernel in (2.13), Conditions SL1 and SL2 are satisfiedif the kernel functions K(·), K+(·) and K−(·) have a compact support anda bounded variation. Condition SL3 bounds the bias of ∆jn(y). ConditionsSL4 and SL5 impose conditions on bandwidths pn, hn. In this chapter, welet bandwidth pn = p0 × n−γp and hn = h0 × n−γh . Condition SL5 requiresthe rate γp and γh to satisfy γp + (J − 1)γh < 1, γp + (J + 1)γh > 1,3γp + (J − 1)γh > 1, which is an under-smoothing condition.The validity of the critical value can be described heuristically asfollows. After controlling the difference between σj,n(y) and σˆj,n(y), thedesired critical value is approximately the (1 − α)th quantile of thedistribution of supj,y∈Yˆn,κ1|Zj,n(y)| whereZj,n(y) =√npnhJ−1n(∆(j)n −E[∆(j)n (y)])σj,n(y).Because κ2 < κ1, the (1 − α)th quantile of the distribution ofsupj∈J ,y∈Yˆn,κ1|Zj,n(y)| will be asymptotically bounded by the (1 − α)th362.4. A Test Statisticquantile of the distribution of supj∈J ,y∈Yκ2 |Zj,n(y)|. The random variablesupj∈J ,y∈Yκ2 |Zj,n(y)|, which is the supremum of an empirical processindexed by a VC class, can be approximated by a random variable withthe same distribution as supj∈J ,y∈Yκ2 |Gj,n(y)| , where{Gj,n(y) : j ∈ J , y ∈ Yκ2} is a Gaussian process with mean zero andvariance one (Theorem 3.1 of Chernozhukov et al. (2013)). On the otherhand, the supremum of the Gaussian multiplier process in (2.20) can beapproximated by a random variable with the same distribution as therandom variable supj∈J ,y∈Yκ2 |Gj,n(y)| (Theorem 3.2 of Chernozhukov etal. (2013)). Combining both approximations, the (1− α)th quantile of thedistribution of supj∈J ,y∈Yκ2 |Zj,n(y)| can be approximated by the (1− α)thquantile of the supremum of the Gaussian multiplier process.Proposition 5 and 6 below establish the asymptotic behaviour of ourtest.Proposition 5. Let the test statistic ∆n be constructed as in (2.19) andthe critical value c˜S1,n be computed as in (2.21). Suppose Assumption S1 andConditions SL1-SL5 hold. Then, under H0,lim infn→∞Pr(∆n ≤ c˜S1,n)≥ 1− α.The next proposition shows that our test has a power converging toone for any fixed alternative in which a jump occurs not too close to theboundary of the support. Formally, we assume that a jump occurs withinthe set Yn,L,c defined as follows.Yn,L,c ={y ∈ Y : P−j (y)/Pj(y) > κ1 + cn−d,P+j (y)/Pj(y) > κ1 + cn−d, all j ∈ J},where P+j (y), P−j (y) and Pj(y) are population versions of Pˆ+j (y), Pˆ−j (y)and Pˆj(y) in (2.18), and are specified at the beginning of Appendix A.2, cis some positive constant and the restriction on the rate d will be specifiedin Proposition 6.Proposition 6. Let the test statistic ∆n be constructed as in (2.19) andthe critical value c˜S1,n be computed as in (2.21). Suppose Assumption S1 andConditions SL1-SL5 hold, and the following conditions are satisfied.(i) The maximal jump in Yn,L,c is positive, that is,supj∈J ,y∈Yn,L,c∣∣∣∆(j)(y)∣∣∣ = δ > 0.372.5. Monte Carlo Simulations(ii) npnhJ−1n γn/ log n → ∞ or C = 0, npnhJ−1n / log n → ∞. (γn is in(2.21).)(iii) The rate d in Yn,L,c satisfies0 < d < min (γh, γp, (1− γp − (J − 1)γh)/2) .Thenlim infn→∞Pr(∆n > c˜S1,n)= 1.In this section, we compute the local jump of the density function bychecking the jump along each coordinate. A generalization will be tocompute the jump along all directions. This can be achieved by using therotational kernel functions introduced by Qiu (1997). The rotational kernelfunctions for multi-dimension can be constructed from the sphericalcoordinate systems15. A test based on local jumps that take into accountall directions tends to have greater power, however, it is computationallymore intensive. We do not go into those details in this chapter.2.5 Monte Carlo Simulations2.5.1 Data Generating Processes (DGP)The DGPs used in the Monte Carlo simulation are based on Echeniqueand Komunjer (2007). It is a simplified BLP model but the consumer’spreference has an heteroskedastic idiosyncratic term depending on prices.Assume there are two products j = 1, 2 produced by two firms. The utilityof a consumer isvijt = −αpjt + g(pjtp−jt)jit. (2.22)Note that there is no product-specific unobservable. The component jit inthe idiosyncratic term is i.i.d. across consumers and markets, and has aGumbel distribution with density f(t) = exp(−t + exp(−t)). The varianceof the idiosyncratic term, however, depends on p through g(pjp−j). Letg(x) = ρ+ exp(−τ/x). The marginal cost is assumed to becjt = eγωjt, (2.23)15The spherical coordinate system for dimensions larger than two can be found inBlumenson (1960).382.5. Monte Carlo Simulationswhere ωj is product-level unobservable for the cost. The first-order condi-tions yield the following structural equations in the form of (2.1):ln(p1t −F (δ1t)g(p1tp2t)f(δ1t)α[1 +p1tp2tg′(p1tp2t)(p2t − p1t)g(p1tp2t)p1t]−1)= γ + lnω1t,ln(p2t −F (δ2t)g(p1tp2t)f(δ2t)α[1 +p1tp2tg′(p1tp2t)(p1t − p2t)g(p1tp2t)p2t]−1)= γ + lnω2t, (2.24)where δjt = α(p−jt − pjt)/g(pjtp−jt) for j = 1, 2, and an equilibrium inmarket t is (p1t, p2t). Function F (t) = 1/(1 + exp(−t)) is the CDF of theLogistic distribution. The number of solutions to equations (2.24) dependson values of parameter (α, γ, ρ, τ) and (ω1, ω2). Table 2.2 and Figure 2.4show the cases of a unique and three solutions when fixing (ω1, ω2) at (1, 1).The value of parameters (α, γ, ρ, τ) follows that in Echenique and Komunjer(2007). In Table 2.2 , the left column computes the three equilibrium pricesfor the structural equations (2.24), with parameters as specified in the tableand (ω1, ω2) = (1, 1). The right column, on the other hand, computesthe unique equilibrium price for the the structural equations (2.24), withparameters as specified in the table and (ω1, ω2) = (1, 1). In Figure 2.4, theupper graph plots the case of multiple solutions while the lower graph plotsthe case of a unique solution. In each graph, the blue line is the equilibriacharacterizing function derived as follows: we first solve the best responsefunction for p2 in terms of p1 from the second equation of (2.24), and thenplug the best response function into the first equation of (2.24). This waywe obtain an equation charactering the equilibrium p1, and then write itas a function equal zero. Such a function is the equilibria characterizingfunction plotted in Figure 2.4. The red line is the zero horizontal line.The intersection of the red and blue line gives the equilibrium p1. Sinceequations (2.24) are symmetric when (ω1, ω2) = (1, 1), we have p2 = p1 inevery equilibrium.In the Monte Carlo study, we consider the following four DGPs:DGP0 : α = 0.3841, γ = 0.0923, ρ = 0.1755, τ = 11.0009ω1 ∼ U [0.8, 1], ω2 ∼ U [0.6, 2.5]DGP1, 2, 3 : α = 0.2464, γ = 0.0776, ρ = 0.1074, τ = 12.9913392.5. Monte Carlo SimulationsFigure 2.4: Characterization of equilibriaα = 0.2464, γ = 0.0776, ρ = 0.1074, τ = 12.99131 2 3 4 5 6 7­2­1.5­1­ 0 .500 .51Equ ilibr ium pr ic e p2P2E quilibrium c harac teriz ing f unc tionα = 0.3841, γ = 0.0923, ρ = 0.1755, τ = 11.00091 2 3 4 5 6 7­2­1.5­1­ 0 .500 .511.5Equ ilibr ium pr ic e p2P2E quilibrium c harac teriz ing f unc tion402.5. Monte Carlo SimulationsTable 2.2: Equilibrium pricesα = 0.2464, γ = 0.0776 α = 0.3841, γ = 0.0923ρ = 0.1074, τ = 12.9913 ρ = 0.1755, τ = 11.0009(p1, p2) (p1, p2)(1.753, 1.753) (1.853, 1.853)(3.098, 3.098) This is the unique(4.843, 4.843) equilibriumω1 ∼ U [0.8, 1], ω2 ∼ U [0.6, 1.5]The support of ω2 is larger in DGP0 so that the generated data (P1, P2)has a comparable range with the other three. For DGP0, equations (2.24)always admit a unique solution. On the other hand, DGP1, DGP2 andDGP3 generate three solutions with a probability around 0.46 and aunique solution with a probability around 0.54. The difference amongDGP1, DGP2 and DGP3 lies in the equilibrium selection rule. We assumethat an equilibrium is randomly selected given the equilibria indices set.When the equilibria indices set contains equilibrium 1, 2 and 3, theequilibrium selection rule of DGP1 always chooses equilibrium 3, that is,DGP1 falls into category (C2) and the model has a unique equilibrium.The equilibrium selection rule of DGP2 chooses equilibrium 1 and 3 withprobability 1/4 and 3/4 respectively. The equilibrium selection rule ofDGP3 assigns equal probabilities to equilibrium 1, 2 and 3. As a result,DGP2 and DGP3 fall into category (C3) and produce multiple equilibria.The following table summarizes the equilibrium selection rules.DGP 0 1 2 3# of solutions to (2.24) 1 1 or 3 1 or 3 1 or 3Equilibrium selection rule [1] [0, 0, 1] [1/4, 0, 3/4] [1/3, 1/3, 1/3]# of realized equilibria 1 1 1 or 2 1 or 3In each time of simulation, the data is an i.i.d. sample (P1, P2). Figure 2.5plots four samples generated by DGP0 to DGP3. Note that the data with aunique equilibrium may have a connected support (like DGP0), however, itmay not (like DGP1). Similarly, the data with multiple equilibria may (likeDGP3) or may not (like DGP2) have a connected support.412.5. Monte Carlo SimulationsFigure 2.5: Data generated by DGP0,1,2,3DGP0 (upper left),DGP1 (upper right),DGP2 (lower left) and DGP3 (lower right)1 1.5 2 2.5 3 3.5 4 4.5 511.522.533.544.555.56P 1P2D GP 01 1.5 2 2.5 3 3.5 4 4.5 5 5.511.522.533.544.555.56P 1P2D GP 11 1.5 2 2.5 3 3.5 4 4.5 5 5.511.522.533.544.555.56P 1P2D GP 21 1.5 2 2.5 3 3.5 4 4.5 5 5.511.522.533.544.555.56P 1P2D GP 3422.5. Monte Carlo SimulationsFigure 2.6 magnifies the data generated by DGP2 and DGP3 near thelocations of jump. We clearly see that there is a line separating the graphinto a dense part and a sparse part. Such a line is the jump location curve,which is the collection of loci where jumps in the density occur.2.5.2 Performance of the TestThe data is the prices {P1, P2}. We compute the test statistic ∆n as in (2.19)and the critical value c˜S1,n as in (2.21), with C = 0. The number of simulationis R = 500. The number of simulation used to compute the critical value isB = 199. Condition SL5 implies that the bandwidths pn = p0 × n−γp andhn = h0 × n−γh must satisfyγp + γh < 1, 3γp + γh > 1, γp + 3γh > 1.Here we use γp = γh = 0.3. In the construction of ∆(j)n (y), j = 1, 2, we allowp0 and h0 to be data dependent by letting p0 = pc× σˆj and h0 = hc × σˆ−j ,where σˆj and σˆ−j are the standard deviation of Pj and P−j respectively,j = 1, 2. We use the following one-sided Epanechnikov kernels16,K+1 (u1, u2) =1211(1− u22)1{|u2| ≤ 0.5} ·1211(1− (u1 − 0.5)2)1{0 ≤ u1 ≤ 1},K−1 (u1, u2) =1211(1− u22)1{|u2| ≤ 0.5} ·1211(1− (u1 + 0.5)2)1{−1 ≤ u1 ≤ 0},K+2 (u1, u2) =1211(1− u21)1{|u1| ≤ 0.5} ·1211(1− (u2 − 0.5)2)1{0 ≤ u2 ≤ 1},K−2 (u1, u2) =1211(1− u21)1{|u1| ≤ 0.5} ·1211(1− (u2 + 0.5)2)1{−1 ≤ u2 ≤ 0}.Table 2.3 reports the rejection frequencies of our test under different DGPswith parameter (pc, hc) = (0.25, 0.25), (κ1, κ2) = (0.20, 0.19) and H = 3.The first two rows show the level of our test under DGP0 and DGP1. Thelast two rows show the power of the test under DGP2 and DGP3. Thereported rejection frequencies suggest that the test has a desirable levelwhen there is a unique equilibrium (DGP0 and DGP1) and it also exhibitsa reasonable power to detect multiple equilibria in models generated byDGP2 and DGP3. Table 2.4 shows the rejection frequencies for differentchoices of (pc, hc), when (κ1, κ2) = (0.20, 0.19) and sample size N = 2000.Table 2.5 uses a different (κ1, κ2) = (0.25, 0.24). Overall, our test performsreasonably well under the specified set of tuning parameters.16Qiu (2002) uses them for a local discontinuity test.432.5. Monte Carlo SimulationsFigure 2.6: Jump location curvesDGP2 (upper) and DGP3 (down)1.4 1.5 1.6 1.7 1.8 1.91.31.41.51.61.71.81.9a subs ample fr om D GP2P1P21.4 1.5 1.6 1.7 1.81.31.41.51.61.71.8a subs ample fr om D GP3P1P2442.5. Monte Carlo SimulationsTable 2.3: Rejection frequencies of the testsample size 1500 2000Unique equilibriumnominal α DGP0 DGP1 DGP0 DGP10.05 0 0.016 0 0.0320.10 0.002 0.084 0.008 0.072Multiple equilibriaDGP2 DGP3 DGP2 DGP30.05 0.502 0.366 0.602 0.4140.10 0.632 0.492 0.732 0.552452.5.MonteCarloSimulationsTable 2.4: Rejection frequencies under different bandwidths(pc, hc) 0.15, 0.6 0.2, 0.45 0.3, 0.15 0.35, 0.1 0.5, 0.06Unique equilibriumnominal α DGP0 DGP1 DGP0 DGP1 DGP0 DGP1 DGP0 DGP1 DGP0 DGP10.05 0 0.056 0.006 0.062 0 0.068 0 0.044 0 0.0640.10 0.08 0.092 0.013 0.126 0 0.136 0 0.102 0 0.142Multiple equilibriaDGP2 DGP3 DGP2 DGP3 DGP2 DGP3 DGP2 DGP3 DGP2 DGP30.05 0.236 0.194 0.598 0.402 0.428 0.344 0.292 0.248 0.354 0.2260.10 0.344 0.296 0.704 0.548 0.578 0.456 0.460 0.396 0.582 0.406462.6. Conclusions and RemarksTable 2.5: Rejection frequencies under different maximizing setsκ1= 0.25, κ2= 0.24(pc, hc) (0.25, 0.25) (0.2, 0.45)Unique equilibriumnominal α DGP0 DGP1 DGP0 DGP10.05 0.004 0.018 0.012 0.0050.10 0.004 0.046 0.012 0.044Multiple equilibriaDGP2 DGP3 DGP2 DGP30.05 0.228 0.280 0.396 0.3040.10 0.304 0.364 0.492 0.442Our test involves calculating the maximal local jump over values ofdependent variables and across the coordinates. Computation of themaximizer may take some time when implementing this test. In oursimulation study, the average CPU time (per simulation) for results in thefirst two rows of Table 2.3 (sample size is 1500) is about 10 to 11 seconds.(In particular, DGP0 10.4 seconds, DGP1 10.3 seconds, DGP2 10.5seconds and DGP3 10.7 seconds). In our simulations, the dependentvariables (the prices for products 1 and 2) are of two dimensions. Weexpect the computation time for implementing our test to increase whenthe dimension of the dependent variables becomes higher.2.6 Conclusions and RemarksThis chapter makes the first attempt to propose a test for multipleequilibria when the dependent variables are continuous and the structuralequations are nonparametric. We show that uniqueness or multiplicity ofequilibria produces testable implications on the continuity or discontinuityof the (conditional) density function of the dependent variables. Based onthat, we develop a test for multiple equilibria by testing for the presence of472.6. Conclusions and Remarksjumps in a multivariate density function. The test statistic is constructedas the supremum of the local density jumps, and the critical value iscomputed via the Gaussian multiplier bootstrap. Monte Carlo studiesshow that the test performs reasonably well under different DGPs.Focusing on continuous dependent variables, our test complements therecent literature of testing multiple equilibria in discrete games.We are aware of other potential testing criteria for multiple equilibriain continuous dependent variables. For example, Berry and Haile (2013)showed that the derivative ∇r(y) in (2.1) is over-identified if the model isin Category (C1). If we are agnostic about the equilibria behaviour, theover-identification restrictions can potentially serve as a testing criterionfor ∇xλ(x, u) = 0 for almost all x, which is closely related to theequilibrium behaviour of the model (2.1). A detailed analysis of a testusing over-identification restrictions is beyond the scope of this chapter. Itis worth noticing that a potential test for multiple equilibria based onover-identification restriction is a complement rather than a substitute forthe test via jump(s) in the density proposed in this chapter. The mainreason is that the over-identification restrictions will be functions ofconditional density fY |X(y|x), whose estimation is much easier if we knowthe presence and location of jump(s).48Chapter 3Jumps of the ConditionalChoice Probabilities inIncomplete InformationGames3.1 IntroductionResearch on identification and estimation of simultaneous games withincomplete information has been growing in the past two decades. Relatedliterature includes Sweeting (2009), Bajari, Hong, Krainer and Nekipelov(2010), Aradillas-Lopez (2010), Wan and Xu (2014), Aguirregabiria andMira (2013), among others. In most of the literature, the conditionalchoice probabilities are the main object the econometrician can observe. 17Furthermore, these conditional choice probabilities are usually the startingpoint for parameter identification and estimation (see for example, Bajari,Hong, Krainer and Nekipelov (2010), Paula and Tang (2012), Wan and Xu(2014), Aguirregabiria and Mira (2013), among others). If all covariatesare discrete, the conditional choice probabilities can be nonparametricallyestimated using sub-sample means. However, this chapter shows that theconditional choice probabilities may have jump(s) with respect to thecontinuous covariates (if they exist), even when the payoff functions anddistribution of latent variables are all smooth. The source of such jumpslies in the equilibrium behaviour of the game. As a result, it may be toostrong to impose the standard smoothness condition on the conditionalchoice probabilities and the standard nonparametric techniques (kernel17To be precise, in this chapter, when the private information can be correlated acrossplayers, a conditional choice probability means the probability of a particular player’schoice, conditional on the choices of other players as well as the value of covariates. As aspecial case, when the private information is independent across players, the conditioningset contains only covariates.493.1. Introductiontype estimators or series estimators) may not apply immediately. If theeconometrician knows that there must be at least a jump in theconditional probability, she/he can first estimate the jump location by themethod of Mu¨ller (1992) and Delgado and Hidalgo (2000). However, forincomplete information games, the econometrician typically knows neitherthe presence nor the location of the jump in the conditional choiceprobabilities. Therefore, a test for the presence of a jump in theconditional choice probabilities would help the econometrician in choosingappropriate estimation approaches.Apart from the statistical concern due to the possibility of a jump, thepresence of a jump in the conditional choice probabilities providesinformation about the equilibrium behaviour of the game. As this chapterwill show, when the equilibrium characterizing equations always admit aunique solution, the conditional choice probabilities will be continuous onthe support of continuous covariates. On the other hand, if there aremultiple equilibria, or the single equilibrium present in the data varies intype over the support of covariates, the conditional choice probabilitieshave a jump, under some reasonable conditions. Such a relationshipbetween the equilibrium behaviour and the presence of a jump in theconditional choice probabilities is robust to correlated private informationand unobserved heterogeneity independent of covariates.Uniqueness or multiplicity of equilibria is important in game-theoreticeconometric models. For games with incomplete information, the presenceof multiple equilibria affects various stages of the empirical research, suchas identification, estimation, comparative static and conterfactualexperiments. Aradillas-Lopez (2010) and Wan and Xu (2014) maintainedthe unique equilibrium assumption for their identification and estimationstrategies in incomplete information games with correlated privateinformation.3.1.1 Related LiteratureAguirregabiria and Mira (2013) noticed that the conditional choiceprobability may have a jump with respect to covariates in incompleteinformation games with independent private information. This chapterelaborates on the theoretical foundation of the jumps in the conditionalchoice probabilities. Firstly, the game set-up in this chapter allows for thecorrelated private information among players. Secondly, we explicitly statethe conditions under which the conditional choice probabilities have ajump, and we also relate the presence of jump(s) to the equilibrium503.1. Introductionbehaviour of the game. As a result, the jump is more than a complicationfor estimation of the conditional choice probabilities, it also revealsinformation about the equilibrium behaviour of the game. Thirdly, weemphasize that the econometrician is typically agnostic about the presenceof a jump. Aguirregabiria and Mira (2013) suggested using the jumplocation estimator developed by Mu¨ller (1992) and Delgado and Hidalgo(2000) when there are jumps. However, the validity of that method relieson the existence of a jump. When there is actually no jump, theaforementioned method no longer applies. Hence we recommend a test forthe presence of a jump at the first stage.When the private information is independent among players within thegame, a test for multiple equilibria has been proposed by Paula and Tang(2012). They exploited the equivalence between the uniqueness ofequilibrium and the conditional independence among players’ actions.Unfortunately, such an equivalence breaks down for correlated privateinformation. In this case, even a unique equilibrium can yield correlationamong players’ actions. Our result, which relates the equilibriumbehaviour to the presence of a jump, is robust to correlated privateinformation across players, although it is weaker than that of Paula andTang (2012) when players’ private information is indeed independent.This chapter is also related to the literature on the identification andestimation of incomplete information games, because the equilibriumbehaviour may affect the identification and estimation approaches. Whenthe private information is independent across players, identification andestimation of incomplete information games have been studied by Bajari,Hong, Krainer and Nekipelov (2010), Aguirregabiria and Mira (2013), Xiao(2014) and many others. Not much research has been conducted for gameswith correlated private information. Aradillas-Lopez (2010), Wan and Xu(2014) are such examples, and both of them assumed the uniqueness ofequilibrium.A test for the presence of a jump in the conditional choice probabilities(or generally, for a regression function) usually involves a supremum of anempirical process which does not weakly converge. This problem, especiallyfor a univariate regression function (or a density function) has been treatedby Bickel and Rosenblatt (1973), Johnston (1982), Chu and Wu (1994),Hamrouni (1999), among others. These papers all established a Gumbeltype limiting distribution. A recent development on this issue can be foundin Chernozhukov, Chetverikov and Kato (2012, 2013) and Chernozhukov,Lee and Rosen (2013). Qiu (1997, 2002) provided practical methods todetect the jump of a bivariate regression function at a fixed point.513.2. The Set-up of the Game3.1.2 Organization of Chapter 3The organization of this chapter is as follows, Section 3.2 sets up theframework of games with incomplete information; Section 3.3 investigatesthe implications of the equilibrium behaviour of the game for continuity ordiscontinuity of the conditional choice probabilities. A numerical exampleillustrates such relations. We also allow for heterogeneity in the payofffunctions unobserved by the econometrician (but independent of thecovariates). Section 3.4 points out an econometric consequence of notknowing the presence of a jump in the conditional choice probabilities.Section 3.5 briefly discusses possible methods for testing the presence of ajump in the conditional choice probabilities. The last section concludes.All proofs are collected in Appendix B.2.3.2 The Set-up of the GameWe consider a simultaneous game with incomplete information. To illustratethe main idea, let us focus on games with two players. Let k = 1, 2 denotesthe identity of the players. Each player chooses an action Yk ∈ {0, 1}. Fork = 1, 2, the payoff function for player k can be written aspik(Yk, Y−k, Xk)− Uk(Yk), (3.1)where Y−k represents the choice of the player other than k, Xk is a vectorof exogenous covariates for player k’s payoff function, and Uk(Yk) is therandom shock to player k’s payoff when she chooses action Yk. Withoutloss of generality, we normalize Uk(0) = 0 for all k and simplify thenotation of Uk(1) to Uk. We summarize the payoff of the game in Table3.1. The information structure is as follows. The shock Uk to player k’spayoff is private information only observed by the player k. The joint CDFof (U1, U2), denoted as F1,2(u1, u2), the functional form {pi1(·), pi2(·)}, andthe exogenous covariates X = (X1, X2) are common knowledge for bothplayers. Denote the support of X as X . The game is played n times andthe econometrician observes a random sample {(Y1i, Y2i, X ′1i, X′2i)}ni=1.We impose two assumptions on the payoff functions and the informationstructure of the game.Assumption 1. (i) The distribution of covariates X = (X1, X2) isabsolutely continuous with respect to the Lebesgue measure and its densityfX is bounded and twice continuously differentiable.523.2. The Set-up of the GameTable 3.1: Normal-form of the gamey2 = 0 y2 = 1y1 = 0 0, 0 0, pi2(1, 0, x2)− u2y1 = 1 pi1(1, 0, x1)− u1, 0 pi1(1, 1, x1)− u1, pi2(1, 1, x2)− u2(ii) The function pik(yk, y−k, xk) is twice continuously differentiable withrespect to xk, for k = 1, 2.(iii) The covariates X are observed by players and the econometrician.Assumption 2. (i) The random vector U = (U1, U2) ∈ R2 is continu-ously distributed with a joint CDF F1,2(·), which is twice continuously dif-ferentiable.(ii) U is independent of X.(iii) The support of (U1, U2) satisfiesinf U1 < infx∈Xmin (pi1(1, 0, x1), pi1(1, 0, x1) + ∆1(x1)) ,inf U2 < infx∈Xmin (pi2(1, 0, x2), pi2(1, 0, x2) + ∆2(x2)) .where ∆k(xk) = pik(1, 1, xk)− pik(1, 0, xk) for k = 1, 2.Assumption 2 (iii) is sufficient for the structural functions ϕ(σ, x) inequations (3.7) defined below to be twice continuously differentiable.In the following, we will characterize the equilibrium of the game underAssumptions 1 and 2. Wan and Xu (2014) pointed out that two notions ofequilibrium have been utilized in game-theoretic econometric models withcorrelated private information. In the first notion, an equilibrium is a pairof self-consistent beliefs (σ1(1|1, x), σ2(1|1, x))′ satisfying equations (3.2),where σk (1|1, x) is the player −k’s belief of player k choosing action 1,conditional on player −k herself choosing action 1 and the covariate valuex. The players’ optimal actions are determined byY1i = 1 {v1(1, Xi)− U1i > 0} ,Y2i = 1 {v2(1, Xi)− U2i > 0} , (3.2)wherev1(1, x) = pi1(1, 0, x1) + σ2(1|1, x)∆1(x1),533.2. The Set-up of the Gamev2(1, x) = pi2(1, 0, x2) + σ1(1|1, x)∆2(x2), (3.3)are the expected payoffs of player k conditional on herself choosing action1 and the realization of covariates X, for k = 1, 2. Aradillas-Lopez (2010)used this notion of equilibrium.In the second notion, an equilibrium is defined as a pair of cut-offs inthe value of private information (u∗1(x), u∗2(x)), such that the followingconditions hold:Yk,i = 1 {Uki ≤ u∗k(Xi)} , for k = 1, 2,andpi1(1, 0, x1) + ∆1(x1)σ˜2(x) = u∗1(x),pi2(1, 0, x2) + ∆2(x2)σ˜1(x) = u∗2(x), (3.4)whereσ˜k(x) = Pr(Uki ≤ u∗k(x)|Xi = x, U−ki = u∗−k(x)).Note that the existence of a monotone pure Nash equilibrium is assumed.The differences between two notions are as follows. (i) In the first notion, thebelief of the other player’s action is conditional on the player’s own action,whereas in the second one, it is conditional on the player’s own privateinformation. (ii) In the second notion, there is an additional assumption ofthe monotonicity of the equilibrium. In the main text of this chapter, wewill follow Aradillas-Lopez (2010) and use the first notion of equilibrium.In Appendix B.1, we investigate the outcomes under the second notion ofequilibrium. We will see that all the results obtained under the first notionremain valid under the second notion, with a slight modification.Now we characterize the equilibrium of the first notion, followingAradillas-Lopez (2010). Given a realization of X, an equilibrium is a vector(σ1(1|1, x), σ2(1|1, x))′satisfying the following system of equations.σk(1|1, x) =Pr (v1(1, x) > U1i, v2(1, x) > U2i|X = x)Pr (v−k(1, x) > U−ki|X = x)=F1,2 (v1(1, x), v2(1, x))F−k (v−k(1, x)), (3.5)for k = 1, 2, where F1(·), F2(·) are the marginal distribution functions ofU1, U2. Note that the first equality in (3.5) comes from (3.2) and the543.3. An Implication of Equilibrium Behaviour on Datasecond uses Assumption 2 (ii). In other words, the system of equationscharacterizing equilibrium beliefs (σ1(1|1, x), σ2(1|1, x))′ can be written as[σ1(1|1, x)σ2(1|1, x)]=[ F1,2(v1(1,x),v2(1,x))F2(v2(1,x))F1,2(v1(1,x),v2(1,x))F1(v1(1,x))]. (3.6)To save on notation, defineϕ(σ, x) ≡[ϕ1(σ, x)ϕ2(σ, x)]≡[σ1 −F1,2(pi1(1,0,x1)+σ2∆1(x1),pi2(1,0,x2)+σ1∆2(x2))F2(pi2(1,0,x2)+σ1∆2(x2))σ2 −F1,2(pi1(1,0,x1)+σ2∆1(x1),pi2(1,0,x2)+σ1∆2(x2))F1(pi1(1,0,x1)+σ2∆1(x1))].The system of structural equations can be stated asϕ(σ, x) = 0. (3.7)Note that it follows from Assumptions 1 and 2 that the function ϕ :[0, 1]2 ×X → R2 is twice continuously differentiable.3.3 An Implication of Equilibrium Behaviour onDataThe observed data for the econometrician can be summarized by the condi-tional choice probabilities defined asQk (yk|y−k, x) ≡ Pr (Yk = yk|Y−k = y−k, X = x) ,for k = 1, 2, and (y1, y2) ∈ {0, 1}2.Let Q (1|1, x) = (Q1 (1|1, x) , Q2 (1|1, x))′. In this section, we analyse theimplications of the equilibrium behaviour of the game for continuity ordiscontinuity of the conditional choice probabilities Q (1|1, x). We beginwith studying the relation from x to the equilibrium belief σ. When (3.7)admits multiple solutions, the relation from the x to σ can be acorrespondence. Each function qm in the following definition can be viewedas a branch of this correspondence.Definition 1. Let M be the smallest number such that there exists afamily of twice continuously differentiable functions{q1(·), q2(·), ..., qM (·)} ,where qm : Bm → [0, 1]2, Bm is an open and connected subset of X ,and thefollowing conditions are satisfied :553.3. An Implication of Equilibrium Behaviour on Data(i) For any (σ, x) ∈ [0, 1]2 × Bm satisfying ϕ (σ, x) = 0, we have σ =qm(x). Furthermore, ϕ(qm(x), x) = 0 for all x ∈ Bm.(ii) ∪Mm=1B¯m = X . (Here B¯m denotes the closure of Bm.)Remark 1. Without loss of generality, we can further assume that forany x ∈ Bm ∩ Bk and m 6= k, we have qm(x) 6= qk(x). Otherwise we canexclude x from either Bm or Bk.For the numerical example in Section 3.3.2, Figure 3.2 plots the familyof functions {q1, q2, q3}.The following assumption is imposed on the constant M and the struc-tural equations ϕ(σ, x).Assumption 3. (i) The system of structural equations (3.7) has atleast one solution for any x ∈ X .(ii) The constant M in Definition 1 is finite.(iii) If the system of structural equations (3.7) admits a unique solutionin σ for all x ∈ X , the partial derivative 5σϕ(σ, x) is invertible for all(σ, x) ∈ (0, 1)2× int(X ).Assumption 3 (i) and (ii) require that a solution to (3.7) must exist, andthe number of solutions is finite. Assumption 3 (iii) is a technical conditionthat ensures the partial derivative of ϕ(σ, x) with respect to σ be invertible inthe case of unique solution. Definition 1 yields an index system with regardto the “types” of equilibrium. For each x, if a solution σ(x) to (3.7)belongsto qm(Bm), we call it type m equilibrium and denote it as σ(m)(1|1, x) =(σ(m)1 (1|1, x), σ(m)2 (1|1, x))′. If σ ∈ q(Bm) for multiple m’s, then we takethe smallest one as its index. Thus, we have σ(m)(1|1, x) = qm(x) for allx ∈ ∪Mm=1Bm. We define Υ(x) as the equilibria indices set for a genericvalue x ∈ X ,Υ(x) ≡ {m : ϕ (qm(x), x) = 0}. (3.8)By definition, σ(m)(1|1, x) exists if and only if m ∈ Υ(x). Furthermore, wedefine an equilibrium selection rule pi as follows.Definition 2. Let an equilibrium selection rule pi be a measurable func-tion ∪Mm=1Bm → [0, 1]M defined aspi(x) = (pi1(x), pi2(x), ..., piM (x)),where pim(x) = Pr (Y ∈ qm(Bm)|X = x) .563.3. An Implication of Equilibrium Behaviour on DataThe component pim(x) is the probability of selecting equilibrium m con-ditional on X = x. Clearly, pim(x) = 0 if m /∈ Υ(x). Also,∑Mm=1 pim(x) =1.The following equation describes the relationship between the observedconditional choice probabilities Q (1|1, x) and the equilibrium beliefs(σ1(1|1, x), σ2(1|1, x))′ at a given x:Q (1|1, x) =∑h∈Υ(x)pih(x)σ(h)(1|1, x), (3.9)where σ(h)(1|1, x) = (σ(h)1 (1|1, x), σ(h)2 (1|1, x))′ = qh(x). Equation (3.9) saysthat a conditional choice probability is a mixture of the equilibrium beliefsthat are solutions to (3.7). The mixing probabilities are the correspondingcomponents of pi(x).3.3.1 The Equilibrium Behaviour and Presence of A Jumpin Conditional Choice ProbabilitiesMultiplicity of equilibria in games with incomplete information has drawnincreasing attention in the literature. Theoretically, multiple equilibria mayarise when the system of equations (3.7) admits more than one solution.The main result of this section is the relationship between the equilibriumbehaviour and the presence of a jump in the conditional choice probabilitiesQ (1|1, x). In particular, we show that if the system of equations (3.7) admitsa unique solution for all x ∈ X , then the conditional choice probabilitiesQ (1|1, x) must be continuous in x on int(X ). On the other hand, if thesystem of equations (3.7) admits multiple solutions and the index of therealized equilibrium varies over x, then Q (1|1, x) will have a jump at somex, under reasonable conditions. Note that Q (1|1, x) is said to be continuouswhen both Q1 (1|1, x) and Q2 (1|1, x) are continuous. On the other hand,Q (1|1, x) has a jump at some xmeans that at least one of the two conditionalchoice probabilities has a jump at x.We first make clear the meaning of uniqueness and multiplicity of equi-libria. We classify the game into one of the following three categories.(C1) The system of equations (3.7) admits a unique solution in σ for allx ∈ X .(C2) There exists a set XA ⊂ X with Pr(X ∈ XA) > 0, such that thesystem of equations (3.7) admits multiple solutions in σ for all x ∈ XA.573.3. An Implication of Equilibrium Behaviour on DataMoreover, for any x ∈ XA, there exists an m∗, which may depend on x, suchthat pim∗(x) = 1.(C3) There exists a set XA ⊂ X with Pr(X ∈ XA) > 0, such that thesystem of equations (3.7) admits multiple solutions in σ for all x ∈ XA.Moreover, there exists a set XB ⊂ XA with Pr(X ∈ XB) > 0, satisfyingthe following conditions: for any x ∈ XB, there are m1 6= m2 such thatpim1(x) > 0 and pim2(x) > 0.Note that this classification does not include an exceptional case in whichstructural equations (3.7) admit multiple solutions on a non-empty but zero-measure set of x. Also, it does not include another exceptional case inwhich the set {x ∈ X : pim1(x) > 0, pim2(x) > 0, for some m1,m2} is non-empty but has a zero measure. We do not consider those cases in thischapter. In the literature of games with incomplete information, usuallygames fitting into Category (C1) or (C2) are viewed as having a uniqueequilibrium, whereas games in Category (C3) are viewed as having multipleequilibria.We further classify (C2) into two sub-categories:(C2-1) There is an m∗, which does not depend on x, such that pim∗(x) =1 for all x ∈ X .(C2-2) Two disjoint subsets of X , C1 and C2, satisfying C¯1 ∩ C¯2 6=∅,Pr(X ∈ C1) > 0,Pr(X ∈ C2) > 0, have the following property:pim∗1(x) = 1 for x ∈ C1, pim∗2(x) = 1, for x ∈ C2,for some m∗1 6= m∗2.Similarly, we do not consider an exceptional case where the set {x ∈ X :pim∗(x) = 1} is non-empty but has a zero measure. Sub-category (C2-1) canbe viewed as in Category (C1). Indeed, the system of structural equations(3.7) coupled with the equilibrium rule pim∗(x) = 1 and for an m∗ and for allx ∈ X , is observationally equivalent to the structural equations σ = qm∗(x)and X = Bm∗ . Clearly, the latter fits into Category (C1). Therefore, in therest of this chapter, we treat Category (C1) and Sub-category (C2-1) as awhole.The following proposition associates the uniqueness of solution withcontinuity of the conditional choice probabilities.Proposition 1. Under Assumptions 1-3, if the game fits into Category(C1) or Subcategory (C2-1), the conditional choice probabilities Q (1|1, x)will be twice continuously differentiable in x on int(X ).583.3. An Implication of Equilibrium Behaviour on DataAs a result, when we are testing the null hypothesis that Q (1|1, x) iscontinuous in x on int(X ), we are actually testing a sufficient condition foruniqueness of equilibrium.Now let us investigate what happens when the game has multipleequilibria, (i.e., the game fits into Category (C3)). We will show thatunder some reasonable conditions on the structural equations (3.7), whenthere are multiple equilibria, the conditional choice probabilities Q (1|1, x)will have a jump at certain x, except for some special equilibrium selectionrules. Here a jump at point xd (we use the word jump and discontinuityinterchangeably throughout this chapter) means that the limit of Q (1|1, x)towards xd does not equal in every direction. That is, there are disjointsubsets C1 and C2 of X with Pr(X ∈ C1) > 0,Pr(X ∈ C2) > 0, and somexd ∈ C¯1 ∩ C¯2 such that for k = 1 or 2,∣∣∣∣ limx→xd,x∈C1Qk (1|1, x)− limx→xd,x∈C2Qk (1|1, x)∣∣∣∣ = δ > 0.The main source for such a jump lies in the equilibrium behaviour18. Whenthe structural equations (3.7) admit multiple solutions, an important andeconomic-relevant scenario is that the set of equilibrium indices Υ(x) variesover X . (A typical case is that the number of solutions to (3.7) changesover X ). Consider what happens at the boundary between regions of xwith different equilibria indices sets. Suppose that there is a region of xwhere (3.7) has a unique solution and another region where (3.7) has morethan one solution. Equation (3.9) implies that at one side of the boundary,Q (1|1, x) equals the unique solution, whereas on the other side of theboundary, Q (1|1, x) is a mixture of more than one solution. As a result,except for some special equilibrium selection rules, Q (1|1, x) will have ajump at that boundary. This idea will be formalized in Proposition 2.Let us give conditions for a jump to happen when the game has multipleequilibria. Recall the equilibria indices set is defined in (3.8), if there is aset C ∈ X such that Υ(x) is the same for all x ∈ C, we say that Υ(C) is theequilibria indices set on the given set C. Clearly, the value of Υ(C) is givenby Υ(C) = Υ(x) for an arbitrary x ∈ C. The following assumption requiresthat Υ(C) changes over subsets of X .Assumption 4H. If the game fits into Category (C2-2) or (C3), there is18There is another possible source of jump which is related to the form of equilibriumselection rule. We will discuss it later. As the latter source needs stronger conditions onthe equilibrium selection rules, we view it as a secondary source of a jump and will discussit later.593.3. An Implication of Equilibrium Behaviour on Dataan l ∈ {1, ...,M} such that for some disjoint C1, C2 ⊂ Bl satisfying Pr(X ∈C1) > 0, Pr(X ∈ C2) > 0 and C¯1 ∩ C¯2 6= ∅, we have Υ(C1) 6= Υ(C2).Assumption 4H says that equilibrium indices set changes within someBl. It includes a typical case specified in Assumption 4L, provided thatAssumption 5 below holds. Assumption 4L is easier to interpret thanAssumption 4H.Assumption 4L. If the game fits into Category (C2-2) or (C3), thereis a subset C of X with Pr(X ∈ C) > 0, such that equations (3.7) admit aunique solution in σ for all x ∈ C.Assumption 4L says that the structural equations (3.7) admit a uniquesolution for some values of x while admit multiple solutions for other valuesof x (by the definition of (C3)). This often happens in games with multipleequilibria. Multiple equilibria in players’ beliefs can be viewed as a result ofa coordination mechanism. When the covariates take a very large (or small)value, the effect of covariates dominates the coordination mechanism, and ituniquely determines the equilibrium beliefs (i.e., there is only one solutionto (3.7) given such an extremal x). In contrast, when the values of thecovariates are moderate, the self-fulling mechanism tends to dominate, andproduces (for example) three equilibria: a large one, a medium one and asmall one. The following Assumption 5 is a technical condition under whichAssumption 4L is sufficient for Assumption 4H.Assumption 5. (i) For an arbitrary x ∈ int(X ), if for any  > 0 thereexists an x0 such that |x− x0| <  and ϕ(σ, x0) has a unique solution in σ,then 5σϕ(σ, x) is invertible for any σ satisfying ϕ(σ, x) = 0.(ii) For any m ∈ {1, ...,M}, let x0 be an arbitrary point x0 ∈ B¯m\Bm,and σ∗ satisfy ϕ(σ∗, x0) = 0, σ∗ 6= limx→x0,x∈Bm qm(x) ≡ σm. If σ∗ existsand there is no such σ′ that every component of σ′−σm has the same sign asσ∗ − σm and |σ′ − σm| < |σ∗ − σm|, then the partial derivative 5σϕ(σ∗, x0)is invertible.Assumption 5 assumes that the partial derivative 5σϕ(σ, x) is invertibleat some particular (σ, x). Lemma 1 below establishes the relation betweenAssumption 4L and Assumption 4H.Lemma 1. Under Assumption 5, Assumption 4L is sufficient for As-sumption 4H.603.3. An Implication of Equilibrium Behaviour on DataIn the following, whenever we state a result under Assumption 4H, italso holds if Assumption 4H is replaced by Assumption 4L and Assumption5.Note that the function Υ : X → the power set of {1, ...,M}, whichmaps an x to an equilibria indices set, is fully determined by the structuralequations (3.7). Given a function Υ, we can view an equilibrium selectionrule pi as an element in a classC(Υ) =g : ∪Mm=1Bm → [0, 1]M :M∑h=1gh(x) = 1,and gj(x) = 0 if j /∈ Υ(x), for all x ∈ ∪Mm=1Bm.,where gj(x) is the jth component of g(x) ∈ [0, 1]M . Proposition 2 belowstates that under Assumptions 1-3 and 4H, multiplicity of equilibria in theincomplete information game leads to a jump in the conditional choiceprobabilities Q (1|1, x) at some x, except for equilibrium selection rulesthat take values of a zero measure “around” x’s at the boundary betweenregions with different equilibria indices sets.Proposition 2. Under Assumptions 1-3 and 4H, if the game fits intoCategory (C3), the conditional choice probabilities Q (1|1, x) will have ajump at some x ∈ int(X ), except for a class of equilibrium selection rulesCE1(Υ), and the set{( limx′∈C1,x′→xpi(x′), limx′∈C2,x′→xpi(x′)) : pi ∈ CE1(Υ)}has a zero measure in{( limx′∈C1,x′→xpi(x′), limx′∈C2,x′→xpi(x′)) : pi ∈ C(Υ)},for all x ∈ C¯1 ∩ C¯2, and all disjoint C1, C2 ⊂ Bl, for some l ∈ {1, ...,M}such that Pr(X ∈ C1) > 0, Pr(X ∈ C2) > 0, C¯1 ∩ C¯2 6= ∅, Υ(C1) 6= Υ(C2).The existence of such x’s in Proposition 2 is guaranteed by Assumption4H (or Assumption 4L and Assumption 5). Consider the numerical examplein the Section 3.3.2. Its normal-form is in Table 3.2. For some values ofparameters, the structural equations (3.11) has either one or three solutionsdepending on the x value. It satisfies Assumption 4L, and there are C1, C2satisfying Assumption 4H with l = 1, Υ(C1) = {1} and Υ(C2) = {1, 2, 3}.For an arbitrary xd ∈ C¯1 ∩ C¯2, equation (3.9) implies that613.3. An Implication of Equilibrium Behaviour on Datalimx→xd,x∈C1Q (1|1, x) = σ(1)(1|1, xd),limx→xd,x∈C2Q (1|1, x) = limx→xd,x∈C2pi1(x)σ(1)(1|1, xd)+ limx→xd,x∈C2pi2(x) limx→xd,x∈C2σ(2)(1|1, x)+ limx→xd,x∈C2pi3(x) limx→xd,x∈C2σ(3)(1|1, x).Hencelimx→xd,x∈C1Q (1|1, x)− limx→xd,x∈C2Q (1|1, x)=(1− limx→xd,x∈C2pi1(x))σ(1)(1|1, xd)− limx→xd,x∈C2pi2(x) limx→xd,x∈C2σ(2)(1|1, x)− limx→xd,x∈C2pi3(x) limx→xd,x∈C2σ(3)(1|1, x).Consider a class of equilibrium selection rules in which (pi1, pi2, 1− pi1 − pi2)are the probabilities of picking up each equilibrium when the structuralequations have three solutions. Now, a desired jump will arise at xd as longas(1− pi1)σ(1)(1|1, xd)− pi2 limx→xd,x∈C2σ(2)(1|1, x)− (1− pi1 − pi2) limx→xd,x∈C2σ(3)(1|1, x)6= 0.Rearranging the terms leads to(limx→xd,x∈C2σ(3)(1|1, x)− σ(1)(1|1, xd))pi1+(limx→xd,x∈C2σ(3)(1|1, x)− limx→xd,x∈C2σ(2)(1|1, x))pi2+σ(1)(1|1, xd)− limx→xd,x∈C2σ(3)(1|1, x) 6= 0,Note that σ(1)(1|1, xd) 6= limx→xd,x∈C2 σ(3)(1|1, x) for all xd ∈ C¯1 ∩ C¯2. 19As a result, the conditional choice probabilities Q (1|1, x) will have a jump,19Suppose to the contrary, σ(1)(1|1, xd) = limx→xd,x∈C2 σ(3)(1|1, x) for some xd ∈ C¯1 ∩623.3. An Implication of Equilibrium Behaviour on Dataexcept for some artificially chosen (pi1, pi2) ∈ {(pi1, pi2) ∈ [0, 1]2 : pi1 +pi2 ≤ 1}satisfying the following restriction:αpi1 + βpi2 = γ, (3.10)where α = limx→xd,x∈C2σ(3)(1|1, x)− σ(1)(1|1, xd) 6= 0,β = limx→xd,x∈C2σ(3)(1|1, x)− limx→xd,x∈C2σ(2)(1|1, x),γ = σ(1)(1|1, xd)− limx→xd,x∈C2σ(3)(1|1, x),for all xd ∈ C¯1 ∩ C¯2, and all C1, C2 satisfying 4H with l = 1,Clearly the set {(pi1, pi2) ∈ [0, 1]2 : pi1 + pi2 ≤ 1, and (3.10) holds} has a zeromeasure in {(pi1, pi2) ∈ [0, 1]2 : pi1 + pi2 ≤ 1}. Note that here the equilibriumselection rule does not depend on x within the regions where structuralequations (3.7) have three solutions. As a result, the above “zero measure”argument for the equilibrium selection rule holds for any x that leads tothree solutions.The main idea of Proposition 2 is that the conditional choiceprobabilities Q (1|1, x) will have a jump at the boundary between regionsof x with distinct equilibrium indices sets (especially at the boundarybetween regions have different numbers of solutions to the structuralequations (3.7)), except for some special equilibrium selection rules.However, such boundaries are not the only type of locations jumps mayhappen. A jump can also occur in the interior of a region with the sameequilibria indices set, if the equilibrium selection rule itself has a jump inx. Definition 3 characterizes such equilibrium selection rules.Definition 3. We say an equilibrium selection rule pi has a jump withinan equilibrium indices set Υ(D) if there is a subset D ⊂ int(X ) with Υ(D)defined, and the following condition is satisfied: there are two disjoint subsetsD1, D2 ⊂ D, Pr(X ∈ D1) > 0,Pr(X ∈ D2) > 0, D¯1 ∩ D¯2 6= ∅, such that forall x ∈ D¯1 ∩ D¯2, we have∣∣∣∣ limx′→x,x′∈D1pih∗(x′)− limx′→x,x′∈D2pih∗(x′)∣∣∣∣ = δ > 0, for some h∗ ∈ Υ(D).C¯2. This means limx→xd,x∈C1 q1(x) = limx→xd,x∈C2 q3(x) for some xd ∈ C¯1 ∩ C¯2. Notethat xd ∈ B1 and xd ∈ B¯3, we have q1(x′) = q3(x′) for some x′ ∈ B(xd, r) ∩ B1 ∩ B3,where B(xd, r) is a ball centred in xd and with radius r. B(xd, r) ∩B1 ∩B3 6= ∅ becausexd ∈ C¯1 ∩ C¯2, C1, C2 ⊂ B1 (since l = 1 in Assumption 4L) and C2 ⊂ B3. This contradictswith Remark 1 below Definition 1.633.3. An Implication of Equilibrium Behaviour on DataDenote as CF (Υ) the class of equilibrium selection rules that have a jumpwithin an equilibria indices set.The next result states that if the game has multiple equilibria and theequilibrium selection rule has the property specified in Definition 3, theconditional choice probabilities Q (1|1, x) will have a jump at some x, exceptfor some special equilibria selection rules.Corollary 1. Under Assumptions 1-3, if the game fits into Category(C3) and the equilibrium selection rule λ ∈ CF (Υ) (i.e., the equilibriumselection rule has the property in Definition 3), the conditional choiceprobabilities Q (1|1, x) will have a jump at some x ∈ int(X ), except for aclass of equilibrium selection rules CE2(Υ), and the set{( limx′∈D1,x′→xpi(x′), limx′∈D2,x′→xpi(x′)) : pi ∈ CE2(Υ)}has a zero measure in{( limx′∈D1,x′→xpi(x′), limx′∈D2,x′→xpi(x′)) : pi ∈ CF (Υ)},for all x ∈ D¯1 ∩ D¯2 in Definition 3.So far we have studied the game fitting into Category (C1),Sub-category (C2-1) or Category (C3). What remains is Sub-category(C2-2). Unfortunately, the game in (C2-2) exhibits undesirable propertiesin the conditional choice probabilities. The following Corollary 2 showsthat the game in (C2-2), which is usually regarded as having a uniqueequilibrium, leads to jump(s) in the conditional choice probabilities. Thisis because the single equilibrium present in the data jumps from one indexto the other at some x. This is the main limitation of the discontinuitycriterion in testing for multiple equilibria in discrete games withincomplete information.Corollary 2. Under Assumptions 1-3, if the game fits into Sub-category(C2-2), the conditional choice probabilities Q (1|1, x) will have a jump atsome x ∈ int(X ).In sum, we have shown that under Assumptions 1-3 and 4H, theequilibrium behaviour of incomplete information games produces testableimplications on continuity or discontinuity of the conditional choiceprobabilities Q (1|1, x). Such relations are robust to correlated privateinformation between players.643.3. An Implication of Equilibrium Behaviour on DataTable 3.2: Normal-form of the numerical exampley2 = 0 y2 = 1y1 = 0 0, 0 0, α+ βx− u2y1 = 1 α+ βx− u1, 0 α+ (β + γ)x− u1, α+ (β + γ)x− u23.3.2 A Numerical ExampleWe present a numerical example to illustrate the relationship between theequilibrium behaviour and the presence of a jump in the conditional choiceprobabilities. Suppose that the game has a normal-form in Table 3.2.In this example, scalar X is the only covariate. Assume the joint distri-bution of (U1, U2)′ is(U1U2)∼ N([00],[1 ρρ 1]).The resulting equilibrium characterizing equations are:[σ1σ2]=[G1,2(α+ βx+ γσ2, α+ βx+ γσ1)/G2(α+ βx+ γσ1)G1,2(α+ βx+ γσ2, α+ βx+ γσ1)/G1(α+ βx+ γσ2)],(3.11)where G1,2 is the bivariate CDF of (U1, U2)′, Gk is the marginal CDF of Uk,and σk is player −k’s belief that player k chooses action 1, conditional onplayer −k choosing action 1.Consider two games with different values of parameters (α, β, γ, ρ)′specified in Table 3.3. Game 1 fits into Category (C1) while Game 2 fitsinto Category (C3). Their parameters only differ in γ. When γ > 0, thegame with normal-form in (3.2) is a coordination game in which γdetermines the strength of interaction. For Game 1, the structuralequations (3.11) always have a unique solution, while for Game 2, thestructural equations (3.11) have one or three solutions depending on thevalue of x. Figure 3.1 plots the equilibrium behaviour of Game 2 atx = 0.4 (upper) and x = 0.5 (lower). In Figure 3.1, the solid line is thebest response function of belief σ1 in terms of σ2 at a fixed x. Because ofthe symmetry of the game, the equilibrium belief σ1 (and σ2) is theintersection of the best response function and the 45 degree line (thedashed line). At x = 0.4, the only equilibrium belief is σ1 = σ2 = 0.456,while at x = 0.5, there are three equilibrium beliefs: 0.460, 0.698 and 0.916.653.3. An Implication of Equilibrium Behaviour on DataFigure 3.1: The equilibrium belief(s) σ1α = 0.25, β = −3.5, γ = 7, ρ = 0.20 0.2 0.4 0.6 0.8 100.10.20.30.40.50.60.70.80.91conditional belief σ2conditional belief σ 1  best response function45 degree line0 0.2 0.4 0.6 0.8 100.10.20.30.40.50.60.70.80.91conditional belief σ2conditional belief σ 1  best response function45 degree line663.3. An Implication of Equilibrium Behaviour on DataTable 3.3: Parameters for the gamesGame 1 Game 2(α, β, γ, ρ)′ (0.25,−3.5, 3.5, 0.2)′ (0.25,−3.5, 7, 0.2)′Support of X [0.35, 0.55] [0.35, 0.55]# of solutions20 1 1 or 3Figure 3.2 plots the equilibria correspondence from the covariate valuex to the equilibrium belief(s) σ1, for Game 2. When x is smaller than 0.464,the structural equations (3.11) admit a unique solution (the solid line). Onthe other hand, when x is larger than 0.464, the structural equations (3.11)admit three solutions (the solid, dash and dot-dash line). The functions q1,q2 and q3 are three branches of the correspondence.We examine the observed conditional choice probability Q1(1|1, x) =Pr(Y1 = 1|Y2 = 1, X = x). Figure 3.3 plots Q1(1|1, x) as a function of x, forGame 1 (upper) and Game 2 (lower).For Game 1, the conditional choice probability Q1(1|1, x) is continuouson the support of X. For Game 2, if the equilibrium selection rule assignsequal probabilities to three solutions whenever they are available, Q1(1|1, x)has a jump at x = 0.464.3.3.3 An Extension: Unobserved HeterogeneityIn this section, we allow for heterogeneity in the payoff functions. Here theheterogeneity refers to various types of game environments determined bya random variable T , before the players make choices. The payoff functionfor player k can be written aspik(Yk, Y−k, Xk, T )− Uk(Yk). (3.12)As in (3.1), we make normalizations Uk(0) = 0 and Uk(1) = Uk. Thedeparture from (3.1) is that the function pik now is t-dependent. The normal-form of the game is in Table 3.12, given t ∈ {1, ..., t¯}. In each observationi, (Y1i, Y2i, X ′1i, X′2i) are observed by the econometrician, Ti is observed byplayers but not by the econometrician. Uki is the private information tothe player k, k = 1, 2. We assume that Ti can take values in {1, ..., t¯} andwith probability Pr(Ti = t) = λt, for t = 1, ..., t¯. However, the discretesupport of T is not essential. T can be generalized to a continuous random673.3. An Implication of Equilibrium Behaviour on DataFigure 3.2: Equilibria correspondence from covariate x to belief σ1α = 0.25, β = −3.5, γ = 7, ρ = 0.20.35 0.4 0.45 0.5 0.550.40.50.60.70.80.91xσ 1  q1q2q3Table 3.4: Normal-form of the game with T = ty2 = 0 y2 = 1y1 = 0 0, 0 0, pi2(0, 1, x2, t)− u2y1 = 1 pi1(1, 0, x1, t)− u1, 0 pi1(1, 1, x2, t)− u1, pi2(1, 1, x2, t)− u2683.3. An Implication of Equilibrium Behaviour on DataFigure 3.3: Observed conditional choice probabilityα = 0.25, β = −3.5, γ = 3.5, ρ = 0.20.35 0.4 0.45 0.5 0.550.60.650.70.750.80.850.90.951xPr(Y 1=1|Y2=1,x)α = 0.25, β = −3.5, γ = 7, ρ = 0.2, equilibrium selection rule pi = (1/3, 1/3, 1/3)0.35 0.4 0.45 0.5 0.550.40.450.50.550.60.650.70.75xPr(Y 1=1|Y2=1,x)693.3. An Implication of Equilibrium Behaviour on Datavariable, though we do not cover it in this chapter. The real restriction onthe heterogeneity T is the part (iv) of the following Assumption 2’.Assumption 2’. (i) For each realization t of T , the random vectorU = (U1, U2) ∈ R2 is continuously distributed with a conditional CDFF t1,2(·), which is twice continuously differentiable.(ii) The unobservable U is independent of X conditional on T .(iii) Conditional on T = t ∈ {1, ..., t¯}, the support of (U1, U2) satisfiesinf U1 < infx∈Xmin (pi1(1, 0, x1, t), pi1(1, 0, x1, t) + ∆1(x1, t)) ,inf U2 < infx∈Xmin(pi2(1, 0, x2, t), pit2(1, 0, x2, t) + ∆2(x2, t)),where ∆k(xk, t) = pitk(1, 1, xk, t)− pitk(1, 0, xk, t) for k = 1, 2.(iv) T is independent of X.Assumption 2’(i)-(iii) are modified versions of Assumption 2.Assumption 2’ (iv) says that the unobserved heterogeneity is independentof covariates X. An implication is that the support X is the same for all t.Otherwise, the heterogeneity depending on covariates by itself may createjump(s) in the conditional choice probabilities, which contaminates therelationship between the equilibrium behaviour and the presence of jump.Fixing T = t, the corresponding game is the same as the one we haveanalysed before. The equilibrium characterizing equations become[σt1(1|1, x)σt2(1|1, x)]=F t1,2(pi1(1,0,x1,t)+σt2(1|1,x)∆1(x1,t),pi2(1,0,x2,t)+σt1(1|1,x)∆2(x2,t))F t2(pi2(1,0,x2,t)+σt1(1|1,x)∆2(x2,t))F t1,2(pi1(1,0,x1,t)+σt2(1|1,x)∆1(1,1,x1),pi2(1,0,x2,t)+σt1(1|1,x)∆2(x2,t))F t1(pi1(1,0,x1,t)+σt2(1|1,x)∆1(x1,t)) ,(3.13)for t ∈ {1, ..., t¯}, where (σt1(1|1, x), σt2(1|1, x))′ is the pair of equilibriumbeliefs for the game with T = t. For each t, denote equation (3.13) asϕt(σ, x) = 0. (3.14)The following Definition 1’, 2’ and Assumption 3’, 4H’ are modifiedversions of their counterparts in the previous subsection. Thus we statethem without further discussions.Definition 1’. For each t ∈ {1, ..., t¯}, let M t be the smallest numbersuch that there exists a family of twice continuously differentiable functions{qt1(·), qt2(·), ..., qtMt(·)},703.3. An Implication of Equilibrium Behaviour on Datawhere qtm : Btm → [0, 1]2, Btm is an open and connected subset of X , and thefollowing conditions are satisfied :(i) For any (σ, x) ∈ [0, 1]2 × Btm satisfying ϕt (σ, x) = 0, we have σ =qtm(x). Furthermore, ϕt(qtm(x), x) = 0 for all x ∈ Btm.(ii) ∪Mm=1B¯tm = X . (Here B¯tm denotes the closure of Btm.)Like Remark 1, without loss of generality, we can further assume thatfor any x ∈ Btm ∩Btk and m 6= k, qtm(x) 6= qtk(x).Assumption 3’. For each t ∈ {1, ..., t¯},(i) the system of equations (3.14) has at least one solution in σ;(ii) the constant M t in Definition 1 is finite;(iii) if the system of equations (3.14) admits a unique solution in σ for allx ∈ X , the partial derivative 5σϕt(σ, x) is invertible for all (σ, x) ∈ (0, 1)2×int(X ).If σ ∈ qtm(Btm), call it equilibrium m under T = t and denote it asσt(m)(1|1, x). If σ ∈ qtm(Btm) for multiple m’s, we take the smallest one asthe index. Similar to (3.8), we define Υt(x) as the equilibria indices set fora generic value x, conditional on T = t,Υt(x) ≡ {m : ϕt(qtm(x), x)= 0}.For T = t, we define the equilibrium selection rule pit.Definition 2’. For every t ∈ {1, ..., t}, let an equilibrium selection rulepit be a measurable function ∪Mm=1Btm → [0, 1]M defined aspit(x) = (pit1(x), pit2(x), ..., pitM (x)),where pitm(x) = Pr(Y ∈ qtm(Btm)|X = x).In the presence of unobserved heterogeneity T , the observed conditionalchoice probabilities can be written as followsQ (1|1, x) =t¯∑t=1λtQt (1|1, x) =t¯∑t=1λt∑h∈Υt(x)pith(x)σt(h)(1|1, x), (3.15)where σt(h)(1|1, x) = qth(x), as defined in Definition 1’. We can see that theheterogeneity adds another layer of mixture to the expression of conditionalchoice probabilities. Then we state a modified version of Assumption 4H.713.3. An Implication of Equilibrium Behaviour on DataAssumption 4H’. For any t ∈ {1, ..., t¯}, if the game fits into Category(C2-2) or (C3), there is an lt ∈ {1, ...,M t} such that for some disjointCt1, Ct2 ⊂ Bl satisfying Pr(X ∈ Ct1) > 0, Pr(X ∈ Ct2) > 0 and C¯t1 ∩ C¯t2 6= ∅,we have Υt(Ct1) 6= Υt(Ct2).The following proposition establishes the relation from the equilibriumbehaviour to the presence of jump(s) in the conditional choice probabilities,for incomplete information games with unobserved heterogeneity T . Themain idea is that for any fixed t, the relation is the same as the previoussubsection. In addition, such a relation remains after the linear combinationof t-specific components, if the jumps inQt(1|1, x) =∑h∈Υt(x)pith(x)σt(h)(1|1, x)do not cancel out across t at some x.Proposition 3. (i) Suppose Assumptions 1’-3’ hold, if for allt ∈ {1, ..., t¯}, the game fits into Category (C1) or Sub-category (C2-1), theconditional choice probabilities Q (1|1, x) will be twice continuouslydifferentiable in x for all x ∈ int(X ).(ii) Suppose Assumptions 1’-3’ hold, and there exists a nonempty setTD ⊂ {1, ..., t¯} such that for all t ∈ TD the game fits into Category (C3).For each t ∈ {1, ..., t¯}, let X tD = {x ∈ int(X ) : Qt (1|1, x) has a jump at x}.If for some t ∈ {1, ..., t¯}, X tD 6= ∅, and the following condition COND issatisfied, the conditional choice probabilities Q (1|1, x) will have a jump atsome x ∈ int(X ).COND. There is a t and some xd ∈ X tD such that∑t∈{t′:xd∈X t′D}λtQt (1|1, x)has a jump at x = xd.(iii) Suppose Assumptions 1’-3’ hold, and there exists a nonempty setTD ⊂ {1, ..., t¯} such that for all t ∈ TD the game fits into Sub-category(C2-2). Define TD and X tD as in part (ii). Then the conditional choicepredictabilities Q (1|1, x) will have a jump at some x ∈ int(X ), if thecondition COND in (ii) is satisfied.Consider Proposition 3(ii), for any t ∈ TD, if Assumption 4H’ holds, wecan use Proposition 2 to obtain X tD 6= ∅, except for some special equilibrium723.3. An Implication of Equilibrium Behaviour on Dataselection rules. This can also be achieved by Corollary 1, provided that theequilibrium selection rule pit satisfies the property in Definition 3.In sum, we have established the relationship between the equilibriumbehaviour and the presence of jump(s) in the conditional choiceprobabilities when the payoff functions of the game contain heterogeneityunobserved by the econometrician and the heterogeneity is independent ofthe covariates. The observed conditional choice probabilities are mixturesof the choice probabilities given a fixed value of the heterogeneity. Hence,the relation from equilibrium behaviour to the jump(s) remains in thepresence of unobserved heterogeneity.3.3.4 Testing for Multiple Equilibria via DiscontinuityProposition 1 to 3 show that in incomplete information games, the equilib-rium behaviour of the game produces testable implications on continuity ordiscontinuity of the conditional choice probabilities. As a result, continuityor discontinuity of the conditional probabilities provides information aboutthe equilibrium behaviour. Under Assumptions 1-3 and 4H, we can trans-form the problem of testingH0 : The game fits into Category (C1) or Subcategory (C2-1)H1 : The game fits into Category (C3) or Subcategory (C2-2).into testingH ′0 : Qk (1|1, x) is continuous in x for all x ∈ int(X ) for all k = 1, 2.H ′1 : Qk (1|1, x) has jump(s) at some x ∈ int(X ) and some k ∈ {1, 2}.Under Assumptions 1-3 and 4H, if H ′0 is not rejected, the data providesno evidence against the null hypothesis that the system of structuralequations has an unique solution for all x. The union of Category (C1)and (C2-1) is usually regarded as a subset of games with a uniqueequilibrium. However, sometimes the econometrician did impose anassumption that the game fits into (C1) or (C2-1). For instance,Aradillas-Lopez (2010) proposed a semi-parametric estimation strategiesfor incomplete information games with parametric payoff functions, undera “unique equilibrium” assumption that corresponds to the union of (C1)733.3. An Implication of Equilibrium Behaviour on Dataand (C2-1).21 In this scenario, testing for the presence of a jump inQ (1|1, x) is checking the maintained assumption in Aradillas-Lopez (2010).Note that none of propositions and corollaries in this chapter requiresthe independence between players’ private information. In other words, theimplications of the equilibrium behaviour for continuity or discontinuity ofthe conditional choice probability are robust to correlated privateinformation. As a comparison, tests for multiple equilibria based onconditional dependence of players’ actions (see Paula and Tang (2012),Aguirregabiria and Mira (2013), among others) required the privateinformation to be independent across players (see Assumption 2S below).This is an advantage of the discontinuity criterion.Assumption 2S (Independent private information) The private infor-mation U1 is independent of the private information U2.We are aware that under Assumptions 1-3 and 4H, information providedby continuity or discontinuity of the conditional choice probabilities doesnot fully separate the games with a unique equilibrium from those withmultiple equilibria. However, it at least offers partial information aboutthe equilibrium behaviour. The uniqueness or multiplicity of equilibria inincomplete information games is important for identification and estimation,especially when the private information is correlated among players. To thebest of my knowledge, related literature such as Aradillas-Lopez (2010),Wan and Xu (2014) and Liu, Vuong, and Xu (2013) all maintained that thegame has a unique equilibrium.Moreover, knowing the presence of jump(s) in the conditional choiceprobabilities is useful if the econometrician wants to implement Paula andTang (2012)’s test for multiple equilibria. Under Assumption 2S, theyproposed to test the following independence restriction:Pr (Y1 = 1, Y2 = 1|X = x) = Pr (Y1 = 1|X = x) Pr (Y2 = 1|X = x) .Implementation of such a test requires the estimation of three conditionalchoice probabilities. If any of them has a jump at some x, which is possibleas we have shown, the econometrician cannot directly apply the standardnonparametric estimation technique such as kernel or series estimators.21Aradillas-Lopez (2010) treated “a unique equilibrium” as “a unique solution” to the(3.7) (he considered a linear payoff function). See Proposition 2, 3 and Assumption A3 (ii)in his chapter. Recall that a DGP in (C2-1) can be viewed as in (C1), if we let σ = qm∗(x)be the equilibrium characterizing equilibrium, where m∗ is the index of the equilibriumthat always appears in the data, by the definition of (C2-1).743.4. An Estimation Problem Due to Jumps3.4 An Estimation Problem Due to JumpsFrom the last section, we know that even when the payoff functions andthe latent distribution are all smooth, the conditional choice probabilitiesmay have a jump. It raises a problem for estimating the conditional choiceprobabilities, when the econometrician in priori does not know thepresence of such jump(s). The estimation of the conditional choiceprobabilities is usually the first stage for parameter estimation andinference in games with incomplete information (see for example, Bajari,Hong, Krainer and Nekipelov (2010), Paula and Tang (2012), Wan and Xu(2014), Aguirregabiria and Mira (2013), among others). If the conditionalchoice probabilities have a jump, the standard nonparametric estimationtechnique such as kernel or series estimators do not apply directly. On theother hand, because the conditional probability may not have a jump, theusual jump location estimator does not apply either (to be discussed inthis section).This section reviews a jump location estimator and explains why it doesnot apply immediately when the presence of a jump in the conditional choiceprobabilities is unknown. To simplify arguments, we assume Assumption 2Sto hold in this section. The conclusion can be generalized to cases withoutAssumption 2S, as we can use the sub-sample with Y−k = 1.Under Assumption 2S, the choice of the player −k does not appear inthe conditioning set of the choice probability of the player k, that is,Qk (1|1, x) = Qk (1|x) ≡ Pr (Yk = 1|X = x) , for k = 1, 2.Under Assumption 2S, Aguirregabiria and Mira (2013) noticed that theconditional probabilities Pr (Yi = (1, 1)′|Xi = x) may be discontinuous atsome x and “the econometrician does not know, ex-ante, the number andthe location of these discontinuity points, and this complicates theapplication of smooth nonparametric estimators...”. The solution theyproposed is to first use the method developed by Mu¨ller (1992) andDelgado and Hidalgo (2000) to estimate the location(s) of the jump(s).However, their jump location estimator relies on the existence of a jump inthe regression function, or in other words, the largest possible jump sizemust be strictly positive. To see this, note that the jump size appearsmultiplicatively in the asymptotic covariance of the jump locationestimator (see equation (3.8) in Mu¨ller (1992) and Theorem 2 in Delgadoand Hidalgo (2000)). Here we briefly review the jump location estimatorproposed by Mu¨ller (1992). Mu¨ller (1992) considered the followingnonparametric regression model for a univariate X.753.4. An Estimation Problem Due to JumpsY = g(X) + U,where Y ∈ R, X ∈ R, U ∈ R,E[U |X] = 0,E[U2|X] = σ2. The departurefrom the standard univariate regression model is that g(x) has a jump atx = τ. In particular,g(x) = f(x) + ∆{x > τ},where function f(x) is twice continuously differentiable. The unknownlocation of jump τ is the object of interest and ∆ measures the jump size.Mu¨ller (1992) assumed that ∆ > 0 and introduce the local jump as∆(x) = g+(x)− g−(x),whereg+(x) = lim gt↓0(x+ t), g−(x) = lim gt↓0(x− t).In Mu¨ller (1992), τ is estimated byτˆ = arg maxx∆ˆ(x),∆ˆ(x) = gˆ+(x)− gˆ−(x),where gˆ+(x) and gˆ−(x) are constructed using one-sided kernels. Theasymptotic distribution of τˆ is obtained from the weak convergence of alocal deviation process. Let δˆ(y) = ∆ˆ(τ + yh) and define a process as the(nh)−1/2zh deviation of ∆ˆ(τ) (n denotes the sample size and h denotes thebandwidth),ξn(z) = nh[δˆ((nh)−1/2z)− δˆ(0)].Theorem 3.1 of Mu¨ller (1992) showed that ξn(z) weakly converges to ξ(z),whereξ(z) = −∆z2K−(0)/2 +Wz,W ∼ N(0, 2σ2∫K2−(v)dv),and K−(·) is the one-sided kernel function. By the construction of τˆ andξ(z), we haveτˆ = τ + (nh)−1/2hZn,Zn = arg maxzξn(z).763.5. Testing for the Presence of Jump(s)Let Z∗ = arg maxz ξ(z), we can compute Z∗ = W/(∆K−(0)), which givesthe identification of Z∗, if ∆ > 0. Corollary 3.1 of Mu¨ller (1992) establishedthe asymptotic distribution of estimator τˆ as follows,(nh)1/2 (τˆ − τ)→d N(0,2σ2(∆K−(0))2∫K2−(v)dv). (3.16)Given that, the asymptotic distribution of the estimated jump size, ∆ˆ(τˆ),can be derived as (see Corollary 3.2 of Mu¨ller (1992))(nh)1/2(∆ˆ(τˆ)−∆)→d N(0, 2σ2∫K2−(v)dv).We can see that when E[Y |X = x] = g(x) is continuous in x, (i.e.∆(x) = 0 for all x), the estimator τˆ will diverge. Therefore, the jumplocation estimator developed by Mu¨ller (1992) and Delgado and Hidalgo(2000) does not immediately apply when the presence of jump(s) in theregression function is unknown. In incomplete information games, theconditional choice probabilities may or may not have a jump in x,depending on the equilibrium behaviour. As a result, the econometriciancannot utilize the jump location estimator before knowing the presence ofjump(s). This calls for a test for the presence of jump(s) in the conditionalchoice probabilities in the first place.3.5 Testing for the Presence of Jump(s)In nonparametric estimation, the regression function is usually assumed tobe twice continuously differentiable. However, we have shown that evenwhen the payoff functions and the latent distribution are all smooth, theequilibrium behaviour may give rise to jump(s) in the conditional choiceprobabilities. Hence the standard smoothness conditions imposed onregression functions are too restrictive for the conditional choicesprobabilities. There are generally two approaches to treat this problem,the first one is to conduct a testing procedure for the presence of jump(s)in the conditional choice probabilities. If the conditional probabilities arecontinuous in covariates, the standard smooth nonparametric estimatorsapply. Otherwise, the econometrician may estimate the locations of jumps(for example, use the method by Mu¨ller (1992) and Delgado and Hidalgo(2000)). An alternative approach is to estimate the conditional probabilitywithout the smoothness conditions (for example, use wavelet methods). Inthis section, we discuss the first approach, that is, to test for the presence773.5. Testing for the Presence of Jump(s)of a jump in conditional choice probabilities. Since the presence of a jumpis related to the equilibrium behaviour of the game, such a test not onlyhelps the econometrician choose appropriate estimation approaches, butalso reveals information about the equilibrium behaviour of the game.Suppose we are interested in testingH ′0 : Qk (1|1, x) is continuous in x on int(X ) , for all k = 1, 2.H ′1 : Qk (1|1, x) has jump(s) at some x ∈ int(X ), for some k ∈ {1, 2}.Since there are two players, the null hypothesis contains two sub-hypotheses.Taking this into account, if we focus on the conditional probability for each ofthe players, the critical value is to be adjusted (using Bonferroni correction,for example). In the following, we focus on testing the presence of a jumpin the conditional choice probability for player 1.The observed data is an i.i.d. sample {(Y1i, Y2i, X ′i}Ni=1. Let J = dim (X ).When J = 1, for a fixed x, let gˆ+1 (x) and gˆ−1 (x) be two kernel type estimatorsof Q1 (1|1, x) using one-sided kernel functions22, an estimator for the localjump at x can be written as∆n(x) = gˆ+1 (x)− gˆ−1 (x),wheregˆ+(x) =∑Ni=1 Y1iY2iK+(Xi−xhn)∑Ni=1 Y2iK+(Xi−xhn) and gˆ−(x) =∑Ni=1 Y1iY2iK−(Xi−xhn)∑Ni=1 Y2iK−(Xi−xhn) ,where K+(·) and K−(·) are one-sided kernel functions and hn is thebandwidth. The next step is to construct a test statistic by aggregatingthe local jumps over the support. The most straightforward way is to usesupremum. To find the critical value 23, one can use the result fromHamrouni (1999)24. There are other ways to aggregate the local jumps.For instance, Bowman, Pope and Ismail (2006) developed a test based onthe sum of squared pointwise jumps for the univariate and bivariate22Properties of the one sided-kernel functions for J = 1 can be found in Definition 1.1of Hamrouni (1999).23Here one cannot directly apply the Continuous Mapping Theorem to deal with thesupremum, because the underlying empirical process of kernel estimators does not haveweak convergence.24See Theorem 2.8 of Hamrouni (1999) for a complete statement of the limitingdistribution.783.6. Conclusions and Remarksregression functions. Alternatively, Mu¨ller and Stadtmu¨ller (1999) showedthat in univariate regression models, the sum of squared jump sizes can berepresented as a coefficient of an asymptotic linear model, with thesquared difference in Y as the dependent variable and twice the standarddeviation of the error term as the intercept. Hence they developed a testby checking whether the sum of the squared jump sizes equals zero.Gijbels and Goderniaux (2004) identified a discontinuity as a point withthe largest derivative. When J > 1, the local jump at a given x shall bechecked along different directions, see Qiu (1997, 2002). A recentdevelopment in the literature related to this testing problem lies inChernozhukov, Chetverikov and Kato (2012, 2013). They show that thesupremum of an empirical process of kernel type estimators can beapproximated by the supremum of a Gaussian multiplier bootstrap. Thistechnique can be applied to our set-up. The details of constructing a testfor the presence of jump(s) in the conditional choice probabilities inincomplete information games is left for future work.3.6 Conclusions and RemarksIn this chapter, we show that in binary games with incomplete information,the conditional choice probabilities may have a jump with respect to thecontinuous covariates, even when the payoff functions and the distributionof the latent variables are all smooth. The source of such jumps lies in theequilibrium behaviour of the game. We further analyse the conditions underwhich the conditional choice probabilities exhibit a jump. The relationshipbetween the equilibrium behaviour and the the presence of jump(s) in theconditional choice probabilities is robust to correlated private informationand unobserved heterogeneity. As a result, continuity or discontinuity in theconditional choice probabilities reveals information about the equilibriumbehaviour.Testing for the presence of a jump in the conditional choice probabilitiesalso matters if the econometrician wants to estimate that conditional choiceprobability, which is usually the starting point for parameter estimationand inference in incomplete information games. This chapter points outthat when the econometrician is unknown about the presence of jump(s) inthe conditional choice probability, the jump location estimator (by Mu¨ller(1992) and Delgado and Hidalgo (2000)) may not be consistent.Lastly, we briefly discuss the construction of a test for the presence ofa jump in the conditional choice probabilities. The test statistic can be793.6. Conclusions and Remarksconstructed as the supremum of the local jumps, where the local jump canbe computed as the difference between two kernel type estimators (usingone-sided kernels). We leave the detailed construction of the test statisticand critical values for future work.80Chapter 4Efficient Inference inEconometric Models WhenIdentification Can Be Weak4.1 IntroductionWeak instrumental variables (Weak-IV) have received a lot of attention ineconometrics in particular following Staiger and Stock (1997), whodeveloped an analytical framework for analysing the effect of weakinstruments and constructing weak-identification-robust methods ofinference, and Dufour (1997), who showed that usual bounded confidencesets cannot be valid in the case of weak identification. This paperconsiders testing the coefficient of an endogenous variable in instrumentalvariables regression models (linear IV models) when the instruments maybe weak. A standard approach to the construction of Weak-IV robust testsis to use null-restricted residuals, which are obtained by imposing the valuespecified under a null hypothesis for the coefficients of endogenousregressors. The hypothesis can be tested by considering the samplecovariance between null-restricted residuals and instrumental variables.The approach is robust to weak identification problems because, under anull hypothesis, the distribution of the covariance term does not dependasymptotically on the strength of the correlation between endogenousregressors and instruments. This idea is behind the Anderson-Rubin (AR)statistic, see Anderson and Rubin (1949) and Staiger and Stock (1997).Tests based on the AR statistic are efficient if models are justidentified. However, the power of AR-type tests is inferior to that of usualt- and Wald-tests when models are overidentified and instruments arestrong. This is because, when models are overidentified, the AR approachtests more restrictions than the dimension of the parameter of interest. Toaddress that issue, several papers suggested alternatives to the ARstatistic. Kleibergen (2002, 2007) and Moreira (2001, 2003) proposed814.1. IntroductionLagrange Multiplier (LM) and Conditional Likelihood Ratio (CLR) -typestatistics. Kleibergen’s LM (KLM) tests can be used with usual χ2 criticalvalues. CLR tests requires simulations to generate critical values, however,they have been demonstrated to have better power properties in MonteCarlo simulations than KLM or AR tests. Andrews, Moreira and Stock(2006) showed that, for normal linear instrumental variables regressionmodels with homoskedastic errors, KLM and CLR tests are efficient amongWeak-IV robust tests when instrumental variables are strong (Strong-IV).When instrumental variables are weak, there is no uniformly most powerful(UMP) test except for the just identified case, hence the efficiency dependson optimizing criteria considered by the econometrician. Andrews, Moreiraand Stock (2006) considered a weighted average power (WAP) at twovalues of parameters, and derived the optimal average power test. Theyalso numerically demonstrated that in the Weak-IV scenario, CLR testdominates AR and KLM tests, and attains the power envelope given bythe optimal average power test. As a result, they recommended CLR testfor empirical researchers when the instruments may be weak and model isover-identified. Cattaneo, Crump and Jansson (2012) extended the resultsof Andrews, Moreira and Stock (2006) to non-normal errors by using anasymptotic framework of Gaussian experiments. Chernozhukov, Hansenand Jansson (2009) showed that all members of the weighted averagepower likelihood ratio tests are admissible, including the AR test.This chapter is concerned with a different optimizing criterion, that is,the power for alternative hypotheses that are determined by arbitrarily largedeviations from the null hypothesis. In the Weak-IV scenario, the power ofany robust test may be far below 1 even for arbitrarily large deviations. Thepower of a test for such alternatives is also related to the length of confidenceintervals constructed using test inversion.In this chapter, we focus on an asymptotic experiment followingCattaneo, Crump and Jansson (2012) and Choi and Schick (1996). Thisasymptotic experiment framework substantially simplifies the analysis byreducing a complex inference problem to that based on a normallydistributed vector. It allows one to derive an efficiency bound in presenceof nuisance parameters. In this chapter, we first derive the optimal test foralternatives that are arbitrarily far away from the null, among allrotational invariant and asymptotically similar tests. To do this, we followthe method of Andrews, Moreira and Stock (2006) and Mills, Moreira andVilela (2013), with a focus on alternatives determined by arbitrarily largedeviations from the null. Then we use the notion of efficiency in Choi andSchick (1996) to obtain a power envelope under Weak-IV, in the worst824.2. An Asymptotic Experiment for Linear IV Modelsscenario with respect to a perturbation to the nuisance parameter. Afterthat, we compare the power of popular Weak-IV robust tests (AR, KLMand CLR tests) when alternatives are determined by arbitrarily largedeviations from the null. In particular, we find that the relativeperformance of the AR test versus the CLR test depends on the degree ofendogeneity in the model. For a relatively low degree of endogeniety, theAR test outperforms the CLR test, while for a relatively large degree ofendogeniety, the order is reversed. This result suggests that the CLR test isnot dominating the AR test under a different (but reasonable) optimizingcriterion from the WAP considered by Andrews, Moreira and Stock (2006).In addition, we propose a new Weak-IV robust test, the ConditionalLagrange Multiplier (CLM) test, which is asymptotically efficient in theStrong-IV case, robust to weak instruments, and exhibits the same poweras the AR test for arbitrarily large deviations from the null. Lastly, weextend the investigation to heteroskedastic models. In particular, we findthat the generalized likelihood ratio statistic in heteroskedastic modelsreduces to the AR statistic in the Weak-IV scenario and when alternativesare determined by arbitrarily large deviations from the null.4.1.1 Organization of Chapter 4The plan of the chapter is as follows: Section 4.2 sets up the asymptoticexperiment framework, Section 4.3 describes the optimal rotationalinvariant and asymptotically similar test when alternatives are determinedby arbitrarily deviations from the null; Section 4.4 characterizes the powerenvelope in the worst scenario with respect to a perturbation to thenuisance parameter. Section 4.5 compares the power properties of the AR,CLR and KLM tests under Weak-IV and when alternatives are determinedby arbitrarily large deviations from the null. This section also proposes anew Weak-IV robust test, the CLM test. Section 4.6 extends theframework to heteroskedastic models. The last section concludes.4.2 An Asymptotic Experiment for Linear IVModelsConsider linear IV models with a single endogenous regressor. The struc-tural equation isy1 = y2γ + Z2β + u, (4.1)834.2. An Asymptotic Experiment for Linear IV Modelsand the first stage regression isy2 = Z1pi1,n + Z2pi2 + v, (4.2)where y1, y2 ∈ Rn, Z2 ∈ Rn×l1 , and Z2 ∈ Rn×l2 are observed variables;u, v ∈ Rn are unobserved error terms; coefficients γ ∈ R, β, pi2 ∈ Rl2 , andpi1,n ∈ Rl1 are unknown parameters, among which the coefficient γ is thestructural parameter of interest. Assumption 1 characterizes the Weak-IVscenario.Assumption 1. (Weak-IV) pi1,n = n−1/2C, where C ∈ Rl1 is fixed.We assume that the data is an i.i.d. sample, the instrumental variablesare uncorrelated with the unobserved error terms, the errors in the modelare homoskedastic, and the instrumental variables have finite second ordermoments.Assumption 2. (a) {(y1i, y2i, Z1i, Z2i), i = 1, ..., n} are i.i.d.(b) E[Z1iZ2i][ ui vi ] = 0.(c) E[u2i uiviuivi v2i|Zi1, Zi2]=[σ2u σuvσuv σ2v], is a finite and positivedefinite matrix.(d) E[Z1iZ ′1i Z1iZ′2iZ2iZ ′1i Z2iZ′2i]=[Q11 Q12Q′12 Q22]= Q, is a finite and positivedefinite matrix.By Assumption 2(c) and the fact that Z′1M2Z1n →p Q1·2, we have1√n[Z ′1M2uZ ′1M2v]→d N(0,[σ2u σuvσuv σ2v]⊗Q1·2), (4.3)where Q1·2 = Q11 − Q12Q−122 Q′12. Note that (4.3) is a characterization ofhomoskedasticity.The null hypothesis is H0 : γ = γ0. Let ∆ = γ − γ0. The followingtwo statistics Sn and Tn (and their normalized versions S∗n and T∗n) will beused repeatedly in this chapter. We construct the statistic Sn ∈ Rl1 as thesample covariance between the null restricted residuals and the instrumentalvariables.Sn = Z′1M2(y1 − y2γ0)/n844.2. An Asymptotic Experiment for Linear IV Models=Z ′1M2Z1n∆C√n+Z ′1M2(u+ ∆v)n, (4.4)where M2 = In − Z2(Z ′2Z2)−1Z ′2 is an orthogonal projection matrix. Notethat the second equality is implied by Assumption 1. Let pˆi1,n be the OLSestimator of pi1,n in the first stage regression, that ispˆi1,n = (Z′1M2Z1)−1Z ′1M2y2 =(Z ′1M2Z1n)−1 Z ′1M2vn+C√n, (4.5)where the second equality comes from Assumption 1. Defineσ2(∆) = σ2u + ∆2σ2v + 2∆σuv.Expressions in (4.4) and (4.5) give rise to the asymptotic distribution of√n[S′n, pˆi′1,n]′ in the Weak-IV scenario:N([∆Q1·2CC],[σ2(∆)Q1·2(σuv + ∆σ2v)Il1(σuv + ∆σ2v)Il1 σ2vQ−11·2]).Next we decompose pˆi1,n into two parts, one is the population projectionof pˆi1,n onto the space of Sn and the other is orthogonal in population toSn. We take the second part and construct the statistic Tn ∈ Rl1 to beasymptotically uncorrelated with Sn:Tn = pˆi1,n −σuv+∆σ2vσ2(∆) Q−11·2Sn.The asymptotic distribution of√n [S′n, T′n]′ isN([∆Q1·2Cσ2u+∆σuvσ2(∆) C],[ (σ2u + 2∆σuv + ∆2σvv)Q1·2 0l10l1σ2uσ2v−σ2uvσ2(∆) Q−11·2]).We normalize Sn and Tn by lettingS∗n = Q−1/21·2√nSn/σ(∆),andT ∗n = σ(∆)Q1/21·2√nTn/(σ2uσ2v − σ2uv)1/2.Since the covariance matrix of the reduced-form errors [ui + ∆vi, vi]′ canbe consistently estimated, in this chapter we assume that Q1·2, σ(∆) andσ2uσ2v − σ2uv are known. Note that Q1·2 can be consistently estimated by854.2. An Asymptotic Experiment for Linear IV ModelsZ′1M2Z1n . Let Ω denote the 2 by 2 covariance matrix of the reduced form25errors [ui + ∆vi, vi]′. Then σ(∆) is the upper left element of Ω, σuv + ∆σ2vis the upper right element of Ω, and σ2uσ2v − σ2uv is the determinant of Ω.Furthermore, we obtain the following asymptotic experiment[S∗nT ∗n]∼a N∆σ(∆)Q1/21·2 C(σ2u+∆σuv)σ(∆)(σ2uσ2v−σ2uv)1/2Q1/21·2 C , I2l1 . (4.6)Define N to be a 2l1 × 1 normal random vector with a zero mean and theidentity covariance matrix, and let[S∗T ∗]= N +∆σ(∆)Q1/21·2 C(σ2u+∆σuv)σ(∆)(σ2uσ2v−σ2uv)1/2Q1/21·2 C .Clearly,[S∗T ∗]is distributed as the right hand side of (4.6), and we have[S∗nT ∗n]→d[S∗T ∗].We are particularly interested in alternatives that are determined byarbitrarily large deviations from the null, that is, ∆→∞. In the followinginvestigation, we first analyse the case ∆ → +∞, and the case ∆ → −∞can be treated analogously. Consider the asymptotic experiments under thenull and alternatives, under H0 : ∆ = 0,[S∗T ∗]∼ N([01√1−ρ2λ], I2l1), (4.7)where λ = 1σvQ1/21·2 C and ρ =σuvσuσv.Under H1 : ∆→ +∞,[S∗T ∗]∼ N([λρ√1−ρ2λ], I2l1), (4.8)25Here the reduced form equations are[y1 − γ0y2y2]= Z1[∆pi1,npi1,n]+ Z2[∆pi2 + βpi2]+[u+ ∆vv].864.3. The Optimal Rotational Invariant and Asymptotically Similar TestSimilarly, under H1 : ∆→ −∞,[S∗T ∗]∼ N(−[λρ√1−ρ2λ], I2l1). (4.9)Asymptotic experiments (4.7), (4.8) and (4.9) are the building blocks of thischapter. We can see that the means of those limiting Gaussian distributionsare determined by three parameters: l1, λ and ρ. The parameter l1 isthe number of instruments; the norm of λ determines the strength of theinstruments; and the parameter ρ measures the degree of endogeneity of thislinear IV model.4.3 The Optimal Rotational Invariant andAsymptotically Similar TestIn this section, we derive the asymptotically optimal test against ∆ → ∞among all rotational invariant and asymptotically similar tests. Hererotational invariance means that a test is not affected by any orthonormaltransformation of the instruments. Asymptotic similarity means that theasymptotic null rejection rate of a test is not affected by pi1,n. Similar testsare robust to weak instruments, as the norm of pi1,n determines thestrength of the instruments. Define the test statisticsQn =[Qsn QstnQ′stn Qtn]= [S∗n, T∗n ]′[S∗n, T∗n ],Q =[Qs QstQ′st Qt]= [S∗, T ∗]′[S∗, T ∗].In this following, we consider test statistics that are functions of Qn. Thisis because Andrews, Moreira and Stock (2006) showed that every rotationalinvariant test can be written as a function of Qn. In addition, they showedthat an invariant test is asymptotically similar with significance level α if andonly if the asymptotic null rejection rate of such a test equals α, conditionalon the value of Qtn. Therefore, we can restrict our attention to tests asfunctions of Qn. Popular Weak-IV robust tests, such as the AR, KLM andCLR tests, can be written as functions of Qn.From the asymptotic experiment (4.6), the l1×2 random matrix [S∗, T ∗]is a multivariate normal with the mean matrix given byM =[∆σ(∆) ,(σ2u+∆σuv)σ(∆)(σ2uσ2v−σ2uv)1/2]Q1/21·2 C,874.3. The Optimal Rotational Invariant and Asymptotically Similar Testand the identity covariance matrix. The 2 × 2 random matrix Q has anon-central Wishart distribution with the mean matrix of rank 1 and theidentity covariance matrix. Therefore we can calculate the density of[Qs, Qst] conditional on Qt, and then use the Neyman-Pearson Lemma toconstruct the optimal test against H1 : ∆ → +∞ or ∆ → −∞. Thismethod has been used by Andrews, Moreira and Stock (2006) and Mills,Moreira and Vilela (2013). Here we particularly focus on alternativescorresponding to arbitrarily large deviations from the null. The followingproposition describes the optimal rotational invariant and asymptoticallysimilar test in the Weak-IV scenario and for alternatives corresponding toarbitrarily large deviations from the null.Proposition 1. In the linear IV model (4.1) and (4.2), suppose that As-sumption 1 and 2 hold. Consider a testing problemH0 : ∆ = 0, H1 : ∆→∞.Then the test that rejects H0 whenPOIS∞(Qsn, Qstn) = Qsn +2ρ√1− ρ2Qstn > κ∞(Qtn)maximizes the asymptotic power over all tests as functions of Qn and withasymptotic size α, where κ∞(Qtn) is the (1−α)th quantile of the distributionof POIS∞(Qsn, Qstn) conditional on Qtn and under H0.Proof. We first characterize the optimal invariant and similar test based onthe asymptotic statistic Q. If the optimal test statistic and the critical valueare continuous in Q, then by the Continuous Mapping Theorem, replacingQ with Qn yields a test that attains the asymptotic power envelope amongall invariant and asymptotically similar tests.The density of (Qs, Qst, Qt) isfQ(qs, qst, qt) = K1 exp(−tr(M ′M)/2)det(q)(l1−3)/2 exp (−tr (q) /2)×tr(M ′Mq)−(l1−2)/4 I(l1−2)/2√tr (M ′Mq),whereIv(x) =(x2)v ∞∑j=0(x2/4)jj!Γ(v + j + 1),and K1 is a constant only depending on l1. Calculate the mean matricesunder H0 and H1, respectively.Under H0 : ∆ = 0, M = [0,1√1− ρ2λ]′,884.3. The Optimal Rotational Invariant and Asymptotically Similar TestUnder H1 : ∆→ +∞, M = [λ,ρ√1− ρ2λ].Therefore, under H0, the density of (Qs, Qst, Qt) isf0Q(qs, qst, qt) = K1 exp(−‖λ‖22(1− ρ2))det(q)l1−32 exp(−qs + qt2)×(‖λ‖2qt(1− ρ2))− l1−24I l1−22(√‖λ‖2qt(1− ρ2)).Under H1 : ∆→ +∞, the density of (Qs, Qst, Qt) becomesf1Q(qs, qst, qt) = K1 exp(−‖λ‖22(1− ρ2))det(q)l1−32 exp(−qs + qt2)×(‖λ‖2 ζ(q))− l1−24I l1−22(√‖λ‖2 ζ(q)),whereζ(q) = qs +2ρ√1− ρ2qst +ρ21− ρ2qt.The random variable Qt has a non-central chi-square distribution, and thedensities of Qt under H0 and H1 can be written asf0Qt(qt) =12exp(−‖λ‖22(1− ρ2))qtl1−22 exp(−qt2)×(‖λ‖2qt(1− ρ2))− l1−24I l1−22(√‖λ‖2qt(1− ρ2)),andf1Qt(qt) =12exp(−ρ2 ‖λ‖22(1− ρ2))qtl1−22 exp(−qt2)×(‖λ‖2ρ2qt(1− ρ2))− l1−24I l1−22(√‖λ‖2ρ2qt(1− ρ2)).Therefore, we can compute the densities of (Qs, Qst) conditional on Qt,underH0 andH1 respectively. As a result, the likelihood ratio can be writtenas894.3. The Optimal Rotational Invariant and Asymptotically Similar TestLR(q) =f1Q(qs, qst, qt)/f1Qt(qt)f0Q(qs, qst, qt)/f0Qt(qt)=exp(−‖λ‖22)(‖λ‖2 ζ(q))− l1−24I l1−22(√‖λ‖2 ζ(q))(‖λ‖2 qt(1−ρ2))− l1−24I l1−22(√‖λ‖2 ρ2qt(1−ρ2))=2−l1−22 exp(−‖λ‖22)∑∞j=0(‖λ‖2ζ(q)/4)jj!Γ(l1/2+j)(‖λ‖2 qt(1−ρ2))− l1−24I l1−22(√‖λ‖2 ρ2qt(1−ρ2)) . (4.10)Note that the denominator of (4.10) is a function of qt and the numerator isan increasing function of ζ(q). Therefore, by the Neyman-Pearson Lemma,the optimal test for H1 : ∆→ +∞ is to reject H0 whenPOIS∞(Qs, Qst) = Qs +2ρ√1− ρ2Qst > κ∞(Qt),where the critical value κ∞(Qt) is the (1−α)th quantile of the distributionof POIS∞(Qs, Qst) conditional on Qt and under H0.Since both POIS∞(·) and κ∞(·) are continuous functions of Q, by theContinuous Mapping Theorem, the test that rejects H0 whenPOIS∞(Qsn, Qstn) = Qsn +2ρ√1− ρ2Qstn > κ∞(Qtn),maximizes the asymptotic power over all tests as functions of Qn and withasymptotic size α, where κ∞(Qtn) is the (1−α)th quantile of the distributionof POIS∞(Qsn, Qstn), conditional on Qtn and under H0.Exactly the same result can be obtained for H1 : ∆→ −∞.The optimal test given by Proposition 1 does not depend on thenuisance parameter C, which cannot be consistently estimated in theWeak-IV scenario. However, the optimal test is still infeasible because ρ isunknown and cannot be consistently estimated in linear IV models. Theparameter ρ measures the degree of endogeneity in the model.Mills, Moreira and Vilela (2013) proposed a feasible test for arbitrarilylarge ∆. Their test is to reject H0 whenQsn + 2(det(Ω))−1/2(γ0ω22 − ω12)Qstn > κ∞(Qtn),904.4. The Power Envelope Under an Unknown Nuisance Parameterwhere the κ∞(Qtn) is the (1 − α)th quantile of null distribution of the lefthand side conditional on Qtn, and Ω =[ω11 ω12ω12 ω22]is the covariance matrixof the reduced-form errors[u+ γvv]. However, when γ → ∞, γ0ω22 −ω12 = −∆σ2v − σuv, which goes to infinity. Therefore, the behaviour ofthe test statistic is determined by the second term Qstn. A test based onQstn will exhibit some undesirable power properties similar to Kleibergen’sK (KLM) test.26 The power performance of KLM will be illustrated inSection 4.5.4.4 The Power Envelope Under an UnknownNuisance ParameterIn this section, we derive a power envelope in the worst scenario with respectto a perturbation to the nuisance parameter C, in the Weak-IV scenario.Consider a perturbation to the nuisance parameter C, i.e., C = C0 + τ . Wewill be focusing on the asymptotic power envelope for a joint test ofH0 : ∆ = 0 and τ = 0,againstH1 : ∆→∞ and τ 6= 0.The following proposition characterizes the power envelope in that case.Proposition 2. In the linear IV model (4.1),(4.2), suppose that Assumption1 and 2 hold. Consider the testing problemH0 : ∆ = 0 and τ = 0,againstH1 : ∆→∞ and τ 6= 0.Then the asymptotic power envelope in the worst scenario with respect to τfor tests with an asymptotic size α isPr(G∗ > χ21,1−α), where G∗ ∼ χ21(C ′0Q1·2C0σ2v).26Mills, Moreira and Vilela (2013) first computed the optimal test for a fixed γ, andthen sent γ to infinity. However, ω12 is a function of γ and will also go to ∞. This is nottaken into account in their test statistic.914.4. The Power Envelope Under an Unknown Nuisance ParameterThat is, the optimal test statistic in the worst scenario with respect to τhas a non-central chi-square limiting distribution with degree of freedom 1and the non-centrality parameter equal to C ′0Q1·2C0/σ2v. The critical valueχ21,1−α is the (1− α)th quantile of a centred chi-square distribution.Proof. For a fixed (∆, τ), the asymptotic experiment in (4.6) becomes[S∗T ∗]∼ N∆σ(∆)Q1/21·2 (C0 + τ)(σ2u+∆σuv)σ(∆)(σ2uσ2v−σ2uv)1/2Q1/21·2 (C0 + τ) , I2l1 .The likelihood ratio statistic for H1 : ∆ 6= 0, C = C0 + τ isLR = −(S∗ − ∆σ(∆)Q1/21·2 (C0 + τ))′ (S∗ − ∆σ(∆)Q1/21·2 (C0 + τ))+ S∗′S∗−(T ∗ −(σ2u + ∆σuv)Q1/21·2 (C0 + τ)σ(∆)(σ2uσ2v − σ2uv)1/2)′(T ∗ −(σ2u + ∆σuv)Q1/21·2 (C0 + τ)σ(∆)(σ2uσ2v − σ2uv)1/2)+(T ∗ −σuQ1/21·2 C0(σ2uσ2v − σ2uv)1/2)′(T ∗ −σuQ1/21·2 C0(σ2uσ2v − σ2uv)1/2). (4.11)When ∆→ +∞, the expression in (4.11) becomesLR+∞ = 2W+∞ +K,where K is a constant andW+∞ =S∗′Q1/21·2 (C0 + τ)σv+(T ∗ −σuQ1/21·2 C0(σ2uσ2v − σ2uv)1/2)′(Q1/21·2 (σuv(C0 + τ)− σuσvC0)σv(σ2uσ2v − σ2uv)1/2).We computeE [W+∞] =(C0 + τ)′Q1·2(C0 + τ)σ2v+((C0 + τ)σuv − σuσvC0)′Q1·2 ((C0 + τ)σuv − σuσvC0)σ2v(σ2uσ2v − σ2uv),(4.12)and Var [W+∞] = E [W+∞] . Thus we haveG+∞ = W2+∞/Var [W+∞] ∼ χ21(E [W+∞]).924.4. The Power Envelope Under an Unknown Nuisance ParameterThe asymptotically optimal level α test against H1 : ∆ → +∞ is to rejectH0 whenGn,+∞ = W2n,+∞/Var [W+∞] > χ21,1−α,whereWn,+∞ =S∗′n Q1/21·2 (C0 + τ)σv+(T ∗n −σuQ1/21·2 C0(σ2uσ2v − σ2uv)1/2)′(Q1/21·2 (σuv(C0 + τ)− σuσvC0)σv(σ2uσ2v − σ2uv)1/2).(4.13)The asymptotic distribution of the statistic Gn,+∞ is a non-centralchi-square with degree of freedom 1 and the non-centrality parameterequal to E [W+∞] in (4.12). To consider the worst case, we minimize thenon-centrality parameter in (4.12) with respect to τ . The resulting powerminimizing direction in terms of the disturbance to C0 isτ =σuv − σuσvσuσvC0.Plugging the worst τ back into (4.12) yields the non-centrality parameterassociated with the optimal test in the worst scenario. The resulting non-centrality parameter is C′0Q1·2C0σ2v. Therefore, the power envelope for ∆ →+∞ in the worst scenario with respect to τ isPr(G∗ > χ21,1−α), where G∗ ∼ χ21(C ′0Q1·2C0σ2v).When ∆→ −∞, the likelihood ratio statistic can be reduced toW−∞ = −S∗′Q1/21·2 (C0 + τ)σv−(T ∗ −σuQ1/21·2 C0(σ2uσ2v − σ2uv)1/2)′(Q1/21·2 (σuv(C0 + τ) + σuσvC0)σv(σ2uσ2v − σ2uv)1/2).This yieldsE [W−∞] =(C0 + τ)′Q1·2(C0 + τ)σ2v+(σuv(C0 + τ) + σuσvC0)′Q1·2 (σuv(C0 + τ) + σuσvC0)σ2v(σ2uσ2v − σ2uv),934.5. Power Comparisons of Robust Testsand Var [W−∞] = E [W−∞] . Consequently, the power minimizing directionis τ = −σuv+σuσvσuσv C0 and the resulting non-centrality parameter associatedwith the optimal test for ∆→ −∞ remains C′0Q1·2C0σ2v.By plugging the minimizer τ = σuv−σuσvσuσv C0 into Wn,+∞ given by (4.13),we obtain a statisticW ∗n,+∞ =σuvσuσ2vS∗′n Q1/21·2 C0−(σ2uσ2v − σ2uv)1/2σuσ2vT ∗′n Q1/21·2 C0 +C0Q1·2C0σ2v. (4.14)The asymptotically optimal test for ∆ → +∞, in the worst scenario withrespect to τ is to reject H0 whenG∗n,+∞ =σ2v(W∗n,+∞)2C0Q1·2C0> χ21,1−α. (4.15)For the alternative H1 : ∆ → −∞, we obtain exactly the sameasymptotically optimal test in the worst scenario with respect to τ . Notethat the asymptotically optimal test is C0-specific. However, C0 cannot beconsistently estimated. Hence the above asymptotically optimal test isinfeasible.4.5 Power Comparisons of Robust TestsIn this section, we evaluate the power of several popular Weak-IV robusttests, when alternatives are determined by arbitrarily large deviations fromthe null. In particular, we numerically compare the power of the AR andthe CLR tests in that scenario. Our numerical results suggest that the ARtest outperforms CLR test when the degree of endogeneity is low, however,the order of power performance is reversed when the degree of endogeneityis high. In addition, we propose a new robust test, the ConditionalLagrange Multiplier (CLM) test, which is asymptotically efficient underStrong-IV and exhibits the same power as the AR test under Weak-IV andwhen alternatives are determined by arbitrarily large deviations from thenull.4.5.1 Popular Weak-IV Robust TestsAnderson-Rubin (AR) testThe rejection rule of the AR test isReject H0 when ARn = S∗′n S∗n > χ2l1,1−α.944.5. Power Comparisons of Robust TestsBy (4.8) and (4.9), under H1 : ∆ → ∞, ARn ∼a χ2l1(‖λ‖2), where ‖λ‖2 =C′Q1·2Cσ2v, χ2l1,1−α is (1 − α)th quantile of a centred chi-square distributionwith the degree of freedom l1, and χ2l1(‖λ‖2) is a non-central chi-squaredistribution with the non-centrality parameter equal to ‖λ‖2 and the degreeof freedom l1.Kleibergen’s K (KLM) testThough sometimes refereed to as an LM test , this test statistic is differentfrom the usual LM test in that it uses T ∗n (which is asymptotically indepen-dent of S∗n) instead of the usual pˆi1,n to estimate pi1,n. The rejection rule ofthe KLM test isReject H0 when KLMn = (T∗′n S∗n)2/T ∗′n T∗n > χ21,1−α.Note that the denominator and numerator of KLM are asymptoticallyindependent. Under H1 : ∆ → +∞, KLMn ∼a χ21(CKLM ). The non-centrality parameter CKLM can be described asCKLM =((N + ρ√1−ρ2λ)′λ)2(N + ρ√1−ρ2λ)′(N + ρ√1−ρ2λ) ,where N ∼ N(0l1×1, Il1). Under H1 : ∆→ −∞, the non-centrality parame-ter becomesCKLM =((N − ρ√1−ρ2λ)′λ)2(N − ρ√1−ρ2λ)′(N − ρ√1−ρ2λ) .We will numerically evaluate the performance of KLM test in Section 4.5.3.Conditional Likelihood Ratio (CLR) testThe CLR test was proposed by Moreira (2003), and was recommended byAndrews, Moreira and Stock (2006) because it numerically attains the powerenvelope obtained by the approach of weighted average power. The teststatistic can be written as954.5. Power Comparisons of Robust TestsLRn = S∗′n S∗n − λmin,where λmin is the smallest eigenvalue of Qn. Since we focus on the modelwith a single endogenous regressor, LRn can be re-written asLRn =12{S∗′n S∗n − T∗′n T∗n +√(S∗′n S∗n − T ∗′n T ∗n)2 + 4 (S∗′n T ∗n)2}.The critical value is the (1−α)th quantile of the conditional null distributionof LRn given T ∗n , which can be easily computed by the Monte Carlo method.Denote such a critical value as κLR,α(t), for given T ∗n = t. The rejection ruleof the CLR test isReject H0 when LRn > κLR,α(t).The analytical form of the asymptotic power of CLR test is difficult toobtain, we will numerically evaluate the performance of CLR test in Section4.5.3.4.5.2 A New Test: Conditional Lagrange Multiplier (CLM)TestThe classical Lagrange Multiplier (LM) test has a distorted size in the Weak-IV scenario. Such a distortion is due to the OLS estimator pˆi1,n. Thisproblem can be solved by computing critical values conditional on T ∗n . Inthis section, we construct the LM test statistic using the OLS estimatorpˆi1,n, and compute the critical value conditional on T ∗n . This gives rise to ourConditional Lagrange Multiplier (CLM) test. Such a construction ensuresthat the CLM test is Weak-IV robust. Here we further analyse the powerproperty of CLM test in both Weak-IV and Strong-IV scenarios.Let us first construct the LM statistic. Note that the asymptotic varianceof√n(Z′1M2Z1n pi1,n)′(AsyVar(√nSn))−1 Sn ispi′1,nQ1·2(AsyVar(√nSn))−1Q1·2pi1,n,where AsyVar(√nSn) is the asymptotic variance of√nSn, which can beconsistently estimated. Taking this into account, we construct the LMstatistic asLMn =(√n(Z′1M2Z1n pˆi1,n)′(AsyVar(√nSn))−1 Sn)2pˆi′1,nQ1·2 (AsyVar(√nSn))−1Q1·2pˆi1,n964.5. Power Comparisons of Robust Tests=((Z′1M2Z1n pˆi1,n)′(AsyVar(√nSn))−1/2 S∗n)2pˆi′1,nQ1·2 (AsyVar(√nSn))−1Q1·2pˆi1,n, (4.16)where the second line comes from the normalizationS∗n =(AsyVar(√nSn))−1/2√nSn.By Assumption 2 and using Z′1M2Z1n →p Q1·2, we haveLMn − LM →p 0, (4.17)whereLM =(pˆi′1,nQ1/21·2 S∗n)2pˆi′1,nQ1·2pˆi1,n.Recall that the relation between pˆi1,n and T ∗n is√npˆi1,n = AsyCov(√npˆi1,n,√nSn)(AsyVar(√nSn))−1/2S∗n+(AsyVar(√nTn))1/2T ∗n , (4.18)where AsyCov(√npˆi1,n,√nSn) is the asymptotic covariance between√npˆi1,nand√nSn.The critical values of the CLM test can be simulated conditional on T ∗nand under H0, similar to those of the CLR test. To simulate critical values,let R be the number of simulation. First we generate S∗0r ∼ N(0, Il2) forr = 1, . . . , R. Next, given T ∗n = t, we construct pˆi1,n according to (4.18) withS∗n replaced by S∗0r, and denote the result as pˆi1,n,r(t) for each r = 1, . . . , R.Then we further computeLM∗r =((Z′1M2Z1n pˆi′1,n,r(t))′(AsyVar(√nSn))−1/2 S∗0r)2pˆi′1,n,r(t)Q1·2 (AsyVar(√nSn))−1Q1·2pˆi1,n,r(t),for each r = 1, . . . , R. The critical value for the CLM test, κLM,α(t), is givenby the (1− α)th empirical quantile of {LM∗r : r = 1, . . . , R}. The rejectionrule of the CLM test isReject H0 when LMn > κLM,α(t). (4.19)974.5. Power Comparisons of Robust TestsNow we examine the power of the CLM test under Weak-IV for thealternative H1 : ∆→ +∞. Under Weak-IV, (4.18) becomes√npˆi1,n =σuv + σ2v∆σ(∆)Q−1/21·2 S∗n +σ2uσ2v − σ2uvσ(∆)Q−1/21·2 T∗n .Consequently, when ∆→ +∞, we obtain√npˆi1,n − σvQ−1/21·2 S∗n →p 0,and √npˆi1,n,r(t)− σvQ−1/21·2 S∗0r →p 0.Hence,LMn − S∗′n S∗n →p 0,andLM∗r − S∗′0rS∗0r →p 0.The same result can be obtained H1 : ∆ → −∞. Therefore, the powerof CLM test is the same as that of AR test for arbitrarily large deviationsfrom H0. However, CLM test is more efficient than AR test in the Strong-IVscenario, as we show below.Assumption 3. (Strong-IV) pi1,n = pi1, where pi1 ∈ Rl1 , and pi1 6= 0 isfixed.Proposition 3. In the linear IV model (4.1) and (4.2), suppose that As-sumption 1 and 5 hold. Then the CLM test given by (4.19) is asymptotical-ly efficient against the local alternative γ = γ0 + ∆/√n.Proof. We first compute an effective power upper bound for level α test inthe Strong-IV scenario, following the approach of Choi and Schick (2006).Consider a local perturbation pi1,n = pi1 + τ/√n and a joint test ofH0 : τ = 0 and γ = 0,againstH0 : τ 6= 0 and γ 6= 0.The asymptotic experiment in the Strong-IV scenario becomes√n[SnTn − pi1]∼a N([∆Q1·2pi1τ − ∆σuvpi1σ2u],[σ2uQ1·2 0l1×l10l1×l1σ2uσ2v−σ2uvσ2uQ−11·2]).984.5. Power Comparisons of Robust TestsThen the asymptotically most powerful unbiased (AMPU) test must bebased on the likelihood ratio LRn,LRn =∆2pi′1 (√nSn)σ2u+(τ −∆σuvpi1σ2u)′(σ2uσ2v − σ2uvσ2uQ−11·2)−1√n (Tn − pi1)∼a N (AsyVar(LRn),AsyVar(LRn)) , whereAsyVar(LRn) =∆2pi′1Q1·2pi1σ2u+(τ −∆σuvpi1σ2u)′(σ2uσ2v − σ2uvσ2uQ−11·2)−1(τ −∆σuvpi1σ2u).Thus, for given ∆ and τ , the AMPU level α test isReject H0 when LR2n/AsyVar(LRn) > χ21,1−α.The power of the AMPU test is characterized by the non-centrality param-eter AsyVar(LRn), sinceLR2n/AsyVar(LRn) ∼a χ21 (AsyVar(LRn)) .The power minimizing direction (in terms of distribution to pi1) is τ =∆σuvpi1σ2u, which leads to the non-centrality parameter equal to ∆2pi′1Q1·2pi1σ2u, forthe effective power upper bound in the worst scenario with respect to τ .Now to consider the power of the CLM test under Strong-IV, note thatin the Strong-IV scenario, pˆi1,n →p pi1. Hence by (4.17),LMn →p(pi′1Q1/21·2 S∗n)2pi′1Q1·2pi1∼a χ21(∆2pi′1Q1·2pi1σ2u),andLM∗r ∼a χ21(0).Therefore, the local power of the CLM test in the Strong-IV scenario isPr(V > χ21,1−α), where V has a non-central chi-square distribution withthe degree of freedom 1 and the non-centrality parameter ∆2pi′1Q1·2pi1σ2u. Thus,CLM test attains the effective power bound for a level α test under Strong-IV.994.5. Power Comparisons of Robust Tests4.5.3 Power CalculationsIn this section, we numerically compare the asymptotic power of the AR,KLM, CLR and CLM tests under Weak-IV and when the alternatives aredetermined by arbitrarily large deviations from the null. Previously we haveshown that in this scenario the CLM test has the same asymptotic power asthe AR test. From asymptotic experiments (4.8) and (4.9), the distributionof [S∗′, T ∗′]′ is summarized by three parameters: the number of instrumentsl1, the norm of the nuisance parameter λ that determines the strength ofinstruments, and the parameter ρ that measures the degree of endogeneity.The following Figure 4.1 and Figure 4.2 depict the power of AR (CLM),KLM, CLR tests as well as the power envelope derived in Section 4.3 (de-noted as PO) and the power envelope derived in Section 4.4 (denoted as B-B), for testing H0 : ∆ = 0 against H1 : ∆→ +∞. Figure 4.1 is for the mod-el with 2 instruments and Figure 4.2 is for the model with 5 instruments.In each graph, the horizontal axis is the magnitude of ||λ|| and the verticalaxis is the rejection rate.Figure 4.1 and Figure 4.2 suggest that (i) in the Weak-IV scenario andfor arbitrarily large deviations, none of the AR, KLM, CLR and CLM testsattains the power envelopes (either PO or BB). (ii) The KLM test is dom-inated by CLR test in those numerical comparisons. (iii) The relative per-formance of the AR and CLR tests depends on the degree of endogeniety asmeasured by |ρ|. Both (i) and (iii) are quite different from the findings ofAndrews, Moreira and Stock (2006), which used a weighted average poweras the optimizing criterion and found that CLR test not only numericallydominates KLM and AR tests, but also attains the power envelope given bythe optimal invariant and similar test.Figure 4.3 to Figure 4.6 provide a closer look at the relative power prop-erties of the AR and CLR tests. The horizontal axis is again the magnitudeof ||λ|| while the vertical axis is the rejection rate of the AR test minus thatof the CLR test. We clearly see that when the degree of endogeneity |ρ| isrelatively low, the AR test has a larger power than the CLR test. Howev-er, when the degree of endogeneity |ρ| is relatively high, the CLR test has alarger power than the AR test. Moreover, the power of the AR test is morelikely to be larger than that of the CLR test as the number of instrumentsincreases.1004.5. Power Comparisons of Robust TestsFigure 4.1: Power comparisons among robust tests, 2 IVs1014.5. Power Comparisons of Robust TestsFigure 4.2: Power comparisons among robust tests, 5 IVs1024.5. Power Comparisons of Robust TestsFigure 4.3: Relative power of the AR versus CLR tests, 2 IVs1034.5. Power Comparisons of Robust TestsFigure 4.4: Relative power of the AR versus CLR tests, 5 IVs1044.5. Power Comparisons of Robust TestsFigure 4.5: Relative power of the AR versus CLR tests, 10 IVs1054.5. Power Comparisons of Robust TestsFigure 4.6: Relative power of the AR versus CLR tests, 20 IVs1064.6. Heteroskedastic Models4.6 Heteroskedastic ModelsIn the previous sections, Assumption 2(c) assumes homoskedasticity forthe error terms in model (4.1) and (4.2). This implies the limitingdistribution in (4.3). In this section, we allow for heteroskedasticity. Wederive the corresponding asymptotic experiments in heteroskedasticmodels, and then describe the optimal invariant and asymptotically similartest. Also, we analyse the generalized likelihood ratio statistic inheteroskedastic models when alternatives are arbitrarily far away from thenull. To facilitate analysis, we apply the following normalization oninstruments Z and the nuisance parameters C. For i = 1, ..., n,Z¯i = Q−1/21·2 Z˜i, and C¯ = Q1/21·2 C,where Z˜ = M2Z1 =(Z˜1, Z˜2, ..., Z˜n)′and Z¯ =(Z¯1, Z¯2, ..., Z¯n)′. Thus wehave1√n[Z¯ ′uZ¯ ′v]→d N(0,[Ωuu ΩuvΩuv Ωvv]), (4.20)whereΩuu = E[u2i Z¯iZ¯′i] = Q−1/21·2 E[u2i Z˜iZ˜′i]Q−1/21·2 ,Ωvv = E[v2i Z¯iZ¯′i] = Q−1/21·2 E[v2i Z˜iZ˜′i]Q−1/21·2 ,Ωuv = E[uiviZ¯iZ¯ ′i] = Q−1/21·2 E[uiviZ˜iZ˜′i]Q−1/21·2 .Note that the limiting distribution in (4.20) is a characterization ofheteroskedasticity.Similar to the homoskedastic case, we construct S¯n and p¯i1,n asS¯n = Q−1/21·2 Sn =Z¯ ′(y1 − y2γ0)n=Z¯ ′Z¯n∆C¯√n+Z¯ ′(u+ ∆v)n,p¯i1,n = Q1/21·2 pˆi1,n =(Z¯ ′Z¯)−1Z¯ ′y2 =(Z¯ ′Z¯n)−1 Z¯ ′vn+C¯√n,where the third equality in each line comes from Assumption 1.Note that E[Z¯iZ¯ ′i] = Il1 , and using (4.20), we derive the asymptoticdistribution of S¯n and p¯i1,n as follows,√n[S¯np¯i1,n]∼a N([∆C¯C¯],[Ωuu + 2∆Ωuv + ∆2Ωvv Ωuv + ∆ΩvvΩuv + ∆Ωvv Ωvv]).1074.6. Heteroskedastic ModelsLet T¯n =√np¯i1,n − (Ωuv + ∆Ωvv)(Ωuu + 2∆Ωuv + ∆2Ωvv)−1√nS¯n. Weobtain√n[S¯nT¯n]∼a N([∆C¯A],[Ωuu + 2∆Ωuv + ∆2Ωvv 0l1×l10l1×l1 B]),whereA = (Ωuu + ∆Ωuv)(Ωuu + 2∆Ωuv + ∆2Ωvv)−1C¯,andB = Ωvv − (Ωuv + ∆Ωvv)(Ωuu + 2∆Ωuv + ∆2Ωvv)−1(Ωuv + ∆Ωvv) .By letting S∗n =(Ωuu + 2∆Ωuv + ∆2Ωvv)−1/2√nS¯n and T ∗n = B−1/2√nT¯n,we have[S∗nT ∗n]∼a N([ (Ωuu + 2∆Ωuv + ∆2Ωvv)−1/2∆C¯B−1/2A], I2l1×2l1).Define a random vector N ∼ N(02l1×1, I2l1), and[S∗T ∗]= N +[ (Ωuu + 2∆Ωuv + ∆2Ωvv)−1/2∆C¯B−1/2A].Since[S∗nT ∗n]→d[S∗T ∗], the asymptotic experiment for the heteroskedasticmodel is as follows.Under H0 : ∆ = 0,[S∗T ∗]∼ N([0(Ωvv − ΩuvΩ−1uuΩuv)−1/2C¯], I2l1).Under H1 : ∆→ +∞, the asymptotic experiment becomes[S∗T ∗]∼ N([Ω−1/2vv C¯(Ωuu − ΩuvΩ−1vv Ωuv)−1/2ΩuvΩ−1vv C¯], I2l1).Similarly, under H1 : ∆→ −∞,[S∗T ∗]∼ N(−[Ω−1/2vv C¯(Ωuu − ΩuvΩ−1vv Ωuv)−1/2ΩuvΩ−1vv C¯], I2l1).1084.6. Heteroskedastic ModelsThe form of the asymptotic experiment for arbitrarily large deviations isdetermined by the fact that when ∆→ +∞,B−1/2A = P 1/2(∆)(Ωuu∆+ Ωuv)(Ωuu + 2∆Ωuv + ∆2Ωvv∆2)−1C¯→(Ωuu − ΩuvΩ−1vv Ωuv)−1/2ΩuvΩ−1vv C¯,whereP (∆) =(Ωuv∆+ Ωvv)−1(Ωuu + 2∆Ωuv + ∆2Ωvv∆2)×(Ωuu − ΩuvΩ−1vv Ωuv)−1(Ωuv∆+ Ωvv)Ω−1vv .Analogous results can be obtained for ∆→ −∞.Now let us re-write the mean of the asymptotic statistics S∗ and T ∗. LetD =(Ωuu − ΩuvΩ−1vv Ωuv)−1/2ΩuvΩ−1vv , (4.21)and note thatD′D = Ω−1vv Ωuv(Ωuu − ΩuvΩ−1vv Ωuv)−1ΩuvΩ−1vv .Define[Ωvv ΩuvΩuv Ωuu]=[Ωvv ΩuvΩuv Ωuu]−1=[Ω−1vv + Ω−1vv ΩuvFuΩuvΩ−1vv −Ω−1vv ΩuvFu−FuΩuvΩ−1vv Fu]=[Fv −Ω−1vv ΩuvFu−FuΩuvΩ−1vv Fu],whereFu =(Ωuu − ΩuvΩ−1vv Ωuv)−1,Fv =(Ωvv − ΩuvΩ−1uuΩuv)−1. (4.22)Using this notation, the asymptotic experiment can be re-written asUnder H0: ∆=0,[S∗T ∗]∼ N([0F 1/2v C¯], I2l1), (4.23)1094.6. Heteroskedastic ModelsUnder H1: ∆→+∞,[S∗T ∗]∼ N([Ω−1/2vv C¯DC¯], I2l1), (4.24)Under H1: ∆→-∞,[S∗T ∗]∼ N([−Ω−1/2vv C¯−DC¯], I2l1), (4.25)where D′D = Fv − Ω−1vv .4.6.1 The Optimal Rotational Invariant and AsymptoticallySimilar Test In Heteroskedastic ModelsSimilar to the Section 4.3, we can describe the power envelope for rotationalinvariant and asymptotically similar tests in the heteroskedastic case, whenalternatives are determined by arbitrarily large deviations. LetQn =[Qsn QstnQ′stn Qtn]= [S∗n, T∗n ]′[S∗n, T∗n ],Q =[Qs QstQ′st Qt]= [S∗, T ∗]′[S∗, T ∗].The (infeasible) optimal rotational invariant and asymptotically similar testin this scenario can be established by the following proposition.Proposition 4. In the linear IV model (4.1) and (4.2), suppose that As-sumption 1 and Assumption 2(a),(b),(d) hold, and the heteroskedasticity ischaracterized by (4.20). Consider a testing problem:H0 : ∆ = 0, H1 : ∆→∞.Then the test that rejects H0 whenPOISH∞(Qsn, Qstn) = C¯′Ω−1vv C¯Qsn + 2C¯′Ω−1/2vv DC¯Qstn > κH∞(Qtn),maximizes asymptotic power over all asymptotic size α tests as functionsof Qn, where κH∞(Qtn) is the (1 − α)th quantile of the null distribution ofPOISH∞(Qsn, Qstn) conditional on Qtn.Proof. Note that since Qn →d Q, we can first restrict our attention on theasymptotic statistic Q and then use the Continuous Mapping Theorem toargue the result for Qn. The random matrix Q has a non-central Wishartdistribution. Let f0Q(qs, qst, qt) and f1Q(qs, qst, qt) be the densities of Q underH0 and H1 respectively:f0Q(qs, qst, qt) = K1exp(−C¯ ′FvC¯2)det(q)l1−32 exp(−qs + qt2)1104.6. Heteroskedastic Models×(C¯ ′FvC¯)− l1−24 I l1−22(√C¯ ′FvC¯),f1Q(qs, qst, qt) = K1exp(−C¯ ′FvC¯2)det(q)l1−32 exp(−qs + qt2)×ζ(q)−l1−24 I l1−22(√ζ(q)),whereζ(q) = C¯ ′Ω−1vv C¯qs + 2C¯′Ω−1/2vv DC¯qst + C¯′ (Fv − Ω−1vv)C¯qt.Under H0, the random variable Qt has a non-central chi-squaredistribution with the degree of freedom l1 and the non-centralityparameter C¯ ′FvC¯. Under H1, the random variable Qt has a non-centralchi-square distribution with the degree of freedom l1 and the non-centralityparameter C¯ ′(Fv − Ω−1vv)C¯. Let f0Qt(qt) and f1Qt(qt) be the densities of Qtunder H0 and H1, respectively.f0Qt(qt) = K2exp(−C¯ ′FvC¯2)ql1−22t exp(qt2)×(C¯ ′FvC¯qt)− l1−24 I l1−22(√C¯ ′FvC¯qt),f1Qt(qt) = K2exp(−C¯ ′(Fv − Ω−1vv)C¯2)ql1−22t exp(qt2)×(C¯ ′(Fv − Ω−1vv)C¯qt)− l1−24 I l1−22(√C¯ ′(Fv − Ω−1vv)C¯qt).Therefore, the likelihood ratio isLRH(q) =f1Q(qs, qst, qt)/f1Qt(qt)f0Q(qs, qst, qt)/f0Qt(qt)=exp(− C¯′Ω−1vv C¯2)ζ(q)−l1−24 I l1−22(√ζ(q))(C¯ ′(Fv − Ω−1vv)C¯qt)− l1−24 I l1−22(√C¯ ′(Fv − Ω−1vv)C¯qt)=2−l1−22 exp(− C¯′Ω−1vv C¯2)∑∞j=0(ζ(q)/4)jj!Γ(l1/2+j)(C¯ ′(Fv − Ω−1vv)C¯qt)− l1−24 I l1−22(√C¯ ′(Fv − Ω−1vv)C¯qt) .1114.6. Heteroskedastic ModelsFor a given qt, the denominator of LRH(q) is fixed and the numerator isan increasing function of ζ(q). Therefore we can construct the conditionaltest based on ζ(q) (we can further drop the term C¯ ′(Fv − Ω−1vv)C¯qt, whichis constant after fixing qt). The resulting test statistic isPOISH∞(Qsn, Qstn) = C¯′Ω−1vv C¯Qsn + 2C¯′Ω−1/2vv DC¯Qstn,and H0 is rejected when POISH∞(Qsn, Qstn) > κH∞(Qt), where κH∞(Qt) is the(1 − α)th quantile of the distribution of POISH∞(Qsn, Qstn) conditional onQt and under H0. Exactly the same result can be obtained for H1 : ∆ →−∞.Compared with Proposition 1, we can see that in the homoskedastic case,the optimal test is a function of Qn and depends only on the parameter ρthat measures the degree of endogeneity. In the heteroskedastic case, thefunction of Qn depends on C¯, Ωvv and Fv. The expressions for Ωvv and Fvcan be found in (4.20) and (4.22). While the matrix Ωvv can be estimatedconsistently, Fv and C¯ cannot.4.6.2 The Generalized Likelihood Ratio (GLR) Statistic InHeteroskedastic ModelsIn this section, we consider the generalized likelihood ratio (GLR) statisticunder heteroskedasticity in the Weak-IV scenario and when alternatives aredetermined by arbitrarily large deviations from the null. We find that theGLR statistic degenerates to AR statistic in this case. Consider the testingproblem H0 : ∆ = 0 against H1 : ∆ → +∞, based on the asymptoticexperiments (4.23) and (4.24), the GLR statistic in heteroskedastic modelscan be written asGLRn = S∗′n S∗n− minC¯,Ωuu,Ωuv [(S∗n − µS)′(S∗n − µS) + (T∗n − µT )′(T ∗n − µT )],s.t. µS = Ω−1/2vv C¯, µT = DC¯. (4.26)Recall that Ωvv can be treated as known (since it can be estimatedconsistently), and the matrix D defined in (4.21) depends on Ωuu and Ωuv,which are unknown. The next proposition states that the GLR statisticunder heteroskedasticity reduces to the AR statistic in the Weak-IVscenario and when alternatives are determined by arbitrarily largedeviations from the null.1124.6. Heteroskedastic ModelsProposition 5. In the linear IV model (4.1) and (4.2), suppose that As-sumption 1 and Assumption 2(a),(b),(d) hold, and the heteroskedasticity ischaracterized by (4.20). Consider a testing problemH0 : ∆ = 0, H1 : ∆→ +∞.Then the GLR statistic defined by (4.26) equals to the AR statistic S∗′n S∗n.Proof. Since for any value s∗ of S∗n, we can always set the minimizer C¯ =Ω1/2vv s∗, it suffices to show that for an arbitrary value27 s∗ of S∗n and t∗ ofT ∗n , there exist some Ω∗uu and Ω∗uv such that t∗ = Rs∗ and[Ω∗uu Ω∗uvΩ∗uv Ωvv]ispositive semi-definite (recall that Ωvv is known), whereR = DΩ1/2vv =(Ω∗uu − Ω∗uvΩ−1vv Ω∗uv)−1/2Ω∗uvΩ−1/2vv . (4.27)Let t∗ =(t∗1, ..., t∗l1)′and R = (R1, ..., Rl1)′. Note that t∗ = Rs∗ is equivalentto t∗k = R′ks∗ for k = 1, ..., l1. Therefore, R is determined byR1 = (t∗1/s∗1, 0, ..., 0)′ , R2 = (0, t∗2/s∗2, ..., 0)′ , ..., R1 =(0, ..., t∗l1/s∗l1)′,andR = diag[t∗1/s∗1, t∗2/s∗2, ..., t∗l1/s∗l1 ]. (4.28)The next step is to show that the R satisfying (4.28) can be constructedusing some Ω∗uu and Ω∗uv such that the resulting matrix[Ω∗uu Ω∗uvΩ∗uv Ωvv]ispositive semi-definite. By (4.27) and D′D = Fv − Ω−1vv , we haveR′R = Ω1/2vv FvΩ1/2vv − I= Ω1/2vv(Ω∗uu − Ω∗uvΩ−1vv Ω∗uv)−1Ω1/2vv − I.Rearranging the terms leads toΩ∗uu = Ω∗uvΩ−1vv Ω∗uv + Ω1/2vv(R′R+ I)−1Ω1/2vv . (4.29)This means that for an arbitrary Ω∗uv ∈ Rl1 × Rl1 , we can construct an Ω∗uuas in (4.29). By construction, the resulting Ω∗uv and Ω∗uu satisfy t∗ = Rs∗.27except for s∗ = 0. Note that we have limn→+∞ Pr(S∗n = 0) = 0.1134.6. Heteroskedastic ModelsTo see that[Ω∗uu Ω∗uvΩ∗uv Ωvv]is positive semi-definite, take an arbitrary vectorc = (c′1, c′2)′, where cj ∈ Rl1 for j = 1, 2, and c 6= 0:[c′1 c′2][Ω∗uu Ω∗uvΩ∗uv Ωvv] [c1c2]= c′1Ω∗uvΩ−1vv Ω∗uvc1 + 2c′1Ω∗uvc2 + c′2Ωvvc2 + c′1Ω1/2vv(R′R+ I)−1Ω1/2vv c1=(Ω−1/2vv Ω∗uvc1 + Ω1/2vv c2)′ (Ω−1/2vv Ω∗uvc1 + Ω1/2vv c2)+ c′1Ω1/2vv(R′R+ I)−1Ω1/2vv c1 ≥ 0.The first equality comes from (4.29). The last inequality comes from thefacts that the first term on the right hand side is ||Ω−1/2vv Ω∗uvc1 + Ω1/2vv c2||2and R′R+ I is positive semi-definite.The GLR statistic is calculated by maximizing the likelihood ratio withrespect to nuisance parameters unrestricted by H0 or H1, subject to therestrictions imposed by relationships between the means of the asymptoticexperiment. Under the null, relationships between the asymptotic means ofS∗n and T∗n do not provide any restriction on the nuisance parameters.Therefore, after the maximization, the likelihood under the null becomesthe AR statistic. Under the alternative hypothesis H1 : ∆ → +∞, theasymptotic means of S∗n and T∗n must satisfy certain relationships, whichare very different in homoskedastic and heteroskedastic models. In thehomoskedastic case, under H1 the asymptotic means of S∗n and T∗n areproportional (related by a scaling factor ρ√1−ρ2, see (4.8)). Thisproportional restriction determines the CLR statistic. In theheteroskedastic case, relationships between the asymptotic means of S∗nand T ∗n are characterized by the matrix R in (4.27) which depends on theunknown matrices Ωuu and Ωuv. However, as shown in the proof ofProposition 5, the structure of R is general enough, so that the unknownmatrices Ωuu and Ωuv can produce any relationship between theasymptotic means of S∗n and T∗n . In other words, relationships between theasymptotic means of S∗n and T∗n are non-restrictive. As a result, in theheteroskedastic case, the maximized likelihood ratio only contains the termobtained from the likelihood under the null, which is the AR statistic.1144.7. Concluding Remarks4.7 Concluding RemarksIn this chapter, we consider efficient inference for the coefficient on theendogenous variable in linear regression models with weak instruments.We focus on the power of tests when alternatives are determined byarbitrarily large deviations from the null. We derive the power envelope forsuch alternatives in the Weak-IV scenario. Then we compare the powerproperties of popular Weak-IV robust tests, focusing on the AR and CLRtests. We find that their relative performance depends on the degree ofendogeniety in the model. This is different from Andrews, Moreira andStock (2006), which found that CLR numerically dominates AR when theweighted average power is concerned. In addition, we propose the CLMtest, which is asymptotically efficient in the Strong-IV scenario, robust toWeak-IV, and exhibits the same power as the AR test for arbitrarily largedeviations from the null. We also study the heteroskedastic case, and findthat the generalized likelihood ratio statistic under heteroskedasticityreduces to the AR statistic in the Weak-IV scenario and when alternativesare determined by arbitrarily large deviations from the null.Our analysis can also be extended to a more general minimum distanceestimation (MDE) framework (see Newey and McFadden (1994) for atreatment of classical minimum distance estimation). Econometric modelssuch as censored regression with endogenous regressors28 and DSGEmodels29 fit into the MDE framework. Magnusson (2010) described theAR, KLM and CLR tests in the context of MDE, however, the question ofefficiency remains open. In the MDE framework, one can derive theasymptotic experiments analogous to those in this chapter. Based on that,one can compute the power envelope and compare the power properties ofweak-IV robust tests. Moreover, the CLM test proposed in this chaptercan also be used in the MDE framework.28See Powell (1984, 1986) for a semi-parametric quantile regression estimator for thereduced-form parameters. Estimators for the structural parameters can be constructedfollowing the approach of Amemiya (1978).29See for example, Hnatkovska, Marmer and Tang (2012).115Bibliography[1] Aguirregabiria, V. and P. Mira (2013) Identification of games of in-complete information with multiple equilibria and common unobservedheterogeneity. Working paper[2] Amemiya, T. (1978) The estimation of a simultaneous equation gener-alized probit model. Econometrica 46 (5), 1193–1205.[3] Anderson, T. W. and H. Rubin (1949) Estimation of the parameters ofa single equation in a complete system of stochastic equations. Annalsof Mathematical Statistics 20, 46–63.[4] Andrews, D. W. K., M. Moreira and J. Stock (2006) Optimal invariantsimilar tests for instrumental variables regression. Econometrica 74,715–752.[5] Aradillas-Lopez, A., (2010) Semiparametric estimation of a simultane-ous game with incomplete information. Journal of Econometrics, 157,409-431.[6] Bajari, P., H. Hong, J. Krainer, and D. Nekipelov (2010). Estimatingstatic models of strategic interactions. Journal of Business and Eco-nomic Statistics, 28, 469-482.[7] Berry, S., and P. A. Halie (2014) Identification in DifferentiatedProducts Markets Using Market Level Data. Econometrica forthcoming[8] Berry, S., and P. A. Halie (2013) Identification in a class of nonpara-metric simultaneous equations models. Cowles Foundation DiscussionPaper No. 1787.[9] Berry, S., J. Levinsohn and A. Pakes (1999) Voluntary Export Re-straints on Automobiles: Evaluating a Trade Policy. The AmericanEconomic Review, 89, 400-431.116Bibliography[10] Bickel, P. and M. Rosenblatt (1973) On some global measures of thedeviations of density function estimates. Annuals of Statistics, 26, 1826-1856.[11] Billingsley, R. (1986) Probability and Measure, Wiley, New York[12] Blumenson, L. E. (1960) A derivation of n-dimensional sphericalcoordinates. The American Mathematical Monthly, 67, 63-66[13] Bowman, A. W., A. Pope and B. Ismail (2006) Detecting disconti-nuities in nonparametric regression curves and surfaces. Statistics andComputing, 16, 377–390.[14] Borkovsky, R. N., P. Ellickson, B. Gordon, V. Aguirregabiria, P.Gardete, P. Grieco, T. Gureckis, T. Ho, L. Mathevet and A. Sweeting(2014) Multiplicity of equilibria and information structures in empiricalgames: challenges and prospects. Session at the 9th Triennial ChoiceSymposium.[15] Brock, W., and S. N. Durlauf (2001a) Discrete choice with socialinteractions. Review of Economic Studies, 68, 235-260.[16] Brock, W., and S. N. Durlauf (2001b) Interaction-based models. inHeckman, J and E. Leamer edited Handbook of Econometrics, vol 5.Elsevier Science.[17] Brock, W., and S. N. Durlauf (2007) Identification of binary choicemodles with social interactions. Journal of Econometrics, 140, 52-75.[18] Cattaneo, M. D., R. Crump and M. Jansson (2012) Optimal inferencefor instrumental variables regression with non-gaussian errors. Journalof Econometrics 167, 1–15.[19] Chernozhukov, V., D. Chetverikov and K. Kato (2012) Gaussianapproximation of suprema of empirical process. Working paper[20] Chernozhukov, V., D. Chetverikov and K. Kato (2013) Anti-concentration and honest adaptive confidence bands. Working paper[21] Chernozhukov, V., S. Lee and A. Rosen (2013) Intersection bounds:estimation and inference. Econometrica, 81, 667-737.[22] Chu, C. K. and P. E. Cheng (1996) Estimation of jump points and jumpvalues of a density function. Statistica Sinica 6, 79-95.117Bibliography[23] Choi, S., W. Hall and A. Schick (1996) Asymptotically uniformly mostpowerful tests in parametric and semiparametric models. Annals ofStatistics 24, 841–861.[24] Dagsvik, J. and B. Jovanovic (1994). Was the Great Depression a low-level equilibrium? European Economic Review, 38, 1711-1729.[25] Delgado, M. A and J. Hidalgo (2000) Nonparametric inference onstructural breaks. Journal of Econometrics, 96, 113–144[26] Dufour, J.-M (1997) Some impossibility theorems in econometrics withapplications to structural and dynamic models. Econometrica 65, 1365–1387.[27] Echenique, F., and I. Komunjer (2009) Testing models with multipleequilibria by quantile methods. Econometrica, 77, 1281-1297.[28] Echenique, F., and I. Komunjer (2007) Testing models with multipleequilibria by quantile methods. Caltech HSS working paper 1244R.[29] Ekeland, I., J. J. Heckman and L. Nesheim (2004) Identification andestimation of hedonic models. Journal of Political Economy, 112, S60-S109.[30] Gijbels, I. and A. C. Goderniaux (2004) Bootstrap test for change-points in nonparametric regression, Journal of Nonparametric Statis-tics, 16, 591-611.[31] Gine´, E. and R. Nickl (2010) Confidence bands in density estimation.Annuals of Statistics, 38, 1122-1170.[32] Hall, P. and D. M. Titterington (1992) Edge-preserving and peak-preserving smoothing. Technometrics, 34, 429-440.[33] Hnatkovska, V., V. Marmer and Y. Tang, (2012) Comparison ofmisspecified calibrated models: The minimum distance approach.Journal of Econometrics 169, 131–138.[34] Horowitz, J. L, (1996) Semiparametric estimation of a regressionmodel with an unknown transformation of the dependent variable.Econometrica, 64, 103-137.[35] Johnston, G. (1982) Probabilities of maximal deviations for nonpara-metric regression function estimates. Journal of Multivariate Analysis,12, 402-414.118Bibliography[36] Jovanovic, B. (1989) Observable implications of models with multipleequilibria. Econometrica, 57, 1431-1437.[37] Kleibergen, F (2002) Pivotal statistics for testing structural parametersin instrumental variables regression. Econometrica 70, 1781–1803.[38] Kleibergen, F (2007) Generalizing weak instrument robust iv statisticstowards multiple parameters, unrestricted covariance matrices andidentification statistics. Journal of Econometrics 139, 181–216.[39] Ksay, M. (2012) Nonparametric inference on the number of equilibria,Working paper.[40] L. H. Loomis and S. Sternberg. (1968). Advanced calculus. Reading,Massachusetts: Addison–Wesley.[41] Liu, N., Vuong, Q., and Xu, H. (2013) Rationalization and nonpara-metric identification of discrete games with correlated types. Workingpaper.[42] Magnusson, L. (2010) Inference in limited dependent variable modelsrobust to weak identification. Econometrics Journal 13, 56–79.[43] Matzkin, R. L. (2008) Identification in nonparametric simultaneousequations models. Econometrica, 76, 945-978.[44] Mills, B., M. Moreira and L. Vilela (2014) Tests based on t-statisticsfor IV regression with weak instruments. Journal of Econometrics 182,351–363.[45] Moreira, M. J. (2001) Tests with correct size when instruments can bearbitrarily weak, unpublished manuscript, Department of Economics,University of California, Berkeley.[46] Moreira, M. J., (2003) A conditional likelihood ratio test for structuralmodels. Econometrica 71, 1027–1048.[47] Munkres, J. (1999) Topology, Pearson.[48] Mu¨ller, H. G. (1992) Change-points in nonparametric regression anal-ysis. Annuals of Statistics, 20, 737-761.[49] Mu¨ller, H. G. and U. StadtmMu¨ller (1999). Discontinuous versussmooth regression. Annuals of Statistics, 27, 299–337.119[50] Newey, W. K and D. McFadden (1994) Large sample estimation andhypothesis testing. In: Engle, R. F., McFadden, D. L. (Eds.), Handbookof Econometrics. Vol. IV. Elsevier, Amsterdam, Ch. 36, pp. 2111–2245.[51] Pakes, A and D. Pollard (1989) Simulation and the Asymptotics ofOptimization Estimators. Econometrica, 57, 1027-1057.[52] Paula, A., (2012) Econometric analysis of games with multiple equilib-ria. Annual Review of Economics, 5, 107-131.[53] Paula, A and X. Tang (2012) Inference of signs of interaction effectsin simultaneous games with incomplete information. Econometrica, 80,143–172.[54] Powell, J. L.(1984) Least absolute deviations estimation for the cen-sored regression model. Journal of Econometrics 25, 303–325.[55] Powell, J. L.(1986) Censored regression quantiles. Journal of Econo-metrics 32, 143–155.[56] Qiu, P. (1997) Nonparametric estimation of jump surface. Sankhya:The Indian Journal of Statistics, 59, 268-294.[57] Qiu, P. (2002) A nonparametric procedure to detect jumps in regressionsurfaces. Journal of Computational and Graphical Statistics, 11, 799-822.[58] Staiger, D., and J., Stock (1997) Instrumental variables regression withweak instruments. Econometrica 65,[59] Sweeting, A. (2009) The strategic timing incentives of commercial radiostations: an empirical analysis using multiple equilibria. Rand Journalof Economics, 40, 710-742.[60] Xiao, R. (2014) Identification and estimation of Incomplete InformationGames with multiple equilibria. Working paper[61] Xu, H and Y. Wan (2014) Semiparametric identification and estimationof binary discrete games of incomplete information with correlatedprivate signals., Journal of Econometrics, 182, 235-246, 2014.[62] van der Vart, A, W and J. A. Wellner (2000) Weak Convergence andEmpirical Processes: With Applications to Statistics, Springer.120Appendix AAppendix for Chapter 2A.1 Proofs for Section 2.3Proof of Lemma 1.(i) Suppose to the contrary, there are y1, y2 ∈ Am, and r(y1) = r(y2). ByAm = qm(Bm), there are v1, v2 ∈ Bm such that y1 = qm(v1), y2 = qm(v2).Then by Definition 1(i) and r(y1) = r(y2), we havev1 = r(qm(v1)) = r(y1) = r(y2) = r(qm(v2)) = v2.(ii) For an arbitrary y ∈ Y¯, by Definition 1(ii), we either have y ∈qm(Bm) = Am or y = qm(v˜), where v˜ = limn→∞ vn, for some sequencevn ∈ Bm. If it is the first case, the proof is done. If it is the second case,then by continuity of qm(·), y = limn→∞ yn for yn = qm(vn) ∈ Am. That is,y ∈ A¯m.(iii) Suppose to the contrary, there exists certain y0 ∈ A1 ∩ A2 6= ∅.That is, y0 ∈ q1(B1) and y0 ∈ q2(B2). Also, we have r(y0) = v1 ∈ B1 andr(y0) = v2 ∈ B2, which implies v1 = v2 ∈ B1 ∩ B2. But this contradictsDefinition 1(iii).(iv) By (C1), there exists a unique function q(v) such that for any (y, v) ∈Y¯× int(V) satisfying r (y) = v, y = q(v). We need to show that q(v)is twice continuously differentiable on int(V). By Assumptions D1 to D3and applying Implicit Function Theorem (see, for example, Theorem 9.4 inLoomis and Sternberg (1968))) on an arbitrary point (y0, v0) ∈ Y¯× int(V),there are neighbourhoods A of y0 and B of v0 on which r(y) = v uniquelydefines y as a function of v. That is, there is a function ξ : B → A suchthat r(ξ(v)) = v for all v ∈ B; for each v ∈ B, ξ(v) is the unique solutionto r(y) = v lying in A, and ξ(v) is twice continuously differentiable on B.Because (y0, v0) is arbitrary and q(v) is the only solution to r(y) = v by (C1),we must have q(v) = ξ(v) for all v ∈ int(V). Thus q(v) is twice continuouslydifferentiable on a neighbourhood of every v ∈ int(V). This immediatelyleads to the desired result.(v) When the model is in (C2) or (C3), there exists some vM ∈ (V)such that r(y1) = r(y2) = vM for some y1 6= y2. Suppose M = 1, there is a121A.1. Proofs for Section 2.3function q : int(V) → Y¯ such that y = q(v) for all (y, v) ∈ Y¯× int(V)satisfying r(y) = v. But this means y1 = q(vM ) and y2 = q(vM ), which cannot happen. Q.E.D.Proof of Lemma 2.Let ξ ∼ U [0, 1] independent of (X,U). Given X = x. We haveY =M∑m=11{m−1∑k=0λk(x, U) ≤ ξ <m∑k=0λk(x, U)}qm(U + g(x)),where we define λ0(x, u) = 0, and qm(v) = c for some constant, for any v =g(x) + u /∈ Bm. This does not affect the generation of Y because if v /∈ Bm,we must have m /∈ M(v) and thus λm(x, u) = 0 for all (x, u). Because Ashare mutually exclusive across s and h, for any y ∈ ∪Mm=1 ∪l∈Lm Aml, there isa unique (s, h) such that y ∈ Ash. For any open set Q ⊂ Ash, observe thatPr (Y ∈ Q|X = x)= Pr(M∑m=11{m−1∑k=0λk(x, U) ≤ ξ <m∑k=0λk(x, U)}qm(U + g(x)) ∈ Q|X = x)= Pr({qs(U + g(x)) ∈ Q} ∩{s−1∑k=0λk(x, u) ≤ ξ <s∑k=0λk(x, u)})=∫ ∫1 {qs(u+ g(x)) ∈ Q} 1{s−1∑k=0λk(x, u) ≤ υ <s∑k=0λk(x, u)}×fU,ξ(u, υ)dudυ=∫u∈r(Q)−g(x)[∫1{s−1∑k=0λk(x, u) ≤ υ <s∑k=0λk(x, u)}fξ|U (υ|u)dυ]×fU (u)du=∫u∈r(Q)−g(x)λs(x, u)fU (u)du=∫y∈Qλs(x, r(y)− g(x))fU (r(y)− g(x))|J(y)|dy.The second equality comes from the fact that qm(u+ g(x)) /∈ Q for anym 6= s as well as the independence between (U, ξ) andX. The fourth equality122A.1. Proofs for Section 2.3uses the fact that q−1(·) = r(·) is one to one over As (Definition 1(i)), andthus over Q ⊂ Ash. The fifth equality comes from the independence betweenU and ξ as well as the uniform distribution of ξ. The last equality comesfrom the fact that r(y)− g(x) is one to one in y within As (and thus withinQ), λs(M(Ash), x, u)fU (u) is non-negative, and Theorem 17.2 of Billingsley(1986) p229.Therefore, for any open set O ⊂ ∪Mm=1 ∪l∈Lm Aml,Pr (Y ∈ O|X = x)=M∑m=1∑l∈Lm∫y∈O∩Amlλm(x, r(y)− g(x))fU (r(y)− g(x))|J(y)|dy=∫y∈OM∑m=1∑l∈Lm1{y ∈ Aml}λm(x, r(y)− g(x))fU (r(y)− g(x))|J(y)|dy.The desired result follows. Q.E.D.Proof of Proposition 1.If the model is in (C1), that is, r(·) is one to one, then by AssumptionsD2 to D4, we havefY |X(y|x) = fU (r(y)− g(x))|J(y)| for x ∈ X , y ∈ int(Y(x)),and thus (i) and (ii) follows from Assumption D2 and D4.If the model is in (C2), given any (y, x), we have a unique s∗ such thatλs∗(x, r(y)− g(x)) = 1 and λs(x, r(y)− g(x)) = 0 for all s 6= s∗. (Note thats∗ may depend on y and x.)(i) For any x ∈ X and y ∈(∪l∈Ls∗As∗l))∩ int(Y(x)), we haveλs∗(x, r(y)− g(x)) = 1, (A.1)where s∗ may depend on (y, x).Consider any sequence {yn} such that yn ∈ int(Y(x)) and yn → y, wemust haveλs∗(x, r(yn)− g(x)) = 1,for sufficiently large n, where s∗ is the same as the one in (A.1). Suppose it isnot. Then for any N , there exists an n ≥ N such that λs∗(x, r(yn)−g(x)) =0. Choose large enough N such that yn ∈ B(y) ⊂ As∗ for arbitrarily small, where B(y) is an open ball centred in y and with radius . Together withLemma 2, we have fY |X(yn|x) = 0, which contradicts with yn ∈ int(Y(x)).Therefore, by Lemma 2, we have123A.1. Proofs for Section 2.3fY |X(y|x) = fU (r(y)− g(x))|J(y)| andfY |X(yn|x) = fU (r(yn)− g(x))|J(yn)|.By Assumptions D2 and D4, the continuity statement in Proposition 1(i)holds for given x ∈ X and y ∈(∪l∈Ls∗As∗l)∩ int(Y(x)).Now suppose y ∈(∩k∈KA¯s∗lk)∩ int(Y(x)), where K ⊂ Ls∗ and #(K) ≥2. Let O be an open set such that y ∈ O and O ∩ As∗lk 6= ∅ for all k ∈ K.Then for any y′ ∈ O∩As∗lk∩ int(Y(x)), applying the result of last paragraphleads tofY |X(y′|x) = fU (r(y′)− g(x))|J(y′)|.Consider any sequence {y′n} such that y′n ∈ O∩As∗lk∩ int(Y(x)) and y′n → y,Assumptions D2 and D4 lead tolimO∩As∗lk∩ int(Y(x)), y′n→y′fY |X(y′n|x) = fU (r(y′)− g(x))|J(y′)|.Thus the continuity statement in Proposition 1(i) holds for anyy ∈(∩k∈KA¯s∗lk)∩ int(Y(x)) and some K ⊂ Ls∗ with #(K) ≥ 2.Using similar arguments, we can show that the continuity statement inProposition 1(i) holds for any y ∈(∩s∈SA¯s)∩ int(Y(x)) and some S ⊂{l, ...,M} with #(S) ≥ 2.Note that any y ∈ int(Y(x)) must belong to one of the three sets:∪l∈Ls∗As∗l, ∩k∈KA¯s∗lk and ∩s∈SA¯s. Thus the desired result is proven.(ii) For any x ∈ X and y ∈(∪l∈Ls∗As∗l))∩ int(Y(x)), we haveλs∗(x, r(y)− g(x)) = 1. (A.2)Consider any sequence {yn, xn} such that yn ∈ int(Y(xn)) and (yn, xn) →(y, x), we must haveλs∗(xn, r(yn)− g(xn)) = 1,for sufficiently large n, where s∗ is the same as the one in (A.2). Supposeit is not. Then for any N , there exists an n ≥ N such that λs∗(xn, r(yn)−g(xn)) = 0. For (yn, xn) arbitrarily close to (y, x), yn ∈ B(y) ⊂ As∗ forarbitrarily small . By Lemma 2, fY |X(yn|xn) = 0, which contradicts withyn ∈ int(Y(xn)).Then by Lemma 2, we have124A.1. Proofs for Section 2.3fY |X(y|x) = fU (r(y)− g(x))|J(y)| andfY |X(yn|xn) = fU (r(yn)− g(xn))|J(yn)|.Thus the continuity statement in Proposition 1(ii) follows from AssumptionsD2 and D4.Arguments similar to part (i) applies for y ∈(∩k∈KA¯s∗lk)∩ int(Y(x)),K ⊂ Ls∗ with #(K) ≥ 2, and for y ∈(∩s∈SA¯s)∩ int(Y(x)), S ⊂ {l, ...,M}with #(S) ≥ 2. This completes the proof. Q.E.D.Proof of Proposition 2. (i) Suppose Ash and Ash′ satisfy AssumptionD6 (i). For any y ∈ Ash and y′ ∈ Ash′ , Lemma 2 givesfY |X(y|x) = fU (r(y)− g(x))|J(y)|λs(x, r(y)− g(x)) andfY |X(y′|x) = fU (r(y′)− g(x))|J(y′)|λs(x, r(y′)− g(x)).Let yd ∈ A¯sh ∩ A¯sh′ and x ∈ XD. Observe thatlimy∈Ash,y→ydfY |X(y|x) = fU (r(yd)− g(x))|J(yd)|limy∈Ash,y→ydλs(x, r(y)− g(x)),limy∈Ash′ ,y′→ydfY |X(y′|x) = fU (r(yd)− g(x))|J(yd)|limy∈Ash′ ,y′→ydλs(x, r(y′)− g(x)). (A.3)Sincelimy∈Ash,y→ydλs(x, r(y)− g(x)) 6= limy∈Ash′ ,y′→ydλs(x, r(y′)− g(x)),and both are non-zero by Assumption D6(ii). Together with fU (r(yd) −g(x)) > η > 0 by the fact that yd ∈ Y¯ and thus r(yd) − g(x) ∈ U , and|J(y′d)| > ς by Assumption D5(i), we have limy∈Ash,y→yd fY |X(y|x) 6= 0 andlimy∈Ash′ ,y′→yd fY |X(y′|x) 6= 0 as well as∣∣∣∣ limy∈Ash,y→ydfY |X(y|x)− limy∈Ash′ ,y′→ydf(y′|x)∣∣∣∣= fU (r(yd)− g(x))|J(yd)|×∣∣∣∣ limy∈Ash,y→ydλs(x, r(y)− g(x))− limy∈Ash′ ,y′→ydλs(x, r(y′)− g(x))∣∣∣∣ > 0.125A.1. Proofs for Section 2.3Therefore, fY |X(y|x) has a jump at yd for any x ∈ XD.(ii) immediately follows from (i). Q.E.D.To prove Corollary 1, let us first state Lemma 3 as follows.Lemma 3. Suppose that Assumptions D5(ii) and (iii) hold and M > 1.Then(i) For any m ∈ {1, ...,M}, there exists a k ∈ {1, ...,M} and k 6= m,such that Bm ∩Bk 6= ∅.(ii) For any m, there exists an k 6= m such that r(Am) ∩ r(Ak) 6= ∅.Proof of Lemma 3.(i). It suffices to show that for every m ∈ {1, ...,M}, there exists somev ∈ Bm such that r(y) = v holds for more than one y ∈ Y¯. Suppose it isnot true. Then we have a Bm such that r(y) = v has a unique solution in yfor all v ∈ Bm. Consider an l ∈ {j 6= m : B¯m ∩ B¯j 6= ∅} and let v0 be anarbitrary point v0 ∈ B¯m ∩ B¯l. By assumption, Bm ∩ Bl = ∅ for all l 6= m.Thus v0 ∈ B¯m\Bm. For any such l, we have two cases:Case 1. limv∈Bm,v→v0 qm(v)− limv∈Bl,v→v0 ql(v) = 0 for all v0 ∈ B¯m∩ B¯l.In this case, we can combine Bm and Bl to reduce the number M , thiscontradicts with the Definition 1.In particular, let B′m = int(Bm ∪Bl ∪(B¯m ∩ B¯l))andq′m(v) =qm(v), v ∈ Bm,ql(v), v ∈ Bl,limv∈Bm,v→v0 qm(v), v0 ∈ B¯m ∩ B¯l.By Assumption D5(ii), we know that 5r(y0) is invertible, wherey0 = limv∈Bm,v→v0 qm(v) = limv∈Bl,v→v0 ql(v). Then we can apply theImplicit Function Theorem to obtain that there are neighbourhoods A of y0and B of v0, and a unique function ς : B → A such that ς(v0) = y0 and foreach v ∈ B, ς(v) is the unique solution to r(y) = v lying in A, r(ς(v)) = vfor v ∈ B, and ς(v) is twice continuously differentiable at x0. Therefore,q′m(v) must coincide with ς(v) for v ∈ B. Since v0 is an arbitrary point inB¯m ∩ B¯l, q′m(v) is twice continuously differentiable on B′m.Case 2. |limv∈Bm,v→v0 qm(v)− limv∈Bl,v→x0 ql(v)| = δ > 0, for somev0 ∈ B¯m ∩ B¯l.In this case, by Assumption D2, r(y) is continuous on Y¯. Hence126A.1. Proofs for Section 2.3r(ym) = r(yl) = v0,where ym = limv→v0,v∈Bmqm(x), yl = limv→v0,v∈Blql(x).There must be an l satisfying the following: there is no l′ such thatevery component of yl′ − ym has the same sign as yl − ym and|yl′ − ym| < |yl − ym|. We focus on such an l. By Assumption D5(iii),derivative 5r(yl) is invertible. By applying the Implicit FunctionTheorem, for any neighbourhood A of yl, there is a neighbourhood B of v0and a function ξ : B → A such that ξ(v0) = yl, r(ξ(v)) = v for v ∈ B, andξ(x) is continuous at v0. Therefore, for any , there exists an η such that|v − v0| < η =⇒ |ξ(v)− yl| < . Then for any , all v′ ∈ Bm ∩ B(v0, η) ∩ Bsatisfy r(ξ(v′)) = v′ and |ξ(v′)− yl| < . Thus |ξ(v′)− ym| > δ −  andr(ξ(v′)) = v′ for all v′ ∈ Bm ∩ B(v0, η) ∩ B. On the other hand, bycontinuity of qm, for any ′, there exists a η′ such that|v − v0| < η′ =⇒ |qm(v)− ym| < ′. Therefore, for any ′, allv′′ ∈ Bm ∩ B(v0, η′) satisfy r(qm(v′′)) = v′′ and |qm(v′′)− ym| < ′. Bychoosing  and ′ sufficiently small, we have ξ(v) 6= qm(v) andr(ξ(v)) = r(qm(v)) = v, for all v ∈ Bm ∩B(v0, η ∧ η′) ∩B. This contradictswith the assumption that r(y) = v has a unique solution in y for allv ∈ Bm.(ii) Immediately follows from (i) by letting Am = qm(Bm). Q.E.D.Proof of Corollary 1.By Assumption D7(i), there exists an s ∈ {1, 2, ...,M} such thatPr (g(X) + U ∈ r(As) ∩ S) > 0.Let As1 = {y ∈ As : r(y) ∈ S}. We have (M(r(As1))) = {s}. Also ByLemma 3 (ii), there exists some m 6= s such that r(As) ∩ r(Am) 6= ∅.Let As2 = {y ∈ As : r(y) ∈ r(Am)}. We have {s,m} ⊂ M(r(As2)). ThusM(r(As1)) 6= M(r(As2)). Assuming A¯s1 ∩ A¯s2 6= ∅ does not lose anygenerality. Q.E.D.Proof of Corollary 2.Under Assumptions D8(i) and D7(ii), there exists some k ∈ Lm andC0 ⊂ Amk such that for all x ∈ XD,0 < φm(M(r(C0)), x) = φm(M(r(Amk)), x) < 1.By Assumption D7(i), there exists some j ∈ Lm such that #(M(r(Amj)))= 1. Thus φm(M(r(Amj)), x) = 1. There are two cases:127A.1. Proofs for Section 2.3Case 1. A¯mj ∩ A¯mk 6= ∅, the rest of proof follows that of Proposition 2,and XD1 = XD.Case 2. A¯mj ∩ A¯mk = ∅, that means for all h such that A¯mh∩ A¯mk 6= ∅,#(M(r(Amh))) > 1:Case 2.1. φm(M(r(Amh)), x) 6= φm(M(r(Amk)), x) for some h suchthat A¯mh ∩ A¯mk 6= ∅, and for a subset XD1 ⊂ XD. Then the rest of prooffollows that of Proposition 2.Case 2.2. φm(M(r(Amh)), x) = φm(M(r(Amk)), x) for all h suchthat A¯mh ∩ A¯mk 6= ∅, and for almost all x. Then we must have 0 <φm(M(r(Amh)), x) < 1, for all h such that A¯mh ∩ A¯mk 6= ∅. Consequently,we can replace k with an h such that A¯mh ∩ A¯mk 6= ∅, and repeat theprocedure from the beginning of the proof. Such a procedure continuesuntil an end in either Case 1 or Case 2.1, because #(Lm) is finite. Q.E.D.Proof of Corollary 3.By Assumption D7(i), without loss of generality, assume A1∩S 6= ∅ andA11 = {y ∈ A1 :M(r(y)) = {1}}. Note that we also have A1 = ∪l∈L1A¯1l.If the model is in (C3), by Corollary 1, there exists some A12 such thatM(r(A12)) 6= M(r(A11)). Since M(r(A12)) includes equilibrium 1 bydefinition, we must have #(M(r(A12))) > 1. Suppose A¯11 ∩ A¯12 6= ∅.(This does not reduce generality since Category (C3) and AssumptionD7(i) imply that there exist A1j and A1k satisfying A¯1j ∩ A¯1k 6= ∅ andM(r(A1j)) = 1, M(r(A1k)) > 1). Given an arbitrary x, for anyy ∈ A¯11 ∩ A¯12, by Lemma 2 we havelimy′→y,y′∈A11fY |X(y′|x) = fU (r(y)− g(x))|J(y)|φ1(M(r(A11)), x)= fU (r(y)− g(x))|J(y)| andlimy′→y,y′∈A12fY |X(y′|x) = fU (r(y)− g(x))|J(y)|φ1(M(r(A12)), x),where φ1(M(r(A11)), x) = 1 since M(r(A11)) = {1}. By Assumption D9,we have the following cases:Case 1. There is an XD2 ⊂ X with Pr(X ∈ XD2) > 0 such that 0 <φ1(M(r(A12)), x) < 1 for all x ∈ XD2 .By Assumption D5(i) we have∣∣∣∣ limy′→y,y′∈A11fY |X(y′|x)− limy′→y,y′∈A21fY (y′|x)∣∣∣∣= fU (r(y)− g(x))|J(y)|(1− φ1(M(r(A12)), x))128A.1. Proofs for Section 2.3≥ ηδς > 0.That is, fY |X(y|x) has a jump at any y ∈ A¯11 ∩ A¯12 for some x ∈ XD2 .Furthermore, D5(i) and D9(ii) makes sure that limy′→y,y′∈A11 fY |X(y′|x) 6= 0and limy′→y,y′∈A12 fY |X(y′|x) 6= 0. The desired result is obtained.Case 2.1. If φ1(M(r(A12)), x) = 1 for almost all x.Consider another A1j such that A¯11∩ A¯1j 6= ∅, by the same argument asin Case 1, fY |X(y|x) has a jump at y ∈ A¯11∩A¯1j , if there exists an XD2 suchthat 0 < φ1(M(r(A1j)), x) < 1 for x ∈ XD2 . Repeating this step yields thedesired result if there exists some l ∈ L1 such that 0 < φ1(M(r(A1l)), x) < 1for x ∈ XD2 .Case 2.2. If φ1(M(r(A1l)), x) = 1 for almost all x and all l ∈ L1.Consider some other m 6= 1 with r(Am)∩S 6= ∅. By repeating the stepsin Case 1 and Case 2.1, the desired result is obtained if there exists somem such that 0 < φm(M(r(Aml)), x) < 1, and for some l ∈ Lm and for allx ∈ XD2 .Case 2.3. If φm(M(r(Aml)), x) = 1 for almost all x, all l ∈ Lm and allm with r(Am) ∩ S 6= ∅.Let T = {t : r(At)∩S 6= ∅}. Case 2.3 means that for almost all x, thereexists a unique t ∈ T (t may depend on x) such that φt(M(r(Atl)), x) = 1,all l ∈ Lt and φv(M(r(Avk)), x) = 0 for any v /∈ T and any k ∈ Lv. ByAssumption D9(i), for any (x, u), g(x) + u ∈ r(At) for some t ∈ T . Thusby Assumotion D7(ii), the equilibrium selection rule is degenerate (alwayschoose equilibrium t with probability 1) and definition of (C3) is violated.Hence Case 2.3 cannot happen when there are multiple equilibria. Q.E.D.Proof of Proposition 3.(i) By Assumption D10, given any m,λm(x, u) = ψm(g(x) + u) = ψm(r(y)) = 1 or 0.For any y ∈ ∪Mm=1 ∪l∈Lm Aml, by Lemma 2 and Proposition 1(i), we havefor all x ∈ X ,fY |X(y|x) = fU (r(y)− g(x)) |J(y)|ψm(r(y)),where m and l satisfy y ∈ Aml. Observe thatfY (y) =∫XfU (r(y)− g(x)) |J(y)|ψm(r(y))dFX(x)= ψm(r(y))|J(y)|∫XfU (r(y)− g(x)) dFX(x)129A.1. Proofs for Section 2.3={|J(y)|∫X fU (r(y)− g(x)) dFX(x) if ψm(r(y)) = 10 if ψm(r(y)) = 0,which is continuous as long as y ∈ int(Y(x)) by Assumptions D2 and D4.Moreover, the density is|J(y)|∫XfU (r(y)− g(x)) dFX(x).Since the choice of m is arbitrary and Y¯ = y ∈ ∪Mm=1 ∪l∈Lm A¯ml, thedesired result is proven.(ii) Given the assumptions, the conclusion of Proposition 2(i) holds.From By Assumption D5 and D7, there is a jump yd ∈ A¯mh ∩ A¯mk forx ∈ XD, where #(M(r(Amh))) = 1 and #(M(r(Amh))) > 1. Thus thejump sizelimy∈Amh,y→ydfY |X(y|x)− limy∈Amk,y→ydfY |X(y|x)at an arbitrary x ∈ XD isfU (r(yd)− g(x)) |J(yd)|[1− limy∈Amk,y→ydφm(M(r(Amk)), x)]= δ > 0.Therefore, the difference in the unconditional density can be written aslimy∈Amh,y→ydfY (y)− limy∈Amk,y→ydfY (y)=∫XD[limy∈Amh,y→ydfY |X(y|x)− limy∈Amk,y→ydfY |X(y|x)]f(x)dx> δ Pr (X ∈ XD) > 0.Also, since limy∈Amh,y→yd fY |X(y|x) 6= 0 for x ∈ XD, we havelimy∈Amh,y→ydfY (y) =∫XfY |X(y|x) ≥∫XDfY |X(y|x) 6= 0.The same is for limy∈Amk,y→yd fY (y).Q.E.D.Proof of Proposition 4.Let UD denote the collection of jump locations of fU (u). If the modelhas a unique equilibrium, by Assumption D10, given any m, λm(x, u) =ψm(g(x) + u) = ψm(r(y)) = 1 or 0.For an arbitrary y ∈ ∪Mm=1 ∪l∈Lm Aml, by Lemma 2,fY |X(y|x) = fU (r(y)− g(x)) |J(y)|ψm(r(y)),130A.2. Conditions in Section 2.4for all x ∈ X\XJL(y), where XJL(y) = {x : r(y) − g(x) ∈ UD}, m and lsatisfy y ∈ Aml. Since Pr(U ∈ U) = 0 and Assumption D11 holds, Pr(X ∈XJL(y)) = 0, for all y. Hence for any arbitrary y ∈ ∪Mm=1 ∪l∈Lm Aml,fY (y) =∫X\XJL(y)fU (r(y)− g(x)) |J(y)|ψm(r(y))dFX(x)+∫XJL(y)fY |X(y|x)dFX(x)= ψm(r(y))|J(y)|∫X\XJL(y)fU (r(y)− g(x)) dFX(x)={|J(y)|∫X\XJL(y)fU (r(y)− g(x)) dFX(x) if ψm(r(y)) = 10 if ψm(r(y)) = 0,which is continuous at as long as y ∈ int(Y(x)) by Assumptions D2 and D4.Moreover, the density is|J(y)|∫XfU (r(y)− g(x)) dFX(x).Since the choice of m is arbitrary and Y¯ = y ∈ ∪Mm=1 ∪l∈Lm A¯ml, the desiredresult follows. Q.E.D.A.2 Conditions in Section 2.4This section spells out five conditions (SL1 to SL5) sufficient for Proposition5 and Proposition 6 in Section 2.4. Note that Conditions SL1 to SL4 areadapted from Chernozhukov et al. (2013).To state those conditions, let us first define the population version of theaforementioned maximizing set Yˆκ2 . LetYκ2 = {y ∈ Y : P−j (y)/Pj(y) > κ2, P+j (y)/Pj(y) > κ2, all j ∈ J },where for j ∈ J ,P−j (y) = Pr (yj − wj ≤ Yi,j < yj |Yi,−j = y−j) fY−j (y−j) ,P+j (y) = Pr (yj ≤ Yi,j < yj + wj |Yi,−j = y−j) fY−j (y−j) ,Pj(y) = P+j (y) + P−j (y), wj = (supyYj − infyYj)/H.131A.2. Conditions in Section 2.4Conditions SL1 and SL2 are imposed on the kernel function. For j ∈ J ,recall that ∆Kn,j be a class of measurable functions,∆Kn,j =√pnhJ−1n ϕ(j)n (·, y)σj,n(y): y ∈ Yκ2, (A.4)whereϕ(j)n (t, y) =1pnhJ−1n[K+j(tj−yjpn, t−j−y−jhn)−K−j(tj−yjpn, t−j−y−jhn) ].(A.5)Recall that the covering number N (∆Kn,j , L2(Q), bn,jε) is defined as theminimum number of closed L2 balls of radius bn,jε to cover ∆Kn,j .Condition SL1. (i) ∆Kn,j is uniformly bounded from above by anumber bn,j , and the covering number of ∆Kn,j satisfiessupQN (∆Kn,j , L2(Q), bn,jε) ≤ (aj/ε)vj , for all 0 < ε < 1, (A.6)for some aj > e and vj ≥ 1, where the supermum is taken over all finitelydiscrete probability measures Q.(ii) For any g ∈ ∆Kn,j , there exists a constant σn,j such thatE[g(Y1)2]≤ σ2n,j ≤ b2n,j .Condition SL2. There exist positive constants c2 and C2 such that forall j ∈ J ,c2 ≤ infy∈Yκ2σj,n(y) ≤ supy∈Yκ2σj,n(y) ≤ C2,c2 ≤ σ2n,j ≤ C2.Condition SL3. Suppose that H0 is true, or H1 is true and for anyy ∈ Yκ2 , there exists ε such that f(·) is twice continuously differentiableover a neighbourhood of (yj + ε, y−j) and (yj − ε, y−j) for all j ∈ J . Thenthere are integer n0 and positive constants c3, C3, c4 and C4 such that forall n ≥ n0, there is some t satisfying c3 ≤ t ≤ C3 andsupy∈Yκ2(E[∆(j)n (y)]−∆(j)(y))≤ C4htn,132A.3. Proofs for Section 2.4supy∈Yκ2(E[∆(j)n (y)]−∆(j)(y))≤ C4ptn.Condition SL4. Let Kn = Av(log n ∨ log(abn/σn)), wherebn = maxj∈J bn,j , σn = maxj∈J σn,j , v = maxj∈J vj and a = J maxj∈J aj ,A is a constant. Then there are constants C5 and c5 such thatbnK4n/n ≤ C5n−c5 , and a constant M¯ > 0 such that√npnhJ−1n ptn ≤ M¯cS1,n(γn) supj∈J ,y∈Yκ2σj,n(y),√npnhJ−1n htn ≤ M¯cS1,n(γn) supj∈J ,y∈Yκ2σj,n(y).Condition SL5. Let bandwidths be pn = p0×n−γp and hn = h0×n−γh .Assume that the bandwidths satisfy γp + (J − 1)γh < 1, γp + (J + 1)γh > 1and 3γp + (J − 1)γh > 1.The constant t in Condition SL3 differs under H0 and H1. ConditionsSL4 and SL5 place restrictions on the bandwidths pn and hn. Note thatunder Condition SL5,√npnhJ−1n ptn and√npnhJ−1n htn on the left hand sideof Condition SL4 converge to zero. The requirement of C in computing thecritical value is C ≥ M¯C4/((1 − 3n)c2) (see Appendix A.3). Then underCondition SL5, the constant C can be chosen arbitrarily close to zero. Inpractice, we can set C = 0.A.3 Proofs for Section 2.4The organization of this Appendix A.3 is as follows. We first introducesome notations. Then we spell out four high level conditions in the spiritof Chernozhukov et al. (2013), and verify them under Conditions SL1-SL5in Appendix A.2. Next we state and prove three lemmas. Lastly, we provePropositions 5 and 6 using high level conditions as well as the lemmas.NotationsLet cS1,n(α) be the (1 − α)th quantile of the distribution ofsupj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣ given {Yi}ni=1. Also let cS1,n = cS1,n(α) + c′1,n, where133A.3. Proofs for Section 2.4c′1,n = Cc1,n(γn), and c1,n(γn) is the (1− γn)th quantile of the distributionof supj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣.Let Zj,n be an empirical processZj,n = {Zj,n(y) : j ∈ J , y ∈ Yκ2} ,where Zj,n(y) =√npnhJ−1n(∆(j)n −E[∆(j)n (y)])σj,n(y).Zj,n can also be written asZn(g) =1√nn∑i=1(g(Yi)−E [g(Yi)]) , g ∈ ∆Kn,where ∆Kn ≡ ∪Jj=1∆Kn,j defined in (A.4).Let G = {G(g) : g ∈ ∆Kn} be a zero mean Gaussian process with thesame covariance function as that of Zn = {Zn(g) : g ∈ ∆Kn}. That is,E [G(g1)G(g2)]= E [Zn(g1)Zn(g2)]−E [Zn(g1)] E [Zn(g2)]= pnhJ−1n [E [g1(Yi)g2(Yi)]−E [g1(Yi)] E [g2(Yi)]] . (A.7)for any g1, g2 ∈ ∆Kn.High level conditionsWe state following four high level conditions analogous to Chernozhukovet al. (2013). These conditions will be used in the proof of Propositions 5and 6.Condition SH1. One can construct a sequence of random variablesW 0n satisfying the following conditions.(i) W 0n =d supg∈∆Kn |G(g)|, where G(g) is a centred Gaussian processwith E[G2(g)]= 1 for any g ∈ ∆Kn, and E[supg∈∆Kn |G(g)|]≤ C1√log n.(ii) For some positive sequences 1n and δ1n bounded by C1n−c1 (forsome constants C1, c1). We havePr(∣∣∣∣∣supg∈∆Kn|Zn(g)| −W0n∣∣∣∣∣> 1n)≤ δ1n. (A.8)134A.3. Proofs for Section 2.4Condition SH2. Let cn be the (1 − α)th quantile of the distributionof supg∈∆Kn |G(g)|, for some positive sequences τn, 2n and δ2n bounded byC1n−c1 , we havePr(cS1,n(α) < cn(α+ τn)− 2n)≤ δ2n,Pr(cS1,n(α) > cn(α− τn) + 2n)≤ δ2n.Condition SH3. For some positive sequences 3n and δ3n bounded byC1n−c1 , we havePr(supj∈J ,y∈Yκ2∣∣∣∣σˆj,n(y)σj,n(y)− 1∣∣∣∣ > 3n)≤ δ3n.For future reference, we state Condition A1 as follows,Condition A1. The alternative hypothesis H1 is true, and for anyy ∈ Yκ2 , there is an ε such that the density function f(·) is twice continuouslydifferentiable over a neighbourhood of (yj+ε, y−j) and (yj−ε, y−j) for everyj ∈ J .Condition SH4. LetBn = supj∈J ,y∈Yκ2√npnhJ−1n E[∆(j)n (y)−∆(j)(y)]σˆj,n(y).If H0 is true, or Condition A1 holds, then for some positive sequence δ4nbounded by C1n−c1 , we havePr(Bn > c′1,n)≤ δ4n.where c′1,n = CcS1,n(γn).Note that under H0, ∆(j)(y) is zero for all j.Verify Conditions SH1-SH4 using Conditions SL1-SL5Proposition A1 below states that Conditions SH1-SH4 hold under Con-ditions SL1-SL5.135A.3. Proofs for Section 2.4Proposition A1. Suppose that Conditions SL1-SL4 hold. Then Con-ditions SH1-SH4 hold.Proof of Proposition A1(i) Verify SH1: By Condition SL1, ∆Kn,j is a class of measurablefunctions uniformly bounded by bn,j and its covering number satisfiessupQN (∆Kn,j , L2(Q), bn,jε) ≤ (aj/ε)vj , for all 0 < ε < 1, (A.9)for some vj ≥ 1 and aj ≥ e. Then ∆Kn ≡ ∪Jj=1∆Kn,j is a class of measurablefunctions uniformly bounded by bn = maxj∈J bn,j and its covering numbersatisfiessupQN (∆Kn,, L2(Q), bnε) ≤ (a/ε)v , 0 < ε < 1, (A.10)for some number v ≥ 1, a ≥ e. This is becausesupQN (∆Kn,, L2(Q), bnε) ≤J∑j=1supQN (∆Kn,j , L2(Q), bnε)≤J∑j=1supQN (∆Kn,j , L2(Q), bn,jε)≤ J maxj(aj/ε)vj , 0 < ε < 1.Inequality (A.10) holds by letting v = maxj∈J vj and a = J maxj∈J aj . Inthe following, We will need Theorem 3.1 of Chernozhukov et al.(2013).[Restatement of Theorem 3.1 in Chernozhukov et al.(2013)]Suppose G is a class of functions uniformly bounded by a constant b suchthat there exist constants a ≥ e and v > 1 with supQN(G, L2(Q), b) ≤(a/)v for all 0 <  ≤ 1 . Let σ2 be a constant such that supg∈GVar(g) ≤σ2 ≤ b2. Denote Kn = Av(log n ∨ log(ab/σ)). Define an empirical processGn(g) ≡1√nn∑i=1(g(Xi)−E [g(Xi)]) , g ∈ G.Let Wn = ‖Gn‖G = supg∈G |Gn(g)|, and B = {B(g) : g ∈ G} be a centredtight Gaussian process with a covariance functionE [B(g1)B(g2)] = E [g1(Xi)g2(Xi)]−E [g1(Xi)] E [g2(Xi)] .136A.3. Proofs for Section 2.4Then for any γ ∈ (0, 1), one can construct a random variable such that(i) W0 =d ‖B‖G .(ii)Pr(|Wn −W0| >bKnγ1/2n1/2+σ1/2K3/4nγ1/2n1/2+b1/3σ2/3K2/3nγ1/2n1/2)≤ A(γ +log nn),for some absolute constant A.Apply the above theorem with G = ∆Kn, Gn(g) = Z(g) and B(g) =G(g). We can checkE[G2(g)]= E[Z2(g)]= E[npnhJ−1n(∆(j)n (y)−E[∆(j)n (y)])2/σ2j,n(y)]= 1.Also, we obtainPr(∣∣∣∣∣supg∈∆Kn|Z(g)| −W 0n∣∣∣∣∣> 1n)≤ δ1n,where 1n = bnKnγ1/2n1/2 +σ1/2n K3/4nγ1/2n1/2+ b1/3n σ2/3n K2/3nγ1/2n1/2and δ1n = A(γ + lognn). Byletting γ = Cn−c for some C and 0 < c < 1/2, we have 1n and δ1n boundedabove by C1n−c1 for some C1 and c1.To show E[supg∈∆Kn |G(g)|]≤ C1√log n, we need Corollary 2.2.8 ofVan der Vart and Wellner (2000).[Restatement of Corollary 2.2.8 of Van der Vart and Wellner (2000)]Let {Xt : t ∈ T} be a separable sub-Gaussian process with respect to thesemimetric d. Then(i) for every δ > 0E[supd(s,t)≤δ|Xs −Xt|]≤ K∫ δ0√logD(ε, d)dε,for a university constant K, where D(ε, d)is the packing number for T . (Thepacking number is the maximum number of -separated elements in T .)(ii) In particular, for any t0,E supt|Xt| ≤ E |Xt0 |+K∫ ∞0√logD(ε, d)dε.137A.3. Proofs for Section 2.4Applying (ii) of Corollary 2.2.8 to the Gaussian process {G(g) : g ∈∆Kn} with respect to standard deviation semimetric d, we obtainE[supg∈∆KnG(g)]≤ E |G(g0)|+K∫ ∞0√logD(ε, d)dε.By Condition SL1,∫∞0√logD(ε, d) is finite and thus bounded by C log nfor sufficiently large n.(ii) Verify SH2: This follows Chernozhukov (2013) et al.DefineG˜n,j(y) =1√nn∑i=1ξi√pnhJ−1n[ϕ(j)n (Yi, y)−∆(j)n (y)]σj,n(y),and∆Gn,j(y) = Gˆn,j(y)− G˜n,j(y).Given any yn1 ∈ RnJ , defineWˆ (yn1 ) = supj∈J ,y∈Yκ2Gˆn,j(y),W˜ (yn1 ) = supj∈J ,y∈Yκ2G˜n,j(y).Let Sn,1 ⊂ RnJ such that∣∣∣σˆj,n(y)σj,n(y)− 1∣∣∣ < 3n for all j and y ∈ Yκ2 and for allyn1 ∈ Sn,1, the “verify SH3” part below will show that Pr(Sn,1) > 1− δ3n =1− 2/n.Fix any yn1 ∈ Sn,1. Note that∆Gn,j(y) =1√nn∑i=1ξi√pnhJ−1n[ϕ(j)n (yn1 , y)−∆(j)n (y)]σj,n(y)(σj,n(y)σˆj,n(y)− 1)is a zero mean Gaussian process with varianceVar (∆Gn,j(y)) =1nn∑i=1pnhJ−1n[ϕ(j)n (yn1 , y)−∆(j)n (y)]2σ2j,n(y)(σj,n(y)σˆj,n(y)− 1)2=σˆ2j,n(y)σ2j,n(y)(σj,n(y)σˆj,n(y)− 1)2≤ 423n,138A.3. Proofs for Section 2.4where the last inequality comes from Condition SH3 and by using 3n ≤ 1/2,which holds for sufficiently large n.In the following, we will need Proposition A.2.7 of Van der Vart andWellner (2000).[Restatement of Proposition A.2.7 of Van der Vart and Wellner (2000)]Let Xt, t ∈ T be a separable zero-mean Gaussian process such that forsome K > σ(X), some v > 0 and some 0 < ε0 ≤ σ(X),N (ε, T, ρ) ≤ (K/ε)v for 0 < ε < ε0.Then there exists a universal constant D such that for all λ ≥ σ2(X)(1 +√V )/ε0,Pr(supt∈TXt ≥ λ)≤(DKλ√V σ2(X))v σ(X)λexp(−λ2/2σ2(X)).Now let 3n < 1/2 and define∆K˜n = ∪Jj=1∆K˜n,j and ∆K˜n,j = {ag : a ∈ (0, 1], g ∈ ∆Kn,j} .By Condition SL1, the covering number of class ∆K˜n satisfies the polynomialbound condition of Proposition A.2.7 with v > 1, ρ = L2(Q), T = ∆K˜n,and ε0 = 1. Also, the uniform covering number of the Gaussian process∆Gn,j(y) with respect to the standard deviation semimetric is bounded bythe uniform covering number of function ∆K˜n. So the Gaussian process∆Gn,j(y) meets the conditions of Proposition A.2.7. Let λ = K1/2n 3n (whereKn in specified in Condition SL4). Since σ2(X) = Var (∆Gn,j(y)) ≤ 423n,we have λ ≥ σ2(X)(1 +√V ). Applying Proposition A.2.7 to the process∆Gn,j(y), we obtainPr(∣∣∣Wˆ (yn1 )− W˜ (yn1 )∣∣∣ ≥ λ)≤ Pr(supj∈J ,y∈Yn,κ2∆Gn,j(y) ≥ λ)≤ exp(v log(K3/2n /Var (∆Gn,j(y)))− 1/2 logKn−Kn23n/2Var (∆Gn,j(y))).Since Var (∆Gn,j(y)) ≤ 423n, the leading term in the bracket is−Kn23n/2Var (∆Gn,j(y)) ≤ −Kn/8.139A.3. Proofs for Section 2.4Recall thatKn ≥ log n and λ = K1/2n 3n (can be bounded above by Cn−c, forsufficiently large n, because 3n = 2√b2nσ2nKn/n∨√σ2nKn/n and ConditionSL4). Therefore, there exist constants C1 and c1 such that λ ≤ C1n−c1 andPr(∣∣∣Wˆ (yn1 )− W˜ (yn1 )∣∣∣ ≥ λ)< C1n−c1 , (A.11)uniformly over yn1 ∈ Sn,1.We need Theorem 3.2 of Chernozhukov et al. (2013).[Restatement of Theorem 3.2 in Chernozhukov et al. (2013).Suppose G is a class of measurable functions uniformly bounded by aconstant b such that there exist constants a ≥ e and v > 1 withsupQN(G, L2(Q), b) ≤ (a/)v for all 0 <  ≤ 1. Let σ2 be a constant suchthat supg∈GVar(g) ≤ σ2 ≤ b2. Assume that b2Kn ≤ nσ2, where Knspecified in Theorem 3.1 cited before. Then for any δ > 0, there exists a setSn,0 such that Pr(Sn,0) ≥ 1− 3/n and for any xn1 ∈ Sn,0 one can constructa random variable W0 satisfying the following conditions.(i) W0 =d ‖B‖G , where B has the same property with that in Theorem3.1 restated before.(ii)Pr(∣∣∣W˜ (xn1 )−W0∣∣∣ > ψn + δ)≤ Aγn(δ),where A is an absolute constant andψn =(σ2Kn/n)1/2+(b2σ2K3n/n)1/4,γn(δ) =(b2σ2K3n/n)1/4/δ + 1/n.Applying Theorem 3.2 to W˜ (yn1 ) and B = G(g), we can construct arandom variable W 0 = supg∈∆Kn |G(g)| which satisfies the inequality inTheorem 3.2 (ii). Since b2nK4n/n ≤ C1n−c1 in Condition SL4, there exists aλ such that λ ≤ C1n−c1 andPr(∣∣∣W˜ (yn1 )−W0∣∣∣ ≥ λ)< C1n−c, (A.12)uniformly over yn1 ∈ Sn,2 and Pr(Sn,2) ≥ 1 − 3/n. Combining (A.11) and(A.12) leads toPr(∣∣∣Wˆ (yn1 )−W0∣∣∣ ≥ λ)< C1n−c, (A.13)uniformly over yn1 ∈ Sn = Sn,1 ∩ Sn,2, Pr(Sn) ≥ 1− 5/n and λ ≤ C1n−c1 .140A.3. Proofs for Section 2.4Let cS1,n(α) be the conditional (1 − α)th quantile of the distribution ofsupj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣ given {Yi}ni=1 = yn1 . We obtainPr(supg∈∆Kn|G(g)| ≤ cS1,n(α) + λ)= Pr(W 0 ≤ cS1,n(α) + λ)= Pr(Wˆ (yn1 ) +(W 0 − Wˆ (yn1 ))≤ cS1,n(α) + λ)≥ Pr(Wˆ (yn1 ) ≤ cS1,n(α))− Pr(∣∣∣W 0 − Wˆ (yn1 )∣∣∣ > λ)≥ 1− α− C1n−c1 ,where the first equality comes from W 0 =d supg∈∆Kn |G(g)|, the thirdinequality comes from the fact thatPr(X + Y ≤ c+ c′) ≥ Pr(X ≤ c)− Pr(Y > c′),and the last inequality comes from the definition of cS1,n(α) and (A.13).Recall that cn be the (1−α)th quantile of the distribution of supg∈∆Kn |G(g)|.We havePr(supg∈∆Kn|G(g)| < cn(α+ C1n−c1))= 1− α− C1n−c1 .Therefore,cS1n(α) + λ ≥ cn(α+ C1n−c1),uniformly over yn1 ∈ Sn such that Pr(Sn) ≥ 1− 5/n and λ ≤ C1n−c1 . ThusCondition SH2 (i) holds with τn = C1n−c1 , 2n = λ ≤ C1n−c1 and δ2n = 5/n.Condition SH2 (ii) can be proven analogously.(iii) Verify SH3: This follows Chernozhukov et al. (2013)Observe that ∣∣∣∣σˆj,n(y)σj,n(y)− 1∣∣∣∣ ≤∣∣∣∣∣σˆ2j,n(y)σ2j,n(y)− 1∣∣∣∣∣. (A.14)By the expressions of σˆ2j,n(y) and σ2j,n(y) in Chapter 2,σˆ2j,n(y)− σ2j,n(y)σ2j,n(y)= En(ϕ(j)n (Yi, y)σj,n(y))2−E(ϕ(j)n (Yi, y)σj,n(y))2141A.3. Proofs for Section 2.4+(En[ϕ(j)n (Yi, y)σj,n(y)])2−(E[ϕ(j)n (Yi, y)σj,n(y)])2,where En denotes the sample mean. Therefore,∣∣∣∣∣σˆ2j,n(y)σ2j,n(y)− 1∣∣∣∣∣≤ supg∈∆K2n∣∣En[g(Yi)2]−E[g(Y1)2]∣∣+ supg∈∆Kn∣∣∣(En [g(Y1)])2 − (E [g(Y1)])2∣∣∣ , (A.15)where ∆K2n = ∪Jj=1∆K2n,j and ∆K2n ={g2 : g ∈ ∆Kn,j}. By the definitionof σ2n in Condition SL4, g ≤ b2n. Thus for all g ∈ ∆K2n,E[g(Y1)2] ≤ b2nE [g(Y1)] ≤ b2nσ2n,where the last inequality comes from that E [g(Yi)] ≤ σ2n for g ∈ ∆K2n, byCondition SL1(ii) and the definition of σ2n in Condition SL4. In addition,the covering number of K2n also satisfies the polynomial bound (A.10). Inthe following, we will use a form of Talagrand’s Inequality stated and provenby Chernozhukov et al. (2013).[Restatement of Theorem A.4 in Chernozhukov et al. (2013)]Let ξ1,...,ξn be i.i.d. random variables taking values in a measurablespace (S,S). Suppose that G is a non-empty, pointwise measurable classof functions on S uniformly bounded by a constant b such that there existconstants a ≥ e and v > 1 with supQN(G, L2(Q), b) ≤ (a/)v for all0 <  ≤ 1. Let σ2 be a constant such that supg∈G var(g) ≤ σ2 ≤ b2.Ifb2v log(ab/σ) ≤ nσ2, then for all t ≤ nσ2/b2,Pr[supg∈G∣∣∣∣∣n∑i=1(g(ξi)−E [g(ξ1)])∣∣∣∣∣> A√nσ2 {t ∨ v log(ab/σ)}]≤ e−t,where A > 0 is an absolute constant.Let t = log n, b = bn and σ = σn. (Note that t ≤ nσ2n/b2n for large n.)Recall that Kn = Av(log n ∨ log(abn/σn)). We haveb2nKn/n ≤ b2nK4n/n ≤ C5n−c5 ≤ c2 ≤ σ2n,where the first and third inequality hold for sufficiently large n, the secondinequality follows from Condition SL4 and the last inequality follows fromCondition SL2. Since Kn ≥ Av log(abn/σn), b2nv log(abn/σn) ≤ σ2n and142A.3. Proofs for Section 2.4Condition SL1 holds, conditions of Theorem A.4 are satisfied. ApplyingTheorem A.4 to supg∈∆K2n∣∣En[g(Yi)2]−E[g(Y1)2]∣∣, we havePr[supg∈∆K2n∣∣En[g(Yi)2]−E[g(Y1)2]∣∣> A√nσ2n {log n ∨ v log(abn/σn)}/n]≤ 1/n.Notice that log n∨v log(abn/σn) ≤ v {log n ∨ log(abn/σn)} ≤ Kn, where thefirst inequality comes from v > 1 and the second from any A ≥ 1. We obtainPr[supg∈∆K2n∣∣En[g(Yi)2]−E[g(Y1)2]∣∣ >√σ2nKn/n]≤ 1/n. (A.16)The second term on the right hand side of (A.15) is bounded above bysupg∈∆Kn∣∣∣(En [g(Yi)])2 − (E [g(Y1)])2∣∣∣≤ 2bn supg∈∆Kn∣∣En[g(Yi)2]−E[g(Y1)2]∣∣ ,where the inequality comes from Condition SL1(ii).Again applying Theorem A.4 to supg∈∆Kn |En [g(Yi)]−E [g(Y1)]|, wehavePr[supg∈∆Kn|En [g(Yi)]−E [g(Y1)]| >√σ2nKn/n]≤ 1/n.ThusPr[supg∈∆Kn∣∣∣(En [g(Y1)])2 − (E [g(Y1)])2∣∣∣ > 2√b2nσ2nKn/n]≤ 1/n. (A.17)Let 3n = 2√b2nσ2nKn/n∨√σ2nKn/n. Then there exist constants C1 and c1such that 3n ≤ C1n−c1 . By letting δ3n = 2/n, Condition SH3 follows.(iv) Verify SH4: By Condition SH3 and SL2, we know that withprobability 1− δ3n. Thus we havecS1,n(γn) supj∈J ,y∈Yκ2σˆj,n(y) ≤ cS1,n(γn)(1 + 3n) supj∈J ,y∈Yκ2σˆj,n,s(y)≤ CcS1,n(γn),where C ≥ (1 + 3n)C2. Combining this with Condition SL4, we have143A.3. Proofs for Section 2.4√npnhJ−1n ptn ≤ M¯CcS1,n(γn). (A.18)Observe thatBn ≤√npnhJ−1n C4ptninfj∈J ,y∈Yκ2 σˆj,n(y)≤√npnhJ−1n C4ptn(1− 3n) infj∈J ,y∈Yκ2 σj,n(y)≤√npnhJ−1n C4ptn(1− 3n)c2≤ C6√npnhJ−1n ptn≤ C6M¯CcS1,n(γn),where the first inequality comes from Condition SL3, the second inequalitycomes from Condition SH3, the third follows from Condition SL2, the fourthholds for any constant C6 satisfying C6 ≥ C4/((1 − 3n)c2), the last comesfrom (A.18). Q.E.D.LemmasThe following three lemmas will be used in the proof of Proposition 5and 6. Lemma A1(i) and Lemma A3 will be used in the proof of Proposition5. Lemma A1(ii) will be used in the proof of Proposition 6. Lemma 2 willbe used in the proof of Lemma 3.Lemma A1. (i) Suppose that H0 is true, or Condition A1 holds. IfCondition SL5 holds,Pr(Yˆn,κ1 ⊆ Yκ2)≥ 1−O(n−c0),for some c0 > 0.(ii) Suppose H0 is true, or Condition A1 holds. If Condition SL5 holdsand0 < dn < min (γh, γp, (1− γp − (J − 1)γh)/2) ,thenPr(Yn,L,c ⊆ Yˆn,κ1)≥ 1−O(n−c′0),144A.3. Proofs for Section 2.4for some c′0 > 0.Proof of Lemma A1.(i) DefineP˜−j (y) ≡1nn∑i=11{yj − wj ≤ Yi,j < yj , |Yi,k − yk| ≤ hn/2, all k 6= j},P˜+j (y) ≡1nn∑i=11{yj ≤ Yi,j ≤ yj + wj , |Yi,k − yk| ≤ hn/2, all k 6= j},P˜j(y) = P˜−j (y) + P˜+j (y), wj = (supyYj − infyYj)/H,andY˜κ1 = {y ∈ Y : P˜−j (y)/P˜j(y) > κ1, P˜+j (y)/P˜j(y) > κ1, all j}.We know that Yˆn,κ1 ⊆ Y˜κ1 for all n since wj ≥ wˆj . It suffices to show thatPr(Y˜κ1 ⊆ Yκ2)≥ 1−O(n−c0).By definitions of Y˜κ1 , Yκ2 and Markov’s inequality,Pr(Y˜κ1 ⊆ Yκ2)≥ 1−J∑j=1Pr(∣∣∣P˜−j (y)/P˜j(y)− P−j (y)/Pj(y)∣∣∣ > (κ1 − κ2))−J∑j=1Pr(∣∣∣P˜+j (y)/P˜j(y)− P+j (y)/Pj(y)∣∣∣ > (κ1 − κ2))≥ 1−J∑j=1E[(P˜−j (y)/P˜j(y)− P−j (y)/Pj(y))2]/(κ1 − κ2)2−J∑j=1E[(P˜+j (y)/P˜j(y)− P+j (y)/Pj(y))2]/(κ1 − κ2)2=1−O(p4n + h4n +1npnhJ−1n)under H0,1−O(p2n + h2n +1npnhJ−1n)under Condition A1,= 1−O(n−c0).The last equality is from Condition SL5, for a properly chosen c0 dependingon the rates γh and γp.145A.3. Proofs for Section 2.4Part (ii) can be proven in the same way. We havePr(Yn,L,c ⊆ Yˆn,κ1)≥1−O((p4n + h4n +1npnhJ−1n)n2d)under H0,1−O((p2n + h2n +1npnhJ−1n)n2d)under Condition A1,The desired result follows from the restriction imposed on d by Condition(ii) of Lemma A1. Q.E.D.The following Condition A2 will be used in Lemma A2.Condition A2. The alternative hypothesis H1 is true, and the densityfY (y) > 0 for all y ∈ Yκ1 , whereYκ1 = {y ∈ Y : P−j (y)/Pj(y) > κ1, P+j (y)/Pj(y) > κ2, all j ∈ J }.Lemma A2. Suppose H0 is true or Conditions A1 and A2 hold. IfConditions SL2-SL5 hold, given any yn1 ∈ Sn, where Sn is defined whenverifying Condition SH2 and Pr(Sn) > 1− 5/n, we havePr(∣∣∣∣∣supj∈J ,y∈Yˆn,κ2Gˆn,j(y)− supj∈J ,y∈Yκ2Gˆn,j(y)∣∣∣∣∣> rn)≤ νn,for sufficiently large n, where rn and νn are bounded above by some Cn−c.Proof of Lemma A2.LetYL,n,κ2 = {y ∈ Y :P−j (y)Pj(y)> κ2 + n−d,P+j (y)Pj(y)> κ2 + n−d, all j},YU,n,κ2 = {y ∈ Y :P−j (y)Pj(y)> κ2 − n−d,P+j (y)Pj(y)> κ2 − n−d, all j},where d < min (γp, γh, (1− γp − (J − 1)γh)/2). By similar arguments as inthe proof of Lemma A1, we havePr(YL,n,κ2 ⊂ Yˆn,κ2 ⊂ YU,n,κ2)≥ 1−O(n−k),where k = min (γp, γh, (1− γp − (J − 1)γh)/2)− d. By definition, YL,n,κ2 ⊂Yκ2 ⊂ YU,n,κ2 . Therefore, for some rn → 0 (rn will be specified later),146A.3. Proofs for Section 2.4Pr(∣∣∣∣∣supj∈J ,y∈Yˆn,κ2Gˆn,j(y)− supj∈J ,y∈Yκ2Gˆn,j(y)∣∣∣∣∣> rn)≤ Pr(∣∣∣∣∣supj∈J ,y∈YL,n,κ2Gˆn,j(y)− supj∈J ,y∈YU,n,κ2Gˆn,j(y)∣∣∣∣∣> rn)+O(n−k). (A.19)By (A.11) (with a slight modification of the index set), we havePr(∣∣∣∣∣supj∈J ,y∈YL,n,κ2Gˆn,j(y)− supj∈J ,y∈YL,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)< C1n−c1 , andPr(∣∣∣∣∣supj∈J ,y∈YU,n,κ2Gˆn,j(y)− supj∈J ,y∈YU,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)< C1n−c1 ,uniformly over yn1 ∈ Sn and rn/3 ≤ C1n−c1 . Given yn1 ∈ Sn, we havePr(∣∣∣∣∣supj∈J ,y∈YL,n,κ2Gˆn,j(y)− supj∈J ,y∈YU,n,κ2Gˆn,j(y)∣∣∣∣∣> rn)≤ Pr(∣∣∣∣ maxj,y∈y∈YL,n,κ2G˜n,j(y)− maxj,y∈YU,n,κ2G˜n,j(y)∣∣∣∣ ≥ rn/3)+ Pr(∣∣∣∣∣supj∈J ,y∈YU,n,κ2Gˆn,j(y)− supj∈J ,y∈YU,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)+ Pr(∣∣∣∣∣supj∈J ,y∈YL,n,κ2Gˆn,j(y)− supj∈J ,y∈YL,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)≤ Pr(∣∣∣∣∣supj∈J ,y∈YL,n,κ2G˜n,j(y)− supj∈J ,y∈YU,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)+C1n−c1 . (A.20)By Condition A2, for any y′ ∈ YU,n,κ2\YL,n,κ2 , there exists some y′′ ∈ YL,n,κ2such that |y′ − y′′| < , for any , when n is sufficiently large. HencePr(∣∣∣∣∣supj∈J ,y∈YU,n,κ2G˜n,j(y)− supj∈J ,y∈YL,n,κ2G˜n,j(y)∣∣∣∣∣≥ rn/3)≤ Pr supj∈J ,|y1−y0|<,y1,y0∈YU,n,κ2∣∣∣G˜n,j(y0)− G˜n,j(y1)∣∣∣ ≥ rn/3 .147A.3. Proofs for Section 2.4We can write the process {G˜n,j(y) : y ∈ YU,n,κ2 , j ∈ J } as G˜ = {G˜(g) : g ∈∆KU,n}, whereG˜(g) =1√nn∑i=1ξi[g(Yi)−1nn∑i=1g(Yi)], g ∈ ∆KU,n,and KU,n = ∪Jj=1KU,n,j , where∆KU,n,j =√pnhJ−1n ϕ(j)n (·, y)σn,j(y): y ∈ YU,n,κ2.For fixed yn1 , G˜ is a centred Gaussian process and is sub-Gaussian withrespect to standard deviation semimetric σ such that σ(g1, g0) = (E[G˜(g1)−G˜(g2)]2)1/2 . In addition, by the mean square continuous sample path ofG˜, it must have σ(g1, g0) → 0 as |y1 − y0| → 0, where g1, g0 ∈ ∆KU,n,j ,y1, y0 ∈ YU,n,κ2, andgλ : t 7−→√pnhJ−1n ϕ(j)n (t, yλ)σn,j(yλ), λ = 0, 1.Therefore, for every δ > 0, we obtainPr(supj∈J ,|y1−y0|<,y1,y0∈YU,n,κ2∣∣∣G˜n,j(y0)− G˜n,j(y1)∣∣∣ ≥ rn/3)≤ Pr(supσ(g1,g0)<δ,g0,g1∈∆KU,n∣∣∣G˜(g1)− G˜(g0)∣∣∣ ≥ rn/3)≤ 3E[supσ(g1,g0)<δ,g0,g1∈∆KU,n∣∣∣G˜(g1)− G˜(g0)∣∣∣]/rn≤3Krn∫ δ0√logD(ε, σ)dε. (A.21)The second inequality comes from Markov’s inequality and the last fromCorollary 2.2.8 of Van der Vart and Wellner (2000). D(ε, σ) is the packingnumber of the class ∆Kn with radius ε and standard deviation semimetric.By Condition SL1 and the fact that D(ε, σ) < N(ε/2, σ), we havelogD(ε, σ) < K1 log(1/ε) for some constant K1. For sufficiently small δ, weobtain∫ δ0√logD(ε, σ)dε ≤ K2∫ δ0log(1/ε)dε ≤ K2(δ(log(1/δ + 1))),148A.3. Proofs for Section 2.4for some constant K2. Now let δ = 1/n and rn = n−r. We havenc1rn∫ δ0√logD(ε, σ)dε ≤log(n+ 1)n1−c1−r1.The right hand side of the above inequality goes to zero when n → ∞,as long as c1 + r < 1. As a result, 3Krn∫ δ0√logD(ε, σ)dε < C1n−c1 forsufficiently large n.Lemma A2 follows by combining (A.19), (A.20) and (A.21) and by lettingνn equal some some constant times n−(c1∧k). Q.E.D.Lemma A3. Suppose Conditions SL1-SL5 hold, there exist numbers cand C satisfyingPr(cˆS1,n(α) ≤ cS1,n(α+ νn)− rn)≤ δ5n,for some positive sequence rn, vn and δ5n bounded above by Cn−c.Proof of Lemma A3.For any yn1 ∈ Sn (recall Pr(Sn) > 1− 5/n),Pr(supj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣ ≤ cˆS1,n(α) + rn)≥ Pr∣∣∣supj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣− supj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣∣∣∣+ supj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣ ≤ cˆS1,n(α) + rn≥ Pr(supj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣ ≤ cˆS1,n(α))−Pr(∣∣∣∣∣supj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣− supj∈J ,y∈Yˆn,κ2∣∣∣Gˆn,j(y)∣∣∣∣∣∣∣∣> rn)≥ 1− α− νn,where the last inequality comes from the definition of cˆS1,n(α) and LemmaA2. Meanwhile, by definition of cS1,n(·),Pr(supj∈J ,y∈Yκ2∣∣∣Gˆn,j(y)∣∣∣ ≤ cS1,n(α+ νn))= 1− α− νn.Thus we obtaincˆS1,n(α) + rn ≥ cS1,n(α+ νn),149A.3. Proofs for Section 2.4for any yn1 ∈ Sn. Lemma A3 follows by letting δ5n = 5/n. Q.E.D.Proofs of Proposition 5 and 6We prove Proposition 5 and 6 using Conditions SL1-SL4 and LemmasA1 and A3.Proof of Proposition 5.Observe thatPr(∆n < c˜S1,n)≥ (1) Pr supj∈J ,y∈Yκ2∣∣∣∣∣∣√npnhJ−1n ∆(j)n (y)σˆj,n(y)∣∣∣∣∣∣< c˜S1,n− µn≥ (2) Pr(supj∈J ,y∈Yκ2|Zj,n(y)|σj,n(y)σˆj,n(y)+Bn < c˜S1,n)− µn≥ (3) Pr(supj∈J ,y∈Yκ2|Zj,n(y)|σj,n(y)σˆj,n(y)+Bn < cS1,n(νn)− rn)− µn − δ5n≥ (4) Pr(supj∈J ,y∈Yκ2|Zj,n(y)|σj,n(y)σˆj,n(y)< cS1,n(α+ νn)− rn)−µn − δ5n − δ4n≥ (5) Pr(supg∈∆Kn |Zn(g)| < (1− 3n)cn(α+ τn + νn)−(1− 3n)rn − (1− 3n)2n − (1− 3n)rn)−µn − δ5n − δ4n − δ2n − δ3n≥ (6) Pr(supg∈∆Kn |G(g)| < (1− 3n)cn(α+ τn + νn)−(1− 3n)rn − 1n − (1− 3n)2n)−µn − δn≥ (7) Pr(supg∈∆Kn|G(g)| < cn(α+ τn + νn))− supw∈RPr(∣∣∣∣∣supg∈∆Kn|G(g)| − w∣∣∣∣∣< n)− µn − δn≥ (8)1− α− τn − νn −AnE∣∣∣∣∣supg∈∆Kn|G(g)|∣∣∣∣∣− µn − δn≥ (9)1− α− Cn−c,where δn = δ1n + δ2n + δ3n + δ4n + δ5n, and n = cn(α+ τn + νn)3n + 1n +150A.3. Proofs for Section 2.4(1− 3n)2n + (1− 3n)rn. Inequality (1) comes from Lemma A1. Inequality(2) comes fromsupj∈J ,y∈Yκ2∣∣∣∣∣∣√npnhJ−1n ∆(j)n (y)σˆj,n(y)∣∣∣∣∣∣≤ supj∈J ,y∈Yκ2|Zj,n(y)|σj,n(y)σˆj,n(y)+Bn,under H0. (This is because ∆n(y) = 0 for all j under H0.) Inequality(3) comes from Lemma A3. Inequality (4) follows c1,n ≡ c1,n(α) + c′1,n,Pr(X+Y ≤ c+c′) ≥ Pr(X ≤ c)−Pr(Y > c′) and Condition SH4. Inequality(5) comes from Condition SH2, SH3 and the fact that Pr(X < Y ) ≥ Pr(X <a) − Pr(Y < a). Inequality (6) comes from Condition SH1(ii). Inequality(7) uses the fact thatPr(X < a)−Pr(X < a−b) ≤ Pr(|X−x| < b) for a−b < x < a, a > 0, b > 0.Inequality (8) follows by applying Corollary 2.1 of Chernozhukov et al.(2013)to the second term on the right hand side of inequality (7). Inequality (9)uses Condition SH1(i) together with Markov’s inequality to bound c1,n(α)and supg∈∆Kn |G(g)|. The argument from inequality (4) to the end areadapted from the proof of Proposition 4.1 in Chernozhukov et al.(2013).Q.E.D.Proof of Proposition 6. Recall the (2.19) and letBj,n(y) =√npnhJ−1n(E[∆(j)n (y)]−∆(j)(y))σˆj,n(y).Notice thatsupj∈J ,y∈Yκ2∣∣∣∣∣∣√npnhJ−1n ∆(j)n (y)σˆj,n(y)∣∣∣∣∣∣= supj∈J ,y∈Yκ2∣∣∣∣∣∣|Zj,n(y)|σj,n(y)σˆj,n(y)+Bj,n(y) +√npnhJ−1n ∆(j)(y)σˆj,n(y)∣∣∣∣∣∣.Therefore,Pr(∆n > c˜S1,n)151A.3. Proofs for Section 2.4≥ Pr supj∈J ,y∈Yn,L,c∣∣∣∣∣∣√npnhJ−1n ∆(j)n (y)σˆj,n(y)∣∣∣∣∣∣> c˜S1,n− o(1)= Pr supj∈J ,y∈Yn,L,c∣∣∣∣∣∣|Zj,n(y)|σj,n(y)σˆj,n(y)+Bj,n(y) +√npnhJ−1n ∆(j)(y)σˆj,n(y)∣∣∣∣∣∣> c˜S1,n−o(1)≥ Pr|Zj∗,n(y∗)|σj∗,n(y∗)σˆj∗,n(y∗)+Bj∗,n(y∗) +√npnhJ−1n δσˆj∗,n(y∗)> c˜S1,n− o(1)≥ Pr(|Zj∗,n(y∗)|σj∗,n(y∗)σˆj∗,n(y∗)+Bj∗,n(y∗) +√npnhJ−1n δσˆj∗,n(y∗)> cn(α+ τn) + Ccn(γn + τn)− 6n)− o(1),where (j∗, y∗) ∈ arg supj∈J ,y∈Yn,L,c ∆(j)(y). The first inequality comes fromLemma A1(ii) and the Condition (iii) of Proposition 6. The last inequalitycomes from Condition SH2. By Condition (i) of Proposition 6, ∆(j∗)(y∗) =δ > 0. Then√npnhJ−1n δσˆj∗,n(y∗)= Op(√npnhJ−1n). Meanwhile cn(α+τn) ≤C1 lognα+τnand cn(γn + τn) ≤C1 lognγn+τnby Markov’s inequality and Condition SH1(i).Proposition 6 follows if npnhJ−1n γn/ log n→∞. In particular, when C = 0,Proposition 6 follows if npnhJ−1n / log n→∞. Q.E.D.152Appendix BAppendix for Chapter 3B.1 A Discussion of the Alternative EquilibriumNotionTo analyze the equilibrium characterizing equations given by the secondnotion of equilibrium. We reproduce (3.4) as follows. An equilibrium is apair of cutoffs in the value of private information (u∗1(x), u∗2(x)) such thatthe following holdspi1(1, 0, x1) + ∆1(x1) Pr (U2i ≤ u∗2(x)|Xi = x, U1i = u∗1(x)) = u∗1(x),pi2(1, 0, x2) + ∆2(x2) Pr (U1i ≤ u∗1(x)|Xi = x, U2i = u∗2(x)) = u∗2(x).Moreover, letG2(u∗2, x, u∗1) = Pr (U2i ≤ u∗2|Xi = x, U1i = u∗1) ,G1(u∗1, x, u∗2) = Pr (U1i ≤ u∗1|Xi = x, U2i = u∗2) ,andψ(u∗, x) =[u∗1 − pi1(1, 0, x1)−∆1(x1)G2(u∗2, x, u∗1)u∗2 − pi2(1, 0, x1)−∆2(x1)G1(u∗1, x, u∗2)].The system of equilibrium characterizing equations isψ(u∗, x) = 0.This system of equations corresponds to (3.7). Definitions 1, 2 andAssumption 3 can be modified by replacing ϕ and vector σ with ψ and u∗respectively. The observed conditional choice probabilities can beexpressed asQk (1|1, x) =∑m∈Υ(x)pim(x) Pr(Ui ≤ u∗(m)k (x)|Xi = x, U−ki ≤ u∗(m)−k (x)),153B.2. Mathematical Proofsfor k = 1, 2. This corresponds to (3.9) if the following Assumption 6 holds.Hence the results of this paper follows under this notion of equilibrium.Assumption 6. For any (a, b) 6= (a′, b′) satisfyingPr (Uki ≤ a|Xi = x, U−ki ≤ b) ∈ (0, 1), andPr(Uki ≤ a′|Xi = x, U−ki ≤ b′) ∈ (0, 1),for k = 1, 2, we havePr (Uki ≤ a|Xi = x, U−ki ≤ b) 6= Pr(Uki ≤ a′|Xi = x, U−ki ≤ b′) .Assumption 6 implies that a jump in the equilibrium cut-off (u∗1, u∗2) mustlead to a jump in the conditional belief Pr(Uki ≤ u∗k|Xi = x, U−ki ≤ u∗−k)fork = 1, 2, as long as the latent conditional belief takes a value strictly between0 and 1.B.2 Mathematical ProofsIn this section, we present proofs for Propositions 1 to 3, Lemma 1 andCorollaries 1 and 2. The proof of Lemma 1 needs Lemma 2, which is alsostated below.Proof of Proposition 1.We only need to consider Category (C1) because Sub-category (C2-1) canbe viewed as a case in Category (C1). By (3.9), it suffices to show that theM in Definition 1 equals 1. By (C1), there exists a unique function q(x) suchthat for any (σ, x) ∈ (0, 1)2× int(X ) satisfying ϕ (σ, x) = 0, σ = q(x). Weneed to show that q(x) is twice continuously differentiable on int(X ). UnderAssumptions 1-3 and applying Implicit function theorem on an arbitrarypoint (σ0, x0) ∈ (0, 1)2× int(X ), there are neighbourhoods A of σ0 and Bof x0 on which (3.7) uniquely defines σ as a function of x. That is, thereis a function ξ : B → A such that ϕ (ξ(x), x) = 0 for all x ∈ B; for eachx ∈ B, ξ(x) is the unique solution to (3.7) lying in A, and ξ(x) is twicecontinuously differentiable on B. Because (σ0, x0) is arbitrary and q(x) isthe only solution to (3.7) by (C1), we must have q(x) = ξ(x) for all x ∈int(X ). Thus q(x) is twice continuously differentiable on a neighbourhoodof every x ∈ intX . This immediately leads to the desired result.Q.E.D.Proof of Lemma 1.154B.2. Mathematical ProofsLet S = {x ∈ X : (3.7) admits a unique solution in σ}. Without loss ofgenerality, assume that S ∩B1 6= ∅. Denote S ∩B1 = S1. By construction,Υ(S1) = {1}. By Lemma 2 (stated and proven below), there exists someh ∈ {1, ...,M}, such that B1 ∩Bh 6= ∅. Denote Ch = B1 ∩Bh. Without lossof generality, S¯1 ∩ C¯h 6= ∅. (Otherwise, there will be another k ∈ {1, ...,M}such that B1∩Bh 6= ∅ and S¯1∩ C¯k 6= ∅, where Ck = B1∩Bk, all we need isB1\S1 is not empty, which holds by the conclusion of Lemma 2(ii).) Form asubset Dh of Ch such that S¯1 ∩ D¯h 6= ∅ and Υ(Dh) is well defined. In thiscase, {1, h} ⊂ Υ(D1). Therefore, S1 and Dh corresponds to C1 and C2 inAssumption 4H with Υ(S1) = {1} while Υ(Dh) ⊃ {1, h}. Q.E.D.The proof of Lemma 1 needs the following Lemma 2.Lemma 2. If the game is in Category (C2) or (C3), then the followingholds. (i) The constant M in Definition 1 is larger or equal than 2. (ii)Suppose that Assumption 5 holds, then for any m ∈ {1, ...,M}, there existsa k ∈ {1, ...,M} and k 6= m, such that Bm ∩Bk 6= ∅.Proof of Lemma 2(i). When the game is in (C2-2) or (C3), there exists some xM ∈ Xsuch that ϕ(σ1, xM ) = ϕ(σ2, xM ) = 0 for some σ1 6= σ2. If M = 1, thereis a function q : X → (0, 1)2 such that σ = q(x) for all (σ, x) ∈ X ×(0, 1)2 satisfying ϕ(σ, x) = 0. But this means σ1 = q(xM ) and σ2 = q(xM ),which can not happen.(ii). It suffices to show that for every m ∈ {1, ...,M}, there exists somex ∈ Bm such that ϕ(σ, x) = 0 for more than one σ ∈ (0, 1)2 . Suppose it isnot true, we have a Bm such that ϕ(σ, x) = 0 has a unique solution in σ forall x ∈ Bm. Consider an arbitrary l ∈ {j 6= m : B¯m ∩ B¯j 6= ∅} and let x0be an arbitrary point x0 ∈ B¯m ∩ B¯l. By assumption, Bm ∩ Bl = ∅ for alll 6= m. Thus x0 ∈ B¯m\Bm. For an such l, we have two cases:Case 1. limx∈Bm,x→x0 qm(x)−limx∈Bl,x→x0 ql(x) = 0 for all x0 ∈ B¯m∩B¯l.Thus we can combine Bm and Bl to reduce the number M , this contra-dicts with the Definition 1 in which M is the smallest number. In particu-lar, let B′m = int(Bm ∪Bl ∪(B¯m ∩ B¯l))andq′m(x) =qm(x), x ∈ Bm,ql(x), x ∈ Bl,limx∈Bm,x→x0 qm(x), x0 ∈ B¯m ∩ B¯l.155B.2. Mathematical ProofsAlso, by Assumption 5, we know that 5σ(σ0, x0) is invertible, where σ0 =limx∈Bm,x→x0 qm(x) = limx∈Bl,x→x0 ql(x), then we can apply the ImplicitFunction Theorem to obtain that there are neighbourhoods A of σ0 andB of x0, and a unique function ς : B → A such that ς(x0) = σ0 and foreach x ∈ B, ς(x0) is the unique solution to (3.7) lying in A, ϕ(ξ(x), x) = 0for x ∈ B, and ς(x) is twice continuously differentiable at x0. Therefore,q′m(x) must coincide with ς(x) for x ∈ B, and q′m(x) is twice continuouslydifferentiable.Case 2. |limx∈Bm,x→x0 qm(x)− limx∈Bl,x→x0 ql(x)| = δ > 0, for somex0 ∈ B¯m ∩ B¯l.By continuity of ϕ : (0, 1)2 ×X → R2, we haveϕ(σm, x0) = ϕ(σl, x0) = 0,where σm = limx→x0,x∈Bmqm(x), σl = limx→x0,x∈Blql(x).There must be an l satisfying the following: there is no l′ such that everycomponent of σl′ − σm has the same sign as σl − σm and|σl′ − σm| < |σl − σm|. We focus on such an l. By Assumption 5(ii), thederivative 5σ(σl, x0) is invertible. By the Implicit Function Theorem, forany neighbourhood A of σl, there is a neighbourhood B of x0 and afunction ξ : B → A such that ξ(x0) = σl, ϕ(ξ(x), x) = 0 for x ∈ B, andξ(x) is continuous at x0. Therefore, for any , there exists an η satisfying|x− x0| < η′ =⇒ |ξ(x)− σl| < .Then, for any , all x′ ∈ Bm ∩ B(x0, η) ∩ B satisfy ϕ(ξ(x′), x′) = 0 and|ξ(x′)− σl| < . Thus |ξ(x′)− σm| > δ − .On the other hand, by continuity of qm, for any ′, there exists an η′satisfying|x− x0| < η′ =⇒ |qm(x)− σm| < ′.By choosing  and ′ sufficiently small, we have ξ(x′) 6= qm(x′) for all x′ ∈Bm ∩ B(x0, η ∧ η′) ∩ B. This contradicts the assumption that ϕ(σ, x) = 0has a unique solution in σ for all x ∈ Bm. Q.E.D.Proof of Proposition 2.By Assumption 4H, there exists an l ∈ {1, ...,M} such that for someC1, C2 ⊂ Bl, and C1 ∩ C2 = ∅, C¯1 ∩ C¯2 6= ∅, we have Υ(C1) 6= Υ(C2).Since C1, C2 ⊂ Bl, l ∈ Υ(C1) and l ∈ Υ(C2). Then consider an arbitraryxd ∈ C¯1 ∩ C¯2, by (3.9),156B.2. Mathematical Proofslimx→xd,x∈C1Q (1|1, x) =∑h∈Υ(C1)limx→xd,x∈C1pih(x) limx→xd,x∈C1σ(h)(1|1, x),limx→xd,x∈C2Q (1|1, x) =∑j∈Υ(C2)limx→xd,x∈C2pij(x) limx→xd,x∈C2σ(j)(1|1, x).Therefore,limx→xd,x∈C1Q (1|1, x)− limx→xd,x∈C2Q (1|1, x)=∑l∈Υ(C1)∩Υ(C2)(limx→xd,x∈C1pil(x)− limx→xd,x∈C2pil(x))σ(l)k (1|1, xd)+∑h∈Υ(C1)\Υ(C2)limx→xd,x∈C1pih(x) limx→xd,x∈C1σ(h)k (1|1, x)−∑j∈Υ(C2)\Υ(C1)limx→xd,x∈C2pij(x) limx→xd,x∈C2σ(j)k (1|1, x), (B.1)whereσ(l)(1|1, xd) 6= limx→xd,x∈C1σ(h)(1|1, x),σ(l)(1|1, xd) 6= limx→xd,x∈C2σ(j)(1|1, x),for any l ∈ Υ(C1) ∩ Υ(C2), h ∈ Υ(C1)\Υ(C2), j ∈ Υ(C2)\Υ(C1), and forany xd ∈ C¯1 ∩ C¯2. Suppose to the contrary, for example,σ(l)(1|1, xd) = limx→xd,x∈C2 σ(h)(1|1, x) for some xd ∈ C¯1 ∩ C¯2, this meanslimx→xd,x∈C1 ql(x) = limx→xd,x∈C2 qh(x) for some xd ∈ C¯1 ∩ C¯2. Note thatxd ∈ Bl and xd ∈ B¯h, thus we have ql(x′) = qh(x′) for somex′ ∈ B(xd, r) ∩ Bl ∩ Bh, where B(xd, r) is a ball centred in xd and withradius r. B(xd, r) ∩ Bl ∩ Bh 6= ∅ because xd ∈ C¯1 ∩ C¯2, C1, C2 ⊂ Bl andC2 ⊂ Bh. This contradicts with Remark 1 below Definition 1.Furthermore, observe that the set{( limx∈C1,x→xdpi(x), limx∈C2,x→xdpi(x)) : pi ∈ CE1(Υ)}is{( limx∈C1,x→xdpi(x), limx∈C2,x→xdpi(x)) : pi ∈ C(Υ) and (B.1) = 0},157B.2. Mathematical Proofswhich has zero measure in{( limx∈C1,x→xdpi(x), limx∈C2,x→xdpi(x)) : pi ∈ C(Υ)},for every xd ∈ C¯1 ∩ C¯2 and for all C1, C2, l satisfying Assumption 4H. Thedesired result follows. Q.E.D.Proof of Corollary 1.Consider a subset D satisfying the property in Definition 3 and let xd bean arbitrary point in D¯1 ∩ D¯2. Since Υ(D) is defined, i.e. the equilibriumindices set is the same within D, we have xd ∈ Bh for all h ∈ Υ(D). By(3.9),limx→xd,x∈D1Q (1|1, x) =∑h∈Υ(D)limx→xd,x∈D1pih(x) limx→xd,x∈D1σ(h)(1|1, x),limx→xd,x∈D2Q (1|1, x) =∑h∈Υ(D)limx→xd,x∈D2pih(x) limx→xd,x∈D2σ(h)(1|1, x),where limx→xd,x∈D1 σ(h)(1|1, x) = limx→xd,x∈D2 σ(h)(1|1, x) = σ(h)(1|1, xd)for all h ∈ Υ(D). Thus∣∣∣∣ limx→xd,x∈D1Q (1|1, x)− limx→xd,x∈C2Q (1|1, x)∣∣∣∣=∑h∈Υ(D)(limx→xd,x∈D1pih(x)− limx→xd,x∈D2pih(x))σ(h)(1|1, xd)=∑h∈Υ1(D)(limx→xd,x∈D1pih(x)− limx→xd,x∈D2pih(x))σ(h)(1|1, xd),(B.2)whereΥ1(D) ={h ∈ Υ(D) :∣∣limx→xd,x∈D1 pih(x)− limx′→xd,x′∈D2 pih(x′)∣∣ > 0}.Let CF (Υ) be the subset of C(Υ) that satisfies the property in Definition 3.Furthermore, observe that the set{( limx∈D1,x→xdpi(x), limx∈D2,x→xdpi(x)) : pi ∈ CE2(Υ)}is{( limx∈D1,x→xdpi(x), limx∈D2,x→xdpi(x)) : pi ∈ CF (Υ) and (B.2) = 0},158B.2. Mathematical Proofswhich has a zero measure in{( limx∈D1,x→xdpi(x), limx∈D2,x→xdpi(x)) : pi ∈ CF (Υ)},for every xd ∈ D¯1 ∩ D¯2 in Definition 3. The desired result follows. Q.E.D.Proof of Corollary 2.When the game is in sub-category (C2-2), there exists two subsets of X ,C1 and C2, C¯1 ∩ C¯2 6= ∅, and m∗1 6= m∗2 such that pim∗1(x) = 1 for x ∈ C1and pim∗2(x) = 1 for x ∈ C2. For any xd ∈ C¯1 ∩ C¯2, there are two cases:Case 1. There is an r > 0 such that Υ(B(xd, r)) is defined, whereB(xd, r)is an open ball centred in xd and with radius r. As the equilibrium indicesset does not change on B(xd, r), the ball B(xd, r) ⊂ Bm if m ∈ Υ(B(xd, r)).Therefore,limx→xd,x∈C1∩Υ(B(xd,r))Q (1|1, x)= limx→xd,x∈C1∩Υ(B(xd,r))σ(m∗1)(1|1, x) = σ(m∗1)(1|1, xd),where the second equality comes from B(xd, r) ⊂ Bm if m ∈ Υ(B(xd, r)).Similarly,limx→xd,x∈C2∩Υ(B(xd,r))Q (1|1, x)= limx→xd,x∈C2∩Υ(B(xd,r))σ(m∗2)(1|1, x) = σ(m∗2)(1|1, xd).Since σ(m∗1)(1|1, xd) 6= σ(m∗2)(1|1, xd) (Otherwise, Remark 1 of Definition 1 isviolated), Q (1|1, x) has a jump at xd.Case 2. There is no such r > 0 that Υ(B(xd, r)) is defined. That is,the equilibrium indices set changes at xd. Then there exist disjoint subsetsD1, D2 such that xd ∈ D¯1 ∩ D¯2, D1 ⊂ C1, D2 ⊂ C2, Υ(D1) and Υ(D2) aredefined, and Υ(D1) 6= Υ(D2). We havelimx→xd,x∈D1Q (1|1, x) = limx→xd,x∈D1σ(m∗1)(1|1, x),limx→xd,x∈D2Q (1|1, x) = limx→xd,x∈D2σ(m∗2)(1|1, x).Since the choice of xd ∈ C¯1 ∩ C¯2 is arbitrary, we must havelimx→xd,x∈D1σ(m∗1)k (1|1, x) 6= limx→xd,x∈D2σ(m∗2)k (1|1, x),159B.2. Mathematical Proofsfor some xd. Otherwise Bm∗1 and Bm∗2 can be combined together so thatM is not the smallest constant in Definition 1. Therefore, Qk (1|1, x) has ajump at xd. Q.E.D.Proof of Proposition 3.(i) By (3.15), we haveQ (1|1, x) =t¯∑t=1λtQt (1|1, x) .By Proposition 1, Qt (1|1, x) is twice continuously differentiable in x forall t ∈ {1, ..., t¯}. Thus Qk (1|1, x) is twice continuously differentiable in x.(ii) For an xd satisfying the condition COND in Proposition 3 (ii), by(3.15)Q (1|1, x) =∑t1∈{t′:xd∈X t′D}λt1Qt1 (1|1, x)+∑t2∈{t′:xd /∈X t′D}λt2Qt2 (1|1, x) . (B.3)By construction, Qt2 (1|1, x) does not have a jump at x = xd, for all t2 ∈{t′ : xd /∈ X t′D}. As a result, the second summation on the right hand sideof (B.3) does not have a jump at x = xd. By COND, the first summationon the right hand side of (B.3) has a jump at x = xd. Putting the twosummations together, Q (1|1, x) must have a jump at x = xd.(iii) can be shown in the same way as (ii). Q.E.D.160

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0166460/manifest

Comment

Related Items