FAMILY LABOUR SUPPLY ANDLABOUR FORCE PARTICIPATION DECISIONSbyJULES J.M. THEEUWESLicentiate in Commercial and Consular Sciences,University Faculties Saint Ignatius, Antwerp, 1966Licentiate in Economic SciencesUniversity of Louvain, 1970A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THEREQUIREMENT FOR THE DEGREE OFDOCTOR OF PHILOSOPHYin the DepartmentofECONOMI CSWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAAugust, 1975n presenting this thesis in partial fulfilment of the requirements foran advanced degree at the University of British Columbia, I agree thatthe Library shall make it freely available for reference and study.I further agree that permission for extensive copying of this thesisfor scholarly purposes may be granted by the Head of my Department orby his representatives. It is understood that copying or publicationof this thesis for financial gain shall not be allowed without mywritten permission.Department of EcIY1 C STheUniversityofBritishColumbia2075 Wesbrook PlaceVancouver, CanadaV6T 1WSDate EPTfl F_S no 74..AAS r— 11 —ABSTRACTThe main objective of this study is the empirical estimationof family labour force participation functions. The appropriate estimation procedure for a model involving choice among multiple discretealternatives requires a statistical technique different from ordinaryleast squares. In this study I use the binomial and multinomial logitmodel to estimate parameters affecting the probabilities of choosing aparticular labour force alternative.A theoretical contribution of this thesis to the econometricliterature is the development of a procedure which, in the context ofthe multinomial logit model, allows one to test whether decision makingis sequential or simultaneous. This procedure is applied in testingwhether the family chooses simultaneously among possible alternatives orwhether one partner decides first about participation and the other partner decides conditional upon the first. Using a Bayesian dicriminationtechnique it is found that the simultaneous decision model is more probable posteriori than the sequential model.A substantial portion of the empirical research in this studyinvolves the estimation and comparison of family labour force participationand labour supply decisions. I attempt to discriminate statisticallybetween the hypothesis that the parameters of supply and participationare either the same or that they are different and conclude that thehypothesis of different parameters is more probable, posteriori.— 111 —In addition, the comparison of the parameters of family labour supplyand labour force participation leads to interesting results, e.g.,the substitution effect on both participation and supply behaviour ofhusband and wife. Another use of the estimated labour supply and labourforce participation functions involves combining them to form unconditional labour supply functions. It is indicated that unconditional laboursupply functions could be useful to evaluate the combined effect onsupply and participation of a labour market policy.- iv-TABLE OF CONTENTSpageCHAPTER I INTRODUCTION 1CHAPTER II ECONOMIC THEORY OF LABOUR SUPPLY AND LABOURFORCE PARTICIPATION 61. Introduction2. Conditional Labour Supply Functions2.1 Theoretical Restrictions2.2 Functional Form of Demand Functions3. Labour Force Participation Functions3.1 Shadow Wage3.2 Restrictions on Labour Force Participation Functions3.3 Derivations of the Multinoniial Logit ModelA. Idiosyncrasies and stochastic specificationB. The binomial logit modelC. The multinomial logit modelD. Restrictions on selection probabilities3.4 Two Alternative Family Labour ForceParticipation Models4. Unconditional Labour Supply Models4.1 Comparing The Parameters of Labour Supply andLabour Force Participation4.2 Unconditional Labour Supply Functions: A Simulation5. The Problem of Unobserved Wage RatesAppendix to Chapter II: The Relationship Between theSimultaneous and the Sequential Model- V -pageCHAPTER III: PREDICTING POTENTIAL MARKET WAGE RATES 471. Introduction1 .1 Specification Problems1 .2 Prediction Problems2. The Wage Equations2.1 Description of the Sample2.2 The Husband’s Wage Equations2.3 The Wife’s Wage Equations3. Prediction3.1 Problems in Selecting A Predictor3.2 Test of Structural Difference3.3 Prediction Test3.4 Choice of A PredictorCHAPTER IV FAMILY LABOUR FORCE PARTICIPATION DECISIONS 871. Introduction2. Estimation Problems of the Logit Model3. Description of the Sample4. The Simultaneous Family Labour ForceParticipation Model5. The Sequential Family Labour ForceParticipation Model5.1 The Husband’s Labour Force Participation VersusThe Wife’s Labour Force Participation Decision- vi -page5.2 Labour Force Participation for Wives WithHusbands Working Versus Labour Force Participationfor Wives With Husbands Not Working6. Discrimination Between the Simultaneous and theSequenti al ModelCHAPTER V FAMILY CONDITIONAL LABOUR SUPPLY FUNCTIONS 1291. Introduction2. Issues in Labour Supply Estimation3. Estimation Results: Labour Supply Functions Withthe Observed Wage Rate4. Estimation Results: Labour Supply Functions Withthe Predicted Wage RateCHAPTER VI FAMILY UNCONDITIONAL LABOUR SUPPLY FUNCTIONS 1561. I:ntroduction2. Comparing the Parameters of Labour Supply andLabour Force Participation3. Unconditional Labour Supply FunctionsCHAPTER VII CONCLUSIONS. 174- vii -LIST OF TABLESpageI Goodness of Fit Test on Wage Distribution 52II Husband Wage Rate Equations. Individual Years andPooled (Arithmetic Specification) 54III Husband Wage Rate Equations. Individual Years andPooled (Semi-Logarithmic Specification) 56IV Theil ‘s Decomposition of R2 Applied to the Husband’sPooled Equation 58V F—Statistics for Restrictions Imposed on OccupationalGroups, Husband’s Pooled Wage Equations 62VI R2 for Restricted Samples. Husband’s Pooled Wage Equations . .. .64VII Wife’s Wage Rate Equation. Individual Years andPooled (Arithmetic Specification) 65VIII Wife’s Wage Rate Equation. Individual Years andPooled (Semi-Logarithmic Specification) 67IX Theil’s Decomposition of R2 Applied to the Wife’sPooled Wage Equation 69X F-Statistics for Restrictions Imposed on OccupationalGroups, Wife’s Pooled Wage Equation .. 72XI Goodness of Fit Measurements Comparing Predicted WithObserved Wage Rates for Occasional Labour Force Participants .. 78XII Husband’s and Wife’s Pooled Wage Equation, Semi-LogarithmicSpecification, Estimated for Prediction Purposes 80XIII Distribution of Labour Force Participation Choices inThe Sample for Individual Years (In Absolute Numbers) 94XIV Description of the Pooled Sample in Terms of LabourForce Participation Choices. 96XV Simultaneous Family Labour Force Participation Model,Results of the Multinomial Logit ModelXVI Differences Between Coefficients in SimultaneousFamily Labour Force Participation Model. 101XVII Relationship Between Marital Age, Labour Force ParticipationChoice, and Asset rncome in the Sample 105- viii -pageXVIII Simultaneous Family Labour Force ParticipationModel, Results of Restricted Multinomial Logit Model . . . . 108XIX Sequential Family Labour Force Participation Model(Husband Deciding First), Results of Binomial Logit Models . . . 112XX Sequential Family Labour Force Participation Model(Wife Deciding First), Results of Binomial Logit Models . . . . 114XXI Description of the Sample Used for Labour Supply Regressions . . 135XXII Conditional Labour Supply Functions (Using Observed Wage Rates) . 137XXIII Labour Supply Elasticities (Corresponding to Labour SupplyFunctions Using Observed Wage Rates) 145XXIV Distribution of the Elasticities of the SupplyCurve (Observed Wage Rate) 146XXV Conditional Labour Supply Functions (Using Predicted Wage Rates) 148XXVI Labour Supply Elasticities (Corresponding to LabourSupply Functions Using Predicted Wage Rates) 151XXVII Distribution of the Elasticities of the SupplyCurve (Predicted Wage Rate) 152XXVHI Unconditional Labour Supply Functions, Comparisonof the Tobin and Cragg Model 159XXIX Comparison Among the Wage and Income Coefficientsin the Tobin and Cragg Model 161XXX Probabilities of Labour Force Participation as aFunction of the Husband’s Wage Rate 166XXXI Probabilities of Labour Force Participation as aFunction of the Wife’s Wage Rate 167XXXII Probabilities of Labour Force Participation as aFunction of the Level of Asset Income 168XXXIII Conditional and Unconditional Labour Supply FunctionsFor the Husband 169XXXIV Conditional and Unconditional Labour Supply FunctionsFor the Wife 171XXXV Conditional and Unconditional “Income-Hours Worked” Functions 172- ix -ACKNOWLEDGMENTThis thesis would not have been possible without the supportof a great number of people. For their advice and encouragement throughoutthe course of this study, I wish to thank the members of my dissertationcommittee : Ernst R. Berndt, John G. Cragg, Terence J Wales. I amespecially grateful to Ernst R. Berndt for limitless hours of discussionon all aspects of the dissertation. In addition the technical advice ofKeith Wales on aspects of computer programming has been most helpful.I gratefully acknowledge the financial support from CanadaCouncil 1971 - 1975, and the computer time from the Department of Economics at the University of British Columbia. I also wish to thank Ernst R.Berndt and Terence J. Wales for employment as research assistant. Thanksare also due to Sharron King for typing the final manuscript.Finally I wish to acknowledge with special thanks my wife Magdawho has supported and encouraged me througout my graduate studies.—1—jCHAPTER IINTRODUCTIONThe amount of hours supplied by an individual in thelabour market can be viewed as the result of two sequentialdecisions. First there is the decision whether to participatein the labour force. Second, given that the individual decidesto enter the labour force, he/she then decides the actual numberof hours to work. The first decision is a decision at theextensive margin, the second at the intensive margin. Aspects ofthe first decision are usually investigated in “labour forceparticipation studies”. The determinants of the choice at theintensive margin are the subject of “labour supply studies”.With a few exceptions both sets of studies have developed in anuflreated way.Most labour force participation studies are aimed ateither explaining determinants of the substantial increase inlabour force participation by married women in the post warperiod (Cain [1966], Mahoney [1961], Mincer [1962], Sweet [1973]);or at investigating cyclical behaviour of labour force participation rates for various age and sex groups (Barth [1968],Mincer [1966], Officer and Anderson [1969], Wachter [1972, 1974],Fair [1971], Cragg [1973]). An exhaustive treatment of labourforce participation can be found in the voluminous study by Bowenand Finegan [1969].-2Labour supply studies, on the other hand, seem to havebeen motivated mainly by the need to predict the disincentiveeffects of various personal income tax schemes such as progressive income tax (Break [1957], Kosters [1963], Wales [1973]),or a negative income tax proposal (Boskin [1967], Cain andWatts [1973], Green and Tella [1969]). More recent supplystudies treat labour supply or leisure demand as a part of asystem of consumer demand functions and are mainly interestedin estimating parameters of the underlying utility function(Gussnian [1972], Wales and Woodland [l974a, l974b], Ashenfelterand Heckman [1974]).A first objective of this thesis is to estimate andcompare parameters of both the labour force participationdecisions and the labour supply decisions for the same cross-section sample of families. A common theoretical framework isdeveloped for both kinds of decisions (in Chapter II) and thenlabour force participation and labour supply functions areempirically estimated in Chapters IV and V respectively. Thedecision unit studied in this thesis is the family. Thiscontrasts with most of the previous studies which have concentrated on choice at the individual level.The empirical investigation of labour force participationdecisions attempts to explain the determinants of the choice ofthe individual or family between a finite number of distinct-3-alternatives. For an individual the alternatives could be thechoice to be in or to be out of the labour force. For a two-person family one can see a choice between four alternatives:both husband and wife working, husband only working, wife onlyworking, and none working. The special nature of the dependentvariable in labour force participation decisions requires anappropriate empirical technique. In this thesis I use (themultinomial extension of) the logit model developed by Theil[1969], Cragg and Uhl er El 970, 1971], Cragg and Baxter [1 970],and McFadden [1974]. A logit model allows me to explain theprobability that a particular alternative will be chosen bya family. This probability is defined as a function of a setof independent variables.Such a special statistical technique is not requiredfor the empirical study of labour supply decisions. In thiscase the dependent variable, say annual hours of work, variescontinuously within a wide range. The usual regression analysiswill presumably lead to satisfactory results for the study oflabour supply.The empirical study of labour force participation choicesis furthermore complicated by the fact that one does not observethe (potential) market wage rate for non-labour force participants.Economic theory predicts the importance of the market wage rateas a determinant of labour force participation choice. Thus one-4-should include a wage rate in specifying the labour force participation function. A wage rate is observed for individuals in thelabour market but clearly not for those out of the labour force.In an attempt to circumvent this problem I try to predict thepotential market wage for non-labour force participants usinga wage equation. This wage equation defines the wage rate as afunction of a set of observed socio-demographic variables and isestimated over the sample of labour force participants. Thisprediction procedure and the problems associated with it arediscussed in Chapter III.Both labour force participation decisions and laboursupply decisions are functions of wage rates and income variables.From an aggregative viewpoint therefore a change in “the marketwage rate” or “income” will have two kinds of effects: (i) anumber of individuals will enter or leave the market, (ii) individuals already in the market will adjust their supply of hours.In order to evaluate the total effect of, say, an economic policysuch as a negative income tax, which changes both “income” and“wage rate”, it would seem desirable to measure the combinedresponse in a given population at the internal and externalmargin. Douglas [1934], in one of the earliest empirical laboursupply studies, combined the wage elasticities of hours suppliedwith the wage elasticity of labour force participation into anestimate of the “most probable elasticity of the short timesupply of labour”. More recent studies (Hall [1973], Boskin [1973],—5Kalachek and Raines [1970]) combine the response at the internaland external margin using an expected value formula, i.e., theproduct of the probability of choosing a particular alternative,given the wage rate or income, times the predicted number of hoursworked if that particular alternative is chosen,again given thewage rate or income. This “expected value” will give a moreaccurate idea of the total aggregate effect on hours worked ofan economic policy which changes the “wage rate” and/or “income”,than as is traditional, looking only at labour supply functions.This is especially true if labour force participation decisionsare sensitive to wage rate and/or income changes. These matterswill be discussed in Chapter VI of this thesis.-6-CHAPTER IIECONOMIC THEORY OF LABOUR SUPPLYAND LABOUR FORCE PARTICIPATION1. IntroductionEver since Lord Robbins’ [1930] seminal contribution,economists have tended to treat the choice between leisure andwork as an application of the utility maximization paradigm1.Within this framework a labour supply function is defined asdependent upon prices and income. At the same time certainrestrictions, imposed by the utility maximization assumptionon the parameters of the supply function, are derived. I willshow below that labour force participation decisions can alsobe discussed in this framework.Static utility maximization thus leads to importanttheoretical predictions that are very useful in guiding empiricalresearch. This is the basic reason why I develop a theoreticalmodel for the labour force participation and labour supply choicesof a family, based on the assumption of static utility maximization. But one should be well aware of the shortcomings ofthis assumption particularly in the case of labour supply andlabour force participation decisions. The utility maximizationparadigm is only valid if the family is free to choose anylabour force participation alternative or labour supply patternit desires, within its time constraint. This is not necessarily—7-true in reality. There exist important social and institutionalconstraints on the labour market which severely restrict a family’schoice set. For instance total hours of work in a particularindustry is frequently the result of a collective agreementdefining standard work week and regulating overtime work. Suchinstitutional arrangements could constrain the individual ‘s choiceof working hours (unless both choices happen to be in agreement).2Part time work and multiple job holding are possibilities whichcould sometimes offset this social constraint depending on howeasily an individual can find them in his/her labour market. Inany case, one might expect that, for at least part of the sample,the observed labour force participation or labour supply choicedoes not correspond to what a family would have chosen withoutthe social constraints.A second shortcoming of the static utility maximizationassumption is its neglect of dynamic considerations in labourforce participation and labour supply decisions of a family. Ina static framework wages and (non-employment) income are treatedas exogenously given. In a dynamic context both variables areseen as the result of investments in human and non-human capitaland thus become endogenous variables. The basic assumption ofdynamic utility theory is that a family plans its labour forceparticipation,,labour supply, and (human and non-human) capitalaccumulation paths over its lifetime, maximizing an intertemporalutility function subject to a lifetime wealth constraint (Hicks [1958],Tintner [1938a, l938b, 1939], Lluch and Morishima [1973]). Consequently labour force participation, labour supply, wage rates andincome are determined simultaneously. Intertemporal relationshipsstudied in dynamic utility theory are certainly relevant for thestudy of family choice behaviour. However, the theoreticalpredictions that can be derived in this framework depend substantiallyon (sometimes restrictive) assumptions about the functional form ofthe intertemporal utility function, on the relationship betweenthe market rate of interest and the subjective rate of time preference and on the existence of perfect capital markets (for bothhuman and non—human capital).3 Although not explicitly incorporated in this thesis, I will sometimes rely on dynamic utilitytheory in cases where its possible implications are helpful toexplain certain empirical results.Static utility maximization also neglects the influenceof uncertainty and of search and information costs on labour forceparticipation and labour supply behaviour. For instance, even ifan individual desires to supply a positive amount of hours,search and information costs might offset the expected benefitsof joining the labour force.4 Again I will occasionallysupplement the complications of static utility theory withexplanations derived from other paradigms if this is helpfulin understanding empirical findings.-9The role of the theoretical model in this thesis isessentially to provide a structure to organize the empiricalinvestigations and to use as a reference in explaining theresults. This pragmatic approach avoids the necessity ofspecifying an all embracing theoretical model without, however,losing completely the benefits of some form of theoreticalguidance.The starting point for the theoretical model then isthe assumption that the family maximizes a utility functiondefined over the husband’s leisure, the wife’s leisure and “allother consumption goods”. It should be noted that the existenceof a family utility function depends on some very restrictiveconditions such as non-jointness in consumption, independenceof preferences and an optimal rule for reallocation of income(Samuelson [1956]). However, external consumption effects arethe essence of family life and so the conditions are presumablynot fulfilled. Its existence is nevertheless usually acceptedin the study of family labour supply (e.g. Ashenfelter andHeckman [1974], Diewert [1971], Wales and Woodland [l974a, 1974b])and I will follow this procedure. However, in the case of familylabour force participation decisions I will suggest an alternative model in contrast with the model derived from the familyutility assumption (Section 3.4 of this Chapter).- 10To formalize the basic theoretical model, the familyis assumed to maximize:(1.1) U(C, Lm Lf) with respect to C > 0 Lm > 0, Lf > 0(1.2) subject to: pC + WmLm ÷ wfLf < (w + wf)H + A’ or(dividing both sides by p)(1.2’) subject to: C + ijL + UfLf < (liffi + Vf)H + A = J and(1.3) subject to: H- Lm 0(1.4) subject to: H - Lf >0The subscripts “m” and “f” indicate respectivelyhusband and wife andC: consumption goods, a composite commodity,p: a price index for the composite commodity C,L:: leisure time, i = m, f.wj: money wage, i = m, f.uj: real wage (defined as w/p) i = m, f.H: total amount of time available in the period underconsideration,A’: non-employment income in money terms,A: non-employment income in real terms,J: “whole income” (Becker [1966])— ii —I ignore savings (which can only be treated adequatelyin a dynamic framework) and assume non-satiation for commodityand leisure consumption. Consequently (1.2’) will become anequality.The time constraints (1.3) and (1.4) are crucial inconsidering the labour force participation problem. If(i = m, f) denotes worktime then the Lagrangian can be writtenas:(1.5) L(C, Lm Lf Rm Rf) = U(C, Lm Lf) + x[J - C - lJmLm - ifLf]- Lm - Rm] +lIf[H - Lf - Rf]Applying the Kuhn-Tucker conditions to (1.5) will lead to twodistinct solutions,(i) interior solution, i.e.,UL.(1.6) R > 0,= 0 , u. , i = m, fC 1(ii) corner solution, i.e.,U(1.7) RiO, i>0 _-!=+L>u, i=m,f.The interior solution (1.6) is the usual point of departurefor the labour supply studies. t’tereafter I will call these studies“conditional” labour supply studies because the samples over whichthey are estimated are usually restricted to labour force participantsn(Section 2). The corner solution (1.7) leads to labour force- 12 -participation studies estimating the probability of a positivesupply of hours (Section 3). I will define the function thatcombines the choices at the internal and external margin as anunconditional labour supply function since it is estimated overthe whole sample of both participants and non-participants. Somealternative methods of defining this function are discussed inSection 4.2. Conditional Labour Supply Functions2.1 Theoretical RestrictionsTo highlight the theoretical developments in conditionallabour supply studies I will discuss the case where both husbandand wife are working. This simplifies, with a few alterations,to the case where only one of them is working.If U is a well behaved utility function, the first orderconditions of the Lagrangian (1.5) can be solved uniquely to obtaina set of demand functions5(1.8.1) C= C(Um Uf J)(1.8.2) Lm = Lm(u Uf J)(1.8.3) Lf=Lf(u, Jf J)Furthermore, a set of restrictions on the income and pricecoefficients can be derived. Usually these restrictions areeither imposed on a system such as (1.8) (e.g., Cournot and- 13 -Engel aggregation) or they are tested for as parametric restrictions(e.g., symmetry, homogeneity and the sign of the compensated ownsubstitution effect).More specifically in the case of demand for leisure oneusually tests for at least one of the following restrictions atthe sample points:(i) negative sign of compensated own substitution effectL.(1.9.1) < o , i = m, f‘5 Ui1(ii) symmetry of compensated cross substituion effect(1.9.2) — =‘Suf óUrnC Cwhere the subscript c indicates the utility compensated term ofthe Slutsky equation.Neither the slope of the demand curve for leisure, northe gross or net substitutability or complementarity between thehusband’s and wife’s leisure are predictable from pure demand theory.2.2 Functional Form of Demand EquationsThe functional form of the demand equations (1 .8.1 to1.8.3) is constrained only in a general way by pure theory:one should be able to integrate them “backwards” into a “sensible”utility function. This requirement usually excludes demand- 14 -functions which are linear in the parameters of prices and income.6Two basic methods have been utilized in empiricalstudies of conditional labour supply functions. One approachconfines itself to functional forms for demand equations that arecompatible with utility theory. This method may involve nonlinearestimation techniques for a system of equations.7 Another approachattempts to approximate the parameters of the demandequations withfunctions that are linear in the parameters. Its relationship toutility theory is somewhat pragmatic. However, this methodhas been used extensively, primarily because of its econometricsimplicity.8 For the same reason I would adapt this procedureto estimate conditional labour supply functions (Chapter V).Demand relations which are linear in the parameterscanbe obtained from the first order conditions of (1.5).Totally differentiating these first order conditions and solvingfor the demand vector will give(1.10.1) dL= 1c+ (H - L) [Li] dii1+r6L.1+ (H - L.) I —a- I} do. + ( —- } dA‘ LAJor alternatively:15 -(1.10.2) dL= Lc- L du +‘ LAJThe procedure is the same for dL as left hand side variable. Onecan also solve for the labour supply vector:(1.10.3) dR ={[._i]- R [i]:d + { [-]-R. r1}d. +f}dA.L AJAssuming that the expressions between curly bracketsare constant then one obtains upon integration, the followinglinear demand or supply functions:(1.11.1) L1 = a1 + a2u1 + a3u + a4A(1.11.2) L = al + auj ÷ + aJ(1.11.3) = b1 + b2u. + b2u + b4AThis functional form was used by Kosters [1963], Cohen, etal. [1970].One can generalize this procedure assuming that the expressionsbetween curly brackets (1.10.1) to (1.10.3) depend on the respectivewage or income level. To do this I rewrite (1.10.3) as follows:(1.12) dR A du + B du ÷ C dA- 16 -In (1.12) A is the slope of the supply curve, the sign of Bindicates gross complementarity (if positive) or gross substitutability (if negative) and C is the income effect. Furthermore,A - CR is the compensated substitution effect while net complementarity or substitutability is identified as B - CR.In order to account, in a simple way, for possible nonlinearities in the wage and income terms r introduce the followingassumptions:(1.13.1) A = a1 + a2u + a3u(1.13.2) B = b1 + b2u + b3u(1.13.3) C = c1 + c2 A + c3 A2Substituting (1.13.1) to (1.13.3) into (1.12) and taking theintegral will result in the “polynomial” supply curve:(1.14) = constant + (a1u + a2u + a3u) + (b1u.+ b2u + b3u) + (c1A + c2A + c3A)It is readily seen that A, B, and C as defined in (1.12) areequal to the partial derivatives of (1 .14) with respect torespectively, u, A. This relationship can then be usedto test for restrictions (1.9.1) and (1.9.2).— 17 —Clearly, still higher order polynomials could be used indefining A, B, and C. In practice however, quadratic expressionsare the most commonly used forms for (1.14) (e.g., Rosen and Welch[1971], Berndt and Wales [1974b). In estimating labour supplyfunctions (Chapter V) I have found that occasionally a thirdorder polynomial term would be significant but not a higher order.For a reason to be explained below (Section 4 of this Chapter) Iuse the logarithm of hours supplied as a dependent variable. Thisform can be derived by assuming that the right hand side inequations (1.13.1) to (1.13.3) is multiplied by R.In the actual estimation of labour supply functions Ialso include a number of socio-demographic variables (e.g.,age, education, experience) as independent variables in order tocontrol (partly) for taste variations with respect to labour supplyin the cross-section of families (see Chapter V).3. Labour Force Participation Functions3.1 The Shadow WageIf either the husband or the wife (or both) is not foundparticipating in the labour force then a corner solution conditionsuch as (1.7) must hold. This implies that at that level of familyincome the market wage u. is smaller than the “shadow wage” or “homeII UJLwage (i)Uc- 18 -This is as much information as one can hope to get outofnon-1inear programming theory. An attractive feature of therecent family production models9 might be its usefulness toexplain inequality (1.7) somewhat further. The household isassumed to consist of a consumption sector and a production sector.Utility maximization takes place in two stages. In the firststage the household is seen as minimizing the cost of producinghousehold commodities given the household technology, factorprices and initial endowments of time. This results in ahousehold cost function. rn the second stage the householdmaximizes a utility function defined over the householdcommodities and subject to the cost function. The final solutionof this two stage procedure yields equilibrium values forquantities consumed and for household shadow prices, e.g.,the shadow price of time. To explain the latter, one mustconcentrate on the equilibrium conditions of the householdproduction sector. In equilibrium a factor price must be thesame in all sectors of production and must be equal to the valueof the marginal product. The value of the marginal product isthe product of the commodity price (an “internal” concept inthis framework) and the marginal physical product (a giventechnological fact).The inequality (1.7) is thus obtained, either becausethe family values the household commodity greatly (high implicit- 19 -commodity price) or because it is very efficient in producing thesecommodities (high marginal physical productivity). In either casethe shadow wage should empirically reveal itself as a function ofthe consumption and production of certain household commodities.For instance: the importance of young children for the labour forceparticipation of their mother is a well investigated example thatcan be explained in these terms.3.2 Restrictions on Labour Force Participation FunctionsIn the interior solution case an important restrictionon the coefficients of the demand curves is the negativity of thecompensated own substitution effect. For the labour force participation case a similar result can be shown, using the weak axiom ofrevealed preference which is implied by a utility maximization program.Suppose a family is constrained to move along the sameindifference curve. Suppose furthermore that in a first situationit faces the p-ice vector V1=[p, Wf Wm] and in response to this,chooses consumption vector x1 = [C, H, Lm] where H is the totalamount of timee for the period. In a second situation it faces=[p. w, WmJ and chooses x2 = {C*, L, LJ, where L < H (on thesame indifference curve).The family will minimize the expenditure for a given levelof utility and so the following inequalities are implied by utilitymaximization theory:(1.15) vx1 > v’x1 , and v1x2 > v2x , or- 20 -(1.16.1) (w - Wf) H > 0(1.16.2) (Wf - w) L >0Adding (1.16.1) and (1.16.2) together:(1.16.3) (Wf - wf)(H- L) >0or(1.16.4) AWf ARf>Owhere is a difference indicator.Everything else (especially other family income)remaining constant, a sufficient increase (decrease) inthe own wage rate should induce the consumer to join (leave)the labour force. This relationship was also shown geometrically by Ben-Porath [1973].3.3 Derivation of the Multinomial Logit ModelA. Idiosyncrasies and stochastic specificationThe corner solution condition (1.7) implies that theindividual is not participating because at the observed levelof family income his shadow wage is greater than (or equal to)his market wage. There is however no way to determine howlarge this gap is. Furthermore, unless the individuals arecompletely homogeneous this gap will differ among familiesdue to idiosyncrasies with respect to labour force participation- 21 -patterns. Household production theory may suggest variables thatseek to explain the systematic variation in the shadow wage (e.g.,children). However, an unexplainable portion will remain, partlybecause some variables remain unmeasured and partly because theshadow wage is itself an unobservable (determined by utilityconsiderations). rn a cross-section sample one might observe twoindividuals with the same income, same market wage, same sociodemographic variables (identifying the systematic part of theshadow wage), but one choosing to be in, the other to be out ofthe labour market. Consequently if one would use a linear probability model (i.e., with 0-1 dependent variable) to explain labourforce participation decisions then the error term might play anundesirable role in such an equation. The error term will “explain”a substantial proportion of the observed variation in choice ifthe idiosyncratic elements (i.e., the unobserved “taste” variations)are important.1°The above discussion explains intuitively the unsatis- efactory error structure encountered in using the conventionalregression model for discrete choices. This corresponds to thetechnical shortcomings of the error term in the linear probabilitymodel. In spite of these shortcomings the linear probabilitymodel has been used extensively in the analysis of labour forceparticipation decisions (Cain [1966], Mahoney [1961], Boskin [1973]).A more satisfactory model to explain an individual’s choice among22 -distinct alternatives must, however, make explicit the effects ofan individual’s idiosyncratic preferences.B. The binomial logit modelI start with a simple case. Assume that the sample offamilies is split up so that those households where the husbandis working are only chosen. I am then only interested in thelabour force participation of the wife. Utility maximizationsuggests that her labour force participation will be a functionof her market wage, her shadow wage and other family income.Socio-demographic variables could be added in order to capturesystematic variations in the unobservable shadow wage or in“taste for labour force participation”. Let g stand for thefunction representing the systematic part in the explanation oflabour force participation choice. Because of the importance ofthe unobservable idiosyncrasies on discrete choices, a factorrepresenting these idiosyncrasies has to be added in the explanation.To do this one could theorize that a random variable is drawnfrom an assumed distribution (frequently the normal or logistic).The outcome of the common part g and the idiosyncratic part e willthen indicate whether the individual will participate.More formally one can assume that there exists a “latent”variable q, which is the sum of g and e. In this simple case thevariable q can best be understood as the “desired” amount ofsupply. If q < 0 the individual is not in the labour market and— 23 -vice versa if q > 0.Now we would like to predict if individual i facing willbe in or out of the labour force. Assume e is distributed followinga logistic function.12 Then(1.17.1) Pr(q. < 0) = Pr(E. < -g.) = 11 1 1(1.17.2) Pr(q >0) = 1- Pr(q1 <0) = 11 +It is easily seen thatPr(q. <0) 1 +(1.18) in = ln = g.Pr(q > 0) i + e9i 1So that we end up with a simple relation between the logarithmof the odds of non-participation over participation (or viceversa) and the function g.Once is estimated (see Chapter IV for a discussion ofestimation problems) model (1.18) will show the (logarithm of the)odds that a family ,facing the given market wage and incomevariables and with the observed socio-demographic characteristic;will choose a consumption vector [C, Lm Lf] instead of [C*, L, II],i.e., it determines a probabilistic rule that can be used to splitup the sample in two regimes. Extension to more than two regimeswill now be straightforward.- 24C. The multinomial logit modelAssume that a population of families are all maximizingthe basic model (1.1) to (1.4). Because of family idiosyncrasies,they will have different opinions and tastes with respect to thelabour force participation of the husband and/or the wife. Therefore this general utility maximization problem will specializeover the population into four “regimes”:(1.19.1) Max [U(C, Lm Lf): C + Lmum + Lfuf O, H_Lm>OH_ Lf>OJ(1.19.2) Max [U(C, Lm): C + Lmum < H0 + A; C > 0, H- Lm > 0](1.19.3) Max [U(C, Lf): C ÷ Lfuf Huf + A; C > 0, H- Lf > 0](1.19.4) Max [U(C): C 0]So again what is needed is a probabilistic rule that will split thesample into four regimes characterized by the following alternativevectors: (C, R1 , R2), (C, R1), (C, R2), (C). Define these vectorsas respectively a1, a2, a3, a4 and X as their collective set, i.e.,6X.As in the previous discussion I will theor4ze.that the achosen by a particular family is a systematic function of the marketwage, income variables and of socio-demographic variables capturingsystematic variations in shadow wage and “taste”. It is also a- 25function in a random way of a term capturing the familial idiosyncrasies with respect to labour force participation. Morespecifically:(1.20) z(a) = gi(um A, T) +where T stands for the included socio-demographic variables.a6X is then a probabilistic rule that will split the sample up intothe four regimes, allocating each family in its most probable regime13i.e.,(1.21) px(a) = Pr[.(a.) > ,(a); a€X]Furthermore if(1.22) a) = PX(al) . . . PX4Then(1.23) px(a)=1[) = t] n Pr[(a.) < t] dta6X-aI make the following distributional assumption (which amountsto assuming that E in (1.20) is distributed with the Weibulldistribution14):(1.24) Pr[(a) = t] = g if t <0=0 ift>0.- 26 -Then substituting (1.24) into (1.23) will yield:15(1.25.1) p(a)= f g. H [ g 9jT dr] dt= f g ei H gjt dt= g. exp { g.t}dt1 j J(1.25.2) p(a) = g[Eg]If the functional form exp(Z’) is used for g, with Z correspondingto°m’u.1, A, T) and a set of coefficients to be estimated foreach alternative a€X then (1.25) will be identical with the multinomial logit model (Cragg and Uhler [1970, 1971], McFadden [1974],Theil [1969]) which is an estimable function:4(1.26) p(a) = exP(Z’)]1Note that now (compare with (1.18)), (1.26) impliesPX(a.)(1.27) lnP(a) = z’(s — )Consequently,px’in p(a)(1.28)= ikdZk- 27It should be noted that identification of the parametersof model (1.26) requires a normalization rule, say(1.29) = 0Therefore (see (1.28)) Sik can be interpreted as the marginal change inthe logarithm of the odds of alternative i over the “normalized”alternative j. Using (1.27) we can derive16(1.30) d In Px(a)= ik - jl j jk d Zki.e. the change in the logarithm of the probability of an alternativedue to a change in Zk depends on the outcome of a comparison betweenthe change in the odds of all the alternatives. So whereas the oddsare a monotonic function of the independent variables as in equation(1.27), this is not necessarily true for the probability of analternative (see equation (1.30)).D. Restrictions on the selection probabilitiesThe model expressed in (1.26) is the one that I proposeto estimate. A discussion of the econometrics of (1.26) is deferredto Chapter IV where the model is used empirically. Before leavingthe subject, however, I would like to mention an important theoreticalrestriction on the selection probabilities of the logit model.- 28 -Define X as a set of more than two alternatives and Y as asubset of X consisting of a1 and a2 only. Then the axiom of independence of irrelevant alternatives assumes:p (al) p (a IY)p(a2) — p(a2IY)i.e., the odds of a1 being chosen over a2 in the multiple choicesituation X, where both a1 and a2 are available, equals the oddsof the binary choice of a1 over a2. If this axiom holds and ifpy(a2) 0 or 1 then it can be proven17 that pX(ai) can be writtenas (1.26), i.e., as the multinomial logit model. This axiomthus underlies the multinomial logit model.The axiom can easily be violated in reality. To takeDebreu’s [1960] example, let X consist of:(a1) a recording of the Debussy Quartet by the C quartet,(a2) a recording of the 8th Symphony by Beethoven by theB orchestra conducted by F,(a3) a recording of the 8th Symphony by Beethoven by theB orchestra conducted by K.The following binary choice probabilities are observed:p(a1 Ia1 a2) = 3/5, p(a1 a1, a3)= 3/5, p(a2 Ia2, a3) = 1/2.Then this axiom would predict for the multiple choice situation:p(a1 a1, a2, a3) = 3/7Thus, in the binary choice situation the individual would ratherhave Debussy, but in the multiple choice situation he would prefer- 29Beethoven, which is a counter-intuitive result. This examplesuggests therefore that application of the logit model should belimited to situations where the assumption that the alternativesare distinct and weighted independently is plausible (i.e., thealternatives cannot be aggregated). The proposed labour forceparticipation model (see next section) presumably fulfils thisrequirement (for further discussion, see Section 4 in Chapter IV).3.4 Two Alternative Family Labour Force Participation ModelsA first model follows from the assumption of the existenceof a family utility function such as the one defined in (1.1), i.e.,U(C, Lm Lf). If the family maximizes this function subject totheir budget and time constraints (equations (1.2) to (1.4)) thenthey will choose between the following four labour force participationalternatives (see above equation (1.19.1) to (1.19.4):(1.32.1) [C1, L, L] , i.e., both husband and wife working,(1.32.2) [C2, L, H.] , i.e., husband only working,0.32.3) [C3, H, L] , i.e., wife only working,(1.32.4) [C4, H, H] , i.e., none working.In this model the family, solves its utility maximizing program,considering all four alternatives simultaneously and chooses one.I call this the simultaneous model. In Chapter IV I use the multinomial logit model (1.26) to estimate the probability of a family- 30 -choosing any of these four alternatives as a function of wages,income and socio-demographic variables. A second model follows.from the following a priori considerations. For a variety ofsociological reasons the husband appears always to be in the labourforce unless he is either handicapped, retired, or in school. Thehusband does not seem to have much choiceu regarding labour forceparticipation. On the other hand a social constraint in thisrespect does not appear to be as strong for the wife. Previouseconomic discussions on “primary” and “secondary” income earners ina family already hinted at this distinction caused by sociologicalconstraints. (See Mincer [1966] for a summary of this literature.)Although they are a bit vague these considerations suggest a modelin which the labour force participation “decision” of the husbandis studied first. The wife’s labour force participation choice isthen estimated conditional upon her husband’s “decision”. I thusintroduce a type of lexicographic ordering in the labour forceparticipation decisions of the family. I call this the sequentialmodel. In Figure 1 the simultaneous model and the sequential modelare contrasted.The sequential model is estimated using the binomial logitmodel (1.17.2). It is clearly possible to reverse the order ofdecision and to develop a sequential model in which the wife decidesfirst and the husband’s decision is conditional upon hers. I willalso estimate this model (although it may seem to be somewhat unrealistic, a priori).- 31 -FIGURE 1Sequential Model Simultaneous Model+In —[oth WorkiJ___________)Out—IHusband Only[Family]In—IWife Only I____Out—iNone Workinj]It is of considerable interest to contrast the estimatesof the simultaneous and sequential models and to compare the valuesof their respective likelihood functions. In terms of the logitspecification the sequential model is identical with the simultaneousmodel only under a very restrictive condition on the parameters ofthe labour force participation decision. (See Appendix at the endof this Chapter.) In Chapter IV I estimate and compare both thesimultaneous and the sequential models.4. Unconditional Labour Supply FunctionsIf changes in the market “wage rate” and/or in “income”have an effect on both the labour force participation and laboursupply decisions of a sample of families then it may be of interestto obtain an idea of the combined effect of the changes at theinternal and external margin. For instance the introduction of aLabour ForcePartici pationof WifeOut—)Labour ForceParticipationof Wife- 32 -negative income tax policy will change both the “wage rate” and“income” variables of certain families and therefore their labourforce participation and labour supply behaviour. In what followsI will discuss two alternative ways of defining unconditionallabour supply functions which can be used to summarize the totaloutcome of such a negative income tax policy.The two alternative ways of defining an unconditionallabour supply function follow from the hypotheses that the parametersof labour supply and labour force participation choice are eitherthe same or different. In Chapter VI I will compare these twohypotheses. Furthermore I will combine the probabilities of thelabour force participation choices of a family (estimated inChapter IV) with the labour supply predictions derived from theconditional labour supply functions (estimated in Chapter V).This procedure (as will be seen below) assumes that the parametersof labour force participation and labour supply decisions aredifferent.4.1 Comparing the Parameters of Labour Supply and Labour ForceParticipationIn order to test the hypothesis that the parameters oflabour force participation and labour supply are the same, I comparethe estimation results using Tobin’s [1958] limited dependentvariable model with the results obtained with a variant of the- 33 -limited dependent variable model developed by Cragg [1971]18. Tobin’smodel restricts the parameters of labour force participation andlabour supply to be the same whereas this is not the case for theCragg model.Tobin’s model can be used if one assumes that labour forceparticipation is simply truncated labour supply, i.e., one arguesthat the labour force participation problem only exists because onedoes not observe the “negative” supply of hours that people mightdesire to offer. To formalize this, assume that a desired laboursupply variable for individual t is defined by(1.33) = Z.y + Etwhereby EL is independent and normally distributed with mean zeroand variance a2 Z is a set of independent variables and y avector of coefficients. When the desired labour supply is negative,the variable that is actually observed, Rt is zero. When ispositive then Rt is equal to Using the probit model theprobability that Rt is zero is then defined as:(1.34.1) Pr(Rt = OlZt) = C(-ZyIa)where C(.) designates the cumulative unit normal distribution.19The density for positive values of hours supplied is:(1.34.2) Pr(RtIZt) = (21)’2 a exp{-(Rt - Zy)2/2a}- 34 -The Tobin model thus only provides one set of parameters: yb,restricted to be the same for both labour force participationand labour supply.On the other hand it can be argued that for various reasonsthe restriction imposed by the Tobin model is not reaslistic. Forinstance, search and information costs might inhibit smooth transferinto the labour force even if a positive is desired (Uhler andKunin [1972]). Institutional constraints, e.g., standard work week,and limitations from the demand side, e.g., non—availability ofpart time jobs, might inhibit positive labour supply unless theindividual desires to supply at least a certain amount. The existenceof these “discontinuities” in going from zero to positive laboursupply implies that the “continuous” model (1.33) is incorrectand that the parameters of labour force participation and laboursupply choice are different. It would be more correct to separatethe labour force participation and labour supply decision.To do this I assume that a decision first has to betaken whether to participate. Suppose one decides to participate,then a decision is taken on how many hours to supply. The firstdecision might be represented by a probit model; the second by astandard regression model. However, the dependent variable in thelabour supply case, i.e., Rt, can only take non-negative values.This non-negativity could be guaranteed by truncating the distribution of Rt at zero. The model then becomes:- 35 -(1.35.1) Pr(Rt = OZ1, Z2t) = C(-Z)(1.35.2) Pr(RtIZit, Z2t) = (2r)12 a exp{-(Yt - Zty)2/2a}‘ ‘-1 t for R > 0CCZty/a)where Z1. and Z2. are vectors of independent variables for individualt and and y are vectors of coefficients. Tobin’s model (1.34.1)and (1.34.2) is a particular form of (1.35.1) and (1.35.2) if Z1.=Z2.and s = v/a.In Chapter VI, I estimate and compare Tobin’s and Cragg’smodel for the husband and for the wife, and test for differences inthe parameters of labour supply and labour force participation.4.2 Unconditional Labour Supply Functions: A SimulationIn Chapter IV, I estimate the probabilities of a familychoosing among its labour force participation alternatives givena set of independent variables Z1. This is done using either thesimultaneous or the sequential model. In Chapter V, I estimatelabour supply functions corresponding to the labour force participation alternatives. For the alternative “both working” I estimatea supply function separately for the husband and for the wife; forthe alternatives “husband only working” and “wife only working”, Iestimate a supply function for respectively the husband and the wife.- 36 -Again, the dependent variable labour supply can only takenon-negative values. To assure this restriction one can either usea truncated regression model (such as equation (1.35.2)) or one canassume (as I will do) that the logarithm of Rt is normally distributed.The general form for the labour supply functions is then:(1.36) ln Rt = + t = l T .i = 1, Mwhere is normally and independently distributed with mean zeroand variance and Rt is the amount supplied (by either husbandor wife) when alternative a is selected.In order to derive unconditional supply functions defineP(atIut) as the probability that family t will choose labour forceparticipation alternative a given wage rate u.. Also defineE(RtIut at) as the expected amount of hours supplied givenand given that alternative a is chosen, i.e.,(1.37) E(RtIut at) = exp{ZtY + - a} 20Then an unconditional labour supply function for either husband orwife can be determined as an expected value, i.e.,4(1.38) E(Rt)= jl P(atIut) E (RtIut at)In the second part of Chapter VI I use definition (1.38) tosimulate unconditional labour supply functions for the sample of familiesunder investigation. I will also simulate the effects of changes in“income” by redefining (1.38) with income instead of wage rate as theindependent variable.- 37 -5. The Problem of Unobserved Wage RatesThe demand and/or supply functions (equations (1.8.1) to(1.8.3)), derived in the case of an interior solution to the utilitymaximization problem, are defined as functions of the market wagesand income.21 These variables are observed for labour force participants so that they can be used in the supply equations. There exists,however, a serious problem for the estimation of labour force participation functions.Utility maximization suggests that, given non-employmentincome, the labour force participation decision depends on thedifference between the shadow wage and the market wage (equation (1.7)).The shadow wage is determined by household technology and by thepreference structure for household commodities (see Section 3.1 above).The shadow wage is unobservable but I assume that it is functionallyrelated to certain socio-demographic variables, e.g., age, education,age of children, etc. (see Section 3.3.A above).22 The market wageis unobserved for non-participants. Since the market wage is, however,observed for labour force participants, it seems appropriate to usethis additional information on part of the population to predictthe missing information of the other part of the population. Thisprocedure will be “correct” only if both participants and nonparticipants are structurally similar. I discuss the rather difficultissue of choosing a satisfactory predictor in Chapter II.- 38 -The labour force participation decision in Chapter IIIwill then be defined as a function of the predicted market wagerate(s),23 non-employment income, and a set of socio-demographicvariables.For reasons of comparability between labour force participation and labour supply decisions I. also use predicted wage ratesin the specification of the supply functions in Chapter V. In thesame Chapter I also estimate supply functions using the observedwage rate and compare the supply and income elasticities for bothspecifications.The data used in all empirical applications of this thesisare from the “Panel Study of Income Dynamics” [1972] from the Institute for Social Research, Survey Research Center at the Universityof Michigan. These data are described in Appendix A. The variableswhich are used in this thesis are defined in Appendix B.- 39 -FOOTNOTES TO CHAPTER II1 To cite a few examples: Henderson and Quandt [1971 , 29], Green[1971, 71-74], also Becker £1965].2 In the literature on multiple jobholding (e.g., Bronfenbrenner andMossin [1967], Moses [1962], Penman [1966]) a distinction is madebetween individuals who become overemployed and those who become underemployed because of the standard work week constraint.3 See Weiss [1972], Heckman [1972] for a discussion of these problemsin the context of lifetime labour supply. Smith [1972] discussespossible predictions of dynamic utility theory for lifetime wage andlabour supply profiles.4 More specifically, see Uhler and Kunin [1972]. Also see Block andHeineke [1973] for a treatment of labour supply using a von NeumannMorgenstern approach.5 Supply functions can easily be obtained using H - R - L = 0, i = m, f.6 See Goldberger [1967, 101-104].7 See Wales [1973], Wales and Woodland [1974a, l974b], Ashenfelter andHeckman [1974].8 This pragmatic approach was developed by Kosters [1963]. It is usedby almost all the authors in a recent volume of labour supply studiesedited by Cain and Watts [1973]. See also Cohen, Lerman, and Rea [1970].9 See for instance, the models in the volume “New Approaches to Fertility”,Journal of Political Economy, 81, 2, Part II, March/April, 1973. Fora critical evaluation of this approach see Pollak and Wachter [1975].- 40 -10 See McFadden [1974, 307-312].11 Goldberger [1964, 248-251]. One can, however, conveniently approachthe linear probability model as a linear discriminant function. SeeLadd [1966]. A technique to circumvent the problems of the linearprobability model is to use “discretised” independent variables.See Bowen and Finegan [1969], Cohen, Lerman, and Rea [1970]. Ifthe variables were initially continuous then this procedure entailsa loss of information.12 It is also frequently assumed that e is unit normally distributedwhich then leads to the well known probit model. See Buse [1972] fora discussion of these and other models. The extension of the probitmodel for choices among more than two alternatives, although feasible,is less practical computationally than the extension of the logitmodel.13 The way in which I derive the multinomial logit model is completelyanalogous to Luce and Suppes [1965, Chapter V] and McFadden [1974].I avoid calling .(a) a “random utility function” as they do becauseof the potential confusion in nomenclature.14 Cfr McFadden [1974, 111].15 Cfr Theorem 32 in Luce and Suppes [1965, Chapter V].16 Cfr Theil [1969, 254].17 Cfr McFadden [1974, 109-110], also Luce [1959, Chapter I].18 Both Tobin’s and Cragg’s model use the probit specification to expressthe probability of an alternative. Their models are developed for- 41two alternative choices only. It is possible to formulate an extensionto the multinomial logit model that allows for a test on whether thelabour supply and labour force participation are the same. See footnoteon page 832 in Cragg’s [1971] article. I choose however, to use thelimited dependent variable models because computer programs werereadily available for these tests.19 E.g., C(z) = fZ(2l/ exp{-t2/2}dt.20 Let R be the vector of observations on the dependent variable (hourssupplied in this case) and let y = ln R be the corresponding vectorin the log of this variable. If it is furthermore assumed that(1) YtZ+lit t=l,Twithiinormally and independently distributed with mean zero andvariance a2, then it follows that y is normally distributed withmean Z’ and variance-covariance a2I and R is lognormally distributed with the same parameters. The moments of the lognormal distribution are given by the following formula (e.g., see Press [1972, 139-140]):(2) E(Rt)k = exp{kZ + - k2 a2}so that the mean of Rt would be(3) E(Rt) = exp{Z + a2In practice, however, and a2 are unknown and in that case thedistribution of y depends on the distribution of the estimators ofand a2 (e.g., see Raiffa and Schlaifer [1961, Chapter 13]). For- 42 -convenience reasons I replace, throughout this thesis, the theoreticalS vector in (3) with the b vector of least squares coefficient estimatesand the theoretical a2 value with the least squares variance estimator21 It would be more correct to say that labour supply is a function ofincome and of market wages corrected for the marginal tax rate. See,e.g., Wales [1973], Wales and Woodland [1974a]. I will introduce theeffect of income tax considerations on labour supply in Chapter V.22 Gronau [1973], Heckman [1974] have attempted to estimate the shadowwage as a function of similar socio-demographic variables.23 Again it would be theoretically more appropriate to correct thewage rate for the marginal tax rate. However, since the predictedwage rate, which I have to use in this case, is only a crude indicator(see Chapter II, Section 3.3) of the potential wage offer that a nonlabour force participant might receive, I do not pursue it further.In the case where I use the observed wage I introduce the marginaltax rate adjustment (see Chapter V).- 43 -APPENDIX TO CHAPTER IIThe Relationship Between the Simultaneous and theSequential ModelTo illustrate the relationship between the sequential andthe simultaneous labour force participation model for a family Ifirst introduce some new notation. I use the indices A, B, C, D,for respectively the alternatives: both working, husband onlyworking, wife only working and none working in the simultaneousmodel. For instance, the probability of “both working”, using thelogit specification (1.26) will beZ’SA(A.l) p(A)= Z’eVB Ce +e +e +eFor the sequential model,p(l) and p(2) denote the probability of thehusband being respectively in or out of the labour force. Further,p(llIl), p(l2Il) denote respectively the probability of the wife being inor out of the labour force, given that her husband is in the labourforce. By analogy I define p(2112) and p(2212) when the husband isout of the labour force. Therefore the probability of “both working”in the sequential model, again using the logit specification, will bez’1l Z’131(A.2) p(ll Ii) p(1)= z’ez’ez’(e +e )(e -i-e )- 44 -Z(11+(A.2’) p(l1l) p(1)= e11 ÷ + eh12e22Now I investigate under what conditions the following equalities hold:(A.3.l) p(A)= p(llIl) p(l)(A.3.2) p(B) = p(l211) p(l)(A.3.3) p(C) = p(2112) p(2)(A.3.4) p(D) = p(2212) p(2)To do this I rewrite the logit specification in (A.2’) using thefact that in the binomial logit model the coefficients for the twoalternatives are equal to each other except for the sign (see (1.17.1)and (1.17.2), i.e.,(A.4.l) l2 = 1l(A.4.2) 2 = 1Thus (A.2’) becomesZ’(11+(A.5.1) p(llIl) pCi)= eZ’(-11÷B)+Similar1y for the other three probabilities of the sequential model- 45-(A..5.2):Z’(—11+p(1211) p(l)=e1+ ell)Z11-)+ e11)(A.5.3):Z’(21—p(2l 12) p(2)= eZI 2ll + eZI + eV 2ll + eZI 2ll(A.5..4):—p(2212) p(2)= eV 2ll + eV 2il + eZt 2ll + eV 2llNote that I use the equality 22 = 2l (A.5.3) and (A.5.4).If one writes out the logit specifications for p(A), p(B), p(C), andp(D) similar to (A.l) it will easily be seen that a sufficient conditionfor (A.3.l) to (A.3.4) to hold is that(A.6) S11= 2lNote that (A.6) does not follow from the independence of irrelevantalternatives axiom (1.31). (A.6) is a much more restrictive assumption.Using (A.6) it follows that- 46 -(A.7.l) Z’A = Z’(11 +(A.7.2) Z’B = Z’(—11 +(A.7.3) =—(A.7.4) Z’D = Z(—11 —(A.6) implies that the parameters of the labour force participationchoice of the wife are the same whether her husband works or doesnot work. It also implies that the other sequential model wherethe wife decides first about her labour force participation and thenher husband is the same as the sequential model with the husbandfirst and the wife second. This demonstrates analytically that thesequential model and the simultaneous model provide differenthypotheses about family labour force participation behaviour.- 47CHAPTER IIIPREDICTING POTENTIAL MARKET WAGE RATES1. IntroductionIn order to predict the potential market wage obtained bynon-labour force participants if they were to enter the labour force,I use a least squares prediction technique.1 This technique exploitsthe empirical relationship existing in the sample of labour forceparticipants between the observed wage rate and certain observed sociodemographic characteristics. I assume that the same empirical relationship holds for the sample of non-participants in such a way that I canuse the relevant socio-demographic characteristics observed for thispopulation to predict the unobserved wage rate.The assumption that the same systematic relationship holdsfor both the sample of participants and non-participants is crucialin order to justify this prediction technique and therefore requiressome attention. Before discussing the issue of structural differencesI elaborate first on the choice of “relevant” socio-demographic variablesto be included in the wage equation.1.1 Specification ProblemsA central feature of some recent studies of wage and incomedifferentials is the concept of rate of return on human capital investment.2 In general terms human capital theory predicts that costs occurred- 48because of investments (e.g., in institutional schooling or in on-the-job-training, etc.) should be compensated for by relativelyhigher wage rates after the investment. Therefore variables measuring“schooling” and “experience” become important variables in wageregressions. But very frequently other variables such as sex, race,union membership, occupation are also included (Hill [1959], Adams[1958], Hall [1973], Berndt and Wales [l974a]). In what followsI will do the same and will not restrict the specification of thewage equations to only those independent variables suggested by“human capital investment” theory but will also test for the significance of a set of other variables which I think to be relevantin explaining variations in the observed wage rate.I estimate wage equations separately for husbands and wives.The specifications however, differs between these two equations becausethe amount of information available in the sample is much smaller forthe wives than for the husbands (from whom the interview was taken).These informational gaps could cause estimation problems. If thevariables missing in the specification of the wife’s wage equation.are significantly related to the dependent variable and are furthermore correlated with at least one of the included variables then thecoefficients of the included variables will be biased.3 Clearly, someof the coefficients of the husband’s equations can also be biased asone is never sure of the exact specification. However, I presume thatthe danger of biased coefficients is greater for the wife’s regression- 49as there is less information available. Some indication of thebias in the estimated coefficients of the wife’s wage equation, causedby the lack of information, can be obtained by comparison with thehusband’s coefficients. This comparison is clearly only reliable if itcan be assumed that the husband’s and wife’s equations are structurallythe same.1.2 Prediction ProblemsAs mentioned above, the appropriateness of using a leastsquares predictor for the unobserved wage offer depends crucially onthe assumption that the sample of non-participants is structurally thesame as the sample of participants. Two potential situations couldinvalidate this assumption:(A) The corner solution inequality (equation (1.7), ChapterI) indicates that for non-participants the shadow wage is greater thanthe market wage. The shadow wage was furthermore found to be equal tothe value of the marginal product in home production which in turnis a function of household productivity and of tastes for householdcommodities (Chapter II, Section 3.1). If it could be assumed forinstance, that the value of the marginal product does not differbetween the sample of participants and non-participants then it wouldfollow from the corner solution inequality that participants are onan average, higher market wage earners than non-participants. Thereis no firm theoretical expectation for this to hold; it is, however,- 50 -empirically possible. If it were true the two samples could bestructurally different and thus invalidate the proposed predictiontechnique.(B) A second reason for structural difference can be deducedfrom the fact that the market wage is affected by experience. Therefore individuals who specialize in home production, for a substantiallength of time (which seems to be true for a portion of the non-participants at any point in time) are presumably depreciating their marketskills. If this is empirically true then the wage equation for participants would tend to overpredict the potential wage offer of the non-participants to this extent.Because structural differences between the sample of participants are empirically possible the prediction technique might notbe appropriate. This important matter will be discussed further inSection 3 of this Chapter. First, I will describe the process ofobtaining predictors for the husband’s wage (Section 2.2) and for thewife’s wage (Section 2.3).2. The Wage Equations2.1 Description of the SampleThe sample used for the wage equation is a subset of theoriginal sample as described in Appendix A. This subset is obtainedby restricting the sample to (i) families with both husband and wifepresent. Either one must have worked at least once in the sampleperiod (1967—71); (ii) families where husband and wife were the— 51same all five years. This is introduced to make the education variable,observed only in 1967 and 1971, useful over the whole sample period;(iii) observations where none of the variables, to be used in the wageequations, is missing.4The numerical importance of each restriction clearly dependson the order in Which they are introduced. In the order mentioned abovethey cause, respectively 29, 45,5 and 4 percent to be dropped. Thisleaves 22 percent or l123families observed for each of the five yearsbetween 1967 and 1971. If all the observations for which the husband isfound working are pooled over all five years, the total sample size is5076. The similar pooled sample size for the wives is 2880 observations.Most empirical wage and income distribution studies6 assertthat income or wages tend to be lognormally distributed. In order to determine whether the observed wage distribution for husbands and wivesfits the lognormal distribution better than the normal distribution, Icalculate a x2 and a Kolmogorov-Smirnov goodness of fit test statistic8comparing the observed wage distribution with a theoretically expectedlognornial or normal distribution.Table I summarizes these tests for the husbands’ and wives’wage distribution (using five intervals only for the wage distribution).Neither the x2 test nor the Kolmogorov-Smirnov test allows me to acceptthe null hypothesis that the husbands’ and wives’ wage distribution iseither normally or lognormally distributed. However, comparing the valuesof the two test statistics for the normal and the lognormal, indicatesthat the observed wage distributions fit the lognormal better than thenormal (certainly for the wives’ wage distribution).- 52 —TABLE I - GOODNESS OF FIT TEST ONWAGE DISTRIBUTIONSHUSBAND WIFEEXPECTED EXPECTEDINTERVAL2 OBSERVED NORMAL LOGNORMAL OBSERVED NORMAL LOGNORMAL=2.5 i=2.l i=l .22 2 2 2a =2.4 a 1.4 a =1.8 a =1.5(0.0-2.5) 1809 1733 2046 2144 1709 2089[2.5-5.0) 2467 2024 2076 638 1031 645[5.0-7.5) 589 1091 645 63 137 110[7.5-10.0) 125 213 198 17 3 25[10.0—cc.) 86 15 111 18 0 11x2 very large 163.5 very large 32.5degrees offreedomKolmogorov- 1)Smirnov 10 .05 .15 .02Sample size 5076 5076 2880 2880(1) accept Ho at .05 significance level.9(2) an open interval is denoted “(“s a closed “[“.(3) indicates the value of the mean and 2 the value of the variance usedin calculating the expected frequency distribution. These values arederived from the observed distribution.- 53It should be emphasized that even though the observed wagerate tends to be lognormally rather than normally distributed, this isnot sufficient information to conclude that the basic assumptions of thelinear regression model7 are more closely approximated with a semi-logarithmic model such as(2.1.1) Wt = el e’ltthan with an arithmetic model such as(2.1.2) w X.S2 + 2tthether (2.1.1) is more appropriate than (2.1.2) depends onthe conditional distribution of the dependent variable given the set ofindependent variables. The results on the distribution of the observedwage rate suggest that it would be of interest to estimate and compare thewage equations using both the semi-logarithmic and aritmetic specification.2.2 The Husband’s Wage EquationsThe regression results on the husband’s wage equation arepresented in Thble II for the arithmetic wage rate as dependent variableand in Table III for the semi-logarithmic wage rate. The definitionof each variable is given in Appendix B.In order to obtain a notion of the importance of certain groupsof variables I use Theil ‘s decomposition of the multiple correlationcoefficient. The incremental contribution of each- 54 -TABLE II - HUSBAND WAGE RATE EQUATIONSINDIVIDUAL YEARS AND POOLED (ARITHMETIC SPECIFICATION)1967 1968 1969 1970 1971 POOLCONSTANT -2.75 -2.40 -2.84 -2.49 -3.91 -2.99(.81) (.82) (.85) (.95) (1.20) (.40)GRADE 9/12 .51 .35 .32 .34 .24(n.s.) .35(.16) (.16) (.16) (.17) (.21) (.08)TECH .80 .38(n.s.) .53 .51 .41(n.s.) .54(.23) (.23) (.23) (.25) (.30) (.11)COLL 1.01 .86 .74 1.12 .86 .92(.24) (.23) (.23) (.25) (.30) (.11)BA 2.02 1.82 1.96 2.03 1.87 1.94(.32) (.31) (.30) (.33) (.38) (.14)Pb 2.78 2.68 2.95 3.30 3.31 3.00(.38) (.37) (.36) (.38) (.45) (.17)ACHIEVE .01(n.s.) .05 .03(n.s.) .02(n.s.) .04(n.s.) .03(.02) (.02) (.02) (.02) (.03) (.01)RACE - .28(*)- .25(n.s.)— .21(n.s.) - .22(n.s.)— .08(n.s.)-.20(.17) (.16) (.16) (.17) (.21) (.08)IQ .10 .09 .09 .08 .15 .10(.03) (.03) (.03) (.03) (.04) (.01)CATH .14(n.s.) .12(n.s.) .26(*) .07(n.s.) .16(n.s.) .14(.15) (.81) (.15) (.16) (.20) (.07)JEW-.04(n.s.) -.38(n.s.) .86 .45 .07(n.s.) .36(*)(.40) (.39) (.39) (1.03) (.51) (.19)AGE .12 .10 .14 .14 .16 .14(.03) (.03) (.03) (.04) (.05) (.02)AGE SQ -.0013 -.0010 -.0015 -.0015 -.0018 -.0015(.0004) (.0004) (.0004) (.0004) (.0005) (.0002)RISKAVOID .10 .11 .11 .13 .17 .13(.04) (.04) (.04) (.04) (.05) (.02)URBAN .30 .29 .16(n.s.) .28 .45 .29(12) (.12) (.12) (.13) (.16) (.06)SOUTH -. 24(*)- .36- .33- .53- .42- .37(.14) (.13) (.13) (.14) (.17) (.06)4-9 YRS ON JOB .05(n.s.) .31 .23(n.s.) .O5(n.s.) .33(*) .21(.16) (.14) (.14) (.16) (.19) (.07)10-19 YRS ON JOB .35 .53 .40 .39 .65 .45(.16) (.16) (.16) (.18) (.22) (.08)>20 YRS ON JOB .41 .67 .45 .56 .73 .57(.21) (.19) (.19) (.20) (.24) (.09)— 55TABLE II CONTINUED1967 1968 1969 1970 1971 POOL(1) The numbers in brackets are the standard errors. The coefficients are allsignificant at the 5% level unless followed by (*) which indicates significance at the 10% level. If a coefficient is followed by (n.s.) this denotes“not significant”. Variables are explained in Appendix B.(2) # OBS: number of observations(3) R2: coefficient of multiple correlationadjusted R2SE: standard error of the regressionMean of dependent variableMANAG 1.29 1.84 1.41 1.77 1.42 1.55(.28) (.26) (.25) (.26) (.32) (.12)PROF 1.10 1.20 1.27 1.27 .71 1.09(.20) (.30) (.29) (.30) (.35) (.14)SKILL .95 1.03 .88 .66 .87 .91(.22) (.21) (.21) (.27) (.27) (.10)CLERK .52 1.15 .83 .94 .47 .70(.26) (.25) (.25) (.23) (.32) (.12)SEMISKILL .42(*) .81 .41(*) .48 .57 .51(.24) (.23) (.22) (.23) (.29) (.11)IJNSKILL .48(*) .58 .28(n.s.) .4l(n.s.) .l4(n.s.) .35(.26) (.25) (.25) (.27) (.32) (.12)SECOND JOB —.20(n.s.) -.46 —.33 -.52 —.46 -.38(.14) (.15) (.16) (.18) (.20) (.07)UNION .63 .57 .66 .61 .51 .60(.15) (.14) (.14) (.15) (.12) (.06)1023 992 988# OBS 1ions’ ‘ 1039 1034 5076R2(3) .354 .379 .397 .409 .325 .3662(4).338 .363 .381 .393 .307 .363SE5 1.8 1.8 1.8 1.9 2.3 1.9(6) 3.17 3.36 3.49 3.62 3.74 3.47y(7) 2.22 2.21 2.24 2.43 2.72 2.37(4)(5)(6)(7) Standard deviation of dependent variable- 56TABLE III - HUSBAND WAGE RATE EQUATIONSINDIVIDUAL YEARS AND POOLED (SEMI-LOGARITHMIC SPECIFICATIO11967 1968 1969 1970 1971 POOLCONSTANT -1.30 -.75 -.93 -.86 -1.15 -1.07(.20) (.19) (.22) (.21) (.23) (.09)GRADE 9/12 .19 .16 .16 .13 .11 .15(.04) (.04) (.04) (.04) (.04) (.02)TECH .25 .15 .19 .18 .14 .18(.06) (.05) (.06) (.06) (.06) (.03)COLL .28 .24 .24 .26 .22 .25(.06) (.05) (06) (.06) (.06) (.03)BA .54 .48 .52 .49 .43 .49(.08) (.07) (.08) (.07) (.07) (.03)PHD .60 .58 .67 .64 .69 .63(.09) (.09) (.09) (.09) (.09) (.04)ACHIEVE .01 .01 .01 .01 .01 .01(.005)’ (.005) (.006) (.005) (.006) (.002)RACE —.17 -.11 -.14 -.12 _.07(*) —.12(.04) (.04) (.04) (.04) (.04) (.02)IQ .03 .02 .02 .02 .03 .03(.007) (.007) (.008) (.007) (.008) (.003)CATH .06(n.s.) .03(n.s.) .05(n.s .O1(n.s.) .05(n.s.) .04(.04) (.03) (.04) (.03) (.04) (.02)JEW —.006(n.s.) .10(n.s.) .18(*) .04(n.s.) —.09(n.s.) .05(n.s.)(.10) (.09) (.10) (.10) (.10) (.04)AGE .05 .03 .04 .04 .05 .04(.008) (.008) (.009) (.008) (.009) (.003)AGE SQ -.0006 -.0004 -.0005 -.0005 -.0006 -.0005(.0001) (.0001) (.0001) (.0001) (.0001) (.00004)RISKAVOID .05 .05 .05 .05 .05 .05(.01) (.01) (.01) (.01) (.01) (.004)URBAN .12 .11 .07 .13 .15 .12(.03) (.03) (.03) (.03) (.03) (.01)SOUTH -.09 -.12 -.13 -16 -.10 -.12(.03) (.03) (.03) (.03) (.03) (.01)4-9 YRS ON JOB .09 .10 .09 .03(n.s.) .10 .09(.04) (.03) (.04) (.04) (.04) (.02)10-19 YRS ON JOB .15 .15 .09 .10 .17 .13(.04) (.04) (.04) (.04) (.04) (02)- 57 —TABLE III CONTINUED1967 1968 1969 1970 1971 POOL>2O YRS ON JOB .21 .43 .09(*) .14 .17 .15— (.05) (.07) (.05) (.04) (.05) (.02)MANAG .57 .56 .57 .63 .63 .60(.07) (.06) (.06) (.06) (.06) (.03)PROF .53 .43 .50 .49 .42 .48(.07) (.07) (.07) (.07) (.07) (.03)SKILL .46 .38 .48 .46 .50 .46(.05) (.05) (.05) (.05) (.05) (.02)CLERK .36 .35 .46 .33 .40 .39(.06) (.06) (.06) (.06) (.06) (.03)SEMISKILL .29 .30 .33 .31 .40 .33(.06) (.05) (.06) (.05) (.06) (.02)UNSKILL .32 .18 .27 .27 .21 .26(.06) (.06) (.06) (.06) (.06) (.03)SECOND JOB -.06 -.12 - .08(*) fl -.09 -.09(.04) (.03) (.04) (.04) (.04) (.02)UNION .29 .26 .27 .24 .22 .26(.04) (.03) (.03) (.03) (.04) (.02)# OBS 1039 1034 1023 992 988 5076R2 .531 .523 .484 .528 .509 .510.518 .510 .470 .515 .496 .507SE .45 .40 .45 .42 .43 .43ii .967 1.05 1.08 1.11 1.14 1.07a.64 .57 .61 .61 .61 .61(1) Same comments as Table II- 58 -TABLE IV - THEIL’S DECOMPOSITION OF2*R ‘ “ APPLIED TO THE HUSBAND’S POOLED WAGE EQUATIONSARITHMETIC SPECIFICATIONABSOLUTE RELATIVELOGARITHMIC SPECIFICATIONABSOLUTE RELATIVE(1) CONSTANT .0069 2. .0133 3.(2) EDUCATION .0746 20. .0688 13.(3) AGE .0174 5. .0306 6.}9(4) YRS ON JOB .0104 4. .0146 3.(5) OCCUPATION .0469 13. .1501 29.(6) UNION .0101 3. .0276 5.(7) SOCIO-ECON .0260 7. .0444 9.TOTAL .1923 54. .3494 68.MULTICOLLINEARITYEFFECT .1735 46. .1604 32.R2 .3658 100. .5098 100.(*) cfr. footnote (10) or Theii.[197l,l8l]’.’ Each class is defined as the sumof the marginal contribution of the variables mentioned.(2) education = grade 9/12 ÷ tech + coil + ba + phd(3) age = age + age sq(4) yrs on job = 4-9 yrs on job + 10-19 yrs on job + > 20 yrs on job(5) occupation = manag + prof + skill + clerk + semiskili + unskiil(7) socio-econ = achieve ÷ race + iq ÷ cath ÷ jew + riskavoid + urban+ south + second + union(1), (6) correspond to the same variable- 59 -independent variable10 is then consolidated into groups of selectedindependent variables. This is shown in Table IV. There it can beseen that “education” and “occupation” contribute most to the explanation of the wage rate. It can be seen that the higher in the semi-logarithmic wage rate regression is explained for an important part,by the more efficient (i.e., relatively smaller standard errors)estimation of the occupational variables.I now discuss the independent variables in more detail.This discussion relates to the empirical wage regression literature.It is somewhat outside the main line of argument of this study.Therefore the rest of this section can be omitted without loss ofcontinuity.The coefficients of the education variables correspond tothe predictions of human capital investment theory and can be interpreted as returns on investment in schooling. Such an interpretationis fully justified if education is specified continuously in terms ofyears of schooling. When the semi-logarithmic wage equation is used,one can interpret the coefficients of a YRS SCHOOL variable11 as measuringthe internal rates of return (Mincer [1970, 1974]). I use the samespecification as in Table III, but substitute the YRS SCHOOL andYRS SCHOOL squared variables for the education dummies. This givesthe following partial results(2.2) ln w = -1.13 - .0119 YRS SCHOOL + .0026(YRS SCHOOL)2(.006) (.0003)- 60These parameter estimates imply internal rates of returngoing from .4 percent for individuals having finished only fivegrades to 8.2 percent for Ph.D.’s.12Recent human capital studies (e.g., Taubrnan and Wales [1973],Griliches and Mason [1972]) have stressed the importance of controllingfor ability on the estimation of educational coefficients. In thesewage equations I try to control for ability using the IQ variable.When I leave out the IQ variable each education coefficient would ingeneral, increase with a factor equal to at least one standard error.This implies an increasing bias for higher educational levels. Thispositive bias is caused by the positive effect of ability on boththe wage rate and the educational level.13 A similar bias is presumablyalso present for the wife’s equation, because no measure of her abilityis available (see Section 2.3).Another human capital investment variable is post-schoolinginvestment. This is usually measured by age. Theoretical humancapital investment models predict concave earnings profiles (Ben Porath[1967, 1970], Rosen [1972]).14 The significance of the squared ageterm supports this prediction. The age-wage profile reaches itsmaximum at about 45 years for both the pooled arithmetic and semilogarithmic specification (respectively, 45.3 and 44.5). For theindividual years, however, the arithmetic specification tends topredict a turning point at a later age than the logarithmic specification. The human capital literature also predicts different age-wage- 61 -profiles for different educational levels (Becker [1964]).If one introduces age-education interaction terms into theregression one observes only a significant interaction for the BA andthe Ph.D. level with AGE and AGESQ. Only for husbands with a Ph.D. isthere evidence that the wage profile peaks at a much later age (around50 years) than other husbands. The partial results of the regressions,specified as in Tables II and III but including the age-education interaction terms, are(2.3.1) WAGE = -1.9 + .093 AGE - .001 AGESQ + .301 AGE BA - .0032 AGESQ BA(.4) (.02) (.0002) (.07) (.0007)+ .465 AGE PHD - .004 AGESQ PhD(.06) (.0006 )(2.3.2) ln of WAGE = -.95 + .040 AGE - .0005 AGESQ + .044 AGE BA(.09) (.003) (.00005) (.01)- .0004 AGESQ BA + .0602 AGE PHD - .0005 AGESQ PHD(.0002) (.015) (.0001)Other variables capturing returns on post-educational humaninvestment are the experience variables measuring time on the presentjob. A significant increase in the wage rate is obtained for a husbandhaving been on the same job more than five years and more than ten years.The increase for having more than twenty years experience at the samejob is not significantly different from the increase already obtainedafter ten years. If I test the null hypothesis that these increments- 62are the same, the calculated F-statistics equals 1.67 (arithmeticspecification) or .22 (semi-logarithmic) whereas the critical valueat the 5 percent level is 3.84.A very important set of variables explaining the wage rateare the occupational dummies (especially in the semi—logarithmic specification). Although the occupational wage differences appear to correspond to a social status ordering (e.g., see Duncan etal. [1961]) it isinteresting to test whether these differences are statistically significant. In Table V t bring together the results on the F-statistics forseveral restrictions imposed on the occupational dummy variables. Therestrictions are introduced by re-estimating the pooled wage equation asspecified in Tables [I and III but with restricted occupational groupsconsolidated into one occupational level.TABLE VF-STATISTICS FOR RESTRICTIONS IMPOSED ON OCCUPATIONALGROUPS - HUSBAND’S POOLED WAGE EQUATIONSCalculated F-StatisticsRestriction Arithmetic Semi-logarithmicSpecification SpecificationPROF = SKILL 1.94 .38CLERK = SKILL 4.17 10.25PROF = CLERK = SKILL 4.31 6.58CLERK = SEMISKILL 2.78 4.91SEMISKILL = IJNSKILL 2.50 11.44Critical value for F (1, 5076) = 3.84 (5 percent) or 6.64 (1 percent),for F (2, 5076) = 2.99 (5 percent) or 4.60 (1 percent).- 63 —The strong confirmation of the SKILL = PROF restriction is surprising.Finally, the SECOND JOB variable merits some furtherexplanation. The dependent variable is defined as the logarithm ofthe ratio of labour income over hours worked. The marginal wagehowever, is not constant over the whole range of hours because itis an aggregate of income earned on standard time, overtime andon moonlighting jobs. Therefore the negative coefficient of theSECOND JOB variable indicates that moonlighters have lower averagewages than non-moonlighters.A test to determine whether the county unemployment rate(in the form of a set of dummies) had an influence on the male wagerate (in the years 1968 to 1971) does not lead to significant results.Another test for the years 1970-71 to study the effect of the industry(also in the form of dunnies) on the wage rate gives a significantnegative coefficient in the case of the agricultural industry only.Guided by the goodness of fit tests discussed in theprevious section one would expect that if the wage rate tends to belognormally distributed then the arithmetic specification should beless adequate in explaining the right hand tail of the wage distribution. A supporting indication of this is found in comparing thefit (i.e., the R2) of the arithmetic and semi-logarithmic specificationusing different truncation points of the right tail. Doing this theincreases substantially in the arithmetic case but remains virtuallyconstant in the semi—logarithmic case as shown in Table VI.- 64 -TABLE VIR2 FOR RESTRICTED SAMPLES - HUSBAND’SPOOLED WAGE EQUATIONSRestrictions on Arithmetic Semi-logarithmic # OBSPooled Sample Specifi cati on Speci fi cationAll Observations .37 .51 5076Observations With WageRate < $25 Only .45 .52 5069Observations With WageRate < $20 Only .48 .52 5063Observations With WageRate < $15 Only .50 .52 50522.3 The Wife’s Wage EquationsThe results on the wage equations for the wife are presentedin Table VII for the arithmetic specification and in Table VIII for thesemi-logarithmic equation. Table IX summarizes the results on theincremental contributions of the independent variables. As can beseen “education” and “occupation” (especially in the semi—logarithmicspecification) contribute most to the fit.An important difference between the husband’s equation andthe wife’s equation should be stressed again. Because the interviewswere conducted with the husband some variables are not observed forthe wife. Of these missing variables especially IQ, YRS ON JOB, SECONDJOB, and UNION are important. ACHIEVE and RISKAVOID are also missing- 65TABLE VII - WIFE’S WAGE RATEEQUATIONS INDIVIDUAL YEARS AND POOLED(ARITHMETIC SPECIFICATIO111967 1968 1969 1970 1971 POOL—1 .24(*)(.71)• 07 (n. s.)(.20).27(n.s.)(.28).35(n.s.)(.31).74(*)(.391 .76(.56)-.0O6(n.s.)(.03).47.10(.03)-. 001(.0004).12(.04)-.0O7(n.s.)(.14)-.16(n.s.)(.14)1 .42(.28)2.80(.53).59(.18)-.22(n.s.)(.57).20(n.s.)(.15).28(i.s.)(.20).42 (*)(.21)1 .04(.27).90(.41).03(n.s.)(.02).37(.14).03(p.s.)(.03)-.0001(n.s.(.0003).08(.03).07(n.s.)(.09)_.18(*)(.10)1 . 37(.20).21 (n.s)(.48).67(.13)-.23(n.s.)(.64).38 (*)(.16).19(n.s.)(.22).57(.24)1 .45(.31)1 . 39(.47).005(n.s.)(.02).18(n.s.)(.15).06(.03))_.0006(*)(.0003)•06(n.s.)(.04).005(n.s.)(.11)- .29(.11)1.67(.23)1 .02(*)(.54).52(.14).49(n.s.)(.75).21 (n.s.)(.17).O5(n.s.)(.23).74(.2)1.65(.32)1 .00(.46).01 (n.s.)(.02).33(.16).02(n.s.)(.03)-.001(n.s.)(.0004).02 (n. s.)(.04).22(*)(.11)- .26(.12)1 .55(.22).68(n.s.)(.46).52(.15)CONSTANTGRADE 12TECHCOL LBAPHDACHIEVEHCATHAGEAGE SQRISKAVOIDHURBANSOUTHPROFMANAGCLERK—1.82(n.s.)—.76(1.35) (.36)•24(n.s.) .21(.28) (.09).70(*).31(.39) (.12).88 .62(.42) (.13)1.77 1.29(.55) (.16)3.29 1.64(.78) (.29).05(n.s.) .O2(n.s.)(.04) (.01)•25(n.s.) .30(.26) (.08).12 .07(.06) (.02)-.0014 -.0007(.0007) (.0002)-.02(n.s.) .06(.06) (.02).41 .15(.19) (.06)—.22(n.s.)—.22(.20) (.06)1.77 1.59(.39) (.12)•32(n.s.)1 .01(.22) (.25).63 .57(.25) (.07)- 66TABLE VII CONTINUED1967 1968 1969 1970 1971 POOLSKILL .43(n.s.) .72(n.s.) .55(n.s.) .52(n.s.) .09(n.s.) .50(.57) (.45) (.44) (.38) (1.10) (.25)SEMISKILL 37(*) .44 .44 .54 .67 .50(.20) (.14) (.15) (.17) (.27) (.08)# OBS 532 580 611 583 574 2880R2 .277 .338 .328 .328 .220 .2482.253 .318 .309 .307 .196 .254SE 1.47 1.12 1.28 1.30 2.17 1.521.90 1.97 2.10 2.16 2.28 2.09a 1.70 1.36 1.53 1.56 2.42 1.76(1) same comments as Table II. The variable names ending with the letter “H”refer to variables measured for the husband.- 67 -TABLE VIII - WIFE’S WAGE RATE EQUATIONSINDIVIDUAL YEARS AND POOLED (SEMI-LOGARITHMIC SPECIFICATIONi11967 1968 1969 1970 1971 POOLCONSTANT -1.12 -.83 -.78 -.46(n.s.) -L24 -.96(.26) (.25) (.27) (.31) (.33) (.12)GRADE 12 .11 (n.s.) .14 .14 .16 .20 .15(.07) (.06) (.07) (.07) (.07) (.03)TECH .15(n.s.) .23 .11 (n.s.) .12(ns.) .34 .19(.11) (.09) (.09) (.09) (.10) (.04)COLL .24 .27 ]9(*) .35 .37 .29(.11) (.10) (.10) (10) (.10) (.05)BA .44 .38 .48 .54 .36 .43(.14) (.12) (.14) (.13) (.14) (.06)PHD .67 .42 .46 .45 .67 .52(.21) (.18) (.20) (.20) (.19) (.09)ACHIEVEH-.0O3(n.s.) .02 .005(n.s.) .01(n.s.) .02 .01(.009) (.008) (.008) (.009) (.009) (.004)CATH .16 .15 .O3(n.s.) .13(*) .1O(ris.) .11(.07) (.06) (.06) (.07) (.06) (.03)AGE .04 .02(n.s.) .03 .008(n.s.) .04 .03(.01) (.01) (.01) (.01) (.01) (.006)AGE SQ -.0005 -.0002(n.s.)-.0003 -.0001 -.0005 -.0003(.0002) (.0001) (.0001) (.0002) (.0002) (.0001)RISKAVOIDH .05 .05 .04 .03(*) .05 .04(.02) (.01) (.02) (.02) (.02) (.007)URBAN .O6(n.s.) .09 .05(n.s.) .14 .11 .10(.05) (.04) (.05) (.04) (.05) (.02)SOUTH-.14 -.11 -.11 -.13 -.11(.05) (.05) (.05) (.05) (.05) (.02)PROF .73 .68 .79 .72 .71 .73(.11) (.09) (.10) (.09) (.10) (.04)MANAG .77 .09(n.s.) .58 .41 .15(n.s.) .39(.20) (.21) (.23) (.19) (.18) (.09)CLERK .48 .43 .39 .39 .38 .41(.07) (.06) (.06) (.06) (.06) (.03)SKILL .49 .46 .38 .49 .08(n.s.) .41(.21) (.21) (.19) (.15) (.27) (.09)SEMISKILL .42 .39 .38 .44 .35 .40(.07) (.06) (.06) (.07) (.07) (.03)- 68TABLE VIII CONTINUED1967 1968 1969 1970 1971 POOL# OBS 532 580 611 583 574 2880R2 .391 .399 .344 .370 .382 .367.371 .381 .325 .352 .363 .3631SE .548 .495 .544 .535 .535 .532.403 .487 .534 .563 .590 .517a .692 .628 .662 .663 .670 .666(1) same comments as Table VII- 69 -TABLE IX - THEIL’S DECOMPOSITION OFR2 APPLIED TO THE WIFE’S POOLED WAGE EQUATIONSARITHMETIC SEMI-LOGARITHMICABSOLUTE RELATIVE ABSOLUTE RELATIVE(1) CONSTANT .0012 1 .0131 4(2) EDUCATION .0356 14 .0384 10(3) AGE .0086 3 .0123 3(4) OCCUPATION .0723 28 .1642 45(5) SOCIO-ECON .0114 4 .0248 7TOTAL .1291 50 .2528 69MULTI COLLINEARITYEFFECT .1293 50 .1140 31R2 .2584 100 .3668 100(1) comments are the same as Table IV, except that variables notdefined in Tables VII and VIII are deleted from the definitions.- 70 -but the husband’s variables could be used as proxies, assuming theyreflect household, rather than individual, behaviour. RACE, IQ, andJEW, observed for the husband are not significant in the wife’sequation. The effect of these missing variables is to bias thecoefficients of the included variables, e.g., the bias in the coefficients of the education variable caused by a missing ability (IQ)variable is an obvious example.With this possible bias in mind, I discuss now the estimationresults in more detail. Again this discussion is somewhat outside themain line of argument of this study and the rest of this section can beomitted without loss of continuity.The education coefficients yield results expected fromhuman capital theory. Internal rates of return can be deduced fromthe following specification(2.4) in of WAGE = -.71 - .0544 YRS SCHOOL + .0042 (YRS SCHOOL)2(.02) (.0008)For the wife the rate of return ranges from -3 percent for individualshaving completed only five grades to 9.7 percent for Ph.D.15 It issomewhat hazardous to compare the wife’s rates of return with thehusband’s, because of the missing IQ variable. Inclusion of thelatter in the husband’s equation reduces his rate of return (seeFootnote 12).- 71 -The only other human capital element in the wif&s wageequation is the set of AGE variables. The turning points predictedby the pooled equations are respectively 50.2 years (arithmetic) and52.3 years (semi-logarithmic), which is later than for the husband.This can be expected if women enter the labour force later than men,e.g., because of childbearing and taking care of pre-school children.The turning points predicted in the individual year equations are,however, quite erratic and even insignificant in 1968 and 1970. Thatthe age variable is less firmly established for the wife’s equationshould come as no surprise. The sample of married women is certainlyheterogeneous with respect to post—educational human capital investment.Aswill be seen, the sample contains a much higher proportionof occasional labour force participants than the sample of husbands.Evidence on differences in wage profiles for varyingeducational levels is also very dubious. In this sample the AGE PHDinteraction terms are significant only in the arithmetic specification.16None of the other age-education interactions is significant in eitherspecification.The reversed order of magnitude of the coefficients ofPROF and MANAG for the wife compared with the husband might causesome concern. This result could be explained by noting that theoccupational classification is probably different for males than forfemales. Furthermore, a different set of restrictions, imposed onthe occupational variables, holds for the wives (see Table X).- 72 -TABLE XF-STATISTICS FOR RESTRICTIONS IMPOSED ON OCCUPATIONALGROUPS- WIFE’S POOLED WAGE EQUATIONSCalculated F-StatisticArithmetic Semi -LogarithmicRestriction Specification SpecificationSKILL = PROF 16.64 11.46SKILL = CLERK .09 0.0SKILL = CLERK = SEMISKILL .35 .07MANAG = SKILL = CLERK = SEMISKILL 1.39 .07The 5 percent critical value for F (1, 2880) = 3.84, for F (2, 2880) 2.99,and for F (3, 2880) = 2.60. Therefore one can conclude that for the wivesthe occupational classification can, without loss of information, becompressed into only three occupational classes: professional workers,workers with some skills, and unskilled workers.Of the remaining socio-demographic variables, attentionshould be drawn to the insignificance of the RACE dummy (significantin the husband’s equation). The JEW dummy is already weak in thehusband’s equation, so the fact that it does not pass the criticallevel in the wife’s equation is not surprising. Other variables thatwere insignificant were dummy variables indicating county’s unemploymentlevels (tested for in a 1968-1971 pool) and dummy variables for theindustry (tested for in a 1970-71 pool).— 73 —Comparisons were also made on the fit of the arithmeticand semi-logarithmic specification if the dependent wage variable istruncated at the right side. There are only four wives with a wagehigher than $15. If these four are deleted, the R2 for the arithmeticspecification increases from .26 to .32 while the R2 of the semi-logarithmic specification remains unchanged at .37.3. Prediction3.1 Problems in Selecting a PredictorThe objective of this Chapter is to find a suitable predictorfor the unobserved market wage of non-participants. This task presentsseveral difficulties and potentiaHy insoluble issues (i) the missinq variableproblem: it was indicated above that an important flaw in the wife’sequation is the bias caused by omniitted variables. Yet another missingvariables problem applies to both the husband and wife equations. Certainvariables, included in the wage regressions discussed in the previoussection are observed only if the individual is working. These variablesare, in the case of the husband, the experience and the occupationdummies, the second job and the union dummy; in the case of the wifeonly the occupation dummies. The problem then becomes what to use asa predictor. Two options are readily apparent: (A) to use the equationsdiscussed in the previous section but delete the unobserved variables;(B) to use a new equation obtained from regressing wage on only thosevariables that are observed for non-participants.- 74 -If the true model corresponds to the husband and wifeequations presented in Tables H, III, VII, and VIII, then these leastsquares estimates can be used to obtain optimal (BLUE) predictors.17If, however, the least squares estimates corresponding to option (A)or (B) are used then our predictor will presumably be no longer unbiased.Biasedness is caused in option (A) by deleting relevant variables. Theleast squares estimates in option (B) could also be biased because ofthe missing variables. Correlation between the included and excludedvariables could possibly reduce the bias in this case. In order toobtain some idea of the conparative predictive performance of either ofthese two biased predictors, I compare their respective predictions ina test in Section 3.3.(ii) The sample problem: an even more severe problem in predicting unobserved market wages follows from an intuitively reasonable, thoughhardly provable consideration. Suppose that the sample used to estimatethe wage regressions is different from the sample for which the potentialwage rate must be predicted. If these two populations are indeedstructurally different then it is not reliable to use the regressionestimates derived in one population to predict the other population.Arguments to explain the structural difference between thesample of working individuals (over which wage regressions are estimated)and the sample of non-labour force participants (for whom I need topredict the potential wage offer) were already mentioned above (seeSection 1, this Chapter).- 75 -3.2 Test of Structural DifferenceTo obtain some notion of the existence and extent of thesestructural differences and of the resulting severity of the predictionbias 1 proceed by splitting all the observations on either the husbandsor the wives up into three groups: (A) a group consisting of individualswho were in the labour force all five years; (B) a group consisting ofindividuals who were in the labour force at least one year but notall five years and finally (C) the ones that were never in the labourforce.It is reasonable to view these three groups as being “ordered”in terms of structural differences, i.e., A is “closer” to B than toC. If this is true then I can argue that if I find that A is structurallydifferent from B then A is different from C and also A and B pooledare different from C. The ultimate objective of these tests is toassess the appropriateness of predicting potential market wage ratesfor non—participants using information on participants.A first indication on structural differences between sampleA and B can be found from a Chow test18 on the null hypothesis that Aand B belong to the same sample. The equations for husband and wife arespecified as discussed in the previous section. The null hypothesistested is that sample B is structurally equal to sample A. In thehusband’s case there are 504 individuals out of 5076 who did notwork all five years but worked at least once. The calculated Fstatistic for the arithmetic specification is 2.14, for the semi-- 76logarithmic specification 5.14. The theoretical value of F (27, 5076)is smaller than 1.53 (5 percent significance level) or 1.81 (1 percentsignificance level). Consequently the null hypothesis is rejected.Similarly for the wife I test that the sample of 1140 wiveswho did not work all five years but worked at least once are structurallythe same as the sample of wives that worked all five years. The calculated F—statistics are 4.09 (arithmetic case) and 6.O,(semi-logarithmiccase). The critical value for F (18, 2880) is smaller than 1.65 (5percent level) or 2.01 (1 percent level). Again the null hypothesisthat sample A and B are the same must be rejected.It is not possible to test whether sample C is differentfrom A and B as the wage rate is not observed for C. But in view ofthe definition of respectively A, B, and C and taking account of thefact that A and B are already structurally different, one certainlyshould be aware of the probability that C could be structurallydifferent from A and B.3.3 Prediction TestTo investigate the matter of prediction bias further, I presentthe results of using the least squares coefficients estimated over sampleA to predict the observed wages of sample B. I am interested in thefollowing issues:(i) How good are the predictions from sample A onto sample Bjudged by some conventional criteria. This will possibly give an idea- 77 -of the prediction bias when predicting potential wage offers fornon—participants (as I will do below).(ii) How does the predictive power compare for the semi-logarithmic versus the arithmetic specification. Also how dopredictions compare using the least squares coefficients of a fullyspecified equation versus a partially specified one. With “full”I mean the specifications used in Section 2 of this Chapter; with“partial” I mean a regression including only those variables observedfor non-participants. Aswas seen in Section 3.1 both methods usuallylead to biased predictors.The prediction results are summarized in Table XI. Firstof all, the predictions are less than fully satisfactory. Maybe theyare still acceptable for the husband, but they are very poor for thewife. The ossible implication of this result is compounded by thefact that I must predict a potential wage rate for wives in many morecases than for the husband. There is not much difference in terms ofpredictive ability between the arithmetic or semi-logarithmic specification except that the semi-logarithmic predictor does a bit better(smaller mean squared error and smaller inequality coefficient) inthe husbands case. Neither is there much difference between thefull or partial specification.Together these tests do not provide great confidence onthe reliability of the predictors. It does indicate, however, thatany least squares predictor will probably be unsatisfactory, certainly- 78TABLE XI - GOODNESS OF FIT MEASUREMENTS COMPARINGPREDICTED WITH OBSERVED WAGE RATES FOR OCCATIONAL LABOUR FORCE PARTICIPANTSHUSBAND WIFEFULL PARTIAL FULL PARTIALA.S.** A.S. S.S. A.S. S.S. A.S. S.S.# OBS 504 504 504 504 1140 1140 1140 1140# POSITIVEERROR 344 300 171 169 572 556 348 321% POSITIVE.ERROR 68% 60% 33% 34% 50% 49% 31% 28%CORRELATIONCOEFFICIENT .48 .48 .49 .48 -.02 -.02 -.01 -.01ROOT MEANSQUARE ERROR 1.81 1.74 1.71 1.71 338 33.8 33.8 33.8MEAN ERROR .58 .44 -.35 -.31 1.25 1.25 .77 .71REGRESS I ONCOEFFICIENT .9 1.3 .8 .8 -1.3 -1.7 -.6 -.4THEIL’s INEQ.33 .31 .26 .27 .94 .95 .93 .93BIAS(*).14 .06 .04 .03 .001 .001 .00 .00VARIANCE(*).24 .48 .20 .21 .97 .97 .95 .95COVARIANCE(*).62 .46 .76 .76 .029 .029 .05 .05AVERAGEWAGE 2.7 .75 2.7 .75 1.9 .4 1.0 .4(*) Cfr. Theil [1961] pages 31-42.(**) A.S. = arithmetic specification, S.S. = semi-logarithmic specification- 79 -for the wife’s potential wage rate.3.4 Choice of A PredictorGiven the results of the previous sections the finalchoice of which predictor to use must be somewhat subjective. Ihave chosen a predictor based on the coefficients of a semi-logarithmicregression estimated over the population of participants and specifiedto contain only those variables that are observed for non-participants,i.e., what was called above the “partial” specification. I choose thesemi-logarithmic form because it relates better to the observed wagedistribution than the arithmetic form (Section 2.1 above). I selectthe “partial’ specification because if the included and excluded variablesare correlated, this specification might capture more of the variationin the dependent variable than the “full” specification (which uses theresults obtained in Section 2 above but deletes the unobserved variables).The regressions used for prediction of the husband’s and ofthe wife’s wage rate are given in Table XIII. Note that some variableswhich were not found significant when the specification of Table IV orIX was used turn out to be significant in the “partial” specificationof Table XII, e.g., KIDS, IQ HUsing the regressions of Table XII I will “predict” wagerates both for non-participants and participants. This predicted wagevariable is used in the labour force participation functions of ChapterIV and the labour supply functions of Chapter V. It can be interpreted- 80TABLE XII - HUSBAND’S AND WIFE’S POOLEDWAGE EQUATIONS (SEMI-LOGARITHMIC SPECIFICATION ESTIMATED FOR PREDICTION PURPOSES)HUSBAND WIFECONSTANT- .91- .87(.1) (.14)GRADE 12 .18 .19(.02) (.03)TECH .19 .23(.03) (.05)COLL .24 .42(.03) (.05)BA .51 .76(.03) (.06)PHD .66 .89(.04) (.09)ACHIEVEH .007 .02(.003) (.004)RACE -.14 -.1(.02) (.03)IQH .02 .02(.003) (.006)CATH .07 .16(.02) (.03)AGE .06 .03(.004) (.006)AGE SQ-. 0007- . 0003(.00004) (.0001)RISK H .06 .06(.004) (.007)URBAN .19 .10(.01) (.02)SOUTH-.18 -.11(.02) (.02)KIDS-.009 -.01(.003) (.005)# OBS 5076 2880R2 .39 .28SE .48 .561- 1.07 .52(1) same comments as Table II- 81 -as a indicator of the “permanent wage rate”, i.e., as the long runlevel of the rental income from the individual’s human capitalIn order to simplify the prediction of the arithmetic wagerate using the semi-logarithmic regression I assume that the predictedarithmetic wage rate is lognormally distributed with mean X’ andvariance covariance matrix where X is the matrix of observedvariables, is the vector of estimated coefficients in Table XII andis the squared standard error of the regressions in the same table.The prediction formula for any individual t is then2°(2.5) predicted wage (t) = exp {XB + . sj- 82 -FOOTNOTES TO CHAPTER IIIIn the preliminary investigations of labour force participation, I haveused another prediction method. This method consisted of restrictingthe sample to those families where at least one wage rate is observedin the sample period (1967-71). Then I predicted a potential wagerate choosing the observed wage rate in a year of participation asclose as possible to the year of non—participation (I also adjustedthe observed wage rate for a 4 percent annual growth rate). Thismethod,. however, excludes from the sample an obviously interestinggroup in terms of labour force participation behaviour, viz, thosehusbands or wives that did not participate at all in the five yearsample period. Because of this bias I have less confidence in theresults of these models and therefore have dropped it after someexperimentation.2 See, e.g., Becker [1964], Mincer [1970], Ben-Porath [1967,l97O], Rosen[1972] for some of the theoretical foundations of this approach. See,e.g., Johnson and Stafford [1974], Mincer [1974] for some recentempirical appl i. cations.3 See Theil [1971, 548-549].4 The reason for most of the incomplete observations was that the countyunemployment rate was missing. This information was to be obtained,not in the interview, but from state employment offices which frequently failed to answer.5 Half of these are split-offs, i.e., new families formed during theinterviewing period. Cfr. Appendix A.- 83 -6 Cfr. Mincer [1970] for a good summary in this respect.7 Cfr. Theil [1971, 110-111], Assumption 3.3.8 The x2 test statistic is based on the difference between observed andexpected frequency in each interval. The Kolmogorov-Smirnov test utilizesthe relative cumulative frequency of a theoretical and observed distribution and is based on the greatest divergence between these two. Cfr.Kendall and Stuart [1973, 436—482] Volume 2, Chapter 30.9 This significance, however, disappears if finer class intervals areintroduced.10 Cfr. Theil, Chapter 4 especially page 181. Define R2 as the multiplecorrelation coefficient of the regression containing all independentvariables, and as the multiple correlation coefficient correspondingto the same equation specified without the hth variable. Then theincremental contribution of the hth variable is defined as R2 - andthe following equality holdsR2 R2— (1 -R2) t2— h h’n—kth bei.ng the t-ratio of the hth variable. The multicollinearity effectis defined as R2 - z (R2- R)h11 Cfr. Appendix B for coding of the YRS SCHOOL variable.12 The formula for the internal rate of return derived from (2.2) isr = - .0119 + .OO52t so that for- 84 -o - 5 grades6 - 8 grades9 -11 gradeshi ghschooltechnical trainingcollege dropoutsBAPF[Dof return explaining the logarithmfor increasing schoolings levels,His regression, however, includesout of the regression the internal.4 percent to .5 percent for eachis increasing with increasingwhen to stop education. Presumablyhim decide to stop investinged = w, ed.iq + SW, iq.ed bjq, ed14 Note that these models divert from the classic paradigm of maximizinga (lifetime) utility criterion. Decision makers are assumed tomaximize lifetime disposable income. This simplifies the optimizationproblem. Heckman [1972] has tried to carry out the classic utilityt=3t=7t = 10t = 12t = 13t = 14t = 16t = 18r = .0037r = .0245r = .0401r = .0505r = .0557r = .0609r = .0713r = .0817See Mincer [1974, 51-59]. His ratesof earnings, however, are decreasingeven controlling for weeks worked.fewer variables. If ability is leftrates of return on average, increaseeducational level. The result that reducation brings about the problem ofan individuaPs finite lifespan makesand reap the benefits.13 In its simplest form (assuming all other variables kept constant),TheiPs [1971, 549] missing variable formula for this case is- 85 -maximization procedure in this context but argues that this modelis only tractable if one assumes specific functional forms for theutility and production functions.15 Similar to footnote 12 the fonnula can be derived from (2.4) and isequal tor = .0544 + .0084tso that for0- 5 grades6 - 8 grades9 -ii gradeshi ghschooltechnical trainingcollege dropoutsBAPHDt3 r=-.0292t7 r= .0044t10 r= .0296t=l2 r= .0464t13 r= .0548t=l4 r= .0632t=16 r= .0800t=l8 r= .0968AGE2 + .2244 AGE PHD - .0022 AGEPHD(.08) (.0008)16 WAGE = -.64 + .0642 AGE - .0007(-1.8) (.02) (.0002)in WAGE = -.94 ÷ .0307 AGE- .0003 AGE2 + .0278 AGE PHD- .0003 AGE2PHD(.13) (.006) (.0001) (.03) (.0003)17 Cfr. Theil [1971, 125], Theorem 3.5.18 See Chow [1960] or Fisher [1970]. A Chow test, however, requires homo—skedastic disturbances in order to be valid. Johnston [1972] suggeststhat testing for homoskedasticity can be done by applying the standard- 86 -test for homogeneous variances to the dependent variable if one hasplentiful cross-section data. (See Johnston [1972, 218].) This teststatistic for homogeneous variance will be distributed approximatelyas x2(m — 1) under the hypothesis of homogeneous variances. (m is thenumber of intervals used to split up the dependent variable.) Applyingthis test statistic to the arithmetic wage rate gives a calculated x2value of 46,2 for the husbands and 91.9 for the wives (m = 16). Thetheoretical 3 (15) value at the .5 percent level is 32.8 so that thenull hypothesis of homoskedasticity cannot be accepted on the basisof this test.19 Similar to Friedman’s [1957] definition of permanent income. Seeespecially page 21.20 See Footnote 20, Chapter II.- 87 -CHAPTER IVFAMILY LABOUR FORCE PARTICIPATION DECISIONS1. IntroductionTwo alternative models of family labour force participationdecisions will be considered in this thesis. (See Chapter II, Section3.4 above.) First the simultaneous decision model in which a family isdescribed as choosing among four possible labour force participationalternatives: (i) both husband and wife working, (ii) husband onlyworking, (iii) wife only working, and (iv) none working. This modelis estimated using the multinomial logit technique (see Chapter II,Section 3.3.C above).The second family labour force participation model isthe sequential decision model. In this model the labour force participation decisions are taken sequentially: first the husband decideswhether to join the labour force; then the wife decides conditionalupon her husband’s choice. I will also briefly mention the otherpossible sequential model where the wife decides first and her husband’schoice is conditional upon hers. This second sequential model seemsto be less realistic a priori. As will be seen the statistical resultsconfirm this a priori expectation. I use the binomial logit technique(see Chapter II, Section 3.3 B above) to estimate the sequential model.L Logit models establish the probability of choosing aparticular alternative as a function of a set of independent variables.As was found in Chapter II (Section 3.1) labour force participation- 88 -choices depend on the difference between the shadow wage and themarket wage, given the level of family income. To capture theunobservable shadow wage concept which is related to “tastes” forhousehold connodities and to “productivity” in the household (seeChapter H, Section 3.1), I introduce a set of socio-demographicvariables, e.g., education, race, children, etc. The own market wageis approximated with an estimate of the “permanent” wage rate (seeChapter III). Utility theory predicts a positive relationship betweenthe change in the own wage rate and the decision to participate, keepingutility constant (see Chapter II, Section 3.2). The latter is achievedby controlling for the income effect. I measure the income effectwith an asset income variable (see Appendix B) and then compare thecoefficients on the own wage rate and the asset income variable.Another element of family income, however, is the labour income thatthe other partner can earn. This effect will be approximated withthe perrnanent” wage variable of that partner.In what follows I discuss first some technical issues relatedto the estimation of logit models (Section 2). Then I estimate anddiscuss the results of the simultaneous labour force participationmodel in Section 3 and of the sequential model(s) in Section 4. Section5 concludes this chapter with a comparison of both family labour forceparticipation models.- 89 -2. Estimation Problems of the Logit ModelThe binomial logit model originated in biometric studiesbut has already been used extensively in economics, e.g., studies ofthe acquisition of durable goods, the choice of transportation modes,etc.1 The multinoniial logit model was developed by Theil , [1969] andits statistical properties were extensively discussed by McFadden [1974].The multinomial logit model has been used in economics to study corporatechoice among long term financing instruments (Baxter and Cragg [1970]),the demand for cars (Cragg and Uhler [1970]), the structure of assetportfolios of household (Cragg and Uhler [1971]), the choice oftransportation modes (McFadden [1974]) and occupational choice (Halland Kasten [1974]).The logit models may be estimated by maximum likelihood. (See,e.g., Theil [1970] for another estimation method.) Assuming thatthe choice for each family is independent of the choice of other families,the likelihood for the sample becomesL(3.1) e= i Px(a1t) . • PX(aMt)where X is a set of M possible alternatives a (i = 1, M), p(at) isthe probability of family t (t = 1, T) choosing alternative a. andis the number of families in the sample choosing alternative i soMthat = T, the sample size. If I substitute the logit specification discussed in Chapter II (equation (1.26)):- 90 -Z.j;(3.2) p(at) = Me V.•E etjJ=linto equation (3.1) and differentiate the logarithm of (3.1) withrespect to the ‘s then I obtain nonlinear normal equations. Sincethe matrix of second partials of the log-likelihood function is thenegative of a weighted moment matrix of the independent variables,it is negative semi-definite and thus L is concave in f3•3 It hasbeen shown that the nonsingularity of the Hessian depends, analogousto the least squares model on a full rank condition for the matrix ofindependent variables.4 This condition will be satisfied providedthe independent variables are not àollinear. If the Hessian is non-singular then a vector maximizing the logarithm of (3.1) will beunique, provided a maximum exists. It has been shown5 that theprobability that a maximum likelihood estimator exists and that it isconsistent and asymptotically normally distributed approaches one(under some fairly weak conditions) as the sample size approachesinfinity.6If L in (3.1) is strictly concave in , non-linear gradientmethods will yield the maximum likelihood estimates if the model iswell specified (i.e., full rank condition and not all observationsconcentrated in one alternative). To estimate the binomial logit modelsI use two computer programs: CSP and PROLO. For the multinomial modelI use CSP and THEIL.7 All programs are based on the standard GaussNewton iterative method.- 91 -In terms of use of computer time the more recent CSP programusually converged substantially faster than THEIL and only slightlyfaster than PROLO. PROLO and THEIL on the other hand provide the userwith results not available in CSP, such as goodness of fit statistics(see below), and the asymptotic variance covariance matrix. The latteris particularly useful to test the significance of differences betweenthe coefficients belonging to different alternatives in the multinomiallogit model. THEIL also offers the possibility of restricting thecoefficients in and across the alternatives.THEIL, however, is very time-consuming and experimenting withvarious specifications using this program is prohibitive.8 My researchstrategy therefore is to derive a satisfactory model by means of theCSP program and to re-estimate the “final” model using THEIL. Thenumerical estimates of the coefficients and asymptotic standard errorswere sufficiently close (up to the third digit) in both programs. Thisprovides some confidence that the estimates of both programs are numerically reliable.The goodness of fit statistics that will be given for thelogit models in the next two sections are all transformations of alikelihood ratio. The likelihood ratio under consideration, comparesthe logarithm of the likelihood at its maximum for the given logitmodel with the logarithm of the likelihood of the same model with allcoefficients except the intercept constrained to be zero. The logarithmof the likelihood of this constrained model can simply be written as- 92N(3.3) Lconstr= jl f in (f/T)where M is the number of alternatives, f is the number of occurrencesof alternative j and T is the sample size. If L is the maximum ofthe logarithm of the (unconstrained) model, then a likelihood ratiotest statistic can be defined for the logit model as(3.4) A= _2(Lconstr - L)which is asymptotically distributed as x2 with degrees of freedomequal to the total number of parameters in the model minus the numberof alternatives. This difference is equal to the number of restrictionsimposed. A pseudo R2 can be defined as(35) pseudo R2 = 1- exp(-A/T)and a proportional pseudo R2 as(3.6) prop pseudo R2 = - exp(-JT)1— exp{2(Lconstr- Lmax)/T}whereby Lmax is the logarithm of the maximum possible probability. Theproportional pseudo R2 is then the ratio of variation accounted for overthe total explainable variation. It is identical to the conventional R2in multiple regression.9- 93 -3. Description of the SampleThe sample used for the logit models in this Chapter is againdrawn from the original Survey Research Center sample described inAppendix A. Observations were obtained by restricting the sample to(i) households with both husband and wife present during each of thefive sample years; (ii) households where husband and wife are the sameindividuals during the sample period. This restriction allows me touse variables observed in only one year for all five years, e.g.,education is only observed in the first and last sample year (1967);(iii) households for which the husband is married only once. Thisrestriction allows me to use the variable “age of husband at firstmarriage” to determine the length of the present marriage relationship;(iv) households consisting of immediate family members only, i.e.,nuclear families with husband, wife and children. I avoid therebypossible influences on the labour force participation decisions ofhusband and wife created by the presence of other working adults inthe family. The influence of working children on the labour forceparticipation decisions of their parents will be discussed; (v) households for which all variables to be used in the empirical investigationare present.tf the restrictions are introduced in the above mentionedorder they cause respectively 25, 31, 7, 7, and 9 percent to be discarded from the original sample of 5062 families leaving 21 percentor 1083 households. At an early stage of the investigation I decided- 94 -to estimate labour force participation functions for the pooled sampleonly. As each of the 1083 families is observed five times the pooledsample size has 5415 sample points. The decision to use only the pooledsample is motivated primarily by the fact that the alternative “wifeonly” occurs only a few times in each individual year (Cfr. Table XIII).As a consequence, the estimated parameters for this alternative in theindividual years might depend too much on the particular characteristicsof the few families involved.TABLE XIIIDISTRIBUTION OF LABOUR FORCE PARTICIPATION CHOICESIN THE SAMPLE FOR INDIVIDUAL YEARS (IN ABSOLUTE NUMBERS)Both Husband Only Wife OnlyYear Working Working Working None1967 453 564 11 551968 481 528 15 591969 514 492 15 621970 485 502 19 771971 492 487 21 83POOL 2425 2573 81 336As can be seen from Table XIII there are some substantialchanges in the distribution of the families labour force participationchoices from year to year. Most of the variation takes place betweenthe alternatives “both working” and “husband only working”, which is- 95of course, the change in participation of the wife. The alternatives“wife working” and “none working” grow in importance over the sampleperiod, which can be explained by the importance of age as a determiningfactor for their occurrence.I have also counted how many families choose the same alternativeall five years. The results show that respectively 284, 308, 7 and43 families (out of 1083) remain in respectively the “both working”,“husband only”, “wife only” and “none working” alternative all fiveyears. This amounts to respectively 58 percent, 59 percent, 43 percent,and 63 percent of the pooled sample points listed on the bottom line ofTable XIV.It should be noted that the “none working” alternative is notthe same as retirement. Although the status of the husband in thefamilies choosing the “none working” alternative was almost always “retired”(except for a few husbands being unemployed or in school), there were asubstantial number of “retired” husbands in the labour force too.1°Out of the 5415 observations there were 539 husbands whose status was“retired”; 139 of which were in the labour force. My impression isthat although for most of the husbands retirement seems to be apermanent decision, it is not impossible for them to join the labourforce again.The scanty evidence presented in the previous paragraphs indicates that there is presumably sufficient variation in the labour forceparticipation choices of the individual families over the sample periodto give some confidence to the results of the empirical investigation- 96TABLE XIVDESCRIPTION OF THE POOLED SAMPLE IN TERMS OFLABOUR FORCE PARTICIPATION CHOICESBoth Husband WifeWorking Only Only1 Variables are explained in Appendix BNone AllVariables1# OBS 242511.111.417.3257310.710.918.8818.09.332.13368.98.841.9541510.810.919.7Mean OfYRS SCHOOL HYRS SCHOOL WMARRI AGEWAGE H (PRED)2 3.95 3.97 2.75 2.63 3.86WAGE W (PRED)2 2.26 2.16 1.92 1.74 2.18ASSET Y 1093. 1401. 1075. 1949. 1292.Number of ObservationsWith Dumy Variable = 1LIMIT H 231 (9%) 268 (10%) 64 (79%) 170 (51%) 733 (14%)LIMIT W 21 (.9%) 56 (2%) 3 (4%) 24 (7%) 104 (2%)KID 6 921 (38%) 1256 (49%) 15 (19%) 18 (5%) 2210 (41%)SOUTH 957 (39%) 883 (34%) 35 (43%) 103 (31%) 1978 (37%)URBAN 1506 (62%) 1559 (61%) 30 (37%) 201 (60%) 3296 (61%)R4CE 513 (21%) 451 (18%) 16 (20%) 30 (9%) 1010 (19%)MORTG 1392 (57%) 1382 (54%) 16 (20%) 50 (15%) 2840 (52%)RESERVE 1286 (53%) 1366 (53%) 36 (44%) 240 (71%) 2928 (54%)2 Using the regression predictor- 97 -explaining these choices as functions of certain exogenous variables.Table XIV describes the variables for each labour forceparticipation alternative. A few patterns are apparent: families inthe alternative “none working” are on an average older (in terms ofthe age of marriage variable) and possess more wealth (see ASSET Y,RESERVE). As was seen above, most of these families are in the retirement phase of their life cycle. The families in the alternative “wifeonly” are mostly found in this alternative because of a handicappedhusband (in almost 80 percent) of the cases. The families in thealternatives “both working” and “husband only” are generally bettereducated, younger and potentially. higher wage earners than the othercases. It appears that a handicap •for the wife and the presence ofpre-school aged children is the crucial distinction between these twoalternatives.4. The Simultaneous Family Labour Force Participation ModelThe estimation results for the multinomial logit modelfor the family’s labour force participation decisions are presentedin Table XV. The estimates are obtained by normalizing on the alternative “none working” and are therefore interpreted as the firstderivatives of the logarithms of the odds of that alternative over thealternative “none working” (see equation (1.28), Chapter II). Furthermore differences in the coefficients between the columns for eachvariable are the first derivative of the logarithm of the odds ofone alternative over the other. These column differences are presented- 98 -in Table XVI. Asymptotic “standard errors” for these differencesare calculated based on the asymptotic variance covariance matrix V.The existence of significant differences between the variousalternatives would seem to be an indication that the alternatives shouldnot be aggregated and thus that the axiom of irrelevant alternatives(see Chapter I, Section 3.3.D) is valid for this model.Education for husband and wife has a different effect onthe labour force participation choice. More education seems to increasethe odds of retirement for husbands while the reverse is true forwives (Table XV). In choosing between the “both working” and “husbandonly” alternative, the former seems to be more preferred by highereducated couples (Table XVI).If the husband is handicapped then the odds increase substantially in favour of the “none working” alternative (Table XV) or in favourof the “wife only” choice (Table XVI), with the latter choice beingmore probable than the “none working” choice (Table XV). The LIMIT Hvariable seems therefore to be crucial in determining the labour forceparticipation of the husband. (This is confirmed in the sequentialmodel below.) [f the wife is handicapped then the probability of the“both working” choice becomes smaller compared with the probabilityof either “none working” (Table XV) or “husband only” (Table XVI). Itdoes not have any significant influence on the odds of the “wife only”choice (both tables).- 99 -TABLE XVSIMULTANEOUS FAMILY LABOUR FORCE PARTICIPATION MODEL.RESULTS OF MULTINOMIAL LOGIT MODEL.13.40(.64)—.12(.03).15(.03)-1.50(.17)-1.13(.39)-.13(.01)- .42 (**)(.30).78(.18)- .78(.18)—.02 (n.s.)(.27).83(.16).001 (n.s.)(.22)-. 00009(.00003).77(.19)—.25 (n.s.)(.20)Variable2Both Husband WifeWorking (BA) Only Only (Sc)CONSTANTYRS SCHOOL HYRS SCHOOL WLIMIT HLIMIT NMARRIAGEKID 6SOUTHURBANRACEWAGE H (PRED)WAGE W (PRED)ASSET YMORTGRESERVE3.73(.63)—.15(.03).08(.03)-l .60(.16)- .26(.34)-.10(.01).55 (*)(.30).61(.18)- .83(.18)- .42 (**)(.28)1.10.(.16)- .41 (*)(.22)-. 00002(.00002).52(.19)- .31 (**)(.19)—.52 (n.s.)(1.06)- .09(.05).03 (n.s.)(.05)1 .38(.30)—.83 (n.s.)(.65)- .06(.02).23 (n.s.)(.46).16 (n.s.)(.30)-1.17(.30).02 (n.s.)(.43)—.12(.25)1.22(.29)- .0001 (**)(.00008)—.25(.33)- .79(.32)- 100 -TABLE XV CONTINUED2 Both Husband WifeVariable Working (BA) Only (BB) Only (BC)NUMBERPARTiCIPATING 2425 2573 81NUMBER OFOBSERVATIONS 5415Log likelihood at maximum:- 4179.0Likelihood ratio test (x2 with 42 df.) 1916.13Pseudo R2 = .30Proportional pseudo R2 = .351 Each coefficient is significant at the 5% level (t-test) exceptwhen followed by(*) significant at 10% level(**) significant at 20% level(n.s.) not significant2 Variables are defined in Appendix B3 Numbers in parenthesis are asymptotic standard errors- 101 -TABLE XVI1DIFFERENCES BETWEEN COEFFICIENTS IN SIMULTANEOUS FAMILYLABOUR FORCE PARTICIPATION MODELVariable Both Working vsHusband OnlyBoth Working vsWife Only-Husband Only vsWife Only—CONSTANTYRS SCHOOL HYRS SCHOOL WLIMIT HLIMIT WMARRIAGEKID 6SOUTHURBANRACEWAGE H (PRED)WAGE W (PRED)ASSET YMORTGRESERVE—.3385 (**)(.2093).0314(.0111).0566(.0141).1040 (n.s.)(.1010)- .8758(.2695)- .0270(.0037)- .9692(.0746).1685(.0692).0538 (n.s.)(.0689).3972(.0887)-.2612(.0389).4113(.0626)-. 000073(.000016).2446(.063).0631 (n.s.)(.0674)3.9148(.9463)-.0256 (n.s.)(.0418).1113(.0487)-2.8829(.2686)-.3086 (n.s.)(.6622)- .0739(.0146)- .6550 (*)(.3767).6143(.289).3905 (**)(.2826)-.0414 (n.s.)(.381 5).9524(.2314)—l .221(.2427).000015 (n.s.)(.000075)1.0211(.2987).5414 (*)(.2971)4.2533(.9456)-.0496 (n.s.)(.0415).0558 (n.s.)(.0485)-2.9869(.2673).5672 (n.s.)(.6376)- .0469(.0145).3143 (n.s.)(.3760).4458 (**)(.2859).3367 (n.s.)(.2819)-.4386 (n.s.)(.3807)1.2136(.2316)-1.6324(.2447).000088 (n.s.)(.000075).7766(.2986).4784 (**)(.2963)1 Same comments as Table XV- 102 -As is to be expected the odds of the “none working” choiceover all other choices increases as the couple grows older (Table XV).This is clearly the effect of retirement. Also the odds of the “wifeonly” alternative over the two alternatives where the husband works,increases with years married (Table XVI). Presumably the husband,being on an average older than his wife, retires before her. Anotherreason might be that the possible handicap for the husband is age-related. “Both working” becomes less probable than “husband only”in later stages of the life cycle. I suspect this is a mixture oftwo reinforcing effects: a “life cycle effect”, i.e., “both workingis more probable for younger couples in general; and an “age cohorteffect”, i.e., the younger generation tends to go out working togethermore than the older generation. Given the limited timespan of thesample it is impossible to disentangle both effects.The fact that the couple has a pre-school aged child is animportant factcrespecially in the choice between “both working” and“husband only” (Table XVI). It is thus a determinant in the labourforce participation decision of the wife. (This will be confirmedin the sequential model below.)The odds of “none working” versus the other alternativesseem to increase for an urban environment and to decrease for afamily living in the south (Table XV). Living in the south alsoincreases the choice “both working” as compared with the alternativeswhere either only the husband or the wife works (Table XVI).- 103 -If the couple is non-white then the probability of findingthem both in the labour force instead of onl.y the husband will behigher than if the couple were white (Table XVI). Again “RACE”is mostly a determinant of the wife’s labour force participationdecision (as will also be seen in the sequential model).I now turn to the economic variables. The (permanent) wagerate of the husband increases the odds of those alternatives where heis found working, i.e., “both working” and “husband only”, comparedwith the alternatives where he is not in the labour force, i.e., “wifeonly” and “none working” (see both Tables). Because the income effectis very small or non-significant, this seems to conform to the theoretical expectation of a positive relation between wage changes andlabour force participation, keeping utility constant (see Chapter I,Section 3.2). The same positive relation between the (permanent)wage rate and labour force participation choice usually holds for thewife as well. The wife’ss(permanent)wage increases the odds ofchoosing “wife only” over “none working” (Table XV), of choosing “bothworking” over “husband only” and of choosing “wife only” over “husbandonly” (Table XVt).It is also interesting to observe the effect of a changein one partners wage rate on the other partner’s labour force participation choice. The first column of Table XVI shows the effect ofthe husband’s wage rate on his wife’s labour force participationdecision and the second column shows the reverse. In both cases there- 104 -is a negative effect indicating some sort of substitution effect,i.e., the higher the wage of one partner the lower the probabilityof finding the other partner in the labour force.The level of asset income increases the odds of having nobodyworking compared with having both working (Table XV). A possibleexplanation of this result could be that couples where both are workingare usually young couples (see above for the effect of years married)who are just beginning in terms of non—human capital accumulation. Thecouples where nobody is working are generally older (again see abovefor the effect of years married), are usually retired and presumablypossess a stock of non-human capital which they have accumulatedover their lifetime.The influence of asset income on the choice between “bothworking” and “husband only” (see Table XVI) can be explained in thesame life cycle framework. As the couples where only the husbandworks are usually older than the “both working” couples (see yearmarried effect again), these couples presumably have accumulated someassets which eventually “enable” the wife to leave the labour force.This life cycle explanation relating the assumption ofaccumulation of non-human capital as the life cycle goes on with theshift from “both working” to “husband only” to “none working” as thecouple gets older is also explored in Table XVII. In this table,I present the average asset income as well as the frequency of the fourpossible labour force choices for consecutive age of marriage intervals.- 105TABLE XVIIRELATIONSHIP BETWEEN MARITAL AGE, LABOUR FORCE PARTICIPATIONCHOICE, AND ASSET INCOME IN THE SAMPLEPercentageAge of Marriage Average Standard Both Husband Wife TotalInterval ASSET Y $ Deviation Working Only Only None =100%(0 - lO]l 617 1232 51 48 1 0 1457(10— 20] 1123 2176 48 51 1/2 1/2 1665(20- 30] 1757 3519 46 48 1 5 1289(30- 40] 1926 2891 42 47 3 8 611(40 - 50] 1 901 2332 1 2 31 10 47 311(50 - 2398 2768 0 17 0 83 82Total 2425 2573 81 336 54151“(“ = open interval , “]“ = closed interval.Table XVII confirms that for these sample average asset income is higherfor older than for younger couples (but note the large standarddeviation relative to the mean in each age group) and that “both working”occurs most forcoupies married less than 10 years and “none working”for couples who were married more than 50 years. All other age groupschoose “husband only” more frequently. Again it should be stressedthat Table XVII reflects “life cycle” effects as well as “age cohort”effects.- 106 -If the family has a mortgage debt then the odds of the“both working” alternative increase compared with all other alternatives (Table XV and XVI). Furthermore, it also increases theodds of “husband only” as compared with “wife only” and “noneworking” (Tables XV and XVI). Thus mortgage debt encourages labourforce participation for husband and wife. The influence on thelabour force participation of the husband seems to be more pronounced(this will be confirmed in the sequential model). Note, however, thatone should not exclude the possibiflty of a causal effect running inthe other direction, i.e., families have a mortgage debt becausethey are both working.Having some savings (RESERVE) seems to be most importantin the choice between “wife only” versus “none working”. If the familyhas savings the probability increases that the wife will not jointhe labour force in this case (Table XV).I have also tested the influence of a dummy variableindicating whether the children in the household had any income (mostlylabour income) and have found that it has no significant influenceon the labour force participation choice of their parents. Thisindependence between the labour force participation decisions of theparents and children was also established by Bowen and Finegan E1969].Dummy variables indicating the unemployment level in the county wherethe family lives also were found to have no significant effect.Dummy variables indicating religious preference (Jewish, Catholic)were not significant.107 -As can be seen from Table XV most coefficients are notsignificant (in terms of asymptotic t-ratios). This implies onlythat the odds of the particular alternative over the alternative “noneworking” remain unchanged when the variable in question changes. Itdoes not imply that the odds of that particular alternative overanother alternative will remain unchanged. For instances YRS SCHOOL Whas no influence on the odds of “wife only” over “none working” (TableXV) but has an effect on the odds of “both working” over “wife only”(Table XVI). If one is only interested in the odds of all the alternatives over “none working” then it is possible to find a more parsimonious specification for that particular multinomial logit model.To do this I: re-estimate the logit model as in Table XV but restrictthe coefficients of the non—significant variables to zero.12 Thisleads to the result presented in Table XVIII.The restricted model is very similar to the unrestrictedversion in terms of coefficient values. It is worthwhile testingthe null hypothesis that the restrictions are valid. This can bedone using the likelihood ratio test which is asymptotically x2distributed with degrees of freedom equal to the number of restrictions.The calculated statistic is = -2(-4l83 + 4179) = 8, with degreesof freedom equal to 45 - 31 = 14. The critical value at 5 percentof x2 (14) is 23.7, suggesting that one cannot reject the null hypothesisthat the restrictions are valid. Therefore the more parsimonious modelof Table XVH is equal in informational content to Table XV in termsof comparing all alternatives with the “none working” alternative.- 108 -TABLE XVIIISIMULTANEOUS FAMILY LABOUR FORCE PARTICIPATION MODEL.RESULTS OF RESTRICTED MULTINOMIAL LOGITMODEL1Both Husband WifeVariable2 Working Only Only(BA) (Be)CONSTANT 3.54 3.89 -.10(.53) (.53) (.61)YRSSCHOOLH -.12 -.15 -.11(.03) (.02) (.04)YRS SCHOOL W .13 .08(.03) (.03)LIMIT H -1 .46 -1 .56 1 .40(.16) (.16) (.31)LIMIT W - .88(.26)MARRIAGE -.14 -.11 -.06(.009) (.008) (.01)KID 6 -.50 .48 (*)(.25) (.25)SOUTH .73 .56(.16) (.16)URBAN -.77 -.82 -1.19(.17) (.17) (.28)RACE - .39(.07)WAGE H (PRED) .82 1.08(.11) (.11)WAGE W (PRED) -.42 1.08(.06) (.17)ASSET Y - .00007(.00002)MORTG .87 .63(.17) (.16)RESERVE -.64(.28)- 109TABLE XVIII CONTINUEDBoth Husband WifeVariable Working Only OnlyNUMBERPARTICIPATING 2425 2573 81NUMBER OFOBSERVATIONS 5415Log likelihood at maximum -4183.0Likelihood ratio test (x2 with 28 d.f.) 1908.2Pseudo R2 = .30Proportional Pseudo R2 = .351 Each coefficient is significant at the 5 percent level (t-test)except when followed by (*) which means significant at the10 percent level only.2 Variables are defined in Appendix B- 1105. The Sequential Family Labour Force Participation Model(s)The basic assumption of the family labour force participationmodel is that one partner decides first about his labour force participation and the other partner chooses conditionally upon the labour forceparticipation choice of the first partner. The sequential model inwhich the husband first decides his participation and his wifedecides conditional upon his choice would seem to be, a priori, anacceptable description of family labour force participation behaviour.In this sequential model I first use a binomial logit model toestimate the coefficients of the labour force participation choicefor all the husbands in the sample. In a second stage I estimatetwo binomial logit models for the labour force participation choiceof the wives: one for the sample of wives where the husband is inthe labour force and another for the sample of wives where the husbandis not participating.If one combines the probabilities of labour force participationchoice of husband and wife one evidently arrives again at the fouralternatives discussed in the simultaneous model. For instance theprobability of the husband participating times the probability of thewife participating gives the probability of “both working” in thesequential model (see also Figure 1, Chapter II).An important objective of this Chapter is to compare thesimultaneous model and the sequential model as alternative models toexplain family labour force participation behaviour. To be consistentI will therefore use the same independent variables in both models.— 111As was discussed in the Appendix to Chapter II the differences betweenthe simultaneous model and the sequential model hinge on the differencebetween the coefficients of the two binomial logit models for the wives.(Equation (A.6) of the Appendix to Chapter II.) It will therefore beimportant to compare estimates of these coefficients.Besides the sequential model in which the husband decidesfirst it is also possible to assume a sequential model in which thewife decides first. Although the model would seem to be a priori aless realistic description of family choice behaviour, I have estimatedthis model in order to be complete.The results on the sequential model in which the husbanddecides first are given in Table XIX and the results on the othersequential model are in Table XX.I will first discuss Table XX: the coefficients of the logitmodel for the wife’s labour force participation choice estimated overthe whole sample (Table XX, Column 1) are almost identical to thecoefficients for the logit model estimated over the sample of wiveswith working husbands only (Table XIX, Column 2). The only exceptionis the coefficient of YRS SCHOOL H which is not significant in TableXX but is significant in Table XIX. Because of the similaritybetween these two columns and to avoid repetition I will only discussthe second column of Table XtX.- 112 -TABLE XIXSEQUENTIAL FAMILY LABOUR FORCE PARTICIPATION MODEL (HUSBANDDECIDING FIRST). RESULTS OF BINOMIAL LOGIT MODELS1Variable Husband’sLabour ForceParticipation ChoiceWife’s LabourForce ParticipationChoice. Husband WorkingWife’s LabourForce ParticipationChoi ce.Husband, Not Working3.65(.56)—.11(.02).10(.03)-l .86(.14)-.38 (n.s.)(.32)-.10(.009).009 (n.s.)(.25).72(.17)-.56(.16)—.20 (n.s.)(.24)1 .08(.14)- .59(.18)- .00003 (**)(.00002).71(.17)—.07 (n.s.)(.17)CONSTANTYRS SCHOOL HYRS SCHOOL WLIMIT HLIMIT WMARRIAGEKID 6SOUTHURBANRACEWAGE K (PRED)WAGE W (PRED)ASSET YMORTGRESERVE- .32 (**)(.21)03(.01)06(.01).13(.10)- .83(.26)- .03(.004)- .97(.07).17(.07).06 (n.s.)(.07).38(.08)- .26(.04)40(.06)- .00007(.00002).24(.06).05 (n.s.)(.06)-1.73 (**)(1.10)-.09(.05).11 (*)(.06)1 .22(.36)-1 .95(.74)- .04(.02).72 (n.s.)(.59).55 (**)(.34)-1.31(.37)—.44 (n.s.)(.37).02 (n.s.)(.31)1 .00(.40)-.00016 (**)• (.00010)—.25 (n.s.)(.39)-.60 (**)(.39)- 113TABLE XIX CONTINUEDVariable Husband’s Wife’s Labour Wife’s LabourLabour Force Force Participation Force ParticipationParticipation Choice Choice. Husband Working Choice.Husband Not WorkingNUMBERPARTICIPATION 4998 2425 81NUMBER OFOBSERVATIONS 5415 4998 417Log likelihood atmaximum -746.55 -3284.2 -154.0Likelihood ratio(x2 with 14 d.f.) 1446 356 103Pseudo R2 .23 .07 .22ProportionalPseudo R2 .56 .09 .351 Same comments as Table XV- 114 -TABLE XXWife’s LabourForce ParticipationChoiceHusband’s LabourForce ParticipationChoice.Wife Working3Husband’s LabourForce ParticipationChoice.Wife Not Working—.21 (n.s.)(.20).01 (n.s..)(.01)• 07(.01).10 (n.s.)(.09)- .89(.24)- .04(.003)-1 .00(.07)21(.07)—.05 (n.s.)(.07).43(.09)-.20(.04).41(.06)-. 00007(.00002)27(.06).01 (n.s.)(.07)3.97(1.21)-.06 (n.s.)(.05).18(.06)-2.97(.33)—1.24 (*)(.72)-.09(.02)- .65 (**)(.45).82(.36).16 (n.s.)(.34)-.003 (n.s.)(.004)1 .23(.29)-1.52(.27).00005 (n.s.)(.0001)1.12(.34).32 (n.s.)(.35)VariableCONSTANTYRS SCHOOL HYRS SCHOOL WLIMIT HLIMIT WMARRIAGEKiD 6SOUTHURBANRACEWAGE H (PRED)WAGE W (PRED)ASSET YMORTGRESERVE3.74(.67)-.14(.03).05 (n.s.)(.03)-1 .61(.17)—.12 (n.s.)(.35)-.10(.01).60 (*)(.32).69(.20)-.74(.19).54 (*)(.31).91(.18).05 (n.s.)(.27)-.00003 (n.s.)(.00002).42(.21)- .32 (**)(.22)-115TABLE XX CONTINUEDWife’s Labour Husband’s Labour Husband’s LabourVariable Force Participation Force Participation Force ParticipationChoice Choice.Wife Working ChoiceWife Not WorkingNUMBERPARTICIPATION 2506 81 2573NUMBER OFOBSERVATIONS 5415 2506 2909Log Likelihoodat maximum -3495.6 -187.23 -517.87Likelihood ratiotest (x2 with 14 d.f.) 485.5 540.88 1046.34Pseudo R2 .09 .19 .30ProportionalPseudo R2 .11 .78 .591 Same coments as Table XV- 116 -The second and third columns of Table XX present theresults on the two “conditional” binomial logit models for husbandswith respectively working and non-working wives. The estimationresults for husbands with working wives (Table XX, Column 2) arehowever, dubious. When I use the PROLO computer program to estimatethis model the logarithm of the likelihood becomes smaller at eachiteration and eventually becomes smaller than the smallest possiblenumber in FORTRAN programs (after three iterations). This indicatesthat the likelihood would maybe approach -co if further iteration wouldbe possible. This would certainly happen in a sample where all decisionunits choose one particular alternative and the other alternative wasnot chosen at all. I would suggest that the fact that in the sample inquestion 2425 out of 2506 husbands (i.e., 97%) choose to participatecomes dangerously close to the situation where all would participate.The CSP program, however, does converge and these results are givenin the second column of Table XX. However, because of this crucialdifference in numerical results13 I do not have much confidence in thevalues of these coefficients. These estimation problems also cast somedoubt on the appropriateness of the sequential model in which the wifedecides first. Because of this I will concentrate my further discussionon the sequential model shown in Table XIX.In discussing Table XIX I first compare the coefficientsof the husband’s labour force participation choice with those of thewife’s labour force participation choice, i.e., compare the coefficients- 117in Columns 1 and 2 in Table XIX (remembering that the latter are similarto the coefficients in the first column of Table XX). Next I will comparethe coefficients of the logit model for the wives whose husbands areworking with the coefficients for the wives whose husbands are notworking, i.e., compare Columns 2 and 3 in Table XIX.5.1 The Husband’s Labour Force Participation Versus The Wife’sLabour Force Participation DecisionThe probability of participation decreases for more educatedmen but it increases for more educated women. Bowen and Finegan [1969]and Cohen etal. [1970] found the same result for women but they establisheda positive education effect for men. However, they restricted theirsample to prime-age males whereas my samples have no such age constraintThe different result is therefore presumably due to the dominating effectof earlier retirement for more educated men in my sample. The educationallevel of the husband has no effect on the wife’s labour force participation(see Table XX, Column 1) whereas the wife’s education influences thehusband’s labour force participation positively.A handicap reduces the probability of labour force participationto a considerable extent for the husband. The same is true for thewife but to a much smaller degree. Surprisingly enough, there is no effect ofa handicapped husband on the wife’s labour force participation and viceversa.- 118 -The negative effect of years married is also more considerablefor the husband than for the wife. The presence of pre-school agedchildren reduces the probability of participation for married womenconsiderably. This result has been much documented in numerous otherstudies (Cain [1966], Bowen and Finegan [1969], Cohen, etal. [1970]).It is again confirmed here together with the finding that the presenceof pre-school aged children does not affect the husband’s labour forceparticipation.The positive effect of a non-white dunny variable on theprobability of participation for the wives has also been frequentlyestablished (Bowen and Finegan [1969], Cohen, etal.[1970], andespecially Cain [1966]). This is also confirmed here, togetherwith the result that it is not a relevant variable for the husbands.This was also the case in Cohen, etal. [1970], however, Bowen andFinegan [1969] found a significant racial difference in the “primeage males” group.Ever since Mincer’s [1962] contribution the participationstudies have been interested in wage and income effects. Income effectsare quite well researched but most of these studies, however, do notinclude a wage variable (Mincer [1962], Bowen and Finegan [1969], Cohen,etal. [1970], Cain [1966]). Recently Boskin [1973] estimated linearprobability models for the labour force participation of men and womenincluding among the independent variables a predicted wage variablesimilar to the one used in this thesis. In his study these wage variables- 119 -were not significant except in one case: the negative effect of thehusband’s predicted wage on his wife’s labour force participationequation. Therefore, the significance of the wage variables for thelabour force participation model for both husbands and wives appearsto be a new result in the study of labour force participation behaviour.The own wage is again positively related to the probabilityof participation as was found already in the simultaneous model. Giventhe very small income effect, this positive wage effect corroboratesthe theoretically expected effect (Chapter II, Section 3.2). As inthe simultaneous model the effect of the partner’s wage is negative forboth husbands and wives.The level of asset income decreases the probability of participation for the wives, as was already found in previous studies (Cain[1966], Mahoney [1961], Bowen and Finegan [1969], Cohen, etal. [1970]).For “prime age males” both Bowen and Finegan [1969] and Cohen, etal.[1970] found a significant negative income effect, which is not confirmed in this study.A mortgage dummy variable increases the probability of participation of husband and wife, but more so for the husband. This resultwas already suggested in the simultaneous model.Most previous labour force participation studies have concentratedon the participation decisions of married women. This is understandableconsidering the observed substantial growth in their post-war participation rates. Because of its relative importance in the field of- 120 -labour force participation studies I have tried to investigate thewife’s labour force participation somewhat further. I have done thismostly by adding or changing variables to the basic specification usedin Table XIX, Column 2. These additional results are thus for marriedwomen with a working husband.First of all a dummy variable for the occurrence of a birthin the sample year has, as expected, a negative effect on the probabilityof participation.14 I have investigated the effect of children furtherby splitting the KID 6 variable into a set of three dummy variablesindicating whether the family has a child of less than or equal to twoyears; between two and four years; between four and six years. I findthat the probability of participating decreases the most for the two tofour age group and that this negative influence is substantially smallerfor the four to six age group.15 In another specification I replace theKID 6 variable with a variable counting the number of children in thehousehold. This variable also has a significant negative coefficient.In the same equation a variable counting the number of children in high-school entered positively but was only significant at the 20 percentlevel 16The MARRIAGE variable indicates decreasing probabilities ofparticipation for the wife as the couple gets older. To explore thisage-labour force participation relation somewhat further, I replaceMARRIAGE with a set of dummy variables for the (30 - 40], (40 - 50],and (50 - 100) age cohorts of women. The results indicate that the- 121-logarithm of the odds becomes increasingly smaller for older agegroups. A similar result was found by Cohen, etal.f1970]. A quadraticspecification of the wife’s age variable exhibits a peak in the probability of labour force participation at age 2417.I also have found that a series of dummy variables indicatingintervals of the unemployment rate in the county where the family is1ving are insignificant18 for the participation decisions of the wives.5.2 Labour Force Participation for Wives with Husbands Working versusLabour Force Participation for Wives with Husbands Not Working.As was proved above (see Appendix to Chapter II) a sufficientcondition for the probabilities predicted by the sequential model tobe equal to the probabilities predicted by the simultaneous model is thatthe vector of coefficients of the logit model for wives with workinghusbands (Table XIX, Column 2) is equal to the coefficient vector forthe wives with non-working husbands (Table XIX, Column 3). Comparingcolumn 2 and 3 in Table XIX, it is quickly seen that the coefficientvectors differ in many ways. There is a noticeable difference in the valueof almost all the coefficients.The condition that the two coefficient vectors should be equalin order for the probabilities of the simultaneous and sequential model(husband deciding first) to be the same, can also be tested statistically.To do this I use the likelihood ratio procedure to test the null hypothesisthat the coefficient vectors are the same against the alternative hypothesisthat they are dIfferent The logarithm of the maximum likelihood under- 122-the null hypothesis is given in Table XX (column 1). The lo9arithm ofthe maximum likelihood under the alternative hypothesis is the sum ofthe logarithm of the maximum likelihoods given in respectively columns2 and 3 of Table XIX. The computed value of the likelihood ratio testis 114.8. The critical 2(15) value is 32.8 at the .5% critical level.The null hypothesis is therefore decisively rejected.The same procedure can be used to test the equality of theprobabilities of the simultaneous model and the sequential model wherethe wife decides first. In this case the calculated test statistic is82.9. Again the null hypothesis of equality of the two models must berejected.6. Choosing Between the Simultaneous and Sequential ModelsSince it is difficult to choose between the simultaneous andsequential model on priori grounds, it may be of interest to discriminate between them posteriori. A Bayesian procedure can be helpful inthis respect. Let h1 denote the simultaneous model and h2 the sequentialmodel. Let 1 and 2 be the parameter vectors corresponding to eachmodel. Also let p(h1) and p(h2) be the prior probabilities that either thesimultaneous or the seqential model holds. pts1[h) and p(2lh) are theprior distributions for the parameters of each model. The joint posteriordistribution of the models and their parameters, given the data Z, is(3.7) p(1,h) L(Zlsh) p(1)h) p(h) , for i = 1,2where L is the likelihood function.If one uses prior distributions of the form(3.8) p(1h) p(h) C. , for i = 1,2- 123where C. is a constant, then the (approximate) posterior probabilityfor either model can be derived as2°K./2 —1/2(3.9) p(hiZ) & (2ir) 1 lvii exp {9(slh Z)} C. , i = 1, 2where is the number of parameters in , is the asymptoticvariance covariance matrix and g is the logarithm of the likelihoodfunction. Both and g, are evaluated at the maximum likelihoodestimates of,i.e.,Note that if instead of (3.8) the following prior distributionwould be used-K./2 1/2(3.10) p(aIh) p(h) (2Tr) 1 ivi , i = 1, 2then (3.9) would reduce to(3.11) p(hiZ) & exp {9(Ih Z)}Discrimination would then be done on basis of the maximum values of thelogarithms of the likelihood function. However, (3.10) is unacceptableas a respresentation of prior knowledge since it embodies knowledgeof the likelihood function at its maximum.Before I apply equation (3.9) to discriminate between thesimultaneous and sequential models, I must specify the form of thelikelihood function in the sequential model. (The likelihood functionfor the simultaneous model is defined in equation (3.1) above.) Usingthe same notation as in the Appendix to Chapter II the probabilities ofa family choosing respectively, “both working”, “husband only”, “wife- 124 -only”, and “none working” are written in the sequential model asp(llll) p(l), p(l2)l) p(l), p(2l2) p(2), p(2212) p(2) (see A.3.l toA.3.4 in the Appendix to Chapter II). The likelihood function forthe sequential model using this notation is thenf3(3.12) eL= Pt(11) P) P(12I) p(l) tl P(2lJ)f4P(22I2)where the subscript t indicates the tth family and f1, f2, f3, andf4 are the number of families in the sample choosing respectivelythe “both working”, “husband only”, “wife only”, and “none working”alternative il f = T). Upon rearranging (3.12) becomesf3+f4 fi f2(3.13) eL= tl tl P(llI1) P(l2I1)f3 f4tl J2) tl P(22I)i.e., the product of the likelihood functions of the binomial modelsfor the husband’s labour force participation decision, and for thelabour force participation choice of the wives whose husband is in or outthe labour force. Therefore by extension the (approximate) posteriorprobability of the sequential model will be equal to the product ofthe (approximate) posterior probabilities of the same binomial logitmodels.- 125 -The logarithm of the (approximate) posterior probabilityfor the simultaneous model is equal to a constant plus -4146.5. Thelog of the (approximate) posterior probability for the sequentialmodel in which the husband decides first, is equal to a constant -4166.75.Therefore a posteriori the simultaneous model is more probable than thesequential model. As was mentioned above the estimation of the simultaneous model in which the wife decides first, did not converge whenI use the PROLO program. This would indicate that the logarithm ofthe (approximate) posterior probability would maybe approach -. Inthis case this model would be highly improbable. If I use the resultsobtained from estimating this model with CSP then I can develop adiscrimination procedure using the maximum values of the logarithmsof the likelihood function (equation (3.11)). This implies, however,that an unacceptable prior (3.10) is used. Unfortunately CSP doesnot p-ovide information on so that I cannot calculate (3.9). Thelogarithm of the likelihood at the maximum for the three models isrespectively, simultaneous model: -4179; sequential model (husbanddecides first): -4184.75; sequential model (wife decides first): -4200.70.On the basis of this criterion the simultaneous model is most probableand the sequential model in which the wife decides first is leastprobable a pteriori.- 126 -FOOTNOTES TO CHAPTER IV1 See Buse [1972] for a discussion of the origins of the logit model andfor references to applications in economics.2 Cfr. McFadden [1974, 115], Equation 17. Note, however, that I assumethere are no repetitions.3 Cfr. McFadden [1974, 115], Equation 20.4 Cfr. McFadden [1974, 111], Axiom 5.5 Cfr. McFadden [1974, 116-120].6 Cragg and Uhier [1971] mention a non-existence problem that might ariseif the same alternative were chosen in all observations. See theirfootnote on page 343.7 CSP (Cross Section Processor) Version 3 of June 1972 is a package ofcomputer programs designed to carry out regression analysis on largebodies of cross-section data. It was written by M.G. Kohn with contributions by N.A. Barr (London School of Economics), Z.I. Brody (HebrewUniversity and Massachusetts Institute of Technology), R.E. Hall (Massachusetts Institute of Technology), and M.D. Hurd (Stanford). One ofthe programs in this package allows for binomial and multinomial logitanalysis. PROLO and THEIL are part of a set of programs designed toexecute multiple probit and logit analyses and extensions to them.These programs were written by J. Cragg [1968].8 It took 29’49” computer time on an IBM 370-168 computer to estimatethe simultaneous model presented in Table XVI.9 This summary statistic was developed by Cragg and used in Cragg and Uhier[1970, 1971] and Cragg and Baxter [1970].- 127 -10 The information on the status of the wife is not available.11 To test for the effect of the county unemployment rate I had torestrict the sample to the years 1968 to 1971 as the unemploymentvariable was only available in those years.12 The THEIL program offers the possibility of restrictions within andacross alternatives.13 This kind of difference between PROLO and CSP and for that matter betweenTHEIL and CSP occurred only in this particular case. Usually numericaldifferences between the programs were negligible (results were equalup to the third digit).14 The coefficient and asymptotic standard error for the BIRTH variable wasrespectively -.38 and .11.15 The coefficients and asymptotic standard errors are respectively -.53(.07), -.78 (.10); —.34 (.11) for the (0 - 2], (2 — 4], and (4 — 6]age groups.16 The coefficient of the number of children variable was -.08 (asymptoticstandard error: .02) and for the number of children in highschool variableit was .11 (asymptotic standard error: .07).17 The coefficients and asymptotic standard errors for the age groups arerespectively -.16 (.08); -.35 (.10); -.88 (.12): for the (30- 40],(40 - 50], and (50 - 100] age brackets. The coefficients and asymptoticstandard errors for the AGE and AGESQ variables are respectively.0459 (.021) and -.00095 (.00024). This quadratic form has a maximumat 24.2 years.- 128 -18 To test this the sample must be restricted to the years 1968 to1971 and to those families for whom the county unemployment ratewas in fact observed. Sample size is then 3768 observations.19 Discussion of this Bayesian discrimination technique can be foundin Cragg [1971].20 See Cragg [1971, 834-835]. The (approximate) posterior probabilityfor the logit models is calculated in the PROLO and THEIL programs.- 129 -CHAPTER VFAMILY CONDITIONAL LABOUR SUPPLY FUNCTIONS1. IntroductionOnce a family has decided on its labour force alternativeits next step is to decide conditionally on the number of hours. Morespecifically if the family chooses the “both working” alternative itwill have to determine the labour supply of the husband and the wife.Similarly, if it chooses “husband only” or “wife only” it will have todetermine the hours to be supplied by respectively the husband andthe wife.Probabilities for the participation decisions were establishedin the previous Chapter. In this Chapter I will investigate the laboursupply functions corresponding to the labour force participation alternative chosen.2. Issues in Labour Supply EstimationAs discussed in Cahpter II, Section 2.2, two approaches tothe estimation of labour supply functions exist in the economicliterature. One approach is based on a rigorous application of theresults of utility maximization theory. The strength of this approachis most apparent for a system of demand and supply equation as itenables the researcher to either impose or test for restrictionsderived from utility maximization theory. This kind of approach wasused for labour supply estimation with micro-data by Wales [1973],- 130-Wales and Woodland [1974a, 1974b], Ashenfelter and Heckman [1974],and with macro-data by Gussman [1972]. This method, however, mayinvolve non-linear estimation techniques for a system of equations.A weakness of this approach lies in its treatment of “taste differences”which are assumed to be constant over the population.Another approach to the estimation of labour supply functionsattempts to approximate the parameters of the supply decision usingfunctions that are linear in the coefficients. Its relationship toutility maximization theory is somewhat pragmatic (see Chapter II,Section 2.2). However, this method has been used extensively in theestimation of labour supply functions, primarily because of its econometric simplicity.1 This method controls for “taste variations” inthe sample by inserting socio-demographic variables in the regression.The specific functional form that I use (see Chapter II,Section 2.2) is linear in the coefficients. It contains polynomialexpressions in the wage and income variables so that supply andincome elasticities are not restricted to being constants.2The dependent variable in the labour supply equation, i.e.,annual hours of work is by definition non-negative. If the family in itslabour force participation choice decides to supply a non-negativeamount of hours on the labour market, then this restriction should beimposed on the family’s conditional labour supply function. Cragg [1971]suggested two methods of imposing the non-negativity restriction: (i)truncating the distribution of the dependent variable at zero, (ii) using- 131 -a semi-logarithmic specification for the supply function. The formerinvolves a iterative estimation technique3 whereas the latter can beestimated with the usual regression techniques. For convenience Itherefore choose the semi-logarithmic specification.The parameters of the labour supply decisions can be determinedeither from a supply function with “hours worked” as a dependent variableor from a demand for leisure equation with “total number of hours minushours worked” as a dependent variable. I choose to estimate supplyfunctions because their interpretation is more straightforward and itsimplifies the calculation of the unconditional supply functions inthe next Chapter.A fundamental objection to estimating supply functions at allis that the dependent variable “hours worked” is not in an individual ‘sdecision set but is instead determined by institutional constraints(standard work week) and by demand conditions (layoffs, spells of unemployment). In empirical investigations these problems are sometimesavoided by choosing a sample for whom these constraints are presumablynon-binding, e.g., self-employed (Break [1957], Wales [1973]), individuals stating that they felt no constraint on their choice of hours(Wales and Woodland [l974a]). Frequently, however, these objectionsare either neglected or it is argued (Friedman [1962]) that overtime,moonlighting and part time jobs are readily available and that thereexists sufficient room for individual choice. Whether the latter istrue is clearly an empirical question. But even if it is accepted afurther problem may arise if the marginal wage rate changes between- 132 -these different time allocations, e.g., between normal time and overtime, and if only an average wage rate (i.e., total labour incomedivided by total hours of work) is observed, as is the casein my sample.4 In this study as in most other labour supply studies(e.g., see Cain and Watts [1973]) the existence of possible institutionalconstraints on the labour supply decisions of some families in thesample may therefore affect the parameters of the estimated equations.5For labour force participants an (average) wage rate isobserved. Thus, this observed rate instead of the predicted wage rate(Chapter III) could be used. Since I will combine the labour forceparticipation decision and the labour supply decision into an unconditional labour supply function, which is defined as a function of thewage rate (Chapter VI), consistency is desirable. I will thereforeestimate labour supply functions with the predicted wage rates amongthe independent variables (see Section 4 of this Chapter) and will usethese estimates in calculating the unconditional supply functions.Prediction tests (Chapter III, Section 3.3) indicate that thepredicted wage rate is at its best only a rough indicator of the observedwage rate. It would seem desirable to compare the supply estimatesusing the predicted wage rates with estimates obtained from observedwage rates. I will therefore also estimate the supply functions usingthe observed wage rates (see Section 3 of this Chapter).The observed wage rate, however, is an average wage rate(i.e., income divided by hours), whereas the theoretically relevant- 133-variable is the marginal wage rate. The marginal wage rate can bedifferent from the average wage rate, e.g., because of overtime premiums,multiple job holding with different wage rates, non-proportional incometaxes, etc. The sample does not usually provide enough information todefine the marginal wage rate. Therefore, most labour supply studies(e.g., Kosters [1963], Rosen and Welch [1971]) use the average wagerate. Some recent studies try to approach the marginal wage rate, atleast partially, by taking the effect of non-proportional income taxesinto account (e.g., Diewert [1971], Wales [1973], Hall [1973]). Inthis thesis I have also applied this partial adjustment to the observedaverage wage rate using the same method as Wales and Woodland [1974a].This method consists of correcting the average wage rate for the marginaltax rate. In order to determine the marginal tax rate, I assume jointfiling and determine from the 1967 to 1971 United States federal incometax tables the tax bracket6 and hence the marginal tax rate for afamily with that level of federal income tax. It should be stressedagain that this “corrected” average wage rate will still be differentfrom the relevant marginal wage rate because the adjustment is onlyfor federal taxes (neglecting state income taxes) and it does nottake account of, e.g., overtime premiums, multiple wage rates, etc.7There has been some discussion in the empirical labour supplyliterature concerning the income variable to be used to measure theincome effect (e.g., see Kosters [1963]). Transfer income is sometimesrelated to supply in an unsatisfactory way, e.g., unemployment benefits,income from pension plans. This creates spurious correlation between- 134 -income and supply. Non-labour income such as rents, interests,dividends, etc. is usually non-zero for only a small portion of thesample. Using a definition similar to the one used by Hall [1973] I definea full rental income variable as the sum of taxable asset income plus12 percent of the value of the family car(s) plus 6 percent of the net(i.e., corrected for the outstanding mortgage debt) value of thefamily house(s). It is of interest to note that only 5 percent (295out of 5415) of the sample had zero rental income defined in this way.(The same income variable was used in the labour force participation modelsof the previous Chapter.)The sample over which I estimate the labour supply functionsusing the predicted wage rate is identical to the sample used for thelabour force participation models and is described in Section 3 ofChapter IV. I also estimate the labour supply functions using the correctedobserved wage rate over the same sample, except that I have excludedobservations with a zero wage rate.8 This reduced sample is summarizedin Table XXI9 in terms of the variables used in the labour supplyregressions.- 135 -TABLE XXIDESCRIPTION OF SAMPLE USED FOR LABOUR SUPPLY REGRESSIONSWife* These variables are either for theon the column in which they appearhusbands or the wives depending** Numbers between brackets are standard deviations. If these numbersare followed by % then they denote relative frequencies# OBSBoth • WorkingHus bandHusbandOnly2409240911.139.1840.132.772.741039256710.759.1342.053.073.131405Average of:YRS SCHOOL (*) (3.3)** 11.35 (2.4) (3.7)ACHIEVE (2.5) (2.6)AGE (*) (11.6) 37.58 (10.9) (12.2)KIDS (2.2) 2.78 (2.18) (2.2)WAGE NET (*) (1.4) 1.72 (1.2) (1.7)ASSET Y (1816) 1093 (1816) (2878(# of Families WithDummy Variable = 1RACE 511 (21%) 511 (21%) 451 (18%)JEW 77 (3%) 77 (3%) 84 (3%)LIMIT (*) 225 (9%) 20 (.8%) 265 (10%)KID Y 607 (25%) 607 (25%) 530 (21%)URBAN 1503 (62%) 1503 (62%) 1556 (61%)SOUTH 950 (39%) 950 (39%) 880 (34%)EXP 1807 (75%) 1956 (76%)SEH 304 (13%) 444 (17%)CLERK W 913 (38%)UNSKILL W 553 (23%)UNION 717 (30%) 832 (32%)MORTG 1386 (57%) 1386 (57%) 1378 (54%)RESERVE 1278 (53%) 1278 (53%) 1363 (53%)IRREC 117 (5%) 117 (5%) 141 (5%)Hours Supplied:Average ofLOG HOURS (*) 7.65 (44) 6.75 (1.08) 7.65 (.50)HOURS 2227.0 (647.7) 1227.18 (739.8) 2276.13 (687.5)- 136 -3. Estimation Results: Labour Supply Functions With Observed Wage RatesThe results of the supply equations for the “both working”and “husband only” alternative are presented in Table XXII. The supplycurve for the “wife only” alternative is as follows(4.1) LOG HOURS = 6.31 - .62 URBAN ÷ .69 WAGE W- .08 (WAGE W)2(.31) (.28) (.26) (.03)- .13 ASSET Y(.06)#0BS80;R2=.13;SE1.09Whereas the number of years of schooling increases theprobability that the husband will not be in the labour force (seeTables XV and XIX)1° there is, however, a tendency for more educatedhusbands to work more hours than less educated husbands conditionalon their being in the labour force (Table XXII). This would suggestthe hypothesis that more educated husbands are in the labour force fora shorter period of their life cycle but work on the average longerwhen they are in the labour force.The effect of education on the labour supply of the wivesis also positive and seems to be more pronounced than for the husband(Table XXII). Furthermore, more educated women tend to have a higherprobability of joining the labour force (Table XV and XIX). Thereforehigher educated women spend probably more of their lifetime in thelabour force compared with less educated women and work longer hourson average while they are in it. The positive effect of education onlabour supply was also established by Cohen, etal. [1970] and Hill [1973].- 137 -TABLE XXIIVariablesCONDITIONAL LABOUR SUPPLY FUNCTIONS (USINGOBSERVED WAGE RATES1Both WorkingHusband Wife6.58(.11)• 007(.003).01(.003)- .07(.02).13(.05)-18(.03)-.06(.02).049(.005)-.0006(.0001).24(.02).17(.03)4.98(.28)• 024(.009).17(.06)- .26(.13)- .87(.23)-.06(.01).18(.05).08 **(.05)084(.01)-. 0009(.0002)- .20(.05)- .45(.06)HusbandOnly6.29(.10).01(.002)-.08(.02)-.19(.03)- .01(.004).07(.02).074(.004)-. 001(.0001).22(.02).16(.03)CONSTANTYRS SCHOOL2ACHIEVERACEJEWLIMIT2KI OSKID YURBANSOUTHAGE2AGESQ2EXPSEFfCLERK WUNSKILL W- 138 -.04(.02).05(.02)-.08(.04)- .06(.02)-. 003(.001)- .05(.01)• 003(.001).06(.02)-. 008(.003).0003(.0001.14(.04).18(.05)—.17(.02).0005(.0002)• 32(.06)- .09(.01)• 004(.0006)- .025(.01)HusbandOnly- .06(.02).05(.02).08(.02)TABLE XXII CONTINUEDBoth WorkingHusband WifeVariableUNIONMORTGRESERVEWINDFALL YWAGE H NET7WAGE H NETWAGE H NETWAGE W NETWAGE W NETWAGE W NETASSET Y3ASSETASSET Y—.12(.01)004(.001).01(.003)# OBS 2409 2409 2567R2.24 .12 .362.23 .11 .35S E.39 1.01 .407.65 6.75 7.656.44 1.08 .50- 139 -Notes to Table XXII:1 Each variable is significant at the 5 percent level (t-test).2 These variables are either for the husbands or the wives dependingon the column in which they appear.3 The ASSET Y variable is divided by 1,000.4 S E: standard error of the estimate.5 p: mean of dependent variable.6 : standard deviation of dependent variable.and denote exponential powers.- 140 -There is no significant effect of the racial dummy variableon the probability of labour force participation for the husband (TablesXV and XIX). The negative effect of the same variable on the laboursupply of the husband is however, significant (Table XXIII). Non-whitefamilies are more probable to choose the “both working” alternative overthe “husband only” alternative (Table XVI). Being non-white also increasesthe odds of participation for the wife (Table XIX) and the averagenumber of hours supplied (Table XXII). Similar differences in thelabour supply for white and non-white men and women were found byAshenfelter and Heckman [1973] and Cain [1966]. Combining theseresults it would seem that non-white married men work less hours thanwhite men (or are constrained to do so) but that non-white familieswill probably attempt to offset this negative effect as non-white wivesjoin the labour force more frequently and work more hours than whitewives.A dummy variable indicating that the husband belongs to theJewish religion has no effect on the labour force participation decisionswithin the family (Chapter IV, Section 4). The effect of this variableon the labour supply of the husband and the wife who are both working issignificant. A Jewish husband will work relatively more and his wiferelatively less than non-Jewish couples. This religious variable is notsignificant for the husband whose wife is not in the labour force. Adummy variable indicating that the husband was Catholic has an insignificant effect on both labour force participation and labour supply.- 141A handicap diminishes both the probability of participatingfor husband and wife (Table XIX) and also the average numbers of hourssupplied if they are in the labour force (Table XXII).The number of children has a negative influence on the supplyof labour of the wives who are working together with their husbands andon the supply of the husbands working alone (Table XXII). A KID 6 dummyvariable has similar effects on the labour supply. I choose the KIDSvariablelin the specification of TABLE XXII because this variablegives a slightly better fit than the KID 6 variable (in terms of R2).The KID 6 variable also decreases to a substantial degree the odds ofthe wife participating (Table XIX).A variable indicating whether some children in the familyhave any income (mostly labour income) has no significant effect on thelabour force participation decision of the family (see Chapter IV,Section 4). The same variable, however, affects the labour supplyof the husband, in families where both partners are working, negatively.A significant quadratic form for the AGE variable was foundin all three cases of Table XXII. Similar results were found by Cohenetal. [1970], Smith [1972]. The amount of hours supplied over thelife time tends to peak around 41.2 years for the husband and at46.6 years for the wife when they are both working. When the husbandonly is working his supply peaks around 36.9 years.11 Measuring thelife cycle hours profile with the MARRIAGE variable instead of theAGE would give similar results. The AGE variable, however, fitsbetter in terms of R2 and compares with previous established results- 142 -in labour supply studies.Table XXII shows that self-employed men and men on thesame job more than 3 years tend to work more hours. Female clerksand unskilled female workers (who account together for 60% of allthe working wives) tend to work fewer hours than the other occupationalgroups.It was established earlier that the existence of a mortgagedebt increases the odds of choosing the “both working” alternativeand the “husband only” alternative as compared to the other alternatives.However, it increases especially the odds of the “both working”choice (see Tables XV and XVI). In terms of labour supply it issimilarly the wife’s hours which are mostly increased.The RESERVE dummy variable was only relevant in the labourforce participation choice of the family between the “wife only”and “none working” alternative (Table XV). It is hard to explainthe positive influence of having reserves on the supply of labour.Reserves can either be caused by labour supply, i.e., hard-workingcouples can save more, or it can cause increased labour supply, i.e.,couples are working longer hours in order to accumulate savings.Before discussing the wage and income effects I will make ageneral comparison in terms of the socio-demographic variables betweenthe three labour supply functions represented in Table XXIII. Thedifferences among the coefficients of the husbands’ labour supply functionsand the wives’ labour supply functions are substantial. On the otherhand there is a remarkable similarity among the coefficients of the- 143socio-demographic variables in the two supply curves for the husbands(Table XXII, Columns 1 and 3). An exception to this general tendencyis the set of age variables and some variables which are significantin one but not in the other supply function, e.g., ACHIEVE, JEW, KIDS,KID Y, URBAN, UNION, and WINDFALL Y. There is a substantial differencein the wage and income coefficients; but this difference is not reflectedin the supply and income elasticities for both samples of husbands (seebelow).In discussing the effects of the WAGE and ASSET Y variablesI am interested in identifying the income and the (compensated) substitution effects and in checking for the non-negativity of the latter.The sum of the income and (compensated) substitution effect is the slopeof the supply curve. In the “both working” case I am furthermore interestedin identifying whether the husband’s and the wife’s labour supply arenet complements or net substitutes.12As discussed above in Chapter II, Section 2.2, the functionalform chosen for the supply function is of the following general form (seeChapter II, equation (1.14)):(4.2) ln R = constant + (a1v + a2v + a3v) + (b1v + b2v +b3v) ÷ (c1A + c2A + c3A) , i, j = m, f, i jLet A, B, C, be defined as the partial derivatives of (4.2) with respectto respectively v., v, and A. Then it is readily seen (see also ChapterII’ Section 2.2) that A is the slope of the supply curve for i, that B- 144 -indicates gross complementarity or substitutability between the supplyof i and j and that C is the income effect for i’s labour supply. Furthermore (A- CR) is the compensated substitution effect (which should bepositive), while net complementarity or substitutability is identifiedas (B - CR). Because of the semi-logarithmic form of (4.2) the corresponding elasticities are easily calculated by multiplying A, B, C,(A - CR) (B - CR) with the appropriate wage or income variable.These elasticities are tabulated in Tables XXIII and XXIV.They are calculated using the point estimates of the wage and incomecoefficients in, respectively, Table XXII and equation (4.1). Thesupply curve for men is negatively sloped in both the “both working”and “husband only” alternatives. The curve tends to be more elastic(less steep) in the latter alternative. In the “both working” case thesupply curve for the wives is upward sloping up to the $1.93 net wagerate, then slopes negatively above that wage until the $14.29 net wageafter which it has a postive slope again. When the wife only is workingthe supply curve is almost always positively sloped. For this case,the labour supply curve is also much more elastic than for the other wives.Income effects are usually small except in the “wife only”case. Only for women is leisure usually a normal good. The compensatedsubstitution effect is positive for most of the wives in the sample butfor almost none of the husbands. In the “both working” case husband andwife’s working hours are usually both gross and net “sUbstitutes”, i.e.,if one partner’s wage goes up the other partnerwill work less. Thisresultsfollows from the husband’s equation as well as from the wife’s- 145TABLE XXIIILABOUR SUPPLY ELASTICITIES (CORRESPONDING TO LABOURSUPPLY FUNCTIONS USING OBSERVED WAGE RATES)1Elasticity Average of Number and % ofDefined At Elasticities for Sample Points WithThe Means Sample Points Positive ElasticityBOTH WORKING - HUSBANDsupply elasticity -.20 -.22 0income elasticity .05 .03 2343 (97%)compensated substitutionelasticity -.49 -.50 0gross cross-elasticity -.06 -.05 14 (.6%)net cross-elasticity -.16 -.15 11 (.5%)BOTH WORKING - WIFEsupply elasticity .05 -.03 1720 (71%)income elasticity-.03 -.03 0compensated substitutionelasticity .00005 .03 1920 (80%)gross cross elasticity -.45 -.41 7 (.3%)net cross elasticity -.11 -.26 11 (.5%)HUSBAND ONLYsupply elasticity -.26-.27 1 (.03%)income elasticity .013 .013 2567 (100%)compensated substitutionelasticity-.33 -.34 1 (.03%)WIFE ONLYsupply elasticity .73 .48 78 (98%)income elasticity-.14 -.14 0compensated substitutionelasticity 1.02 .78 79 (99%)1 the definitions of the various elasticies is given in the text- 146 -TABLE XXIVDISTRIBUTION OF THE ELASTICITIES OF THE SUPPLY CURVE(OBSERVED WAGE RATE)# of Sample Points % of Sample Pointse<—l —l<