ANALYSIS OF VARIANCE ESTIMATORS FOR THE SEASONAL ADJUSTMENT OF ECONOMIC TIME SERIES by Walter Erwin Diewert A THESIS SUBMITTED IN PARTIAL FULFILMENT OP THE REQUIREMENTS FOR THE DEGREE OF MASTER OP ARTS in the Department o f Mathematics Ve accept th i s thesis as conforming to the required standard THE UNIVERSITY OP BRITISH COLUMBIA December, 1964 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requ i rements f o r an advanced degree at the U n i v e r s i t y ' o f • B r i t i s h Co lumbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s tudy , I f u r t h e r agree that p e r -m i s s i on f o r ex tens i ve copy ing of t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the Head of my Department or by h i s r e p r e s e n t a t i v e s . I t i s understood that, copy ing or p u b l i -c a t i o n of t h i s t h e s i s f o r f i n a n c i a l ga in s h a l l not be a l lowed without my w r i t t e n pe rm i s s i on * Department of Mathematics The U n i v e r s i t y of B r i t i s h Co lumbia, Vancouver 8, Canada . D a t e January 25f 1965 ABSTRACT The purpose of this thesis is to develope a valid statistical procedure for the estimation of the seasonal component of an economic time series when the seasonal com-ponent is suspected to be partly additive and partly multi-plicative to the trend. The proposed procedure is based on a three-vay classification analysis of variance model, vhere the f i r s t classification is used to represent the long term trend of the series, the second classification is used to represent any regular trend or cycle within the long term trend, and the third classification is used to represent the seasonal. The interaction term between the long term trend and the seasonal may be used to represent any long term change in the nature of the seasonal. However, as the standard analysis of variance significance tests assume independently distributed residuals, i t is necessary to develope a test for independence of residuals against the very likely alternative of f i r s t order (positive) serial correlation. This is done by calculating the mean and variance of the Durbin-Watson d statistic for the three-way classification analysis of variance model. A numerical example is given to illustrate the procedure. i i ACKNOWLEDGMENTS I should like to acknovledge the patient and under-standing assistance of Dr. Stanley W. Nash in the preparation of this thesis. I should also like to acknovledge the financial support of the National Research Council of Canada. TABLE OF CONTENTS Page A b s t r a c t i i Acknowledgments i i i Chapter I METHODS FOR SEASONAL ADJUSTMENT 1 1.1 I n t r o d u c t i o n 1 1.2 A n a l y s i s of v a r i a n c e estimators (two-way c l a s s i f i c a t i o n ) 2 1.3 A method due to Wald 5 1.4 The d i f f e r e n c e from moving average method 8 1.5 Ou t l i n e of a method using three-way c l a s s i f i c a t i o n a n a l y s i s of var i a n c e estimators 10 II ANALYSIS OF VARIANCE ESTIMATORS (THREE-WAY CLASSIFICATION, FIXED EFFECTS MODEL) 15 ' i l l A TEST FOR INDEPENDENCE OF REGRESSION RESIDUALS 20 3.1 The Durbin-Wetson d s t a t i s t i c . . . . . . 20 3.2 The moments of the d s t a t i s t i c 23 IV CALCULATION OF THE MEAN AND VARIANCE OF d IN THE THREE-WAY CASE 27 4.1 I n t r o d u c t i o n 27 4.2 C a l c u l a t i o n of the matrix X 28 4.3 C a l c u l a t i o n of the matrix M 29 4.4 C a l c u l a t i o n of the mean of d 35 4.5 C a l c u l a t i o n of the v a r i a n c e of d... 36 V A NUMERICAL EXAMPLE AND CONCLUSIONS 45 B i b l i o g r a p h y 52 i v LIST OF TABLES Table Page I Table of the a1, 41 II Table of the e x ' s 42 III Canadian Gross National Product by-Quarters, 1947-1958 46 IV The Sample m's 48 V Analysis of Variance Table 50 v LIST OF FIGURES Figure Page 1 . The Matrix X 30 2. The Matrix (X TX)" 1X T 31 3. The Matrix MQ = X(X TX)" 1X T 34 4. The Matrix MA 39 yi CHAPTER I METHODS FOR SEASONAL ADJUSTMENT 1.1 Introduction Ve suppose that we are given a time series which has a s t r i c t l y periodic or seasonal component; for example, an economic time series y. ., where i = 1, 2, . .., S and j = 1, 2, T. That i s , we are given T seasonal observations for each of S years, where T is usually 4 (for quarterly data) or 12 (for monthly data). A problem of some practical im-portance is the estimation of the seasonal component so that the time series may be seasonally adjusted. The purpose of this study is to find a valid statistical procedure for the estimation of the seasonal component of an economic time series when i t is suspected that the nature of the seasonal is slowly changing over time. In the course of the study, some results of independent interest dealing with testing analysis of variance residuals (in a three-way classification) for inde-pendence are derived. The remaining sections of this chapter summarize some of the methods presently used to estimate seasonals, and with the perspective gained by reviewing the present methods, the 1 2 rest of this thesis may be more succinctly summarized in the final section of this chapter. 1.2 Analysis of Variance Estimators (Tvo-Way Classification) One method of estimating seasonal components vhich has been developed by R. L. Anderson [ 1] is to use a two-vay clas-sification analysis of variance model. We suppose that the series . may be represented as follows: 1 , 2 . 1 y..=u..+u. + U . . + i = 1, 2 , ...,S; j = 1 * 2, . . . , T, where ^ E • •} is a series of independent normal 2 variates with mean zero and variance C . S u^. = the ith year effect where ^ u_^ = 0 i = 1 T u.. = the jth seasonal effect where ^ u.. = 0 J j = 1 J u = the overall mean. Under the above assumptions, the least squares estimates (which are the maximum likelihood and minimum variance linear esti-mators as well) J for the u's are given by S T 1 . 2 . 1 m. . = ^ ^ y. . = y. . i = 1 j = 1 - i l ST 3 T m i - = __ y±i -y.. = y±* - y.. i = 1, 2, ...,s T S m T = ^ 1 _ _ i -y» = y.j - j = 1' 2> •••'T 1 - 1 s However, i t frequently happens i n dealing with economic time series, that the residuals ^ ^ ^ j ^ a r e no"k independent; for example, they may follow a small order autordfgpessive ^ scheme. (That i s , i f we renumber the residual series £^^, £ 1 2 , £ 1 T , E 21 » ^ 22' ^S1' ^S2 •»•••» ^ST a S £ .j , cf^* . . . . , £ — where N = ST, then the residual series {^£-k^ may follow a f i r s t order Markov scheme given by where ^ j . i s a series of independent normal variates with mean zero and variance & ^, and ^> i s a positive parameter less than one—we say ^ i s positive because this i s the most l i k e l y alternative as far as most economic time series are concerned). Hence i t i s advisable to test the residuals for independence against alternatives of the form given by 1.2.2. (If the residual series does i n fact follow the form given by 1.2.2, then the series i s said to be autocorrelated or s e r i a l l y correlated.) 4 Durbin and ¥atson^^ a n ^ [10] have derived a test for autocorrelation of the two-way classification analysis of variance residuals vhich is uniformly most powerful against f i r s t order (one-sided) Markov process alternatives, which is the most probable alternative from the viewpoint of economic theory. Suppose that the sample residuals z.. are defined by 1.2.3 y.. = m.. + m. . + m.. + z.. i = 1, 2 S; j = 1, 2, . . ., T Then we renumber the sequence of residuals, z ^ , z^2> « ' « > z s T as z^, Z 2 , end calculate the Durbin-Watson d statistic, defined by ST 1.2.4 d = ^ ( 2 t - 2 t - 1 ) 2 t = 2 ST 0 t = 1 Under the null hypothesis of normally distributed independent residuals ^^j- , the mean and variance of di is given by 1.2.5 E [d] = 21 1 + 1 - 1 \ \ T S(T^1 )/ Var [d] = 4 [ST 2 - T 2 - 3ST + 3T + 4 + jr (7S-12) - ~ ] (S-1) (T - 1) 2 (ST - T - S + 3) J l + 1 (6 - 5S) + 2T "1 L T 2 T 2 : S 2 J • (S-1) (T - 1 ) 2 (ST - T - S + 3) 5 We may use a normal approximation to the true distribution of d under the null hypothesis, with mean and variance given by 1.2.5, to test whether we can accept the null hypothesis of independence of residuals. If the null hypothesis is rejected, then a l l we can say about the estimators given by 1.2.1 is that they are unbiased provided the residuals ^/j- are distributed with mean zero and finite second moments. (See, for example [14], chapter 12.) If the null hypothesis is not rejected, then we may use the standard analysis of variance proce-' dure^^*"^^ to get confidence intervals for our sample estimates 1.2.1 . 1.3 A Method Due to Wald Wald^^'^^ assumes that the time series may be repre-sented by an additive model given by 1.3.1 y^ = mt + c t + s;t + 't t = 1 , 2, . . ., ST where m^ — the long term trend effect c^ = the business cycle effect s^ = the seasonal effect £^ = the residual which is assumed to be independently distributed with zero mean. Wald assumes that a T "month" moving average (say T = 12 for convenience) given by 6 6 1 1.3.2 m, = v i Y-^+i vhere w. = 12 for i = ~5 to 5 i = -6 1 1 1 1 = 24 for i = -6 and 6 takes out the seasonal part without affecting the trend or cycle; that i s , n^ . = + c^ + 7j t whereof ^ is a new residual with mean zero. Finally, ¥ald assumes that the seasonal variation is the product of two functions, one which is s t r i c t l y periodic with a period of 12 months, and another one which is not periodic but changes its value only slowly with time. The numerical procedure used by Wald to derive es t i -mators for the seasonal component is as follows: (a) First re-index the s©ri©s y • "t — 1 ^ 2 ^ • * • • ST B. s x yij- , i = 1 , 2, . . . , S; j = 1, 2, T where y. . is the value of the original series in the ith year for which we have observations and in month j . (b) Form the centred 12 month moving averages, which are assumed to be free of the seasonal component: = \A y i ' J ' " 6 + T2 r y iJ*+ k + \A Y ± J A R I > k = -5 (c) Now an overall seasonal effect a. is gotten by averaging over a l l years for each month as follows: 12S-1 a. = __ y. . - m . j = 1, 2, 12. J i _ 2 — ^ 3_ 12S - 2 (d) The values a. are "corrected" i n order to make their sum zero: • j = a. - | a. | ( 1 a. ) 12 . i = 1 1 1 (e) F i n a l l y , the fact that the seasonal fluctuations may be slowly but systematically changing over time i s now taken into account by the f i n a l estimate of the seasonal fluctuations, S . ., given by 5 • • — a ; —•• —-i • , — - I (f) In order to eliminate the seasonal, calculate y " . • = y. • - s. . However, there are some salient drawbacks to the above procedure. One drawback i s the r e l a t i v e complexity of the estimates, but the main drawback i s the fact that there i s a diffe r e n t estimator for the seasonal component for every ob-servation. Hence we cannot perform any s t a t i s t i c a l inference, because there i s only one observation for each parameter to be estimated. Also, we cannot test whether the residuals follow the d i s t r i b u t i o n postulated by the model. However, Vald's method i l l u s t r a t e s the d i f f i c u l t y of obtaining a 8 s t a t i s t i c a l procedure for dealing with the problem of slowly time trending seasonal adjustments. 1.4 The Difference from Moving Average Method It is supposed "that to a sufficient degree of accuracy the observed series y^^ .} satisfies the additive model 1.4.1 y t = mt + S t + JU ± = 1, 2,...,ST where m^. is the trend, regarded as a smooth deterministic function of time, S^. is the seasonal component, regarded as st r i c t l y periodic with a period of one year, and ^ is a ran-dom disturbance, regarded as generated by a stationary process."^^" Stationarity of the residual series means that E [/"^ ] = 0 and E [/(• ^ ju ^ &.] = f(s) for a l l t. Durbin also states, "In many situations, particularly in economics, the phenomena under investigation are multi-plicative rather than additive in their effects, and in these cases, y^ would be taken as the logarithm of the observed r 8' 3i value." 1 ' J In this latter case, our model would be y^ = m^ s^e which implies log y^ = log m^ + log s^ + Jtx^ which means again the basic additive model 1.4.1 may be used. The numerical procedure used to derive estimators for the trend and the seasonal in the additive model given by 9 1.4,3 y t = m t + 8^ + / t * = 1 2 ^'V j ; j = 1,2,..., 12; i = 1 , .. .S i s as follows: (a) Form the centred 12 month moving averages m^ given by i = 5 t = 7, 8, ..., 12(S-1 ) + 6 which are free of the seasonal under the assumption that the moving average removes the seasonal. (b) Form the deviations from trend given by x t = yt - \ (c) An overall seasonal effect a. i s gotten by J averaging over a l l years for each month as follows: S a. = _ i _ x 1 2 i + . j = 1 , 2 , 6 ~ i = 2 S-1 = _1_ x 1 0 . + . j = 7, 8, .. ., 12. s=T i = 1 X l 2 i + : i (d) F i n a l l y the values a^ . are "corrected" i n order to make their sum zero. Define the seasonal adjustments s. by 10 12 a . = a . - > a. i = 1 12 Durb: rg] in in his article 1- J goes on to show that the estimates s. above are the same as the analysis of variance estimators m . given in section 1.2, except for end correc-tions. Durbin also computes the variance of the estimators s . under the null hypothesis of independently distributed 2 identical normal residuals with zero mean and variance CT ; for example, Var [ s.j ] = Var [ 2] = 11 + 401 1 2(S-1 ) 10368(S-1 )' On the whole, the difference from moving average method for estimating seasonal adjustments is similar to the analysis of variance method outlined in section 1.2. 1.5 Outline of a Method Using Three-Way Classification Analysis of Variance Estimators The procedures outlined in sections 1.2 and 1.4 above enable us to deal with situations where the seasonal is approx-imately an additive constant independent of the trend, or i f we take logarithms, we can use the same models to deal with situations where the seasonal is a constant multiplicative factor times the trend. However, these models are inadequate under some circumstances. The data for the numerical example in Chapter V indicate that for some series, the seasonal is 11 partly additive and partly multiplicative to the trend, though perhaps not significantly so. Wald's procedure outlined in section 1.3 is intended to deal with a situation of this type but as was pointed out, the procedure is essentially an empiri-cal one, not a rigorous statistical procedure. It is the object of this thesis to derive a reasonable statistical pro-cedure to deal with time series which have a seasonal component which is partly additive and partly multiplicative to the trend. The proposed method is based on a three-way classifica-tion analysis of variance model. We assume that there are T seasons and the season constitutes one classification. Assume that we are given data for N years or N observations in a l l . T Now suppose that from theoretical or empirical considerations, i t is suspected that the seasonal changes it s structure sig-nificantly every S years. For example, i f we are dealing with a gross national product series of a country experiencing f a i r l y rapid growth, then we would expect the structure of the seasonal to change significantly in a manner closely related to significant long term changes in the country's productive capacity. One possible rule of thumb method for determining S explicitly is given in Chapter V. Now we re-index our obser-vations y t, t = 1, 2, N as y i n y^2 ...y 1 1 T y f 2 1 ... y R S T or as 12 1«5.1 . y i j k i = 1, 2, ..., R j = 1 , 2 , ..., S k = 1, 2, T where N = RST where i is the "group" (of S years) variable j is the "year" variable within each group k is the seasonal variable (usually T = 4 or 12) Now we assume the following model holds: 1.5.2 y. ... = u... + u... + u . + u. ., + u.. + u. , + J l j k I . j . ..k I J . x . k •jk l j k where u... represents an overall mean effect u. represents the ith "group" or long term trend effect u . represents the jth "year" effect within each group • 0 • of S years u, , represents the overall k th seasonal effect u.. is the interaction effect between the ith group and the jth year within the group u ., is the interaction effect between the jth year and • Jk the kth seasonal u. . is the interaction effect between the ith group and the kth seasonal and finally, £ . is assumed to be a normal variable, with zero 2 mean and variance cr , distributed independently of a l l other £ lmn; 1 ^ 1 °r j ^ m or k ^ n. 1 3 Under the above assumptions, we may use the analysis of variance estimators m.., to estimate the u's. Also, we wil l want to test whether certain groups of u's are s i g n i f i -cantly different from zero. In particular as we are interested mainly in estimating seasonals, we will want to test (a) whether the u , differ significantly from zero, which they almost certainly will with seasonal economic data, (b) whether the u. differ significantly from zero, which they should not as we do not expect the structure of the seasonal to shift significantly over a short period of years, and f i n a l l y (c) whether the u. , differ significantly from zero. If they do differ significantly, then the seasonal adjust-ment is partly additive and partly multiplicative to the trend. If they do not differ significantly from zero, then we may safely conclude that a simple addi-tive model, such as that outlined in section 1.2 will adequately represent the seasonal. As we are going to use a three-way classification analysis of variance model, the mathematics of this model is briefly reviewed in Chapter II. Furthermore, since confidence intervals for the u's and tests of hypotheses are based on the assumption of independent residuals, and as was mentioned before, interdependence of economic data is quite common, i t 1 4 is important to have a valid test for independence of residual In Chapter III, Durbin and Watson's results in this area are reviewed. In Chapter IV, the mean and variance of the Durbin-Watson d statistic are calculated for the present case of a three-way classification analysis of variance model, so that we may test the hypothesis of independence of residuals. Finally, in Chapter V, a numerical example is given, along with some concluding remarks. CHAPTER II ANALYSIS OP VARIANCE ESTIMATORS (THREE-WAY CLASSIFICATION, FIXED EFFECTS MODEL) It is assumed that we have a set of random variables y. ., such that 2.1 v. ., = u + u. + u . + u , + u. . + u. , + •'ljk ... i . . .3. ..k 13. l.k •jk 13k where i = 1, 2, R; 3 =1, 2, S; k = 1, 2, T and the u's satisfy the following constraints: 0 = X u • = £u • = l u , = I u. . = Z u. . i 1 • • 3 " ^ " k ' i 1 ^ * 3* 1 ^ * 0 =^ Ui.k = £*i.k •= | U . 3 k = ^ U.3k Also the u's are constants and the a r e independent identically distributed normal variables with zero expectation and variance cr ? Define the following sample statistics: R S T ^ ^ y i ; j k = _ J _ i,3,k yi3k RST i = 1 j = 1 k = 1 X J ~ RST 15 16 S T z s y±.. = 4 j = 1 k = 1 y i j k 5 s i m i l a r l y f o r y . < a n d y > f c T = f k ? 1 y i J k 5 s i m i l a r l y f o r y i . k a n d y . j k -Now d e f i n e : m = y m. = y - - y ; s i m i l a r l y f o r m . a n d m , m . . = y . . - y . - y . + y ; s i m i l a r l y f o r m. . a n d m V e a r e i n t e r e s t e d i n f i n d i n g t h e l e a s t s q u a r e s ( o r e q u i v a l e n t l y , t h e maximum l i k e l i h o o d ) e s t i m a t o r s f o r t h e u ' s i n e q u a t i o n 2 . 1 , so we d e f i n e t h e f o l l o w i n g sums o f s q u a r e s : 2 . 2 S = ( y . . k - u _ - u . - u ^ - u < k - u . ^ - u . > k i , 3 i k - U . 3 k } ( y . ., - m - m . - m . - m , . . . i r ^ 3 " . . . X . . • 3 • • • K 11 3 t ~ m i i " m i k " m i k ' i j . X . K . •3JS-1 n ( u i -: ) = (m,-,• ~ u n - - ; ) 2 » s i m i l a r l y f o r 11 3 » K S . 0 . ( u i . k } a n d S 0 . . ( u . j k } . 1 7 2 S nn (u-i ) = " u i ) > similarly for S n ( u . ) a n d S00. ( u..k } S o m ( u ) = (m - u ) 2 1 f J >K Now using the easily verified relations « . , m. i,J,k I... s*Z m _ =0 i , j , k i,j,k and m. . = m. , = m ., =0, i t is easy . . , i i . . . l . k . . , ,ik ' ^ i,J,k J i,J,k i»j,k J to verify that the following decomposition for S holds: 2 - 3 s - s . . . + s . . o < « 1 J B > + s . o . ( ui.k> + S0.. ^ . j k ! + s . o o W . . * + s o . o K j . ) + S00. ( u..k) The least squares estimates for the u's are gotten by minimiz-ing the sum of squares S with respect to the u's. The decomposition of S given by 2.3 above indicates that the least squares estimates of the u's are gotten by setting the u.. and the u. equal to the corresponding m.. and m. . Hence X • • X J « X • • the m's are the least squares estimators for the corresponding u's. Furthermore, under our present assumptions, the following statistics are independently distributed according to (central) chi-squared distributions ( i f the corresponding parameters, i.e. the u's, are a l l zero) with the degrees of 18 freedom listed:^ 1 3' C h* 1°^ 2.4 S (R-1 )(S-1 )(R-1 ) S (u.. ) (R-1)(S-1) a- 2 a - 2 S.0>i.k> (B - D(T - D S Q t t ( u t . k ) (S-1)(T-1) 2 2 s . o o < V . > <B-1> s o . o ' » j . ) <s"1> S00>..k> ( T - 1 ) s o o o ' " . . . » f 1 ) o - 2 From the above, we can set up confidence regions for the simultaneous estimation of several u's using an F statistic and we may also test whether the hypothesis that a certain group of u's has a l l u's in i t equal to zero is tenable at a given level of significance. Finally i t should be noted that we may define the regression residuals, z. .,, by the following equation: XJK 2.5 y. ., = m + m. + m . +_,+_..-+ m. ... 1J-K . . . X • « • J • . . A X J . . . I f l£ + m + z. ., .jk xjk or in matrix notation, we may rewrite 2.5 as 2.6 I = X m + z Where X is a matrix which has a l l elements equal to either 19 0, -1 or +1 . The matrix X and the vector m wi l l be explicitly-written out in Chapter IV. Hence as equation 2.6 shows that we are dealing with a linear regression problem, we turn to Durbin and Watson's work on testing regression residuals for f i r s t order autocorrelation. CHAPTER III A TEST POR INDEPENDENCE OF REGRESSION RESIDUALS 3.1 The Durbin-Watson d Statistic The results of this section are due to Durbin and ( Watson.^ Assume we have a regression model given by k-1 3.1.1 y. = u Q + £ x±.u. + e . i = 1, 2..., N d=1 or written in matrix form, 3.1.1 becomes 3.1.2 Y = Xu + £ where the x^ .. are assumed to be fixed variables, independent of the £^ and the £^ are assumed to be independent identi-cally distributed normal variables with mean zero ;and 2 variance cr . The least squares estimate m of the vector u is given by 3.1.3 m = (XTX) " 1X TY we may define the vector of residuals z by 20 21 3.1.4 Y = Xm + z or z = Y - Xm = Y - X(X TX)" 1X TI = (I-X(X TX)" 1X T)Y = (l-X(X TX)" 1X T)(Xu + £ ) = Xu - Xu + (I-X(XTX)~1XT)£ = ( i - x ( x T x ) - 1 x T ) £ : = M£ say. The matrix M defined above is a projection matrix; T 2 that is^M = M and M = M as a short calculation shovs. Hence the sample vector of residuals z is closely related to the true vector of residuals £ . The statistic chosen by Durbin and Watson to test against f i r s t order autocorrelation of the residuals from a regression is d.defined by N 2 3.1.5 d = Z (z. - z . . ) = z T A z in matrix notation 1 = 2 If 7" 8 • i t 1 1 where A is an NXN matrix of the form 22 A = +1 -1 0 0 0 -1 2 -1 * * • 0 0 0 -1 2 * • • 0 0 • * • 9 • k • * 0 0 0 • 2 -1 0 0 0 - * • . -1 +1 However using 3 . 1 . 4 we may rewrite d as 3.1.6 d = zfAz = & TM TAM£ = £ TM TAM £ Z Z T T T £ M M£ £ M & T Because the matrices M AM and M commute, they may be simul-taneously reduced to diagonal form by the same orthogonal transformation. (See for example [3:56]) Since M is a pro-jection matrix and has rank N-k, we have 3.1 . 7 where v^ , v 2, . . .VJJ ^ are the characteristic roots of MTAM other than k zeros. Note that the characteristic roots of MTAM are the same as the characteristic roots of MMTA = M A = MA. (See for example [3:96]) 23 3.2 Calculation of the Moments of the d S t a t i s t i c We would l i k e to calculate the moments of d under the nul l hypothesis that £ i s a vector of independent normal variables, each with zero expectation and constant variance cy which, by the results of the previous section, i s equivalent to working out the moments of N-k 2 d = I J I i=1 N-k 0 1 J . under the n u l l hypothesis that the ^ ^ are independently 2 distributed normal variables with mean zero and variance cT" A very useful general method noted by Dixon L J may be used to evaluate the moments of d. We define N-k „ N-k s» 2 3.2.1 C = ~ T v. J ^ 2 and V = ~ T J i i=1 ~~2 i=1 2 The moment generating function of C and V i s ¥> ( t r t 2 ) = E [ e<°*1 + V V ] C Upon noting that d may be written as V, the expectation of d is gotten by using the formula 24 3.2.2 E[d] = E [ |] - E [ J Ce ^ d t 2 ] r ° v t ? = j E [ Ce ^] d t 0 rO r C t 1 + V t 0 r° r x ( c t i + vt?) , = f [ ^ y<V V ] t 1 = 0 d t 2 t 1 = 0 ™2 ° , AV (*..+. \ , d t 0 0 0 cVb^ t 1 = 0 d S i m i l a r l y , 3.2.3 E [ d h ] = £ ° . . . j~ ° [ ^ h Y> ( t 1 , t 2 1 + t 2 2 +. . . dt 1 h ••• + t 2 h ) ] t 1 = 0 d t 2 1 d t 2 2 d t 2 h ¥e use formula 3.2.2 to evaluate the mean of d. F i r s t the moment generating f u n c t i o n of C and V i s c a l c u l a t e d . f ( t r t 2 ) = E [e ( C t 1 + V V ] (2 TT CT ) 2 J . . . J e x p L C t t + V t 2 Z j f ] IM-K 2 0 - " 25 O A C~9 (2 TT — = ( ETC o- 2 r ^ I - - : I e x P [ - ^ T ^ ] ^ . . . ^ . K = det[B] 2 N-k j _ - [ J J (1-2t 2 - 2 V . t ^ l ' - Z i=' where the penultimate e q u a l i t y above f o l l o w s from an i n t e g r a l formula given i n [2, s e c t i o n 2.3] or i n [4,p. 118]. . ( t 1 , t 2 ) and 0^ ( y t 2 ) ^ t 1 N-k 1 'N-k 77 ( 1 - 2 t 2 - 2 v i v h>-i=1 L J =1 " T F^2t 2-2v-) N-k ^ N-k T T (1-2t ) " 2 [ 3> i=1 j=1 t 1 = o 1-2t, F i n a l l y , upon making the s u b s t i t u t i o n x - 1-2t 2, we f i n d o r- 1 dt. -1 2 -N-k x 2 t 1 = o N-k V. _ I j=1 x dx N-k 3.2.4 So E[d] = i=1 :N-k Z i=1 V. = Y I N-k V . l N-k where N equals the number of observations, k equals the number 26 of regressors including a constant term i f there is one, and T the "V\ are the eigenvalues of M AM except for k zero eigenvalues. We may similarly use formula 3.2.3 to evaluate E[d ] and then calculate Var [d] which we find to be 3.2.5 Var [d] = 2 *S (v^ - v ) 2 1 = 1 (N-k)(N-k+2) N-k where v = V. = E [d] 1 = 1 & Hence we may use the mean and variance given by 3.2.4 and 3.2.5 to give a normal approximation to the distribution of d under the null hypothesis of independently distributed normal regression residuals. We now turn our attention to the problem of evaluat-ing the eigenvalue, sums appearing in formulas 3.2.4 and 3.2.5 under the condition that our regression model is the three-way classification, fixed effects, analysis of variance model outlined in Chapter II. CHAPTER IV CALCULATION OF THE MEAN AND VARIANCE OF d IN THE THREE-WAY CASE 4.1 Introduction Recall that our model i s , i n matrix notation, 4.1.1 Y = Xu + £ and the least squares estimate for the vector u i s m = (X TX) X TY. To test whether the residuals are independently distributed, we wish to calculate the mean and variance of d = £M TAM£ T £ M£ where A i s given by 3.1.5 and M = (I - X(X TX)" 1X T). The mean end variance of d under the nu l l hypothesis of independent residuals were given in the previous chapter by 3.2.4 and 3.2.5. Now we wish to e x p l i c i t l y evaluate 3.2.4 pnd 3.2.5 in the three-way c l a s s i f i c a t i o n analysis of variance model. Note that N-k 4.1.2 E[d] = 2. = t r m T a m = t r M^ A = t r MA i=1 N-k N-k N-k N-k 27 28 Where tr stands for the trace of a matrix; i.e., the sum of the diagonal elements of that matrix. Hence to calculate the expectation of d, a l l we have to do is calculate the diagonal elements of the matrix MA. However, in order to calculate MA, we must f i r s t calculate the matrix M and hence the matrix X. 4.2 Calculation of the Matrix X We re-index our observations , y 2, VJJ where N = RST and R, S, T, 2, as y ^ ^ y 1 1 2 , ... y 1 1 T , y R S T > The regression equation is given by 4.2.1 y..,=m + m. + m . + m , + m. . + m. , ^ l j k ••• i . . . j . ..k I J . l.k •jk l jk where the m's are defined as in Chapter II. Because groups of the m's are adjusted to sum to zero, not a l l of the m's are independent. In fact there are 4.2.2 k = 1 + (R-1 ) + (S-1) + (T-1 ) + (R-1)(.S-1 ) + (R-1)(T-1 ) + (S-1 ) (T-1 ) = R S + R T + S T - R - S - T + 1 = RST - (R-1)(S-1)(T-1 ) independent m's. In matrix notation, equation 4.2.1 becomes 4.2.3 Y = X m + z The matrix X in the above equation is written out on Figure 1, on page 30. 29 4.3 Calculation of the Matrix M T —1 T In order to calculate M = I - X(X X) X ve have to T —1 T f i r s t calculate M = X(X X) X . However, i t would be ex-T tremely d i f f i c u l t to invert the matrix (X X) for arbitrary positive integral R, S, and T, so a labour saving device is resorted to. Recall that by standard least squares theory m = (X TX)" 1X TY As we already know what the m's are from the analysis presented T —1 T in Chapter II, we may write out the matrix (X X)~ X . This is done on Figure 2, starting on page 31. Now having calculated the matrix X and the matrix T —1 T (XX) X , we may calculate the matrix M q = X(X TX)" 1X T The result of this computation is shown on Figure 3. From M , the matrix M = I - M may be readily evaluated. As a check on our results, we note that because M is a projection matrix and hence has a l l nonzero eigenvalues equal to one, the trace of M should equal RST - k where k is defined by 4.2.2. It was found that the latter equality did in fact hold true. 30 Figure 1 . The Metrix X 1 m.. ft-l S-1 T-1 s - i s - i *V m,j. mjj. • » • T r T ST The -Jtc4t>/Y V\.T fa, ?.S| y IJT ynT 9ivr #3ST ST ST (RST t-o^S in £ I 'M at 9 lW ST ?R1T Y « s i YRST where I. Jf, I. C Jz C L ti-, c J._-i c «x J!. C *z tx C ii i i-i C -1 C t i M i> c I «* c I -1 C I -1 I. C I -1 fx C * ; • • * • * • I -1 *S-| C I -i - l c o o 0 Jf, O O l t o I o ' c o i o T o o o c_ o o c o o o Jfe-t o -1 o O I O C O O i o c o 0 o o o o o o o o o o o o o o o 0 A, 1 0 0 0 5 i I O 0 • • * o -1 ! O o o c O C o o 1 1 -I, -c -c ,-c ". i • - « j . i j - C -c 1 !-c -a -c -c -c -c c o o -c c o o • c o * o ••• - C o o C o o o -c -c • • o . . . 0 o c -C_-C ... o • •- o O C -c -c a o o c o o o o -c -c ... o ... o • o c -C -C C O o c o o ,-c -c o D ... o • o O £ -c -C I = a Tx1 matrix of 1 ' s Jt-t= a matrix with 1 's down the ith column and zeros elsewhere C = a matrix with 1 more row than the number of columns 1 o . o i • oo . L-i-> 1 = a matrix with a l l elements equal to 1. 0 = a matrix with a l l elements equal to zero 31 Figure 2. The Matrix (X TX)" 1X T The matrix (X TX)' 1X T has k = RST-(R-1)(S-1)(T-1) rows and RST columns. It is written as a partitioned metrix [B..] below with the number of rows in each matrix element of the partitioned (X TX)~ 1X T matrix indicated to the left of the row. The number of columns in each partitioned matrix element, B. is ST. x3 No. of rows Correr-sponding Parameters Col.l ( x T x ) " 1 x T Col.2 ... Col. R-1 Col.l 1 m • * • B r o B 1 Q ... B10 B10 R-1 m. I . . B21 B 2 2 ... B2,R-1 B2R S-1 m . •3- B30 B30 B30 B30 T-1 m..k B40 B40 ••• B40 B40 (R-1 ) (S-1 ) m. . 13 • B51 B 5 2 ... B5R-1 B5R (R-1) (T-1 ) m. , l .k B61 B62 ••• B6,R-1 B6R (S-1)(T-1) m ., B70 B ? 0 ... B70 B70 Each of the matrix elements B^ . may be expressed in terms of the matrices I, 1 and defined in Figure 1 and the matrix I Q | which is a (T-1)XT matrix with 1's down the main diagonal and zero's elsewhere. The number of columns which each |T, 1, and I 0 1 has is in a l l cases equal to T. The number of rows in each matrix element of B.. is indicated below. 32 Figure 2 (continued) Corre-No.of sponding Rows Parameters B. 1 R-1 S-1 T-1 m m. m m . .k B B B B B 10 : 2i : 2R = 30 = 40 = Col. 1 _ RST ' J . | T - J - 1 i t - , . . Col.2 —— I R S T S T * 1 RST Col.S-1 Col.S R S T R S T • • • ' S T ^ ' R S T I ' S T * ^ R S T 1 R - l 1 RST RS A 0 | RST A RST X J _ f . JL_1 I RT 1 RST | —I - M ' RS 01 RST J-, - J_1 i _ _L_ 1 RST -1- P <:-r J-R5T I f I r — i — i I J _ T _ _L 1 1 _LT _ _L 1 I RS X 0 | RSTl, -L01 (?sTX Perti- No.of tioned Rows in Corre-Row No. Perti- sponding tioned Psra-Row meters 1 2 R - l S-1 S -1 S-1 S-1 vn 'jk-B 5 J Col.l Col.2 .. . Col. S-1 Col.S RST-L _ KT ' lRST A" RT i ' * 1 I 1 A ' IRST-I - -UT' — 1 1 1 ' (JT| ' 1 ' . RST! RT"'. RST-1- RT*2"! \ • ' — 1 -1 RT^-l • R.ST -1 • 1 1 # | ( R T F " • /• 1 1 \ 1 i l-rGr-" - l 1 "W)4T-. I RST 5TJ1 _ _ _ +. _ _ 1 _ I i • • i i . i f ' i f . i f ' . . RST 1 RT 1 | RST* R T l l ' ,J-1 | RST-1---iy T — 1 RST -1-and the above is valid for j =1, 2, R-1. Note that only the jth partitioned row is different from the other R-2 partitioned rows. 33 Figure 2 (continued) No. of Parti- Rows in Corre-tioned Parti- sponding Row No. tioned Para- B.. Row meters * J Col.1 Col.2 Col.S-1 Col.2 1 2 R-l T-1 T - 1 T - 1 T - 1 i-k J-k B = T RST' -J -i (R5T ~ ST) 1 , (Ssf 4T)1 rl R s t RST!" IS Id - L 1 - - L T ,-LI _J_T S T X RS - L 0 | | R S T - L R S J-O, — 1 _ i_T I — -1 _ _LT R.sr-1- RS-TM'RSTI R.SX«I ! ( R S T " S T)1 |(RST " sr^l - - -1- - -i 1 - —T — 1 - — T 1 RS^oil R S T l RS -'•OI and the above is valid for j =1, 2, R-1. Note that only the jth partitioned row is different from the other R-2 partitioned rows. 1 T-1 J i 1 1 ' * ' [ R s r l ~ f g l 0 1 l S i r l ' ^ I o , 2 T-1 * "*.ic • RST 1." R5 f ol' Ssrl" BS \' 1 1 _ i _ ' i • • * , — K1 - J- T 1-1-1 X T , I i • 1 •: - - L. R - l T-1 esrl" «S1<>II^ T1-M1O\I I 1 i ^ i " ' * IWrl"^!^,, "Jgrl-erlai 1 2 T-1 T-1 •^ .IK fer'SrU | , . j ^ 1 ^ 1 1 RSTI " f e j J 1 ,+ (*-|b)Io,' 1 . 1 1 • | * 1 • 1 . . . '_L_ 1 L T 1 1 A\ 1 T RS-l-o|RsT-L R5-L0| . . . ' 1 1 - 1 T J - 1 - I T i 1 1 . 1 S-1 T-1 M-S-lK 1 1 1 ( 1 1 , , M T l - « i 0 l | 34 Figure 3. The Matrix M = 0 x ( x T x ) " 1 x T Row Col. 1 Col. 2 Col. 3.. ..Col. R-1 Col. R 1 C1 C2 c 2 .. . c 2 C2 2 C2 C1 c 2 .. . c 2 C2 3 • C2 C2 c1 •• . c 2 C2 • = R-1 C2 C2 c 2 .. # * C2 R °2 C2 c 2 .. • C2 °1. where each c i - i - 1, 2, is an ST X ST matrix. The matrices C. I are defined by Col. 1 Col.2 Col.3 Col.S-1 Col. S 1st row i C1 i c2 i i c2 i c2 2nd row i c2 i c1 i i c2 i c2 3rd row i c2 i c2 i C .j • • « i c2 i c2 = C. i S-lst row i c2 i c2 i c2 i c1 i c2 Sth row i c2 i c2 i c2 i c2 i c1 _ M where each c^ , i , j = 1, 2, is a T X T matrix defined by 35 Figure 3 (continued) c 1 = ( 4 " ST ~ ^ + -RW) 1 + ( . T + R T ' " R S N ) I c 1 = ( ^ r " s V } 1 + U " ^ I 2 C 1 = ( " R S T " RT) 1 + ^ ~? l I c 2 = - J — 1 - _ L T where J = TxT identity matrix \ = TxT matrix vith a l l elements equal to 1 . 4 . 4 Calculation of the Mean of d As A has non-zero elements only down the main diagonal and the diagonals immediately adjacent to the main diagonal, the evaluation of the diagonal elements of MA was not too d i f f i c u l t . It was found that 4 . 4 . 1 1 tr MA = RST (1 + | g ) + R ( S - 2 ) - ( | T - ^ ) 2 1 2 + (R - 1 ) • ( R T + g T - f ) v • S R T RT RS ' ' = RST - ST - RT + T - 1 + 1; ( S " R S + 1 ) , 1 , 1 J _ 1 _ (R-1 ) 1 _ _ 1 S R RT ST RS 36 Now recall that by 3.2.4 4.4.2 E[d] = %J±_ = tr MA (where k is defined N-k RST- k by 4.2.2) = 2 + 2 ( R S - R - S + 1 ( S - R S + 1 ) + 1 + 1 ) RST - k 2( R^i 1_ _ 1_) + ST " RT RS RST - k or simplified somewhat we get 4.4.3 E[d] =2 £ 1 + 1 + RS\T_V) + R S T ( ? s ? i y . ( T - l ) } which is similar in form to the result for the two-way classification listed in Section 1.2 4.5 Calculation of the variance of d We have from 3.2.5 n-k 4.5.1 Var [d] = 2 X (v ± - v ) 2 1 = 1 (N^ -k) (N-k+2) Hence we must evaluate N-k 0 N-k 0 0 4.5.2 (v. - v T = X - (N-k)v^ i=1 1 i=1 1 Now i t is easy to show that the trace of a matrix is invariant under an orthogonal transformation; that i s , i f H is an ortho-gonal matrix, then (see for example, [3:96]) trMA = tr (HTMAH) 37 As M is a projection matrix we also have tr MA = tr M2A = tr MTMA = tr MAMT T As MAM is a symmetric matrix, there exists an orthogonal H such that T T HMAM H = diag [v/\ (HMAMTHT)2 = diag IV 2] 4.5.3. N^-k v. 2 = tr (HMAMTHT)2 i=1 = tr HMAMAMTHT = tr HTHMAMAMT = tr MAMAMT = tr MTMAMA = tr MAMA = tr (MA)2 Let the columns of M be m^ , m^ , ..., m^, where N = RST. Then a small calculation shows that MA = 2M - [m1 m1 m2 m^ ... mjr_f ] - [m2 m3 • • • mN-1 mN mN-^ Using the above equation, we may write out the matrix MA explicitly. This is done on Figure 4 where the matrix MA is written as an RxR partitioned matrix [A. .], where each A. . is an ST x ST matrix. Because of the relatively simple form of M, there are only 10 distinct A^., (assuming R i 4). The A.. are written in terms of distinct ST x ST matrices, 38 A.| , A^, . A , j Q in Figure 4. Each matrix A^ may be written as an SxS partitioned matrix [a"^ , ], where each a*, is a TxT matrix. Again (assuming S > 4) there are only at most 10 distinct a ^ for each i and these are written as a.j", a^, a^; i = 1, 2, ... 10. The positions of the a^ in their respective A 1 is also indicated by Figure 4. However, not a l l of the a1, are distinct. Table I on page 40 shows the a* 3 K 3 k in terms of 27 distinct non zero matrix blocks ot^, k - 1 , 2, 27. Table II on page 41 defines the in terms of linear combinations of: I = the TxT identity matrix, K = a TxT matrix with 1 1s in the diagonal immediately below the main diagonal and zeros everywhere else, K T-the transpose of K j). = a TxT matrix with 1 's in the ith column and zeros everywhere else, and •K-j* a TxT matrix with zeros everywhere except for a 1 in the i , jth position. Now we may calculate the trace of (MA) . Assuming that R > 5, we may multiply the matrix MA as shown parti-tioned on Figure 4 by i t s e l f and take the trace of the sum of 2 the diagonal elements of (MA) . Upon doing this and collect-ing terms, we find Figure 4. The Matrix MA 1st row 2nd row 3rd row 4th row R-2nd row R-1st row Rth row The Matrices A. I i i a1 a2 i i a5 a6 i i a8 a9 i i a8 a3 i i a8 a3 i i a g a 3 Table I: TABLE OP THE a3: A, A 3 ft4 A£ A 7 A8 A«i A,. I- 1 2 3 4 S k 7 8 9 10 *>« <<\Z • t i l <*\£ 0 0 0 O *3 0 0 0 < <<<3 0 « W 0 * S * 14- <*vj °<-z3 <x.v3 «l * « S <*.S •U <* v5 * 7 «>S <<T *v S <*«.<> *'<7 * S <*tfc o 0 <*\b * 3 *3 <*\3 o( VJ t<,3 °M2 o*-12. «.<> ( O is the TxT matrix with a l l elements equal to 41 Table II: TABLE OP OC's «l = *3 -* 4 ~ «S = -M'-i-^U +U*i-OtK+^ 4 4 i c T , + ( ± . r - : L ) i | + ( ^ £ -« S = o C 9 - -11 * *(«*iO + i x T X + ( i * i -iK T •* 14= -*,0 -«*« = - i l + iC«^ T) * i K n + ^ + (* T- ^ 1 , ««. = <*,3 - R . " ^ ' ^ ~ fMT * i 4 = * z o = <*>u -<*t3 -- | I + 1 U * K T ) U ^ T I * i K> T * C £f " ^ U l < * L f c = 42 4.5.4 tr(MA) 2 = tr [ A ^ + A ? 2 + (R-2)A 6 2 + (R_3)(R-4) A 3 2 + 2(A 2A 5 + A 9A 1 Q + A 4A g) + 2(R-3)(A 3A g + A gA 9 + A-JA^-J Separate calculations show that the formula 4.5.4 is also valid for R=4 and R=3. As the trace of a sum of matrices is equal to the sum of the traces of the matrices, a l l we need to do is evaluate the trace of the 10 separate matrix products A^A^ occurring in equation 4.5.4. The trace of A.A. is found by making use of the partitioning of each A^ into SxS matrices defined also on page 39, and is found to be ( i f S > 5) 4.5.5 + a^JB 4 a^J + (s-3)( s\si + Sg3j + a 2 L a J 9 + 3 ^ + a 3 l a{ + a^a 3 J) ] Separate computations show that the formula given by 4.5.5 is also valid for S=3 and S=4. Now using the table given on page 40, we express the a^ a^ terms in 4.5.5 in terms of oc °< products. Then we use equation 4.5.5 10 times in ra n * equation 4.5.4 to express tr(MA) in terms of o< <* products. 43 Upon collecting terms, the result is 4.5.6 tr(tAb)z - tr +<*t + (a-0<4 -rU-i)*^ + R ( S -3Xs-f-)««3 + ( R^s - R S + t s - s ) * ^ + R ( S - 2 ) < T 2 ( ^ < 5 + « I F « A + ^ 9 « A 7 + ^ ^ B W„c(, a W,,o( a i r^, 7c( 2 0) + 2<*lt«(l3 +(R-l)<<3« 1 o T ( S - 3 ) * 3 * a f RCS-3Wg.«< g + ( R H ^ S - 3 ) +-(R 3S -2.R.S-feR-S + 8) o(,2.0^,3 + t R - i V £ o c a 3 + + CR-2)o<„«(,5 +(R-Z'W.3°^7 + te-=0*i?°<i8 7 T ^ R - ? ) ^ ! ^ ^ Upon evaluating the trace of the 31 <*;.<*j products and sub-stituting the results into 4.5.6 and collecting terms, we find that for R ^ 3, S ^ 3, T £ 2, 4.5.7 I t r ( M A ) 2 = (|-£ - ^ f ( 3 R S T - 2 R -2 - 2 ^ + + ^ ( 3 RS aT - 3 RST -2 RS - i s +4) + ^ ^ 3 R A S T - 3 R S T - V 1 8 S T + Z 4 R T - R S - G 0 T - 7 S - I S R + I 9) + ^ 1 ( 3 R2S - 4 K5 + 12. S + t R -2 3) 44 The above formula, 4.5.1, 4,5,2, and 4.5.3, give an explicit expression for the variance of d for the three-way-classification, which is given by 4.5.8 N-k _2 v Var[d] = 2[ X V i - v ] i=1 (N-k)(N-k+2) N-k+2 = 2 t r (MA) (R-I^S-l)(T-0[(R.-l)(S-l)(.T-') +2] 4(1 +1- + R t S T R S ( r - l 1 RST (t (K-I)(S-IXT-\) +a -RS \ trU<0* _ + ± + J_ x + « »s'-rS x V 2(ft-l)(S-0(T-l) T RS(T-l) RSTCS-I^T-Q / (R-IXS-I)CT-O T Z CHAPTER V A NUMERICAL EXAMPLE AND CONCLUSIONS The example chosen was a time series giving Canadian gross national product by quarters for the years 1947 to 1958 inclusive. The data were rounded to two significant figures and are listed in Table III on page 46. The parameter S was chosen in the following way. We start out with S = 3 and then test whether the mean of the f i r s t 3 year's G-NP differs significantly from the mean of the second 3 year's GNP. If not, then we would try S = 4 and re-peat the procedure. However, i f the means did differ s i g n i f i -cantly from zero, then we would set S = 3. The idea behind this procedure is to choose S so that we can be sure that the trend of the series has changed significantly from one i period to the next. Then we can test whether the seasonals are also changing significantly as the trend changes. Using the yearly data for the years 1947 to 1949 as group 1 and the yearly data for the years 1950 to 1952 as group 2, we define the sample means x1 and x« by 45 46 Table III: Canadian Gross National Product by Quarters, 1947-1958 Year Quarter I Quarter II Quarter III Quarter IV Year 1947 y 1 1 1 = 2 7 y 1 1 2=31 y,, 3=39 y l 1 4 = 3 4 1 3 1 1948 yl21=3.1 y, 22=34 ^ 23=47 y 1 2 4 = 3 9 1 5 1 1949 y 1 3 1=35 ^ 32=39 y, 33=49 y, 34=41 164 1950 y 2 1 1=38 y 2 1 2=42 y 2 1 3 = 5 4 y 2 1 4=47 181 1951 y 2 2 1=45 y 2 2 2=51 y 2 2 3=63 y 2 2 4=53 212 1952 y 2 3 1=51 y 2 3 2=57 y 2 3 3 = 7 2 y224 = 6 0 2 4 0 1 953 y 3 1 1=55 y 3 1 2 = 6 ° y313= 7 4 y314 = 6 1 2 5 0 1954 y 3 2 1=55 y 3 2 2=61 y 3 2 3 = 6 9 y324=63 2 4 8 1955 y 3 3 1=58 y 332= 6 6 y 3 3 3=78 y334= 6 9 271 1956 y 4 1 1=65 y 4 1 2=72 y 4 1 3=88 y 4 1 4=77 302 1957 y 4 2 1=7l y 4 2 2=77 y 4 2 3=88 y 4 2 4=78 314 1958 y 4 3 1=72 y 4 3 2=8i y 4 3 3=92 y 4 3 4 = 8 4 3 2 9 Notes: 1. The data have been rounded to two significant figures. 2. The data are expressed in units of 100 million dollars. 3. The data for the years 1947 to 1957 inclusive were taken from [7]. The data for 1958 were taken from [6]. 47 = 13+15+16 = 14.7 i 3 x0 = 18+21+23 = 20.7 ^ 3 - =6.0 Using the standard t-test procedure (see for example [11:184]) we find that 95% confidence limits for the di f -ference between the population means, u^ and u 2, are given by 6 - (2.78)0,7) £ u 2 - U l £ 6 -(2.78)0*7) or 1,3 iz u 2 - u 1 ± 10.7 Hence we conclude the means are significantly different from each other and set S = 3. As we have observations for only 12 years, this fixes R at R = 4. Of course, as we are given quarterly data, T = 4. Using the formulae given in Chapter II, the sample m's were calculated and they are tabulated in Table IV on page 48. Then, using the formula 5.1 z. ., = y. - m - m. - m . - m , - m. , -ijk ^ i j k ... i . . . j . ,.k l.k —m. . — m i = 1,2,3,4; j = 1,2,3; k = 1,2,3,4, the residuals z . w e r e calculated. Then the residuals were 13k squared and summed. The resulting sum was equal to 56.69. Also the sum of the f i r s t differences squared was calculated and was 48 Table IV: The Sample m's m - 58.2 o • • ra1 = -21.0 m -AO 1 • • m , l . ~ -4.2 m ., = -7.9 m2.. = ~ 5 ' 5 - . 2 . = - ° ' 3 * 2 = - 2 . 3 m3«. = + 5 ' 9 ». 3. - +4.5 m = + 9 . 6 m4^ = +20.6 m 4 = +0.6 m11. - "°- 3 m#11 = +0.1 m = + 1 # 7 ^ 2 . = + 0 ' 8 » . 1 2 = -0.5 m i 9 = „ 0.3 ra13- = " ° * 7 ra.13 = + 0' 1 m, ^ = -1.8 m 2 L = " 3 - 2 ra.i4 = +0.1 m, A = +0.2 '1.1 '1 .2 >1.3 '1.4 m22. " + 0 ' 6 <\ 2 1 = +0.6 ^ = + 0 . 1 m23. = + 2 ' 8 <\ 2 2 = 0 m 2 < 2 = _ 0 ; 4 m31. = + 2 ' 6 m . 2 3 = ~ 0 - 4 m 2 > 3 = -0.3 m32. = -1-7 m < 2 4 = -0.1 = 0 m33. = " 1 - 0 m. 31 = -0-8 m 3 < 1 = -0.2 m 4 L = + ° ' 9 m . 3 2 = + 0 ' 3 m- , = +0.5 '3.2 *3.3 m43. = " 1 * ° m = +0.2 m42. - 0 m > 3 3 = +0.4 m = o '.34 - m 3 o 4 = -0.4 m4.1 = - 1 - 6 m4 2 = + 0 ' 2 m4.3 = + 0 , 9 m 4 . 4 = +1.5 49 found to be equal to 148.45. From these two statistics, we may calculate the Durbin-Vatson d statistic: 5.2 d = zTAz = 148.45 = 2.62 T 59.69 z z Now we may use the results of sections 4.4 and 4.5 to test the null hypothesis that the residuals z^ are independent normal variables. Using formula 4.4.3 we find r 1 1 (R+S-RS) 1 5.3 E[d] =2 j l + T - RS(T-1) + RST(S-1)(T-1)J = 2.44 Using formula 4.5.8, we find that 5.4 Var [d] = 0.774 Using a normal approximation to the distribution of d, with mean and variance given by 5.3 and 5.4 above, we find that the 95$ confidence limits for d include the sample d value given by 5.2. Hence we may accept the null hypothesis that the residuals are independently distributed normal variates end we may use the standard F tests to test the m's for significance, The results of the F tests are tabulated in Table V on page 50. It was found (at a level of significance oc = .05) that the u. , u . , u , a l l differed significantly from zero, l . . . 3 . . . K The reason why the u . differed significantly from zero was due to the fact that the series had a regular trend within the "group" classification; that i s , i t was generally found that 50 Table V: Analysis of Variance Table Observed d. Mean F Con-Statistic Value f. Square F Ratio d.f. ' elusion S... 56.7 18 3.15 S n n ( u . =0) 11,164.8 3 3721 l^ ?l=l,181 3,18 3.16 Signifi-' u u 1 " J , I : > cant S0.0(u . =0) 607.7 2 303.8 ^°?:8=96.46 2,18 3.55 Signifi-V3'* ' 5 cant S00.(u =0) 1920 3 640 f^O 203.2 3,18 3.16 . Signifi-^- ' r cent S -(u.. =0) 385.44 6 64.24 6^,2^=20.39 6,18 2.66 Signifi-" U 1 J ' ^ 1 0 cant S0..(u>.,=0) 6.96 6 1.16 7-44=2.71 18,6 3.90 Not • J k 1 * 1 6 Signifi-cant S.0.(u. ,=0) 37.89 9 4.21 ^-f4=1.37 9,18 2.46 Not 1 , K J* ° Signifi-cant 51 yi1k < yi2k ^ yi3k' h e n c e giving rise to significant u . 's. This within the group of years classification trend seems also to be responsible for significantly different from zero u.. ' s. As expected, the u j^'s did not differ significantly from zero. Finally, i t was also found that the u. , 1s did not differ significantly from zero, which was rather surprising since the data seemed to indicate that the seasonal fluctuations in-creased as the GNP grew, but percentagewise not as quickly as the actual percentage increase in the GNP from the f i r s t years to the last years. However, i f we had taken S = 4 or 5 the u^ k's might well have differed significantly from zero. The conclusion drawn from the numerical example is that in the case of the numerical example, the simpler two-way analysis of variance additive model as outlined in section 1.2 would give roughly as good results as those gotten by using the three-way analysis of variance model. However, the three-way model s t i l l has some merit as an aid in deciding what type of simpler model to use. The overall conclusion is that the procedure outlined in section 1.5 is & reasonably Satisfactory one for estimating seasonal adjustments when i t is suspected that the seasonals are partly additive and partly multiplicative to the trend. BIBLIOGRAPHY R. L. Anderson, "Tests of Significance in Time Series Analysis, in Statistical Inference in Dynamic Economic Models, edited by T. C. Koopmans (pp. 352-355). J. Wiley and Sons, New York, 1950. T. W. Anderson, Introduction to Multivariate Statistical Analysis, J. Wiley and Sons, New York, 1958. R. Bellman, Introduction to Matrix Analysis, McGraw H i l l , New York, 1960. H. Cramer, Mathematical Methods of Statistics, Princeton University Press, Princeton, 1946. W. J. Dixon, "Further Contributions to the Problem of Serial Correlation," Annals of Mathematical Statistics Vol. 15 - (1944), pp. 119-144., Dominion Bureau of Statistics, National Accounts Division National Accounts, Income and Expenditure, Fourth Quarter and Preliminary Annual 1961, Queen's Printer, Ottawa, 1962, Dominion Bureau of Statistics, Research and Development Division, National Accounts, Income and Expenditure by Quarters, 1947-1957. Catalogue Number 13-511 Occasional, Queen's Printer, Ottawa, 1959. J. Dtirbin, "Trend Elimination for the Purpose of E s t i -mating Seasonal and Periodic Components of Time Series," in Proceedings..of ..the Symposium on Time Series Analysis., edited by M. Rosenblatt, J. Wiley and Sons, New York, 1963, pp. 3-16. J. Durbin and G. S. Watson, "Testing for Serial Correla-tion in Least Squares Regression I," Biometrika, Vol. 37 (1950), pp. 409-428. J. Durbin and G. S. Watson, "Testing for/Serial Correla-tion in Least Squares Regression II," Biometrika, Vol. 38 (1951), pp. 159-178. / 52 53 [11] E. S. Keeping, Introduction to S t a t i s t i c a l Inference, D. Van Nostrand Company, Nev York, 1962. [12] G. Tintner, Econometrics, John Wiley and Sons, New York, 1952. [13] S. S. Wilkes, Mathematical S t a t i s t i c s . John Wiley and Sons, New York, 1962. [14] H. Wold, with L. Jureen, Demand Analysis. John Wiley and Sons, New York, 1953.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Analysis of variance estimators for the seasonal adjustment...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Analysis of variance estimators for the seasonal adjustment of economic time series Diewart, Walter Erwin 1964
pdf
Page Metadata
Item Metadata
Title | Analysis of variance estimators for the seasonal adjustment of economic time series |
Creator |
Diewart, Walter Erwin |
Publisher | University of British Columbia |
Date Issued | 1964 |
Description | The purpose of this thesis is to develop a valid statistical procedure for the estimation of the seasonal component of an economic time series when the seasonal component is suspected to be partly additive and partly multiplicative to the trend. The proposed procedure is based on a three-way classification analysis of variance model, where the first classification is used to represent the long term trend of the series, the second classification is used to represent any regular trend or cycle within the long term trend, and the third classification is used to represent the seasonal. The interaction term between the long term trend and the seasonal may be used to represent any long term change in the nature of the seasonal. However, as the standard analysis of variance significance tests assume independently distributed residuals, it is necessary to develop a test for independence of residuals against the very likely alternative of first order (positive) serial correlation. This is done by calculating the mean and variance of the Durbin-Watson d statistic for the three-way classification analysis of variance model. A numerical example is given to illustrate the procedure. |
Subject |
Time-series analysis Mathematical statistics |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-09-15 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0080618 |
URI | http://hdl.handle.net/2429/37390 |
Degree |
Master of Arts - MA |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1965_A8 D53.pdf [ 2.7MB ]
- Metadata
- JSON: 831-1.0080618.json
- JSON-LD: 831-1.0080618-ld.json
- RDF/XML (Pretty): 831-1.0080618-rdf.xml
- RDF/JSON: 831-1.0080618-rdf.json
- Turtle: 831-1.0080618-turtle.txt
- N-Triples: 831-1.0080618-rdf-ntriples.txt
- Original Record: 831-1.0080618-source.json
- Full Text
- 831-1.0080618-fulltext.txt
- Citation
- 831-1.0080618.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080618/manifest