UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Bias in least squares regression. Williams, Douglas Harold 1972

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1972_A6_7 W54.pdf [ 2.45MB ]
Metadata
JSON: 831-1.0075416.json
JSON-LD: 831-1.0075416-ld.json
RDF/XML (Pretty): 831-1.0075416-rdf.xml
RDF/JSON: 831-1.0075416-rdf.json
Turtle: 831-1.0075416-turtle.txt
N-Triples: 831-1.0075416-rdf-ntriples.txt
Original Record: 831-1.0075416-source.json
Full Text
831-1.0075416-fulltext.txt
Citation
831-1.0075416.ris

Full Text

BIAS IN LEAST SQUARES REGRESSION  by DOUGLAS HAROLD WILLIAMS B.Sc,  Simon Fraser University, 1970  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n the Department of FORESTRY  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA January, 1972  In presenting  t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requirements  f o r an advanced degree at The U n i v e r s i t y of B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r reference and study. I f u r t h e r agree that permission f o r extensive f o r s c h o l a r l y purposes may by h i s representatives.  copying of t h i s t h e s i s  be granted by the Head of my Department o r  I t i s understood that copying or p u b l i c a t i o n  of t h i s t h e s i s f o r f i n a n c i a l gain s h a l l not be allowed without my w r i t t e n permission.  The U n i v e r s i t y of B r i t i s h Columbia Vancouver 8, Canada  Date  Q-^*-Z*^O^/  v  5?  >  iii  ABSTRACT  Much of the data analysed by l e a s t squares regression methods v i o l a t e s the assumption that independent v a r i a b l e s are known without error.  A l s o , i t has been demonstrated that parameter estimates based  on minimum r e s i d u a l sums of squares have a high p r o b a b i l i t y of being u n s a t i s f a c t o r y i f the independent v a r i a b l e s are not orthogonal. Both s i t u a t i o n s are examined j o i n t l y by Monte Carlo s i m u l a t i o n and b i a s i n l e a s t squares estimate of regression c o e f f i c i e n t s and e r r o r sums of squares i s demonstrated.  Techniques f o r regression under these con-  d i t i o n s are reviewed but the l i t e r a t u r e does not present a p r a c t i c a l a l g o r i t h m i n e i t h e r case.  iv  TABLE OF CONTENTS Page LIST OF TABLES  v  LIST OF FIGURES  vii  Chapter  ONE  TWO  INTRODUCTION  1  THE LINEAR MODEL  3  The C l a s s i c a l Model  3  The Least Squares Solution  3  MONTE CARLO STUDIES The Simulation Algorithm  THREE  6 .  6  Construction of Vectors of a Given C o r r e l a t i o n . . .  7  A Single Variable Model: Study 1  10  A Two-Variable Model: Study 2  19  Discussion of Simulation Results  32  REGRESSION PROCEDURES WHEN o^ 4 0, AND PREDICTOR VECTORS ARE NOT ORTHOGONAL CONCLUSION  LITERATURE CITED  38 41 44  V  LIST OF TABLES Table 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.  Page Simulation r e s u l t s f o r a s i n g l e v a r i a b l e model: g = -0.25  12  Simulation r e s u l t s f o r a s i n g l e v a r i a b l e model: 3 = 0.0  13  Simulation r e s u l t s f o r a s i n g l e v a r i a b l e model: 3 =0.25  14  Simulation r e s u l t s f o r a s i n g l e v a r i a b l e model: 3 = 0.5 . .  15  Simulation r e s u l t s f o r a s i n g l e v a r i a b l e model: 3 = 0.75  16  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) = 0.0  20  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) =0.1  21  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) =0.2  22  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) =0.3  23  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) = 0.4  .  24  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) =0.5  25  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X^ X^) =0.6  26  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X X ) = 0.7  27  n  0  vi Table 14.  Page Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X X ) = 0.8  28  Simulation r e s u l t s f o r a two v a r i a b l e model: c o r r e l a t i o n (X, X.) = 0.9  29  2  15.  vii  LIST  OF  FIGURES  Figure 1.  Page The Trend  of the Standard Deviation  t = (B 2.  The E f f e c t and  ai  6)/S(3) w i t h  of  B and  of Non-orthogonal  Predictor  0 on t h e S t a n d a r d D e v i a t i o n  t = (B - B ) / S ( B )  18 Variables of  31  ACKNOWLEDGMENT  The author wishes to express h i s gratitude to Dr. A. Kozak who suggested the problem and under whose d i r e c t i o n t h i s study was undertaken. Drs. W. G. Warren, A. Kozak, and Mr. G. G. Young are g r a t e f u l l y acknowledged f o r t h e i r help, useful c r i t i c i s m and review of the thesis.  BIAS IN LEAST SQUARES REGRESSION  INTRODUCTION  M u l t i p l e regression methods employing the l e a s t squares p r i n c i p l e are used throughout the sciences f o r i d e n t i f i c a t i o n of the f u n c t i o n r e l a t i n g a set of 'independent' v a r i a b l e s t o a s i n g l e responding 'dependent' v a r i a b l e .  The widespread u t i l i t y of these  methods n e c e s s i t a t e s examination of the v a l i d i t y of the technique under conditions v i o l a t i n g or s t r a i n i n g the assumptions of l e a s t squares regression theory. In p a r t i c u l a r , much of the f o r e s t r y data analysed by regress i o n methods does not f u l f i l l the assumptions of regression theory. The independent v a r i a b l e may only be an estimate, such as stand age i n n a t u r a l 'even aged' stands.  Another common regression s i t u a t i o n  i n f o r e s t r y i s the volume equation, V = a D  b  H, C  where V = tree volume, D = tree diameter at breast h e i g h t , H = tree height.  2  The l i n e a r form i s log V = log a + b log D + c log H The independent v a r i a b l e s H and D have e r r o r s of estimate and are highly correlated. Some aspects of t h i s problem have been discussed by Wald (1940), B a r t l e t t (1949), Acton (1959), Kendall and Stuart (1961), and Cox (1968). However, these authors are concerned l a r g e l y w i t h developing improved parameter estimates. Kendall (1951) i n a d i s c u s s i o n of regression and f u n c t i o n a l r e l a t i o n s h i p , examined the problem of t e s t of s i g n i f i cance i n a d d i t i o n t o parameter estimation.  T u r n b u l l (1968) presents  a s e r i e s of Monte Carlo simulations of a s i n g l e v a r i a b l e model, and demonstrates e m p i r i c a l l y the seriousness of e r r o r of estimate of the independent v a r i a b l e . I t was attempted i n t h i s paper t o examine two common e x p e r i mental s i t u a t i o n s , the case where the independent v a r i a b l e s contain e r r o r s and the case where the independent v a r i a b l e s are not orthogonal. Monte Carlo s i m u l a t i o n experiments are used to provide e m p i r i c a l data to demonstrate the trend and magnitude o f e f f e c t s a r i s i n g from these situations.  CHAPTER ONE  THE LINEAR MODEL  1.1  The C l a s s i c a l Model The general l i n e a r model f o r r e l a t i n g a response variable Y to  a controlled variable X i s Y = X 3+ e  where Y i s an (n x 1) vector of observations. X i s an (n x p) matrix of known form. 3 is a  (p x 1) vector of parameters  e i s an (n x 1) vector of errors.  A number of assumptions are made about t h i s model:  i) ii)  e * N(0, o I ) 2  Y ^ N(X 3 , a I ) 2  This implies that cov (Y^ Y_.) = 0 f o r a l l i 4 j iii)  The independent variable X i s known without error.  (1)  4  1.2  The L e a s t Squares  The l e a s t  Solution  squares e s t i m a t e o f 3 i s t h e v e c t o r b which  T the e r r o r sums o f s q u a r e s , e e .  minimizes  From e q u a t i o n (1),  3  e = Y - X  (2)  and  e e T  = (Y - X 6 ) ( Y - XS) T  = Y Y -  3X  X3 + 3 X X3  T = Y Y -  2 3 X Y + 3 X X3  Y - Y T T  (3)  T T  V J  '  The v a l u e o f b w h i c h , when s u b s t i t u t e d f o r 3 i n (1), m i n i m i z e s T e e i s found by t h e c l a s s i c a l d e r i v a t i v e t e c h n i q u e . d i f f e r e n t i a t e d w i t h r e s p e c t t o 3. z e r o , 3 b e i n g r e p l a c e d by b .  E q u a t i o n (3) i s  The r e s u l t i n g m a t r i x i s equated t o  The r e s u l t i n g system i s g e n e r a l l y  referred  t o as t h e 'normal' e q u a t i o n s , (4)  (X X)b = X Y. T  T  I f the system o f e q u a t i o n s (4)  c o n s i s t s o f p independent  equa-  t i o n s i n p unknowns, then a unique s o l u t i o n f o r b can be o b t a i n e d . T - I T b = (X X) XI  The  s o l u t i o n b i s an e s t i m a t e o f 3 which m i n i m i z e d  the e r r o r  T sums o f squares e e i r r e s p e c t i v e o f the d i s t r i b u t i o n p r o p e r t i e s o f t h e errors.  However, t h e assumption  that the e r r o r s e are normally  5  d i s t r i b u t e d i s required i n order to make tests of s i g n i f i c a n c e which depend on assumptions of normality, such as t - or F-tests. Another i n t e r e s t i n g property of the s o l u t i o n b i s that when 2 e^N(0, a ) , then b i s the maximum l i k e l i h o o d estimate of 3.  The l i k e l i -  hood function, L, f o r the n-tuple of observation Y = [Y^, ^2'*"*^n^ n ir  T/T..  L(e e) = 1  1  , 2 ,„ 2 /2a )  ryr- exp (-e  o(2ir) '  1  S  1  1 T 2 = — —pr exp(-(e e)/2a ) n n/z a (2TT) / 0  N  As the sums of square error term i s negative, minimizing T L(e e) maximizes the exponential expression and the l i k e l i h o o d  function.  This property provides j u s t i f i c a t i o n f o r the least squares procedure i n the common s i t u a t i o n where errors are normally  distributed.  CHAPTER TWO MONTE CARLO STUDIES  2.1  The Simulation Algorithm The Monte Carlo studies i n t h i s i n v e s t i g a t i o n consist of  generating data from a l i n e a r equation with known parameters and normal random errors of known standard deviations.  The independent  variables were constructed to a predetermined c o r r e l a t i o n and corrected sum of squares by a technique described below. 'observations', y, were generated  y=*0 h  h± h  8„, U  8  +  +  The dependent v a r i a b l e  as  X  2 i  +  " ' -  +  B  p pi X  +  e  2i>  where: i) ii)  8-, > . . . . 1  p  are the parameter values of the model,  X,., X„., X . are the i t h l e v e l independent li' 2i' ' pi variables measured without error, r  iii)  the i n t e r c o r r e l a t i o n s matrix of the independent variables i s C,  iv)  1 S  t n e  i s N(0,  e  r r o r of estimate of the i t h observation and  o ). 2  7  The dependent v a r i a b l e , therefore, i s generated i n a manner i n agreement with regression theory.  The error of estimate i s  associated with the dependent variables and i s independent of the X v a r i a b l e , and the X variables themselves are assumed to be error free. Next, the observed independent v a r i a b l e s , x, were generated such that x = X +  e  1  where  e  i  * y°> A  Each set of observations procedure.  The procedure was  combinations of  and C, and  the means of selected  was  y  f i t t e d using the usual l e a s t squares  repeated 1000  times f o r a number of  'expectation' values were provided  statistics.  It should be pointed out that the same set of standard random deviates was  2.2  by  used for a l l combinations of C and  Construction of Vectors  normal  a^.  of a Given Correlation  The i n t e r c o r r e l a t i o n of the independent regression variables  was  a c o n t r o l l e d parameter f o r studies involving multiple regression models. An algorithm f o r construction of variables of a given c o r r e l a t i o n i s described below.  8  Let the unknown random vector be X = [X., X„, ... X ] , written 1 2 p as a column matrix. Then the desired column vector of mean values i s  E(X ) 2  E(X ) 2  y = E(X) = E(X ) p  and the desired i n t e r c o r r e l a t i o n matrix i s  '12  C =  c  "IP  1  2 1  JLfpi A covariance matrix, V, may be obtained from the c o r r e l a t i o n matrix and the desired standard deviations of each independent  variable  by way of the r e l a t i o n s h i p :  c. . =  i  J  3  or v. . = c. . a. a. A random standard normal deviate generator i s used to produce observations of a p-dimensional normal v a r i a b l e Y.  Let C be the r e -  sultant covariance matrix of Y, and m i s the mean vector. Y  ^  N (m, C) . P  That i s ,  It i s a property of multivariate normal d i s t r i b u t i o n s that i f A i s any p x p, non singular matrix, then Z = AY  *  N (Am, ACA ) . P T  Without immediately determining A assume that,  ACA where I  T  = I , p  i s a p x p i d e n t i t y matrix. 1/2 If V  i s the square root of the desired covariance  matrix,  then by the above property, V  1 / 2 T  Z = x ' „ N(V = X  1/2T  Am,  1/2T  Am, V) .  ^ N(V  V  1 / 2 T  IV  1 / 2  )  The user s p e c i f i e d mean vector i s formed by subtracting the mean vector of Z from each observation of Y and adding the user specif i e d mean vector y. We have not yet determined the matrix A such that T ACA  = I  T Premultiplying both sides by A , we have T T T T A ACA = A I = A . This implies that (A A) A T  _ 1  T  = CA  T  10  and  that T -1 (A A) = C . A  Therefore T -1 A A = C  and  A - Ccf ) 1  172  Hence to determine the desired random vector X, we need only compute the two matrices  -1 1/2 1/2 T (C ) and (V ) and apply them to the  s t a r t i n g random v a r i a b l e Y: X = (V  1 / 2  )  T  (C  - 1  )  (Y-m) + u  1 / 2  i s N(u, V) .  Much of this algorithm i s incorporated i n t o the computer program *N0RMAL w r i t t e n by J . Halm of The University of B r i t i s h Columbia Computing Centre. It should be pointed out that ' c o r r e l a t i o n ' i s a property of random v a r i a b l e s .  In this study we use ' c o r r e l a t i o n ' i n association  with f i x e d predictor vectors f o r want of a better term. 2.3  A Single Variable Model: Study 1 In the f i r s t study, the s t r a i g h t l i n e model used by Turnbull  (1968) was  simulated  Y= 3  0  +ex + x  c  2  11  where  8Q = 10. ,  and  e  ^  1  e  2  2  N(0,  a )  * N(0,  2  Observations were generated f o r 9 l e v e l s of X (X = 6, 7, 13, 14) and the standard deviation of the error of Y, stant over the study,  = 1.  ....,  was held con-  The slope, 8^ and the standard deviation  of error of X , a ^ , were varied over the study i n a f a c t o r i a l arrangement. &  = -.25, 0, .25, .5,  a  = 0, 1, 2, 3, 4  1  x  The r e s u l t s of the 25 experiments  .75  i n Study 1 are given i n Tables  1-5.  The tabled variables are Bl  , the l e a s t squares estimate of B ^ >  S(B1)  , the standard deviation of B l ,  T(B1)  , the s t a t i s t i c Bl-  8,  ±_  »  ST(B1) , the standard deviation of the above s t a t i s t i c T(B1), CHI  , the s t a t i s t i c 9  (n-2) S 2 °2  y.x  or  SSRES 2 °2  V(CHI) , the variance of the s t a t i s t i c  CHI.  The table entries are the means of 1000 observations.  TABLE 1 SIMULATION RESULTS  MODEL  : Y =  FOR A SINGLE  VARIABLE  MODEL  10. - 0.2 5< X+E 1 > + E2 9 LEVELS OF X' = RE PE TI TI ONS = 10 00  DES I G.N •  •  . SIGMA 2  .  •  •  .  SIG MA 1  •  .  •  •  1 •  0  1  •  1  • • •  •  *  •  1  1. •  2  •  1 •  •  •  4  3 •  •  STATISTICS  Bl  •  -0.17  .  -0.12  .  -0.09  .  2.10  .  .  0.12  •  0.12  .  o.io  -0.02  .  0.2 2  •  0.77  .  1.43  ST(31)  1.21  .  1.17  •  1.19  1.27  .  1.40  CHI  7.04  .  7 .44  •  8.26  8.96  .  9.41  13.75  .  14.97  22. 52  .  2 5 . 03  S(B1)  .  T( B 1 )  . V(CHI)  .  18.75 •  .  .  .  0. 09  0.13  .  .  -0.23  -0.25  . .  TABLE 2 SIMULATION RESULTS FOR A SINGLE VARIABLE MODEL  MODEL : Y.= 10. + O.CCIX+El) + E2 LEVELS OF X = 9 REPE TI TI ONS = 1000  DESIGN •  •  . SIGMA 2  .  •  •  . SIGMAI  .  •  •  •  1 0  • •  •  •  1  1 •  1  •  •  2  •  •  •  1  1 •  .3  •  •  4  •  STATISTICS Bl  -0.00  -0.00  .  -0.0 0  .  0.00  .  0. 09  .  0.08  .  0.13  .  0.12  .  0.11  T( B l )  -0.02  .  -0.00  .  0. 01  . ST( Bl )  1.21  .  1.19  .  1.17  .  1.16  .  1.16  .  7.04  .  7.05  .  7.07  .  7.09  .  7. 10  .  13.75  .  13.B8  .  14.01  .  14.26  .  14.54  .  .  S ( Bl )  .  CHI . V(CHI)  .  -0.0 0  0.00  0.01  14  TABLE 3 SIMULATION 'RESULTS EUR A SINGLE VARIABLE MODEL  MODEL : Y = 1 0 . + 0.25 1X+E1) + E2 LEVELS OF X = 9 REPETITIONS - 1000  DESIGN •  •  . SIGMA2 •  •  1 •  . SIGMA1  .  •  3.  •  0  .  .  1  1  •  1  .  3  2  .  1  »  .  4  STATISTICS 31 S ( B 1) .  0.13  .  T ( Rl ) . -0.02  . ST(B1) CHI . V(CHI )  0.17  .  0.12  .  0. 09  .  0.12  0.12  .  0.10  .  0.09  .  -0.2 3  -0.78  -1.45 .  -2.13  .  1.43  .  C . 2 5 . C.23  1.21  .  1 .23  7.C4  .  7.44  13.75  .  16.09  .  .  l..*24 .  1.31  .  8.27  8.98  ,  2 0.04  .  . .  23.59  .  9.45 ..  26.09  .  15  TABLE 4 SIMULA!ICN  MODEL  RE SUITS FOR A SINGLE VARIABLE  : Y = IC . + 0.50 ( X + E l )  MODEL  + E2  LEVELS OF X = 9 REPETITIONS = 1000  DES IGN •  •  •  •  . STGMA1 »  •  #  1  . SIGMA2  •  0  •  1 •  •  •  1  1  •  •  1 *  •  2  •  •  1 3  •  4 •  •  STATISTICS  Bl S(31» .  .  G. 45  0.13  .  0.13  0.14 -1.31  .  -2.25  .  0.35  .  0.24  0.50  -0.02  -0.4 3  . ST I 01)  1.21  1.27  .  1.32  .  1.46  8.60  .  11.86  .  14.66  . V(CHI)  7.C4 1 3.75  .  . 2 2.06  4 0.9?  0.17  .  0.12  .  .  -3.18  .  .  1.67  .  16.47  .  0.13  T ( Bl )  CHI  .  58.28  .  6 8.61  16 TABLE 5 SIMULATION RESULTS  MODEL  FOR A SINGLE VARIABLE  MODEL  : Y = 10. + 0 . 7 5 ( X + E l ) + E2 LEVELS OF X = 9 REPF TI TI ONS = 1000  DES IGM • •  .  •  •  . S, IG M A 2  .  1 0  1  •  .  1 •  •  SIG MA 1  •  *  1  •  *  •  1 •  •  3  2 •  •  1 •  4  •  STAT1STICS  Bl.  0.7?  .  S ( Bl )  0.13  .  0.15  .  T( B 1 )  -0.02  .  . ST(B1) CHI . V(CHI)  0. 52  0.37  0.26  .  0. 17  0. 1 7  0.16  -0.58  .  - 1 . 59  1.21  1.28  .  1 .33  7.04  10.55  13.75  0.63  .  3 3.29  .  .  .  -2.60  -3.5 8  1 .47  1.67  .  28. 14  .  17.85  .  24.10  89. 42  .  141.79  .  167.96  17  I t can be seen from Tables 1-5, that there i s a trend t o underestimation  of [3 |, by B l as c^, increases. 1  i s p r o p o r t i o n a l to the value of the parameter 6^.  Also, this bias I f we define b i a s  as the absolute distance d,  then f o r the family of models simulated, d = .66 3  X  The t s t a t i s t i c behaves as would be expected from the trend of B l . However, the variance of T(B1) demonstrates from Students' t distribution. as Students'  The expected variance of a random v a r i a b l e d i s t r i b u t e d t w i t h n-2 degrees of freedom i s  -Ar n-2 When  =  1.285  (when n = 9 ) .  = 0, the experimental value i s close t o i t s expectation  2 (ST(B1) = (1.21) = 1.46)  but increases as  2  increases.  The parameter  value of 6^, a l s o appears t o have a p r o p o r t i o n a l e f f e c t of the standard d e v i a t i o n of the t - s t a t i s t i c , at l e a s t f o r 8-^ = 0, .25, .5. Figure 1 shows that the trend of ST(B1) f o r 8 = .75 i s almost i d e n t i c a l t o the trend when 6^ = .5. The s t a t i s t i c CHI, the r a t i o of the r e s i d u a l sum of squares t o 2 a^ should have a c h i square d i s t r i b u t i o n w i t h mean n-2 and variance 2 t  2(n-2).  When  ^ 0, the observed values f o r CHI and i t s variance  V(CHI) are greater than t h e i r expectations.  The bias f o r both s t a t i s t i c s  18  3=0.75 8=0.50  3=0.25 8=-.25  8=0.00  T  1.0  2.0  SIGMA 1  Figure 1 The Trend of the Standard Deviation of t = (B - 8)/S(8) with 8 and a,  19  i s d i r e c t l y proportional to the absolute value of the estimate. We w i l l return to discussion of these e f f e c t s a f t e r consideration of a multiple regression model.  2.4  A Two-Variable Model: Study 2 In the second study, a two variable l i n e a r model was " *()  Y  +  S  l l X  h  +  X  2  +  £  simulated,  2  where 10  3 =  .5 .5  and e  ll'  e  12 ^ 2 N  e  ( 0  '  0  1 2 I  '  )  o\  * N(0, c ) , 2  2  = 1 ,  for 10 l e v e l s of X = [ X ^ X ]. The covariance matrix, V, of V = kl  2  was varied over the study,  , k = 0, 1, 2, 3, 4,  i n a f a c t o r i a l arrangement with the c o r r e l a t i o n of the predictor vectors of X, p(X , X ) = j ( . l ) 1  2  , j = 0, 1  9 .  The regression c o e f f i c i e n t s 3 , and a^, were held constant over the study. The r e s u l t s of the 50 experiments Tables 6-15.  Tables 6-15  of Study 2 are given i n  are of the same form as those of Study 1  except that data are present f o r two regression c o e f f i c i e n t s .  20  rABLE 6 SIMULATION  HO DEL  RESULTS  FOR A TWO  VARIABLE MODEL  : Y = 10. + .5(X1 + E11) + .5(X2 + E12)  + E2  LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION XI,X2 = 0.0  DESIGN •  .  •  SIGMA2  •  .  •.  .  1  m  SIGMA 1  .  •  .  1  m  0  .  .  1  .  2  m  1  STATISTICS  B1 S(B1) .  T(B1)  .  ST(B1)  .  B2  o. 14  .  0. 14  .  -2.87  .  1, 31  1. 50  ."  .  0..22  0. 15  .  0..16  .  0, 16  0. 14  .  .  -1..23  .  -2,.07  -2.89  .  .  1. 6 3  .*  0..50  *  0..45  '.  0..33  !  0..22  0,.13  "  0. 15  .  0..17  .  0,.16  -0..0 3  *  -0..41  ,  -1..19  '.  -2..03  1..12  .  1. 14  .  1. 19  !  0..50  *.  0..45  «  0..33  0..13  *  0. 15  .  -0..05  *  - 0..46  .*  S(B2)  .  T(B2)  .  ST(B2)  1..23  .  1..24  ]  1..29  CHI  7..07  *  10. 14  .  1 6. .17  29. .75  '.  67..82  . V(CHI)  .  .  14..8 5 *.  .  "  .  .  1.. 44 20,.71  .*  98..25  * 112. 2 3  23. 28  I  TABLE 7 SIMULATION  MODEL  : ¥ =  RESULTS  FOR  A TWO  VARIABLE  90DEL  10. + .5(X1+E11) + .5(X2+E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 COR RELATION X 1 X 2 = 0.1 f  DESIGN .  SIGMA2  *  .  .  1  •  SIGMA1  .  0  STATISTICS  B1  0. 50  .  0. 46  .  0. 34  '.  0..23  0. 15  .  0. 16  0. 14  .  - 2.71  .  .  S(B1)  ,'.  0. 13  .  0. 15  .  0. 17  .  T(B1)  , -o.02  .  -0. 37  .  -1. 10  . ST(B1)  1. 13  .  1. 13  .  1. 1 8  1. 29  1. 46  .  B2  0.50  .  0. 45  .  0. 34  0. 23  0. 16  .  .  S{B2)  0.13  .  0. 15  .  0. 17  '.  0. 16  0. 14  .  .  T(B2)  -0.05  .  -0. 42  .  -1. 15  .  - 1 .94  - 2 .74  .  1. 23  .  1. 24  .  1. 29  1. 59  .  10. 18  .  16. 49  '.  21. 47  ".  24. 42  .  29. 92  .  70. 47  .  105..59  .  122.82  .  .  . ST(B2) CHI .  V(CHI)  7..07 ..  14.85  .  .  - 1 ..91  .  .  1. 42  22 TABLE 8 SIMULATION  HODEL  RESULTS FOB A TWO  : Y = 10. + ,5(X1 + E11)  VARIABLE  MODEL  + .5(X2 + E12) + E2  LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X1,X2 = 0.2  DESIGN **  •  .  SIGMA2  •  .  .  •  1  •  SIGMA1  .  '  1  •  •  .  •  1  .  •  1  1  •  .  0  2  3  B1  '.  0. 50  "  0..U6  !  0.35  0..24  .  S(B1)  '.  0. 13  .  0..15  .  0. 17  .  T(B1)  '.  - o .02  *  -0.,3a  .  ST(B1)  '.  1. 15  '.  0. 50  *  0..45  .  o. 35  !  o.13  "  0. 15  !  0. 1 7  -0. 0 5  *  -0,.40  1. 25  .  1. 25  .  1. 29  CHI  7. 07  .  10. 21  '.  1 6. 76  .  V(CHI)  14. 85  .  30. 08  .  72. 92  . 112.66  STATISTICS  !  B2  .  S(B2)  .  T(B2)  .  ST(B2)  .  '.  \.  1.. 14  - 1 .03  '. •  1. 18  -1 . 08  °.  !  0. 16  .  0..16  ."  0. 15  .  -1 .80 -  ."  -2. 58  .  1..28  1. 44  .  0..24  0. 17  .  0. 16  0. 15  *  -2. 6 0  .*  -1..84  .*  1. 40 22. 17  1. 56  .  25. 50  I  * 133. 46  .  .*  23 TABLE 9 SIMULATION  MODEL  RESULTS  FOR A TWO  VARIABLE MODEL  : Y = 10. + .5(X1+E11) + .5(X2+E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRE1&TICN X1,X2 = 0.3  DESIGN • .  • SIGMA2  •  .  • 1  *  SIGMA1  • 1  •  0  .  1  •  1  • .  1  •  .  2  .  3  STATISTICS B1 . .  S(B1) . T(B1)  .  . ST(B1) B2 .  S(B2)  .  T(B2)  .  . ST(B2) CHI .  V{CHI)  ..  0. 17  .  0. 15  .  - 2.46  .  1. 28  1. 42  .  0. 36  0. 25  0. 18  .  .  0. 17  0. 16  0. 15  .  -0. 37  .  -1. 02  - 2 .48  .  1. 27  .  1. 30  1. 53  .  .  10. 24 '.  17. 01  '.  26.52  .  .  30. 21  75. 23  . 119. 46  . 144. 12  .  0. 50  .  0. 46  .  0. 36  0. 13  .  0. 15  .  0. 17  !  0. 17  -0.01  .  -0. 32  .  -0. 97  .  - 1.71  1.18  .  1. 16 ' .  1. 19  '.  0..50  .  0. 46  .  0. 13  .  0. 15  -0.05  .  1. 29  .  7. 07 14.85  .  0. 25  .  - 1 .74  '.  .  .  1. 40 22. 81  .  24 TABLE 10 SIMULATION  HO DEL  RESULTS FOR A TWO  VARIABLE MODEL  : i = 10. + .5(X1 + E11) + .5(X2 + E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X1,X2 = 0.4  DESIGN • . • .  • SIGMA2  • .  1  • SIGMA 1  .  1  .  1  • .  0  STATISTICS  0..50  *  0..47  0...13  *  0.. 15  -0,.01  .  -C..30  !  1..24  "  1..20  '.  0..50  *  !  o. .13  B1 .  S(B1)  .  T(B1)  .  ST(B1) B2  I  \ .  0. 37  '.  0. 1 7  0..26  0. 18  .*  0..17  0. 15  .  - 0 .91  .  -1..62  "  -2. 35  .*  .  1 .21  *  1..28  "  1.41  .  0. 46  .  0.37  I  0..26  0. 19  .  *  0..15  '.  0. 1 7  \  0..17  0. 15  .  .  -1 .66  -2. 37  .*  1..40  1. 51  . .*  I.  .  S(B2)  .  T(B2)  -0..0 5  *  -0..35  .  ST(B2)  1,.34  .  1..31  CHI  7..07  "  10..27  '..  17.23  '.  23..39  .*  27. 48  .34 3 0.  '.  77.43  . 126..01  .  154.78  . V(CHI)  .  14..85  \.  -0.96  "  1 .32  .  25 TABLE 11 SIMULATION  MODEL : Y =  RESULTS  FOR A TWO  VARIABLE 80DEL  10. + .5(X1 + E11) + -5(X2+E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X1,X2 = 0 . 5  DESIGN • . • .  SIGMA2  • . •  SIGMA1  • 1  •  •  1  1  •  •  0  1  .  1  • .  2  .  3  STATISTICS B1  0, 50  .  0. 47  .  0. 37  0. 27  0. 19  .  0. 17  0. 15  .  - 2 .26  .  .  S(B1)  ,  0. 13  -  0. 15  .  0. 17  .  T(B1)  .  -0.00  .  -0. 29  .  -0. 87  .  ST(B1)  .*  1. 28  .  1. 22  .  1. 23  1. 30  1. 41  .  B2  0.50  .  0. 46  .  0. 37  0. 27  0. 19  .  .  S(B2)  0. 13  .  0. 15  .  0. 17  0. 17  0. 15  .  .  T(B2)  -0.05  .  - 0 . 34  .  -0. 92  - 2 .28  .  ST(B2)  1. 37  .  1. 32  .  1. 33  CHI  7. 07  .  10. 30 '.  17. 43  .  23.97  28. 42  .  14.81  .  30. 35  79. 30  .  132.60  *. 165. 90  .  .  .  V(CHI)  ..  .  .  .  .  - 1.56  - 1.59 1. 41  .  .  1.5 1  .  26 TABLE 12 SIMULATION  RESULTS FOR A TWO  VARIABLE MODEL  MODEL : Y = 10. + .5{X1 + E11) + .5(X2 + E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X1,X2 = 0.6  DESIGN tt  •  .  SIGMA2  9  .  .  •  1  •  SIGMA1  .  •  .  1  •  0  «  .  •  1  .  1 •  2  •  . .  1 '  .  1  •  3  4  STATISTICS 0..50  B1  *  0.20  .  0. 16  .  -2. 16  .  1. 4 0  *  0. 20  .  0. 16  .  -2. 18  .  1. 18  .  29. 25  .  . 138.. 36 .* 176. 02  .  0..47  \  0..38  "  0..28  0..15  .  0..17  "  0..17  -0..82  .  -1..48  '. S(B1)  0..13  .  T(B1)  0..01  *  - 0 ..26  .  ST(B1)  1..45  *  1..33  '.  1..29  1..33  0..50  "  C..47  \  0..38  0..28  0..13  .  0..15  .  0..17  0..17  -0..06  *  -0..32  '.  -0..87  -1..51  1. 39  1. 43 24 .43  B2  ..  .  S(B2)  .  T(B2)  .  ST(B2)  1..53  .  1. 43  !  CHI  7..07  "  1C..33  '..  17..59  14..8 5  "  30..61  ].  81 .55 .  . V(CHI)  .  .  .  .  ."  .  .*  27 TABLE 13 SIMULATION  MODEL  RESULTS  FOR A TWO  : Y = 10. + .5(X1+E11)  VARIABLE MODEL  + .5(X2+E12)  + E2  LEVELS OF X = 10 REPETITIONS = 1000 CORRELATICH X1,X2 = 0 . 7  DESIGN •  .  •  SIGMA2  *  .  •  1 m  SIGMA1  *  1 •  0  •  .  1  *  1  .  1  .  3  •  .  2  STATISTICS  B1  .  .  S(B1)  .  0. 50  0. 47  .  0. 38  !  0.29  0. 21  .  0. 13  0. 15  .  0. 1 8  !  0. 17  0. 16  .  .  - 1 . 41  - 2 .09  .  1. 36  1. 39  .  T(B1)  0. 02  .  -0. 25  .  -0. 78  ST(B1)  1. 64  .  1. 45  .  1. 36  0. 50  .  0. 47  ,  0. 38  '.  0. 29  0. 21  .  0..13  .  0. 15  .  0. 17  '.  0. 17  0. 16  .  - 0 . 31  .  -0. 83  .  - 1.45  - 2 . 10  .  1. 55  .  1. 46  1. 48  .  10. 36  .  17. 74  '.  30. 07  .  30. 76  .  83. 38  . 144. 14  . 186.60  .  B2 .  S(B2)  .  T(B2)  .  . ST(B2)  1..72  CHI . V(CHI)  -0.0 7 .  7. 07 .  14.85  .  .  .  1. 45 24. 89  '.  28 TABLE 14 SIMULATION  RESULTS FOR A TWO  VARIABLE MODEL  MODEL : Y = 10. + .5(X1 + E11) + .5(X2 + E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X1,X2 = 0 . 8  DESIGN •  .  •  SIGMA2  •  .  m  1  *  . SIGMA1  •  .  1  •  .  0  •  .  1 m  .  1  •  .  1  .  *  .  2  1 m  .  3  .  4  STATISTICS  B1  '.  0..50  \  0..47  '  0.39  0..29  0. 21  .  .  0. 1 8  0..17  0. 16  .  .  S(B1)  '.  0..13  .  0..15  .  T(B1)  '.  0..04  *  -0..23  .  ST(B1)  1..98  *  1..65  \  1..47  1..38  1. 40  .*  '. .  \.  -0.75  .  - 1 . 36  .  -2.0 2 .*  '.  0..49  0..47  \  0, 39  0. 29  0. 22  .  S(B2) !'.  0..13  0. 15  \  0. 1 7  0. 17  0. 16  .  -2.02  .  1. 48  .  3 0.85  .  .* 197. 20  .  B2  .  T(B2)  .  ST(B2)  30  *  .05  .  1. 73  7. 07  "  14,.85  .  2.  CHI . V(CHI)  -c.  '. -0..08  .  ].  -0..80  .  - 1 .39  \  1.55  10. 39  \.  17.87  .  30. 97  \.  84.98  . 149.64  .  1. 46 25.31  ."  29 TABLE 15 SIMULATION  MODEL  RESULTS  FOR & TWO  VARIABLE MODEL  : Y = 10. + .5(X1+E11) + .5(X2+E12) + E2 LEVELS OF X = 10 REPETITIONS = 1000 CORRELATION X I , X 2 = 0 . 9  DESIGN *  .  *  SIGMA2  o  .  •  •  1 m  SIGMA1  1 »  .  1  .  2  m  0  1  STATISTICS B1 S(B1)  .  0.48  .  0. 39  0. 30  0. 22  0.13 ."  0. 15  .  0. 1 8  0. 18  0. 16  0.51 .'.  T(B1)  0.07  .  -0.21  .  ST(B1)  2.77  .  2.02  .  1. 58  1.41  1.42  0.49  .  0.47  .  0. 39  0. 30  0. 22  0. 15  .  0. 1 8  0. 18  0. 16  B2  -0. 72  .  -1.32  *. -1.95  S(B2)  .  0.13  T(B2)  ..  -0.11  .  -0.29  .  2.82  .  2.07  .  7.07  .  10.41  .  17.96  '.  25.69  '.  31.60  14.85  .  31.24  .  86. 20  .  155.00  *  207.67  ST(B2) CHI V(CHI)  .  -0. 76  .  1. 65  -1.32  .  -1.94 1.50  1. 48  30  From Tables 6-15,  we  see that the trends due  2 to o^, which were  noted f o r the s i n g l e variable model are generally exchanged by the tions of another v a r i a b l e .  addi-  However, detailed comparison of the two  cases  does not seem j u s t i f i e d  as the method of s e l e c t i o n of the independent  variables i s d i f f e r e n t .  For the single variable case the x variables  are  set at x = (6,7,8, . . . , 13,14), while for the 2 v a r i a b l e case the x variables were generated randomly with a mean of 10 and a standard deviat i o n of 2.  Consequently, the r e s u l t s for the 2 v a r i a b l e case are stated  below without comparison to the single variable case. The as a-  c o e f f i c i e n t B l (or B2) underestimates 8-^ (or H^)  increases. The  of a  regression  statistic  B i - 8. „ .— behaves as expected considering o \P_jJ / Q  4 0 on the estimation  increases.  The  of 8, increasing i n absolute value as OB i - 8.  rate of increase of  r a i s i n g the l e v e l of c o r r e l a t i o n of X^, The  ,  .  i s inversely affected by  X^.  standard deviation of this s t a t i s t i c , ST(B1), increases  t i o n a l to a^.  Raising the c o r r e l a t i o n of the two  causes an increase i n the value of ST(B1) f o r which ST(B1) increases with The  the e f f e c t s  s t a t i s t i c CHI  propor-  independent variables > 0.  However, the rate at  decreases f o r larger c o r r e l a t i o n s (Figure 2).  and i t s variance V(CHI) both show overestimation  with increasing a^, i n the 2 v a r i a b l e model. 2.5  Discussion The  of Simulation  Results  fact that b w i l l be a biased estimate of 8 when the  variance  of error associated with the independent variables i s greater than zero, i s not w e l l known by many users of regression techniques.  An  algebraic  31  0  Figure 2 The E f f e c t of Non-orthogonal Predictor Variables and ai J 0 on the Standard Deviation of t = (B - 3)/S(B)  explanation of t h i s r e s u l t can be obtained by i n s e r t i n g the error of the dependent variable into the normal equations. Let x be the p x n column matrix of observed  independent  variables with an associated random vector of errors e^.  1  e. * N  p  (0, a.I) 1  The true experimental v a r i a b l e i s x such that e  x = X +  1  and x x = (X + T  ) (X + T  £  ;  L  T  T  = (X X + X e  ±  +  T  T  e-jX + e e) .  The normal equations expressed i n terms of the  independent  variables observed with error i s T T (x x) b = x Y  Using equation (6) and s o l v i n g f o r b, equation (7) can be written T  T  T  b = (XX + X e  ±  +  T  —1  ept + e ^ )  T  T  (X Y + e*Y) .  Taking expected values of both sides we have  E(b) = E[(X X + X e T  T  1  + e^X + e ^ ) "  1  T as E(e^Y) = 0 by the d i s t r i b u t i v e properties of e^.  X Y] T  ,  33  From (9) we can see that the b i a s of b w i l l be small i f (X E T  + e^S + e^e ) i s small r e l a t i v e to X X.  T u r n b u l l (1968), working  T  1  1  w i t h a s i n g l e v a r i a b l e model, observed that the b i a s i n b as an estimate of 3 increases as the r a t i o of squares increases.  to the corrected sum  of  The b i a s of the estimate of 6 i s minimized i f the  range of the independent v a r i a b l e s i s maximized. The p r a c t i c a l s i g n i f i c a n c e of t h i s r e s u l t i s apparent i n the f o r e s t r y problem of 'even aged' stands mentioned i n the i n t r o d u c t i o n . If  i s one or two years and the range of stand age i s l a r g e , the  b i a s i n the estimate of 3 would tend to be t r i v i a l .  S i m i l a r i l y , the  b i a s due to cr^ > 0 i n the estimate of the regression c o e f f i c i e n t corresponding  to the v a r i a b l e tree h e i g h t , w i l l be small i f the range  of t r e e heights i s l a r g e . Another point of i n t e r e s t i n the s i m u l a t i o n r e s u l t s was  the  T b i a s of the e r r o r sum of squares, 2 2 (denoted CHI i n Tables 1-15), 2 e  e  as an estimate of (n-p-1) a^.  I t was  a l s o noted that the sampling  d i s t r i b u t i o n of t h i s s t a t i s t i c  was not a c h i square d i s t r i b u t i o n .  An  explanation of t h i s r e s u l t can be obtained through examination of the components of the  statistic.  The e r r o r sum of squares SSE i s computed as SSE = ( Y - y ) =  YY T  T  (Y-y)  - 2y Y + y y T  T  .  (10)  I f we have a c l a s s i c a l r e g r e s s i o n model y = X8 + e  , e * NCO.a*)  2  2  then  -Y + e  7  2  and we can replace y i n equation (10) w i t h Y + e^. SSE = Y Y - 2(Y + e ) Y + (Y + e ) ( Y + e ) T  T  T  2  m  m  2  T  T  2  T  T*  T*  = Y Y - 2Y Y - 2 e Y + Y Y + e Y + Y e 2  T T T Gathering Y Y terms and n o t i n g e Y = Y e 2  rp  SSE = (Y Y -  m  2Y  m  2  2  2  rp  rp  +  (  n  )  ,  m  Y + Y Y) + (e^Y + Y.e 2  2  2  rp  Y) +  T "  £  2 2  (12)  e  However, i f our model contains e r r o r i n the independent v a r i a b l e s , 2  y = xB + e  2  »  e 2  ^  N ( 0  '°2  )  Where x = X + e  , e * N (0,a I ) 2  x  x  p  p  (13)  35  S u b s t i t u t i n g e q u a t i o n (13) i n t o T  ,  „,  (10) g i v e s .  ,T„  i SSE = Y Y - 2 [ ( X + e 1 ) 3 + e 2 ] Y +  =  n  r  N  r  ,  /  v  n i  n  T,  [(X+e^B+e^  [ ( X + e ^ B+e 2 ]  YTY-2(X3)TY-2(e;L3)TY-2e2Y  +  (X3+e 1 3+e 2 ) T  (X3+e13+£2)  N o t i n g t h a t X3 = Y,  SEE = Y T Y  - 2Y T Y - Ke^^Y  - 2e?Y  +  (Y T Y + Y T ( £ ; L 3 ) + Y T e 2 4- (e^Y  +  ( e ; L 3 ) T e 2 + e ^ C e ^ ) + e^Ce^Y) +  +  (e^)  1  (e^)  Gathering terms,  SSE = 8 T e ^ e 1 3 +  2 e  2ei  e  +  T T T h i s f o l l o w s due t o t h e f a c t t h a t e 2 Y , ^ l ^ ' e  2  E  2e2  a n d  ^14)  Y  T g l  e  a r e  1  X  1  m a t r i c e s , and t h e i r t r a n s p o s e s must have t h e same v a l u e . T a k i n g e x p e c t a t i o n s of ( 1 4 ) ,  T  E(SSE) = E [ 3 e ^ e ] L 3 ] + 2 E [ e 2 e i 3 ] + E [ e 2 e 2 ]  = E[B eJ B] + E [ e 2 e 2 ] T  when  and  ei  (15)  are independent.  From e q u a t i o n ( 1 5 ) , the e x p e c t e d v a l u e o f t h e e r r o r sum o f s q u a r e s when t h e model has e r r o r o f i n d e p e n d e n t v a r i a b l e s i s an e x p r e s s i o n  36  of three terms. of  2 (n-p-ljo^.  2 (n-p-ljo^.  The f i r s t term i s the c h i square d i s t r i b u t e d estimate T T The second term 6 ^ -^3 biases the SSE as an estimate of e  E  Note that the i n f l a t i o n of the e r r o r sums of squares i s  d i r e c t l y p r o p o r t i o n a l to the magnitude of the 3 parameters and the 2 diagonal of the covariance matrix a.. I of the random vector e... A x  T t h i r d b i a s term, Ze^e^ and  p  X  f u r t h e r i n f l a t e s the e r r o r sums of squares when  are not independent. These r e l a t i o n s h i p s are c l e a r l y r e f l e c t e d i n the data of the  s i m u l a t i o n experiment.  The experiments of both studies (1) and (2)  demonstrate the e f f e c t of i n c r e a s i n g ( -j^p) CT  o n  t n e  s t a t i s t i c labelled  CHI, the sums of squares e r r o r , and the experiment of Study 2 show the e f f e c t of v a r y i n g the s i z e of the 3^ parameters. A p r a c t i c a l value of the s i m u l a t i o n r e s u l t s i s that they demonstrate the r a p i d increase i n over-estimation of the true e r r o r sum of squares of Y as a^I increases.  The observations that the d i s t r i -  b u t i o n of the s t a t i s t i c CHI no longer follows that of the c h i square i s a l s o explained by equation (15). The s t a t i s t i c i s no longer the sum of the squares of a standard normally d i s t r i b u t e d random v a r i a b l e . The reason f o r the departure of the s t a t i s t i c Bl - 3  X  S(3 ) 1  from students' t d i s t r i b u t i o n can be seen from the above conclusions. The t d i s t r i b u t i o n i s a r a t i o of random v a r i a b l e s , the denominator d e r i v i n g from the c h i squared estimate of SSE.  37  It has been demonstrated that when the p r e d i c t o r vectors are not orthogonal and 3^ > 0, the error sum of squares i s i n f l a t e d prop o r t i o n a l to both  and the c o r r e l a t i o n of X^, X^.  The  correlation  e f f e c t s of Study 2 are as i n t e r e s t i n g as the X-error e f f e c t s but are presented here without discussion.  A precise mathematical explanation  of the bias conditions associated with non-orthogonal p r e d i c t i o n vectors i s given by Hoerl and Kennard (1970). From the standpoint of a p p l i c a t i o n i t i s important  to point out  that regression with non-orthogonal predictor vectors i s f a r from unusual.  In f o r e s t r y , the l i n e a r form of the volume equation discussed  earlier, log V = log a + b l o g D + c log H, contains two correlated independent v a r i a b l e s . Presumably, a l e a s t squares f i t would r e s u l t i n overestimation of the variance,  2 o. 0  CHAPTER THREE REGRESSION PROCEDURES WHEN a  J 0, AND  ±  PREDICTOR VECTORS ARE NOT ORTHOGONAL  Kendall and Stuart (1961) discuss the problem of "both v a r i a b l e s subject to e r r o r . "  The two variables X and Y are assumed  linearly related,  Y = 3 but X and Y cannot be observed.  Q  + 3X = c X  y  However, 6 and y can be observed where  6 = X + e , x  Y = Y + e , y e  x >  being errors associated with the x and y vectors r e s p e c t i v e l y . B a r t l e t t (1949) describes a graphical method f o r f i t t i n g s t r a i g h t  l i n e s when both X and Y are subject to error. is  (X,Y), the means of the X and Y vectors.  One point on the l i n e  The n data points are then  divided i n t o 3 groups such that the extreme groups contain as near ~ points as possible.  The means of the extreme groups (X^,Y^) and  (X„,Y_) are used with the o r i g i n a l point to determine the l i n e .  39 Carlson, Sobel and Watson (1966) describe a technique f o r f i t t i n g m u l t i p l e regression models w i t h e r r o r s of both dependent and independent v a r i a b l e s by means of "instrumental v a r i a b l e s . " Instrumental v a r i a b l e s Z, ,Z„,...,Z are v a r i a b l e s which could 1 2 n be used to p r e d i c t X but are uncorrelated to e r r o r s of X and Y. The l i n e a r model becomes Y = $ + 3x + e Q  x  (16)  y  where x = Y  Q  + Y ^  + Y z 2  2  + ... + Y z n  n  + e  2  (17)  The authors solve the system d i r e c t l y by a " f u l l information maximum l i k e l i h o o d " method, an i t e r a t i v e process i n v o l v i n g estimation of the e r r o r convariance  matrix.  Results obtained by applying l e a s t squares methods to (17) and then t o (16) underestimate likelihood.  the values found by f u l l information maximum  The authors suggest the f o l l o w i n g compromise: solve (17) by  l e a s t squares and s u b s t i t u t e i n (16).  Then estimate a,b such that  -Z (y - bx - a )  2  i s a minimum. Acton (1950) presents a maximum l i k e l i h o o d estimate of the regression c o e f f i c i e n t of a s t r a i g h t l i n e when X and Y are i n e r r o r . The maximum l i k e l i h o o d expression f o r g i n v o l v e s the standard d e v i a t i o n of the e r r o r s of X and Y, a E  the e r r o r s p ( e , e ). x y K  x  ,  a £  r e s p e c t i v e l y , and c o r r e l a t i o n of y  40 A l i k e l i h o o d f u n c t i o n f o r the m u l t i p l e v a r i a b l e case i s discussed by Clutton-Brock (1967),  An i t e r a t i v e f i t t i n g process i s  a l s o demonstrated. Techniques f o r f i t t i n g l i n e a r models when the independent v a r i a b l e s are c o r r e l a t e d are not w e l l developed.  K e n d a l l (1957)  suggests applying p r i n c i p l e component a n a l y s i s to the independent v a r i a b l e s and then using the orthogonal v a r i a b l e s s p e c i f i e d by the f i r s t few p r i n c i p l e components as p r e d i c t o r v e c t o r s . (1968) c r i t i c i z e s  However, Cox  t h i s technique, arguing that there i s no l o g i c a l  reason why the dependent v a r i a b l e should not be c l o s e l y t i e d to the l e a s t important p r i n c i p l e component. Hoerl and Kennard (1970) propose an estimation procedure, termed Ridge Regression, f o r the s i t u a t i o n where p r e d i c t o r v a r i a b l e s are nonorthogonal.  Estimation i s based on (X X + k I ) P T  , k > 0,  T where X X i s i n the form of a c o r r e l a t i o n matrix.  A d d i t i o n of the small  p o s i t i v e quantity k to each diagonal element causes the system (X X + k l ) B = X Y T  to a c t more l i k e an orthogonal system.  T  A procedure f o r improving  estimation of 8 i s given. This summary of regression procedures when a  4 0 or C  f I  shows that the l i t e r a t u r e i s not w e l l developed on t h i s subject. P r a c t i c a l algorithms f o r f i t t i n g l i n e a r models under these non-standard conditions do not yet e x i s t .  i  CONCLUSION  In t h i s paper we examine the r a m i f i c a t i o n s of using l e a s t squares regression methods when the assumption that independent v a r i a b l e s are known without e r r o r i s v i o l a t e d and the p r e d i c t o r vectors are non-orthogonal. V i o l a t i o n of the assumption that X v a r i a b l e s are known without e r r o r a f f e c t s the estimate of the vector of regression c o e f f i c i e n t s , 3 . As the variance of the e r r o r of the independent v a r i a b l e s , o^, i n c r e a s e s , there i s a d i s t i n c t trend of decrease i n the \$\ estimate, |B|. For a two v a r i a b l e model w i t h 3^ = 3^ = .5, 6^ decreased from B l = .5 to B l = .14 as  increased from  = 0 to  = 4.  T u r n b u l l (1968) demon-  s t r a t e s that the r e l a t i v e bias of the estimate of 3 i s p r o p o r t i o n a l t o the sum of squares of the corresponding p r e d i c t o r v a r i a b l e . The r a t i o of the r e s i d u a l sum of squares t o variance of e r r o r of y observations does not d i s p l a y the elementary c h a r a c t e r i s t i c s of a c h i square s t a t i s t i c when of squares error when  > 0.  From the expression f o r expected sum  > 0,  E[SSE] = 3  T  E[e^ ] 3 + E f e ^ ] T  e;L  the bias term i s a product of sum of squares e r r o r s of the independent v a r i a b l e s and t h e i r corresponding 3 term. a f f e c t e d by the i n f l a t i o n of SSE.  The T s t a t i s t i c i s a l s o  42 A f u r t h e r small d e v i a t i o n from the expected sampling d i s t r i b u t i o n of these s t a t i s t i c s ( f o r  > 0) w i t h i n c r e a s i n g c o r r e l a t i o n i s due to T  lack of p r e c i s i o n i n computing B  T E[e^e^] B, as the system becomes l e s s  orthogonal. From the above conclusions, two g e n e r a l i z a t i o n s about models w i t h more than two v a r i a b l e s can be made: a)  Overestimation of the true e r r o r sums of squares of y i s p r o p o r t i o n a l t o the e r r o r of independent v a r i a b l e s . Increasing the number of v a r i a b l e s w i l l increase the b i a s of t h i s  b)  estimate.  A d d i t i o n of non-orthogonal p r e d i c t o r vectors  will  increase the number of non-zero o f f diagonal elements of the c o r r e l a t i o n matrix.  The i n f l a t i o n of the e r r o r  sum of squares r e s u l t s from t h i s lack of o r t h o g o n a l i t y and would presumably increase. Further s i m u l a t i o n studies of models with more than 2 non-orthogonal p r e d i c t o r vectors might demonstrate the p r a c t i c a l upper l i m i t s of c o r r e l a t i o n of p r e d i c t o r v a r i a b l e s f o r given computer p r e c i s i o n .  In  general, the e f f e c t s of c o r r e l a t e d independent v a r i a b l e s deserve f u r t h e r study through both s i m u l a t i o n and a l g e b r a i c a n a l y s i s . The l i t e r a t u r e does not present a p r a c t i c a l algorithm f o r f i t t i n g l i n e a r models under these non-standard conditions.  The most  f r u i t f u l approach to the problem of e r r o r of independent v a r i a b l e s appears t o be maximum l i k e l i h o o d estimates of B.  The e f f e c t s of  non-orthogonal predictor variables are less generally  appreciated.  Transformations to an orthogonal subset of predictor vectors by p r i n c i p l e components methods, and the recent technique of ridge regression are possible solutions.  44  LITERATURE CITED  Acton, F. S.  1959. A n a l y s i s of S t r a i g h t Line Data.  Wiley, New York.  B a r t l e t t , M. S. 1949. F i t t i n g a s t r a i g h t l i n e when both v a r i a b l e s are subject t o e r r o r . Biometrics 5:207-212. Caulson, F. D., E. Sobel and G. S. Watson. between v a r i a b l e s a f f e c t e d by e r r o r s .  1966. L i n e a r r e l a t i o n s h i p s Biometrics 22:252-267.  Clutton-Brock, M. 1967. L i k e l i h o o d d i s t r i b u t i o n s f o r estimating functions when both v a r i a b l e s are subject to e r r o r . Technometrics 9:261-269. Cox, D. R. 1968, Notes on some aspects of regression a n a l y s i s . J . R. S t a t i s t . Soc. A 131:265-279. H o e r l , A. E. and R. W. Kennard. 1970. Ridge Regression: biased estimation f o r non-orthogonal problems. Technometrics 12:55-67. K e n d a l l , M. G. 1951. Regression, s t r u c t u r e , and f u n c t i o n a l r e l a t i o n ship. Biometrics 7:11-25. 1957.  A Course i n M u l t i v a r i a t e A n a l y s i s .  and A. Stuart. Hofner, New York.  G r i f f i n , London.  1961. The Advanced Theory of S t a t i s t i c s . V o l . 2.  T u r n b u l l , K. J . 1968. Monte Carlo studies of s e v e r a l regression models. Presented at the meeting of the Society of American F o r e s t e r s , D i v i s i o n of Forest Mensuration, P h i l a d e l p h i a , Oct. 2, 1968. Wald, A. 1940. The f i t t i n g of s t r a i g h t l i n e s i f both v a r i a b l e s are subject t o e r r o r . Ann. Math. S t a t i s t . , 11:284.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0075416/manifest

Comment

Related Items