UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Fitting spline functions by the method of least squares Smith, John Terry 1967

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1967_A8 S55.pdf [ 2.64MB ]
Metadata
JSON: 831-1.0080581.json
JSON-LD: 831-1.0080581-ld.json
RDF/XML (Pretty): 831-1.0080581-rdf.xml
RDF/JSON: 831-1.0080581-rdf.json
Turtle: 831-1.0080581-turtle.txt
N-Triples: 831-1.0080581-rdf-ntriples.txt
Original Record: 831-1.0080581-source.json
Full Text
831-1.0080581-fulltext.txt
Citation
831-1.0080581.ris

Full Text

FITTING•SPLINE FUNCTIONS BY THE METHOD OF' LEAST SQUARES  by JOHN TERRY SMITH B.Sc,  A c a d i a U n i v e r s i t y , 1962.  B.A., Oxford U n i v e r s i t y ,  1965.  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE  REQUIREMENTS FOR THE DEGREE OF  MASTER OF ARTS  i n the Department ' of Mathematics  We accept t h i s t h e s i s as conforming to the r e q u i r e d  standard;  THE UNIVERSITY OF BRITISH COLUMBIA • August, 1967.  In p r e s e n t i n g t h i s t h e s i s  in p a r t i a l f u l f i l m e n t of the requirements  f o r an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia,  I agree  t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and Study. thesis  I f u r t h e r agree t h a t p e r m i s s i o n  f o r e x t e n s i v e c o p y i n g of  this  f o r s c h o l a r l y purposes may be g r a n t e d by the Head of my  Department or by hl'S r e p r e s e n t a t i v e s . or p u b l i c a t i o n o f t h i s t h e s i s  i s understood t h a t  copying  f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d  w i t h o u t my w r i t t e n p e r m i s s i o n .  Department of The U n i v e r s i t y of B r i t i s h Vancouver 8, Canada  It  Columbia  ii.  ABSTRACT  A s p l i n e f u n c t i o n o f degree  5 ,5 ,...,5 Q  1  isa  r  degree a t most (? » ° ) • + 0  r  T  n  e  k  C  k _ 1  k  w i t h knots  f u n c t i o n which i s a p o l y n o m i a l o f  i n each o f the i n t e r v a l s  (-<»  (g^^),...,  Gauss-Mar k o f f Theorem can be used t o e s t i m a t e  by l e a s t squares the c o e f f i c i e n t s o f a s p l i n e f u n c t i o n o f g i v e n degree and knots. E s t i m a t i n g a s p l i n e f u n c t i o n o f known knots w i t h o u t f u l l knowledge o f the degree e n t a i l s an e x t e n s i o n o f the GaussMarkoff technique.  The e s t i m a t i o n o f the degree when the k n o t s  are a l s o unknown has a p o s s i b l e s o l u t i o n i n a method  employing  f i n i t e differences.  ,  The technique o f m i n i m i z i n g sums o f squared  residuals  forms the b a s i s f o r a method o f e s t i m a t i n g the k n o t s o f a s p l i n e f u n c t i o n o f g i v e n degree.  E s t i m a t e s f o r the knots may a l s o  be o b t a i n e d by a method o f s u c c e s s i v e a p p r o x i m a t i o n , p r o v i d e d a d d i t i o n a l i n f o r m a t i o n about the s p l i n e f u n c t i o n i s known.  i i i .  T A B L E OF C O N T E N T S  Page  INTRODUCTION  1  CHAPTER I  4  1.  The G a u s s - M a r k o f f  2.  Spline  Theorem  4  Functions  6  ,/  CHAPTER I I  11  1.  A p p l y i n g the  2.  Rank o f  the  Gauss-Markoff  Theorem  11  Design Matrix  1-J5 22  CHAPTER I I I 1.  Estimating  the  Degree:  2.  Concerning  Orthogonal  3. -  Estimating  the  Knots  , 2 2  Known  24  Polynomials  Degree:  Knots  27  Unknown  32  CHAPTER I V 1.  Estimating  the  Knots:  2.  Non-Linear Estimation Approximation  BIBLIOGRAPHY  Degree Known by  32  •  Successive 36 j  /  40  iv.  ACKNOWLEDGEMENTS  I am glad of t h i s opportunity to acknowledge a debt of gratitude to Dr. Stanley W. Nash f o r suggesting the central problem of the thesis and f o r g i v i n g generously time and assistance during the research work on i t .  of h i s  I would  l i k e also to extend thanks to Dr. J . R. Hugh Dempster f o r h i s advice during the writing and f o r reading the proofs; Wayne Welsh who also read the proofs;  to  and to Mr. A. G. Fowler  and the University of B r i t i s h Columbia Computing Centre f o r guidance i n the use of the computer f a c i l i t i e s . My thanks also to the National Research Council f o r . their f i n a n c i a l assistance while the work was i n progress.  INTRODUCTION  The method of l e a s t squares curve f i t t i n g has become a highly developed one f o r a wide range of functions which are l i n e a r i n the parameters.  Straight l i n e , hyperplane,  and  polynomial regressions have received special a t t e n t i o n and have found wide applications i n a l l d i s c i p l i n e s which attempt  to  formulate functional models to explain observed phenomena. The t h e o r e t i c a l basis f o r the general method i s the Gauss-Markoff Theorem, described i n Chapter I, which ensures the existence of and indicates the method f o r obtaining minimum variance l i n e a r unbiased estimates f o r the parameters of functions which are l i n e a r i n them.  To date, a p p l i c a t i o n s to  planar curves have been concerned l a r g e l y with i n f i n i t e l y * d i f f e r e n t i a b l e curves f i t t e d to a scatter of data points. Any attempt to f i t a connected sequence of d i s t i n c t curves to one set of data has had to r e l y on a d d i t i o n a l techniques. A  type of function developed l a r g e l y by  I. J . Schoenberg i n the context of i n t e r p o l a t i o n theory allows the f i t t i n g of segmented polynomial curves to be achieved solely by the Gauss-Markoff Theorem.  This function, also  introduced i n Chapter I, i s c a l l e d a spline f u n c t i o n and c o n s i s t s of a number of d i s t i n c t polynomial arcs joined together with a s p e c i f i e d order of continuous d i f f e r e n t i a b i l i t y at points  2.  whose abscissae are c a l l e d knots.  The spline f u n c t i o n has a  unique representation which i s l i n e a r i n i t s c o e f f i c i e n t s ;  this  allows the Gauss-Markoff Theorem to be applied, i n one operation, to i t s estimation. Spline functions can be c l a s s i f i e d i n t o f a m i l i e s by their degree, which i s c l o s e l y connected with t h e i r order of d i f f e r e n t i a b i l i t y , or by their knots.  The c o e f f i c i e n t s of a  spline function of given degree and knots can be estimated as e a s i l y as can the c o e f f i c i e n t s of a polynomial, provided the conditions are known which ensure that the equations to be solved a c t u a l l y have a unique solution.  Chapter I I j u s t i f i e s  the use of the Gauss-Markoff Theorem f o r spline functions and answers the question concerning the s o l v a b i l i t y of the equatipns for the estimates. The question then a r i s e s as to what procedure to follow when the degree of a spline function i s not s p e c i f i e d by the physical s i t u a t i o n and so must be estimated.  In  Chapter I I I the case of given knots i s considered f i r s t ;  the  problem i s solved e s s e n t i a l l y by repeated a p p l i c a t i o n of the Gaull-Markoff Theorem u n t i l the degree i s found which gives the best f i t .  To estimate the degree when no information about  the knots i s a v a i l a b l e , the suggestion i s put forward that one use a method involving f i n i t e d i f f e r e n c e s . The knots of a spline function cannot be estimated d i r e c t l y by the Gauss-Markoff Theorem because the known representations f o r spline functions are not l i n e a r i n the  3.  knots.  Chapter IV discusses two possible approaches to the  problem of knot estimation.  The f i r s t , dealing with the case  of given degree, evolves out of known techniques f o r estimating the Join points of segmented curves.  The second  approach  depends on the use of a successive approximation technique f o r functions which are not l i n e a r i n t h e i r parameters. Questions which might lead to further research  are  noted as they arise i n the discussion. A separate numbering system i s used f o r theorems, d e f i n i t i o n s and equations i n each chapter.  4.  CHAPTER  I  In t h i s chapter we s h a l l state the Gauss-Markoff Theorem and discuss some well-known a p p l i c a t i o n s .  The  d e f i n i t i o n of spline function, as i t i s to be used throughout  .  t h i s paper, w i l l be given, along with some of i t s relevant properties.  1.  The Gauss-Markoff Theorem The common uses of l e a s t square f i t t i n g  techniques  deal with straight l i n e s , hyperplanes, polynomial, curves, and, i n general, any curves represented by functions which are l i n e a r i n the parameters.  Each of these examples i s an a p p l i c a t i o n  of the Gauss-Markoff Theorem [10, pp. 1 7 9 - 1 8 2 ] , 1.1  Theorem Let  (Gauss-Markoff);  y_ be an n x l column vector of observations and l e t  y_ = X£ + £ , where a  ss i s a random  qxl column vector of unknown parameters,  matrix of known weights with rank minimum of n  and q .  r e a l constant and I Let  a  be any  m  Suppose that  and variance-covariance matrix  a'|J  n x l column vector, X  £ is  i s an nxq  no greater than the z_ has expectation  a I , where  a  0  i s an unknown  i s the nxn i d e n t i t y matrix. qxl column vector of r e a l numbers.  Then  i s a l i n e a r parametric function. I f there e x i s t s an unbiased estimate of  a'>3',  that i s ,  5  i f there exists an has expectation  nxl  column vector  such that  u'y_  a'£ , then a'i  s  a'(X'X)"X'^  i s the l i n e a r unbiased estimate of o a'(X'X)"a  u  a'£  with minimum variance  i n the class of a l l l i n e a r unbiased estimates of  2  Furthermore, the minimum of the sum of squared r e s i d u a l s S (£) = (y_ - *£)'(£ - X£)  is  2  has expectation  (n-m)a  S  2  = v_'(I - X(X'X)~X')y_  .  We s h a l l omit the proof of the theorem. (x'x)~  that X'X  .  Note  denotes a generalized inverse of the matrix  (For any matrix  A  there e x i s t s a generalized inverse  A" , not n e c e s s a r i l y unique, with the properties: and trace  which  (A""A) = rank A .  singular, then  A~  If  i s unique and  A  i s square and A~ = A"  1  AA"A  = A ;  non-  , the true inverse*  (See [.8, pp. 24-26] for a b r i e f discussion of generalized inverses.) The more common a p p l i c a t i o n s of t h i s theorem y i e l d matrices  X  which are of f u l l rank, i n which case  (X'X)~ o (X'X)""  and the minimum variance l i n e a r  1  estimate of  j3  unbiased  i s given by  |with expectation  (X'xj^x'y.  ,  p , and variance-covariance matrix  a (X'X)" 2  1  6.  The matrix  X  appearing i n the Gauss-Markoff Theorem  s h a l l be referred to throughout as a design matrix. In f i t t i n g a straight l i n e to a set of data points (x ,y ) , i  i = l,2,...,n , the vector  i  observed ordinates, and the matrix  X  y_  i s a column of  followed by a column c o n s i s t i n g of the The vector  consists of the  n  n  ones  observed abscissae.  f3 has two elements representing the y - i n t e r c e p t  and the slope of the l i n e , r e s p e c t i v e l y . The example of the hyperplane i s a g e n e r a l i z a t i o n of the straight l i n e model obtained by adding columns of a d d i t i o n a l observations to the components In the  X £  matrix and increasing the number of vector correspondingly.  The  i s adapted s t i l l further to the case of a polynomial by defining the components of the Hi  =  * i '*  i = l*2,...,n;  X  situation curve f i t  matrix to be  j = 0,1, . . . , q .  In Chapter I I we consider a f u r t h e r a p p l i c a t i o n of the Gauss-Markoff Theorem to the f i t t i n g by l e a s t squares of a class of functions which, i n some ways, extend and improve upon the f i t t i n g properties of polynomials.  These functions,  c a l l e d spline functions, are described i n the f o l l o w i n g section.  2.  Spline Functions I t i s i n t u i t i v e l y appealing that a broken-line poly-  gonal graph can give a better l e a s t squares f i t than can straight l i n e . situation arises.  one  For curves of higher degree a s i m i l a r One might speculate, f o r example, as to  whether a sequence of parabolic arcs w i l l better f i t a set of data points than one unbroken polynomial curve of, say, f i f t h degree.  These considerations might lead one to the question  of whether d i f f e r e n t i a b i l i t y conditions can be imposed a t th© points where the polynomial segments are joined  together.  The concept of the spline function allows us to make precise some of the properties of curves formed by p i e c i n g together polynomial arcs.  The following d e f i n i t i o n i s one  of those used by Schoenberg [16, p. 110]. 2.1  Definition:  (2.2)  Let  •••<?.  < ? 2  -l  < ?  0  < §  l "* <  < §  v *" <  be given i n f i n i t e sequence of r e a l numbers such that lim 5  = +o9 , i f  negative integer.  v -» + oo , and l e t  k  A spline function  be a given non-  S (x) of degree  having knots (2.2), i s a function of the c l a s s real  x  such that i n each i n t e r v a l  (? '? v  a polynomial of degree not exceeding  k ,  lc  v + 1  )  C i - t  k _ 1  for a l l  reduces to  k .  In the following discussion we w i l l be l a r g e l y concerned with those spline functions which have only a f i n i t e number of knots. functions  In t h i s case, we w i l l r e f e r to spline  S (x) , with knots fc  polynomials of degree a t most and  (§ ,+«) r  as well as In  ? <? <-"<§ 0  k  1  > which reduce to  i n the i n t e r v a l s (-<*,t* )  (§ *S V  r  0  V + 1  ) ,  v = 0,1,...,r-l .  I t can be r e a d i l y v e r i f i e d that a spline function of  8.  degree zero i s a step function with steps at the knots. S i m i l a r l y , a spline function of f i r s t degree i s a continuous piecewise-linear function whose slope i s discontinuous  at the  knots. Several equivalent representations tions may  be found i n the l i t e r a t u r e ;  13, 14, 15, 16, 5].  of spline func-  see, f o r example,  [12,  The one most suited to a p p l i c a t i o n of the  Gauss-Markoff Theorem i s a modification of the notation of Schoenberg [16, p. 352] and G r e v i l l e [5, p.  54]. k x  F i r s t , consider the truncated power function defined by k if x > 0 , if This function may of degree  k  +  x < 0 .  be thought of as a rudimentary spline f u n c t i o n  with knot  §Q = 0 .  I f we apply the d i f f e r e n t -  iation rule, d_k dx +  _ ~  x  k K  x  +  k-1 k  we see that the f i r s t and that the of  ki  k  k-1  derivatives of  derivative,  at the knot  §  Q  are continuous  +  ki x_j_ , has a Jump d i s c o n t i n u i t y  = 0 .  a spline function of degree  x  G r e v i l l e [5, p. 54-55] shows that k  with knots  §Q<? <*"<§ 1  r  has  the unique representation: (2.3)  where  '  P (x) fc  S (x) k  = P (x)/  lc v=0 + /  (x -  §  )  k  ,  i s a polynomial of degree at most  k .  This i s  9.  the representation we s h a l l use i n Chapter I I to apply  the  Gauss-Markoff Theorem to spline functions. Spline functions behave i n some ways l i k e polynomials> but i n other ways, t h e i r behaviour i s quite d i f f e r e n t .  A  summary of some of the properties of spline functions i n d i c a t e s the major s i m i l a r i t i e s and differences. mentioned by Schoenberg i n The operations  Most of these are  [16].  of d i f f e r e n t i a t i o n and i n t e g r a t i o n  respectively decrease and increase the degree of a spline function.  The family of spline functions of degree  knots (2.2)  i s closed with respect to forming l i n e a r combin-  ations of i t s elements with constant  coefficients.  k  and  These  properties are shared by the family of polynomials of degree i n f a c t , a spline function of degree a polynomial of degree  k  with  k  no knots i s  k .  Spline functions d i f f e r from polynomials i n t h e i r behaviour under the operation of f i n i t e d i f f e r e n c i n g .  It is  well known that the f i r s t order difference of a polynomial of degree  p  i s a polynomial of degree  p-1  difference of a spline function of degree another spline function of degree [12,  p. 75].  .  The  m  order  k , however, i s  k , as shown by Schoenberg  This f a c t should be kept i n mind when an attempt  i s made to determine the unknown degree of a spline f u n c t i o n of l e a s t squares f i t by the method of f i n i t e differences described by Kendall and Stuart [7,  p. 384],  the l i n e a r operation of d i f f e r e n c i n g applied (2.3)  In other words, k+1  r e s u l t s i n a spline function of the form  times to  10.  the polynomial part having been "differenced  out".  Although the problems and demands of i n t e r p o l a t i o n methods are i n many ways d i s t i n c t from those of l e a s t squares f i t t i n g , the two  techniques share some common ground.  following remark by G r e v i l l e [4, by spline functions  p, 53]  The  concerning i n t e r p o l a t i o n  indicates t h e i r usefulness i n that area  suggests s i m i l a r advantages for l e a s t squares f i t t i n g : By employing polynomials of r e l a t i v e l y , low degree, one can often avoid'the marked undulatory behaviour that commonly a r i s e s from f i t t i n g a single polynomial exactly to a large number of empirical observations. On the other hand, much greater smoothness i s obtained than with the t r a d i t i o n a l piecewise i n t e r p o l a t i o n procedures which give r i s e to d i s c o n t i n u i t i e s i n the f i r s t derivative of the i n t e r p o l a t i n g function.  The  reader i s referred to Theorems 1 and  2  and  Corollary 1 of Schoenberg's paper [16] f o r a summary of  the  s p e c i f i c properties of spline functions which give r i s e to their power as i n t e r p o l a t i n g functions. properties may  Other kinds of  be found i n [12, 1, 13, 14, 15,  We now  16,  proceed to consider the d e t a i l s of  the Gauss-Markoff Theorem to spline  functions.  5]. applying  and  11  CHAPTER I I  The central problem of the thesis concerns f i t t i n g a spline function to a given set of data points.  This can  be effected by the Gauss-Markoff Theorem provided the conditions of the theorem are s a t i s f i e d .  We consider f i r s t the  a p p l i c a b i l i t y of the theorem and secondly, discuss the rank of the design matrix. 1.  Applying the Gauss-Markoff Theorem Let us begin by formulating the problem i n the  language of the theorem.  Suppose we are given a set  {(x^y^)} , i = 0,1,...,n , of points i n the plane representing  n+>l  p a i r s of observations.  may assume that (SJL) that  To s i m p l i f y matters we  .  XQ<X^<««*<X  * i = 0,1,...,r , of r+1  Suppose f u r t h e r that a set  fixe'd r e a l numbers i s given, and  x <? <? <---<? <x . 0  0  1  r  n  We hypothesize the following l i n e a r model: (1.1) where  y  ±  S (x) fc  ?0>5i*.»«>?  n  = S (x ) + u k  ±  ±  ,  f o r i = 0,1,...,n ,  Is a spline function of degree > and  k  with knots  (u^) , i = 0,1,...,n , i s a sequence of  uncorrelated random v a r i a b l e s , each with zero expectation and the same variance  a  c  .  Chapter I , we may write  Using the representation (2,3) i n  12. k  for  r  i = 0,1,...,n .  In matrix notation t h i s set of equations may be written i n the form y_ = Pa + Sc + u where a  y_ i s the  i s the  (n+l)*l  (k+l)xl  coefficients;  £  column vector of unknown polynomial i s the  (n+l)x(n+l)  constant;  P  (r+l)xl  E(u) = 0  variables f o r which the  column vector of y-observations;  and E(uu') = a I , I  i d e n t i t y matrix,  i s the  and a  x  x-observations;  [ S ^ = ( x - 5j) ]  matrix  and S  (J-0,...,k) of i s the  of spline terms i n  +  i  being  , an unknown  (n+l)x(k+l) matrix [ P i j = ^  successive powers of the (n+l)x(r+l)  column vector of random  the x-observations. Denote by X by  £  the (k+r+2)xl  the (n+l)x(k+r+2) column vector  matrix  [a'jc']' .  [P|S] , and The observ-  ational equations ( l . l ) may now be written as i n Chapter I , Section 1:  (1.2)  £ = X£ + u . To f i x these ideas i n mind consider the following  i l l u s t r a t i o n , where  n =  J,  = 3 i t and £ = 6 & . 1  model i n t h i s case w i l l be  k =  2,  r =  1,  x  ±  =  i+1, i = 0,1,...,7,  Ttye matrix equation f o r the l i n e a r  13.  y  0  y  l  y y  k  y  5  y  6  y  7  1  1  0  0  1  2  4  0  0  a  l  3  9  o  0  a  2  1  4  16  1/4  0  c  O  1  5  25  9/V  0  c  l  1  6  36,.  25/4 •  0  1  7  49 .' 49/4''  1  8  64.^ 8 1 / 4 / 25/4  1  2  3  y  1  .  'o u  '0 a  +  1/4  u  l  U  2  U  3  u  4  U  5  u  6  U  7  •  Returning to the general case with observational equations (1.2), l e t us assume f o r the moment that least  k+r+1  and the rank of X  i s k+r+2 .  of the Gauss-Markoff Theorem are s a t i s f i e d since l i n e a r i n the unknown c o e f f i c i e n t s  n  i s at  The conditions s  a^ and C j .  k  (  x  )  is  Hence,  the minimum variance l i n e a r unbiased estimate of J3 i s given by | = (X'X)-Vv. , with variance-covariance matrix  p —1 a (X'X)~ .  I f we wish to obtain estimates f o r the polynomial c o e f f i c i e n t s and the spline c o e f f i c i e n t s separately, we can write  $ as •  A  «  a  £A  P'P  P'S ~  S'P  s's  »  P'jr  1  -  I  14.  Letting A - I - PfP'P)" ?' 1  and B = S'S - S P ( P ' P ) " P ' S = S'AS /  ,  1  and applying the partioned inversion r u l e , we obtain m  A  -(p'pr^-p'SB"  (p'pr^i+p'SB'Vpfp'p)"" ] 1  a A £  -B^S'PCP'P)"  1  Y  '  1  B-  A  1  ( ' )-lp'y_ + ( p / p ) - l p / - l / p ( p / p j - l ' P  P  S B  -B~ S'P(P'P)" P'v 1  1  (p/pj-lp/^.  i  +  s  p  zf  (p'pJ-VsB^S'y.  B^S'fc  _ (p'p)- p'SB" S A^ 1  1  /  B'VA^  Thus f a r i n the discussion we have assumed the nons i n g u l a r i t y of the  (k+r+2)x(k+r+2) matrix  X'X .  In the  next section we s h a l l examine the conditions under which t h i s assumption i s j u s t i f i e d and s h a l l show that these conditions do not greatly r e s t r i c t the scope of actual a p p l i c a t i o n s .  A far  more serious l i m i t a t i o n of the above procedure, however, i s the f a c t that i t requires a knowledge of the degree of the s p l i n e function to be f i t t e d , as wei^L as the number and l o c a t i o n of the knots.  Chapters I I I and IV attempt to throw l i g h t on  15.  these two problems.  2.  The Rank of the Design Matrix In t h i s section we discuss the design matrix  X  which appears In equation (1.2). Let let  k, r , and n  x <x <...<x Q  1  n  be f i x e d non-negative Integers and  and ?Q<§ <* • • - be given r e a l numbers. 1  r  Denote by X(k,r,n) the (n+l)x(k+r+2)  matrix  >  m  1  x  1  0  x  0  • • •• 0 < O-5o + ( 0"?l'+ x  \  4  V  4  x  x  • • • ( 0"?r'+ x  • • • i ( i-?o'+ < l-5l>£>x  x  <  Assuming that conditions on the x^ f u l l rank  )  • • < i-5 )+  x  ( n"?o)+ x  ( n-5l>+ x  x  r  • • •  ( n"5 )+ x  r  n >^ k+r+1 , we s h a l l look f o r and the §j under which  X(k,r,n) has  k+r+2 . We s h a l l need the following theorem proved by .  Schoenberg and Whitney [1>, p. 249]. 2.1, Theorem; x <x <...<x 0  1  r  D=  Let k  be a non-negative integer.  , and y <y <"'<y Q  —  1  (det U ^ y ^ ]  , then  r  ±  =  0  3  l  j—0 i l  •i i f and only i f the i n e q u a l i t i e s  s  t  3  _  r  <i »  \  >0  If  16.  x -k-l v v < y  for v = 0 , 1 , . . . , r  ,  < x  v  ,'  hold.  The proof w i l l he omitted. Recall that we are dealing with the set df observational equations ( 1 . 1 ) or, equivalently, ( 1 . 2 ) , where now S (x)  = X(k,r,n)£  k  ,  or k S  k( s) x  The polynomial  =  p  •  s  a  ±  0  . k i s j? ^ s" j^+ x  C  X  a  v b  k  ,  §  0  ( ) = x  k  +  S  a  i  where the y ' s are k+1  i  x  m  s  e  for s = 0 , 1 , . . •  written uniquely as  f i x e d r e a l numbers chosen i n any way  v  such that y <y <---<y <x 0  1  k  0  ,  and y  k  < ?  0  .  r  This follows from the well-known f a c t that given  k+1  numbers, we can express any polynomial of degree  k  i n terms of them. k<  =  x )  E  . Y (x-y ) J , v  V=0  y  y  k+l k+2  =  =  k+r+l  0 *  ?  ?  ?  1 >  r  for a l l x > y  ., K  I f we now l e t y  uniquely  But we may also write k  P  fixed  r  3  17.  S (x) k  maybe written as k+r+1 S (x) = E Y (*-y )+ k  k  where  Y  q  ,, =  v  J  C  f°  r  v  choose any subsequence of „  < „  0  a  XQJX.^,  k+r+2  *.. ,x  ,  k  Prom the ascendR  , arbitrarily  elements and relabel them as  <*  x  a  for a l l x > y  = 0,1,...,r .  ing sequence of given numbers '  x  >  v  l  a  k+r+l  If we now apply Theorem 2 . 1 , we see that the f i r s t k+1  inequalities i n the statement of the theorem are auto-  matically satisfied, i . e . , for v - o,l,...,k ,  y <x„ , v and the remaining r+1  x  „  < 1? i  °1  x  a  <  <  inequalities may be expressed as  x  «  °k+2  ?r x„ <  'W+l  r  Hence det [(x  - yjO  i = 0  ,l,...,k+r+ll > 3=0,1,...,k+r+1  i f and only i f the inequalities > a, X  hold.  < ?  i  < x  c a aj  f o r  i+k+l  i = 0,l,...,r  0  18.  This implies, however, that the  (n+l)x(k+r+2)  matrix Y(k,r,n) - [ ( X j - y i ) + ] , l , . . . , n j=0,l,...,k+r+l l a 0  has f u l l rank i f and only i f , f o r some sequence x„ <x„ <•'...<x„ 0 l °k+r+l a  chosen from  a  x ,x ,...,x Q  1  n  x„ <5.,<x„ °i °i+k+l  , the i n e q u a l i t i e s , f o r i = 0,1,...,r  ,  1  hold. Note that matrix  Y(k,r,n)  can be written as the p a r t i t i o n e d  [Q(k,n)|S(k,r,n)] where Q(k,n) = [ x - y j ) ] k  ±  , .. J=0,l,...,k  >•  :  l s a 0  1 > #  f n  and  S(k,r,n) - [ (  X j L  - Sj)J]  1 = 0 J L V  .,.,  N  •  j=0,l,...,r F i n a l l y , we show that there i s a non-singular!.matrix B  of order  k+r+2  such that  1  Y(k,r,n) = X(k,r,n)B  .  Having done t h i s , we can immediately conclude that the conditions necessary and s u f f i c i e n t for  X(k,r,n)  to have f u l l rank are  p r e c i s e l y those which ensure the same f o r Y(k,r,n) .  19.  Consider the two ways of w r i t i n g the polynomial P (x) k  :  P (x) = k  K  *  i  E a.x  1=0  ,  1  or P (x)  =  k  =  E v=0  S v=0  =  Y (x-y  )  v  v  K  v  Y E (-D (|)x -V , j=0 k  J  v  J  v  E [(-i) ( ) E J  k  J  j=0  v=0  v  Y y^]x k  J  v  v  v  Equating c o e f f i c i e n t s f o r the two forms, we see that k = (-l) .- ( ) E  a  k  Define the  J  k  (k+l)x(k+l)  Y y^  ,  d  v  matrix  W  for j = 0,l,...,k  .  to be  W = [w, ] where w  = (-l) " ( )yJj" Y k  jv  J  k  J  t = 0,1,. ..,1s. ;  v = 0,1,. ..,k .  We then see that a = Wx , which implies that  ,  Q(k,n)v. = P(k,n)a = P(k,n)Wy_ where  P(k,n)  i s the  (n+l)x(k+l)  matrix  expression must be true f o r a r b i t r a r y Q(k,n) = P(k,ii)W . Furthermore,  we can write  , [x^] .  Since t h i s  y_ i t follows that  20.  W  [Q(k,n)|S(k,r,n)] = [P(k,n)|S(k,r,n)]  • «  0 where  I  r  +  i a the i d e n t i t y matrix of order  1  r+1  r+1 , or,  Y(k,r,n) = X(k,r,n)B . I t remains only to' show that non-singular. the  We note f i r s t that  (k+l)x(k+l)  where  v  = y  i J  W  W , and hence  B , is  i s row-equivalent to  matrix  ,  k _ i  i , J = 0,1,...,k ,  since every element i n the a same non-zero constant  row of W  (-1) (*) .  But  i s m u l t i p l i e d by the • detV  i s a Vandermpnde  determinant of the form detV = ±~fr p,q=0 p>q  (y --y ) ,  s  which i s non-zero as long as the y ' s are d i s t i n c t . v  V  Hence,  i s non-singular and so i s W . Our r e s u l t s may be summarized i n the following  2.2 if,  Theorem:  The matrix  f o r some sequence  X(k,r,n)  x <x <»««<x °0 l k+r+l a  x  0  , x  l  , ,  *  , , x  x  hold.  n ' a  < 1  t  n  ?i  e  < x  * a  n  e  has f u l l rank i f any only  <  l  u  i + k + 1  a  l  i* ,  chosen from  a  l  e  s  •  for i = 0,l,...,r ,  21  Part of the proof of t h i s theorem i s based on an argument by Schoenberg and Whitney  [13,  p.  257].  To ensure that the design matrix has f u l l rank, one need only v e r i f y that the knots are located c o r r e c t l y among the  data abscissae.  These r e s t r i c t i o n s on the knots are not  highly stringent, p a r t i c u l a r l y i f the number of data points i s very large compared with the  k+r+1 ;  they merely require that  knots, i n some sense, are not too c l o s e l y squeezed together  at either end of the i n t e r v a l  (x,~,x ) . * u n  These conditions  might well be put into more useful form by c h a r a c t e r i s i n g the positions of the knots f o r which the design matrix f a i l s to have f u l l rank; not  however, an attempt to answer t h i s question w i l l  be made here.  i  f  22.  CHAPTER I I I  Thus f a r , we have assumed complete knowledge of the degree of the spline function to be f i t t e d and of the number and p o s i t i o n of the knots. ' I t i s not u n l i k e l y that s i t u a t i o n s w i l l arise i n applications where one or the other or both of these facts i s specified by the physical s i t u a t i o n .  Such  circumstances are by no means assured, however, and so we ' must turn to the problems of estimating the degree and the knots. In t h i s chapter we deal with problems of estimating the unknown degree of a l e a s t squares spline function,  first,  when f u l l knowledge of the knots i s a v a i l a b l e , and secondly, when neither the p o s i t i o n nor the number of knots i s known. This second s i t u a t i o n presents considerable d i f f i c u l t y and has not been completely solved. When both the degree and the knots must be estimated, the greater hope f o r success i s offered when the degree i s estimated f i r s t . 1.•  Estimating the Degree:  Knots Known  I t i s not u n r e a l i s t i c to expect that, i n a whole range of spline f i t t i n g problems, the number and l o c a t i o n of the knots w i l l be known, even though the degree may have t a be estimated.  One i n t e r p r e t a t i o n of t h i s s i t u a t i o n i s the  case In which d i f f e r e n t e f f e c t s on the dependent v a r i a b l e come  23. i n t o play i n d i f f e r e n t "regimes" of the data, the "cross-over" points being well-known constants. For  example, consider density as a function of  temperature f o r some given substance.  The three regimes  correspond to the s o l i d , l i q u i d , and gaseous states of the substance;  the melting point and b o i l i n g point may be chosen  as knots.  Chemists are f a m i l i a r with cases where d i f f e r e n t  physical mechanisms r e l a t e density to termperature i n each of the  three states. An analogous s i t u a t i o n a r i s e s i n Economics.  One  could argue that the forces which a f f e c t the shape of the Gross National Product growth curve i n peacetime;' are d i f f e r e n t from those which take e f f e c t i n wartime.  Here again one f i n d s a  s i t u a t i o n where the knots may be assumed to be known, namely, the  times corresponding to the outbreak and termination of war. In l i g h t of these considerations we are perhaps  j u s t i f i e d i n spending some time on a method f o r estimating the unknown degree of a spline function when complete knowledge of i t s knots i s a v a i l a b l e . Suppose we are given a set of r+1  n+1  data points and  knots and we wish to f i t a spline function of unknown  degree  k...  The method f o r estimating  k  i s s i m i l a r to the  technique applied to polynomials [7,pp. 92-95]. Choose a p o s i t i v e integer  w  to be the highest  degree reasonable upon inspection of the data.  At worst,  w  24.  w i l l be one- less than the number of data points.. Denote by the sum of squared residuals f o r the spline function D  and by  R ^ - ^  the quantity  u  to teat the n u l l hypothesis that than  s  ( ) x  u - 1  i  s  given by  F  l,n-u-r-l  •  T  n  e  R  u  S (x) , u  variance r a t i o used  S ( x ) f i t s the data no b e t t e r u  • u R /(n-u-r-l) D  Beginning with ratio . P  1 n  u  „  r -  =  u  *  u = w , and comparing each successive variance i y  f  o  r  u  = w,w,-l,w-2,...,  with the c o r r e s -  ponding tabled value of the P - d i s t r i b u t i o n with n-u-r-1 value  degrees of freedom, we terminate the process when a i\  function S  ic 1  1 and  ^ *  of u  i s found f o r which the corresponding spline  S£(x) gives a s i g n i f i c a n t l y b e t t e r f i t than '  f n en  u  m  b  e  i s then the estimate of the degree.  r  In the process, of course, the required c o e f f i c i e n t s are estimated as w e l l , producing the desired spline function. i  2.  Concerning Orthogonal  Polynomials.  At t h i s stage one might w e l l ask whether the method of orthogonal polynomials [7>PP» 95-99] may be adapted f o r use with spline functions.  Before attempting to answer t h i s , we  should r e c a l l that the estimation of the degree of a polynomial i s achieved, e s s e n t i a l l y , by t e s t i n g f o r s i g n i f i c a n c e the c o e f f i c i e n t of the highest power of the independent at each stage.  variable  This process i s f a c i l i t a t e d by the trans-  formation to orthogonal polynomials which allows the polynomial c o e f f i c i e n t s to remain unchanged from stage to stage.  Another  advantage of the method of orthogonal polynomials i s that i t  25  avoids the computational conditioning of the  d i f f i c u l t y r e s u l t i n g from the  X'X  matrix f o r polynomials  greater than or equal to  ill-  of degree  6 .  The argument for using orthogonal polynomials  for  spline functions i s not as strong as i t i s f o r polynomials. In the f i r s t place, the i l l - c o n d i t i o n i n g problem i s not so critical.  Secondly, estimation of the degree i s not equivalent  to testing the c o e f f i c i e n t of the term of highest power at each stage, since each of the spline part c o e f f i c i e n t s accompanies only terms to the highest power appearing i n the f u n c t i o n . I f a spline function of degree  k  and knots  5 <V••<«,. , 0  could be represented equivalently as  (2.1)  S (x) = k  k  '  V  11  S b.x 1=0 1  +  +  k  k c  A j  £ £ h=0 j=0  h)(x  "  so that spline terms of a l l degrees up to then we might be j u s t i f i e d i n attempting orthogonal polynomials.  k  would appear,  to use the method of  However, i t can be shown that such  an equivalent representation i s not p o s s i b l e . 2.2 knots  Theorem;  Por a s p l i n e function  5Q ?I * ' *.<§!."» <  <  S (x)' of order fc  representation  t n e  \ ± +./ *• ,„ - .§ , )xk* S.(x) = ' E ' a.x t .c,(x V • i=0 j=0 J J K  1  1  /  +  \ k  with  26 i s unique i n the sense that i t contains no terms of the form  (* - ? )+  w i t h  h  r  = 0,1,...,k-1  ; and the C j are unique  for the given spline function. Proof:  Suppose the spline function has an a l t e r n a t i v e .  representation of the form  (2.1).  In the i n t e r v a l  -«»<X<5Q , the spline part does not appear i n e i t h e r representation and so we must have  i = 0,1,...,k .  a^ = b^ f o r  In the i n t e r v a l  which implies that  c  n Q  ^ =0  _< x < §^ ,  %Q  f o r h = 0,...,k-l , and  c(*> =c 0 0 *. c  c  To perform the induction, suppose that, f o r a l l s =  0,1,...,v-1  , we have  0  c ^= n  Q  , for h -  0,1,...,k-1  (k) and  c  g  '- c  g  .  Then i n the i n t e r v a l  we must have E c (x - s ) = k  j=0  J  E  E  h=0 j=0  3  §  < x <§  v  +  1  c ( ) ( x -3 ) . h  J  v  h  ?  But, taking i n t o account the assumed conditions on the ci ^ n  c o e f f i c i e n t s , we see that c-fx -  ) = k  ?  v  v  E  h=0  cv ( ( x - 5v ) n )  As i n the f i r s t step, i t follows that h=0,l,...,k-l , and c^ =c^ .  .  n  c^ ^=0, n  for  This completes the  induction step and the proof. The question of the existence of orthogonal sets of spline functions remains an open one.  r  27.  j.  Estimating  the Degree:  Knots Unknown  When neither the l o c a t i o n nor the number of knots of a l e a s t squares spline function i s known, the introduced  representation  i n Chapter I i s of no use i n estimating  I t i s even questionable  the degree.  whether there i s always a spline function  of unique degree which gives the best l e a s t squares f i t when the knots are allowed to vary a r b i t r a r i l y .  A method employing  f i n i t e differences has been explored, however, and,  f o r the case  bf one p a r t i c u l a r spline function, the r e s u l t s suggest that a c a r e f u l refinement of the method might lead to a means of., accurately determining the unknown degree. The method was  suggested by a comment by T.N.E. G r e v i l l e  [4, pp. 220-221] concerning the behaviour of a c t u a r i a l data / under f i n i t e d i f f e r e n c i n g .  The a r t i c l e containing the comment  deals with i n t e r p o l a t i o n using functions which have since come to be regarded as spline functions.*  G r e v i l l e states that  columns of f i n i t e differences of a c t u a r i a l data often d i s p l a y the following properties:  the f i r s t column or two  ences, while containing marked i r r e g u l a r i t i e s , trend;  of  differ-  show a d i s c e r n i b l e  then follows a column with no trend but with an o v e r a l l ^  tendency toward either p o s i t i v e or negative elements;  following  t h i s i s a column with seemingly haphazard values, no pattern i n signs, but with about equal numbers of plus and minus signs; succeeding columns display an increasing tendency toward large numerical values and r e g u l a r l y a l t e r n a t i n g signs.  The  column  * G r e v i l l e ' s a r t i c l e was published i n 1944. The seminal a r t i c l e on spline functions [12] was written i n 1946 by Schoenberg who drew upon Greville»s work i n g e n e r a l i s i n g and making rigorous the concept of spline function.  28. with no trend and a predominance of plus or "minus signs i s the one which corresponds to the degree of the f u n c t i o n to be > fitted. In order to test the effectiveness of the procedure recommended by G r e v i l l e , sets of 50 data points generated i n the following way.  1  1  were  The x ' s were chosen i n i  ascending order from the i n t e r v a l 0.08  (x ,y )  [-2,2]  i n steps of roughly  The y^'s were computed using the formula  (3.1)  y = ( x + 1.82) ±  ±  k  + 5.4(  Xl  + 0.l9) - 13.9(x - 0.91) k  ±  + 8 . 5 ^ - 1.57) + u , f o r i = 1 , 2 , . . . ,50 . k  ±  In other words,  y^ represents the sum of the value a t x^  of a spline function of degree 0.91*  and 1.57  k  with knots a t -1.82, -0.19,  » and a randomly generated value  u^ of a  normally d i s t r i b u t e d random variable with mean zero and variance a  .  An IBM 7044 computer was used to compute a table of eight  divided differences f o r each set of data points.  Each d i f f e r -  ence column was tested f o r randomness using a sign t e s t and a • run test (for runs of plus and minus signs), and each was t e s t e d for trend.  For each column, the percentage excess of the  positive numerical values (the difference between the t o t a l absolute value of the p o s i t i v e elements and that of the negative elements expressed as a percentage of the t o t a l absolute value of the p o s i t i v e elements) was calculated.  Difference tables  were analysed f o r each of th'e error variances ;  a . - 0.10, 2  0.01,  and 0* 001- > f o r l e v e l s of s i g n i f i c a n c e f o r the sign, run and  29.';'; trend tests of  0.05  and  0.01  , and f o r degrees  To be f a i r to G r e v i l l e , we comment [4,  p. 228]  k = 1,...,5 •  should note h i s f u r t h e r  \  that "as i s usually the case with arm-  chair rules of procedure, the difference table c r i t e r i o n  ...  i s not always clear cut i n i t s p r a c t i c a l a p p l i c a t i o n " .  A  complete summary of the analyses of many difference  tables  showed a disagreement with G r e v i l l e ' s c r i t e r i o n oh two major counts.  F i r s t of a l l , for the difference column corresponding  to the correct degree, the sign t e s t indicated that there was... no predominance of one  sign.  Secondly, i n the columns  following the one associated with the correct degree, a trend was  indicated, contradicting the claim that here the values  should alternate i n sign.  I t should be pointed  data to which G r e v i l l e referred was  out that the  5  of a s p e c i a l nature,  consisting of such quantities as m o r t a l i t y rates which imply an underlying  sample of thousands of observations.  .  Further-  more, the s t a t i s t i c used, to detect trend f o r the computer  p •p  p  analyses was 6 /s , where 6 i s the mean square difference of the elements of a difference column, ,n  o  m-l m-l  .;/••.•'  ••  '  successive  •  m-  p and  s  i s t h e i r variance 8 2  =  =^i?i  C r i t i c a l values for- 6 / s 2  sensitive, tests f o r -.  -  ••  •..  :  €$6hd  i  ( d  V  a ) 2  were found i n  2  [4,  p. 25O].  More  might have y i e l d e d r e s u l t s which agree •  •  ..-  more c l o s e l y with G r e v i l l e ' s c r i t e r i o n .  v.  ....  .  •  ......  30.  Further study of the computer analyses f a i l e d to explain conclusively the divergence from G r e v i l l e ' s c r i t e r i o n or to y i e l d an a l t e r n a t i v e c r i t e r i o n which would c o n s i s t e n t l y select the correct degree.  However, c e r t a i n consistencies were  noted; • these we s h a l l summarize b r i e f l y . The percentage p o s i t i v e excess was  close to 100  for the f i r s t column of differences, maintaining almost  this  l e v e l up to the 'degree column', or else the one preceding i t , where i t decreased by at l e a s t one-half;  i n subsequent columns  i t showed a general tendency to o s c i l l a t e but to increase slowly and i r r e g u l a r l y i n absolute value.  In almost a l l  cases examined, the columns including and f o l l o w i n g the 'degree column' displayed consistent r e s u l t s f o r the sign, run and trend tests:  each column indicated a roughly equal d i s t r i b u t e .  ion of plus and minus signs, a preponderance of runs almost to the extent of a l t e r n a t i n g signs, and a s i g n i f i c a n t trend. the case where the degree, k 2 variance  a  to be  0.001  was taken to be  5  In  and the e r r o r  , these r e s u l t s f o r the sign, run  and trend tests occurred with high frequency f o r the 'degree column'. . For larger values of the error variance, the post• degree column' phenomenon was  observed to begin a t e a r l i e r  columns, perhaps i n d i c a t i n g an increased contamination, In some sense, of the lower order d i f f e r e n c e s . No claim is.made that these observations support sound conclusions about the b/ehaviour of general least-squares spline functions under f i n i t e d i f f e r e n c i n g .  I t i s conjectured^  however, that some sort of consistent behaviour does occur which  31.  may b e i n f l u e n c e d , b u t n o t o b s c u r e d ,  by v a r i a t i o n s i nt h e  d e g r e e , t h e number a n d l o c a t i o n o f t h e k n o t s , a n d , b e l o w upper bound, t h e s i z e o f t h e e r r o r v a r i a n c e .  some  The r e s u l t s  d e s c r i b e d a b o v e , i t i s h o p e d , may r a i s e f u r t h e r q u e s t i o n s about t h e f i n i t e d i f f e r e n c e method f o r d e t e r m i n i n g  the degree,  a n d p r o v i d e some h e l p I n f o r m u l a t i o n \ o f a s e n s i t i v e  criterion.  CHAPTER  IV  We now turn to the problem of estimating the unknown knots of a spline function.  In the l i t e r a t u r e of the  l a s t ten years one can f i n d a growing i n t e r e s t i n the problem. ; of estimating segmented regression curves and t h e i r j o i n points.  We s h a l l examine some of the techniques developed  to date i n an e f f o r t to propose a method f o r estimating the knots of a spline function of known degree;  when the degree  i s unknown t h i s approach becomes v i r t u a l l y unmanageable. In Section 2 a d i f f e r e n t sort of procedure i s suggested and an example of i t s a p p l i c a t i o n i s discussed.  1.  Estimating the Knots:  Degree Known  There are two cases here which w i l l be discussed separately:  f i r s t , the case f o r which the number of knots i s  known, and secondly, the one which requires the number of knots to be estimated, as well as t h e i r l o c a t i o n . When the number of knots i s known, f u l l use can be made of the spline representation formula (2.3) of Chapter I. However, i t i s clear that t h i s function i s not l i n e a r i n the knots, and so the Gauss-Markoff Theorem cannot be used to estimate them.  While Quandtf [9], Robison'[11], and Hudson [6]  do not have the advantage of the spline representation,,a b r i e f  33.  survey of t h e i r techniques may  c i a r i f y the s p e c i f i c problems  one encounters, and suggest a general procedure. Quandt and Robison both use the maximum l i k e l i h o o d approach with i t s inherent r e s t r i c t i o n to normally data.  Quandt derives a formula f o r the maximum l i k e l i h o o d as  a function of data points to  distributed  t , where one regression l i n e i s f i t t e d to the ( x , y ) , . . . , ( x , y ^ ) , and another l i n e i s f i t t e d 1  1  y  ( t4.i>yt+i)>"'>( n n) * without requiring the l i n e s to x  x  intersect.  ,y  One then computes t h e i r maximum l i k e l i h o o d f o r a l l  possible values of  t  and selects that value corresponding to  the maximum maximorum. Robison recommends a maximum l i k e l i h o o d procedure f o r estimating the i n t e r s e c t i o n point of two polynomial curves, suggesting an e x p l i c i t formula to be used when the j o i n p o i n t i s known to l i e between two given data abscissae.  A search;  for. the maximum maximorum of the l i k e l i h o o d i s to be  carried  out, as i n Quandt's procedure, when i t i s not known between which two data abscissae the j o i n point l i e s .  Robison imposes  no r e s t r i c t i o n s on the order of d i f f e r e n t i a b i l i t y at the point where the two polynomials ,  intersect. .-  \  .  .  '  An analogous, but more exhaustive, and more general treatment i s g i v e n by Hudson, who appeals to least-squares !  techniques rather than those of maximum l i k e l i h o o d .  Limiting  h i s discussion, to the case of one j o i n , he i s o l a t e s three types of joins:  (a)  interval ( x ^ x ^ )  one at /a point i n some unknown open where the slope i s discontinuous;  '  3*.  (b)  a j o i n at one of the  be continuous;  and  (c)  where the slope may or may one at a point i n  the slope i s continuous.  ( ±> ±+i) x  x  not  where  To estimate a j o i n of type ( a ) ,  Hudson proves i t i s s u f f i c i e n t to minimize separately the two sums of squared residuals f o r each of the curve segments, imposing the continuity condition at the j o i n . (b), one selects the  x^  Por a j o i n of type  which corresponds to the minimum of  the minimized o v e r a l l sums of squared r e s i d u a l s , the c o n t i n u i t y constraint being d i f f e r e n t f o r each.  A process of successive  approximation, not e x p l i c i t l y described, i s recommended f o r a j o i n of type ( c ) . .'  Hudson also outlines an estimation procedure  f o r a j o i n of unknown type involving a selected combination of the three above procedures. Returning to the problem of knot estimation, f o r spline functions ( s t i l l assuming that the number of knots i s known), we can.see that the spline representation has i t s own  restrict-  ions .on continuity and d i f f e r e n t i a b i l i t y b u i l t i n t o the formula. Hence, Hudson's procedures f o r joins of type (a) , and type (b) with the slope discontinuous are of no use to us, except f o r ' spline functions of degree zero (step functions).  ,  Although i t  i s desirable to avoid the r e s t r i c t i o n to normally d i s t r i b u t e d data imposed by the use of maximum l i k e l i h o o d techniques, the methods of Quandt and Robison are i n s t r u c t i v e i n that they explore the d e t a i l s of a procedure of estimation which c o n s i s t s , e s s e n t i a l l y , of optimizing a quantity* under c e r t a i n r e s t r i c t i o n s , by a f i n i t e number of t r i a l s /  35.  Let us confine ourselves to the estimation of one knot, the procedure f o r several knots being much the same i n theory, but computationally more d i f f i c u l t .  One distinguishes  o  between knots which coincide with the data abscissae and those which l i e s t r i c t l y i n some i n t e r v a l  ( i> j. i) • x  x  +  '  I h e  f  i  r  s  t  type of knot may be handled,with comparative ease, p a r t i c u l a r l y i f one has the help of a computer.  In t h i s case one computes  a sequence of sums of squared r e s i d u a l s , one f o r each possible knot  x^  i n the sequence  extreme value  x  n  .  x ,...,x Q  n  , excluding, of course, the  (Note that a knot at  x  Q  i s equivalent  to no knot at a l l , since the spline function over the i n t e r v a l (x ,x ) Q  x^  n  w i l l then become a polynomial.)  The knot i s then the  f o r which the sum of squared residuals i s a minimum.  When  several knots of t h i s type are to be estimated, the choice of t h e i r possible values w i l l be r e s t r i c t e d by the conditions of Theorem 2.2, Chapter I I so that the design matrix f o r the l i n e a r model w i l l have f u l l rank. When the unknown knot i s s t r i c t l y contained between; , 1  two of the adjacent data abscissae, the problem becomes more difficult.  Such a knot corresponds to Hudson's j o i n of type  (c), f o r which he recommends:some form of successive approximation.  Suppose, to begin with, that one knows the i n t e r v a l  i n which the knot l i e s .  One could calculate values of the sum  of squared residuals as a function of the knot equally spaced values of  §  in  (  x i  *  x i + 1  5 , taking  ) •> with  A?  to the maximum permissible e r r o r i n the estimate of  equal  § .  The  r e s u l t i n g table of values would r e a d i l y d i s p l a y the value of  36  5  at which the minimum of the r e s i d u a l sum of squares  achieved.  was  I f one does not know between which two data  abscissae the knot l i e s , one proceeds i n a manner analogous to the one which l e a t t r i b u t e d to Robison above. When i t i s not known how many knots one must estimate, the computational d i f f i c u l t i e s expand tremendously; compute sums of squared residuals f o r a l l possible and numbers of knots (up to number of data points and  r = n-k , where  k , the degree).  n+1  one must combinations i s the  Before attempting  such an undertaking, one should f i r s t answer the question of whether a better f i t i s achieved by adding a d d i t i o n a l knots. I f the answer i s yes, then one avoids considerable labour by assuming the largest possible number of knots,  r = n-k .  The question remains open f o r disucssion. The comments i n t h i s section apply to spline functions for which the degree i s known.  Such knowledge of the degree  w i l l be available i n applications where the p h y s i c a l p r o p e r t i e s of the model specify either the degree of the polynomials between the knots, or else the order of d i f f e r e n t i a b i l i t y a t the joins.  2.  Non-Linear Estimation by Successive Approximation As was remarked e a r l i e r , the Gauss-Markoff  Theorem  cannot be applied to the problem of estimating the knots because a spline function i s not l i n e a r i n these parameters.  The  technique of non-linear least-squares f i t t i n g by successive approximation, however, does produce estimates f o r the  37 c o e f f i c i e n t s of some types of functions not l i n e a r i n the coefficients.  The procedure  i s described i n [3, pp. 267-269].  The success of t h i s technique i s by no means always assured as we s h a l l see In the case of a p a r t i c u l a r example. An exploratory t e s t of the method was performed on a known spline function.  The function (3«l) of Chapter I I I  was used, as before, to generate sets of f i f t y data points ( i x  , y  i)  o  n t  n  e i  n  t  e  r  v  a  l  [-2,2] , t h i s time with no r e s i d u a l  error component being added to the y^'s .  The successive k = 1,...,5 ,  approximation procedure was applied f o r degrees and f o r three sets of i n i t i a l s  estimates of the eight parameters  (four c o e f f i c i e n t s and four knots).  The maximum number of  i t e r a t i o n s was set at 20 and the maximum r e l a t i v e error permitted of the f i n a l estimates was 0.01  .  The values of  the i n i t i a l estimates are summarized i n the f o l l o w i n g t a b l e . . - true values  ;[-,'. i n i t i a l estimates I  • •  :S: ~ ./f:  1.0  •' 1.0  coefficients :  5.4  '  -13.9 -8.5 knots  *  ;  ..  II  ' , 1.0  ' 1.0  1.5 ,  III  2.0  6.2  1.0  2.0  1.0 '••,  2.5  10.0 •  -14.7  -i.82  -1.00  -1.72  -1.72  -0.19  0.00  -0.39  -0.39  0.91  1.00  0.71  0.71  1.57  1.99  1.47  1.47  \  The estimation method'failed to y i e l d r e s u l t s f o r Sets I and I I of i n i t i a l estimates.  Set I I I f a i l e d to y i e l d  58.:  f i n a l estimates f o r degrees 2,  4, and 5> out f o r degrees 1 and  3 the f i n a l estimates f o r the c o e f f i c i e n t s and the knots were exact a f t e r f i v e and s i x . I t e r a t i o n s , r e s p e c t i v e l y . In a l l cases where the method f a i l e d , the reason f o r the f a i l u r e was the s i n g u l a r i t y of the matrix  A'A /where  A  represents the matrix of p a r t i a l derivatives of the f u n c t i o n  S  *  W  - j i3 c  ( x  " J>+ 5  with respect to each of the parameters evaluated at each of the points Intermediate estimate the maximum  fj  A'A  x ,x ,... ,x^ 1  x^ , one whole column  to be singular.  2  of a knot  i = 1,2,..,,50 , of the matrix  c^,...,c^, f ^ , . . . , ? ^ ,  A  Q  .  Whenever an  was obtained which exceeded [-kCj(x  i  -  contained only zeros causing  At t h i s point the procedure broke down.  In case I, the procedure could not get past the f i r s t because i t turned out that none of the x^'s exceeded  iteration 1.99  .  In case I I , i t i s conjectured that the large error i n the i n i t i a l estimates of the c o e f f i c i e n t s was responsible f o r an intermediate estimate of some knot exceeding  2.00  .  It is  not at a l l clear why the i n i t i a l estimates I I I y i e l d e d r e s u l t s for  some values of the degree and not f o r others. This example was introduced p r i m a r i l y to e s t a b l i s h  that the method described can, i n c e r t a i n l i m i t e d instances a t l e a s t , estimate a spline f u n c t i o n of known degree with a given number of unknown knots.  Besides t h i s , one would l i k e to  formulate some questions which would lead to a deeper examination of  the e f f e c t i v e n e s s of the method f o r general s p l i n e f u n c t i o n s ;  39.  an analytic study, i f possible, combined with the methods of computer simulation.  Questions that a r i s e immediately  concern  s e n s i t i v i t y of the method to variations i n the degree, the number and l o c a t i o n of th© knots, the maximum deeired r e l a t i v e error i n the f i n a l estimates, the p r e c i s i o n of the i n i t i a l estimates, and the variance,of the r e s i d u a l error.  Answers to  these questions w i l l provide a means of assessing the scope of the method i n p r a c t i c a l a p p l i c a t i o n s . I f the method were discovered to have s u f f i c i e n t f l e x i b i l i t y , one would then be interested i n how  to obtain  i n i t i a l estimates of the parameters with the required degree of accuracy.  The further problem of the degree of the spline  function being unknown could be solved, f o r the time being at l e a s t , by computing the goodness of f i t f o r a l l reasonable degrees, and selecting the degree which gives the best f i t . The unknown number of knots s i t u a t i o n might be handled i n a similar way using a s l i g h t l y d i f f e r e n t c r i t e r i o n i n place of goodness of f i t , because of the v a r i a t i o n , at each stage i n the number of parameters being estimated. This non-linear successive approximation method f o r spline functions, i f i t works, has an added advantage.  A  very rapid computer subroutine c a l l e d LQF has been w r i t t e n which accommodates the method.  One version of t h i s routine  i s available on f i l e a t the U n i v e r s i t y of B r i t i s h Columbia Computing Centre.  40.  BIBLIOGRAPHY  [1]  H. B. C u r r y and I . J . Schoenberg, On P o l y a F r e q u e n c y F u n c t i o n s I V : The S p l i n e F u n c t i o n s and T h e i r L i m i t s , B u l l e t i n o f t h e A m e r i c a n M a t h e m a t i c a l S o c i e t y , 53, (1947), A b s t r a c t 380t, p. 1114.  [2]  Edwin L. Crow, F r a n c e s A. D a v i s , and M a r g a r e t W. M a x f i e l d , . Statistics York,  [3]  Manual, Dover P u b l i c a t i o n s , I n c . , New  I960.  N. R. Draper and H. S m i t h , A p p l i e d R e g r e s s i o n A n a l y s i s , 1966.  John W i l e y and Sons, I n c . , New Y o r k , [4]  Thomas N. E. G r e v i l l e ,  The G e n e r a l  Theory o f O s c u l a t o r y  I n t e r p o l a t i o n , the Transactions S o c i e t y o f America, [5]  Thomas N. E. G r e v i l l e ,  45 (1944),  of the A c t u a r i a l pp.  202-265.  Numerical Procedures f o r I n t e r - ,  p o l a t l o n by S p l i n e F u n c t i o n s , The SIAM J o u r n a l o f N u m e r i c a l A n a l y s i s , (B) [6]  1 (1964),  53-68.  Derek J . Hudson, F i t t i n g Segmented Curves Whose J o i n P o i n t s Have t o be E s t i m a t e d , Statistical Association,  [7]  pp.  J o u r n a l of the American  61 (1966),  pp.  1097-1129.  M a u r i c e G. K e n d a l l and A l a n S t u a r t , The Advanced T h e o r y o f S t a t i s t i c s , Volume 3, C h a r l e s G r i f f i n and Co. L t d . , . ,  London,  1966.  [8]  R. L. P l a c k e t t , P r i n c i p l e s o f R e g r e s s i o n A n a l y s i s , O x f o r d : A t t h e C l a r e n d o n P r e s s , i960.  [9]  R i c h a r d E. Quandt, The E s t i m a t i o n o f a L i n e a r R e g r e s s i o n System Obeying Two S e p a r a t e Regimes, J o u r n a l o f t h e A m e r i c a n S t a t i s t i c a l A s s o c i a t i o n , 53 (1958),  pp. 873-880. [10] C. R. Rao, L i n e a r S t a t i s t i c a l I n f e r e n c e and i t s A p p l i c a t i o n s j , John W i l e y and Son's, I n c . , New Y o r k , 1965. ;  41.  [11]  D. E. Robison, Estimates f o r the Points of I n t e r s e c t i o n of Two Polynomial Regressions, Journal o f the • American S t a t i s t i c a l Association, £9 (1964), pp. 214-224.  [12]  I. J . Schoenberg, Contribution to the Problem of Approximation of Equidistant Data by A n a l y t i c  Functions,  Quarterly of Applied Mathematics, 4 (1946), Part A,  pp. 45-99. [13]  I. J. Schoenberg and Anne Whitney, On Polya Frequency Function I I I :  The P o s i t i v i t y of T r a n s l a t i o n Determ-  inants with an Application to the I n t e r p o l a t i o n Problem by Spline Curves, Transactions  of the American  Mathematical Society, J_4 (1953), PP. 246-259[14] I. J . Schoenberg, Spline Functions, Convex Curves and Mechanical Quadrature, B u l l e t i n of the American Mathematical Society, 64 (1958), pp. 324-357. [15)  I. J. Schoenberg, Spline Interpolation and Best Quadrature Formulae, B u l l e t i n of the American Mathematical Society, 70 (1964  [16]  , pp. 145-148.  I. J . Schoenberg, "On Interpolation by Spline 'and  Functions  i t s Minimal Properties", International Series  Numerical Mathematics, 5; On Approximation Theory, Birkhauser  Verlag, Basel and S t u t t g a r t , 1964.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080581/manifest

Comment

Related Items