UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Analysis of ordered categorical data Chang, Janis 1988

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
[if-you-see-this-DO-NOT-CLICK]
UBC_1988_A6_7 C42_6.pdf [ 3.16MB ]
Metadata
JSON: 1.0097667.json
JSON-LD: 1.0097667+ld.json
RDF/XML (Pretty): 1.0097667.xml
RDF/JSON: 1.0097667+rdf.json
Turtle: 1.0097667+rdf-turtle.txt
N-Triples: 1.0097667+rdf-ntriples.txt
Original Record: 1.0097667 +original-record.json
Full Text
1.0097667.txt
Citation
1.0097667.ris

Full Text

ANALYSIS OF ORDERED CATEGORICAL DATA By Janis Chang Sc., (Biochemistry) University of British Columbia, 1985  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE  in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF STATISTICS We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA October 1988  © Janis Chang, 1988  I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r a n a d v a n c e d degree at t h e U n i v e r s i t y of B r i t i s h C o l u m b i a , I agree t h a t the L i b r a r y s h a l l m a k e it freely available for reference a n d study. I f u r t h e r agree t h a t p e r m i s s i o n for extensive c o p y i n g of this thesis for scholarly purposes m a y be g r a n t e d b y the h e a d of m y d e p a r t m e n t or b y h i s o r h e r r e p r e s e n t a t i v e s . It is u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s for f i n a n c i a l g a i n shall not be allowed w i t h o u t m y w r i t t e n permission.  D e p a r t m e n t of Statistics T h e U n i v e r s i t y of B r i t i s h C o l u m b i a 2075 W e s b r o o k P l a c e Vancouver, V6T  Date:  1W5  Canada  Abstract  M e t h o d s of testing for a l o c a t i o n shift between two populations i n a l o n g i t u d i n a l study are i n v e s t i g a t e d w h e n the d a t a of interest are ordered, c a t e g o r i c a l a n d non-linear.  A  n o n - s t a n d a r d a n a l y s i s i n v o l v i n g m o d e l l i n g of d a t a o v e r t i m e w i t h t r a n s i t i o n p r o b a b i l i t y m a t r i c e s is d i s c u s s e d . for the analysis  N e x t , t h e r e l a t i v e efficiencies of statistics m o r e f r e q u e n t l y used  of s u c h c a t e g o r i c a l d a t a at a single t i m e p o i n t are e x a m i n e d .  The  W i l c o x o n r a n k s u m , M c C u l l a g h , a n d 2 sample t statistic are c o m p a r e d for the analysis of s u c h cross s e c t i o n a l d a t a u s i n g s i m u l a t i o n a n d efficacy c a l c u l a t i o n s .  Simulation  techniques are t h e n u t i l i z e d i n c o m p a r i n g the stratified W i l c o x o n , M c C u l l a g h a n d chi s q u a r e d - t y p e s t a t i s t i c i n t h e i r efficiencies at d e t e c t i n g a l o c a t i o n shift w h e n t h e d a t a are e x a m i n e d over t w o t i m e points.  T h e d i s t r i b u t i o n of a c h i s q u a r e d - t y p e  statistic  based on the simple contingency table constructed b y merely noting whether a subject i m p r o v e d , s t a y e d t h e s a m e o r d e t e r i o r a t e d is d e r i v e d . A p p l i c a t i o n s o f t h e s e  methods  a n d r e s u l t s t o a d a t a set o f M u l t i p l e S c l e r o s i s p a t i e n t s , s o m e o f w h o m w e r e t r e a t e d w i t h i n t e r f e r o n a n d s o m e of w h o m received a p l a c e b o are p r o v i d e d t h r o u g h o u t t h e thesis a n d o u r findings are s u m m a r i z e d i n the last C h a p t e r .  11  Table of Contents  Abstract  ii  List of Tables  iv  List of Figures  v  Acknowledgement  vi  1 Introduction  1  2 Markov Analysis  3  2.1  2.2  Tests of O r d e r a n d S t a t i o n a r i t y  4  2.1.1  Tests of O r d e r of M a r k o v C h a i n  6  2.1.2  Tests of Stationarity  8  M o d e l l i n g of t h e Tridiagonals  10  2.2.1  Modelling Incorporating V  11  2.2.2  Modelling Without V  12  2  2  3 Analysis At One Time Point  19  3.1  Simulation  20  3.2  Efficacy Calculations  22  3.2.1  Efficacy of the W i l c o x o n  23  3.2.2  E f f i c a c y o f t h e T test  31  C o m p a r i s o n of the Efficacies  34  3.3  iii  k  4  5 A  Analysis at Two Time Points  36  4.1  Description of the Simulation  36  4.1.1  Stratified Wilcoxon  37  4.1.2  Chi-squared Statistic  38  4.2  Simulation Results  46  4.3  A p p l i c a t i o n of Tests to t h e M S D a t a  51  Conclusions  53  Sample Transition Matrices  55  Bibliography  60  iv  List of Tables  2.1  C o m p a r i s o n of Treatment and C o n t r o l G r o u p s  9  2.2  Tests of 2  9  2.3  Tests of 1  2.4  R e s u l t s of Test for S t a t i o n a r i t y  16  2.5  T r e a t m e n t vs. C o n t r o l ( g e n e r a l t r i d i a g o n a l m o d e l l i n g )  16  2.6  F i t of G e n e r a l T r i d i a g o n a l M o d e l s  17  2.7  T r e a t m e n t vs. C o n t r o l ( g e n e r a l t r i d i a g o n a l - n o P )  17  2.8  F i t of G e n e r a l M o d e l - no P  17  2.9  C o m p a r i s o n of Specific and G e n e r a l Models (treatment)  17  2.10  C o m p a r i s o n of Specific a n d G e n e r a l M o d e l s  18  4.1  R e s u l t s of M c C u l l a g h A n a l y s i s  51  nd  st  O r d e r vs. 1  st  Order M a r k o v Process  O r d e r vs. I n d e p e n d e n c e  16  2  2  v  List of Figures  3.1  S i m u l a t i o n of D a t a at O n e T i m e P o i n t  21  4.1  S i m u l a t i o n R u n 1 (3 = 0.1, p = 0.8)  47  4.2  S i m u l a t i o n R u n 2(8  = 0.1, p = 0.6)  48  4.3  S i m u l a t i o n R u n 3(3  = 0.3, p = 0.8)  49  4.4  S i m u l a t i o n R u n 4(8  = 0.3, p = 0.6)  50  vi  Acknowledgement  I w o u l d like to thank m y supervisor, Dr.  N.E. H e c k m a n for her continual guidance,  s u p p o r t a n d m a n y invaluable suggestions. A l s o , I a m grateful t o D r . A . J . P e t k a u advice a n d c a r e f u l r e a d i n g of this work. In a d d i t i o n , the encouragement of Dr. J . B e r k o w i t z a n d e x p e r t c o m p u t i n g a d v i c e of P e t e r S c h u m a c h e r are g r a t e f u l l y acknowledged. I also w i t h to thank the M S  C l i n i c at the U B C H o s p i t a l f o r t h e i r s u p p o r t .  T h i s research was f u n d e d b y the M u l t i p l e Sclerosis S o c i e t y of C a n a d a t h r o u g h the U B C MS  Clinic.  vii  Chapter 1  Introduction  O n e h u n d r e d M u l t i p l e Sclerosis patients f r o m t h e patient p o p u l a t i o n of t h e M S C l i n i c at t h e U B C H o s p i t a l p a r t i c i p a t e d i n a r a n d o m i z e d d o u b l e - b l i n d c l i n i c a l t r i a l t o determ i n e t h e e f f e c t i v e n e s s o f t r e a t m e n t w i t h i n t e r f e r o n . M u l t i p l e S c l e r o s i s is a p r o g r e s s i v e d i s e a s e w h i c h a t t a c k s t h e n e r v o u s s y s t e m a n d o f t e n r e s u l t s i n loss o f v i s i o n , m o t o r c o o r d i n a t i o n and/or sensory perception. T h e severity of the s y m p t o m s patients.  varies  among  T h e subjects i n this s t u d y a r e c h r o n i c progressive, t h a t is, t h e i r c o n d i t i o n  deteriorates progressively over time. In this trial, fifty patients were assigned r a n d o m l y to c o n t r o l a n d treatment groups. Subjects i n the c o n t r o l group were given injections of a p l a c e b o a n d those i n t h e t r e a t m e n t g r o u p were given injections of i n t e r f e r o n for six months.  T h e patients were m o n i t o r e d d u r i n g the six months of treatment a n d for  eighteen m o n t h s of follow-up t o the subsequent t e r m i n a t i o n of the treatment. N o s t a n d a r d q u a n t i t a t i v e m e t h o d of m e a s u r i n g t h e level of t h e disease exists. this study, measurements  of symptoms  In  s u c h as m o b i l i t y o r n u m b n e s s w e r e u s e d t o  assess t h e s e v e r i t y o f t h e s u b j e c t ' s c o n d i t i o n a n d t h i s i n f o r m a t i o n w a s u s e d t o p r o d u c e the K u r t z k e e x t e n d e d d i s a b i l i t y status scale ( E D S S ) . T h e K u r t z k e E D S S , referred t o h e r e as K u r t z k e s c o r e , w a s c h o s e n as t h e m e a n s o f t r a c i n g t h e s u b j e c t s ' c o n d i t i o n s o v e r t i m e . T h e K u r t z k e s c o r e is o r d e r e d a n d c a t e g o r i c a l , t a k i n g o n v a l u e s o f 0 ( n o r m a l ) t o 10 ( d e a d ) i n i n c r e m e n t s o f 0.5. It is a l s o n o n l i n e a r , so t h a t , f o r e x a m p l e , a c h a n g e i n s c o r e o f 1 t o 2 is n o t as severe as a c h a n g e o f 5 t o 6. T h e n o n l i n e a r i t y a n d c a t e g o r i c a l n a t u r e o f t h e scores m a k e s i t i n a p p r o p r i a t e t o t r e a t  1  Chapter  1.  2  Introduction  t h e m as c o n t i n u o u s v a r i a b l e s i n t h e a s s e s s m e n t of t h e e x t e n t o f t h e disease. I n p a r t i c ular, any statistic w h i c h requires the a s s u m p t i o n of n o r m a l l y d i s t r i b u t e d observations ( s u c h as t h e 2 s a m p l e t s t a t i s t i c ) is n o t s u i t a b l e f o r a n a l y z i n g t h i s t y p e o f d a t a .  On  the other hand, standard categorical d a t a analysis could be done ignoring the ordinal n a t u r e o f t h e d a t a . T h i s t y p e o f a n a l y s i s is n o t a p p r o p r i a t e h e r e as i n f o r m a t i o n o n t h e d e g r e e of i m p r o v e m e n t o r d e t e r i o r a t i o n o f t h e p a t i e n t ' s c o n d i t i o n w o u l d b e l o s t . O n e m e t h o d o f a n a l y s i s , d e s c r i b e d i n C h a p t e r 2, is t o c o n s i d e r t h e c a t e g o r i e s as s t a t e s a n d t h e m o v e m e n t s o f t h e s u b j e c t s f r o m c a t e g o r y t o c a t e g o r y o v e r t i m e as t r a n sitions f r o m state t o state. T h e t r e a t m e n t a n d c o n t r o l groups were m o d e l l e d b y diff e r e n t t r a n s i t i o n p r o b a b i l i t y m a t r i c e s w h i c h w e r e c o m p a r e d t o d e t e r m i n e t h e effect of the interferon injections. M o d e l s assuming stationarity, the M a r k o v property and other restrictions o n the t r a n s i t i o n p r o b a b i l i t i e s were fit to the d a t a to d e t e r m i n e w h e t h e r a s i m p l e m a t r i x c o u l d be used t o describe the transitions of the patients. I n C h a p t e r 3, t h e W i l c o x o n s t a t i s t i c , M c C u l l a g h m o d e l a n d 2 s a m p l e t t e s t w e r e c o m p a r e d t o assess t h e i r r e l a t i v e e f f i c i e n c y i n d e t e c t i n g a l o c a t i o n s h i f t b e t w e e n t h e t w o g r o u p s w h e n t h e scores a r e a s s u m e d t o h a v e a n u n d e r l y i n g c o n t i n u o u s d i s t r i b u t i o n . T h e M c C u l l a g h m o d e l , a m o d i f i c a t i o n of the logistic regression m o d e l , incorporates t h e o r d i n a l n a t u r e o f t h e scores. I n t h i s a n a l y s i s t h e d a t a a t o n l y o n e t i m e p o i n t w a s u s e d . T h e c o m p a r i s o n o f t h e tests w a s b a s e d o n a s y m p t o t i c e f f i c a c y c a l c u l a t i o n s a n d simulations. I n C h a p t e r 4, t h e s t r a t i f i e d W i l c o x o n , M c C u l l a g h a n d c h i s q u a r e d - t y p e s t a t i s t i c w e r e compared using simulation when analyzing the d a t a between two time points. T h e chi squared- t y p e statistic c a l c u l a t e d was the u s u a l analysis i n v o l v i n g c o n t i n g e n c y tables w h e r e t h e cells were t h e n u m b e r of p a t i e n t s i n t r e a t m e n t a n d c o n t r o l w h o  improved,  s t a y e d t h e s a m e o r w o r s e n e d . T h e d i s t r i b u t i o n o f t h i s s t a t i s t i c u n d e r t h e h y p o t h e s i s of n o d i f f e r e n c e b e t w e e n t h e g r o u p s is d i s c u s s e d .  Chapter 2  Markov Analysis  Since patients i n this s t u d y m o v e d f r o m state to state over time, the movements  from  one state t o another were m o d e l l e d for each group using t r a n s i t i o n p r o b a b i l i t y m a trices. C o m p a r i s o n of the t w o groups t h e n i n v o l v e d a n a l y z i n g the m a t r i c e s e s t i m a t e d for each group.  T h i s a p p r o a c h does not r e q u i r e t h e a s s u m p t i o n of a p a r a m e t r i c f o r m  for the t r a n s i t i o n probabilities, however some models assuming a specific f o r m for the p r o b a b i l i t i e s w e r e fit t o t h e d a t a t o d e t e r m i n e i f t h e n u m b e r o f p a r a m e t e r s r e q u i r e d t o describe the d a t a could be reduced. Initially, t h e d a t a was a n a l y z e d i n its o r i g i n a l f o r m . T r a n s i t i o n p r o b a b i l i t y matrices were p r o d u c e d for each g r o u p ( t r e a t m e n t a n d control) f r o m 0 m o n t h s t o each of the o t h e r t i m e p o i n t s (1,3,6,9,12,18,24 m o n t h s ) . diagonals,  M o s t observations were o n or near the  i n d i c a t i n g t h a t t h e p a t i e n t s ' scores d i d n o t c h a n g e m u c h f r o m o n e t i m e  period to the next.  A s t h e d i m e n s i o n s o f t h e m a t r i c e s w e r e 21 x 21 a n d t h e r e w e r e  o n l y fifty p a t i e n t s i n e a c h group, m a n y cells were empty. F o r this reason, t h e K u r t z k e scores w e r e g r o u p e d i n t o five c a t e g o r i e s , c h o s e n t o e n s u r e a t l e a s t f o u r p e o p l e i n e a c h c a t e g o r y a t 0 m o n t h s . T h e c a t e g o r i e s w e r e scores o f 0-4, 4.5-5.5, 6.0, 6.5, a n d 7-10, w i t h 0-4 b e c o m i n g s t a t e 1, 4.5-5.5 b e c o m i n g s t a t e 2, e t c . T r a n s i t i o n m a t r i c e s o f t h e p a t i e n t s f r o m 0 m o n t h s t o 24 m o n t h s a r e d i s p l a y e d i n A p p e n d i x A . A l l f u r t h e r c a l c u l a t i o n s i n this chapter are based u p o n this collapsed data. I n S e c t i o n 2.1, l i k e l i h o o d r a t i o s t a t i s t i c s w e r e u s e d t o d e t e r m i n e i f a s e c o n d o r d e r M a r k o v s t r u c t u r e fit a p p r e c i a b l y b e t t e r t h a n a f i r s t o r d e r m o d e l . A t e s t f o r s t a t i o n a r i t y  3  Chapter  2.  Markov  4  Analysis  was also a p p l i e d t o t h e data. S o m e m o d e l l i n g o f the d a t a i n t h e t r i d i a g o n a l positions is d i s c u s s e d i n S e c t i o n 2.2.  2.1  Tests of Order and Stationarity  T h e f o l l o w i n g tests w e r e b a s e d o n t h o s e d e s c r i b e d b y B h a t [1]. S o m e n o t a t i o n u s e d i n the r e m a i n d e r of this c h a p t e r is: z  —  n u m b e r o f states  t  =  t i m e p o i n t s s t u d i e d , t = 0,1,. .., 7 c o r r e s p o n d i n g t o t h e times 0, 1, 3, 6, 9, 12, 18, 2 4 m o n t h s  X  t  =  K u r t z k e s c o r e a t t i m e t.  U s i n g t h e collapsed data, transition probability matrices were calculated for t r a n sitions f r o m 0 m o n t h s t o each of the other t i m e points studied. M a x i m u m likelihood estimates for P  T  and P, c  t h e transition matrices of the treatment a n d control group  w e r e c o m p u t e d , as w e l l as f o r P, t h e t r a n s i t i o n m a t r i x u n d e r Ho : P  = P°. A l i k e l i -  T  h o o d r a t i o s t a t i s t i c w a s c o m p u t e d s e p a r a t e l y f o r each t i m e p o i n t t = 0 , 1 , . . . , 7 t o test whether t h e t w o groups' t r a n s i t i o n probabilities were similar. U n d e r the n u l l hypothesis t h a t t h e t w o g r o u p s c a n b e m o d e l l e d u s i n g t h e s a m e t r a n s i t i o n p r o b a b i l i t y m a t r i x , i t c a n b e s h o w n [6] t h a t t h e l o g l i k e l i h o o d f u n c t i o n l n L ( P ) is:  InL(P) = B + J2 i=l j=l  where  I  >  S  +  <  l  n  ^ ) -  Chapter  2.  Markov  Analysis  B  =  a t e r m i n d e p e n d e n t of the p j's  nf-  =  n u m b e r of observations where X  4  = j and X o = i i n the treat-  t  ment group =  ng  number of observations i n the control group where X  t  = j  a n d XQ = i P  i  =  j  Pv[X  =  t  j\  X  =  0  i].  U n d e r t h e a l t e r n a t e h y p o t h e s i s , Hi  l n L ( P  c  , P  T  ) =  B  +  :P  ^ P ,  T  c  t h e l o g l i k e l i h o o d f u n c t i o n is:  ± ± (nj,lnpg «'=1 j=l  +  ng  Inpg)  where  pfj  —  Pr[X  pfj  =  Pr[X  t  T h e log likelihood statistic, G, 2  G  2  (  = j | Xo = i i n t r e a t m e n t g r o u p ] = j \ Xo = i in. c o n t r o l g r o u p ] .  is t h e n  [lnL(P)-lnL(P ,P )]  =  -2  =  2 ± ± [(nl l n p g + ng I n p g ) - (nj + ng)(lnp,,)] . t'=i j=i  c  r  where a l l the probabilities were estimated using m a x i m u m likelihood techniques:  lnL(P)  =  Pij  =  Pij  =  Pij  =  the  Ij + %  n  <. =  log n  nf.+nf.  "S  nj.  likelihood  at  P  Analysis  6  U n d e r the n u l l hypothesis, G  h a s a n a s y m p t o t i c c h i - s q u a r e d d i s t r i b u t i o n w i t h degrees  Chapter 2.  Markov  2  o f f r e e d o m e q u a l t o z[z — 1) — y  p  where y  Pearson's chi squared statistic, X , 2  * =££  p  is t h e n u m b e r o f z e r o e n t r i e s i n P.  was also calculated:  2  i=l j=l  where  mJ- = nj.pij,  and  L  m  Ij  +  mfj = nf.pij  (2.3) m  ij  are t h e expected n u m b e r of observations i  treatment a n d control group w h o moved f r o m state i t o state j between time 0 a n d t i m e t. R e s u l t s f r o m t h e s e v e n t e s t s a r e g i v e n i n T a b l e 2.1 a n d t h e t r a n s i t i o n m a t r i c e s b e t w e e n t i m e 0 a n d 24 m o n t h s a r e d i s p l a y e d i n A p p e n d i x A . T h e p - v a l u e s r e p o r t e d were based o n the x  2  approximation.  T h e p-values were smallest (although not significant) at s i x a n d nine months, w h i c h is w h e n t h e t r e a t m e n t w a s d i s c o n t i n u e d . T h i s s u g g e s t s t h a t t h e t r e a t m e n t m a y h a v e h a d a n effect a t t h i s t i m e , w h i c h w o r e o f f a t e i g h t e e n m o n t h s . S i n c e t h e s t a t i s t i c s c a l c u l a t e d o n l y measure absolute differences i t was n o t possible t o d e t e r m i n e f r o m t h e m w h e t h e r the treatment group d i d better o r worse t h a n the control group. N o obvious trend could b e s e e n f r o m e x a m i n a t i o n o f t h e e s t i m a t e d t r a n s i t i o n m a t r i c e s , b u t p r e v i o u s w o r k [7] i n d i c a t e d t h a t i n fact t h e t r e a t m e n t g r o u p regressed ( r e l a t i v e t o t h e controls) f r o m 0 to 6 months.  2.1.1  Tests of O r d e r of M a r k o v C h a i n  S e c o n d o r d e r a n d first o r d e r M a r k o v C h a i n m o d e l s w e r e fit t o t h e d a t a a n d t h e r e s u l t a n t estimates were c o m p a r e d . E a c h group was tested separately. A l i k e l i h o o d r a t i o statistic  Chapter  2.  Markov  was u s e d t o test H  0  7  Analysis  : c h a i n is l  s  <  o r d e r M a r k o v vs. Hi : c h a i n i s 2  order Markov.  nd  This  t e s t w a s c a r r i e d o u t s e p a r a t e l y f o r e v e r y t h r e e c o n s e c u t i v e t i m e p o i n t s i n t h e s t u d y , i.e. a t 0-1-3 m o n t h s , 1-3-6 m o n t h s , e t c . I f t h e d a t a a p p e a r e d t o a c t as a 2  nd  order  Markov  c h a i n t h e n t h a t w o u l d i m p l y that a p a t i e n t ' s score w o u l d d e p e n d o n his score at t h e previous t i m e p o i n t a n d t h e t i m e p o i n t before that.  both  I n this problem, t h e G  2  statistic becomes:  ijk x n).  n  i=i j=i k=i  i hn  xn  )k  where  n\-  —  n u m b e r o f o b s e r v a t i o n s i n w h i c h X -2 = h X -i = j, Xt = k  njk  =  n u m b e r o f o b s e r v a t i o n s i n w h i c h X _i = j, X  ).  —  Yli=i jk  ij.  —  J2t=i \jk  k  n  n  t  t  t  t  t  = k  nt  n  = 2,...,z.  G\ h a s a n a p p r o x i m a t e c h i s q u a r e d d i s t r i b u t i o n w i t h J2j=i(  z  ~ j — 1) X r  — Cj — 1)  degrees o f f r e e d o m w h e r e z is t h e n u m b e r o f c a t e g o r i e s , Cj a n d rj a r e t h e n u m b e r o f zero rows a n d columns respectively i n the two dimensional transition m a t r i x consisting o f t h e t r a n s i t i o n p r o b a b i l i t i e s p\j (i = 1 , . . . , z, j is f i x e d a n d k = 1 , . . . , z ) . k  T h e r e s u l t s f r o m t h e s e t e s t s a r e i n T a b l e 2.2. O n l y t h e s t a t i s t i c f o r t h e t r e a t m e n t g r o u p i n t h e t i m e i n t e r v a l s i x t o t w e l v e m o n t h s w a s s i g n i f i c a n t a t a n a l e v e l o f 0.05, w h i c h indicated that modelling w i t h a I * order M a r k o v chain was reasonable for most s  of t h e t i m e periods.  T h e c o u n t s i n t h e s e m a t r i c e s w e r e v e r y l o w , so t h e s t a t i s t i c s  calculated m a y be misleading. S i m i l a r tests were a p p l i e d t o t h e 1  s t  order M a r k o v transition matrices to determine  w h e t h e r t h e f i n a l r e s p o n s e d e p e n d e d o n t h e p r e v i o u s s c o r e (1  st  order M a r k o v ) or not  Chapter  2.  Markov  (independence).  8  Analysis  T h e appropriate G  statistic becomes:  2  z  z  G! = 2 £ 5 X . l n  (2.4)  where  T h e s t a t i s t i c s c a l c u l a t e d u s i n g (2.4) a r e i n T a b l e 2.3. A l l o f t h e s t a t i s t i c s w e r e v e r y l a r g e i n c o m p a r i s o n w i t h t h e degrees o f f r e e d o m a t a l l t i m e p o i n t s w h i c h m e a n t the reduction f r o m a 1  ST  that  order M a r k o v chain m o d e l t o a n independence m o d e l was not  reasonable.  2.1.2  Tests o f S t a t i o n a r i t y  A s s u m i n g the d a t a h a d the M a r k o v Property, another likelihood ratio statistic was computed, testing for stationarity. W e assume that t h e same transition m a t r i x could be used t o describe the patients' movements between all of the time intervals measured, independent of the actual real-time length of that time interval. A transition matrix was estimated u s i n g d a t a f r o m a l l t i m e points. T h i s was c o m p a r e d t o the seven matrices e s t i m a t e d w i t h t h e c o u n t s f o r e v e r y t w o c o n s e c u t i v e t i m e p o i n t s i n t h e s t u d y (0-1 m o n t h , 1-3 m o n t h s , e t c . ) . T h e G  2  s t a t i s t i c here takes o n t h e f o r m : z  z  7  =:i j=i t=i  t  where  n\. x riij  Chapter  2.  Markov  Analysis  9  T a b l e 2.1: C o m p a r i s o n o f T r e a t m e n t a n d C o n t r o l G r o u p s Time Period  G  p value  X  p value  degrees o f  statistic  (G )  statistic  (X )  freedom  0-1  12.386  0.19  10.718  0.30  9  0-3  15.086  0.13  12.745  0.24  10  0-6  22.050  0.08  17.387  0.24  14  0-9  17.985  0.08  14.931  0.19  11  0-12  17.502  0.13  14.239  0.29  12  0-18  12.621  0.32  10.683  0.47  11  0-24  14.072  0.44  11.825  0.62  14  (months)  2  2  T a b l e 2.2: T e s t s o f 2  nd  2  O r d e r vs. 1  st  Order Markov rrocess  Time Period (months) 1 -» Treatment Group  3 -> 6 ->  3  25.400  19  0.15  22.141  16  0.14  20.006  15  0.17  21.161  11  0.03  6 -4 9 9 -> 12  3.589  12  0.99  29.795  24  0.19  3  18.429  12  0.10  5.997  12  0.92  9  10.104  9  0.34  12  19.101  13  0.12  9 '-»• 12 - » 18  11.920  11  0.37  11.251  17  0.84  0  18 -»• 2 4 1 ^  1 ^ 3 - ^ 6 Group  p-value  freedom  3 -• 6  9 -»• 12 -> 18 12  Control  degrees o f statistic  0 - » 1 ->  2  3 ->  6 ->  6  9  12  18 -> 2 4  Chapter  2.  Markov  10  Analysis  T h e r e s u l t s a r e i n T a b l e 2.4. T h e s t a t i s t i c s f o r b o t h g r o u p s w e r e n o n s i g n i f i c a n t ( a t a = 0.05) w h i c h i m p l i e d t h a t t h i s m o d e l w a s n o t u n r e a s o n a b l e f o r t h e t i m e p e r i o d studied.  2.2  Modelling of the Tridiagonals  S p e c i f i c m o d e l l i n g o f t h e t r i d i a g o n a l s w a s d o n e as m o s t t r a n s i t i o n s w e r e m a d e t o n e i g h b o u r i n g K u r t z k e scores. I n t h i s s e c t i o n , a l l t r a n s i t i o n s m a d e t o n o n - n e i g h b o r i n g  states  were ignored a n d t r a n s i t i o n probabilities n o t i n a t r i d i a g o n a l p o s i t i o n were assumed t o b e z e r o ; t h a t is, p^ = 0 i f | i — j  |> 2. I n t h e f o l l o w i n g c a l c u l a t i o n s , t h e t r a n s i t i o n  matrices were a s s u m e d t o b e s t a t i o n a r y a n d first order M a r k o v .  T h e observations o n  the tridiagonals were m o d e l l e d first using a general f o r m where each r o w h a d different entries, t h e n using a m o r e specific f o r m w h i c h was suggested b y t h e data. I n this section, t h e uneven spacing of t h e time points will b e taken into account. T h e d a t a at t h e o n e m o n t h t i m e p o i n t were o m i t t e d so t h a t t h e r e m a i n i n g t i m e p o i n t s  (0,3,6,9,12,18,24) w e r e s e p a r a t e d b y i n t e r v a l s w h i c h w e r e m u l t i p l e s o f t h r e e m o n t h s . T h i s a l l o w e d m o d e l l i n g o f t r a n s i t i o n s o v e r a t h r e e m o n t h p e r i o d u s i n g t h e m a t r i x , V. T h u s , transitions over a six m o n t h period could be modelled using V . 2  T o adjust for  t h i s o m i s s i o n , r e d e f i n e t = 0,1,..., 6 c o r r e s p o n d i n g t o 0, 3, 6, 9, 12, 18, a n d 24 m o n t h s . S u b s e c t i o n 2.2.1 d i s c u s s e s m o d e l l i n g t h e t r i d i a g o n a l s u s i n g t h e m a t r i c e s V a n d V to describe transitions o c c u r r i n g over three a n d s i x m o n t h periods respectively. Section  2  In  2.2.2, m o d e l s a r e f i t t e d a s s u m i n g t h a t t h e s a m e t r a n s i t i o n m a t r i x , P , c a n b e  used t o m o d e l b o t h three a n d six m o n t h periods.  T h e entire time interval a n d that  d u r i n g t r e a t m e n t (0, 3, 6 m o n t h s ) a n d a f t e r t r e a t m e n t (6, 9, 12, 18, 2 4 m o n t h s )  were  modelled separately so that t h e estimates could b e compared. S o m e other observations were o m i t t e d f r o m the analysis because the transitions were  Chapter  2.  Markov  Analysis  11  l a r g e r t h a n t h o s e a l l o w e d b y t h e m o d e l . I n t h i s case, d a t a f r o m t h e s u b j e c t i n q u e s t i o n was used u n t i l the v i o l a t i o n occurred. A subject w i t h a missing o b s e r v a t i o n was dealt w i t h similarly. A p p r o x i m a t e l y 2 5 % of the subjects i n each group h a d a missing at s o m e t i m e p o i n t after s i x m o n t h s .  value  O n l y six patients i n t h e treatment group a n d  three i n the control group h a d transitions w h i c h were larger t h a n those allowed b y the model.  2.2.1  Modelling Incorporating V  2  T r a n s i t i o n m a t r i c e s f o r t r e a t m e n t (V ) T  a n d c o n t r o l g r o u p s (V )  were estimated a n d  c  c o m p a r e d t o a m a t r i x e s t i m a t e d u n d e r t h e n u l l h y p o t h e s i s V°  = V. T  f i r s t m o d e l l e d u s i n g t h e g e n e r a l m a t r i x , V:  Pll  1 - p  n  0  0  0  P21  P22  1 — P21 — P22  0  0  0  P32  P33  1 - P32 - P33  0  0  0  P43  P44  0  0  0  1 -P55  T h e l i k e l i h o o d f u n c t i o n L(V)  1 - P43 ~ P55  w a s as f o l l o w s :  L(V)  = Kx  n , ^ " .  (  P  2  )*,•  w h e r e pij w a s r e d e f i n e d as: P i j  = P r [X = j | X -i t  t  = i] f o r t = 1, 2, 3 , 4  and  Nij = Ei=5 n\j Pi = Pr[X K  t  = j | X_ t  x  = i] f o r t = 5 , 6  = a c o n s t a n t i n d e p e n d e n t o f Pi/s.  T h e d a t a were  Chapter 2.  Markov  12  Analysis  T o c a l c u l a t e t h e m.l.e.'s, i t w a s n e c e s s a r y t o s o l v e e i g h t n o n l i n e a r e q u a t i o n s . w a s a c c o m p l i s h e d u s i n g P o w e l l ' s m e t h o d , d e s c r i b e d i n [5]. T h e G  2  This  statistic calculated  to compare the two groups was: (2.5)  <3 = £ E t=i j=i  w h e r e njj, nfj, Nf-, Nf- w e r e d e f i n e d t o c o r r e s p o n d t o c o u n t s i n t h e t r e a t m e n t a n d c o n t r o l g r o u p s i n t h e o b v i o u s way. R e s u l t s o f t h e t est a r e i n T a b l e 2.5. I n t h e t i m e p e r i o d d u r i n g t h e t r e a t m e n t , G\ w a s l a r g e r e l a t i v e t o t h e degrees o f freedom.  T h e l o w p values (calculated using t h e \  2  approximation)  indicated that,  d u r i n g this t i m e p e r i o d , the t w o g r o u p s ' t r a n s i t i o n p r o b a b i l i t y matrices were different. In t h e follow u p period after treatment, a n d t h e entire t w o year period, there was no evidence that t h e two groups behaved differently. G  2  statistics were calculated t o  d e t e r m i n e h o w w e l l t h e m o d e l s fit t h e d a t a (see T a b l e 2.6). T h e s e s t a t i s t i c s i n d i c a t e d t h a t t h e fit o f t h e m o d e l w a s r e a s o n a b l e a t a l l o f t h e t i m e i n t e r v a l s m o d e l l e d .  2.2.2  Modelling Without V  2  S i n c e t h e " s t a t i o n a r i t y " t e st s e e m e d t o i n d i c a t e t h a t a l l c o n s e c u t i v e i n t e r v a l s i n t h e s t u d y could b e m o d e l l e d using the same m a t r i x , the above analysis was repeated under this assumption w i t h t h e d a t a gathered  at one m o n t h again omitted.  T h e period  d u r i n g t r e a t m e n t d i d n o t i n c l u d e a n y s i x m o n t h intervals so estimates a r e t h e same as i n t h e p r e v i o u s s e c t i o n . G | , t h e l o g l i k e l i h o o d s t a t i s t i c c o m p a r i n g t h e t w o g r o u p s , w a s s i m i l a r t o t h a t c a l c u l a t e d i n (2.3). T h e r e s u l t s , d i s p l a y e d i n T a b l e 2.7, s h o w t h a t the groups could be modelled reasonably according t o t h e same transition p r o b a b i l i t y m a t r i x after treatment.  O v e r t h e t w o year period, t h e statistics indicated that t h e  t r a n s i t i o n p r o b a b i l i t y matrices f o r t h e two groups were different, d u e p r o b a b l y t o t h e d i f f e r e n c e s w h i c h w e r e d e t e c t e d i n t h e first s i x m o n t h s o f t h e s t u d y (see T a b l e 2.5).  Chapter  G  2  2.  Markov  13  Analysis  statistics for t h e period after treatment a n d t h e whole time studied are displayed  i n T a b l e 2.8. T h e s e s h o w t h a t t h e p r e d i c t i o n s f r o m t h e m o d e l s a g r e e d r e a s o n a b l y w e l l w i t h the actual data. A f t e r f u r t h e r e x a m i n a t i o n o f t h e d a t a i t w a s n o t e d t h a t m o s t o f t h e d i a g o n a l elem e n t s w e r e s i m i l a r . A p a t t e r n a m o n g t h e o f f d i a g o n a l e l e m e n t s w a s n o t i c e d also. T o determine whether the data could be modelled w i t h four parameters instead of the eight used above, the previous calculations were repeated, this t i m e using a t r a n s i t i o n p r o b a b i l i t y m a t r i x suggested b y the data, namely:  Pll  1 - Pn  0  0  0  P21  P22  1 - P21- P22  0  0  0  P32  Pll  1 ~ P32 ~ Pll  0  0  0  1 ~P32 ~ Pll  Pll  P32  0  0  0  1 - Pll  Pll  T h e estimates c o m p u t e d for the specific a n d general t r i d i a g o n a l models were c o m p a r e d for each group separately using  G: 2  G = 2  7  2 £ £ < > 'A i=l j=l  ,Pfj  where  pfj  =  m.l.e. o f pij u n d e r t h e h y p o t h e s i s o f a general tridiagonal m a t r i x  pfj  =  m.l.e. o f pij u n d e r t h e h y p o t h e s i s o f t h e specific t r i d i a g o n a l m a t r i x above.  Chapter  2.  Markov  14  Analysis  T h e results are s h o w n i n T a b l e 2.9 t o 2.10 a n d estimates for the specific a n d general matrices are provided i n the Appendix.  T h e statistics for t h e control group indicate  t h a t t h e g e n e r a l m o d e l d o e s n o t f i t s i g n i f i c a n t l y b e t t e r ( a t a n a l e v e l o f 0.05) t h a n t h e specific m o d e l over t h e t w o years s t u d i e d .  T h e specific m a t r i x estimated f o r t h e  t r e a t m e n t g r o u p however, was only reasonable f o r t h e follow u p period. A s it was of interest whether t h e treatment a n d control group could b e modelled using t h e same m a t r i x after t r e a t m e n t , this m a t r i x w a s c o m p a r e d t o t h a t f o r t h e c o n t r o l g r o u p (over the t w o y e a r period) a n d they were f o u n d t o b e significantly different. T h i s difference m a y b e d u e t o t h e f a c t t h a t t h e s p e c i f i c m a t r i x f o r t h e c o n t r o l g r o u p d o e s n o t fit t h e d a t a p a r t i c u l a r l y w e l l . A c h i - s q u a r e d s t a t i s t i c f o r t h e g o o d n e s s o f f i t w a s 6.056 ( p = 0 . 1 9 5 ) . F r o m a n examination of the matrices, it appears that i n t h e control group,  those  patients i n states other t h a n t w o were m o r e likely t o stay i n t h a t state. If they were i n s t a t e t w o , h o w e v e r , t h e y w e r e m o r e l i k e l y t o m o v e t o s t a t e one. T h i s w a s a l s o t r u e f o r patients i n t h e treatment group after s i x months.  D u r i n g administration of the drug,  t r e a t m e n t s u b j e c t s w e r e m o r e l i k e l y t o m o v e t o a h i g h e r s t a t e (i.e. t o b e c o m e s i c k e r ) t h a n the control group. A l l o f these analyses i n d i c a t e t h a t t h e progress o f t h e disease i n t h e t r e a t m e n t g r o u p w a s n o t t h e s a m e as i n t h e c o n t r o l g r o u p i n t h e first s i x m o n t h s o f t h e s t u d y . In  t h e follow u p period, there was n o evidence t o indicate that a n y of t h e groups  was worse off t h a n t h e other a n d over t h e entire t i m e period studied there to b e little difference between t h e two.  seemed  A model assuming the transition probability  m a t r i x w a s first o r d e r M a r k o v a n d s t a t i o n a r y s e e m e d t o fit t h e d a t a r e a s o n a b l y  well.  T h e m o d e l l i n g o f the t r i d i a g o n a l elements revealed that a specific m o d e l i n w h i c h a l l d i a g o n a l e l e m e n t s w e r e t h e s a m e e x c e p t f o r t h e s e c o n d , a p p e a r e d t o fit t h e d a t a .  From  t h e m a t r i c e s e s t i m a t e d u s i n g t h i s m o d e l , i t i s s e e m s t h a t p a t i e n t s w i t h scores of 4.5  Chapter  2.  Markov  Analysis  15  t o 5.5 a r e m o r e l i k e l y t o m o v e t o a n a d j a c e n t s c o r e t h a n p a t i e n t s i n a n y o f t h e o t h e r categories used. However, i t s h o u l d b e n o t e d that the collapsed categories used i n this analysis were somewhat a r b i t r a r y a n d m a y produce misleading results. A s well, m a n y of the cells i n t h e t r a n s i t i o n m a t r i c e s tested were zero, w h i c h c o u l d p r o d u c e statistics t h a t d o n o t reflect t h e d a t a accurately.  Chapter  2.  Markov  16  Analysis  T a b l e 2.3: T e s t s o f 1  O r d e r vs. I n d e p e n d e n c e  st  Time Period (months) 0 ->  Gl _  degrees o f  statistic  freedom  1  52.025  16  1 -»• 3  42.851  16  3 ^ 6  53.937  16  9  46.017  16  9 - » 12  59.955  16  12 -»• 18  51.898  16  18 - * 2 4  49.019  16  1  58.278  16  1 ->  3  44.748  16  3 ->  6  55.247  16  6 ^ 9  45.811  16  9 ^  52.629  16  Treatment Group  6 ->  0 -> Control Group  12  12 -»• 18  51.747  16  18 ->• 2 4  63.797  16  T a b l e 2.4: R e s u l t s o f T e s t f o r S t a t i o n a r i t y  G\  Group  df  p value  treatment  88.921  84  0.34  control  65.198  78  0.85  T a b l e 2.5: T r e a t m e n t v s . C o n t r o l ( g e n e r a l t r i d i a g o n a l m o d e l l i n g ) Time (months)  G\  p-value  0-3-6  19.250  0.01  6-9-12-18-24  4.283  0.83  0-3-6-9-12-18-24  11.415  0.18  Chapter  2.  Markov  17  Analysis  T a b l e 2.6: F i t o f G e n e r a l T r i d i a g o n a l M o d e l s Time  Treatment  degrees  (in months)  Group  of  P value  Control  degrees  Group  of  P value  freedom  freedom 0-3-6  1.22  8  0.99  3.73  8  0.88  6-9-12-18-24  14.75  24  0.93  25.91  24  0.36  0-3-6-9-12-18-24  43.76  40  0.32  42.80  40  0.35  T a b l e 2.7: T r e a t m e n t v s . C o n t r o l ( g e n e r a l t r i d i a g o n a l - n o Time  P) 2  p-value  (in months) 6-9-12-18-24  3.456  0.484  0-3-6-9-12-18-24  9.642  0.047  T a b l e 2.8: F i t o f G e n e r a l M o d e l - n o P Treatment degrees Control P Group of f r e e d o m value Group  Time (in months) 6-9-12-18-24  16.74  8  0.86  27.66  0-3-6-9-12-18-24  44.08  40  0.30  43.55  2  degrees of f r e e d o m  P value  24  0.27  40  0.32  )  T a b l e 2.9: C o m p a r i s o n o f S p e c i f i c a n d G e n e r a l M o d e l s ( t r e a t m e n t ) Time (in months)  Treatment G? 2  degrees  p-value  of f r e e d o m 0-3-6  11.815  4  0.018  6-9-12-18-24  3.919  4  0.417  0-3-6-9-12-18-24  14.7211  4  0.005  Chapter  2.  Markov  Analysis  18  T a b l e 2.10: C o m p a r i s o n o f S p e c i f i c a n d G e n e r a l M o d e l s Control Time (in months)  G?  degrees  p-value  of f r e e d o m 0-3-6  6.531  3  0.088  6-9-12-18-24  6.017  4  0.197  0-3-6-9-12-18-24  5.565  4  0.234  Chapter 3 Analysis At One Time Point  A t y p i c a l a n a l y s i s i n v o l v e s c o m p a r i n g t h e scores o f t h e p a t i e n t s i n t h e t r e a t m e n t c o n t r o l g r o u p at one t i m e p o i n t . C o m m o n l y  and  used for this are the 2 sample t statistic  ( a l t h o u g h i n a p p r o p r i a t e ) a n d the W i l c o x o n rank s u m statistic. Here, these statistics w e r e c o m p a r e d t o a M c C u l l a g h m o d e l s t a t i s t i c d e s c r i b e d i n [4]. T h e M c C u l l a g h  model  c a n be used to a n a l y z e d a t a w i t h ordered, categorical responses. T h e f o r m used i n this c h a p t e r is t h e p r o p o r t i o n a l o d d s m o d e l :  log  9j -f3 x  . .  T  7j(g) 1 -7j(£).  —  (3.6)  where  j  =  1,2, ...,z  z  =  t o t a l n u m b e r of categories  x  =  vector of covariates  7j(af)  =  p r o b a b i l i t y of b e i n g i n category j or lower given covariate x  9j  =  c u t p o i n t j , the ( u n k n o w n ) p o i n t separating the categories j and j + 1  f3  =  a v e c t o r of u n k n o w n parameters  K  =  a scale parameter.  T h e r e l a t i v e e f f i c i e n c i e s o f t h e t h r e e t e s t s w e r e c o m p a r e d u s i n g s i m u l a t i o n a n d efficacy  calculations. D a t a for the treatment and control group were simulated by  generating n o r m a l d a t a w i t h different l o c a t i o n parameters.  19  first  T h i s continuous d a t a was  Chapter  3.  Analysis  At One Time  20  Point  u s e d t o c r e a t e o r d e r e d c a t e g o r i c a l d a t a a c c o r d i n g t o a c h o s e n set o f c u t p o i n t s .  Tests  for differences i n the t w o groups were t h e n c a r r i e d o u t u s i n g the 2 s a m p l e t, W i l c o x o n r a n k s u m a n d M c C u l l a g h statistic. T h e 2 s a m p l e t test w a s also c a l c u l a t e d o n t h e u n c a t e g o r i z e d d a t a s o t h a t a t e s t w h i c h d i d n o t lose i n f o r m a t i o n d u e t o t h e c a t e g o r i z a t i o n c o u l d b e c o m p a r e d t o t h e o t h e r s . T h e s e r e s u l t s a r e g i v e n i n S e c t i o n 3.1. E f f i c a c y c a l c u l a t i o n s were also m a d e t o d e t e r m i n e h o w different t h e tests were a s y m p t o t i c a l l y . T h e e f f i c a c i e s o f t h e W i l c o x o n r a n k s u m t e s t a r e c a l c u l a t e d i n S e c t i o n 3.2.1, a n d f o r t h e t t e s t c a l c u l a t e d o n t h e c a t e g o r i c a l d a t a i n S e c t i o n 3.2.2. T h e s e e f f i c a c i e s a r e c o m p a r e d t o t h e t t e s t c a l c u l a t e d o n t h e u n d e r l y i n g c o n t i n u o u s d a t a i n S e c t i o n 3.3.  3.1  Simulation  A s i m u l a t i o n was r u n t o c o m p a r e the powers of the W i l c o x o n r a n k s u m statistic, b o t h 2 s a m p l e t tests a n d M c C u l l a g h m o d e l estimates. T h e c o n t r o l a n d t r e a t m e n t g r o u p were simulated using i V ( 0 , l ) a n d i V ( A , l ) distributions respectively to produce underlying c o n t i n u o u s responses.  T h e s e responses were t h e n categorized w i t h c u t p o i n t s  chosen  so t h a t t h e p r o b a b i l i t y o f b e i n g i n a n y c a t e g o r y w a s 0.2 i f t h e d a t a c a m e f r o m a iV(0,1) distribution.  F i f t y observations  ( t h e s a m e n u m b e r as w e r e i n t h e M u l t i p l e  Sclerosis data) were generated for each group. T h e simulations involved o n e t h o u s a n d replications a t each value o f A . T h e A values used were those f r o m 0 t o 1 i n increments o f 0.1. T h e d a t a w e r e g e n e r a t e d w i t h o u t c o n d i t i o n i n g o n t h e n u m b e r o f o b s e r v a t i o n s i n e a c h c a t e g o r y as t h i s w a s t h e f o r m o f t h e M u l t i p l e S c l e r o s i s d a t a . P o w e r c u r v e s w e r e c a l c u l a t e d f o r a o n e s i d e d 0.05 l e v e l test o f t h e n u l l h y p o t h e s i s H  0  : A = 0 v s . t h e a l t e r n a t e h y p o t h e s i s Hi : A > 0. T h e r e s u l t s o f t h e s i m u l a t i o n  a p p e a r i n F i g u r e 3.1. A s e x p e c t e d , t h e 2 s a m p l e t test o n t h e u n d e r l y i n g c o n t i n u o u s d a t a w a s t h e most p o w e r f u l because n o i n f o r m a t i o n w a s lost t h r o u g h collapsing t h e  Chapter  3.  Analysis  At One Time  Point  F i g u r e 3.1: S i m u l a t i o n o f D a t a a t O n e T i m e P o i n t  Chapter  data.  3.  Analysis  At One Time  22  Point  T h e W i l c o x o n r a n k s u m s t a t i s t i c a n d t h e 2 s a m p l e t test o n t h e c a t e g o r i z e d  d a t a w e r e v e r y s i m i l a r a t a l l p o i n t s . F o r t h e r e s u l t s s u m m a r i z e d i n F i g u r e 3.1, K, t h e s c a l e p a r a m e t e r f o r t h e M c C u l l a g h a n a l y s i s w a s s e t t o one.  T h e covariate x was a n  indicator variable distinguishing t h e treatment f r o m the control observations.  I n this  c a s e t h e test b a s e d o n t h e M c C u l l a g h s t a t i s t i c w a s as p o w e r f u l as t h e o t h e r t e s t s o n t h e c a t e g o r i z e d d a t a . I f K w a s i n c l u d e d as a p a r a m e t e r d e p e n d i n g o n c o v a r i a t e x, i t w a s not significantly different f r o m 1 ~ 9 2 % of the time. However, i n c l u d i n g this parameter d e c r e a s e d t h e p o w e r o f t h e M c C u l l a g h s t a t i s t i c f o r v a l u e s o f A l a r g e r t h a n 1.5. Differences i n t h e s i m u l a t e d p o w e r b e t w e e n a n y o f t h e m o d e l s were n o t v e r y large. T h e m a x i m u m d i f f e r e n c e w a s ~ 0.05 b e t w e e n t h e 2 s a m p l e t test c a l c u l a t e d u s i n g t h e c o n t i n u o u s d a t a a n d t h e M c C u l l a g h m o d e l a t A = 0.7, w h e r e t h e p o w e r o f t h e t test is a b o u t 0.95. G i v e n t h a t t h e s t a n d a r d e r r o r o f t h i s d i f f e r e n c e w a s ~ 0.02, t h e t test w a s significantly more powerful t h a n the M c C u l l a g h m o d e l at this t i m e point.  3.2  Efficacy Calculations  A n o t h e r w a y o f c o m p a r i n g t h e tests is t o c a l c u l a t e t h e i r a s y m p t o t i c r e l a t i v e e f f i c i e n c i e s . T h e m e t h o d o f c a l c u l a t i o n u s e d is t h a t i n L e h m a n n [3]. Let  Vjv, V}f b e t w o s e q u e n c e s o f s t a t i s t i c s b a s e d o n N o b s e r v a t i o n s .  d i s t r i b u t i o n s o f b o t h Vpi a n d hypothesis b e 9 = 9 Define B  N  0  d e p e n d o n a r e a l v a l u e d p a r a m e t e r 9.  a n d the alternate hypothesis  be 9 > 8  Q  Assume the L e t the null  as i n t h e s i m u l a t i o n .  t o b e t h e p o w e r o f t h e test w h i c h r e j e c t s HQ i f  Vff ~ /*(flo) C/v(# )  > CAT  (3.7)  0  a n d 8'  N  t o b e t h e p o w e r o f t h e test b a s e d o n  cr' (9 ) N  0  w h i c h r e j e c t s HQ i f (3.8)  Chapter  3.  Analysis  At One Time  w h e r e pi(9 ), CJV(# ) a n d fi'(9 ), cr' (6 ) 0  0  0  23  Point  N  0  are normalizing constants w h i c h m a y be the  e x p e c t a t i o n a n d s t a n d a r d d e v i a t i o n o f VN a n d  r e s p e c t i v e l y , a n d c/v a n d c'  N  are  sequences o f c r i t i c a l values. A s s u m e 9^ i s a s e q u e n c e o f a l t e r n a t i v e s , c o n v e r g i n g t o 9  0  i n s u c h a w a y t h a t 9^  = 9 +  F o r m o s t c o m m o n l y u s e d tests, t h i s c o n d i t i o n is s u f f i c i e n t t o e n s u r e t h a t  PN(9N)  —• B , 0 < 0 < 1. D e f i n e N' t o b e t h e s a m p l e s i z e r e q u i r e d f o r V^, t o a c h i e v e  0  0  O  t h e s a m e l i m i t i n g p o w e r a s Vjv a g a i n s t t h e s a m e s e q u e n c e o f a l t e r n a t i v e s 9^. I f c/y, c^i —> z ( t h e (1 — a)  th  a  q u a n t i l e o f t h e s t a n d a r d n o r m a l d i s t r i b u t i o n ) as N —> oo, t h e n t h e  P i t m a n e f f i c i e n c y o f VN r e l a t i v e t o V ^ , i s limjv-foo  (N'/N).  S u p p o s e t h a t w h e n e v e r 9^ = 9 + 0  0 (9 ) N  N  -  $(cA - z) a  and  3' (9 ) N  N  $ ( c ' A - *„).  T h e n c a n d c ' a r e c a l l e d t h e efficacies o f t h e t e s t s b a s e d o n Vpj a n d V^, a n d t h e P i t m a n e f f i c i e n c y o f Vjv r e l a t i v e t o  is: ( c / c ' f  3.2.1  Efficacy of the Wilcoxon  S u p p o s e t h e o r d e r e d c a t e g o r i c a l v a r i a b l e t a k e s o n v a l u e s gi < g < • • • < g , w h e r e z is 2  z  the n u m b e r of categories. I n this analysis, observations tied i n a category were assigned m i d r a n k s , so t h e W i l c o x o n r a n k s u m s t a t i s t i c ( w i t h ties) was:  W  = ^(iVa + l)/2 + T (N 2  X  + (N + l)/2) + • • • + T , ( £ AT,- + (N + l)/2) 2  z  (3.9)  3.  Chapter  Analysis  At One Time  24  Point  where Tj = n u m b e r o f o b s e r v a t i o n s i n t h e t r e a t m e n t g r o u p i n c a t e g o r y j Cj = n u m b e r o f o b s e r v a t i o n s i n t h e c o n t r o l g r o u p i n c a t e g o r y j and  Nj = Tj + Cj. Let qj = P r [ i n c a t e g o r y j | i n c o n t r o l g r o u p ] Pj = P r f i n c a t e g o r y j | i n t r e a t m e n t g r o u p ] i V = t o t a l n u m b e r of observations.  T h e test based o n t h e W i l c o x o n i n v o l v e d n o r m a l i z i n g W  as f o l l o w s :  W-E (W) 0  yjvar* {W) ' 0  w h e r e E (W) 0  is t h e e x p e c t a t i o n o f t h e W i l c o x o n u n d e r H  0  : qj = pj V j , a n d varl(W)  is  t h e v a r i a n c e o f t h e W i l c o x o n u n d e r HQ c o n d i t i o n a l o n t h e n u m b e r o f t i e d o b s e r v a t i o n s in each category. T h e p o w e r of t h e test at a specific a l t e r n a t i v e c o u l d have b e e n c a l c u l a t e d either unconditionally or conditionally on the tied observations.  I n t h i s case, u n c o n d i t i o n a l  p o w e r w a s u s e d as i t s i m p l i f i e d t h e e f f i c a c y c a l c u l a t i o n s o f t h e t t e s t a n d e l i m i n a t e d t h e need t o generate a fixed n u m b e r of observations i n each category for the simulations. T h e u n c o n d i t i o n a l p o w e r of t h e test u s i n g the W i l c o x o n s t a t i s t i c was:  f3 (p,6)=?r  W  -  E (W) 0  N  Jvar*(W) where  > z,a  Chapter  3.  Analysis  At One Time  E*(W)  v  a  r  :  (  =  W  )  E (W)  =  0  8  ^  _  =  25  Point  ^  -  ^  ^  ^  N  f^  N  j  )  y/N{q-p)  = =  total n u m b e r of observations i n the control group  n  =  total number of observations i n the treatment group  JV  =  total n u m b e r of observations i n the study  =  n + m.  m  Theorem 3.2.1 Suppose ( T i , T , . . . , T ) ~ multinomial(n, 2  z  of ( C i , C , • • •, C ) ~ multinomial(m,qi,q ,. 2  z  pi, p , • • •, p ) 2  z  independent  .. ,q ) and N —> oo in such a way that  2  z  n/N -> a, m/N -> b, 0 < a < 1. Then, as N —> oo, MP)  -  $(/(££) - z)  (3.io)  a  where *  V ^ E ; : ; *J+E£ E £ P * - E £ E ^  V^;=i > >-Ei=i E,= i r  rj =  Ef=j+i P i Ei=i  1+  —  p  defining  E/  I n t h e s i m u l a t i o n r u n o f S e c t i o n 3.1 p  x  _1  +  *)  1W  '•i'-iPiPj  Pi ^° & 0 /or  aw?/ integer  e  = p  = ••• = p  2  I.  a n d the underlying continu-  z  o u s d i s t r i b u t i o n w a s k n o w n . T h e f o l l o w i n g c o r o l l a r i e s s t a t e t h e efficacies u n d e r t h o s e c o n d i t i o n s a n d t h e p r o o f o f T h e o r e m 3.2.1 f o l l o w s . —»  Corollary 3.2.1 Assume XT± , XT , •. •, Xx 2  way.  n  that the multinomial  That is, an observation  cutpoints  vectors T and C are generated  ~ Hd N(A, 1) and X , Xc ,..., Xc Gl  di and 9^  — *  2  m  from  ~ iid i V ( 0 , 1 ) in the usual  is in category i if and only if the X value is between  where 9 = — o o and 9 = o o . Then under the assumptions of 0  Z  Theorem 3.2.1, and with A = 3 (A) N  -> $ ( c • £ -  z) a  Chapter  Analysis  3.  where  At One Time  26  Point  \/a6 vEi=i yPi-E,'=i Ej=i r  -  / o r j = 1,..., z - 1  probability  density function  distribution  of the standard  normal  at 6j.  C o r o l l a r y 3.2.2 I n i/ie case where  —A, - X r — A , . . . , XT„ —A anc?  are zzc? observations  density, h, and p = p2 = • • • = p then, under  the assumptions  with a continuous  of Theorem  > -^c* > • • • > - ^ " c  2  2  1  z  3.2.1., the efficacy of the Wilcoxon Va~b^-\(z  - j)e  becomes:  3  where A = (5/x/iV e j  =h(0 )-h(6 - )j j  j  =  1  l,2,...,z-l  Oj = outpoint j.  The above results do not depend upon the scores associated with the categories. P r o o f of T h e o r e m 3.2.1: =  P (p,6) N  Pr  W - E (W) a  > z  D  y/var*(W)  =  Pr  W - E(W)  var*(W)  y/mnN  V mnN  E(W) - E (W) a  y/mnN  It w i l l be shown that i f N —» oo i n such a way that jj —> a, ^ —» 6, then  1- W ^ ^ ' ^ ) 0  ^  = i [ £ £ } r?  P i  - E*=i EJ=1 r T  r,- = 1 + Ej=J+i Pj ~ Ej^i Pj  t  i P i P i  ]  m  3.  Chapter  „  3  Analysis  vart(W)  .  At One Time  27  Point  -,  w n - ^ w - *  ffat).  1. T o s i m p l i f y t h e c a l c u l a t i o n s , s u b s t i t u t e m — £; i Q =  for C  2  andn —  Ti f o r T  2  i n t h e expression f o r t h e W i l c o x o n (refeq:wilc) t o o b t a i n : W  - E(W) \J mnN  2  V  m  2-1  2-1  1  _  nY  n  N  J  C  i  -mY  !=1  mn  2-1  J  T  2-1  + Y ^ T j d  j  j=l t'=l  j=l 2-1  j-l  Z-l  - £  2-1  E  T  Ji C  j=l «=j+l  2-1  E ^ + E E ^ - j=i«=j+i E E Let  c;  Cj - mqj Ti — npi  Then, W  - E(W)  _  yj mnN  2-1  1  2-1  £ E r c: - ^ E  2  t  w  +  »=i  Since T * , C j " converge i n d i s t r i b u t i o n t o n o r m a l r a n d o m v a r i a n c e j ( l — Pj), as iV —> 0 0 , P  2-1j-i  I  T h u s as N —> 0 0 , W  -  E(W)  y/mnN where J-I  2-1  i = 1+  r  E  Pi -  -li-i  ^ E E (27c; - ir  E  q)  variables w i t h mean 0 a n d  Chapter  3.  Analysis  At One Time  28  Point  2-1 »t = i + d e f i n i n g E ^ f c = 0 a n d EfJmPi  =  t-1  Yl 9j - £  9;  j=l 0- S i n c e cot;(C;,C;) = j=i+l  -q  a n d cot;(T*,T*) =  iqj  converges t o a n o r m a l l y d i s t r i b u t e d r a n d o m variable w i t h m e a n 0 a n d  —piPj, jJ^^ w  variance a w-  ^ u  / z—1  z—1z—1  = j\YL )pj - Y Yl * \j=i t=i j=i  w  r  i iPiPj  r r  2. R e w r i t i n g t h e r a t i o as  var* (W)  _ var*(W)  0  mnNa  var (W) 0  var (W)  2 v  mnNo~w'  0  i t w i l l b e s h o w n t h a t b o t h r a t i o s o n t h e r i g h t h a n d s i d e c o n v e r g e t o 1. F o r t h e first r a t i o ,  var (W)  =  0  E [var* (W)] 0  =  0  +  var [E* (W)] 0  0  E [var* (W)] 0  0  s i n c e E*(W) i s a c o n s t a n t . S o  var* (W) 0  mnNo-w  _  var* (W)  var (W)  0  0  E [var*(W)]  mnNa^'  0  But  var  n  W  Jf^arliW)  1  12  N~* ab  12  + 1)  mn(N  3  3  m.  ab  12 i - E  ™ Y(N -N) 12N(N - 1) jr?"  N  as  N —> oo,  and  j )  Chapter  3.  Analysis  At One Time  29  Point  ab - £ ( Pi + 3=1  1  12  a  ab  under HQ.  12 T h u s , varl (^f) E  Hjf  c o n v e r g e s t o a c o n s t a n t , s a y y, a n d i s b o u n d e d a n d p o s i t i v e . H e n c e ,  var* ^-^==)j a l s o c o n v e r g e s t o y, a n d s o var* (W) 0  1.  E[var*(W)] N o w w e need t o prove that var (W) 0  mnNaw R e c a l l i n g t h a t u n d e r H , pj = qj V j , a n d s o rj = Sj V j i t c a n b e s h o w n t h a t 0  var  0  z-l  (W-E(W)\ . V VmnN  J  1 —• - £ 4 „2 cr .  z-l r  z-l  ]pj - £ £ t jp;pj r  r  t=i j=i  w  So var,o(W) mnNa^ as N —> o o . 3. F i n a l l y , u s i n g p a r t s 1 a n d 2, t h e e f f i c a c y t e r m c a n b e s h o w n t o c o n v e r g e t o a c o n s t a n t : E(W)  -  E (W) 0  yj mnNa\ rw 2  KM. So n o w we have that  3 (pJ) N  -  P r [JV(0,1) < /(p,<?) - z \ a  .  Chapter  3.  Analysis  At One Time  30  Point  P r o o f o f C o r o l l a r y 3.2.1. N o t e t h a t ,  =  P j  - A) -  - A).  Hence, as N —> oo,  S u b s t i t u t i o n o f t h i s e x p r e s s i o n i n t o (3.10) y i e l d s t h e r e s u l t o f C o r o l l a r y 3.2.1. P r o o f o f C o r o l l a r y 3.2.2. F r o m T h e o r e m 3.2.1, t h e e f f i c a c y t e r m is:  y/XJZl fjPi - E & E g  r,T  1  F o r t h e case pj = 1/zV  i P i P i  j,  (  2—1  £ ^  Vab  2—1j—1  2—1 2—1  + £ £ M * ' - £ £ -s + - £ - ; -1) z  - - E(J j=2  2-2 1 2  -  + -E Z  =  — Vao" z  - i^)+ 2  j=2  (*i since  Since i t c a n easily b e seen f r o m the a b o v e a r g u m e n t t h a t ~  VNA[h(8j)  =  Se,  -  ^ »=i  3=1  —y/ab 2*£(z-j)6e  Si  ^-0  Z  2-1 2-1 * £ * ; + £(*-2j)$> 3=1  =  2—2  z  h(0j..i)]  Si = 0  Chapter  3.  Analysis  At One Time  31  Point  f o r a n y c o n t i n u o u s d e n s i t y h. N o t e t h a t w h e n pj =  1/z,  z-l  12  = z-l  i  r  3=1  3z 2(> 1 ) ( 2 * - 1) r? = S u b s t i t u t i o n o f t h e s e e x p r e s s i o3=1 ns i n t o t h e d e n o m i n a t o r o f the efficacy t e r m produces:  g  z-l  3  z - l z - l  ? - 1 2  \ 12 jPj ~1212 nrjPiPj \ j=i i=i j=i  =  r  a n d i t c a n e a s i l y b e s e e n t h a t t h e e f f i c a c y t e r m is:  vV-i)/i2 3.2.2  Efficacy of the T test  L e t t h e r e s p o n s e v a r i a b l e b e as i n S e c t i o n 3.2.1. T h e n t h e 2 s a m p l e t s t a t i s t i c c a n b e w r i t t e n as: T = Z]Zl(9i ~ 9z)(mT  3  -  nCj)  nmS\Jl/m + 1/n where S  =  pooled standard deviation of the 2 groups (treatment and control).  A s N -> o o , ( w i t h n/N -> a, m/N  —• b), 2—12—1  Z  +  a 12(dj ~ 9z) Pj ~ 12 12(9j - 9z)(9i ~ 9z)PiPj 2  3=1 *=1  j=l  2—1 2—1  Z  a n d , u n d e r HQ,  - 9z) Pj - 12 12(9j - 9z)(gi - 9z)qi9j 2  3=1  j=l z-l  S  2  -* a\ = Y(9j  i=l  z - l z - l  ~ 9z) Pj - 12 12(9j ~ 9z)(9i ~ 9z)p, Pj2  Chapter  3.  Analysis  At One Time  Point  T h e p o w e r o f t h e 2 s a m p l e t t e s t , I I ^ ( p ) is  n (p) =  Pr^[T>t }.  N  T h e o r e m 3.2.2  a  of Theorem 3.2.1  Under the assumptions  r — * $ [w(p, 8) -  IIJV(P)  z  a  where ™IP,  ~  Pj  C o r o l l a r y 3.2.3 If XT and Xc are as in Section 3.2.1 then, as N  oo,  njv(A) = $(c • 8 - z ) a  where dj and 8 are as in the previous section and v a&££}(fl i - 9z)dj y  c =  \JHU\(9j  - 9z) PJ 2  ,  - £^=1 E L i O - 9z)(g,  - 9z)PiPj  where A = 6/y/W and n/N —• a, m/N —* b. C o r o l l a r y 3.2.4  When  P  l  = p = • • • = p = \, (X 2  z  and (Xc , Xc , • • •, Xc ) are as in Corollary x  2  z  Tl  w h e r e ej is d e f i n e d as p r e v i o u s l y . P r o o f o f T h e o r e m 2.2.2: N  =  Ti  - A,...,  3.2.2 and gi oc i, the efficacy  2 sample t test becomes  H (p)  - A,X  Pr ^[T>t } p  a  X  Tz  term  Chapter  3.  Analysis  At One Time  33  Point  L e t 7 = £ j Z \ ( 9 j ~ 9z)(mTj - nCj). T h e n  IL (p)  =  N  P  7 r f t ?  + 1/n  mnSyJl/m  P r p,5  7  £( )'  mnt S yJl/m  7  a  + 1/n -  v  >  E(j)  \/var(y)  I n o r d e r t o r e d u c e t h i s e x p r e s s i o n t o t h e f o r m i n (3.2.2) t h e f o l l o w i n g w i l l b e s h o w n : 1.  ^±^N(0,1)  y/var(-y) 2. var( i „ ) —> <7 7  3.  mntaSWl/m+l/n yJl~F" .. yf/ var(y)  Eh)  • z  c  w(p,6)  yjvar{i)  P a r t 1 is t r u e b y t h e C e n t r a l L i m i t T h e o r e m . T o p r o v e p a r t 2, n o t e t h a t :  uar( ) 7  =  var J^(9j ~ 9z)(mTj - nC ) 3  vz  =1  z—1z—X  mn 12(9i - 9zf [mpj + nq ] - ^ 3  A s N —> oo, p —> j=l qj a n d h e n c e ,  j=l i=l  3  (  12(9j ~ 9z)(9i ~ 9z) [mpiPj +  z-l  z-lz-l  Y(9j - 9z) Pj - 12 12(9j - 9z)(9i ~ 9z)piPj 7 j=l j=l i=l . var(' y/mnN' 2  Now mnt S^l/m a  + 1/n  _  t Sy/mnN a  sfmr^y) tS a  ,/uar(-7=2==) V VmnN' v  n q} qi  3  Chapter  3.  Analysis  At One Time  34  Point  T h e efficacy t e r m can t h e n b e f o u n d t o converge t o a constant:  y/Z&9j  3.3  ~  Vob'ZjZiigz - 9 9z) Pi - Ei=i£f=i(0.- - g,)(gj - g,)PiPj 2  Comparison of the Efficacies  T h e 2 s a m p l e t t e s t i s o f t e n u s e d i n a n a l y z i n g o r d e r e d c a t e g o r i c a l d a t a a l t h o u g h i t is n o t a p p r o p r i a t e . F o r t h i s r e a s o n , t h e r e l a t i v e e f f i c i e n c y o f t h e W i l c o x o n a n d t h e t test is o f s o m e i n t e r e s t . U n d e r c o n d i t i o n s l i s t e d b e l o w , as t h e n u m b e r o f o b s e r v a t i o n s  becomes  large, the efficacy terms for the W i l c o x o n a n d t h e t tests a p p l i e d t o s u c h d a t a converge to the same expression:  S u f f i c i e n t c o n d i t i o n s f o r (3.11) t o h o l d a r e : 1. t h e p r o b a b i l i t i e s o f t h e r e s p o n s e s a r e e q u a l i.e. p  1  = p  2  = ••• = p  z  and 2. t h e scores f o r t h e c a t e g o r i e s a r e p r o p o r t i o n a l t o t h e c a t e g o r y  number.  3. t h e scores a r e g e n e r a t e d b y a c o n t i n u o u s d i s t r i b u t i o n .  T h e e f f i c a c y f o r t h e t test a p p l i e d t o t h e c o n t i n u o u s d a t a i s c a l c u l a t e d i n L e h m a n n [3]. I t is  c = where  Chapter  3.  Analysis  At One Time  35  Point  c is t h e s t a n d a r d d e v i a t i o n o f t h e d i s t r i b u t i o n o f t h e d a t a a = n/N,  the proportion of the sample i n the treatment group  b = m/N,  the p r o p o r t i o n of the sample i n the control group.  T h e efficiency o f the W i l c o x o n a p p l i e d t o t h e categorical d a t a relative t o t h e t h e t t e s t a p p l i e d t o t h e u n d e r l y i n g c o n t i n u o u s d a t a w a s o f i n t e r e s t as t h i s i n d i c a t e d t h e decrease i n p o w e r due t o the c a t e g o r i z a t i o n of the continuous d a t a . U n d e r the c o n d i t i o n o f e q u a l p/s  a n d d a t a generated f r o m a continuous d i s t r i b u t i o n , the P i t m a n efficiency  of t h e W i l c o x o n t o t h e t test was:  e lc,t(cut) wi  - [  1 2 ( 7  2  ( E  ,2 1 -l^ _  I ) E  .)2  S o , f o r e x a m p l e i f t h e c o n t r o l g r o u p ' s s c o r e s w e r e u n i f o r m l y d i s t r i b u t e d o n [0,1] a n d t h e t r e a t m e n t g r o u p ' s scores w e r e u n i f o r m l y d i s t r i b u t e d o n [ A , 1 + A ] t h e n t h e e f f i c i e n c y o f t h e t ( u n c u t ) t e s t t o t h e W i l c o x o n ( o n scores c a t e g o r i z e d as i n t h e s i m u l a t i o n ) is:  c  - ( z - i y  ( A l t h o u g h t h e u n i f o r m d e n s i t y i s n o t c o n t i n u o u s a t 0 a n d 1, C o r o l l a r y 3.2.2 c a n b e m o d i f i e d f o r t h i s case.) It i s e a s i l y seen f r o m t h i s t h a t as z, t h e n u m b e r o f c a t e g o r i e s b e c o m e s l a r g e , c —> 1, w h i c h is w h a t w o u l d b e e x p e c t e d . T h e s i m u l a t i o n i n S e c t i o n 3.1 w a s r u n u n d e r c o n d i t i o n s 1 a n d 2 w i t h t h e n o r m a l as i t s u n d e r l y i n g c o n t i n u o u s d i s t r i b u t i o n . A l t h o u g h t h e r e w e r e o n l y 100 o b s e r v a t i o n s i n e a c h r u n , t h e c u r v e s i n F i g u r e 3.1 a r e v e r y close. U s i n g t h e e f f i c a c y t e r m s c a l c u l a t e d , t h e a s y m p t o t i c P i t m a n efficiency o f t h e W i l c o x o n w i t h respect t o t h e t test o n t h e c o n t i n u o u s d a t a w a s c a l c u l a t e d a n d f o u n d t o b e 0.89.  T h i s m e a n s t h a t t h e t test  requires ~ 8 9 % of the observations needed b y the W i l c o x o n i n order t o a t t a i n the same l i m i t i n g p o w e r u n d e r t h e s a m e a l t e r n a t e hypotheses. T h e s e c a l c u l a t i o n s s e e m t o agree w i t h t h e s i m u l a t i o n r u n i n S e c t i o n 3.1.  Chapter 4 Analysis at Two Time Points  M o s t p a t i e n t s i n t h e M u l t i p l e Sclerosis t r i a l s were e x a m i n e d at a l l of t h e t i m e p o i n t s 0, 1, 3, 6, 9, 12, 18, 2 4 m o n t h s d u r i n g t h e s t u d y . A b e t t e r i n d i c a t i o n o f t h e d i f f e r e n c e s b e t w e e n t h e t r e a t m e n t a n d c o n t r o l g r o u p c o u l d b e g a i n e d b y a n a l y z i n g h o w t h e scores o f t h e p a t i e n t s c h a n g e b e t w e e n t w o t i m e p o i n t s . T h e p o w e r of s o m e s t a t i s t i c s f r e q u e n t l y used to analyze ordered d a t a over t i m e were c o m p a r e d using simulation. These statistics were the stratified W i l c o x o n , the M c C u l l a g h , a n d a chi squared statistic. Section 4.1 c o n t a i n s a d e s c r i p t i o n o f t h e s i m u l a t i o n r u n s .  T h e W i l c o x o n and the chi-squared  s t a t i s t i c a r e d e s c r i b e d m o r e f u l l y i n S e c t i o n s 4.1.1 a n d 4.1.2.  In the latter, the dis-  t r i b u t i o n o f t h e c h i - s q u a r e d s t a t i s t i c is d e r i v e d u n d e r t h e h y p o t h e s i s o f n o d i f f e r e n c e b e t w e e n t h e t r e a t m e n t a n d c o n t r o l g r o u p s . I n S e c t i o n 4.2, t h e r e s u l t s o f t h e s i m u l a t i o n are discussed.  4.1  Description of the Simulation  A n initial and  final  score were generated for each subject s i m u l a t i n g his c o n d i t i o n at  t w o t i m e p o i n t s . T h e values of t h e u n d e r l y i n g c o n t i n u o u s r a n d o m variables g i v i n g rise t o t h e s e s c o r e s w e r e d e n o t e d b y (Rf,  Y?)  o r (Rf,Yf)  for those i n the t r e a t m e n t or  c o n t r o l g r o u p . T h e s e d a t a w e r e c a t e g o r i z e d u s i n g t h e s a m e c u t p o i n t s as i n t h e p r e v i o u s c h a p t e r t o p r o d u c e scores. T h e d i s t r i b u t i o n o f t h e s e p a i r s was:  /Ir  / ^ \ BVN Y?  J  1  0n  \. £+ 36  A  .  r,1 P. I N i .P  1  .  Chapter  4.  Analysis  at Two Time  37  Points  and  /c R  \ -  I BVN  0  1 P  0  P  1  \ /  T h e p a r a m e t e r s 3 a n d A r e p r e s e n t e d t h e c h a n g e s i n t h e scores d u e t o t h e p r o g r e s s i o n of the disease a n d the t r e a t m e n t respectively. P o w e r curves were c o m p u t e d f o r a level a = 0 . 0 5 test of t h e n u l l hypothesis A  =  0 vs. t h e a l t e r n a t e h y p o t h e s i s  A  ^  0.  A  s i m u l a t i o n r u n g e n e r a t e d t w o g r o u p s ( f i f t y o b s e r v a t i o n s / g r o u p ) a t A v a l u e s o f 0 t o 1.2 i n i n c r e m e n t s o f 0.1. E a c h c u r v e w a s c a l c u l a t e d u s i n g o n e t h o u s a n d s i m u l a t i o n r u n s . T h e stratified W i l c o x o n , chi-squared a n d M c C u l l a g h m o d e l statistics were calcul a t e d f o r each set of d a t a generated.  T h e M c C u l l a g h model statistic was computed  u s i n g t h e p l u m s o f t w a r e i n a w a y s i m i l a r t o t h a t i n C h a p t e r 2, e x c e p t t h a t t h e c o variate x h a d m u l t i p l e components. One component was a variable indicating whether the observation was f r o m the treatment o r c o n t r o l group. T h e other components were i n d i c a t o r v a r i a b l e s f o r e a c h p o s s i b l e i n i t i a l score. T h e s c a l e p a r a m e t e r w a s s e t t o one. Since, i n t h e s i m u l a t i o n , t h e continuous observations were k n o w n , t h e continuous response v a r i a b l e w a s regressed using a n i n d i c a t o r f o r t h e t r e a t m e n t / c o n t r o l groups a n d the i n i t i a l values for covariates. T h e n u l l hypothesis H  0  : A = 0 was rejected i f the  e s t i m a t e f o r t h e t r e a t m e n t / c o n t r o l v a r i a b l e w a s s i g n i f i c a n t l y different f r o m zero. T h i s s t a t i s t i c w a s i n c l u d e d s o t h a t t h e d e c r e a s e i n p o w e r d u e t o t h e loss o f i n f o r m a t i o n f r o m categorization could be determined.  4.1.1  Stratified W i l c o x o n  T h e s t r a t i f i e d W i l c o x o n r a n k s u m s t a t i s t i c , d e s c r i b e d i n [3] ( p . 1 3 2 ) , is a n e x t e n s i o n of t h a t e m p l o y e d i n t h e previous chapter. T h e subjects i n t h e t w o groups were  first  s t r a t i f i e d a c c o r d i n g t o t h e i r i n i t i a l score. W i t h i n e a c h s t r a t a , s, t h e W i l c o x o n s t a t i s t i c  Chapter  W  s  Analysis  4.  at Two Time  38  Points  w a s c o m p u t e d , as i n C h a p t e r 3, u s i n g t h e f i n a l scores as t h e b a s i s f o r t h e r a n k i n g  p r o c e d u r e a n d s u m m i n g the t r e a t m e n t ranks. E a c h of the statistics, W , was n o r m a l i z e d s  b y s u b t r a c t i o n o f i t s m e a n t o p r o d u c e W*:  w; = w - E(w ) a  8  where E  (W.)  T  s  N  s  2 ^ ± H  =  — n u m b e r of observations i n t r e a t m e n t g r o u p w i t h i n i t i a l score s = t o t a l n u m b e r o f o b s e r v a t i o n s w i t h i n i t i a l s c o r e s.  T h e o v e r a l l W i l c o x o n , W * , w a s t h e n c a l c u l a t e d as:  WT  T h e v a r i a n c e o f W* w a s :  [se(W*)f  =  £  ~se(W*)  Ni + 1  where z = number of strata se(Wj)  ~ s t a n d a r d d e v i a t i o n o f W * c a l c u l a t e d i n t h e u s u a l way.  T h e test w a s t h e n rejected o r accepted using the n o r m a l a p p r o x i m a t i o n :  W* se(W*)  4.1.2  JV(0,1).  Chi-squared Statistic  A n a t u r a l test o f t h e n u l l h y p o t h e s i s t h a t a s u b j e c t ' s p r o g r e s s is i n d e p e n d e n t o f t r e a t m e n t r e c e i v e d is b a s e d o n a s i m p l e c o n t i n g e n c y t a b l e a n a l y s i s . T h e r e s u l t i n g s t a t i s t i c , r e f e r r e d t o as xL/o ^  s D  a  s  e  d o n t h e number of subjects whose c o n d i t i o n improved, de-  t e r i o r a t e d o r r e m a i n e d t h e s a m e o v e r t h e t w o t i m e p o i n t s . T o c a l c u l a t e Xca/ci ^ were collapsed into a 3 x 2 contingency table:  n  e  data  Chapter  4. Analysis  at Two Time  Points  Treatment  Control  Totals  rpo  cC°  n°  T+  C+  n+  n  m  N  n~  where  1 X^  rp— c  1  Ef=i EjLt+i Cij  =  C°  T  z  =  E?  = 1  Cy  V"* VM—1 rp -  rp-\-  2  —  2^i=2 l^j=l • ij  c  =  EUE&Ci,-  T{j  =  1  +  L  t h e number of observations i n the treatment group w i t h i n i t i a l score i a n d f i n a l score j  Cij  =  the number of observations i n t h e control group w i t h i n i t i a l score i a n d f i n a l score j  =  n u m b e r of observations i n t h e t r e a t m e n t group  m  =  number of observations i n the control group  N  =  total n u m b e r of observations i n t h e study.  n  T h e s t a t i s t i c w a s c a l c u l a t e d as: y 2  =  (r- - ^ )  X-calc  (c- - ^ )  2  ~r mnN N rpo nn"^ (Vio mn° \ 1 c nn  (  ~ir)  A  nn" N  '  ~ —)  mn" N  2  Chapter  4. Analysis  at Two Time  (T+  40  Points  (c+  TV  T h e statistic,  Xcaici  degrees o f f r e e d o m .  l  usually assumed t o have a l i m i t i n g  s  x  2  distribution with two  It was n o t clear i f this a p p r o x i m a t i o n w a s v a l i d so t h e l i m i t i n g  n u l l d i s t r i b u t i o n o f xL/c  w  a  s  derived. T w o methods of sampling which produce data  i n a f a s h i o n s i m i l a r t o t h a t i n t h e M u l t i p l e Sclerosis s t u d y were considered, a n d l e d to d i s t i n c t l i m i t i n g n u l l d i s t r i b u t i o n s for  Xcalc-  ^  d e s c r i p t i o n of each of these s a m p l i n g  s c h e m e s a n d t h e d i s t r i b u t i o n o f xlaic * p r e c e d e d b y s o m e n o t a t i o n u s e d i n t h i s a n a l y s i s : s  Pij — P r [ f i n a l s c o r e = j | i n i t i a l s c o r e = i, i n t r e a t m e n t g r o u p ] qij = P r [ f i n a l s c o r e == j | i n i t i a l s c o r e = i, i n c o n t r o l g r o u p ] p^j = P r [ f i n a l s c o r e = j, i n i t i a l s c o r e = i, i n t r e a t m e n t g r o u p ] qfj = P r [ final s c o r e = j, i n i t i a l s c o r e = i, i n c o n t r o l g r o u p ] rii = n u m b e r o f t r e a t m e n t o b s e r v a t i o n s i n s t r a t a i rrii = n u m b e r o f c o n t r o l o b s e r v a t i o n s i n s t r a t a i  Unconditional Sampling In u n c o n d i t i o n a l sampling, subjects are r a n d o m l y assigned t o groups a n d then are s t r a t i f i e d a c c o r d i n g t o t h e c h o s e n c o v a r i a t e . I n t h i s case, t h e n u m b e r s o f o b s e r v a t i o n s i n e a c h s t r a t a is r a n d o m . T h i s t y p e o f s a m p l i n g i s a p p r o p r i a t e w h e n i t is n o t i m p o r t a n t i f s t r a t a a r e e m p t y , o r t h e r e a r e s o m a n y s u b j e c t s t h a t t h e e x p e r i m e n t e r is a s s u r e d t h a t it is u n l i k e l y t h a t a n y o f t h e s t r a t a w i l l b e e m p t y . S u p p o s e t h a t ( T n , T i , . . . ,T ) ~ multinornial(n,p" ,...,p"_J, 2  (Cn, C  1 2  , . . . , C ) ~ multinomial(m, zz  2  ZZ  independent of  g , q™ ,..., q™ ) u n d e r u n c o n d i t i o n a l s a m p l i n g . If n  2  z  w e d e f i n e T t o b e a v e c t o r w i t h t h e c o m p o n e n t s ( T , T~,T°) a n d C t o b e ( C + , C~, C ° ) +  Chapter  4.  Analysis  at Two Time  41  Points  then T a n d C are independent w i t h distributions: T ~ multinomial(n,p  ,p ,p°)  +  C ~ multinomial  (m, q , q , q°) +  where  P — Ei=2 Ej=i Pij +  P  ~ Ei=l ^2j=i+l  Pij  P° = Z U P I J o+ V* y^ y — i^i=2 —  1  =  - 1  a?Hi]  £i=i £j=;+i Qtj  I n t h i s case, t h e u s u a l a s y m p t o t i c t h e o r y f o r i n d e p e n d e n t m u l t i n o m i a l s h o l d s a n d under the hypothesis  p  +  = q , p° — q°, p~ = q~, ( a n d t h u s a l s o u n d e r t h e m o r e +  r e s t r i c t i v e n u l l h y p o t h e s i s p^- = q^ V i,j), xlaic  1S  a s y m p t o t i c a l l y d i s t r i b u t e d as a x  2  w i t h t w o degrees o f f r e e d o m .  Conditional Sampling I n c o n d i t i o n a l s a m p l i n g , t h e n u m b e r o f s u b j e c t s i n e a c h s t r a t a is fixed a t t h e b e g i n n i n g o f t h e e x p e r i m e n t . T h i s t y p e o f s a m p l i n g c o u l d b e u s e d w h e n i t is e x p e n s i v e t o a l l o w m a n y subjects t o take p a r t . It ensures t h a t there w i l l b e observations i n each strata, although t h e total number of subjects m a y be relatively small. T h i s type of s a m p l i n g is c o m m o n s o i t is n a t u r a l t o c a l c u l a t e t h e d i s t r i b u t i o n o f xlaic c o n d i t i o n a l o n n{ a n d m -. I n t h i s case, t h e d i s t r i b u t i o n o f t h e c o u n t s b e c o m e s (Tn,Ti2, t  multinomial(ni,pn,pi2,  . . . ,Pi ) a n d (Cn, C , - , . . . , Ci ) ~ multinomial z  2  z  w h e r e i = 1,2,...,z a n d a l l m u l t i n o m i a l s a r e i n d e p e n d e n t .  • ••,T ) ~ J 2  (mi, qn, q ,..., qi )  Some assumptions  i2  made to  z  Chapter  4. Analysis  at Two Time  42  Points  f a c i l i t a t e c o m p u t a t i o n s w e r e t h a t n,- = m - V i a n d t h a t t  rii/N  —¥ ai. N o t e t h a t xlaic  c  a  n  ^  e  N  oo i n s u c h a w a y t h a t  r e w r i t t e n as:  xLc = Kl + K\ + Kj where  K  •  ZX2  —  js-  Theorem 4.1.1 If  J  \  m  1  (T  nn°\]  0  2  V mnn" [VW V I f 1 / rp+  tii —  AT /J nn+ \  TTij V i, then,  H  under  ^2,7^)  : p,j = ^ j j , V i,j  0  ^>MVN(0,X)  with  =  E  1  t =  2 ^ j = l Efc=l  iPijPik  a  Ef=2 E j = i fli'Pij E L i «jP33 E j = i jPjj>' a  1 -  E i = l Ej=i+1 Efe=t'+1  iPijPik  a  E j = : l Ej=i+1 t'Pij a  E zi=2v-»t—1 vj=l Z^j  E12  ^iPiiPij  [(E*=2 E } = \ a,-p„) ( E j = i a j P « ) ] —  S13  1/2  E '=2 E j = i Efc=i+i AiPijPik t  (E;=2 E j = l iPij)  ( E f i E j = ; + l Q-iPijj  a  =  and E23  E f = l Ej=j-|-l  [(zui zUi+i iPij) a  a  iPiiPij  1/2 ( E j = i ajPii)  1/2  as N  — • oo ane?  Chapter  4.  Analysis  at Two Time  43  Points  Thus,  where Z, Z,Z t  2  3  are iid N(0,1)  and  Ai,A2,A3 are the eigenvalues o / S .  P r o o f o f T h e o r e m 4.1.1. L e t  ~ VW\  X 2  x  3  ~N~  _ L ( T + - ^ VN \ N  =  T o d e t e r m i n e t h e l i m i t i n g d i s t r i b u t i o n o f Xcaio Xi,X ,X3. 2  n  r  s  ^  If  rp*  Tjj  13  TljPij  V^i  and  *  c.  then  X  x  =  nn ~N z i—1  EE * vwyuu 2  rriiqij  Cij  nn  N  'mi  n  n  d t h e l i m i t i n g distributions of  Chapter  Analysis  4.  at Two Time  ) 1  44  Points  z i—1  ^ z i—1  £ £ ij i=2 j=l  - JJ £ £ °ij i=2 j=\  T  i v  t=2 i=l  J V 7  i V  i=2 i=l  Therefore, under iJ,0 5 z i—1  i=2 j=l under HQ.  since ra; = m - and p,j = t  Under H , T*j and C*j converge to normal random variables with mean zero, and 0  thus Xi is asymptotically normal with mean zero and variance: var(Xi)  ~  var Z  (  j  \t=2i=l z i—1  z i—1 j—1  £ E a^iiC - Pij) - E E E 1  2  i=2 j=l  z i—1 Under H , 0  n  i=2 i=l  ^  iPijPik  a  fc=l  z i—1 i—1  E E  E E E  i=2j=l  i=2j=lk=l  diPijPik  iV  z i—1  ~ fefeU iV m, iV +  z i—1 E E ( P < ' i i + ft'i ') i=2 j=l z i—1 fl  =  X  2  a  ££<w,  i=2 j=l and thus K~i is asymptotically normal with mean zero and variance, Et=2 E j - l QiPij ~ £j-2 Ej=l Efc l AiPijPik =  12i=2 Ej=l  a  iPij  ,  Chapter  4. Analysis  at Two Time  Similar calculations show that K  Points  and K  2  3  45  are asymptotically normal. T h e variance/covariance  calculations are straightforward. N o w (K K ,K ) U  2  ~  3  MVN(0,V).  If S i s a p o s i t i v e s e m i d e f i n i t e m a t r i x , 3 a d i a g o n a l m a t r i x D, w i t h t h e e i g e n v a l u e s ( A i , A , A 3 ) o f E as i t s e l e m e n t s , a n d a m a t r i x P, s u c h t h a t P P T  2  Let  Y = PK,  = I, a n d PHP  T  — D.  t h e n Y ~ N(0, D) a n d KK T  =  YY T  ~  AiZ  x  + A2Z2 + A3Z3  I n m o s t cases, t h e e i g e n v a l u e s o f S a r e d i f f i c u l t t o f i n d e x p l i c i t l y . H o w e v e r , o n e s i m p l e c a s e is c o n s i d e r e d i n t h e f o l l o w i n g : C o r o l l a r y 4.1.1 Suppose that n\ = n = ... = n 2  z  and p^ = 1/z V i,j.  Under the null  hypothesis,  where Z Z U  2  ~ JV(0,1).  Thus, xlaic does not have a limiting  chi-squared distribution  with two degrees of freedom  in this case. Proof of Corollary:  U s i n g T h e o r e m 4.1.1, t h e a s y m p t o t i c c o v a r i a n c e m a t r i x o f (K , x  K , K ) is: 2  3  T h e e i g e n v a l u e s f o r t h i s c o v a r i a n c e m a t r i x a r e A i = 0, X = 1, A = | — ^ . 2  3  Chapter  4.2  4.  Analysis  at Two Time  46  Points  Simulation Results  T h e p o w e r c u r v e s c o m p u t e d b y t h e s i m u l a t i o n s a p p e a r i n F i g u r e s 4.1 t o 4.4.  I n all of  t h e figures, t h e m o s t p o w e r f u l test w a s t h e o n e b a s e d o n t h e r e g r e s s i o n p a r a m e t e r .  This  r e s u l t w a s e x p e c t e d as n o i n f o r m a t i o n d u e t o c a t e g o r i z a t i o n w a s l o s t i n t h e c a l c u l a t i o n of this statistic.  T h e next most powerful was t h e M c C u l l a g h statistic which h a d a  s i m i l a r p o w e r c u r v e t o t h e W i l c o x o n . I n a l l cases b o t h o f t h e s e s t a t i s t i c s w e r e m u c h more powerful t h a n t h e chi-squared statistic.  Because the chi-squared statistic d i d  n o t t a k e i n t o a c c o u n t t h e i n i t i a l s c o r e o r t h e size o f t h e d i f f e r e n c e , t h i s w a s w a s n o t surprising. T h e p o w e r s o f t h e t e s t s i n c r e a s e d w i t h i n c r e a s i n g p. T h i s p r o b a b l y o c c u r r e d b e c a u s e i n c r e a s i n g p w o u l d d e c r e a s e t h e v a r i a t i o n i n d i f f e r e n c e s b e t w e e n t h e i n i t i a l a n d final scores a t fixed v a l u e s o f 0 a n d A . C h a n g e s i n /? d i d n o t affect t h e p o w e r c u r v e s o f the W i l c o x o n , M c C u l l a g h o r regression statistics, b u t increasing i t d i d decrease t h e p o w e r o f t h e c h i - s q u a r e d s t a t i s t i c . A s /? b e c a m e l a r g e , m o s t p a t i e n t s ' scores i n c r e a s e d , a n d t h e d i f f e r e n c e b e t w e e n t h e t r e a t m e n t a n d c o n t r o l g r o u p s w e r e i n t h e sizes o f t h e s e increases.  Since t h e W i l c o x o n a n d t h e M c C u l l a g h statistics compared t h e treatment  a n d c o n t r o l g r o u p s b a s e d o n t h e d i f f e r e n c e s o f t h e final scores g i v e n t h e i n i t a l scores, a n increase i n 0 w o u l d n o t b e expected t o change the power of these statistics. However, v e r y l a r g e c h a n g e s i n /3 w o u l d r e s u l t i n a l l p a t i e n t s m o v i n g t o t h e l a r g e s t score, i n w h i c h case, n o n e o f t h e t e s t s w o u l d d e t e c t a n y d i f f e r e n c e b e t w e e n t h e g r o u p s . T h e c h i - s q u a r e s t a t i s t i c c o m p u t e d h e r e w a s u n c o n d i t i o n a l as t h e i n i t i a l n u m b e r o f o b s e r v a t i o n s i n t h e s t r a t a were n o t f i x e d . It o n l y r e c o r d e d w h e t h e r o r n o t t h e p a t i e n t s ' scores increased, d e c r e a s e d o r r e m a i n e d t h e s a m e , s o i t s p o w e r was e x p e c t e d t o d e c r e a s e as /3 i n c r e a s e d .  Chapter  4. Analysis  at Two Time  Points  delta  Figure 4.1: Simulation Run 1 (8 = 0.1, p = 0.8)  Figure 4.2: Simulation Run 2 (/? = 0.1, p = 0.6)  Chapter  4. Analysis  at Two Time  Points  Chapter  4.  Analysis  at Two Time  Points  delta  F i g u r e 4.4: S i m u l a t i o n R u n 4 (0 = 0.3, p = 0.6)  Chapter  4.  Analysis  51  at Two Time Points  T a b l e 4.1: R e s u l t s o f M c C u l l a g h Time (in  months)  Analysis  estimate  standard  of t r e a t m e n t  error  z-score  parameter 0-1  -0.82  0.418  -1.95  1-3  -0.40  0.403  -1.00  3-6  -0.17  0.424  -0.40  6-9  0.98  0.466  2.10  9-12  0.07  0.51  0.14  12-18  0.31  0.452  0.69  18-24  -0.16  0.48  -0.33  Application of Tests to the MS Data  4.3  T h e stratified W i l c o x o n and x  2  t e s t s h a v e b e e n c a l c u l a t e d o n t h e M S d a t a set p r e v i o u s l y  [2]. B o t h s t a t i s t i c s w e r e c a l c u l a t e d o n t h e d a t a b e t w e e n 0 m o n t h s a n d e a c h o f t h e o t h e r times t h e p a t i e n t s were observed. T h e results o f the test based o n t h e W i l c o x o n s t a t i s t i c s h o w e d t h a t t h e t r e a t m e n t g r o u p r e g r e s s e d ( p ~ 0.05) r e l a t i v e t o t h e c o n t r o l g o r u p i n t h e t i m e p e r i o d s 0-1 m o n t h s  a n d 0-3 m o n t h s .  T h e remaining statistics calculated  were nonsignificant. However, the change i n sign i n those t i m e periods larger t h a n s i x m o n t h s i n d i c a t e s t h a t p a t i e n t s i n t h e c o n t r o l g r o u p m a y h a v e f a r e d less w e l l t h a n t h o s e i n the treatment group i n the follow u p period. Since the x  2  is less p o w e r f u l t h a n t h e  W i l c o x o n , i t is n o t s u r p r i s i n g t h a t none o f t h e tests based o n i t were significant. T h e d a t a w e r e fit t o a M c C u l l a g h m o d e l . U n l i k e t h e p r e v i o u s a n a l y s i s , t h e c o l l a p s e d scores w e r e u s e d h e r e s i n c e t h e d a t a w e r e s p a r s e . T h e c o v a r i a t e s i n t h i s a n a l y s i s w e r e i n d i c a t o r v a r i a b l e s f o r e a c h o f t h e p o s s i b l e i n i t i a l scores a n d o n e f o r t r e a t m e n t / c o n t r o l group.  T h e d a t a were modelled between consecutive time points rather t h a n f r o m  b a s e l i n e t o t h e o t h e r scores. T h e r e s u l t s a r e s h o w n i n T a b l e 4.1. T h e v a r i a b l e f o r t h e t r e a t m e n t g r o u p p a r a m e t e r w a s set u p s o t h a t a n e g a t i v e v a l u e  Chapter  4.  Analysis  at Two Time  Points  52  i m p l i e d t h a t the c o n t r o l group regressed w i t h respect t o the t r e a t m e n t group. C o l l a p s ing t h e d a t a has p r o d u c e d results w h i c h a r e c o n t r a d i c t o r y t o t h e previous W i l c o x o n a n a l y s i s i n t h e 0-1 m o n t h t i m e p e r i o d . The Wilcoxon and x  2  analysis agree w i t h t h e analysis u s i n g t h e M a r k o v techniques  i n that a l l three show that patients i n the treatment group m a y not have progressed at t h e s a m e r a t e as t h o s e i n t h e c o n t r o l g r o u p d u r i n g t h e f i r s t s i x m o n t h s o f t h e s t u d y . I n t h e c a s e o f t h e M a r k o v a n a l y s i s , i t is n o t p o s s i b l e t o d e t e r m i n e w h i c h g r o u p i m p r o v e d relative t o the other.  Chapter 5 Conclusions  A s the d a t a set was small, the number of observations w h i c h had any p a r t i c u l a r K u r t z k e score was low. F o r this reason, the d a t a were collapsed i n t o five categories chosen only to ensure at least four observations i n each category at zero months. Results f r o m the M a r k o v analysis indicate that, d u r i n g the a d m i n i s t r a t i o n of interferon, the treatment group regressed relative to the controls. After this p e r i o d , subjects i n the treatment group appeared t o r e t u r n t o a state i n w h i c h their t r a n s i t i o n probabilities were not significantly different f r o m the controls. M o d e l s w h i c h d i d not incorporate i n f o r m a t i o n about the length of t i m e intervals between observations fit the d a t a just as well as those that d i d . T h a t is, a m o d e l w h i c h assumed that transitions between any two consecutive t i m e points c o u l d be modelled using the same t r a n s i t i o n m a t r i x fit as well as a model w h i c h allowed a different m a t r i x for each t i m e interval.  M o d e l l i n g w i t h a specific  t r i d i a g o n a l m a t r i x w i t h a l l diagonal elements equal, except for the second, proved to be reasonable for the control group over the entire time period. T h e treatment group could be modelled reasonably b y this form only i n the eighteen m o n t h follow u p period. T h e major differences between the two groups were i n those patients that started at K u r t z k e scores i n the range 4.5 t o 5.5.  C o n t r o l patients w i t h i n i t i a l scores i n this  range fared better t h a n treatment patients, d u r i n g the p e r i o d when the interferon was administered. Statistics c o m m o n l y used to compare the groups at one t i m e point were examined  53  Chapter 5.  Conclusions  54  using simulation techniques. If d a t a for the control group were generated using underl y i n g i V ( 0 , 1 ) d i s t r i b u t i o n a n d t h o s e f o r t h e t r e a t m e n t g r o u p b y a N(A,  1) d i s t r i b u t i o n ,  a n d t h e p r o b a b i l i t y o f b e i n g i n e a c h c a t e g o r y w a s e q u a l f o r A = 0, t h e n t h e W i l c o x o n , M c C u l l a g h a n d 2 s a m p l e t test w e r e f o u n d t o h a v e s i m i l a r r e l a t i v e efficiencies. T h e o r e t i c a l a s y m p t o t i c c a l c u l a t i o n s y i e l d e d expressions for P i t m a n efficiencies o f these statistics f o r g e n e r a l s h i f t m o d e l s . U n d e r t h e c o n d i t i o n s o f t h e s i m u l a t i o n s , t h e efficacies o f t h e W i l c o x o n a n d 2 s a m p l e t test c a l c u l a t e d o n t h e c a t e g o r i z e d d a t a w e r e f o u n d t o b e e q u a l . T h e P i t m a n efficiency o f these t w o statistics w i t h respect t o t h e t test c a l c u l a t e d o n t h e u n d e r l y i n g c o n t i n u o u s d a t a w a s 0.89. A c o m p a r i s o n o f s t a t i s t i c s u s e d t o c o m p a r e t h e t w o g r o u p s ' p r o g r e s s i o n o f disease between t w o t i m e points was then carried out. S i m u l a t i o n results showed that when the categorical d a t a were generated using a bivariate n o r m a l distribution, w i t h equal i n i t i a l probabilities of being i n a n y category, t h e W i l c o x o n a n d M c C u l l a g h statistics were m u c h m o r e efficient t h a n t h e c h i squared statistic. T h e a s y m p t o t i c d i s t r i b u t i o n o f t h e c h i - s q u a r e d s t a t i s t i c w a s d e r i v e d u n d e r t h e h y p o t h e s i s o f n o t r e a t m e n t effect. It w a s d e t e r m i n e d t h a t t h e c h i - s q u a r e d s t a t i s t i c d i d n o t n e c e s s a r i l y h a v e a l i m i t i n g x  2  d i s t r i b u t i o n w i t h t w o degrees o f f r e e d o m if, i n t h e s a m p l e o f p a t i e n t s , t h e i n i t i a l n u m b e r of subjects i n each category was fixed. However, i f the n u m b e r w a s n o t fixed, t h e c h i squared statistic d i d have a n asymptotic x  2  d i s t r i b u t i o n w i t h t w o degrees o f f r e e d o m .  Appendix A Sample Transition Matrices  T r a n s i t i o n M a t r i x f o r C o n t r o l G r o u p f r o m 0 t o 24 21  7  1  0  0  8  7  3  0  0  0  2  67  29  1  0  0  23  49  7  0  0  1 5  8  T r a n s i t i o n M a t r i x f o r T r e a t m e n t G r o u p f r o m 0 t o 24 23  6  1  0  0  4  2  7  0  0  1  3  59  20  1  0  0  20  40  14  0  0  0  12  16  55  months  months  Appendix  A.  Sample  Transition  56  Matrices  Control Group - General Tridiagonal Modelling 0 months to 6 months 0.67  0.33  0.00  0.00  0.00  0.57  0.29  0.14  0.00  0.00  0.00  0.03  0.65  0.32  0.00  0.00  0.00  0.35  0.59  0.06  0.00  0.00  0.00  1.00  0.00  6 m o n t h s t o 24 m o n t h s 0.81  0.19  0.00  0.00  0.00  0.36  0.45  0.18  0.00  0.00  0.00  0.02  0.71  0.28  0.00  0.00  0.00  0.24  0.64  0.11  0.00  0.00  0.00  0.20  0.80  0 m o n t h s to 24 m o n t h s 0.75  0.25  0.00  0.00  0.00  0.44  0.39  0.17  0.00  0.00  0.00  0.02  0.68  0.30  0.00  0.00  0.00  0.29  0.62  0.09  0.00  0.00  0.00  0.38  0.62  Appendix  A.  Sample  Transition  Matrices  Treatment group - General Modelling 0 months to 6 months 0.75  0.25 0.00 0.00 0.00  0.13  0.13 0.75 0.00 0.00  0.00  0.07 0.57 0.37 0.00  0.00  0.00 0.35 0.32 0.32  0.00  0.00 0.00 0.58 0.42  6 months to 24 months 0.82  0.18 0.00 0.00 0.00  0.60  0.20 0.20 0.00 0.00  0.00  0.02 0.81 0.17 0.00  0.00  0.00 0.21 0.70 0.09  0.00  0.00 0.00 0.31 0.69  0 months to 24 months 0.79  0.21 0.00 0.00 0.00  0.31  0.15 0.54 0.00 0.00  0.00  0.04 0.72 0.24 0.00  0.00  0.00 0.27 0.54 0.19  0.00  0.00 0.00 0.43 0.57  Appendix  A.  Sample Transition  Matrices  C o n t r o l G r o u p - Specific T r i d i a g o n a l M o d e l l i n g 0 months to 6 months 0.61  0.39 0.00 0.00 0.00  0.57  0.29 0.14 0.00 0.00  0.00  0.04 0.61 0.35 0.00  0.00  0.00 0.35 0.61 0.04  0.00  0.00 0.00 0.39 0.61  6 months t o 24 months 0.71  0.29 0.00 0.00 0.00  0.36  0.45 0.18 0.00 0.00  0.00  0.05 0.71 0.24 0.00  0.00  0.00 0.24 0.71 0.05  0.00  0.00 0.00 0.29 0.71  0 months to 24 months 0.67  0.33 0.00 0.00 0.00  0.44  0.39 0.17 0.00 0.00  0.00  0.05 0.67 0.29 0.00  0.00  0.00 0.29 0.67 0.05  0.00  0.00 0.00 0.33 0.67  58  Appendix  A.  Sample  Transition  Matrices  Treatment Group - Tridiagonal Modelling 0 months to 6 months 0.48  0.52 0.00 0.00 0.00  0.13  0.13 0.75 0.00 0.00  0.00  0.18 0.48 0.33 0.00  0.00  0.00 0.33 0.48 0.18  0.00  0.00 0.00 0.52 0.48  6 months to 24 months 0.76  0.24 0.00 0.00 0.00  0.60  0.20 0.20 0.00 0.00  0.00  0.05 0.76 0.19 0.00  0.00  0.00 0.19 0.76 0.05  0.00  0.00 0.00 0.24 0.76  0 months to 24 months 0.65  0.35 0.00 0.00 0.00  0.31  0.15 0.54 0.00 0.00  0.00  0.11 0.65 0.25 0.00  0.00  0.00 0.25 0.65 0.11  0.00  0.00 0.00 0.35 0.65  Bibliography  [1] B h a t , N a r a y a n U . , ( 1 9 8 4 ) . Elements  of Applied  Stochastic  Processes 2 n d e d i t i o n .  J o h n W i l e y & Sons. [2] K a s t r u k o f f , L., e t . a l . , (1988). u n p u b l i s h e d m a n u s c r i p t . U B C H o s p i t a l . [3] L e h m a n n ,  E . L . (1975).  Nonparametrics:  Statistical  Methods  Based  on  Ranks.  H o l d e n - D a y Inc., S a n F r a n c i s c o , C a l i f o r n i a . [4] M c C u l l a g h , P. ( 1 9 8 0 ) . R e g r e s s i o n M o d e l s F o r O r d i n a l D a t a . J.R. Statist. Soc. B, 42, No. 2, pp. 109-142. [5] W a l s h , G . R . ( 1 9 7 5 ) . Methods [6] W h i t t l e , P. ( 1 9 5 5 ) .  of Optimization.  J o h n W i l e y & Sons.  Some Distribution a n d M o m e n t  Formulae For the Markov  C h a i n . J.R. Statist. Soc. B, 17, pp. 235-242. [7] T h e S t a t i s t i c a l A d v i s o r y S e r v i c e (1988). M u l t i p l e S c l e r o s i s D a t a s e t . u n p u b l i s h e d manuscript, University of M a n i t o b a .  60  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
China 11 8
United States 6 5
Poland 2 0
Japan 2 0
Russia 1 0
City Views Downloads
Beijing 6 7
Shenzhen 5 1
Ashburn 4 0
Unknown 2 6
Tokyo 2 0
Saint Petersburg 1 0
Redmond 1 0
Mountain View 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0097667/manifest

Comment

Related Items