Time-Varying Exposure Subject toMisclassi cationBias Characterization and AdjustmentbyEric CormierB.Sc. (Honors), The University of Victoria, 2009A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate Studies(Statistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2010c Eric Cormier 2010AbstractMeasurement error occurs frequently in observational studies investigatingthe relationship between exposure variables and a clinical outcome. Error-prone observations on the explanatory variable may lead to biased esti-mation and loss of power in detecting the impact of an exposure variable.When the exposure variable is time-varying, the impact of misclassi cationis complicated and signi cant. This increases uncertainty in assessing theconsequences of ignoring measurement error associated with observed data,and brings di culties to adjustment for misclassi cation.In this study we considered situations in which the exposure is time-varying and nondi erential misclassi cation occurs independently over time.We determined how misclassi cation biases the exposure outcome relation-ship through probabilistic arguments and then characterized the e ect ofmisclassi cation as the model parameters vary. We show that misclassi ca-tion of time-varying exposure measurements has a complicated e ect whenestimating the exposure-disease relationship. In particular the bias towardthe null seen in the static case is not observed.After misclassi cation had been characterized we developed a meansto adjust for misclassi cation by recreating, with greatest likelihood, theexposure path of each subject. Our adjustment uses hidden Markov chaintheory to quickly and e ciently reduce the number of misclassi ed statesand reduce the e ect of misclassi cation on estimating the disease-exposurerelationship.The method we propose makes use of only the observed misclassi edexposure data and no validation data needs to be obtained. This is achievedby estimated switching probabilities and misclassi cation probabilities fromthe observed data. When these estimates are obtained the e ect of mis-classi cation can be determined through the characterization of the e ect ofmisclassi cation presented previously. We can also directly adjust for mis-classi cation by recreating the most likely exposure path using the Viterbialgorithm.The methods developed in this dissertation allow the e ect of misclassi -cation, on estimating the exposure-disease relationship, to be determined. ItiiAbstractaccounts for misclassi cation by reducing the number of misclassi ed statesand allows the exposure-disease relationship to be estimated signi cantlymore accurately. It does this without the use of validation data and is easyto implement in existing statistical software.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . ixDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 22 Bias Determination . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Bias Calculation . . . . . . . . . . . . . . . . . . . . . . . . . 43 Bias Characterization . . . . . . . . . . . . . . . . . . . . . . . 63.1 Bias Characterization for Linear Outcome Model . . . . . . . 63.1.1 E ect of the Exposure Switching Probability . . . . . 83.1.2 E ect of the Sensitivity and Speci city . . . . . . . . 183.2 Bias Characterization for Linear Outcome Model with LagTerm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.1 E ect of the Exposure Switching Probability . . . . . 263.2.2 E ect of the Sensitivity and Speci city . . . . . . . . 313.3 The E ect of the Number of Exposure Measurements . . . . 414 Discrete-Time Hidden Markov Adjustment . . . . . . . . . . 434.1 Discrete-Time Hidden Markov Process . . . . . . . . . . . . 444.2 Inference for Discrete-Time Hidden Markov Process . . . . . 444.3 Recreating the Path Through True Exposure States . . . . . 45ivTable of Contents4.4 Adjusting for Misclassi cation: True Exposure Path Recre-ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.4.1 Data Simulation . . . . . . . . . . . . . . . . . . . . . 474.4.2 Simulation Results . . . . . . . . . . . . . . . . . . . . 485 Continuous-Time Hidden Markov Adjustment . . . . . . . . 515.1 Continuous-Time Markov Process . . . . . . . . . . . . . . . 525.2 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . 535.3 Continuous Time Hidden Markov Process . . . . . . . . . . . 535.4 Inference for Continuous-Time Hidden Markov Process . . . 545.5 Recreating the Exposure Path for Continuous Time HiddenMarkov Processes . . . . . . . . . . . . . . . . . . . . . . . . 555.6 Adjusting for Misclassi cation: True Exposure Path Recre-ation in Continuous Time . . . . . . . . . . . . . . . . . . . . 565.6.1 Data Simulation . . . . . . . . . . . . . . . . . . . . . 575.6.2 Simulation Results . . . . . . . . . . . . . . . . . . . . 586 Conclusion and Future Work . . . . . . . . . . . . . . . . . . 66Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69AppendixA R Code for Bias Determination . . . . . . . . . . . . . . . . . 72vList of Tables4.1 Simulation results for reducing misclassi cation using discreteadjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2 Simulation results for discrete adjustment with no lag term . 484.3 Simulation results for discrete adjustment with misspeci edno lag term . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.4 Simulation results for discrete adjustment with lag term . . . 494.5 Simulation results for discrete adjustment with misspeci edlag term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.1 Simulation results of continuous time hidden Markov param-eters: Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 585.2 Simulation results for continuous recreation: Case 1 . . . . . 595.3 Simulation results for continuous time adjustment with no lagterm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.4 Simulation results of continuous time hidden Markov param-eters: Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 605.5 Simulation results for continuous recreation: Case 2 . . . . . 605.6 Simulation results for continuous time adjustment with mis-speci ed no lag term . . . . . . . . . . . . . . . . . . . . . . . 615.7 Simulation results of continuous time hidden Markov param-eters: Case 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.8 Simulation results for continuous recreation: Case 3 . . . . . 625.9 Simulation results for continuous time adjustment with lagterm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.10 Simulation results of continuous time hidden Markov param-eters: Case 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.11 Simulation results for continuous recreation: Case 4 . . . . . 645.12 Simulation results for continuous time adjustment with mis-speci ed lag term . . . . . . . . . . . . . . . . . . . . . . . . . 64viList of Figures3.1 Coe cient Magnitudes . . . . . . . . . . . . . . . . . . . . . . 73.2 Common Switching Probability when (SN = 0:8;SP = 0:95) 93.3 Switching Probabilities for 4 when (SN = 0:8;SP = 0:95) . 103.4 Switching Probabilities for 3 when (SN = 0:8;SP = 0:95) . 113.5 Switching Probabilities for 34 when (SN = 0:8;SP = 0:95) . 123.6 Switching Probabilities for Bias when (SN = 0:8;SP = 0:95) 133.7 Switching Probabilities for 4 when (SN = 0:95;SP = 0:8) . 143.8 Switching Probabilities for 4 when (SN = 0:9;SP = 0:9) . . 153.9 Switching Probabilities for 3 when (SN = 0:95;SP = 0:8) . 163.10 Switching Probabilities for 3 when (SN = 0:9;SP = 0:9) . . 173.11 E ect of SN and SP on Bias when = 0:2 . . . . . . . . . . . 183.12 E ect of SN and SP on Bias when = 0:5 . . . . . . . . . . . 193.13 E ect of SN and SP on Determination of 4 when = 0:2 . . 203.14 E ect of SN and SP on Determination of 4 when = 0:8 . . 213.15 E ect of SN and SP on Determination of 3 when = 0:2 . . 223.16 E ect of SN and SP on Determination of 3 when = 0:8 . . 233.17 E ect of SN and SP on Determination of 34 when = 0:2 . 243.18 Coe cient Magnitudes for Lagged Model . . . . . . . . . . . 263.19 E ect of Switching Probabilities on 4 for Lagged Model . . . 283.20 E ect of Switching Probabilities on 3 for Lagged Model . . . 293.21 E ect of Switching Probabilities on 2 for Lagged Model . . . 303.22 E ect of Switching Probabilities on Bias for Lagged Model . 313.23 E ect of SN and SP on Determination of 3 for Lagged Modelwhen = 0:2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.24 E ect of SN and SP on Determination of 3 for Lagged Modelwhen = 0:5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.25 E ect of SN and SP on Determination of 3 for Lagged Modelwhen = 0:8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.26 E ect of SN and SP on Determination of 4 for Lagged Modelwhen = 0:2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 35viiList of Figures3.27 E ect of SN and SP on Determination of 4 for Lagged Modelwhen = 0:5 . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.28 E ect of SN and SP on Determination of 4 for Lagged Modelwhen = 0:8 . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.29 E ect of SN and SP on Determination of Model Bias forLagged Model when = 0:2 . . . . . . . . . . . . . . . . . . . 383.30 E ect of SN and SP on Determination of Model Bias forLagged Model when = 0:5 . . . . . . . . . . . . . . . . . . . 393.31 E ect of SN and SP on Determination of 2 for Lagged Modelwhen = 0:2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.32 E ect of the Number of Exposure Measurements on Bias, nand n 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42viiiAcknowledgmentsI would like to express my supreme gratitude to my supervisor Professor PaulGustafson and my co-supervisor Dr. Nhu Le whose support and suggestionshave been an immeasurable help throughout this entire process.I would also like to thank Professors John Petkau, Lang Wu and RubenZamar for their support throughout my masters program.I am also grateful to my fellow graduate students Corrine and Sky whosecollaboration throughout my masters program has been very bene cial.Eric CormierThe University of British ColumbiaAugust 2010ixTo My Family and Friends Who Enrich My Life Every Single Day.xChapter 1IntroductionIn many epidemiological and clinical studies we wish to model a health re-lated outcome, Y, dependent on an explanatory variable corresponding tosome exposure status, X, and certain measured potential confounders Z.Sometimes the measured exposure status, denoted by X , is an imperfectsurrogate for the actual exposure X. This is known as exposure misclas-si cation and it is very important to account for in these studies. Carroll,Ruppert, Stefanski and Crainiceanu (2006) found that measurement errorin the explanatory variable: causes bias in parameter estimation for statistical models; masks the features of the data; leads to a profound loss of power for detecting relationships betweenvariables.An example of this is when a prescription is dispensed to a patient but themedication is not taken. If the exposure measure is taken from the pre-scription records, then our data would assume that the patient was exposedto the treatment when actually no exposure occurred. This will cause ourestimates to be biased and is a serious problem in many studies. Hencethe goal of adjustment for mis-measurement is to achieve roughly unbiasedestimates to reveal the relationship between Y and X indirectly, based onthe measurements of Y, X and perhaps other correctly recorded covariatesZ. In binary contexts the degree of misclassi cation is determined by thesensitivity and speci city, (SNi;SPi) respectfully:SNi = Pr(X = 1jX = 1;Y = i); SPi = Pr(X = 0jX = 0;Y = i);(1.1)for i = 0;1.In simple contexts it is reasonable to assume the actual exposure andthe recorded exposure are both binary and X depends on (X;Y;Z) onlythrough X, which is know as nondi erential misclassi cation. ThenSN = Pr(X = 1jX = 1); SP = Pr(X = 0jX = 0): (1.2)11.1. Problem FormulationIn this situation it is known that exposure misclassi cation biases the at-tenuation factor:Attenuationfactor = X coe cient in the (YjX ;Z) regressionXcoe cient in the (YjX;Z) regression (1.3)toward the null (0 <AF < 1) when estimating the exposure-disease associa-tion (Gustafson 2004). Therefore there is a tendency to report an arti ciallyweak association between the exposure and response in ignoring measure-ment error on the exposure. This bias increases when either the sensitivityor speci city of the classi cation decreases or if the correlation between Xand Z increases. Furthermore, when the exposure prevalence approach 0 or1, measurement error induces serious and unstable attenuation toward thenull. It is also known that in these contexts adjusting for exposure misclassi- cation has little to no e ect on the ‘nearer to null’ endpoint of the intervalestimate for the coe cient of X . That is, misclassi cation adjustment willnot strengthen the evidence for the existence of an exposure-disease asso-ciation (Gustafson 2004). The simple contexts de ned above for modelingmisclassi cation bias are very restrictive and most medical studies do notfall into this category. Greeland and Gustafson (2006) found that no generalconclusion could be made regarding the direction of estimated associationwhen the nondi erential misclassi cation requirement on a binary exposureis not satis ed, or when the exposure variable is polychotomous. Furtherdiscussion of measurement in continuous or polychotomous exposure vari-ables or when misclassi cation is di erential is found in Gustafson (2004)and Carrol et al. (2006).To account for measurement error in all the situations described abovecomplete information on the outcome variable Y, true exposure X and sur-rogate exposure X is needed for a small proportion of the data (validationsample). The true exposure status for the majority of study subjects (mainstudy) remains unobservable or cannot be precisely measured.1.1 Problem FormulationIn this thesis we restrict our attention to misclassi cation on a time-varyingbinary exposure variable with no other measured covariates and we assumeno measurement error arising in the outcome of interest Y. When the actualbinary exposure status is time-varying all of the rules about how the biasa ects our results no longer hold.We assume that the actual binary exposure status across timeX1;X2;X3;:::21.1. Problem Formulationis a Markov chain with switching probabilities:P(Xi = 1jXi 1 = 0) = 1; P(Xi = 0jXi 1 = 1) = 2; (1.4)and that these time-varying exposures are misclassi ed (Xj!X j ) indepen-dently over time. This misclassi cation is assumed to be nondi erential andcan be characterized by (SN;SP). To characterize the e ect of exposuremisclassi cation we consider a linear outcome model of the formE(YijX1;:::;Xi) = + Xi; (1.5)and a linear outcome model including lag terms such asE(YijX1;:::;Xi) = + Xi 1 + Xi: (1.6)Even in this simple case when very strong assumptions are made aboutthe dependence of exposure-status over time, misclassi cation of the time-varying exposure will have a signi cant e ect. This e ect will be di erentthen the attenuation toward the null seen in the static case.In this thesis we characterize and adjust for the e ect of time-varying ex-posure misclassi cation and thereby increase the accuracy of the estimatesand allow for correct inference to be made about the e ect of time-varyingexposure. This adjustment is obtained without the use of a validation studyto determine the misclassi cation model, (X 1:njX1:n), and is easily imple-mented in statistical software.The thesis is organized as follows. Chapter 2 provides general resultson how the e ect of misclassi cation was determined. Chapter 3 character-izes the e ect of misclassi cation for speci c examples and depicts generaltrends that result. Chapters 4 and 5 describe how to adjust for misclassi ca-tion using Markov chain theory and display results from simulation studies.Chapter 6 provides overall conclusions and remarks on further research.3Chapter 2Bias Determination forTime-Varying ExposureMisclassi cationLet the outcome variable Y be dependent on exposure variable X. If Y ismodeled on the misclassi ed exposure variable X then bias results. In thesimple case this bias and how misclassi cation a ects the exposure-outcomerelationship can be computed exactly.2.1 Bias CalculationAssume the actual binary exposure-status across time isX1:n = (X1;X2;X3;:::;Xn) and that these time-varying exposures are mis-classi ed (Xj!X j ) as X 1:n = (X 1;X 2;X 3;:::;X n). Assume this misclas-si cation occurs independently over time and is nondi erential. Then for alinear outcome modelE(YnjX1:n) = X1:n ; (2.1)the outcome variable Y depends on X 1:n through the relationshipE(YnjX 1:n) = EfE(YnjX1:n;X 1:n)jX 1:ng= Ef(X1:n )jX 1:ng= EfX1:njX 1:ng :The joint distribution of X and X is calculated byPr(X1:n;X 1:n) ="Pr(x1)nYi=2Pr(xijx1:::xi 1)# nYi=1Pr(x ijxi); (2.2)wherePr(x ijxi) =((SN)x i (1 SN)1 x i if xi = 1;(1 SP)x i (SP)1 x i if xi = 0; (2.3)42.1. Bias Calculationand Pr(xijx1:::xi 1) is determined by the switching probabilities.In this thesis we are interested in the special case where the exposure isgoverned by a Markov chain, Pr(xijx1:::xi 1) = Pr(xijxi 1) and Pr(x1)is taken to be the stationary probability distribution of the ergodic Markovchain.To determine the e ect of misclassi cation, the joint probability distribu-tion of EfX1:njX 1:ngis tabulated for all 2n possible values of X 1:n. The 2nvalues of EfX1:njX 1:ng are then expressed via the binary expansion with2n coe cients. This one-to-one correspondence determines the relationshipbetween Y and X ,E(YnjX 1:n) = nX n + n 1X n 1 +:::+ 12:::n(X 1X 2 :::X n): (2.4)This calculation allows us to determine the coe cients and the biasthat results from using misclassi ed time-varying exposure measurementsand how quantities such as sensitivity, speci city and Markov switchingprobabilities a ect these quantities. Code for this calculation is presentedin Appendix A.5Chapter 3Bias Characterization forTime-Varying ExposureMisclassi cationWhen examining a time-varying exposure that is subject to misclassi cationthe normal rules of attenuation toward the null shown for the static case inGustafson (2004) do not apply. This is true even in the most simplistic case.Assume that the actual binary exposure-status across time isX1;X2;X3;:::and that these time-varying exposures are misclassi ed (Xj!X j ) indepen-dently over time. This misclassi cation is assumed to be nondi erential andcan be characterized by (SN;SP). To characterize the e ect of exposuremisclassi cation we consider a linear outcome model and a linear outcomemodel including a lag term. Even in this simple case misclassi cation of thetime-varying exposure will have signi cant e ect. This is shown for speci cexamples in the next two sections.3.1 Bias Characterization for Linear OutcomeModelTo show the e ect of time-varying exposure misclassi cation we consider aspeci c example. Let the binary exposure X1;X2;X3;::: be a Markov chainwith switching probabilities:P(Xi = 1jXi 1 = 0) = 1; P(Xi = 0jXi 1 = 1) = 2; (3.1)for i = 2;3;:::n. We assume that the Markov chain is in its stationarydistribution which implies thatP(X1 = 0) = 2 1 + 2; P(X1 = 1) = 1 1 + 2: (3.2)We further assume that the outcome variable only depends on the currentexposure variable, with = 1. Then at the fourth exposure observation we63.1. Bias Characterization for Linear Outcome ModelhaveE(Y4jX1;:::;X4) = X4: (3.3)The choice of = 1 is made without loss of generality and all results holdfor arbitrary . The choice of the fourth observation is also arbitrary andwas chosen for computational convenience. Results hold for n exposureobservations.Equation (3.4) implies that the relationship of the outcome variable de-pendent on the misclassi ed exposure will have the formE(Y4jX 1;:::;X 4 ) = 4X 4 + 3X 3 +:::+ 1234(X 1X 2X 3X 4 ); (3.4)where if (SN = 1;SP = 1) then X i = Xi and ( 4) = ~0 and 4 = 1.In the linear outcome model the largest coe cients are 4, 3 and 34.All other coe cients are estimated to be close to zero. Figure 3.1 displays thecoe cient magnitudes when (SN = 0:8;SP = 0:95) and common switchingprobability 1 = 2 = = 0:2 .a71a71a71a71a71a71a71a71a71 a71a71a71a71a71a71a715 10 15−0.20.00.20.40.60.81.0Graphical Representation of Coefficient Magnitudes when (SN=0.8,SP=0.95) and φ=0.2βjValue of Misclassification Coefficientβ0β1β2β3β4β12β13 β23 β14β24β34β123β124β134β234β1234Figure 3.1: Coe cient magnitudes, when (SN=0.8, SP=0.95) and = 0:2,showing 4, 3 and 34 to be the largest coe cients.73.1. Bias Characterization for Linear Outcome Model3.1.1 E ect of the Exposure Switching ProbabilityTo determine how the switching probabilities of the exposure Markov chaina ect the misclassi cation coe cients and model bias, we calculate thesequantities for xed (SN;SP) as 1 and 2 vary. We de ne model bias asthe sum of the absolute di erences between the coe cients (main e ects andall interaction terms) of the linear outcome model when misclassi cation ispresent ( j) and when misclassi cation is not present ( j),Bias = Xij i ij: (3.5)In all calculations Pr(X1) is taken to be the stationary probability distri-bution of the ergodic Markov chain de ned in equation 3.2.If the two exposure switching probabilities are equal ( 1 = 2 = ),Figure 3.2 shows how the coe cient determination changes with when(SN = 0:8;SP = 0:95). From Figure 3.2 we can see that 4 is determinedto be closest to the coe cient in the correctly classi ed exposure model when 0:3. The other misclassi cation coe cients are closest to the coe cientin the correctly classi ed exposure model at the same point as the bias isminimized at = 0:5.When = 0:5 the next state is not a ected by the previous state sothere is no dependence in the exposure Markov chain. This means that forthis model only the last exposure is important and we are essentially in thestatic case. This causes the familiar attenuation toward the null to occur.This is re ected by ( 4) ~0 and 4 = 0:77.When both switching probabilities ( 1; 2) are allowed to vary indepen-dently, similar e ects are shown. Figure 3.3 shows a determination surfacefor the misclassi cation coe cient 4 when both switching probabilities varyand (SN = 0:8;SP = 0:95). We can see that 4 increases as switching prob-ability 1, 1, increases until 1 0:3 and then decreases after that. We canalso see that 4 remains almost constant as 2 varies except when 2 getsvery small. When 2 is small it causes a strong negative e ect on the de-termination of 4. Figure 3.4 shows that the magnitude of 3 remains closeto zero unless 2 is small. The magnitude of 3 becomes greatest when 1 0:3 and 2 is low. The magnitude of 34 also becomes greatest when 1 0:3 and 2 is low. This is shown in Figure 3.5. This implies that as 2approaches its lower limit the e ect of exposure misclassi cation becomesgreatest. This is re ected in Figure 3.6 where we can see that the bias in-creases signi cantly when 2 is close to zero. The bias is most apparentwhen both switching probabilities approach zero or one.83.1. Bias Characterization for Linear Outcome Model0.0 0.2 0.4 0.6 0.8 1.00246810Switching Probability φBias0.0 0.2 0.4 0.6 0.8 1.00.30.50.7Switching Probability φβ40.0 0.2 0.4 0.6 0.8 1.0−0.40.00.2Switching Probability φβ30.0 0.2 0.4 0.6 0.8 1.0−0.10.10.30.5Switching Probability φβ34Figure 3.2: E ect of switching probability, , on determination of 4, 3, 34 and overall Bias for time-varying exposure misclassi cation when (SN =0:8;SP = 0:95).93.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta40.20.40.6Effect of Switching Probability on the Determination of β4Figure 3.3: E ect of switching probabilities, ( 1; 2), on determination of 4 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).103.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta3−0.4−0.20.00.20.40.6Effect of Switching Probability on the Determination of β3Figure 3.4: E ect of switching probabilities, ( 1; 2), on determination of 3 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).113.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta34−0.6−0.4−0.20.00.20.4Effect of Switching Probability on the Determination of β34Figure 3.5: E ect of switching probabilities, ( 1; 2), on determination of 34 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).123.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8Bias1234Effect of Switching Probability on Estimation BiasFigure 3.6: E ect of switching probabilities, ( 1; 2), on bias for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).When the values of (SN;SP) change, the determination surface of themisclassi cation coe cients di er with respect to the switching probabilities.The shapes of the determination surface for bias remains mostly una ectedby change in (SN;SP) but the determination surface for 4, 3 and 34 shiftsdepending on the relative magnitude of (SN;SP). When SN is small relativeto SP, such as (SN = 0:8;SP = 0:95), Figure 3.3 shows that the maximum of 4 occurs around 1 0:3. When SN is large relative to SP, such as (SN =0:95;SP = 0:8), the maximum of 4 is shifted right and occurs around 1 0:7, as shown in Figure 3.7. When SN and SP are approximatelyequal then the determination surface for 4 is roughly symmetric and the133.1. Bias Characterization for Linear Outcome Modelmaximum occurs when 1 0:5, as shown in Figure 3.8. Similar asymmetricbehavior is displayed by 3 and 34 as shown by Figure 3.9 and Figure 3.10.This asymmetric behavior is believed to occur because by changing the(SN;SP) we are just relabeling the states that are measured with precision.When SN is smaller than SP, the unexposed state is measured with moreprecision therefore misclassi cation has the least e ect when 1 is low. WhenSN is larger then SP misclassi cation has the least e ect when 1 is high.Switching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta40.20.40.6Effect of Switching Probability on the Determination of β4Figure 3.7: E ect of switching probabilities, ( 1; 2), on determination of 4 for time-varying exposure misclassi cation when (SN = 0:95;SP = 0:8).143.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta40.20.40.60.8Effect of Switching Probability on the Determination of β4Figure 3.8: E ect of switching probabilities, ( 1; 2), on determination of 4 for time-varying exposure misclassi cation when (SN = 0:9;SP = 0:9).153.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta3−0.4−0.20.00.20.4Effect of Switching Probability on the Determination of β3Figure 3.9: E ect of switching probabilities, ( 1; 2), on determination of 3 for time-varying exposure misclassi cation when (SN = 0:95;SP = 0:8).163.1. Bias Characterization for Linear Outcome ModelSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta3−0.4−0.20.00.20.40.6Effect of Switching Probability on the Determination of β3Figure 3.10: E ect of switching probabilities, ( 1; 2), on determination of 3 for time-varying exposure misclassi cation when (SN = 0:9;SP = 0:9).173.1. Bias Characterization for Linear Outcome Model3.1.2 E ect of the Sensitivity and Speci cityTo determine how SN and SP a ect the impact of misclassi cation, for xedswitching probability , determination surfaces are created as SN and SPvary. Figure 3.11 shows that when = 0:2 at low SN and high SP the e ectof misclassi cation is greatest. A similar surface is produced when = 0:8.This shows that when there is a switching probability that is not around0.5, bias does not behave in a linear fashion with respect to (SN;SP) andhigh sensitivity is necessary for bias minimization. Figure 3.12 shows thatwhen = 0:5 bias does behave in a linear way, decreasing as either SN orSP increases.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0Bias0.00.51.01.52.02.5Effect of Sensitivity and Specificity on Bias when φ=0.2Figure 3.11: E ect of sensitivity and speci city on bias for time-varyingexposure misclassi cation when = 0:2.183.1. Bias Characterization for Linear Outcome ModelSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0Bias0.00.20.40.60.81.0Effect of Sensitivity and Specificity on Bias when φ=0.5Figure 3.12: E ect of sensitivity and speci city on bias for time-varyingexposure misclassi cation when = 0:5.193.1. Bias Characterization for Linear Outcome ModelSensitivity and speci city have a linear e ect on the determination of 4.When = 0:2, high values of speci city cause misclassi cation to have theleast impact on the determination. When = 0:5, sensitivity and speci cityhave an equal e ect on the determination of 4, and when = 0:8, highvalues of sensitivity cause misclassi cation to have the least impact on thedetermination of 4. This is shown in Figure 3.13 and Figure 3.14.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta40.00.20.40.60.81.0Effect of Sensitivity and Specificity on the Determination of β4 when φ=0.2Figure 3.13: E ect of sensitivity and speci city on determination of 4 fortime-varying exposure misclassi cation when = 0:2.203.1. Bias Characterization for Linear Outcome ModelSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta40.00.20.40.60.81.0Effect of Sensitivity and Specificity on the Estimation of β4 when φ=0.8Figure 3.14: E ect of sensitivity and speci city on determination of 4 fortime-varying exposure misclassi cation when = 0:8.213.1. Bias Characterization for Linear Outcome ModelThe determination of 3 is a ected by SN and SP in a non-linear way.Figure 3.15 shows that when = 0:2, 3 is positive and approaches zeroas SN approaches one. Figure 3.16 shows that when = 0:8 3 is nega-tive and also approaches zero as SN approaches one. When = 0:5 3 isapproximately zero and (SN;SP) have no e ect on the determination.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta30.00.10.20.30.4Effect of Sensitivity and Specificity on the Determination of β3 when φ=0.2Figure 3.15: E ect of sensitivity and speci city on determination of 3 fortime-varying exposure misclassi cation when = 0:2.223.1. Bias Characterization for Linear Outcome ModelSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta3−0.25−0.20−0.15−0.10−0.050.00Effect of Sensitivity and Specificity on the Estimation of β3 when φ=0.8Figure 3.16: E ect of sensitivity and speci city on determination of 3 fortime-varying exposure misclassi cation when = 0:8.233.1. Bias Characterization for Linear Outcome ModelThe determination surface for 34 formed as (SN;SP) vary is hyperbolicand the value of 34 cannot easily be predicted based on values of (SN;SP).This is seen in Figure 3.17. When = 0:5, 34 is approximately zero and(SN;SP) have no e ect on its determination.When determining the e ect of sensitivity and speci city, symmetricbehavior as moves away from 0.5 is observed. The switching probabilitiesand (SN;SP) interact in a reciprocal way. This is can be explained by thesymmetry that is obtained by just relabeling the exposed and unexposedstates.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta34−0.4−0.20.00.2Effect of Sensitivity and Specificity on the Determination of β34 when φ=0.2Figure 3.17: E ect of sensitivity and speci city on determination of 34 fortime-varying exposure misclassi cation when = 0:2.243.2. Bias Characterization for Linear Outcome Model with Lag Term3.2 Bias Characterization for Linear OutcomeModel with Lag TermTo determine the e ect of misclassi cation on a linear outcome model withlag term we consider a speci c example. We again consider the binaryexposure X1;X2;X3;::: to be a Markov chain with switching probabilities 1 and 2, and independent nondi erential misclassi cation determined by(SN;SP). We further assume that the outcome variable depends on thecurrent exposure variable with coe cient one and the previous lag termwith coe cient 0.5. Therefore at the fourth exposure observation we haveE(Y4jX1;:::;X4) = X4 + 0:5X3: (3.6)This relationships implies that the outcome variable dependent on the mis-classi ed exposure will have the formE(Y4jX 1;:::;X 4 ) = 4X 4 + 3X 3 +:::+ 1234(X 1X 2X 3X 4 ); (3.7)where if (SN = 1;SP = 1) then X i = Xi and 4 = 1, 3 = 0:5 and j = 0otherwise. In this case we study the behavior of the main e ects 4, 3 and 2. Figure 3.18 shows that when (SN = 0:8;SP = 0:95) and = 0:2 mostof the other coe cients are estimated to be close to zero with the exceptionof some interaction terms. Figure 3.18 shows that under these conditionthe value of 3 is overestimated. This means that when including lag termsmisclassi cation can overestimate the e ect of previous exposure.253.2. Bias Characterization for Linear Outcome Model with Lag Terma71a71a71a71a71a71a71a71 a71a71a71a71a71a71a71a715 10 15−0.40.00.20.40.60.81.0Graphical Representation of Coefficient Magnitudes when (SN=0.8,SP=0.95) and φ=0.2βjValue of Misclassification Coefficientβ0β1β2β3β4β12β13β23β14β24β34β123β124β134β234β1234Figure 3.18: Coe cient magnitudes when (SN = 0:8;SP = 0:95) and = 0:23.2.1 E ect of the Exposure Switching ProbabilityTo determine how the Markov switching probabilities a ect the determina-tion of misclassi cation coe cients and model bias, we again calculate thesequantities for xed (SN;SP) as 1 and 2 vary. Through this character-ization many similarities can be seen between linear outcome models withlagged terms and linear model outcome models without lagged terms. Fig-ure 3.19 shows that when lagged terms are included the determination ofthe leading coe cient ( 4) behaves similarly as when there is no lagged termpresent. When (SN = 0:8;SP = 0:95) the familiar determination surface isseen with a sharp increase to the maximum obtained at 1 0:3 and thenslow descent as 1 increases. This shape is also observed for the determi-nation surface of 3 except that 2 has a greater e ect. As 2 get smallwe see that the value of 3 increases rapidly causing its e ect to be over-estimated. This can be seen in Figure 3.20. From Figure 3.3, Figure 3.19and Figure 3.20 we can see that the determination of misclassi cation coe -263.2. Bias Characterization for Linear Outcome Model with Lag Termcients for exposure measurements that are present in the correctly speci edlinear outcome model behave similarly regardless of whether lagged termsare present or not.The last misclassi cation coe cient that was examined was 2. The ran-dom variable X2 is not in the linear outcome model so if the exposure statusis correctly classi ed then it should be zero. We can see from Figure 3.21that the determination of 2 behaves similarly to the determination to 3in the linear outcome model without lagged term. The determination ofthese coe cients correspond because they are both coe cients for the rstrandom variable that is not included in the true linear outcome model.The determination of the e ect of switching probabilities on linear out-come models with lagged term have shown many similarities with the e ectof switching probabilities on linear outcome models without lagged terms.Figure 3.22 shows that this correspondence is also displayed in the determi-nation of bias.273.2. Bias Characterization for Linear Outcome Model with Lag TermSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta40.20.40.60.81.0Effect of Switching Probability on the Determination of β4Figure 3.19: E ect of switching probabilities, ( 1; 2), on determination of 4 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).283.2. Bias Characterization for Linear Outcome Model with Lag TermSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta30.00.51.0Effect of Switching Probability on the Determination of β3Figure 3.20: E ect of switching probabilities, ( 1; 2), on determination of 3 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).293.2. Bias Characterization for Linear Outcome Model with Lag TermSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8beta20.00.20.40.60.8Effect of Switching Probability on the Determination of β2Figure 3.21: E ect of switching probabilities, ( 1; 2), on determination of 2 for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).303.2. Bias Characterization for Linear Outcome Model with Lag TermSwitching Probability 10.20.40.60.8Switching Probability 20.20.40.60.8Bias51015Effect of Switching Probability on the Determination of BiasFigure 3.22: E ect of switching probabilities, ( 1; 2), on Bias for time-varying exposure misclassi cation when (SN = 0:8;SP = 0:95).3.2.2 E ect of the Sensitivity and Speci cityTo determine how SN and SP a ect the impact of misclassi cation in a linearoutcome model with a lag term, determination surfaces are created as SNand SP vary and is xed. The coe cient of most interest in the outcomemodel with a lag term is that of 3. Figure 3.24 and Figure 3.25 show thatwhen is large 3 behaves similar to 4 in the linear model with no lag term.When is small something quite di erent occurs. When misclassi cationis present random variables that are in the true model almost always have313.2. Bias Characterization for Linear Outcome Model with Lag Termtheir e ect on the outcome variable underestimated. This is not the casefor 3 when is small. Figure 3.23 shows that depending on sensitivityand speci city 3 can either be under or over estimated. This can causea big problem in analysis because of the uncertainty of whether the laggedexposure e ect is truly more important when misclassi cation is present orif it is truly less important.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta30.00.20.40.60.8Effect of Sensitivity and Specificity on the Determination of β3 when φ=0.2Figure 3.23: E ect of sensitivity and speci city on determination of 3 fortime-varying exposure misclassi cation with lagged term when = 0:2.323.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta30.00.10.20.30.40.5Effect of Sensitivity and Specificity on the Determination of β3 when φ=0.5Figure 3.24: E ect of sensitivity and speci city on determination of 3 fortime-varying exposure misclassi cation with lagged term when = 0:5.333.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta30.00.10.20.30.40.5Effect of Sensitivity and Specificity on the Determination of β3 when φ=0.8Figure 3.25: E ect of sensitivity and speci city on determination of 3 fortime-varying exposure misclassi cation with lagged term when = 0:8.343.2. Bias Characterization for Linear Outcome Model with Lag TermThe determination surfaces for 4 and bias behave the same as in the pre-vious model, as shown in Figure 3.26, Figure 3.27, Figure 3.28, Figure 3.29and Figure 3.30. For 2 we can see from Figure 3.31 that the determinationsurface behaves similarly to the determination to 3 in the linear outcomemodel without lagged term. The determination of these coe cients corre-spond because they are both coe cients for the closest exposure term thatis not included in the true linear outcome model.Sensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta40.00.20.40.60.81.0Effect of Sensitivity and Specificity on the Determination of β4 when φ=0.2Figure 3.26: E ect of sensitivity and speci city on determination of 4 fortime-varying exposure misclassi cation with lagged term when = 0:2.353.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta40.00.20.40.60.81.0Effect of Sensitivity and Specificity on the Determination of β4 when φ=0.5Figure 3.27: E ect of sensitivity and speci city on determination of 4 fortime-varying exposure misclassi cation with lagged term when = 0:5.363.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta40.00.20.40.60.81.0Effect of Sensitivity and Specificity on the Determination of β4 when φ=0.8Figure 3.28: E ect of sensitivity and speci city on determination of 4 fortime-varying exposure misclassi cation with lagged term when = 0:8.373.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0Bias0123Effect of Sensitivity and Specificity on Bias when φ=0.2Figure 3.29: E ect of sensitivity and speci city on determination of biasfor time-varying exposure misclassi cation with lagged term when = 0:2.383.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0Bias0.00.51.01.5Effect of Sensitivity and Specificity on Bias when φ=0.5Figure 3.30: E ect of sensitivity and speci city on determination of biasfor time-varying exposure misclassi cation with lagged term when = 0:5.393.2. Bias Characterization for Linear Outcome Model with Lag TermSensitivity0.50.60.70.80.91.0Specificity0.50.60.70.80.91.0beta20.00.10.20.30.4Effect of Sensitivity and Specificity on the Determination of β2 when φ=0.2Figure 3.31: E ect of sensitivity and speci city on determination of 2 fortime-varying exposure misclassi cation with lagged term when = 0:2.403.3. The E ect of the Number of Exposure Measurements3.3 The E ect of the Number of ExposureMeasurementsIn the previous two section we have arbitrarily chosen to study the e ectof time-varying misclassi cation at the fourth exposure measurement. Thischoice was made due to computational e ciency, not for any reasons per-taining to misclassi cation. The results derived for the fourth time pointwill also hold, approximately, for any other number of time points greaterthen two as long as the exposure Markov chain is in its stationary distribu-tion. This is because when the Markov chain is stationary the misclassi edMarkov chain will also be stationary because SN and SP are constant overtime. The results are not exactly equivalent because as the number of timepoints increases so does the number of misclassi cation coe cients. Thiscauses j to vary slightly but for all intents and purposes the results arethe same. In our analysis we have chosen Pr(X1) to be the Markov chain’sstationary distribution so the results are not dependent on the number ofexposure measurements that are taken. This can be seen in Figure 3.32where we examine the relationship:E(YnjX 1:n) = nX n + n 1X n 1 +:::+ 12:::n(X 1X 2 :::X n): (3.8)The lines for both n and n 1 are horizontal indicating e ectively no changewith respect to number of exposure measurements. The curve for bias in-creases but this is believed to be solely the result of the exponentially in-creasing number of misclassi cation coe cients. If the exposure Markovchain is far from its stationary distribution then the number of exposuremeasurements might have an impact on the e ect of misclassi cation andthis should be investigated.413.3. The E ect of the Number of Exposure Measurements2 4 6 8 100.00.51.01.5Effect of the Number of Exposure Measurements on the Estimation of Select Regression CoefficientsNumber of Exposure MeasurementsCoefficient Estimationβnβn−1BiasFigure 3.32: E ect of the number of exposure measurements on Bias, n and n 1 for linear outcome model when (SN = 0:8;SP = 0:95) and = 0:2.42Chapter 4Adjustment forMisclassi cation UsingDiscrete-Time HiddenMarkov ProcessTime-varying misclassi cation, where the underlying exposure is assumedto be a Markov chain, is an example of a hidden Markov chain. A hiddenMarkov model is a statistical model in which the system being modeled isassumed to be a Markov process with unobserved states. A hidden Markovmodel can be considered as a simple dynamic Bayesian network. The hid-den Markov model and situations analyzed via the Kalman lter can beconsidered the most simple dynamic Bayesian networks.In a regular Markov model, the state is directly visible to the observer,and therefore the state transition probabilities are the only parameters. In ahidden Markov model, the state is not directly visible, but output, dependenton the state, is visible. Each state gives rise to a probability distribution overthe possible output symbols. Therefore the sequence of symbols generatedby a hidden Markov model gives some information about the sequence ofstates.When time-varying misclassi cation is present the possible output sym-bols are the misclassi ed exposure measurements while the states of thehidden Markov chain are the true, correctly classi ed, exposure status. Bymodeling time-varying misclassi cation using hidden Markov chains, esti-mates can be obtained for the Markov transition probabilities as well asthe misclassi cation probabilities (SN;SP). After estimates for transitionprobabilities and misclassi cation probabilities are obtained, the most likelypath through the underlying states can be recreated using the Viterbi Algo-rithm. This recreated path is an estimate of the underlying exposure statusand thus can be used to adjust for misclassi cation using this estimatedpath as the time-varying exposure status in an exposure-disease model.434.1. Discrete-Time Hidden Markov Process4.1 Discrete-Time Hidden Markov ProcessWe denote the observed sequence as fX ig, for i = 1;:::;n and the hiddenMarkov chain asfXig, for i = 1;:::;n. The history of the observed processup to time i is denoted by,X1:i = (X1;:::;Xi); (4.1)where i = 1;:::;n. We de ne X 1:i similarly.The hidden Markov chain has m states denoted by 0;1;:::;m 1 andthe underlying Markov chain has transition probability matrix denoted by , where the (j;k)th element is jk = Pr(Xi+1 = kjXi = j) for i = 1;:::;n; j;k = 0;:::;m 1: (4.2)In this analysis the Markov chain is assumed to be homogeneous, whichmeans that for each j and k, jk is constant over time. The Markov chaincan be stationary or non-stationary. A Markov chain is said to be stationaryif the marginal distribution is the same over time, i.e. for each j, ij =Pr(Xi = j) is constant for all i. The stationary marginal distribution isdenoted by s = ( 1;:::; m).4.2 Inference for Discrete-Time Hidden MarkovProcessTo conduct inference for hidden Markov chains we must maximize the likeli-hood function. To do this the EM algorithm is used, since we only know theobservations and not the sequence of states producing them (Durbin 1998).For this expectation algorithm the complete data likelihood, Lc, isLc = Pr(X 1 = x 1;:::;X n = x n;X1 = x1;:::;Xn = xn): (4.3)This can be shown to beLc = Pr(X 1 = x 1jX1 = x1)Pr(X1 = x1)nYi=2Pr(X i = x ijXi = xi)Pr(Xi = xijXi 1 = xi 1);and hence, substituting model parameters, we getLc = 1;x1 x1x2 x2x3 ::: xn 1xnnYi=1Pr(X i = x ijXi = xi); (4.4)444.3. Recreating the Path Through True Exposure StatessologLc = log 1;x1 +nXi=2log xi 1xi +nXi=1logPr(X i = x ijXi = xi): (4.5)Hence the complete data likelihood is split into three terms: the rst relatesto parameters of the marginal distribution of the Markov chain, the secondto the transition probabilities, and the third to the distribution parametersof the observed random variable (MacDonald and Zucchini 1997).When the hidden Markov chain is assumed to be non-stationary, thecomplete data likelihood has a neat structure, in that delta only occurs in the rst term, only occurs in the second term, and the parameters associatedwith the observed probabilities only occur in the third term. Hence, thelikelihood can easily be maximized by maximizing each term individually.In this situation, the estimated parameters using EM algorithm will be theexact maximum likelihood estimates.When the hidden Markov chain is assumed to be stationary, = ,and then the rst two terms of logLc determine the transition probabilities . This raises more complicated numerical problems, as the rst term ise ectively a constraint. This is dealt with in a slightly ad-hoc manner bye ectively disregarding the rst term, which is assumed to be relativelysmall. In the M-step, the transition matrix is determined by the secondterm, then is estimated using the relation = (Harte 2010). Both thesemethods give maximum likelihood estimates for the transition probabilitiesand the misclassi cation probabilities (SN;SP).4.3 Recreating the Path Through True ExposureStatesTo adjust for time-varying misclassi cation we want to estimate the truetime-varying exposure states of each subject. To do this we can predict themost likely sequence of the true Markov exposure states given the observedmisclassi ed states using the Viterbi algorithm. The purpose of the Viterbialgorithm is to globally decode the underlying hidden Markov state at eachtime point. It does this by determining the sequence of states (k 1;:::;k n)which maximizes the joint distribution of the hidden states given the entireobserved process,(k 1;:::;k n) = argmaxk1;:::;kn f1;2;:::;mgPr(X1 = k1;:::;Xn = knjX 1:n = x 1:n):The algorithm has been taken from Zucchini (2005).454.4. Adjusting for Misclassi cation: True Exposure Path RecreationDetermining the a posteriori most probable state at time i is referred toas local decoding,k i = argmaxk f1;2;:::;mgPr(Xi = kjX 1:n = x 1:n):Once the sequence of states (k 1;:::;k n) which maximizes the joint distri-bution of the hidden states is determined, this can be used as an estimate ofthe true time-varying exposure status for each subject. With this estimatedexposure Markov chain, inference can be done using a plug-in method. Thisis done by using this estimate as the time-varying exposure status and de-termining the exposure-outcome relationship using this estimated Markovchain.4.4 Adjusting for Misclassi cation: TrueExposure Path RecreationIn order to demonstrate the performance of adjusting for time-varying mis-classi cation using the most-likely path through true exposure states, weconduct a simulation study under four cases. In each case the underlyingtime-varying exposure and misclassi cation are generated under the sameconditions. The exposure-outcome model di ers in each situation and thee ectiveness of the adjustment is evaluated.The R package ‘HiddenMarkov’ contains functions for the analysis ofdiscrete time hidden Markov models, Markov modulated GLMs and theMarkov modulated Poisson process. It includes functions for simulation,parameter estimation, and the Viterbi algorithm. The package is currentlyunder development and it is designed for a single long Markov chain not aseries of longitudinal data. This means that transition probabilities and mis-classi cation probabilities cannot be accurately estimated when there onlyexist short hidden Markov chains. In the time-varying exposure case this isthe type of data we are interested in, so the transition probabilities and mis-classi cation probabilities need to be estimated by a di erent means. Thepackage can still accurately calculate the most likely path through hiddenstates using the Viterbi algorithm when estimates for transition probabili-ties and misclassi cation probabilities are available. This means that whenaccurate estimates of transition probabilities and misclassi cation probabili-ties exist, such as from a validation study, then the most likely path throughtrue exposure states can be determined. This allows misclassi cation to beadjusted for by recreating the most likely true exposure path for each sub-464.4. Adjusting for Misclassi cation: True Exposure Path Recreationject. The e ectiveness of this adjustment is shown through the simulationstudy in the next section.4.4.1 Data SimulationTo demonstrate the performance of misclassi cation adjustment, 500 MonteCarlo samples were simulated as follows:1. The total number of subjects is N = 1;000 and the number of exposuremeasurements on each subject is n = 102. For each subjectigenerate an exposure Markov chainsXi1;Xi2;:::Xi;nwhere the rst exposure measurement Xi1 is generated from the sta-tionary distribution of the Markov chain. The Markov chain is de nedby its transition probabilities. ( 1 = 0:1; 2 = 0:3) Pr(Xi1 = x) = 0:25x(1 0:25)1 x for x = 0;1.3. Generate the exposure outcome model in two cases: Yij N(Xij;0:1) for i = 1;:::;N and j = 1;:::;n. Yij N(Xij + 0:5Xi;j 1;0:1) for i = 1;:::;N and j = 2;:::;n.4. Misclassify the Markov chain X i1;X i2;:::X i;n with (SN = 0:85;SP =0:95).5. Recreate the most likely exposure path Xesti1 ;Xesti2 ;:::Xesti;n using theViterbi algorithm with estimates (SN0 = 0:85;SP0 = 0:95; 10 =0:1; 20 = 0:3).6. Consider two di erent exposure outcome models: E(YijjXi;1:j) = 1Xij E(YijjXi;1:j) = 1Xij + 2Xi;j 17. Fit each exposure outcome model usingL( ) =NYi=1nYj=1f(Yijjdi;1:j); (4.6)wheredi;1:j = Xi;1:j, X i;1:j, orXesti;1:j when estimating true, misclassified,or estimate respectively.474.4. Adjusting for Misclassi cation: True Exposure Path Recreation4.4.2 Simulation ResultsThe simulation can be broken into four cases. Cases are determined bythe form of the linear outcome model (lagged term or no lagged term) andwhether the model has been correctly speci ed. In each case the coe cientestimates for the exposure outcome model are presented based on the truedata, misclassi ed data and estimated data. The simulation standard de-viation for the 500 Monte Carlo Samples and Monte Carlo 95% con denceinterval are also presented in the tables below. The percentage of Misclas-si ed states, before and after the most likely path recreation, are presentedin Table 4.1.% of States Std. Dev. 95% CIMisclassi ed 6.88 0.26 (6.86, 6.90)Misclassi ed after Recreation 6.00 0.29 (5.97, 6.03)Table 4.1: Comparison of the percentage of misclassi ed states before andafter Viterbi path recreation when a sample of 1,000 subjects was taken with10 observations per subject. (SN = 0:85;SP = 0:95), ( 1 = 0:1; 2 = 0:3).Case 1 .True Model: E(YijjXi;1:j) = XijAssumed Model: E(YijjXi;1:j) = XijCoe . Coe .est Std. Dev. 95% CI true 1.000 0.0008 (0.9999, 1.0001) estimate 0.823 0.0035 (0.8227 0.8233) misclassified 0.762 0.0033 (0.7617, 0.7623)Table 4.2: Simulation results for misclassi cation adjustment in discretetime for linear outcome model with no lag term when (SN = 0:85;SP =0:95), ( 1 = 0:1; 2 = 0:3) and a sample of 1,000 subjects was taken with 10observations each.484.4. Adjusting for Misclassi cation: True Exposure Path RecreationCase 2 .True Model: E(YijjXi;1:j) = Xij + 0:5Xi;j 1Assumed Model: E(YijjXi;1:j) = XijCoe . Coe .est Std. Dev. 95% CI true 1.269 0.0018 (1.2688, 1.2692) estimate 1.095 0.0047 (1.0946, 1.0954) misclassified 0.987 0.0047 (0.9866, 0.9874)Table 4.3: Simulation results for misclassi cation adjustment in discretetime for a misspeci ed linear outcome model with no lag term when (SN =0:85;SP = 0:95), ( 1 = 0:1; 2 = 0:3) and a sample of 1,000 subjects wastaken with 10 observations each.Case 3 .True Model: E(YijjXi;1:j) = Xij + 0:5Xi;j 1Assumed Model: E(YijjXi;1:j) = 1Xij + 2Xi;j 1Coe . Coe .est Std. Dev. 95% CI 1true 1.000 0.0010 (0.9999, 1.0001) 1estimate 0.791 0.0049 (0.7906, 0.7914) 1misclassified 0.806 0.0031 (0.8057, 0.8063) 2true 0.499 0.0010 (0.4989, 0.4991) 2estimate 0.472 0.0051 (0.4716, 0.4724) 2misclassified 0.539 0.0031 (0.5387, 0.5393)Table 4.4: Simulation results for misclassi cation adjustment in discretetime for linear outcome model with lagged term when (SN = 0:85;SP =0:95), ( 1 = 0:1; 2 = 0:3) and a sample of 1,000 subjects was taken with 10observations each.494.4. Adjusting for Misclassi cation: True Exposure Path RecreationCase 4 .True Model: E(YijjXi;1:j) = XijAssumed Model: E(YijjXi;1:j) = 1Xij + 2Xi;j 1Coe . Coe .est Std. Dev. 95% CI 1true 0.999 0.0010 (0.998, 1.000) 1estimate 0.768 0.0045 (0.7676, 0.7684) 1misclassified 0.712 0.0032 (0.7117, 0.7123) 2true 0.000 0.0010 (-0.0001, 0.0001) 2estimate 0.073 0.0046 (0.0726, 0.0734) 2misclassified 0.195 0.0029 (0.1947, 0.1953)Table 4.5: Simulation results for misclassi cation adjustment in discretetime for linear outcome model with misspeci ed lagged term when (SN =0:85;SP = 0:95), ( 1 = 0:1; 2 = 0:3) and a sample of 1,000 subjects wastaken with 10 observations each.The results above show that recreation of the true exposure path using theViterbi algorithm, with the true values of (SN;SP) and ( 1; 2), reducesthe number of misclassi ed states and positively adjusts for misclassi ca-tion. The coe cient estimates of the exposure outcome model are signi -cantly closer to the true exposure outcome model. This adjustment is leaste ective when we are tting an exposure outcome model with lag term thatmatches the data generating model. Although there are signi cantly lessmisclassi ed states this is not re ected in the coe cient estimates. Thismodel does however adjust the estimate for 2 so that an arti cially strongassociation between the outcome and the lagged exposure term is no longerobserved. The adjustment is most e ective in adjusting for misclassi cationwhen a misspeci ed model is t including extra lagged terms. This is verybene cial because when model selection procedures are employed most pro-cedures start with a saturated/partially saturated model and then removecovariates that are not signi cant. We see that in Case 4 when the expo-sure status Xn 1 had no association with the outcome variable misclassifiedwas still large, indicating a relationship between Xn 1 and Y that did notexist. When the most likely exposure path was recreated the e ect of Xn 1dropped dramatically and estimate was close to zero, indicating no associa-tion. This is very helpful for model selection and allows arti cial associationsthat result from misclassi cation to be minimized.50Chapter 5Adjustment forMisclassi cation UsingContinuous-Time HiddenMarkov ProcessContinuous time Markov chains have found a wide application in the medicaland social sciences, especially in studies that consist of data that record lifehistory of exposures for individuals. In this chapter we consider continuoustime Markov process with panel data. The panel data consists of the statesoccupied by the individuals under study at a sequence of discrete time pointsbut no information is available about the timing of events between obser-vation times. One of the most useful properties of continuous-time Markovchains is their ability to model multi-state processes under this type of paneldata.In practice most time-varying exposures will occur according to a con-tinuous time process. Modeling time-varying exposure by a continuous timeMarkov process allows transitions between exposure states to happen at anytime and allows for transitions to happen between observations. Observa-tion times of the process are arbitrary and they no longer need to be equallyspaced. The ability for continuous time Markov chains to model panel datameans exact transition times do not need to be observed. These advantagesof continuous time Markov chains make them a very important extensionof discrete time Markov processes. All the theory for hidden Markov chainsin discrete time can be extended to continuous time hidden Markov chains,making continuous time hidden Markov processes a very powerful tool formodeling and adjusting for misclassi cation. By modeling time-varying mis-classi cation using a continuous time hidden Markov chain, estimates can beobtained for the Markov transition probabilities as well as the misclassi ca-tion probabilities (SN;SP). This can be done using only the observed dataso a validation sample does not need to be obtained. This estimation can be515.1. Continuous-Time Markov Processextended to allow transition intensities and misclassi cation probabilitiesto depend on accurately measured covariates. Due to the fact the mis-classi cation probabilities can depend on measured covariates, di erentialmisclassi cation can be modeled by letting outcome status be a covariate.In this thesis we only consider time-varying exposure and misclassi cationthat does not depend on covariates.When estimates for transition probabilities and misclassi cation proba-bilities are obtained the path through the underlying states can be recreatedwith highest probability using the Viterbi Algorithm. This recreated path isan estimate of the true underlying exposure status and thus can be used toadjust for misclassi cation using this estimated Markov chain as the time-varying exposure status of each subject.5.1 Continuous-Time Markov ProcessSuppose individuals move independently among m states according to acontinuous-time Markov process. Let X(t) be the state occupied at time tby a randomly chosen individual. For 0 s t, let P(s;t) be the m mtransition probability matrix with entriespij(s;t) = Pr(X(t) = jjX(s) = i); (5.1)for i;j = 0;1;:::;m 1. This process can be speci ed in terms of the tran-sition intensities,qij(t) = lim t!0pij(t;t+ t) t ; i6= j (5.2)andqii(t) = Xi6=jqij(t); i = 1;:::;k; (5.3)and let Q(t) be the m m transition intensity matrix with entries qij(t).In this chapter only the time homogeneous models are investigated whichimplies qij(t) = qij. In the time homogeneous case the process is stationaryandP(t) = P(s;s+t) = P(0;t); (5.4)in this case we can writeP(t) = eQt =1Xh=0Qhth=h!: (5.5)525.2. Maximum Likelihood EstimationWe can also allow qij = qij( ) to depend on b functionally independentparameters 1; 2;:::; b. If each individual’s multi-state transitions dependon measured covariates then adjustments can be made using, a generalizedCox proportional hazards model. This entails setting the transition ratefunctions of an individual’s covariates as follows:q(x) = (x; )q0; (5.6)where x is a vector of patient covariates, the corresponding coe cients,and q0 the baseline transition rate. () can take several parameterizationsincluding the exponential, linear, logistic, and augmented family forms.To be able to estimate the transition probabilities we must determinethe likelihood function. Suppose that a random sample of N individuals isobserved at times t0;t1;:::;tn. If Nijl denotes the number of individual instate i at tl 1 and j at tl then conditioning on the distribution of individualsamong states at t0 the likelihood function for isL( ) =nYl=18<:mYi;j=1pij(tl 1;tl)Nijl9=;: (5.7)5.2 Maximum Likelihood EstimationMaximum likelihood estimation can be conducted to estimate transitionprobabilities when these transitions depend on measured covariates or whenthey do not. This is done using an e cient quasi-Newton procedure thatuses rst derivatives of logL( ) to compute P(t; ) = exp(Q( )). For agiven , Q( ) is decomposed into Q = ADA 1. Here D = diag(d1;:::;dk),where d1;:::;dk are the distinct eigenvalues of Q, and A is the k k matrixwhose jth column is the right eigenvector corresponding to dj. ThenP(t) = Adiag(ed1t;:::;edkt)A 1: (5.8)Using this expression for P(t) in our likelihood function and using the quasi-Newton (or scoring) procedure the MLE can be determined. This procedureis implemented in the R package msm.5.3 Continuous Time Hidden Markov ProcessIn a hidden continuous Markov model the states of the Markov chain are notobserved. The observed data are governed by some probability distribution535.4. Inference for Continuous-Time Hidden Markov Processconditionally on the unobserved state. The evolution of the underlyingMarkov chain is governed by a transition intensity matrix Q. Hidden Markovmodels are mixture models, where observations are generated from a certainnumber of unknown distributions. However the distribution changes throughtime according to states of a hidden Markov chain. Multi-state models withmisclassi cation are hidden Markov models. Here the observed data arestates, assumed to be misclassi cation of the true, underlying states. As anextension to the simple multi-state model, the msm package can t a generalmulti-state model with misclassi cation. For patient i, and observationtime tij , observed states X ij are generated conditionally on true states Xijaccording to a misclassi cation matrix E. This is a m m matrix, whose(r;s) entry isers = Pr(X (tij) = sjX(tij) = r) (5.9)which we assume to be independent of time t. When the exposure misclas-si cation is binary then this matrix is completely determined by (SN;SP).Analogously to the entries of Q, some of the ers may be xed to re ectknowledge of the diagnosis process. For example, the probability of mis-classi cation may be negligibly small for non-adjacent states. Thus the pro-gression through underlying states is governed by the transition intensitymatrix Q, while the observation process of the underlying states is governedby the misclassi cation matrix E. Both Q and E can depend on accuratelymeasured covariates but in this thesis we only consider the case where bothdo not.5.4 Inference for Continuous-Time HiddenMarkov ProcessConsider now a hidden Markov model in continuous time. The true state ofthe model Xij evolves as an unobserved Markov process. Observed data x ijare generated conditionally from true states Xij = 1;2;:::;m according toa set of distributions f1(x j 1);f2(x j 2);:::;fn(x j m) respectively, where r is a vector of parameters for the state r distribution.A type of EM algorithm known as the Baum-Welch or forward-backwardalgorithm is commonly used for hidden Markov model estimation in contin-uous time (Bureau et al 2000).To develop the likelihood for a continuous time hidden Markov processwe start at looking at each subject separately. The ith subject’s contributionto the likelihood is545.5. Recreating the Exposure Path for Continuous Time Hidden Markov ProcessesLi = Pr(x i1;:::;x imi)= XPr(x i1;:::;x imijXi1;:::;Ximi)Pr(Xi1;:::;Ximi);where the sum is taken over all possible paths of underlying statesXi1;:::;Ximi(Jackson 2009) . Assume that the observed states are conditionally inde-pendent given the values of the underlying states. Also assume the Markovproperty, Pr(XijjXi;j 1;:::;Xi1) = Pr(XijjXi;j 1). Then the contributionLi can be written as a product of matrices by decomposing the overall sumin equation 5.11 into sums over each underlying state. The sum is accu-mulated over the unknown rst state, the unknown second state, and so onuntil the unknown nal state, soLi = XXi1Pr(x i1jXi1)Pr(Xi1)XXi2Pr(x i2jXi2)Pr(Xi2jXi1)::: XXimiPr(x imijXimi)Pr(XimijXimi 1);where Pr(x ijjXij) is the misclassi cation probability density, in the binarycase determined by (SN;SP). For general hidden Markov models, this isthe probability density fXij(x ijj Xij).Pr(Xi;j+1jXij) is the (Xij;Xi;j+1) entry of the Markov chain transitionmatrix P(t) = (pij(t))1 i;j n evaluated at t = ti;j+1 tij . Let f be the vec-tor with rth element the product of the initial state occupation probabilityPr(Xi1 = r) and Pr(x i1jr), and let 1 be a column vector consisting of ones.For j = 2;:::;mi let Tij be the n n matrix with (r;s) entryPr(x ijjs)prs(tij ti;j 1): (5.10)Then the likelihood contribution for subject i isLi = fTi2Ti3:::Timi1: (5.11)5.5 Recreating the Exposure Path forContinuous Time Hidden Markov ProcessesThe most common method of reconstructing a continuous time hidden Markovchain is the Viterbi algorithm. The Viterbi algorithm is a dynamic pro-gramming algorithm for nding the most likely sequence of hidden states.555.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous TimeOriginally proposed by Viterbi (1967), it is also described by Durbin et al.(1998) and Macdonald & Zucchini (1997). For continuous-time models itproceeds as follows. Suppose that a hidden Markov model has been ttedand a Markov transition matrix P(t) and misclassi cation matrix E areknown. Let vk(ti) be the probability of the most probable path ending instate k at time ti.1. Estimatevk(t1) using known or estimated initial-state occupation prob-abilities.2. For i = 1;:::;N, calculate vl(ti) = el;X ti maxkvk(ti 1)Pkl(ti ti 1).Let Ki(l) be the maximizing value of k.3. At the nal time point tN, the most likely underlying state bX N is thevalue of k which maximizes vk(tN).4. Retrace back through the time points, setting bX i 1 = Ki( bX i ).5.6 Adjusting for Misclassi cation: TrueExposure Path Recreation in ContinuousTimeTo demonstrate the performance of adjusting for time-varying misclassi ca-tion using the most-likely path through true exposure states, we conduct asimulation study under four cases. In each case the underlying time-varyingexposure and misclassi cation are generated under the same conditions, thendi ering exposure-outcome models are developed and the e ectiveness of themisclassi cation adjustment is evaluated through the estimation of transi-tion and misclassi cation probabilities, the number of misclassi ed statesand the estimation of the exposure-outcome model.The R package ‘msm’ consists of functions for tting general continuoustime Markov and hidden Markov multi-state models to longitudinal data.Both Markov transition rates and the hidden Markov output process can bemodeled in terms of covariates. A variety of observation schemes are sup-ported, including processes observed at arbitrary times, completely-observedprocesses, and censored states. The package can estimate transition proba-bilities as well as misclassi cation probabilities from the observed data. Thisallows adjustment for misclassi cation to be done with only the observedmisclassi ed data so a validation study is not needed. This package canalso calculate the most likely path through hidden states using the Viterbi565.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous Timealgorithm. This allows misclassi cation to be adjusted for by recreating thetrue exposure path of each individual in continuous time. The e ectivenessof this adjustment is shown through the simulation study below.5.6.1 Data SimulationTo demonstrate the performance of misclassi cation adjustment, 500 MonteCarlo samples were simulated as follows:1. The total number of subjects is set to N = 1;000 and the number ofexposure measurements on each subject is set ton = 10.2. For each subject i generate a continuous time exposure Markov chainXi(t) for t 0 where the rst exposure measurement Xi(0) is gener-ated with equal probability of being exposed (Xi(0) = 1) or unexposed(Xi(0) = 0) and the transition intensities are: (q01 = 0:2;q10 = 0:3).3. Censor the continuous Markov chain by observing the process stateof subject i at time points ti1;ti2;:::;ti;n to obtain the observationXi(ti1);Xi(ti2);:::;Xi(ti;n) (panel data where there is no informationabout the process between tij).4. Generate the exposure-outcome model from Xi(tij) in two cases: Yij N(Xi(tij);0:1) for i = 1;:::;N and j = 1;:::;n. Yij N(Xi(tij) + 0:5Xi(ti;j 1);0:1) for i = 1;:::;N and j =2;:::;n.5. Misclassify the Markov chainX i (ti1);X i (ti2);:::;X i (ti;n) with (SN =0:8;SP = 0:95).6. Estimate the transition intensities (q01;q10) and misclassi cation prob-abilities (e11;e00) using the R function msm.7. Recreate the most likely exposure pathXesti (ti1);Xesti (ti2);:::Xesti (ti;n)using the Viterbi algorithm with the estimates obtained in previousstep.8. Consider two di erent exposure-outcome models: E(YijjXi(ti;1:j)) = 1Xi(ti;j) E(YijjXi(ti;1:j)) = 1Xi(ti;j) + 2Xi(ti;j 1)575.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous Time9. Fit each exposure-outcome model usingL( ) =NYi=1nYj=1f(Yijjdi(ti;1:j)); (5.12)where di(t1;1:j) = Xi(ti;1:j), X i (ti;1:j), or Xesti (ti;1:j) when estimating true, misclassified, or estimate respectively.5.6.2 Simulation ResultsThe simulation can be broken into four cases. Cases are determined bythe form of the linear outcome model (lagged term or no lagged term) andwhether the model has been correctly speci ed. In each case the coe cientestimates for the exposure outcome model are presented based on the truedata, misclassi ed data and estimated data. The simulation standard devi-ation from the 500 Monte Carlo samples and 95% Monte Carlo con denceintervals are also presented in the tables below. The percentage of misclassi- ed states, before and after the most likely path recreation, are presented aswell as estimates for transitions intensities (q01;q10), corresponding transi-tion probabilities (p01;p10) and misclassi cation probabilities (e11;e00). Themisclassi cation probabilities corresponding to (SN;SP) respectively. Thetransition probabilities are calculated using the matrix exponential of Q.Case 1 .True Model: E(YijjXi(ti;1:j)) = Xi(ti;j)Assumed Model: E(YijjXi(ti;1:j)) = Xi(ti;j)Parameter Param.est Std. Dev. 95% CIq01 0.200 0.0152 (0.199, 0.201)q10 0.301 0.0282 (0.299, 0.304)p01 0.157 0.0094 (0.156, 0.158)p10 0.237 0.0168 (0.235, 0.239)e11 0.800 0.0214 (0.798, 0.802)e00 0.950 0.0607 (0.945, 0.955)Table 5.1: Simulation results for continuous time hidden Markov parameterswhen (SN = 0:8;SP = 0:95), (q01 = 0:2;q10 = 0:3) and a sample of 1,000subjects was taken with 10 observations per subject.585.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous Time% of States Std. Dev. 95% CIMisclassi ed 9.50 0.30 (9.47, 9.53)Misclassi ed after Recreation 8.70 0.42 (8.66, 0.874)Table 5.2: Comparison of the percentage of misclassi ed states before andafter Viterbi path recreation when a sample of 1,000 subjects was takenwith 10 observations per subject. The parameters used for this recreationare shown in Table 5.1.Coe . Coe .est Std. Dev. 95% CI true 1.000 0.0023 (0.9998, 1.0002) estimate 0.820 0.0112 (0.819, 0.821) misclassified 0.790 0.0071 (0.789, 0.791)Table 5.3: Simulation results for misclassi cation adjustment in continuoustime for linear outcome model with no lag term when (SN = 0:8;SP =0:95), (q01 = 0:2;q10 = 0:3) and a sample of 1,000 subjects was taken with10 observations each.595.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous TimeCase 2 .True Model: E(YijjXi(ti;1:j)) = Xi(tij) + 0:5Xi(ti;j 1)Assumed Model: E(YijjXi(ti;1:j)) = 1Xi(ti;j)Parameter Param.est Std. Dev. 95% CIq01 0.201 0.0121 (0.201, 0.202)q10 0.301 0.0276 (0.299, 0.303)p01 0.158 0.0076 (0.157, 0.159)p10 0.237 0.0178 (0.235, 0.239)e11 0.800 0.0179 (0.798, 0.802)e00 0.951 0.0057 (0.950, 0.952)Table 5.4: Simulation results for continuous time hidden Markov parameterswhen (SN = 0:8;SP = 0:95), (q01 = 0:2;q10 = 0:3) and a sample of 1,000subjects was taken with 10 observations per subject.% of States Std. Dev. 95% CIMisclassi ed 9.48 0.30 (9.45, 9.51)Misclassi ed after Recreation 8.64 0.38 (8.61, 8.67)Table 5.5: Comparison of the percentage of misclassi ed states before andafter Viterbi path recreation when a sample of 1,000 subjects was takenwith 10 observations per subject. The parameters used for this recreationare shown in Table 5.4.605.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous TimeCoe . Coe .est Std. Dev. 95% CI true 1.281 0.0049 (1.2806, 1.2814) estimate 1.092 0.0212 (1.090, 1.094) misclassified 1.015 0.0097 (1.014, 1.016)Table 5.6: Simulation results for misclassi cation adjustment in continuoustime for linear outcome model with misspeci ed no lag term when (SN =0:8;SP = 0:95), (q01 = 0:2;q10 = 0:3) and a sample of 1,000 subjects wastaken with 10 observations per subject.Case 3 .True Model: E(YijjXi(ti;1:j)) = Xi(tij) + 0:5Xi(ti;j 1)Assumed Model: E(YijjXi(ti;1:j)) = 1Xi(ti;j) + 2Xi(ti;j 1)Parameter Param.est Std. Dev. 95% CIq01 0.200 0.0142 (0.199, 0.201)q10 0.296 0.0272 (0.293, 0.299)p01 0.158 0.0093 (0.157, 0.159)p10 0.233 0.0177 (0.231, 0.235)e11 0.798 0.0210 (0.796, 0.800)e00 0.950 0.0060 (0.949, 0.951)Table 5.7: Simulation results for continuous time hidden Markov parameterswhen (SN = 0:8;SP = 0:95), (q12 = 0:2;q21 = 0:3) and a sample of 1,000subjects was taken with 10 observations per subject.615.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous Time% of States Std. Dev. 95% CIMisclassi ed 9.50 0.33 (9.47, 9.53)Misclassi ed after Recreation 8.68 0.42 (8.64, 8.72)Table 5.8: Comparison of the percentage of misclassi ed states before andafter Viterbi path recreation when a sample of 1,000 subjects was takenwith 10 observations per subject. The parameters used for this recreationare shown in Table 5.7.Coe . Coe .est Std. Dev. 95% CI 1true 1.000 0.0031 (0.9997, 1.0003) 1estimate 0.837 0.0163 (0.836, 0.838) 1misclassified 0.712 0.0089 (0.711, 0.713) 2true 0.499 0.0027 (0.4987, 0.4993) 2estimate 0.479 0.0287 (0.476, 0.482) 2misclassified 0.576 0.0081 (0.575, 0.577)Table 5.9: Simulation results for misclassi cation adjustment in continuoustime for linear outcome model with lag term when (SN = 0:8;SP = 0:95),(q12 = 0:2;q21 = 0:3) and a sample of 1,000 subjects was taken with 10observations per subject.625.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous TimeCase 4 .True Model: E(YijjXi(ti;1:j)) = Xi(tij)Assumed Model: E(YijjXi(ti;1:j)) = 1Xi(ti;j) + 2Xi(ti;j 1)Parameter Param.est Std. Dev. 95% CIq01 0.199 0.0126 (0.198, 0.200)q10 0.301 0.0241 (0.299, 0.303)p01 0.157 0.0084 (0.156, 0.158)p10 0.237 0.0157 (0.236, 0.238)e11 0.801 0.0195 (0.799, 0.803)e00 0.950 0.0058 (0.949, 0.951)Table 5.10: Simulation results for continuous time hidden Markov param-eters when (SN = 0:8;SP = 0:95), (q01 = 0:2;q10 = 0:3) and a sample of1,000 subjects was taken with 10 observations per subject.635.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous Time% of States Std. Dev. 95% CIMisclassi ed 9.49 0.33 (9.46, 9.52)Misclassi ed after Recreation 8.67 0.40 (8.63, 8.71Table 5.11: Comparison of the percentage of misclassi ed states before andafter Viterbi path recreation when a sample of 1,000 subjects was takenwith 10 observations per subject. The parameters used for this recreationare shown in Table 5.10.Coe . Coe .est Std. Dev. 95% CI 1true 1.000 0.0025 (0.9998, 1.0002) 1estimate 0.750 0.0163 (0.749, 0.751) 1misclassified 0.707 0.0079 (0.706, 0.708) 2true 0.000 0.0030 (-0.0003, 0.0003) 2estimate 0.087 0.0275 (0.085, 0.089) 2misclassified 0.238 0.0077 (0.237, 0.239)Table 5.12: Simulation results for misclassi cation adjustment in continuoustime for linear outcome model with misspeci ed lag term when (SN =0:8;SP = 0:95), (q01 = 0:2;q10 = 0:3) and a sample of 1,000 subjects wastaken with 10 observations per subject.645.6. Adjusting for Misclassi cation: True Exposure Path Recreation in Continuous TimeThe results above show that we can accurately determine the estimatesof transition probabilities and misclassi cation probabilities. This is donewithout any other information available besides the observed panel data soa validation study is not needed. When estimates for transition probabilitiesand misclassi cation probabilities are obtained then we can use the resultsof Ch.3 to see how this form of misclassi cation a ects the coe cients. Thisallows us to have some intuition on how estimation of the exposure-outcomemodel will be a ected and allows us to better interpret our results. It canalso be seen that the Viterbi algorithm, supplied with the estimates of Qand E, is e ective in reducing the number of misclassi ed states. This re-duction allows the coe cient estimates of the exposure-outcome model tobe more accurate and reduces the e ect of misclassi cation. The adjust-ment is most e ective in adjusting for misclassi cation when a misspeci edmodel is t including extra lagged terms. This is very bene cial becausewhen model selection procedures are employed most procedures start witha saturated/partially saturated model and then remove covariates that arenot signi cant. We see that in Case 4 when the exposure status X(tn 1)had no association with the outcome variable, 2misclassified was still large,indicating a relationship between X(tn 1) and Y that did not exist. Whenthe most likely true exposure path was recreated the e ect of Xn 1 droppeddramatically from 2misclassified = 0:238 to 2estimate = 0:087. This is veryhelpful for model selection and allows arti cial associations that result frommisclassi cation to be minimized.65Chapter 6Conclusion and Future WorkIn this dissertation we concentrate on time-varying exposure misclassi -cation. We determine and characterize the bias that results from time-varying misclassi cation for various misclassi cation parameters and time-varying exposure parameters. We have also determined two separate, easily-implemented adjustment methods that allow for estimation of the misclas-si cation parameters and exposure parameters while reducing the e ect ofmisclassi cation. When potential measurement error on the time-varyingexposure is not accounted for, statistical assessment of the impact of theexposure variable on a health related outcome is misleading. The directionin which the association between the actual but unobserved explanatoryvariable and the response is biased, unpredictable and substantial.The bias that results from misclassi cation is determined by tabulatingthe discrete probability distribution of Pr(X1:n;X 1:n) and then the one-to-one correspondence between the discrete probability distribution and allthe combinations of the random variables (X 1:n) is used to determine thecoe cients in E(YnjX 1:n). This allows us to determine how misclassi edtime-varying exposure measurements a ect the exposure outcome associa-tion. This development is presented in Chapter 2. Code for this determina-tion is presented in Appendix A. This calculation allows us to characterizethe e ect of misclassi cation on the estimation of the regression coe cientsand model bias. This is done by determining the e ect of misclassi cationwhile (SN;SP) vary as well as when the exposure switching probabilities,( 1; 2), vary. The results of this characterization are presented in Chapter3. It can be seen that (SN;SP) and ( 1; 2) cause the e ect of misclassi ca-tion to change in a complicated way and no general rule can be determinedto describe how misclassi cation a ects the association between the actualexplanatory variable and the response. This characterization allows thee ect of misclassi cation to be determined for certain values of the misclas-si cation parameters. Therefore we can use this to develop intuition on howthe exposure e ect is attenuated.To adjust for the e ect of misclassi cation we use algorithms for discretetime hidden Markov chains and continuous time hidden Markov chains to66Chapter 6. Conclusion and Future Worktry and determine the true exposure by recreating the most likely paththrough the unobserved true exposure states (Chapter 4 and 5). These al-gorithms are easy to use and are already implemented in standard software.By using inferential techniques for hidden Markov chains in continuous timewe are able to estimate the misclassi cation parameters (SN;SP) as wellthe switching probabilities ( 1; 2) with only the observed data and no val-idation study is needed. These parameter estimates allow us to use themisclassi cation characterization of Chapter 3 to determine how the coe -cients in the exposure outcome model are a ected. When these parametervalues are determined we can also adjust for the e ect of misclassi cationin a more direct way by determining the most likely exposure path usingthe Viterbi algorithm. We can then determine the exposure-outcome rela-tionship with this recreated path. It is shown through simulation that thisrecreated path allows a more accurate exposure-outcome relationship to beestimated and diminishes the e ect of misclassi cation.One analogous technique that parallels the adjustment method used inthis thesis is that of regression calibration. Regression calibration recon-structs X using X and then regresses Y on this reconstruction. In re-gression calibration Y is regressed on E(XjX ) (Carroll et al. 2006). Ouradjustment method regresses Y on the mode(XjX ). In principal E(XjX )could also be used in our analysis and possible advantages could be achieved.The adjustment for misclassi cation presented in this paper can be im-plemented in a quick and easy way. It does not, however, remove the e ectof misclassi cation entirely. We can see that when using the recreated ex-posure path the coe cient estimates of the exposure outcome model arestill biased but they are much closer then the coe cients obtain using theobserved misclassi ed data. The adjustment is most e ective when theassumed model has more lagged terms than the true model. This allowsproper model selection to be conducted by including many lag terms andthen removing the ones that are not signi cant. The adjustment that wehave proposed only makes use of the hidden Markov chain and techniquesassociated with hidden Markov theory. It only uses the observed data X and the outcome variable Y is not used to predict the true exposure statusX. A more complete adjustment for misclassi cation would use a Bayesianframework that would include both X and Y to predict X. The use of Ymight enable more accurate prediction of X and adjustment would be moree cient but it would also be much more complicated to implement. The ad-justment presented in this dissertation can be easily used by clinicians andepidemiologists and is e ective at reducing the e ect of misclassi cation.The misclassi cation adjustment in this thesis can be extended in many67Chapter 6. Conclusion and Future Workways. The inferential procedure implemented in the R package ‘Hidden-Markov’ for discrete time hidden Markov chains can be extended to uselongitudinal data so that accurate estimation of transition probabilities andmisclassi cation probabilities can be calculated. This would allow for ad-justment for misclassi cation to be done without a validation study in thediscrete time case, as it is done in the continuous time case. In the contin-uous time case the function ‘msm’ allows transition probabilities to dependon covariates. Therefore, if there is reason to believe the exposure statusof a subjects switches based on accurately measured covariates, this canbe accounted for. Our adjustment method can also be easily extended toadjust for di erential misclassi cation. This can be done by using the out-come variable Y as a covariate for misclassi cation probabilities. The Rpackage ‘msm’ also allows for misclassi cation probabilities to depend oncovariates so existing software has the ability to account for di erential mis-classi cation. If the outcome variable Y is binary then the hidden Markovadjustment is still e ective but how misclassi cation a ects the results is un-known. The measurement error problems arising from combinations of theabove scenarios are worth exploring. Further research should be conductedto improve the validity of scienti c ndings in epidemiological studies.68BibliographyBaum L. E. and Petrie T.(1966) Statistical inference for probabilistic func-tions of nite state Markov chains, Annals of Mathematical Statistics 37:1554-1563Baum L. E., Petrie T., Soules G., and Weiss N. (1970). A maximisationtechnique occurring in the statistical analysis of probabilistic functions ofMarkov chains, Annals of Mathematical Statistics41: 164-171.Bureau A., Hughes J. P., and Shiboski S. C. (2000). An S-Plus implementa-tion of hidden Markov models in continuous time. Journal of Computationaland Graphical Statistics, 9: 621-632.Carroll, R. J., Ruppert, D., Stefanski, L.A. and Crainiceanu, C. M. (2006).Measurement Error in Nonlinear Models, Vol. 105 of Monographs on Statis-tics and Applied Probability, second edn, Chapman & Hall/CRC, Boca Ra-ton.Durbin, R., Eddy, S., Krogh, A. and Mitchison, G. (1998). Biological se-quence analysis, Cambridge University Press.69BibliographyGreenland, S. and Gustafson, P. (2006). Accounting for independent non-di erential misclassi cation does not increase certainty that an observedassociation is in the correct direction, American Journal of Epidemiology164: 63-68.Gustafson, P. (2004) Measurement Error and Misclassi cation in Statisticsand Epidemiology: Impacts and Bayesian Adjustment, Vol. 13 of Interdis-ciplinary Statistics, Chapman & Hall/CRC, Boca Raton.Harte D. (2010). HiddenMarkov [Computer program].Jackson C. (2009) Multi-state modeling with R: the msm package, Cam-bridge, United Kingdom.Jackson, C.H. and Sharples, L.D. (2002). Hidden Markov models for the on-set and progression of bronchiolitis obliterans syndrome in lung transplantrecipients, Statistics in Medicine, 21: 113-128.Jackson, C.H., Sharples, L.D., Thompson, S.G. and Du y, S.W. and Couto,E. (2003). Multi-state Markov models for disease progression with classi -cation error, The Statistician 52: 193-209.Kalb eisch, J., Lawless, J.F. (1985). The analysis of panel data under aMarkov assumption, Journal of the American Statistical Association 80:863-871.70MacDonald, I.L. and Zucchini, W. (1997). Hidden Markov and Other Mod-els for Discrete-valued Time Series, Chapman & Hall/CRC, Boca Raton.Pan S. L. and Wu H. M. (2007). A Markov regression random-e ects modelfor remission of functional disability in patients following a rst stroke:ABayesian approach, Statistics in Medicine 26: 5335-5353Satten, G.A. and Longini, I.M. (1996). Markov chains with measurementerror: estimating the ’true’ course of a marker of the progression of hu-man immunode ciency virus disease (with discussion), Applied Statistics45: 275-309.Viterbi J. (1967). Error bounds for convolutional codes and an asymp-totically optimal decoding algorithm, IEEE Transactions on InformationTheory 13: 260-269.Zucchini, W. (2005). Hidden Markov Models Short Course, 34 April 2005.Macquarie University, Sydney.71Appendix AR Code for BiasDetermination#defining the total number of random variables (X,X*), (SN,SP) and$(\phi_1, \phi_2)$n=8SN<-0.8SP<-0.95swit1<-Switch1[k]swit2<-Switch2[l]# creating all combination of true data and misclassified datacom<-t(rep(0,n))for(i in 1:n){com<-rbind(com,t(combn(1:n, i, tabulate, nbins = n)))}d<-data.frame(com)# creating all combination of misclassified datacom2<-t(rep(0, n/2))for(i in 1:I(n/2)){com2<-rbind(com2,t(combn(1:I(n/2), i, tabulate, nbins = I(n/2))))}# determining stationary distribution for Markov chainmarMat<-cbind(c(-swit1,1), c(swit2,1))sol<-solve(marMat)%*%c(0,1)# determining Pr(X, X*)prob<-NULLpx4<-NULLfor(i in 1:2^n){72Appendix A. R Code for Bias Determination#determining $Pr(X)$r<-NULLr[1]<-sol[I(d[i,1]+1)]for(j in 2:I(n/2)){if(d[i,j-1]==0){r[j]<-(1-abs(d[i,j]-d[i,j-1]))*(1-swit1)+abs(d[i,j]-d[i,j-1])*swit1}if(d[i,j-1]==1){r[j]<-(1-abs(d[i,j]-d[i,j-1]))*(1-swit2)+abs(d[i,j]-d[i,j-1])*swit2}}px14<-prod(r)# determine $Pr(X^*|X)$x<-com[i,1:I(n/2)]xs<-com[i,I(n/2+1):n]pStar<-prod(x*(SN^xs*(1-SN)^(1-xs)) + (1-x)*((1-SP)^xs*SP^(1-xs)))# determing $Pr(X^*,X)$prob[i]<-px14*pStar}# determing $Pr(X_4==1|X)$x4<-NULLxstar<-NULLfor(i in 1:length(com2[,1])){x4[i]<-sum(prob[d[,4]==1 & d[,5]==com2[i,1] & d[,6]==com2[i,2] &d[,7]==com2[i,3] & d[,8]==com2[i,4] ])xstar[i]<-sum(prob[d[,5]==com2[i,1] & d[,6]==com2[i,2] &d[,7]==com2[i,3] & d[,8]==com2[i,4] ])}Eprob<-x4/xstar#Creating the one-to-one correspondencemat<-data.frame(com2)Emat2<-model.matrix(~ X1*X2*X3*X4, mat)# Determining $\beta$E2<-solve(Emat2)%*%$Eprobrow.names(E2)<-names(as.data.frame(Emat2))E273
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Time-varying exposure subject to misclassification...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Time-varying exposure subject to misclassification : bias characterization and adjustment Cormier, Eric 2010-08-27
pdf
Page Metadata
Item Metadata
Title | Time-varying exposure subject to misclassification : bias characterization and adjustment |
Creator |
Cormier, Eric |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | Measurement error occurs frequently in observational studies investigating the relationship between exposure variables and a clinical outcome. Error-prone observations on the explanatory variable may lead to biased estimation and loss of power in detecting the impact of an exposure variable. When the exposure variable is time-varying, the impact of misclassification is complicated and significant. This increases uncertainty in assessing the consequences of ignoring measurement error associated with observed data, and brings difficulties to adjustment for misclassification. In this study we considered situations in which the exposure is time-varying and nondifferential misclassification occurs independently over time. We determined how misclassification biases the exposure outcome relationship through probabilistic arguments and then characterized the effect of misclassification as the model parameters vary. We show that misclassification of time-varying exposure measurements has a complicated effect when estimating the exposure-disease relationship. In particular the bias toward the null seen in the static case is not observed. After misclassification had been characterized we developed a means to adjust for misclassification by recreating, with greatest likelihood, the exposure path of each subject. Our adjustment uses hidden Markov chain theory to quickly and efficiently reduce the number of misclassified states and reduce the effect of misclassification on estimating the disease-exposure relationship. The method we propose makes use of only the observed misclassified exposure data and no validation data needs to be obtained. This is achieved by estimated switching probabilities and misclassification probabilities from the observed data. When these estimates are obtained the effect of misclassification can be determined through the characterization of the effect of misclassification presented previously. We can also directly adjust for misclassification by recreating the most likely exposure path using the Viterbi algorithm. The methods developed in this dissertation allow the effect of misclassification, on estimating the exposure-disease relationship, to be determined. It accounts for misclassification by reducing the number of misclassified states and allows the exposure-disease relationship to be estimated significantly more accurately. It does this without the use of validation data and is easy to implement in existing statistical software. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-08-27 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0071231 |
URI | http://hdl.handle.net/2429/27839 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2010-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2010_fall_cormier_eric.pdf [ 5.86MB ]
- Metadata
- JSON: 24-1.0071231.json
- JSON-LD: 24-1.0071231-ld.json
- RDF/XML (Pretty): 24-1.0071231-rdf.xml
- RDF/JSON: 24-1.0071231-rdf.json
- Turtle: 24-1.0071231-turtle.txt
- N-Triples: 24-1.0071231-rdf-ntriples.txt
- Original Record: 24-1.0071231-source.json
- Full Text
- 24-1.0071231-fulltext.txt
- Citation
- 24-1.0071231.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0071231/manifest