UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Exploring inverse probability weighted per-protocol estimates to adjust for non-adherence using post-randomization… Mosquera, Lucy 2020

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2020_november_mosquera_lucy.pdf [ 1.18MB ]
Metadata
JSON: 24-1.0392954.json
JSON-LD: 24-1.0392954-ld.json
RDF/XML (Pretty): 24-1.0392954-rdf.xml
RDF/JSON: 24-1.0392954-rdf.json
Turtle: 24-1.0392954-turtle.txt
N-Triples: 24-1.0392954-rdf-ntriples.txt
Original Record: 24-1.0392954-source.json
Full Text
24-1.0392954-fulltext.txt
Citation
24-1.0392954.ris

Full Text

EXPLORING INVERSE PROBABILITY WEIGHTEDPER-PROTOCOL ESTIMATES TO ADJUST FORNON-ADHERENCE USING POST-RANDOMIZATIONCOVARIATES: A SIMULATION STUDYbyLucy MosqueraBSc, Queen’s University, 2018A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Statistics)The University of British Columbia(Vancouver)August 2020c© Lucy Mosquera, 2020The following individuals certify that they have read, and recommend to the Fac-ulty of Graduate and Postdoctoral Studies for acceptance, the thesis entitled:Exploring Inverse Probability Weighted Per-Protocol Estimates to Ad-just for Non-Adherence Using Post-Randomization Covariates: A Sim-ulation Studysubmitted by Lucy Mosquera in partial fulfillment of the requirements for thedegree of Master of Science in Statistics.Examining Committee:M. Ehsanul Karim, School of Population and Public HealthSupervisorLang Wu, Department of StatisticsSecond ReaderiiAbstractIn pragmatic trials, treatment strategies are randomly assigned at baseline, butpatients may not adhere to their assigned treatments during follow-up. In thepresence of non-adherence, we aim to compare the conventionally used analy-ses (e.g. intention-to-treat (ITT) and naive per-protocol (PP)) with inverse prob-ability weighted (IPW) and baseline adjusted PP analyses. We have conductedcomprehensive simulation studies to generate realistic two-armed pragmatic trialdata with a baseline covariate and post-randomization time-varying covariates. Oursimulation was applied to understand the impact of trial characteristics (e.g., non-adherence rates, event rates, trial size), varying the causal relationships (e.g., if thebaseline covariate is unmeasured or a risk factor), and varying the measurementschedule for adherence rates and time-varying covariates in the follow-up period.We also assessed the key statistical properties of these estimators. In the presenceof non-adherence, our results suggest that ITT, IPW-PP and baseline adjusted PPestimates can recover the true null treatment effect. For non-null treatment effects,only the IPW-PP and baseline adjusted estimates were reasonably unbiased. Ifadherence and time-varying covariates are assessed less frequently, the bias andvariability of effect estimates increase. This study demonstrates the feasability ofusing adjusted PP estimates to recover the true effect of treatment in the presenceof non-adherence and the necessity of designing pragmatic trials that measure bothpre-and-post-randomization covariates to reduce bias in the estimation of the treat-ment effect.iiiLay SummaryClinical trials aim to assess the benefits and harms of proposed medical treatmentsby randomly assigning participants to either a new treatment or the current stan-dard of care. In practice, not every participant follows through and receives theirassigned treatment. When participants do not receive the treatment they are as-signed, it makes it difficult to estimate the true benefits or harms of the proposedmedical treatment. This research uses a simulation to compare how a variety ofmethods for estimating the benefits and harms of a proposed treatment performwhen there are many participants who do not receive their assigned treatment. Thiswork aims to provide guidance to clinicians on how best to design and analyze clin-ical trials so that the true benefits and harms of a treatment may be estimated, evenif participants do not receive the treatment they are assigned.ivPrefaceThis dissertation presents original research conducted by Lucy Mosquera underthe supervision of Dr. Ehsan Karim. Lucy Mosquera wrote all chapters in this the-sis, and performed the simulations. This dissertation was proofread by Dr. EhsanKarim and Dr. Lang Wu in their official capacity, and the final version was ap-proved by both.This work was supported by BC Support Unit’s Real-World Clinical TrialsMethods Cluster, Project #2, led by Dr. Karim (with research members PaulGustafson, Joan Hu, Hubert Wong and Derek Ouyang), and Dr. Karim’s NaturalSciences and Engineering Research Council of Canada (NSERC) Discovery Accel-erator Supplements. This research was enabled in part by computing support pro-vided by WestGrid (www.westgrid.ca), Compute Canada (www.computecanada.ca),and the Centre for Health Evaluation and Outcome Sciences.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xivAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Pragmatic Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Limitations of Pragmatic Trials . . . . . . . . . . . . . . . . . . . 31.3 Why is Non-Adherence a Problem? . . . . . . . . . . . . . . . . 41.4 Estimates in the Presence of Non-Adherence . . . . . . . . . . . . 41.5 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.6 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Simulation and Effect Estimation Methods . . . . . . . . . . . . . . 92.1 Key Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9vi2.2 Causal Diagram for the Simulated Data . . . . . . . . . . . . . . 112.3 Data Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.4 Effect Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4.1 Naive Estimates . . . . . . . . . . . . . . . . . . . . . . . 132.4.2 Adjusted Per-Protocol (PP) Estimates . . . . . . . . . . . 162.5 Analysis Methods to Handle Sparse Follow-Up Data . . . . . . . 192.6 Statistical Properties of Effect Estimates . . . . . . . . . . . . . . 202.7 Computational Details . . . . . . . . . . . . . . . . . . . . . . . 203 Trial Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Methods: Simulation Specifications . . . . . . . . . . . . . . . . 243.2.1 Varying Treatment Effect and Trial Size . . . . . . . . . . 243.2.2 Event Rate . . . . . . . . . . . . . . . . . . . . . . . . . 253.2.3 Non-Adherence Rates . . . . . . . . . . . . . . . . . . . 253.2.4 The Effect of Time on the Outcome . . . . . . . . . . . . 253.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.1 Varying Treatment Effect and Trial Size . . . . . . . . . . 263.3.2 Event Rate . . . . . . . . . . . . . . . . . . . . . . . . . 303.3.3 Non-Adherence Rates . . . . . . . . . . . . . . . . . . . 313.3.4 The Effect of Time on the Outcome . . . . . . . . . . . . 323.4 Discussion: Interpretation . . . . . . . . . . . . . . . . . . . . . . 333.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354 Modifying the DAG . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.2 Methods: Simulation Specifications . . . . . . . . . . . . . . . . 374.2.1 Baseline Covariate as a Risk Factor . . . . . . . . . . . . 374.2.2 Baseline Covariate Unmeasured . . . . . . . . . . . . . . 374.2.3 The Effect of Time-Varying Covariates on Non-Adherence 394.2.4 The Effect of Time-Varying Covariates on the Outcome . 394.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3.1 Baseline Covariate as a Risk Factor . . . . . . . . . . . . 39vii4.3.2 Baseline Covariate Unmeasured . . . . . . . . . . . . . . 404.3.3 The Effect of Time-Varying Covariates on Non-Adherence 414.3.4 The Effect of Time-Varying Covariates on the Outcome . 434.4 Discussion: Interpretation . . . . . . . . . . . . . . . . . . . . . . 434.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Sparse Follow-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.2 Methods: Simulation Specifications . . . . . . . . . . . . . . . . 485.2.1 Varying Measurement Schedule . . . . . . . . . . . . . . 495.2.2 Varying Treatment Effect . . . . . . . . . . . . . . . . . . 495.2.3 Varying Non-Adherence . . . . . . . . . . . . . . . . . . 505.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505.3.1 Varying Measurement Schedule . . . . . . . . . . . . . . 505.3.2 Varying Treatment Effect . . . . . . . . . . . . . . . . . . 525.3.3 Varying Non-Adherence . . . . . . . . . . . . . . . . . . 555.4 Discussion: Interpretation . . . . . . . . . . . . . . . . . . . . . . 575.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586 Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . 596.1 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . 596.2 Our Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 606.3 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.5 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65viiiList of TablesTable 2.1 Table of statistical properties reported for each treatment effectestimate. For each property, an estimate for that property iscalculated as well as a standard error for that estimate if applicable. 22Table 3.1 Table of parameter variations used to explore different trial char-acteristics. Note the settings that varied two parameters ex-plored different combinations of the parameters simultaneously.The parameters in this table correspond to the data generationfunctions as defined in Section 2.3. . . . . . . . . . . . . . . . 24Table 3.2 Table of parameters and the corresponding non-adherence ratesobserved in each trial arm. The parameters in this table cor-respond to the data generation functions as defined in Section2.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Table 4.1 Table of parameter variations used to modify and assess causalrelationships. The parameters in this table correspond to thedata generation functions as defined in Section 2.3. . . . . . . . 38Table 5.1 Table of parameter variations used to assess the impact of sparsefollow-up. The parameters in this table correspond to the datageneration functions as defined in Section 2.3. . . . . . . . . . 49ixList of FiguresFigure 2.1 Directed Acyclic Graph (DAG) describing the desired simula-tion where Z is randomization and B is the baseline covariate.Lt−1 and Lt are time-varying covariates at times t − 1 and t,respectively., At−1 and At indicates whether treatment was re-ceived at times t−1 and t, respectively. Yt+1 is the outcome ofinterest observed at time t+1. . . . . . . . . . . . . . . . . . 11Figure 3.1 A) Bias plotted against treatment effect for each effect estimatewhere the colour corresponds to the trial size in terms of num-ber of participants per arm. B) Power or proportion of times thetreatment effect was found to be statistically significant plottedagainst the treatment effect for each effect estimate where thecolour corresponds to the trial size. Note that for both figures alog(OR) effect of treatment of 0 correspond to a null treatmenteffect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27Figure 3.2 Empirical standard error and model standard error (columns)for each effect estimate (rows) plotted against the log(OR) ef-fect of treatment, for various trial sizes (colour). . . . . . . . . 28Figure 3.3 Coverage probability and unbiased coverage probability (columns)for each effect estimate (rows) plotted against the log(OR) ef-fect of treatment, for various trial sizes (colour). . . . . . . . . 29Figure 3.4 Bias (plot A) and average model standard error (plot B) ob-served in each estimation method across 1000 iterations of thesimulation as the event rate varies between 0.001 and 0.75. . . 31xFigure 3.5 Bias (plot A) and convergence rate (plot B) observed in eachof the cumulative survival type estimates across 1000 iterationsof the simulation as the event rate varies between 0.001 and 0.75. 32Figure 3.6 Bias (plot A) and average model standard error (plot B) againstnon-adherence rates in the control arm for each effect esti-mate. The colour of the points corresponds to the relativenon-adherence in the treatment arm where 0% means the non-adherence in the treatment arm equal that in the control arm,and 10% indicates the non-adherence in the treatment arm is10% greater than in the control arm. . . . . . . . . . . . . . . 33Figure 3.7 Bias (plot A) and average model standard error (plot B) againstthe increase in event rate due to the effect of time on the outcome. 34Figure 4.1 Bias (plot A) and model standard error (plot B) observed foreach estimate when the baseline covariate B is a risk factorrather than a confounder. The X-axis corresponds to varyingwhether B has a strong positive or negative effect on Y . . . . . 40Figure 4.2 Bias (plot A) and average model standard error (plot B) ob-served for each estimate as the log(OR) effect of treatmentvaries when the baseline covariate B is unmeasured. . . . . . . 41Figure 4.3 Bias (plot A) and average model standard error (plot B) ob-served for each estimate as the log(OR) effect of treatmentvaries when the baseline covariate B is unmeasured and onlyaffects the time-varying covariates L1 and L2 at time t = 0. . . 42Figure 4.4 Bias (plot A) and model standard error (plot B) observed foreach estimate as the effect of the time-varying covariates L onnon-adherence varies. Non-adherence is designed to be equalin both trial arms and is equal to 40% when L has no effect onadherence. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Figure 4.5 Bias (plot A) and model standard error (plot B) observed foreach estimate as the effect of the time-varying covariates L onthe outcome varies. . . . . . . . . . . . . . . . . . . . . . . . 44xiFigure 5.1 Bias observed in each estimation method across 1000 itera-tions of the simulation as the measurement schedule duringthe follow-up period varied from monthly measures (m = 1)to measures every 2 years (m = 24). Columns correspond towhether the adherence (A), time-varying covariates (L1&L2),or both (A,L1&L2) were sparse during the follow-up period.Rows correspond to the 4 main estimation methods. Colour de-notes whether the analysis was based on Complete Case (CC)or Last Observation Carried Forward (LOCF) imputation. . . . 51Figure 5.2 Average model standard error observed in each estimation methodacross 1000 iterations of the simulation as the measurementschedule during the follow-up period varied from monthly mea-sures (m = 1) to measures every 2 years (m = 24). Columnscorrespond to whether the adherence (A), time-varying covari-ates (L1&L2), or both (A,L1&L2) were sparse during the follow-up period. Rows correspond to the 4 main estimation methods.Colour denotes whether the analysis was based on CC or LOCFimputation. Note that in this figure some error bars have beentruncated to due to outliers. . . . . . . . . . . . . . . . . . . . 52Figure 5.3 Bias observed in each estimation method across 1000 itera-tions of the simulation as the effect of treatment varied. Columnscorrespond to whether the adherence (A), time-varying covari-ates (L1&L2), or both (A,L1&L2) were sparse during the follow-up period. Rows correspond to the 4 main estimation methods.Colour denotes whether the analysis was based on CC or LOCFimputation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 53xiiFigure 5.4 Power observed in each estimation method across 1000 itera-tions of the simulation as the treatment effect varied. Colourdenotes whether the analysis was based on CC or LOCF im-putation. Note that no difference was observed depending onwhether the adherence (A), time-varying covariates (L1&L2),or both (A,L1&L2) were sparse during the follow-up period.This figure corresponds to when both adherence and time-varyingcovariates were subject to sparse measurement during the follow-up period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54Figure 5.5 Bias observed in each estimation method across 1000 itera-tions of the simulation as the rate of non-adherence varies be-tween 10% and 90% with equal non-adherence rates in eacharm. Colour denotes the time between measurements for thesparse adherence and time-varying covariates. Both time-varyingcovariates and adherence were sparse during the follow-up pe-riod and were imputed using LOCF. . . . . . . . . . . . . . . 56xiiiGlossaryAC Available CaseADJUSTED-PP Adjusted Per-ProtocolCC Complete CaseCDP Coronary Drug ProjectCHARM Candesartan in Heart Failure: Assessment of Reduction inMortality and MorbidityCONSORT Consolidated Standards of Reporting TrialsDAG Directed Acyclic GraphIPW Inverse Probability WeightITT Intention to TreatLOCF Last Observation Carried ForwardLOG(OR) Log-Odds RatioMAR Missing At RandomMCAR Missing Completely At RandomMNAR Missing Not At RandomNAIVE-PP Naive Per-ProtocolOR Odds RatioPP Per-ProtocolPRECIS Pragmatic-Explanatory Continuum Indicator SummaryRCT Randomized Clinical TrialxivRD Risk DifferenceTWSIPW Truncated Weakly Stabilized Inverse Probability WeightedWSIPW Weakly Stabilized Inverse Probability WeightedxvAcknowledgementsThis thesis would not have been possible without my wonderful support network.I thank my supervisor Ehsan Karim for his attentiveness, accommodating na-ture, and always helping keep things in perspective. I thank my second reader LangWu for his timely insights and inspiring my confidence.I thank the UBC Statistics department for welcoming me to Vancouver andproviding excellent opportunities to explore statistics through both academics andapplications in consulting.Lastly I want to thank my family for encouraging me and their unwaveringsupport. Especially my husband Connor, for his patience and positivity.xviChapter 1Introduction1.1 Pragmatic TrialsThe term “pragmatic trial” was first coined by Schwartz and Lellouch [1]. Theyargued that clinical studies exist on a continuum between addressing explanatoryand pragmatic research questions. The continuum can also be characterized as thecontinuum between the research questions for assessing efficacy in ideal settingsversus assessing effectiveness in clinical care settings, respectively [1, 2].Randomized Clinical Trials (RCTS) are typically applied to address explanatoryresearch questions. RCTS often assess the efficacy of a treatment where efficacyis typically defined as “how well a treatment works under perfect adherence andhighly controlled conditions” [3]. RCTS recruit participants from a highly selectedpatient population based on a set of strict inclusion and exclusion criteria. Consent-ing participants are then randomized to receive a pre-specified treatment regimefrom the set of regimes under investigation. Randomization is a critical element ofRCTS. Randomization mitigates confounding bias, that would be introduced by al-lowing clinicians or participants to select which treatment the participants receive.Randomization with a sufficient sample size also ensures that patient characteris-tics are balanced across each treatment arm. Participants are then followed for apre-specified follow-up period, allowing researchers to collect detailed informationon the outcome of interest over time. Traditional RCTS often utilize placebos andblinding to minimize internal bias, which aids the estimation of the biological ef-1fect of the treatment of interest. RCTS are often regarded as the “gold standard” formedical research and are generally required for regulatory decision making suchas approving new treatments [4, 5]. However, RCTS have been criticized as theobserved effect of the treatment of interest may not be generalizable to a patientpopulation in typical real world care settings [6–8].On the other end of the continuum are pragmatic trials. Pragmatic trial addressresearch questions that take into account the context of receiving a treatment andare more concerned with comparing the practical effects of treatment, rather thanthe biological effects. In a pragmatic trial, the trial aims to inform clinical decision-making rather than regulatory decision making [1, 9, 10]. In general, pragmatictrials may aim to quantify the effectiveness of a treatment, where effectiveness iscommonly defined as “how well a treatment works in everyday practice” [3].Key elements of pragmatic trial design include participant eligibility, recruit-ment, setting for care delivery, organization, delivery of intervention, adherence,follow-up of participants, primary outcome, and analysis methods [2, 9, 11]. Forexample, pragmatic trials should recruit participants that are representative of thepatients who would receive the treatment if the treatment were to become the stan-dard of care [9]. Similarly, the delivery of intervention should also mimic howthe treatment would be delivered were it to become the standard of care. Follow-up of participants often utilizes electronic health records or disease registries totrack participant’s outcomes without obtrusive follow-up. Analysis of pragmatictrials should focus on outcomes that are relevant to patients (e.g., occurrence ofany one of the following: cardiovascular death, non-fatal heart attack, non-fatalstroke, hospitalization for unstable angina or coronary revascularization), ratherthan composite outcomes [1, 9, 12].Interest in pragmatic trials has grown since their proposition in the 1960s.Pragmatic trials are now included in popular guidelines for standardized reportingof clinical research. The Consolidated Standards of Reporting Trials (CONSORT)group recommends pragmatic trials be used to address questions related to clini-cal decision-making and have released standardized guidelines on the reporting ofpragmatic trials [10]. Additional guidelines, such as the Pragmatic-ExplanatoryContinuum Indicator Summary (PRECIS) and PRECIS-2 tool, have been developedto facilitate investigators incorporating pragmaticism in different trial design el-2ements [2, 11]. Pragmatic trials are becoming more mainstream and have beenproposed to replace stage IV clinical trials [13] and to integrate with electronichealth records to facilitate a learning health system [14, 15].1.2 Limitations of Pragmatic TrialsPragmatic trials are not without limitations. Key limitations of pragmatic trialsinclude ethical considerations for recruitment, difficulty collecting outcome datafor all participants, and non-adherence to assigned treatment [1, 9, 16]. Individualswho participate in RCTS are typically younger, healthier, and more affluent thanthose who do not participate [17]. Pragmatic trials explicitly aim to avoid thisbias. In some pragmatic trials, patients may be recruited without the collectionof informed consent as a means to recruit a representative sample of participants[9]. This type of recruitment may be a valid approach when the participation in thetrial does not extend beyond typical care but should always be carefully consideredbefore implementation.Collecting outcome data for all participants in a non-obtrusive manner may bedifficult, depending on the outcome of interest [9]. Electronic health records anddisease registries may be used to collect outcome data in some situations, but whenthe outcome of interest is a patient-reported outcome, greater involvement frominvestigators is required [9]. Additionally, pragmatic trials may yield higher ratesof loss to follow-up compared to traditional RCTS if the intensity of follow-up isdesigned to mimic standard care [11].Non-adherence in pragmatic trials is broadly defined as when participants failto fully receive the treatment protocol, they were randomized to receive [18]. Insome ways, non-adherence in pragmatic trials is an accurate reflection of howindividuals respond to prescribed treatments in the real world. When a familydoctor prescribes a treatment to their patient, such as a daily medication to lowertheir blood pressure, this does not guarantee the patient will take the medication.Whether or not the patient takes the blood pressure medication may depend ontheir medical literacy, ability to access the medication, and health conscientiousbehaviors [19].31.3 Why is Non-Adherence a Problem?Non-adherence in pragmatic trials is a concern for investigators and patients asnon-adherence may bias treatment effect estimates towards the null [3, 12, 20, 21].This means that a trial may under-estimate the benefits or harms of a treatment[3, 21, 22].Different treatment regimes may be more susceptible to non-adherence. Treat-ment regimes can generally be classified as acute, episodic, or sustained [18].Acute treatment regimes are administered once over a short time period and aretypically subject to low rates of non-adherence. Episodic treatment regimes areadministered repeatedly, at intermittent intervals such as in response to a diseaseflare-up. Sustained treatments are meant to be received continuously over an ex-tended period of time. Both episodic and sustained treatments may be affected bynon-adherence [12, 18]. In any analysis focusing on mitigating the effects of non-adherence, it is crucial to explicitly define non-adherence in a manner appropriatefor the trial’s treatment strategy [23]. In this work, we focus on sustained treat-ment regimes, where non-adherence is quantified at regular intervals via follow-upvisits.In the field of pharmacoepidemiology, it is also common to quantify a partic-ipant’s receipt of a sustained treatment regime using the concept of persistence.Persistence is defined as the time between a participant’s randomization to receivea treatment regime and the participant’s discontinuation of that treatment regime[24, 25]. However, in this work, we will not be directly addressing the impact ofpersistence on the calculation of treatment effect.1.4 Estimates in the Presence of Non-AdherenceIn pragmatic trials, several estimators are commonly used to estimate the treatmenteffect. The most common effect-estimator is Intention to Treat (ITT), which corre-sponds to the causal effect of randomization on participants’ outcomes [12]. If allparticipants recieve their assigned treatment, the ITT effect also coincides with theaverage causal effect of receiving the treatment on the outcome of interest [3]. Abiased ITT effect estimate due to non-adherence may be problematic if the adher-ence pattern in the trials is different than the adherence seen in clinical practice as4this would reduce the generalizability of the finding [3].Per-Protocol (PP) estimates are often used when researchers are interested inquantifying the effect of genuinely receiving the treatment even when there isnon-adherence among trial participants. Naive Per-Protocol (NAIVE-PP) estimatesquantify the treatment effect, exclusively using data collected from participantswho adhered to their treatment [3, 12]. NAIVE-PP methods often produce biasedtreatment effect estimates as it compares a subset of participants in each arm of atrial in such a way that randomization gets broken [12].An alternative estimator is an Adjusted Per-Protocol (ADJUSTED-PP) effect es-timator; commonly, it is adjusted by Inverse Probability Weights (IPWS) [3, 12, 26–28]. The adjustment via IPW aims to mitigate the bias introduced by comparing anon-randomized subset of participants who adhered to their randomized treatment.This work aims to compare the ITT, NAIVE-PP, and ADJUSTED-PP effect estimates.1.5 LiteratureHistorically, the analysis of pragmatic trials has focused on quantifying the ITTeffect of treatment [3]. This was largely due to the landmark Coronary DrugProject (CDP), which highlighted the difficulty of working with PP effect estimatesin the presence of non-adherence [29]. The CDP was a large 5-armed trial thataimed to assess the efficacy of several lipid-lowering agents at reducing all-causemortality among individuals with previous coronary heart disease [30]. Within theclofibrate treatment arm, it was noted that the five-year mortality was 20.0 percent,but those in the clofibrate arm who adhered to their treatment had a much lower5-year mortality rate than those in the clofibrate arm who did not adhere to theirtreatment (15.0 vs. 24.6 percent; P = 0.00011). A similar trend was noted in theplacebo arm where those who adhered to the placebo treatment had a significantlylower 5-year mortality rate than those who did not adhere (15.1 percent vs. 28.3percent; P = 4.7× 10−16) [29]. These observations have been used to argue the“serious difficulty, if not impossibility” of PP estimation of treatment effects inthe presence of non-adherence. The original CDP analysis calculated a NAIVE-PPestimate. However, NAIVE-PP estimates are only unbiased if non-adherence hasoccurred completely at random [3]. Given that non-adherence is often related to5prognostic factors, it is unlikely non-adherence occurred completely at randomwithin the CDP, indicating that the NAIVE-PP analysis would be biased by defini-tion.Even as the ITT estimate was accepted as the default effect estimate in RCTSand pragmatic trials, criticisms of the ITT estimate have accumulated. Some re-searchers argue that the ITT doesn’t provide useful information for clinicians andpatients at the frontline of clinical decision-making as the ITT only quantifies theeffect of being assigned to a treatment, not the effect of receiving or completinga treatment [12, 21]. If clinicians or patients are deciding between two treatmentswhere they reasonably expect to actually receive the treatment, knowing the ef-fect of receiving the treatment would be most beneficial for making an informeddecision. Additionally, ITT effects quantify the effect of a treatment for a given ad-herence pattern. This means if a treatment is being applied to a different situationwith a different adherence pattern, the ITT estimate is no longer meaningful [3, 21].This is especially problematic as adherence rates in clinical research typically dif-fer from adherence rates seen in clinical practice, with adherence higher in clinicalresearch [31, 32]. Similarly, ITT estimates may not be meaningful as an aggregatedsummary if it is harmful to some patient subgroups and beneficial to others [21].In recent years researchers have made strong cases for the importance of cal-culating both ITT and PP type estimates largely due to the impact of adherence onthe interpretation of these effect estimates [3, 12, 22]. A pair of papers by Murrayand Herna´n [26, 28] have recently reanalyzed the original CDP data using moremodern ADJUSTED-PP methods to illustrate that proper adjustment can remove thedifference in survival between those who are and aren’t adherent on survival us-ing the appropriate covariates. The most recent re-analysis was able to reduce the5-year mortality risk difference between adherers and non-adherers in the placeboarm from 10.9 percentage points (95% confidence interval, 7.5, 14.4) to 0.01 per-centage points (95% confidence interval, -12.2, 13.2) [26]. These findings havebeen critical in illustrating the applicability of ADJUSTED-PP estimates rather thanNAIVE-PP estimates in mitigating bias due to non-adherence. Additional workshave illustrated the use of ADJUSTED-PP estimates in the settings of partial ad-herence [33] as well as other landmark studies such as the Candesartan in HeartFailure: Assessment of Reduction in Mortality and Morbidity (CHARM) trial [34].6Recently, Young et al. [27] considered a simulation study limited to the nulltreatment effect, for assessing the impact of varying measurement schedules (fre-quent versus infrequent) for collecting the post-randomization covariates on esti-mating the average causal treatment effect in the presence of non-adherence. Thiswork implemented cumulative survival type estimates of the ITT, NAIVE-PP, andIPW ADJUSTED-PP average causal effect of treatment. Similar work has been con-ducted to compare methods of analyzing sparse follow-up measurements whensurvival analysis is conducted using marginal structural models [35–37]. Moodieet al. [35] compared multiple imputation and inverse probability of missingnessweighting for when information on an important confounder is missing. Vourliand Touloumi [37] compared multiple imputation, last observation carried for-ward, and inverse probability of missingness weighting for Cox marginal struc-tural model analysis when there is either missing confounder or missing visit data.Mojaverian et al. [36] assessed analysis of sparse follow-up data using the impu-tation methods last observation carried forward and multiple imputation as well asavailable case analysis. These works have highlighted the need for collection ofpost-randomization covariates throughout the follow-up period while comparingmethods for addressing sparse covariate data.1.6 Our ContributionsThis simulation study is interested in comparing the relative performance of ITT,NAIVE-PP, and ADJUSTED-PP effect estimates when there is non-adherence in atwo-armed trial. Specifically, this simulation addresses when adherence is low,and a participant’s adherence is associated with post-randomization covariates thatare regularly measured. In this work, we have extended the framework of Younget al. [27] to address non-null treatment effects with an extended covariate list (i.e.,including baseline factors), and compared the results from survival curve estimatesof ITT, NAIVE-PP, and ADJUSTED-PP (using unstabilized IPW, defined later) [38].Furthermore, in this work, we have also introduced a model-based ADJUSTED-PP,that is adjusted using stabilized IPW [26].In this work, Chapter 2 describes the methods for the simulation, effect esti-mation, and statistical comparison metrics. We then conduct three groups of simu-7lations aiming to compare the performance of these estimates. In Chapter 3: TrialCharacteristics we assess the impact of trial characteristics such as different treat-ment effects, trial sizes, event rate, non-adherence rates on the performance of theproposed effect estimates. Next in Chapter 4: Modifying the DAG we exploreestimate performance under alternative causal relationships. Lastly, in Chapter 5:Sparse Follow-Up we assess the impact of sparsity for time-varying covariates andadherence during the follow-up period, comparing the popular Last ObservationCarried Forward (LOCF) imputation method to Complete Case (CC) analysis formissing observations. The implications, strengths and limitations of this research,and suggestions for future work are discussed in Chapter 6: Discussion and Con-clusions.8Chapter 2Simulation and Effect EstimationMethodsThe following chapter provides comprehensive descriptions of the methods used inthis work. This includes specifying the mechanisms for generating the simulateddata, calculating the treatment effect estimates, and assessing the statistical prop-erties of each effect estimate. The methods described in this chapter are the basisfor the simulation results presented in the three subsequent results chapters.2.1 Key NotationThis simulation aims to mimic the structure of a two-armed randomized pragmatictrial comparing a newer treatment to the current “standard of care”. In this simula-tion, we consider an event during the follow-up period (e.g., death, hospitalization)as the outcome Y . Our target parameter is the average causal effect of the newertreatment compared to the current standard of care.This trial consists of monthly measurements, indexed by time as t = 0,1, ...,K.The month t = 0 is the baseline visit, with months t = 1,2, ...K correspondingto follow-up visits. At the baseline visit, participants in this trial are randomlyassigned to receive either the newer treatment (Z = 1) or the standard of care treat-ment (Z = 0). All participants have not experienced the outcome event at the base-line visit Y0 = 0. Additionally, at the baseline visit (t = 0), participants are mea-9sured for a baseline covariate B. Measurement of baseline covariates are commonin clinical trials and may include demographic characteristics (e.g., age, sex, race,ethnicity) and baseline disease characteristics (e.g., symptom scores, lab tests, timesince the initial diagnosis, etc.).Participants are followed-up with via monthly visits until the outcome event isobserved or the end of the follow-up period (K visits recorded). At each monthlyvisit, participants are measured for post-randomization covariates Lb and Lc. Post-randomization covariates may include vital sign measurements (i.e., blood pres-sure, temperature, heart rate, etc.) or details on concomitant treatments (i.e., med-ications, procedures). This trial includes the measurement of a single Gamma dis-tributed continuous numeric baseline covariate B, and two post-randomization co-variates: the binary Lb (generated from a binomial distribution) and continuous Lc(generated from a normal distribution), collectively referred to as L= {Lb,Lc}.At each visit, it is also recorded if the participant did (At = 1) or did not (At = 0)receive the newer treatment. Participants are considered to be adherent in montht when At = Z. For this simulation, adherence is considered binary. Assumingadherence is binary may be a natural assumption based on the nature of the treat-ment. For example, if the treatment is a monthly injection, then by the nature ofthe treatment, all participants either do or do not receive the treatment each month.Alternatively, binary adherence may be created by applying a threshold to morecomplex sustained treatment regimes. For instance, in the Coronary Drug Project,participants were deemed to have received treatment if, at the end of a month, theirprescription bottle had > 80% of the pills missing [26].For the iterations exploring the impact of sparse measurements of time-varyingcovariate and treatment adherence data, these quantities are observed are everym months. This simulation also assumes that once a participant becomes non-adherent, they remain non-adherent (i.e., no participants switch adherence statusmultiple times). Whether a participant receives the newer treatment at each periodis influenced by their randomization status (Z) as well as all measured covariates (Band L). For simplicity, in this simulation, no participants are lost to follow-up. Thecausal relationships between each quantity in this trial are summarized in Figure2.1.10Figure 2.1: Directed Acyclic Graph (DAG) describing the desired simulationwhere Z is randomization and B is the baseline covariate. Lt−1 andLt are time-varying covariates at times t − 1 and t, respectively., At−1and At indicates whether treatment was received at times t − 1 and t,respectively. Yt+1 is the outcome of interest observed at time t+1.2.2 Causal Diagram for the Simulated DataThe causal relationships between each quantity in this trial are summarized in Fig-ure 2.1. This DAG illustrates the presence of multiple back-door paths that con-found the relationship between the receipt of the new treatment and the outcomeof interest Y . By the back-door path adjustment theorem [39], the effect of thenewer treatment on the outcome may be recovered by effect estimates that blockthe back-door paths created by the baseline covariate B. This indicates that un-adjusted treatment effect estimates will be confounded and necessitates the use ofadjusted effect estimates to produce unbiased treatment effect estimates.2.3 Data GenerationThe data generating process for this work is based on that of Young et al. [27]. No-tably, our work expands on Young et al. [27]’s data generation method by includingan effect of treatment on the outcome. Additionally, in our work the baseline co-variate B is measured rather than unmeasured, and has a direct effect on the receiptof treatment, in addition to the indirect effect through the time-varying covariatesL. Further variants of this data generating process and the causal relationships itsimulates are assessed in Chapter 4.11This simulation generates data for n participants in each arm in a trial with Kmonthly visits. The specific data generating functions are as follows:Randomization: Z where Zi = 1 indicates that participant i was randomizedto treatment and Zi = 0 indicates they were randomized to the control arm. Thevariable Z is randomly generated as a single Bernoulli trial with the mean of: pZBaseline covariate: B∼ Γ(λ )Time-varying continuous covariate Lc: generated from a normal distributionsuch that at time t for individual iLc,t,i =β1−0+β1−1Bi+β1−2At−1,i+β1−3cumsum(At−2,i)+β1−4cumavg(L1,t−1,i)+β1−5L2,t−1,i+β1−6t+ εi,(2.1)with εi ∼ N(0,σ), where cumavg() denotes the cumulative average function, andcumsum() denotes the cumulative sum function.Time-varying binary covariate L2: generated from a Bernoulli distributionwith mean pL2 defined as:logit(pL2) =β2−0+β2−1Bi+β2−2cumavg(L1,t,i)+β2−3L2,t−1,i+β2−4At−1,i+β2−5cumavg(At−2,i)+β2−6t.(2.2)Binary indicator for the receipt of treatment A: if a participant was non-adherent at time t−1 (i.e., At−1,i 6= Zi), then At,i was set to equal At−1,i. Otherwise,A was generated from a Bernoulli distribution with mean pAi defined by arm as::Arm Z = 0 : logit(pAi) =α00+α10cumavg(L1,t,i)+α20L2,t−1,i+α30ti+α40Bi,Arm Z = 1 : logit(pAi) =α01+α11cumavg(L1,t,i)+α21L2,t−1,i+α31ti+α41Bi.(2.3)Binary outcome Y : was generated from a Bernoulli distribution with mean pYidefined aslogit(pYi) =θ0+θ1Bi+θ2Ai+θ3L1,i−1+θ4L2,i−1+θ5ti. (2.4)12After the data has been generated, observations of the time-varying covariatesL1 and L2 and/or the receipt of treatment A are suppressed for follow-up visits inmonths not divisible by the time between measurements m. This approach of sup-pression after the data has been generated is meant to reflect that in the context ofinfrequent follow-up appointments, these quantities do exist and impact subsequentobservations, regardless of whether or not they are observed by clinicians. Havingsparse follow-up observations for the time-varying covariates and the receipt oftreatment in this manner is considered Missing Completely At Random (MCAR)[36].2.4 Effect EstimatesThis simulation study estimates the average causal effect of treatment using naiveand adjusted effect estimates. For both estimation methods the treatment effect isquantified using the Log-Odds Ratio (LOG(OR))s. Odds Ratio (OR)s quantify theaverage causal effect of treatment by comparing the odds of the outcome occurringin the treated arm relative to the odds of the outcome occurring in the control arm.LOG(OR) is the natural logarithm transform of an OR. This is a departure fromYoung et al. [27]’s simulation as they focused on Risk Difference (RD) effect esti-mates calculated using cumulative survival. Our work focuses on the performanceof model based effect estimates, although cumulative survival effect estimates arealso calculated.2.4.1 Naive EstimatesNaive estimates of the average causal effect of treatment include the widely popularintention to ITT estimate and the NAIVE-PP estimate.Intention to Treat (ITT) EstimatesThe first naive effect estimate considered in this simulation is the ITT estimate.The ITT estimate compares all patients randomized to the treatment arm againstall patients randomized to the control arm to quantify the average causal effect oftreatment. In this analysis, ITT estimates are calculated using model-based andcumulative survival approaches.13To calculate model-based ITT estimates, pooled logistic regression was imple-mented using the following models for each regression:Model-based ITT Odds Ratio: logit(Y ) = γ0+ γZZ+ γtt. (2.5)Pooled logistic regression with time as a covariate is a suitable outcome modelin this situation as pooled logistic regression approximates Cox-regression whenthe outcome of interest is rare [40–42]. Including time as a covariate in pooledlogistic regression allows the effect of time on the outcome to be taken into account[41, 42]. The estimated coefficient γZ is the LOG(OR) average causal effect oftreatment. The OR can be estimated by exponentiating the estimated coefficient γZ .Standard errors for these estimates were sandwich estimators that take into accountthe clustering of multiple observations from the same individual [43].To calculate the ITT estimates using cumulative survival, survival is calculatedfor each treatment arm. The estimated survival at the end of the trial (t = K) forarm z can be estimated as Sz(K) = ∏Kt=0ntz−∑ntzi=1 yt,intzwhere ntz denotes the numberof participants alive in arm Z at the start of time period t. The cumulative hazardfor arm z at the final time point in a trial is calculated as 1− S1(K). The ITT ORestimate is then calculated as the ratio of hazards 1−S1(K)1−S0(K) . Note that since thissimulation yields a rare outcome, the hazard ratio approximates the OR. Lastly, theLOG(OR) is recovered by: log(1−S1(K)1−S0(K)).While the ITT estimate is readily used in the analysis of randomized clinicaltrials, in the presence of non-adherence, the ITT estimate may underestimate thebenefits or harms of a treatment as it truly estimates the effect of being randomizedto receive treatment [3, 21, 22]. This means in our simulation, when the treatmenteffect is null, the ITT estimate is unbiased, and when the treatment effect is non-null, the ITT estimate is biased.Naive Per-Protocol (PP) EstimatesThe second naive estimate this simulation considers is the per-protocol estimate.The PP estimate compares patients randomized to the treatment arm who adhered,against patients randomized to the control arm who adhered. The NAIVE-PP esti-mate quantifies the average causal effect of receiving the treatment. Calculating a14PP estimate requires censoring, or removing from analysis, participants at the timethey become non-adherent to their randomized treatment. The NAIVE-PP estimatesare again calculated using both logistic regression outcome models and survivalcurves.The naive model-based PP estimates are calculated using pooled logistic re-gression on the trial data that has been censored based on adherence. The OR wasestimated using the following model:Naive Model-based PP Odds Ratio: logit(Y ) = γ0+ γAA+ γtt. (2.6)Pooled logistic regression with time as a covariate is a suitable outcome modelin this situation as pooled logistic regression approximates Cox-regression whenthe outcome of interest is rare [40–42]. Including time as a covariate in pooled lo-gistic regression allows the effect of time on the outcome to be taken into account[41, 42]. The estimated coefficient γA is the LOG(OR) average causal effect of treat-ment. The OR can be estimated by exponentiating γA. Again, standard errors forthese estimates were calculated using sandwich estimators that take into accountthe clustered nature of the observations [43].For the cumulative survival estimates approach, patients are censored (i.e. re-moved from the analysis dataset) when they become non-adherent to their random-ized treatment. Then cumulative survival is calculated for the censored data fromeach trial arm at each time point. The NAIVE-PP OR is one minus the survivalestimate at the final time point (t = K) for adherent patients in the treatment armSZ=1,A=1(K) divided by one minus the survival estimate for those adherent in thecontrol arm SZ=0,A=0(K). The LOG(OR) can then be recovered as log(SZ=1,A=1(K)SZ=0,A=0(K)).The PP estimate can be thought of as aiming to estimate the effect of trulyreceiving the treatment on the outcome. However, the NAIVE-PP estimate is flawedas the groups compared are not generated via randomization, and thus may not becomparable on the basis of confounding variables, leading to biased estimates ofthe effect of treatment [3, 44]. Thus we expect the NAIVE-PP estimate to performpoorly in the presence of non-random non-adherence.152.4.2 Adjusted Per-Protocol (PP) EstimatesThe final effect estimates assessed are ADJUSTED-PP estimates. Two forms ofADJUSTED-PP estimates are introduced: baseline adjusted and IPW ADJUSTED-PPestimates. These adjusted estimates, or IPW PP estimates aim to recover the trueaverage causal effect of receiving treatment.Baseline Adjusted Per-Protocol EstimatesThe baseline ADJUSTED-PP estimates are calculated on the censored data, usinga pooled logistic regression model where the baseline covariate B is included as apredictor. This yields the following equation:Baseline ADJUSTED-PP Odds Ratio: logit(Y ) = γ0+ γAA+ γBB+ γtt, (2.7)where the estimated log(OR) effect of treatment is γA and the standard errors ofthis estimate are calculated using clustered sandwich estimators [43].Within the context of RCTS, there remains much debate regarding whether ornot baseline covariates should be adjusted for, as baseline confounding is sup-posed to be handled by design [45]. Some argue that adjusting for known prog-nostic factors in an RCT has methodological advantages, i.e., bias reduction andincreased precision [46, 47]. However, the CONSORT statement on RCTS advisesagainst identifying such adjustment variables via empirical exploration (e.g., sta-tistical significance) of the same data [48]. In a pragmatic trial context, there areexamples of needing to adjust for baseline characteristics to correct for baselineimbalances [49–51]. A recent guideline draft also proposed adjusting for prog-nostic factors that meet a pre-specified threshold for imbalance [52]. Given thecurrent debate regarding baseline adjustment, we have opted to include and assessa baseline ADJUSTED-PP estimate.Inverse Probability WeightsIPWS are calculated on the censored data to re-weight each observation to yield apseudo-population where the two comparator groups are generated in an unbiased16manner [38]. In this simulation stabilized IPW are the main IPW, although weaklystabilized IPW and truncated IPW have also been calculated.Stabilized weights adjust for the bias introduced by artificially censoring thedata using both the post-randomization covariates L1 and L2 and the baseline co-variate B. The appeal of using stabilized IPW is that for individuals with a very lowprobability of censoring due to non-adherence, unstabilized IPW yields very largeweights. These very large weights lead to high variance in effect estimates, whichis undesirable.The stabilized weight for an observation from subject i at time t are calculatedas:Wi,t =t∏i=1P(At |B, t)P(At |B, t,L1,t ,L2,t) , (2.8)where: P(At |B, t) are the predicted values from logistic regression of the form:logit(At) = γtt+ γBB, (2.9)and P(At |B, t,L1,t ,L2,t) are the predicted values from logistic regression of theform:logit(At) = γtt+ γBB+ γL1L1,t + γL2L2,t , (2.10)Each form of IPW is calculated independently within each arm as this accountsfor the possibility that the pattern of adherence may be different for those random-ized to receive the treatment than those randomized to receive the placebo. Thissituation may be seen in pragmatic trials if the burden or side effects of one treat-ment outweighs the other [3].The truncated versions, replace all weights higher than the 99% percentile withthe value at the 99% percentile, and all weights lower than the 1% with the value atthe 1% percentile. Truncation of IPWs is a commonly accepted strategy that aimsto reduce the variance of treatment effect estimates at the cost of a slight increasein bias [38].The weakly stabilized IPW differ from the stabilized IPW as they do not includethe baseline covariate B in the estimation of Wi,t . This form of IPW is convenient17to implement in the alternative causal situation explored in Chapter 4, where thebaseline covariate is unmeasured.Inverse Probability Weight Outcome ModelsThese weights can then be used to calculate the final IPW ADJUSTED-PP estimate,using either model-based or cumulative survival approaches. To calculate the IPWADJUSTED-PP effect estimate, the trial data is artificially censored by removingparticipants when they become non-adherent. This artificially censored data isthen used with the weights in either the model-based or cumulative survival effectestimation procedure.In the case of model-based effect estimation, the artificially censored trial datais then used in weighted logistic regression. The specific model depends on whichweights are used:Using Stabilized WeightsAdjusted Model-based PP Odds Ratio: logit(Y ) = γ0+ γAA+ γtt+ γBB, (2.11)Using Weakly Stabilized WeightsAdjusted Model-based PP Odds Ratio: logit(Y ) = γ0+ γAA+ γtt. (2.12)Weighted pooled logistic regression with time as a covariate is a suitable out-come model in this situation as weighted pooled logistic regression approximatesCox-regression when the outcome of interest is rare [40–42]. Including time as acovariate in pooled logistic regression allows the effect of time on the outcome tobe taken into account [41, 42, 53].In the case of stabilized weights, the final outcome model must also include thebaseline covariate B. The estimated coefficient γA is the LOG(OR) average causaleffect of treatment. The OR can be estimated by exponentiating γA. The standarderrors for these estimates were calculated using sandwich estimators that take intoaccount the clustered nature of the observations [43].In the case of estimating the average causal effect of treatment using cumu-lative survival estimates, the artificially censored trial data is then weighted using18either the unstabilized weights or the truncated unstabilized weights. This censoredand weighted data is then used to estimate cumulative survival in each trial arm.The ADJUSTED-PP cumulative survival calculated OR is one minus the survivalestimate at the final time point for adherent patients in the treatment arm dividedby one minus the survival estimate for those adherent in the control arm using theweighted survival estimates. The cumulative survival OR estimates may only becalculated with the unstabilized IPWs. Therefore, the cumulative survival estimateADJUSTED-PP has no mechanism to include the baseline covariate.2.5 Analysis Methods to Handle Sparse Follow-Up DataSparse values are problematic in data analysis, but common when working withhealthcare data. In pragmatic trials where researchers aim to utilize non-intrusiveforms of follow-up, existing electronic health record and administrative databasesmay be the source of key follow-up information [2, 9, 11, 54]. However, fewcountries have a single comprehensive data source for all forms of health data,requiring follow-up information to be derived by linking data from multiple sourcesleading to sparse values during follow-up [9, 54].The simplest strategy for analyzing data with sparse values in key covariatesis to only use observations that have all covariates measured. This is called aComplete Case (CC) analysis or Available Case (AC) analysis. CC analyses areunbiased if the missing data mechanism is MCAR [36]. However, in the context ofhealth research, the CC is unlikely to be a popular method as trials are designed witha specific sample size in mind to achieve desired power. Discarding observationsis thus undesirable as it may lead to an underpowered study and signifies wastedresources.Imputation is a strategy that allows analysis to be conducted on data that hasmissing values. In imputation, the missing values are imputed into the datasetusing an algorithm. The simplest such algorithm for longitudinal data is LOCF. InLOCF imputation any missing values are replaced by the last observation of thatvariable for the same individual. In our setting where we have measurements forthe outcome at every time point, but sparse measurements for the time-varyingcovariates and/or adherence, these values can be imputed using LOCF using the19most recent prior month’s observed values.While the LOCF method is widely used, it may produce biased treatment ef-fect estimates even if the data is MCAR [36]. Despite the popularity of the LOCF,compared to more modern imputation approaches such as multiple imputation ormixed-effect model repeated measure imputation, LOCF may produce biased re-sults [35, 36, 55]. While LOCF may introduce bias, an advantage of this and otherimputation methods is that it does not alter the sample size used in analysis, pre-venting a reduction in statistical power [36].2.6 Statistical Properties of Effect EstimatesFor each effect estimate under consideration, the following statistical propertieshave been calculated: convergence, bias, coverage probability, bias-adjusted cov-erage probability, empirical standard error, average model standard error, meansquared error, power or type I error, and confidence interval length. The formulafor these properties can be found in Table 2.1 adapted from Morris et al. [56]. Forthis table let θˆi denote the estimated treatment effect for iteration i of the simula-tion, where i∈ 1, ...,nsim. The true treatment effect is denoted θ , and the average ofthe estimated treatment effect across all iterations is θ¯ . Additionally, let pi denotethe p-value associated with the coefficient for the treatment effect at iteration i, andθˆhigh,i, θˆlow,i denote the upper and lower 95% confidence intervals for the estimatedtreatment effect for iteration i. Finally, let V̂ar(θˆi) be the estimated variance asso-ciated with the effect estimate θˆi. In these definitions we also use I to denote theindicator function. The indicator function returns a value of 1 if the input conditionis true, and 0 otherwise.Note that in the subsequent chapters of simulation results not all statisticalproperties are reported for every setting. The most important properties are biasand average model standard error. Results for all statistical properties for eachsimulation setting are available upon request.2.7 Computational DetailsThis simulation was conducted in R [57]. Cumulative survival OR estimates of ITT,NAIVE-PP, and ADJUSTED-PP were calculated using dplyr and tidyr from the20tidyverse. Pooled logistics regression outcome models were calculated usingthe base R lm family of regression models, with a logit link. Clustered standarderrors were calculated using the sandwich package [58, 59]. Results plots weregenerated using ggplot2 also from the tidyverse [60]. For each simulationscenario, the convergence of each estimate was tracked. Tracking convergenceis helpful to understand how reliably a method produces effect estimates. Thecumulative survival estimates were undefined in situations where the hazard in thecontrol arm was 0.21Table 2.1: Table of statistical properties reported for each treatment effect es-timate. For each property, an estimate for that property is calculated aswell as a standard error for that estimate if applicable.Statistical Property Estimation FormulaStandard Error ofEstimation FormulaConvergence 1nsim ∑nsimi=1 I(θi ∈ R) -Bias 1nsim ∑nsimi=1 θˆi−θ√1nsim(nsim−1) ∑nsimi=1(θˆi− θ¯)2CoverageProbability 1nsim ∑nsimi=1 I(θˆlow,i ≤ θ ≤ θˆhigh,i)√Cover.×(1− Cover. )nsimBias-AdjustedCoverageProbability 1nsim ∑nsimi=1 I(θˆlow,i ≤ θ ≤ θˆhigh,i)√B.A. Cover.×(1−B.A. Cover.)nsimEmpirical StandardError (EmpSE)√1nsim−1 ∑nsimi=1(θˆi− θ¯)2ˆEmpSE√2(nsim−1)Average ModelStandard Error(ModSE)√1nsim ∑nsimi=1 V̂ar(θˆi)√V̂ar[V̂ar(θˆi)]4nsim× ̂ModSE2Mean SquaredError (MSE) 1nsim ∑nsimi=1(θˆi−θ)2 √∑nsimi=1 [(θˆi−θ)2−M̂SE]2nsim×(nsim−1)Power or Type IError (Power) 1nsim ∑nsimi=1 I(pi ≤ α)√Power×(1−Power)nsimConfidence IntervalLength 1nsim ∑nsimi=1 θˆhigh,i− θˆlow,i -22Chapter 3Trial Characteristics3.1 IntroductionThe first group of simulations conducted focused on varying key trial characteris-tics. The goal of simulating different trial characteristics is to compare the perfor-mance of these effect estimates for differing diseases and pragmatic trial scopes.The results of this section would be most applicable to clinical researchers andapplied statisticians when designing new pragmatic trials.The trial characteristics explored in this section include the treatment effectsize, trial size, event rate, non-adherence rate, non-adherence pattern (i.e., whetheror not there is differential non-adherence based on trial arm), and varying the ef-fects of time on the outcome. Disease characteristics such as event rate, treatmenteffect size, and the effect of time on the outcome will likely be fixed but are usefulto explore so that clinical researchers can judge whether or not their research areawill be suitable for this kind of analysis when studied using a pragmatic trial. Incontrast, non-adherence rates and trial size may be influenced by study design. Forexample, non-adherence rates could be improved in a pragmatic trial by engag-ing with patient advocates to ensure the treatment instructions are clear to peopleof all backgrounds [32]. The results of this analysis could be used by cliniciansto inform target non-adherence rates. Similarly, the exploration of trial size andtreatment effect could be used to inform sample size calculations as part of trialdesign.23Table 3.1: Table of parameter variations used to explore different trial charac-teristics. Note the settings that varied two parameters explored differentcombinations of the parameters simultaneously. The parameters in thistable correspond to the data generation functions as defined in Section2.3.Setting Parameters Varied Parameter RangeVarying Treatment Effectand Trial SizeFor each n:Vary θ2n ∈ {200,1000,2000}θ2 ∈ {−1.3,−1,−0.7,−0.5,−0.3,0,0.3,0.5,0.7,1,1.3}Event Rate θ0θ0 ∈ {−19,−17.9,−17.3,−16,−16.8,−15,−14.6,−14.2,−13.8,−13.4,−12.8,−12.3,−11.8,−11.3,−10.8}Non-Adherence Rates α01,α00α01 ∈ {4.9,4.4,3.8,3.3,2.7,2.2,1.6,1.1,0.5}α00 ∈ {−9.6,−9.1,−8.5,−8,−7.5,−6.9,−6.4,−5.9,−5.3}Effect of Timeon Y θ5θ5 ∈ {0,0.01,0.02,0.03,0.04,0.05,0.075,0.1}3.2 Methods: Simulation SpecificationsThis group of scenarios were simulated by modifying parameters in the data gen-eration process described in Section 2.3 according to the ranges described in Table3.1.3.2.1 Varying Treatment Effect and Trial SizeThe assessments of treatment effect and trial size were conducted in parallel as agrid search. This resulted in all combinations of values being explored (i.e., foreach trial size considered, simulations were conducted for each treatment effect24considered). Trial sizes were selected to reflect medium to large pragmatic tri-als. Treatment effects were selected to vary between no treatment effect up to amoderate-strong treatment effect based on Cohen’s d for both positive and nega-tive treatment effects [61]. This assessment of the treatment effect and trial sizeis the base case for this entire work so a greater emphasis will be put on under-standing the statistical properties of the effect estimators in this setting comparedto subsequent scenarios.3.2.2 Event RateThe event rate was varied via the background risk of the event occurring (θ0). Thiscorresponds to varying the prevalence of the outcome of interest. For example, thiscould correspond to varying the mortality rate for a given condition, or the risk ofre-admission following a procedure. The parameters selected correspond to eventrates between 0.1% and 70%.3.2.3 Non-Adherence RatesNon-adherence rates were varied by modifying parameters α01 and α00 which cor-respond to the likelihood a patient in the treatment and control arms receives thetreatment, respectively. The parameter values explored were designed to corre-spond to 10% increments in non-adherence for each arm between 10% and 90%non-adherence. Table 3.2 illustrates which parameter values yielded which rates ofnon-adherence. Note that the non-adherence rates in the treatment arm are inde-pendent of the non-adherence rates in the control arm. This allowed simulations tobe conducted where non-adherence was equal as well as unequal between the trialarms by selecting different combinations for the values of α01 and α00.3.2.4 The Effect of Time on the OutcomeThe effect of time on the outcome of interest was assessed for the final group ofsimulations in this chapter. For this simulation, the effect of time on the outcomeof interest was assumed to be linear. The values selected correspond to time havingno effect on the outcome, or increasing the event rate from 3% to 60%.25Table 3.2: Table of parameters and the corresponding non-adherence ratesobserved in each trial arm. The parameters in this table correspond tothe data generation functions as defined in Section 2.3.Arm 1 Arm 0ParameterNon-AdherenceObserved ParameterNon-AdherenceObservedα01 = 4.909 10% α00 =−9.601 10%α01 = 4.364 20% α00 =−9.066 20%α01 = 3.818 30% α00 =−8.532 30%α01 = 3.273 40% α00 =−7.997 40%α01 = 2.727 50% α00 =−7.463 50%α01 = 2.182 60% α00 =−6.928 60%α01 = 1.636 70% α00 =−6.394 70%α01 = 1.091 80% α00 =−5.859 80%α01 = 0.545 90% α00 =−5.325 90%3.3 Results3.3.1 Varying Treatment Effect and Trial SizeFigure 3.1 A illustrates the bias observed for each estimation method as the treat-ment effect varies. The stabilized IPW and baseline ADJUSTED-PP estimates areapproximately unbiased for all assessed treatment effects. In contrast, the ITT esti-mate is unbiased only for the null treatment effect, with the effect estimates biasedtowards the null. In Figure 3.1 the colour of the points indicate the trial size. Thisshows that for strong treatment effects stabilized IPW and baseline ADJUSTED-PPestimates are slightly biased when the trial size is limited to n = 200 participantsper arm.Figure 3.1 B illustrates the power observed as the treatment effect changes.Power or type I error rates refer to the rate at which the null hypothesis is rejected,given that the null hypothesis is true. An estimate that controls for the type 1 errorrate will reject the null hypothesis at a rate of α when the treatment effect is null,and increasingly reject the null hypothesis the treatment effect moves away fromthe null [56]. This means the expected power would be approximately equal to 0.0526Figure 3.1: A) Bias plotted against treatment effect for each effect estimatewhere the colour corresponds to the trial size in terms of number of par-ticipants per arm. B) Power or proportion of times the treatment effectwas found to be statistically significant plotted against the treatment ef-fect for each effect estimate where the colour corresponds to the trialsize. Note that for both figures a log(OR) effect of treatment of 0 corre-spond to a null treatment effect.when the log(OR) effect of treatment is 0, and then increase as the treatment effectmoves away from the null. We see this U-shaped pattern for all effect estimates.For the naive PP estimate we see the “U shape” is not centred at the null treatmenteffect indicating undesirable performance. We see that the stabilized IPW and base-line ADJUSTED-PP estimates all yield 100% power for log(OR) treatment effects> 0.7 or < −0.7. This indicates that for strong treatment effects a clinical trialcould achieve standard statistical power (e.g., 80%) using smaller sample sizes. Incontrast, the ITT estimate power values correspond to a ‘U shape’ centred at thenull treatment effect, but do not achieve 100% power until the log(OR) treatmenteffects are > 1.3 or <−1.3.Considering the precision of the effect estimates, the empirical standard errorand model standard error results can be seen in Figure 3.2. For all effect estimates,the empirical standard error decreases as the log(OR) treatment effect increases.This is likely due to strong positive treatment effects leading to increases in the27Figure 3.2: Empirical standard error and model standard error (columns) foreach effect estimate (rows) plotted against the log(OR) effect of treat-ment, for various trial sizes (colour).event rate, allowing greater precision when estimating. Empirical standard erroralso decreases as the trial size increases, for all effect estimates. Empirical stan-dard error quantifies the long-run standard deviation of the effect estimates. Thismeans the comparison of the empirical standard error and model standard errorcan quantify the accuracy of the standard error values estimated by the models[56]. For all effect estimates presented, the empirical standard error and the modelstandard error align well for all treatment effects, if the trial size is n = 1000 or28Figure 3.3: Coverage probability and unbiased coverage probability(columns) for each effect estimate (rows) plotted against the log(OR)effect of treatment, for various trial sizes (colour).larger. This indicates that our models are accurately estimating the standard errorof the treatment effect.A final key statistical property to assess is the performance of the confidenceintervals using coverage probability. The results for coverage probability and un-biased coverage probability as the treatment effect varies for different trial sizescan be seen in Figure 3.3. Coverage probability is the proportion of confidenceintervals that include the true treatment effect. For a traditional 95% confidence in-29terval, the coverage probability should equal 95%. Under or over-coverage may beattributed to bias or incorrect interval width. To remove the effect of bias, Morriset al. [56] proposed unbiased coverage probability as a metric to assess the accu-racy of the confidence interval length and thus standard error. In Figure 3.3, we seethat the ITT and NAIVE-PP estimate largely suffer from undercoverage. However,we can discern this is due to bias. We see this as the ITT effect has appropriatecoverage probability for a null treatment effect (when it is also unbiased), and bothestimates have approximately the unbiased coverage probability equal 95% for alltreatment effects assessed. For the stabilized IPW and baseline ADJUSTED-PP es-timates we see coverage and unbiased coverage probabilities approximately equalto 95% for all treatment effects and trial sizes.3.3.2 Event RateFor this set of simulations, the event rates were varied between 0.1% and 75%among all participants. Figure 3.4 plot A illustrates the bias observed as the eventrate varies. Notably we see that the stabilized IPW estimate as well as the baselineADJUSTED-PP estimate are unbiased for event rates between approximately 1%and 75%. The ITT estimate and the NAIVE-PP estimate are biased for all event rates.We see that the estimates that are largely unbiased (stabilized IPW and baselineADJUSTED-PP estimate) have greater bias for small event rates (< 0.01).The variability in bias between iterations also increases with small event ratesfor all estimation methods. The average model standard error, seen in Figure 3.4plot B, which averages the standard error of the estimated treatment effect acrossiterations, was also observed to increase when the event rate is below 1%.The cumulative survival method was also used to calculate the ITT, NAIVE-PP,and unstabilized IPW ADJUSTED-PP effect estimates as the event rate varied. Fig-ure 3.5 plot A illustrates the bias for the cumulative survival estimates. We see thebiases observed for the cumulative survival ITT and NAIVE-PP estimates are simi-lar to the model-based ITT and NAIVE-PP estimates at approximately 0.5 and 0.7respectively (note the different scales between Figures 3.4 and 3.5). For the cumu-lative survival unstabilized IPW ADJUSTED-PP estimate we see greater bias thanwith the model-based stabilized IPW ADJUSTED-PP estimate. For these simple es-30Figure 3.4: Bias (plot A) and average model standard error (plot B) observedin each estimation method across 1000 iterations of the simulation asthe event rate varies between 0.001 and 0.75.timates, low event rates did not increase the variability of the effect estimates, butrather decreased the proportion of iterations that converged (illustrated in Figure3.5 plot B). With the event rate below 1%, as few as 30% of iterations producedvalid effect estimates.3.3.3 Non-Adherence RatesFigure 3.6 plot A illustrates the bias observed for each effect estimate as non-adherence rates vary. For these simulations, non-adherence was varied between10% and 90%. The colours in Figure 3.6 correspond to the observed differencein non-adherence in the treatment arm relative to the control arm. This allowsexploration of differential non-adherence. In terms of bias, the stabilized IPWand the baseline ADJUSTED-PP estimates are unbiased when the control arm non-adherence is between 10% and 60%, regardless of whether the treatment arm hasthe same or greater non-adherence. We see that for the ITT and NAIVE-PP estimatesbias increases as the non-adherence rate in the control arm increases. Additionally,for the ITT and NAIVE-PP estimates we can note that bias also increases for a given31Figure 3.5: Bias (plot A) and convergence rate (plot B) observed in each ofthe cumulative survival type estimates across 1000 iterations of the sim-ulation as the event rate varies between 0.001 and 0.75.non-adherence rate in the control arm as relative non-adherence in the treatmentarm increases.Considering the precision of these effect estimates, Figure 3.6 plot B illustratesthe average model standard error as the non-adherence rates change. The ITT esti-mate produces reasonably constant average model standard error regardless of thenon-adherence rate. In contrast, the average model standard error of the naive PP,stabilized IPW and baseline ADJUSTED-PP estimates increases substantially whenthe non-adherence rate exceeds approximately 60%.3.3.4 The Effect of Time on the OutcomeThe impact of the impact of time on the outcome for these estimates was assessedfor linear effects, with the effect of time varying from having no impact on theoutcome to having a moderate to strong impact on the outcome. The bias (plotA) and average model standard error (plot B) for these simulations are presentedin Figure 3.7. For this simulation, time was set to have no effect on the outcomewhen the event rate was approximately 3%. The effect of time on the outcomewas then increased until a total event rate of approximately 58% was observed,corresponding to a 55% increase in the event rate due to the effect of time. Thestabilized IPW and baseline ADJUSTED-PP estimates were unbiased for all effects32Figure 3.6: Bias (plot A) and average model standard error (plot B) againstnon-adherence rates in the control arm for each effect estimate. Thecolour of the points corresponds to the relative non-adherence in thetreatment arm where 0% means the non-adherence in the treatment armequal that in the control arm, and 10% indicates the non-adherence inthe treatment arm is 10% greater than in the control arm.of time assessed. In contrast, the ITT and NAIVE-PP estimates were biased for alleffects of time assessed, where time having a strong impact on the outcome led tothe most biased effect estimates. When time had a strong impact on the outcome,more events were observed, resulting in lower average model standard error for alleffect estimates.3.4 Discussion: InterpretationOverall, varying the event rate has shown that stabilized IPW and baseline ADJUSTED-PPestimates are unbiased when the event rate is greater than 1% for our simulated trial(n= 2000). Effect estimates when the event rates are < 0.01 are biased, with highvariability. While this is likely due to insufficient sample size to provide sufficientpower for this analysis, these findings are interesting as our method of using pooledlogistic regression to estimate a log(OR) typically assumes a rare outcome [40–42].To see the impact of the rare outcome assumption would likely require increasingthe sample size of the simulation and decreasing the event rate. The cumulativesurvival curve effect estimates in contrast, do not require this assumption.33Figure 3.7: Bias (plot A) and average model standard error (plot B) againstthe increase in event rate due to the effect of time on the outcome.The trial size simulations have illustrated which combinations of treatment ef-fects and trial sizes lead to unbiased effect estimates and desirable power. Thestabilized IPW and baseline ADJUSTED-PP effect estimates produce reasonably un-biased results and are able to achieve high power for smaller treatment effects thanthe ITT estimate. These results do illustrate that the ITT estimate is both unbiasedand appropriately powered when the treatment effect is null. This is consistent withthe results of Young et al. [27]. The ITT estimates have also been shown to be bi-ased towards the null in the presence of non-adherence which is consistent with ourexpectations. These results would be particularly helpful for estimating the samplesizes required to achieve a desired power for a hypothesized treatment effect. Thecombination of the standard error results in Figure 3.2 and the coverage probabilityresults in Figure 3.3 provide strong evidence that the clustered sandwich estimatesare unbiased estimates of the standard error of the treatment effect estimate for alltreatment effects and trial sizes. This validates the use of standard errors calculatedvia clustered sandwich estimators.The non-adherence simulations have provided valuable context for the perfor-mance of these effect estimates under differing adherence rates. Pragmatic tri-als are likely to experience higher rates of non-adherence compared to RCTS dueto their design aiming to mimic standard of care [2, 11]. It is not uncommon34for pragmatic trials to have non-adherence rates between 30% and 60% [28, 62].These results indicate that under our simulated underlying data generation mecha-nism, stabilized IPW and baseline ADJUSTED-PP estimates are unbiased when thenon-adherence rate in the control arm is as high as 60%, regardless of if the non-adherence rate in the treatment arm is higher. These results are promising andindicate that these effect estimates will be valuable for a wide range of pragmatictrials, although the exact thresholds may vary depending on the underlying causalrelationships.The effect of time on the outcome of interest did not seem to substantiallyaffect the performance of the effect estimates. This indicates that the stabilized IPWand baseline ADJUSTED-PP estimates would be suitable for trials of both chronic(where time does not substantially effect outcomes) and degenerative (where timedoes substantially effect outcomes) disease areas.3.5 ConclusionThis section has provided simulation results for the impact of varying trial char-acteristics on the performance of different treatment effect estimators. By varyingthe event rate we have assessed how well each effect estimate performs if clinicalresearchers were to assess more or less common outcomes. By varying the treat-ment effect we have expanded the works of Young et al. [27] to compare theseeffect estimates when the effect of treatment is non-null. We have seen that thestabilized IPW and baseline ADJUSTED-PP estimates are unbiased and produce es-timates with low variability for all assessed treatment effects and effects of time onthe outcome. Non-adherence has been shown to increase bias of effect estimates,particularly when it exceeds 60%. Overall these results will provide valuable in-sight to clinicians when designing pragmatic trials and assessing potential effectestimates.35Chapter 4Modifying the DAG4.1 IntroductionThis group of simulations aims to explore the base case causal diagram proposed inSection 2.1 by modifying relationships between key variables via simulation. Thissection will highlight the impacts of adding, removing, or varying different causalrelationships on the performance of each of the effect estimates.In the base case, our simulation consists of a two-armed randomized pragmatictrial. There is one baseline covariate B that is measured, and affects an individual’sreceipt of treatment A and time-varying covariates (L1 and L2, collectively referredto as L). The time-varying covariates also affect an individual’s receipt of treat-ment. The outcome for an individual is influenced by their receipt of treatmentand their baseline covariates. In this causal setting, the baseline covariate B is aconfounder as it affects both the receipt of treatment and the outcome.The main scenarios explored in this section include assessing the performanceof the estimators while exploring the following modifications to the DAG:• Baseline covariate B acting as a risk factor rather than a confounder. Thismeans removing the causal relationships between B and the receipt of treat-ment A as well as the relationship between B and the time-varying covariatesL.• Baseline covariate B being unmeasured. This does not change the causal36relationships depicted in Figure 2.1, but results in altered analysis methodsas B cannot be included in effect estimation.• Time-varying covariates L1 and L2 affecting the outcome. This correspondsto additional causal relationships between each time-varying covariate L1and L2 at time t and the outcome Y at time t+1.• Time-varying covariates L1 and L2 varying the effect on adherence. Thiscorresponds to varying the causal relationship between L1 and L2 at timet and the receipt of treatment A at time t. This includes testing positive,negative, and null effects.4.2 Methods: Simulation SpecificationsThis section consists of four main situations, the parameters of which are describedin Table 4.1. Within this table, all parameters correspond to the description of thedata generating process in section 2.3.4.2.1 Baseline Covariate as a Risk FactorTo consider the baseline covariate B as a risk factor rather than a confounder, theinfluence of B on A and L was set to 0 (α41 = α40 = 0 and β1−1 = β2−1 = 0,respectively). Then the degree of risk associated with the baseline covariate wasvaried between moderate positive and moderate negative (θ1). In this situationestimating the effect of receiving treatment on the outcome is not confounded. Assuch, we expect both NAIVE-PP and ADJUSTED-PP estimates to be unbiased.4.2.2 Baseline Covariate UnmeasuredThe second situation explored in this section is if the baseline covariate B were to beunmeasured. The baseline covariate B could be unmeasured due to cost or flawedtrial design. If the baseline covariate is unmeasured, then not all of the estimatorsassessed in this research will be applicable. Specifically, the stabilized IPW andbaseline ADJUSTED-PP estimates will not be feasible. As such, in this section theresults for the Weakly Stabilized Inverse Probability Weighted (WSIPW) and Trun-cated Weakly Stabilized Inverse Probability Weighted (TWSIPW) ADJUSTED-PP37Table 4.1: Table of parameter variations used to modify and assess causalrelationships. The parameters in this table correspond to the data gener-ation functions as defined in Section 2.3.Setting Parameters Varied Parameter RangeB as a Risk FactorSet:α41 = α40 = 0β1−1 = β2−1 = 0Then vary: θ1θ1 ∈ {−1.5,−1.3,−1,−0.7,−0.3,0,0.3,0.7,1,1.3,1.5}B Unmeasured:Base CaseVary θ2Do not use B ineffect estimationθ2 ∈ {−1.5,−1.3,−1,−0.7,−0.3,0,0.3,0.7,1,1.3}B Unmeasured:B Only Affects Lat t = 0Set β11 = β2−1 = 0for t > 0Vary θ2Do not use B ineffect estimationθ2 ∈ {−1.5,−1.3,−1,−0.7,−0.3,0,0.3,0.7,1,1.3}L Effect onNon-AdherenceVary α11 = α21& α10 = α20[α11 = α21,α10 = α20]∈ {[0.9,−0.2], [0.2,−0.08],[0,0], [−0.11,0.07],[−0.23,0.13], [−0.38,0.19],[−0.7,0.27]}L Effect on Outcome Vary θ3 = θ4θ3 = θ4 ∈ {−0.12,−0.2,−0.28,−0.06,0,0.05,0.1,0.18,0.27}estimates will be presented instead. Two variants of unmeasured baseline covari-ate are considered: 1) under the base case DAG 2) when the effect of B on thetime-varying covariates L is limited to the first time step. For each variant, the per-formance of each estimate, that does not require the baseline covariate, is assessedas the treatment effect varies.384.2.3 The Effect of Time-Varying Covariates on Non-AdherenceThe third situation explored in this chapter is varying the strength of the impactof the time-varying covariates on adherence. For this setting, parameter variationswere selected such that the non-adherence rates observed in both arms were equal,and such that the coefficients associated with L1 and L2 were equal for each arm.The parameter range selected varied non-adherence rates from 20% to 80% withthe time-varying covariates having no impact on the adherence for an adherencerate of 40%.4.2.4 The Effect of Time-Varying Covariates on the OutcomeThe final situation considered in this section is if the time-varying covariates Ldirectly influence the outcome Y . This corresponds to the addition of a directededge between each L and Y in the DAG illustrated in Figure 2.1. The parameterrange for this situation corresponded to an event rate of 30% when the time-varyingcovariates had no impact on the outcome, then setting the coefficients for θ3 andθ4 equal, and varying them to increase/decrease the event rate by up to 20%.4.3 Results4.3.1 Baseline Covariate as a Risk FactorWhen the baseline covariate B acts as a risk factor rather than a confounder, wesee that the NAIVE-PP, stabilized IPW, and baseline ADJUSTED-PP estimates areunbiased (Figure 4.1, plot A). The ITT estimate is biased, with bias not appearingto be correlated with the strength of the effect of the baseline covariate B on theoutcome Y . In this situation, the calculation of the effect of receipt of treatment Aon the outcome is no longer confounded. However, the simulated trials still havea non-adherence rate of approximately 30% in each trial arm, resulting in the ITTeffect estimate being biased.The average model standard error results are presented in Figure 4.1 plot B.Both the average model standard error and the variability of the average modelstandard error decrease as the log(OR) effect of the baseline covariate on the out-come increases. This is likely due to an increase in the event rate leading to greater39Figure 4.1: Bias (plot A) and model standard error (plot B) observed for eachestimate when the baseline covariate B is a risk factor rather than a con-founder. The X-axis corresponds to varying whether B has a strongpositive or negative effect on Y .precision in the effect estimates.4.3.2 Baseline Covariate UnmeasuredThe results for bias when the baseline covariate B is unmeasured but all causal re-lationships described in Figure 2.1 are present can be seen in Figure 4.2 plot A. Allestimates are biased for all treatment effects, with the exception of the ITT estimatefor the null treatment effect. The ITT estimate always under-estimates non-zerotreatment effects, indicating the ITT effect is biased towards the null treatmenteffect. All PP effects are biased, with the WSIPW and TWSIPW ADJUSTED-PPestimates consistently less biased than the NAIVE-PP estimate. Regarding pre-cision, the TWSIPW estimate has a lower average model standard error than theWSIPW estimate. Figure 4.2 plot B illustrates the average model standard errorof the estimators under consideration as the effect of treatment varies. The ITT,NAIVE-PP, and TWSIPW ADJUSTED-PP estimates show low average model stan-dard error that decreases slightly as the treatment effect increases. In contrast, the40Figure 4.2: Bias (plot A) and average model standard error (plot B) observedfor each estimate as the log(OR) effect of treatment varies when thebaseline covariate B is unmeasured.WSIPW ADJUSTED-PP estimate shows a higher average model standard error withgreater variability. This illustrates the benefit of truncation for IPW ADJUSTED-PPestimates.Figure 4.3 illustrates the related scenario where B is unmeasured, but only di-rectly influences the time-varying covariates at the first time point. The bias results(plot A) are similar to that of figure 4.2, where the ITT estimate is unbiased fornull treatment effects and consistently under-estimates non-null treatment effects.We see all three PP estimates are biased with the exception of when the treatmenteffect is log(OR) = 1. Figure 4.3 plot B illustrates the average model standarderror, which decreases as the treatment effect increases for the shown estimators.Compared to the previous situation, the difference between model standard errorbetween the ITT and WSIPW ADJUSTED-PP estimate is smaller.4.3.3 The Effect of Time-Varying Covariates on Non-AdherenceFigure 4.4 plot A illustrates the bias observed as the effect of the time-varyingcovariates varies from decreasing non-adherence, by up to 20%, to increasing non-41Figure 4.3: Bias (plot A) and average model standard error (plot B) observedfor each estimate as the log(OR) effect of treatment varies when thebaseline covariate B is unmeasured and only affects the time-varyingcovariates L1 and L2 at time t = 0.adherence, by up to 40%. When the time-varying covariates have no effect on ad-herence the non-adherence rate is 40% in each arm. In this setting, non-adherencerates have been tuned to be equal in both trial arms. The ITT estimate is biased forall values and the NAIVE-PP estimate is only unbiased for when the non-adherencerates have increased by 20% due to the time-varying covariates. The stabilizedIPW and baseline ADJUSTED-PP estimates are approximately unbiased until thenon-adherence rate reaches 80% (40% non-adherence plus a 40% increase in non-adherence due to L). Figure 4.4 plot B illustrates the model standard error as theeffect of L on adherence varies. The ITT estimate shows approximately the samemodel standard error regardless of the effect of L on adherence. All three PP es-timates presented show model standard error increasing as the non-adherence rateincreases due to the time-varying covariate. The high model standard error forwhen the non-adherence rate reaches 80% (40% non-adherence plus a 40% in-crease in non-adherence due to L) is consistent with the results in Section 3.3.42Figure 4.4: Bias (plot A) and model standard error (plot B) observed for eachestimate as the effect of the time-varying covariates L on non-adherencevaries. Non-adherence is designed to be equal in both trial arms and isequal to 40% when L has no effect on adherence.4.3.4 The Effect of Time-Varying Covariates on the OutcomeFigure 4.5 plot A illustrates the bias of each estimate as the effect of the time-varying covariates varies from decreasing to increasing the event rate by 20%. TheITT and NAIVE-PP estimates are biased for all parameter settings. The stabilizedIPW and baseline ADJUSTED-PP estimates are only unbiased when the time-varyingcovariate does not affect the outcome. Figure 4.5 plot B illustrates the averagemodel standard error as the effect of the time-varying covariate on the outcomevaries. For all estimators shown, model standard error and the variability of modelstandard error decreases as the event rate increases due to the influence of L. Thisis likely due to an increased event rate increasing the number of events observedand thus the precision of effect estimates.4.4 Discussion: InterpretationThis chapter has explored varying the relationships presented in the DAG in Figure2.1. The relationships varied can be described as 4 different simulation settings:43Figure 4.5: Bias (plot A) and model standard error (plot B) observed for eachestimate as the effect of the time-varying covariates L on the outcomevaries.having the baseline covariate as a risk factor, the baseline covariate being unmea-sured, varying the effect of the time-varying covariates on adherence and varyingthe effect of the time-varying covariates on the outcome.If the baseline covariate B is a risk factor rather than a confounder, all PP ef-fect estimates are unbiased, even in the presence of 30% non-adherence. If thissituation arises in a trial, it will be easier to analyze as the assessment of the treat-ment effect is no longer confounded by non-adherence. In contrast, if the baselinecovariate B is an unmeasured confounder, this limits the estimators available to an-alysts. ITT can provide an unbiased treatment effect estimate in the null treatmenteffect setting. Our simulations also showed the truncated and untruncated weaklystabilized IPW ADJUSTED-PP estimate were unbiased when the log(OR) effect oftreatment was 1. In practice, an estimator that is unbiased for a single effect size isunlikely to be beneficial to analysts. We can conclude that if there is unmeasuredbaseline confounding, estimation of the treatment effect will be biased, which isconsistent with the assumptions the IPW method makes [38].Our simulations have shown that varying the effect of the time-varying covari-ates on adherence does not affect the performance of the estimators. The stabilized44IPW and baseline ADJUSTED-PP estimates were unbiased until the adherence ratein both arms was at 80%, which is consistent with the results from varying theadherence rate independent of the trial quantities (see Section 3.3.3). This showsthese ADJUSTED-PP estimates are robust to the influence of the time-varying co-variates on adherence. In contrast, if the time-varying covariates directly affect theoutcome, all presented effect estimates are biased as now the time-varying covari-ates are also acting as confounders [63]. Alternative analysis methods may yieldbetter performance, such as including the time-varying covariates in the outcomemodel as this would allow for adjustment [64].Additional simulations that varied causal relationships have been omitted fromthis thesis. This includes a group of three simulations which varied the strengthof the effect of the baseline covariate B on adherence, the time-varying covariatesL, and the outcome Y , respectively. These simulations did not identify substantialdifferences in performance as the strength of the effect of the baseline covariateB varied. Consistently both the stabilized IPW and baseline ADJUSTED-PP effectestimates were unbiased, while the ITT and NAIVE-PP estimates were biased.Another simulation assessed if the baseline covariate B affected the time-varyingcovariate L at the first time point, but not at any other time points (i.e., the causalarrow between B and L0 is retained but the arrows between B and Li, i ∈ 1,2,3, ...Kare removed) as the treatment effect was varied. In this case no trend was observed,and again the stabilized IPW and baseline ADJUSTED-PP effect estimates were un-biased, while the ITT and NAIVE-PP estimates were biased, except the ITT estimatewhen the treatment effect is null.Two final additional simulations conducted were where the baseline covariateB was removed from the data generating process (i.e., the arrows between B andL, A, and Y were removed). By removing the effect of B, the estimation of thecausal effect of receiving the treatment is no longer confounded. Thus our simu-lation showed that when B was removed from the data generation process and thetreatment effect was varied all PP estimates were unbiased, and the ITT effect wasonly unbiased for null treatment effects. Building on this situation, we assessedwhen the baseline covariate B was removed from the data generating process, butthe time-varying covariate L had a direct effect on the outcome (e.g., adding acausal arrow between L and Y ). In this situation, the baseline covariate B has been45removed, but confounding is now present due to the time-varying covariate L af-fecting both the receipt of treatment A and the outcome Y . In this case all effectestimates were biased when L had a non-null effect on the outcome.4.5 ConclusionThis section has highlighted key limitations of analyzing pragmatic trial data inthe presence of non-adherence. Notably, the presence of unmeasured confoundingviolates the exchangeability assumption that IPW adjustment requires [38]. Addi-tionally, these results highlight the need for understanding the underlying causaldiagram prior to analysis to select the most appropriate model. If the time-varyingcovariates directly influence the outcome, alternative model specifications shouldbe considered.46Chapter 5Sparse Follow-Up5.1 IntroductionPragmatic trials are designed to be representative of standard care and to minimizethe burden of participation for patients and clinicians [2, 11]. This means that thefollow-up period is often designed to minimize regularly scheduled formal follow-up visits, and instead utilize administrative datasets or mailed questionnaires orfollow-up appointments only occurring in response to a specific event (e.g., hospi-talization or recurrence) [2, 11]. While these methods are advantageous for mini-mizing cost and burden on participants, combining follow-up data from adminis-trative datasets and other means may lead to inconsistency in the frequency of datacollection between adherence, time-varying covariates, and the outcome of inter-est. In this chapter we examine the impact of sparse follow-up measures for thereceipt of treatment (A), time-varying covariates (L1,L2), or both simultaneously.Adherence information may or may not be subject to sparsity during the follow-up period depending on the format of the intervention. Often in studies of pharma-cological treatments, adherence can be estimated from prescription dispensationrecords [2, 36]. Prescription dispensation records as methods for assessing adher-ence may underestimate rates of non-adherence but are likely to provide consistentmeasurements [32, 54]. Self-reporting, physician’s estimates, or assays that mea-sure compound concentration within participants are methods of assessing adher-ence that may be subject to sparse measurement within pragmatic trials [2, 32].47Sparse assessment of time-varying covariates is an area of active research foradjusted effect estimates such as the IPW ADJUSTED-PP estimate we are imple-menting [27, 35–37]. When designing a pragmatic trial, clinicians must weigh theburden of acquiring time-varying covariate information against the benefit of theiranalytic value when adjusting for non-adherence. The goal is that this chapter willprovide valuable context for clinicians designing trials to balance these competingpriorities.In this work we do not explore varying the interval for measuring the outcome.These simulations are based on an outcome of interest that is the (first) occurrenceof an event. Examples of this kind of outcome include death, hospitalization, emer-gency room visits, and recurrence of cancer. These outcomes of interests are welldefined, relevant to patients, and unlikely to be subject to sparse measurement [2].Instead, these scenarios assess when either the receipt of treatment, time-varyingcovariates or both are subject to sparse measurement.The specific scenarios explored within the sparse follow-up context are: vary-ing the measurement schedule, varying the treatment effect given sparse follow-up,and varying the effect of non-adherence given sparse follow-up. Each of these sce-narios were conducted with the receipt of treatment (A), time-varying covariates(L1,L2), or both sparse during follow-up. Additionally, LOCF imputation was com-pared to CC analysis as strategies of handling the sparse follow-up information.Both LOCF imputation and CC analyses are common strategies for analyzing datawith missing values. LOCF fills in missing values using the last observation, whileCC restricts the dataset used in analysis to those observations that are complete. Byonly analyzing observations which are complete CC analyses reduces the samplesize and thus the statistical power of the trial analyzed.5.2 Methods: Simulation SpecificationsThe specific parameter variations used to assess each scenario are detailed in Table5.1. The parameters described in this table correspond to those defined for thedata generating process in Section 2.3. All sparse follow-up scenarios also need tospecify which variables are sparse during the follow-up period and the method forhandling the sparsity during analysis.48Table 5.1: Table of parameter variations used to assess the impact of sparsefollow-up. The parameters in this table correspond to the data generationfunctions as defined in Section 2.3.Setting Parameters Varied Parameter RangeMeasurement Schedule& Which MeasuresAre SparseFor each:• A Sparse• L1&L2 Sparse• A,L1,&L2 SparseVary mm ∈ {1,3,6,9,12,15,18,21,24,27,30}Varying TreatmentEffectFor each:• A Sparse• L1&L2 Sparse• A,L1,&L2 SparseVary θ2θ2 ∈ {−1.3,−1,−0.7,−0.3,0,0.3,0.7,1,1.3}Non-Adherence RatesFor each mVary α01 and α00m ∈ {1,6,12,18,24}[α01α00] ∈ {[4.9,−9.6],[4.4,−9.1], [3.8,−8.5],[3.3,−8], [2.7,−7.5],[2.1,−7], [1.6,−6.4],[1.1,−5.9], [0.5,−5.3]}5.2.1 Varying Measurement ScheduleThe first group of simulations focuses on assessing the impacts of the frequency ofmeasurements in the follow-up period on the performance of each of the estimators.The frequency of measurements vary between monthly (m= 1) and every two anda half years (m = 30). The impact of which variables are sparse and the methodof handling the sparsity are also assessed. This setting was conducted with a fixedtreatment effect of log(OR) = −0.7 and non-adherence rate of 30% in each trialarm.5.2.2 Varying Treatment EffectThe second group of simulations address the relationship between the sparsity offollow-up and the treatment effect. For a fixed months between measurements of49m= 12, the effect of treatment varied between log(OR)=−1.3 and log(OR)= 1.3.Again, the impacts of which variables are sparse and the method of handling thesparsity are also assessed. This group of simulations was conducted with a fixednon-adherence rate of 30% in each trial arm.5.2.3 Varying Non-AdherenceThe third group of sparse follow-up measurement simulations address the relation-ship between non-adherence rates and sparsity of follow-up. For these simulations,non-adherence rates were set to be equal in both the control and treatment arms,and varied between 10% and 90%. The measurement frequency was then variedbetween monthly measurements (m = 1) and measures every two years (m = 24).LOCF was used to impute the sparse follow-up measures. This group of simulationswas conducted with a fixed treatment effect of log(OR) =−0.7.5.3 Results5.3.1 Varying Measurement ScheduleFigure 5.1 illustrates the bias observed for each estimation method (rows), for eachset of sparse follow-up variables (columns), for each missing data handling method(colours). The ITT and NAIVE-PP estimates are biased for all measurement sched-ules, sets of sparse follow-up variables, and missing data methods.The stabilized IPW ADJUSTED-PP estimates are largely unbiased, with very in-frequent measurements (m > 20) producing substantially biased estimates whencomputed using CC. The LOCF stabilized IPW ADJUSTED-PP estimates are slightlybiased for infrequent measurement schedules (starting at annual measurements,m = 12) where adherence is sparse. In contrast, the baseline ADJUSTED-PP esti-mate is largely unbiased with bias increasing marginally as the frequency of mea-surements decreases.For all estimation methods, LOCF produces lower variation in bias, especiallyfor infrequent measurement schedules. When CC is conducted on the least frequentmeasurements, the greatest variability in bias is observed.Figure 5.2 illustrates the average model standard error observed across the sim-50Figure 5.1: Bias observed in each estimation method across 1000 iterations ofthe simulation as the measurement schedule during the follow-up periodvaried from monthly measures (m= 1) to measures every 2 years (m=24). Columns correspond to whether the adherence (A), time-varyingcovariates (L1&L2), or both (A,L1&L2) were sparse during the follow-up period. Rows correspond to the 4 main estimation methods. Colourdenotes whether the analysis was based on CC or LOCF imputation.ulation scenarios. For the ITT, NAIVE-PP, and baseline ADJUSTED-PP the CCanalyses produce consistently greater model standard error than LOCF, with thedifference becoming more pronounced with less frequent measurement schedules.This trend is also observed for the stabilized IPW ADJUSTED-PP estimate whenthe time-varying covariates are the only variables subject to sparse measurements.However, when adherence is subject to sparse measurement, the stabilized IPWADJUSTED-PP sees a significant increase in model standard error when the mea-surement schedule is every 21 months or less frequent (m ≥ 21). When the mea-51Figure 5.2: Average model standard error observed in each estimationmethod across 1000 iterations of the simulation as the measurementschedule during the follow-up period varied from monthly measures(m = 1) to measures every 2 years (m = 24). Columns correspond towhether the adherence (A), time-varying covariates (L1&L2), or both(A,L1&L2) were sparse during the follow-up period. Rows correspondto the 4 main estimation methods. Colour denotes whether the analysiswas based on CC or LOCF imputation. Note that in this figure some errorbars have been truncated to due to outliers.surement schedule is every 30 months (m= 30), we see a high degree of variabilityin model standard error for both ADJUSTED-PP estimates presented.5.3.2 Varying Treatment EffectFigure 5.3 illustrates the bias for each estimate (rows) for each set of sparse follow-up variables (columns) for each missing data handling method (colours). Note52Figure 5.3: Bias observed in each estimation method across 1000 iterationsof the simulation as the effect of treatment varied. Columns correspondto whether the adherence (A), time-varying covariates (L1&L2), or both(A,L1&L2) were sparse during the follow-up period. Rows correspondto the 4 main estimation methods. Colour denotes whether the analysiswas based on CC or LOCF imputation.that these results are for a fixed measurement schedule of annual measurements(m = 12). The ITT estimates are unbiased for null treatment effects, regardless ofthe set of sparse follow-up variables or missing data handling method. When theeffect of treatment is strong negative (log(OR) =−1.3) we see the ITT effect is ap-proximately unbiased, which is not something observed when follow-up measuresare taken monthly (see 3.3.1). The NAIVE-PP estimates are biased for all treatmenteffects.Both the stabilized IPW and the baseline ADJUSTED-PP estimates are approx-imately unbiased when LOCF is applied to handle the sparse variables. The stabi-lized IPW and the baseline ADJUSTED-PP estimate see slight increases in bias for53Figure 5.4: Power observed in each estimation method across 1000 iterationsof the simulation as the treatment effect varied. Colour denotes whetherthe analysis was based on CC or LOCF imputation. Note that no dif-ference was observed depending on whether the adherence (A), time-varying covariates (L1&L2), or both (A,L1&L2) were sparse during thefollow-up period. This figure corresponds to when both adherence andtime-varying covariates were subject to sparse measurement during thefollow-up period.strong treatment effects (log(OR) ≥ 1 or log(OR) ≤ −1) when LOCF is used andadherence is sparse (columns A and L1,L2, and A). When CC analysis is conducted,the stabilized IPW and the baseline ADJUSTED-PP estimates have slight bias withgreater variability in bias compared to LOCF. The CC analysis produces partic-ularly biased stabilized IPW ADJUSTED-PP estimates when the log(OR) effect oftreatment is strong negative (log(OR)≤−1).Figure 5.4 illustrates power for each estimate, for both missing data handlingmethods as the effect of treatment is varied. No substantial difference was observed54in power depending on which variables were sparse during follow-up, so this plotpresents the findings for when both adherence and time-varying covariates weresparse. CC analysis consistently has less power than LOCF imputation. The differ-ence in power between CC and LOCF analyses is most pronounced for strong non-zero treatment effects for the ITT, stabilized IPW, and baseline ADJUSTED-PP esti-mates. The ITT, stabilized IPW, and baseline ADJUSTED-PP estimates all achievethe desired power of 0.05 for null treatment effects regardless of missing datamethod. The baseline ADJUSTED-PP estimate combined with LOCF achieves 100%power sooner than the stabilized IPW ADJUSTED-PP estimate combined with LOCF(|log(OR)|= 0.7 compared to |log(OR)|= 1).Additionally, this group of simulations showed that effect estimates calculatedusing LOCF imputed data converge 100% of the time for the considered effect esti-mates while cumulative survival effect estimates calculated using CC often failed toconverge. With strong negative treatment effects (log(OR) = −1.3), convergencewas as low as 93% for the cumulative survival treatment effect estimates.5.3.3 Varying Non-AdherenceFigure 5.5 illustrates the bias as the non-adherence rate varies for different num-bers of months between measurements (colour). In this simulation non-adherencerates were set to be equal in both treatment arms, the effect of treatment was fixedat log(OR) =−0.7, and LOCF was implemented during the analysis. Results pre-sented are based on having both the receipt of treatment A and the time-varyingcovariates L subject to sparse measurement.The ITT estimate is consistently biased, with bias increasing with the non-adherence rate. The measurement frequency does not impact the bias or the vari-ability of the bias (i.e., all 5 lines coincide) for the ITT estimate. This is consistentwith expectations as sparse time-varying covariate or adherence data does not af-fect the calculation of an ITT effect estimate. In contrast, the NAIVE-PP estimate isbiased for all non-adherence rates with infrequent measurements yielding slightlygreater bias at rates of high non-adherence.The stabilized IPW ADJUSTED-PP estimate shows bias increases as the non-adherence rates increase. When non-adherence exceeds approximately 40% we55Figure 5.5: Bias observed in each estimation method across 1000 iterationsof the simulation as the rate of non-adherence varies between 10% and90% with equal non-adherence rates in each arm. Colour denotes thetime between measurements for the sparse adherence and time-varyingcovariates. Both time-varying covariates and adherence were sparseduring the follow-up period and were imputed using LOCF.see bias increase for a measurement schedule of every 24 months. In contrast,monthly measurements do not see this increase in bias until non-adherence ratesexceed 60%. Variability of the bias increases as non-adherence rates increase. Thisindicates the high rates of non-adherence and infrequent measurements degradesperformance faster than high rates of non-adherence on their own.The baseline ADJUSTED-PP estimate is approximately unbiased for all non-adherence rates, with measurements taken every 24 months yielding slightly higherbias than monthly measurements.Measurement schedule resulted in moderate and mild increases in bias for the56stabilized IPW and baseline ADJUSTED-PP estimates respectively, only when ad-herence was one of the variables subject to sparse follow-up. When only the time-varying covariates were subject to sparse follow-up, all estimates had the sameperformance regardless of measurement schedule (not shown). For all estimates,100% convergence was observed, regardless of non-adherence rate.5.4 Discussion: InterpretationIn almost all situations, decreased frequency of measurements during the follow-up period resulted in increased bias and variability in the treatment effect estimatesproduced by the generally reliable stabilized IPW and baseline ADJUSTED-PP esti-mation methods. In this simulation the degradation of performance occurred whenmeasurements were scheduled annually or less often. These results are consistentwith that of Young et al. [27] and illustrate the importance of designing pragmatictrials to collect regular measurements of adherence and time-varying covariatesduring the follow-up period. Young et al.’s work focused on RD effect estimates,while our work implements OR effect estimates, thus highlight that this finding isconsistent whether RD or OR effect estimates are to be used.These simulations have shown that for simple trials where there are few time-varying covariates to build more complex imputation models, LOCF produces lowbias and low variability estimates when compared to CC. The CC approach is alsounlikely to appeal to clinicians as it requires discarding time points and observa-tions that are incomplete, leading to reduced power compared to imputation meth-ods. In the context of pragmatic trials, LOCF is favorable compared to CC in termsof statistical properties and in terms of making better use of the collected data.Current literature on the analysis of health data with sparse follow-up data favoursmultiple imputation and inverse probability of missingness methods to LOCF andCC as they require fewer assumptions regarding the mechanism of missingness[35–37, 65]. However, multiple imputation would be best implemented if therewere more baseline covariates or time-varying covariates available to build an in-formative imputation model, and thus were not applied in this simulation.These simulations have shown the impact of whether the adherence, time-varying covariates or both are sparse during the follow-up period. When vary-57ing the measurement schedule or the treatment effect, it was observed that hav-ing sparse measures of adherence led to a slight increase in the bias of LOCF im-puted stabilized IPW and baseline ADJUSTED-PP estimates. In contrast, sparsetime-varying covariates on their own did not substantially increase bias.The increase in bias observed by decreasing the frequency of measures in thefollow-up period was somewhat more pronounced with strong treatment effects(|log(OR)| ≥ 1) for the LOCF imputed stabilized IPW and baseline ADJUSTED-PPestimates.Decreasing the frequency of the follow-up measurements resulted in higherbias when non-adherence rates in both arms exceeded 40% for the stabilized IPWADJUSTED-PP estimate combined with LOCF imputation. The baseline ADJUSTED-PPestimate saw marginal increases in bias as the measurement frequency decreasedbut remained unbiased at all non-adherence rates. This indicates that the combina-tion of non-adherence and sparse follow-up measurements can seriously limit theperformance of popular effect estimates.5.5 ConclusionThis section has highlighted the importance of regular measures of adherence andtime-varying covariates during the follow-up period for pragmatic trials. For sim-ple trials where there are few baseline or time-varying covariates to build impu-tation models, LOCF imputation produces low bias and low variability estimateswhen compared to CC analysis. Sparse adherence data has been shown to increasebias at a greater rate than sparse time-varying covariate data for the same measure-ment schedule. Additionally, this work has illustrated that sparse follow-up mea-surements combined with high rates of non-adherence (> 40%) result in increasedbias and variability in the estimation of stabilized IPW ADJUSTED-PP estimates.58Chapter 6Discussion and Conclusions6.1 Summary of FindingsIn this study, we aimed to contrast the ITT, NAIVE-PP, and ADJUSTED-PP estimatesin a two-armed pragmatic trial setting with substantial non-adherence. We haveaccomplished this aim for a variety of trial characteristics, causal relationships,and measurement schedules during the follow-up period.Our results have consistently shown that the ITT effect estimate is biased to-wards the null in the presence of non-adherence. The stabilized IPW and baselineADJUSTED-PP estimates perform well in most situations but may yield biased re-sults in the presence of high rates of non-adherence (e.g., greater than 60% ac-cording to our simulation’s data generation mechanism), low event rates (e.g., lessthan 1%), or infrequent measurements during the follow-up period. The analysisof sparse measurement schedules (e.g., measurements less than every 24 months)highlighted that LOCF produces estimates with lower bias and variance than CCand work well when applied with stabilized IPW or baseline ADJUSTED-PP esti-mates. In the presence of unmeasured confounding, all effect estimates consideredwere biased, illustrating the importance of identifying potential confounders priorto study design.596.2 Our ContributionsThis work builds on existing literature that calls for the use of PP estimates in prag-matic trials by illustrating the efficacy of ADJUSTED-PP estimates at producingunbiased treatment effect estimates in the presence of non-adherence, for a varietysituations [3, 12, 26–28]. Specifically, compared to Young et al. [27]’s work, wehave shown that IPW ADJUSTED-PP estimates are effective for recovering non-nulltreatment effects when time-varying covariates are regularly or infrequently mea-sured. Our evaluation of sparse measurement during the follow-up period expandedon that of Young et al. [27], producing similar recommendations when assessingnon-null treatment effects.By modifying trial characteristics this work has also illustrated that trials withlow sample size (e.g., n = 200 participants per arm according to our simulations)often produce unbiased effect estimates with sufficient coverage, but to achievesufficient power, larger trial sizes are required. In trials where the event rate islow (e.g., < 1% according to our simulations) model based estimates have highvariability and converge 100% of the time while cumulative survival type estimateshave lower variability and converge < 100% of the time.By exploring modifications to the causal diagram we have shown that whenthe baseline covariate B is a risk factor rather than a confounder, all PP estimatesare unbiased, while the ITT effect is biased. This shows that non-adherence in atrial biases ITT treatment effects towards the null, while NAIVE-PP are only biasedin the presence of non-adherence if a factor that influences non-adherence alsoinfluence the outcome. Similarly, we’ve shown that if the baseline covariate Bis unmeasured and still influences both the receipt of treatment and the outcome,all estimates are biased for all treatment effects, with the exception of the ITTeffect estimate for null treatment effects. This indicates that in the presence ofunmeasured confounders, treatment effect estimation is unreliable. Similarly, ifthere are time-varying covariates L that affect both the receipt of treatment and theoutcome, the presented effect estimates are biased.By comparing different measurement schedules during the follow-up period wehave shown that the stabilized IPW and baseline ADJUSTED-PP estimates show aslight increase in bias as the frequency of measurements decreases in the follow-up60period. We have also compared our estimators performance when there are sparsefollow-up measurements which are imputed using LOCF or analyzed as a CC. Thisis an extension of the work of Mojaverian et al. [36]. Our work illustrated that LOCFestimates have lower variability and bias than CC estimates as the measurementschedule becomes more infrequent. LOCF imputation also yields greater powerthan CC as the treatment effect varies. The baseline ADJUSTED-PP estimate showedonly marginal increases in bias as non-adherence rates increased when combinedwith LOCF imputation, indicating the benefit of the baseline ADJUSTED-PP esti-mate when a trial is subject to both high rates of non-adherence and sparse mea-surements during the follow-up period.6.3 StrengthsThis work utilizes a robust simulation and implements a thorough comparison ofestimates of treatment effect. Each estimate of treatment effect has been evaluatedfor a variety of statistical properties over a range of treatment effects, providingvaluable insights about these methods. A strength of this work is the number ofdifferent scenarios to which these effect estimates have been applied.A previous simulation study used a sample size of n = 100,000 per trial arm,and analyzed this one large dataset [27]. Even though there exist large randomizedpragmatic trials (e.g., Cocoros et al. [66] had 44,786 subjects), statistical propertiesof relatively moderately size pragmatic trials are of much interest. In this work,each simulated trial had n = 1,000 participants per arm (total n = 2,000), andperformed a Monte Carlo simulation with 1,000 iterations to be able to study thestatistical properties of such moderately sized trials.This work has extended the work of Young et al. [27] by including non-nulltreatment effects and OR type effect estimates. Our work also implements model-based effect estimates in addition to cumulative survival effect estimates. This isan important extension of previous work as model-based effect estimates allow thedirect estimation of the variance of the treatment effect and easier inclusion of co-variates. This allowed us to report additional statistical properties for each effectestimate including power, average model standard error, empirical standard error,coverage probability, and un-biased coverage probability. These statistical prop-61erties provide strong evidence of the performance of each effect estimate beyondbias.6.4 LimitationsThe limitations of this work are largely due to analytic simplicity. Key assumptionsthat limit the generalizability of these findings include the assumption that no par-ticipants were lost to follow-up and that once a participant became non-adherent,they remained non-adherent for the remainder of the trial. We also assumed thatadherence was binary. Depending on the treatment, continuous or partial non-adherence may be informative and these analytic methods may be extended toaddress continuous non-adherence [67]. The DAG used to inform the data gen-erating process was also simple. In practice, real clinical trials are likely to havemore baseline and time-varying covariates as well as potentially more complicatedcausal relationships.Our work was also limited as we quantified the average causal effect of treat-ment using OR rather than RD. In practice, both OR and RD are regularly reportedmeans to quantify a treatment effect, each with their own merits [20]. In our work,calculation of the RD was difficult to simulate. The outcome models often requiredexperimenting with various starting values as these outcome models frequentlyfailed to converge due to the use of the identity link in the generalized linear model.Additionally, this work is limited as these findings are derived from a singlesimulation framework. These effect estimates should also be compared using realdata. However, comparisons using real data are complicated by the fact that theground truth and true causal relationships are not known to the analysts. Alter-natively, for these findings to be shown to be generalizable they could be repli-cated using alternative simulation frameworks such as those proposed by Vourliand Touloumi [37], Young et al. [68], Xiao et al. [69].6.5 Future WorkFuture work for this active research area, would benefit from exploring a greaterrange of treatment effects, such as where both the most recent treatment receivedas well as the cumulative dose received are able to affect outcomes. Treatment62effects in real-world pragmatic trials may also include dynamic treatment regimes(i.e., where dosages may be adapted throughout the study based on a disease in-dicator). Studying a wider variety of treatment effects would help ensure thesefindings are reproducible in different settings that may be seen in real-world prag-matic trials. Additional future work could also explore the number of baseline andpost-randomization covariates collected as well as whether the treatment effect iscumulative.Additional directions for future works include comparisons to additional esti-mation techniques such as g-estimation, the parametric g method, and instrumentalvariable methods [12, 70]. Each of these different estimation techniques may besuitable in different situations, depending on what assumptions can be made andthe desired causal quantity. For example, instrumental variable methods can beeffective at generating unbiased PP estimates without relying on baseline and time-varying covariates if randomization can be treated as an instrument, and there areno “defiers” in the trial [12]. For randomization status to be a suitable instrument,the effect of randomization on the outcome of interest must be mediated exclu-sively through the treatment (i.e., the exclusion restriction) [12]. Defiers can bedescribed as individuals who instinctively take the alternate treatment compared totheir randomization status. If there are no defiers (i.e., monotonicity assumption),then IV methods may produce an unbiased estimate of the complier average causaleffect of treatment [12].Additional work would also be beneficial for addressing more complex ad-herence patterns. In this study, once a patient becomes non-adherent they wereassumed to remain non-adherent for the remainder of the trial [27]. While a con-venient assumption for our analysis, in real trials, patients may start and stop theirreceipt of treatment many times throughout a trial. It may also be beneficial toexplore more nuanced ways of tracking non-adherence. In this study, adherenceis binary. Depending on the treatment, at each time point patients may truly onlyadhere to part of their assigned treatment, leading to partial adherence. While itis common to dichotomize partial adherence [26, 28], it has been shown to intro-duce bias in effect estimation [33]. Additional work could compare similar effectestimates using partial adherence and more complex patterns of adherence over atrial.63Further work regarding the sparse measurement of time-varying covariates andtreatment adherence data could include exploring alternatives for handling miss-ing values including multiple imputation [35, 36]. However, this work would bestbe implemented in conjunction with increasing the number of baseline and time-varying covariates available to analysts. Additional work could explore the impactsof other forms of missing data on the estimation of treatment effect in the presenceof non-adherence, such as Missing Not At Random (MNAR) and Missing At Ran-dom (MAR) data.6.6 ConclusionThis work highlights the benefits and feasability of analyzing pragmatic trials us-ing an ADJUSTED-PP approach in the presence of non-adherence. Our simulationshave illustrated that while ITT estimates may be unbiased for null treatment ef-fects, ADJUSTED-PP estimates were unbiased for non-null treatment effects undera variety of conditions. Specifically, the stabilized IPW and baseline ADJUSTED-PPeffect estimates were largely unbiased when varying each of the following: eventrates, trial sizes, non-adherence rates, effects of time on the outcome, and measure-ment schedules during the follow-up period. In contrast, NAIVE-PP effect estimateswere unbiased only if the receipt of treatment was independent of the outcome.This study has also demonstrated the necessity of designing pragmatic trials thatmeasure both baseline and post-randomization covariates to facilitate bias reduc-tion in the estimation of treatment effects. Further work on this topic can aim toexplore issues of model misspecification, more complex treatment effects, alterna-tive missing data mechanisms, and more complex causal scenarios with a greaternumber of covariates.64Bibliography[1] Daniel Schwartz and Joseph Lellouch. Explanatory and pragmatic attitudesin therapeutical trials. Journal of Chronic Diseases, 20(8):637–648, 1967.doi:10.1016/0021-9681(67)90041-0. → pages 1, 2, 3[2] K. Loudon, S. Treweek, F. Sullivan, P. Donnan, K. E. Thorpe, andM. Zwarenstein. The PRECIS-2 tool: designing trials that are fit forpurpose. Bmj, 350(h2147), Aug 2015. doi:10.1136/bmj.h2147. → pages1, 2, 3, 19, 34, 47, 48[3] Miguel A Herna´n and Sonia Herna´ndez-Dı´az. Beyond the intention-to-treatin comparative effectiveness research. Clinical Trials, 9(1):48–55, 2012.doi:10.1177/1740774511420743. URLhttps://doi.org/10.1177/1740774511420743. PMID: 21948059. → pages1, 2, 4, 5, 6, 14, 15, 17, 60[4] Christopher Ty Williams. Food and drug administration drug approvalprocess: a history and overview. Nursing Clinics, 51(1):1–11, 2016. → page2[5] Tetsuya Tanimoto, Masaharu Tsubokura, Jinichi Mori, Monika Pietrek,Shunsuke Ono, and Masahiro Kami. Differences in drug approval processesof 3 regulatory agencies: a case study of gemtuzumab ozogamicin.Investigational new drugs, 31(2):473–478, 2013. → page 2[6] Jon D Lurie and Tamara S Morgan. Pros and cons of pragmatic clinicaltrials. Journal of Comparative Effectiveness Research, 2(1):53, Jan 2013.doi:http://dx.doi.org.ezproxy.library.ubc.ca/10.2217/cer.12.74. → page 2[7] Kevin P Weinfurt, Adrian F Hernandez, Gloria D Coronado, Lynn L DeBar,Laura M Dember, Beverly B Green, Patrick J Heagerty, Susan S Huang,Kathryn T James, Jeffrey G Jarvik, et al. Pragmatic clinical trials embeddedin healthcare systems: generalizable lessons from the NIH collaboratory.BMC medical research methodology, 17(1):144, 2017.65[8] Noel S Weiss, Thomas D Koepsell, and Bruce M Psaty. Generalizability ofthe results of randomized trials. Archives of Internal Medicine, 168(2):133–135, 2008. → page 2[9] Ian Ford and John Norrie. Pragmatic trials. New England Journal ofMedicine, 375(5):454–463, 2016. doi:10.1056/NEJMra1510059. URLhttps://doi.org/10.1056/NEJMra1510059. PMID: 27518663. → pages2, 3, 19[10] M. Zwarenstein, S. Treweek, J. J Gagnier, D. G Altman, S. Tunis,B. Haynes, A. D Oxman, and D. Moher. Improving the reporting ofpragmatic trials: an extension of the CONSORT statement. Bmj, 337(a2390), Nov 2008. doi:10.1136/bmj.a2390. URLhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3266844/. → page 2[11] K. E. Thorpe, M. Zwarenstein, A. D. Oxman, S. Treweek, C. D. Furberg,D. G. Altman, S. Tunis, E. Bergel, I. Harvey, D. J. Magid, and et al. Apragmatic-explanatory continuum indicator summary (PRECIS): a tool tohelp trial designers. Canadian Medical Association Journal, 180(10), 2009.doi:10.1503/cmaj.090523. → pages 2, 3, 19, 34, 47[12] Miguel A. Herna´n and James M. Robins. Per-protocol analyses of pragmatictrials. New England Journal of Medicine, 377(14):1391–1398, May 2017.doi:10.1056/nejmsm1605385. → pages 2, 4, 5, 6, 60, 63[13] Eneida Mendonca and Umberto Tachinardi. Pragmatic trials and newinformatics methods to supplement or replace phase IV trials. InPersonalized and Precision Medicine Informatics, pages 199–213. Springer,2020. → page 3[14] Gary E Rosenthal. The role of pragmatic clinical trials in the evolution oflearning health systems. Transactions of the American Clinical andClimatological Association, 125:204, 2014. → page 3[15] Mark J Pletcher, Bernard Lo, and Deborah Grady. Informed consent inrandomized quality improvement trials: a critical barrier for learning healthsystems. JAMA internal medicine, 174(5):668–670, 2014. → page 3[16] Scott Y Kim and Franklin G Miller. Informed consent for pragmatic trials:the integrated consent model. New England Journal of Medicine, 2014. →page 366[17] M. Mckee, A. Britton, N. Black, K. Mcpherson, C. Sanderson, and C. Bain.Methods in health services research: Interpreting the evidence: choosingbetween randomised and non-randomised studies. Bmj, 319(7205):312–315,1999. doi:10.1136/bmj.319.7205.312. → page 3[18] Robert D. Herbert, Jessica Kasza, and Kari Bo. Analysis of randomisedtrials with long-term follow-up. BMC Medical Research Methodology, 18(48), 2018. doi:10.1186/s12874-018-0499-5. → pages 3, 4[19] Marie T. Brown and Jennifer K. Bussell. Medication adherence: Who cares?Mayo Clinic Proceedings, 86(4):304–314, 2011.doi:10.4065/mcp.2010.0575. → page 3[20] Sengwee Toh, Sonia Herna´ndez-Dı´az, Roger Logan, James M. Robins, andMiguel A. Herna´n. Estimating absolute risks in the presence ofnonadherence. Epidemiology, 21(4):528:539, 2010.doi:10.1097/ede.0b013e3181df1b69. → pages 4, 62[21] Ian Shrier, Evert Verhagen, and Steven D. Stovitz. The intention-to-treatanalysis is not always the conservative approach. The American Journal ofMedicine, 130(7):867–871, 2017. doi:10.1016/j.amjmed.2017.03.023. →pages 4, 6, 14[22] Mohammad Ali Mansournia, Julian P. T. Higgins, Jonathan A. C. Sterne,and Miguel A. Herna´n. Biases in randomized trials. Epidemiology, 28(1):54–59, 2017. doi:10.1097/ede.0000000000000564. → pages 4, 6, 14[23] Miguel A. Herna´n and Daniel Scharfstein. Cautions as regulators move toend exclusive reliance on intention to treat. Annals of Internal Medicine, 168(7):515, 2018. doi:10.7326/m17-3354. → page 4[24] Charity Evans, Ruth Ann Marrie, Feng Zhu, Stella Leung, Xinya Lu,Dessalegn Y. Melesse, Elaine Kingwell, Yinshan Zhao, and Helen Tremlett.Adherence and persistence to drug therapies for multiple sclerosis: Apopulation-based study. Multiple Sclerosis and Related Disorders, 8:78–85,2016. doi:10.1016/j.msard.2016.05.006. → page 4[25] Solmaz Setayeshgar, Elaine Kingwell, Feng Zhu, Tingting Zhang, RobertCarruthers, Ruth Ann Marrie, Charity Evans, and Helen Tremlett.Persistence and adherence to the new oral disease-modifying therapies formultiple sclerosis: A population-based study. Multiple Sclerosis and RelatedDisorders, 27:364–369, 2019. doi:10.1016/j.msard.2018.11.004. → page 467[26] Eleanor J. Murray and Miguel A. Herna´n. Improved adherence adjustmentin the coronary drug project. Trials, 19(158), May 2018.doi:10.1186/s13063-018-2519-5. → pages 5, 6, 7, 10, 60, 63[27] Jessica G. Young, Rajet Vatsa, Eleanor J. Murray, and Miguel A. Herna´n.Interval-cohort designs and bias in the estimation of per-protocol effects: asimulation study. Trials, 20(1), May 2019. doi:10.1186/s13063-019-3577-z.→ pages 7, 11, 13, 34, 35, 48, 57, 60, 61, 63[28] Eleanor J Murray and Miguel A Herna´n. Adherence adjustment in thecoronary drug project: A call for better per-protocol effect estimates inrandomized trials. Clinical Trials: Journal of the Society for Clinical Trials,13(4):372–378, Jul 2016. doi:10.1177/1740774516634335. → pages5, 6, 35, 60, 63[29] Coronary Drug Project Research Group. Influence of adherence to treatmentand response of cholesterol on mortality in the coronary drug project. NewEngland Journal of Medicine, 303(18):1038–1041, 1980.doi:10.1056/nejm198010303031804. → page 5[30] Coronary Drug Project Research Group. The coronary drug project.Circulation, 47(3s1), Mar 1973. doi:10.1161/01.cir.47.3s1.i-1. → page 5[31] William Storms. Clinical trials: are these your patients? Journal of allergyand clinical immunology, 112(5):S107–S111, 2003. → page 6[32] Doreen Matsui. Strategies to measure and improve patient adherence inclinical trials, 2009. → pages 6, 23, 47[33] Ian Shrier, Robert W. Platt, Russell J. Steele, and Mireille Schnitzer.Estimating causal effects of treatment in a randomized trial when someparticipants only partially adhere. Epidemiology, 29(1):78–86, 2018.doi:10.1097/ede.0000000000000771. → pages 6, 63[34] Eleanor J. Murray, Brian L. Claggett, Bradi Granger, Scott D. Solomon, andMiguel A. Herna´n. Adherence-adjustment in placebo-controlled randomizedtrials: An application to the candesartan in heart failure randomized trial.Contemporary Clinical Trials, 90, 2020. doi:10.1016/j.cct.2020.105937. →page 6[35] E. E. Moodie, J. A. Delaney, G. Lefebvre, and R. W. Platt. Missingconfounding data in marginal structural models: a comparison of inverseprobability weighting and multiple imputation. Int J Biostat, 4(1):Article 13,2008. → pages 7, 20, 48, 57, 6468[36] N. Mojaverian, E. E. Moodie, A. Bliu, and M. B. Klein. The Impact ofSparse Follow-up on Marginal Structural Models for Time-to-Event Data.Am. J. Epidemiol., 182(12):1047–1055, Dec 2015. → pages7, 13, 19, 20, 47, 61, 64[37] G. Vourli and G. Touloumi. Performance of the marginal structural modelsunder various scenarios of incomplete marker’s values: a simulation study.Biom J, 57(2):254–270, Mar 2015. → pages 7, 48, 57, 62[38] S. R. Cole and M. A. Herna´n. Constructing inverse probability weights formarginal structural models. American Journal of Epidemiology, 168(6):656–664, 2008. doi:10.1093/aje/kwn164. → pages 7, 17, 44, 46[39] Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):702–710, 1995. doi:10.1093/biomet/82.4.702. → page 11[40] Manfred S. Green and Michael J. Symons. A comparison of the logistic riskfunction and the proportional hazards model in prospective epidemiologicstudies. Journal of Chronic Diseases, 36(10):715–723, 1983.doi:10.1016/0021-9681(83)90165-0. → pages 14, 15, 18, 33[41] Ralph B. D’Agostino, Mei-Ling Lee, Albert J. Belanger, L. AdrienneCupples, Keaven Anderson, and William B. Kannel. Relation of pooledlogistic regression to time dependent Cox regression analysis: TheFramingham heart study. Statistics in Medicine, 9(12):1501–1515, 1990.doi:10.1002/sim.4780091214. URLhttps://onlinelibrary.wiley.com/doi/abs/10.1002/sim.4780091214. → pages14, 15, 18[42] Julius S. Ngwa, Howard J. Cabral, Debbie M. Cheng, Michael J. Pencina,David R. Gagnon, Michael P. Lavalley, and L. Adrienne Cupples. Acomparison of time dependent Cox regression, pooled logistic regressionand cross sectional pooling with simulations and an application to theFramingham Heart Study. BMC Medical Research Methodology, 16(148),Mar 2016. doi:10.1186/s12874-016-0248-6. → pages 14, 15, 18, 33[43] A. Colin Cameron and Douglas L. Miller. A practitioner’s guide tocluster-robust inference. Journal of Human Resources, 50(2):317–372,2015. doi:10.3368/jhr.50.2.317. → pages 14, 15, 16, 18[44] Michael Baiocchi, Jing Cheng, and Dylan S. Small. Instrumental variablemethods for causal inference. Statistics in Medicine, 33(13):2297–2340,692014. doi:10.1002/sim.6128. URLhttps://onlinelibrary.wiley.com/doi/abs/10.1002/sim.6128. → page 15[45] P. C. Austin, A. Manca, M. Zwarenstein, D. N. Juurlink, and M. B.Stanbrook. A substantial and confusing variation exists in handling ofbaseline covariates in randomized controlled trials: a review of trialspublished in leading medical journals. J Clin Epidemiol, 63(2):142–153,Feb 2010. → page 16[46] J. D. Ciolino, H. L. Palac, A. Yang, M. Vaca, and H. M. Belli. Ideal vs. real:a systematic review on handling covariates in randomized controlled trials.BMC Med Res Methodol, 19(1):136, 07 2019. → page 16[47] B. C. Kahan, V. Jairath, C. J. Dore´, and T. P. Morris. The risks and rewardsof covariate adjustment in randomized trials: an assessment of 12 outcomesfrom 8 studies. Trials, 15:139, Apr 2014. → page 16[48] D. Moher, S. Hopewell, K. F. Schulz, V. Montori, P. C. Gøtzsche, P. J.Devereaux, D. Elbourne, M. Egger, and D. G. Altman. CONSORT 2010explanation and elaboration: updated guidelines for reporting parallel grouprandomised trials. BMJ (clinical research edition), 340(c869), Mar 2010.doi:10.1136/bmj.c869. → page 16[49] J. E. Rossouw, G. L. Anderson, R. L. Prentice, A. Z. LaCroix,C. Kooperberg, M. L. Stefanick, R. D. Jackson, S. A. Beresford, B. V.Howard, K. C. Johnson, J. M. Kotchen, and J. Ockene. Risks and benefits ofestrogen plus progestin in healthy postmenopausal women: principal resultsFrom the Women’s Health Initiative randomized controlled trial. JAMA, 288(3):321–333, Jul 2002. → page 16[50] O. Holme, M. Loberg, M. Kalager, M. Bretthauer, M. A. Herna´n, E. Aas,T. J. Eide, E. Skovlund, J. Lekven, J. Schneede, K. M. Tveit, M. Vatn,G. Ursin, and G. Hoff. Long-Term Effectiveness of SigmoidoscopyScreening on Colorectal Cancer Incidence and Mortality in Women andMen: A Randomized Trial. Ann. Intern. Med., 168(11):775–782, 06 2018.[51] J. K. Carroll, G. Pulver, L. M. Dickinson, W. D. Pace, J. A. Vassalotti, K. S.Kimminau, B. K. Manning, E. W. Staton, and C. H. Fox. Effect of 2 ClinicalDecision Support Strategies on Chronic Kidney Disease Outcomes inPrimary Care: A Cluster Randomized Trial. JAMA Netw Open, 1(6):e183377, 10 2018. → page 1670[52] Eleanor J. Murray, Sonja A. Swanson, and Miguel A. Herna´n. Guidelines forestimating causal effects in pragmatic randomized trials, 2019. → page 16[53] Zoe Fewell, Miguel A. Herna´n, Frederick Wolfe, Kate Tilling, Hyon Choi,and Jonathan A. C. Sterne. Controlling for time-dependent confoundingusing marginal structural models. The Stata Journal: Promotingcommunications on statistics and Stata, 4(4):402–420, 2004.doi:10.1177/1536867x0400400403. → page 18[54] Terrence F. Blaschke, Lars Osterberg, Bernard Vrijens, and John Urquhart.Adherence to medications: Insights arising from studies on the unreliablelink between prescribed and actual drug dosing histories. Annual Review ofPharmacology and Toxicology, 52(1):275–301, 2012.doi:10.1146/annurev-pharmtox-011711-113247. URLhttps://doi.org/10.1146/annurev-pharmtox-011711-113247. PMID:21942628. → pages 19, 47[55] O. Siddiqui, H. M. Hung, and R. O’Neill. MMRM vs. LOCF: acomprehensive comparison based on simulation study and 25 NDA datasets.J Biopharm Stat, 19(2):227–246, 2009. → page 20[56] Tim P. Morris, Ian R. White, and Michael J. Crowther. Using simulationstudies to evaluate statistical methods. Statistics in Medicine, 38(11):2074–2102, 2019. doi:10.1002/sim.8086. → pages 20, 26, 28, 30[57] R Core Team. R: A Language and Environment for Statistical Computing. RFoundation for Statistical Computing, Vienna, Austria, 2017. URLhttps://www.R-project.org/. → page 20[58] Susanne Berger, Nathaniel Graham, and Achim Zeileis. Various versatilevariances: An object-oriented implementation of clustered covariances in r.Working Paper 2017-12, Working Papers in Economics and Statistics,Research Platform Empirical and Experimental Economics, University ofInnsbruck, July 2017. URLhttp://EconPapers.RePEc.org/RePEc:inn:wpaper:2017-12. → page 21[59] Achim Zeileis. Object-oriented computation of sandwich estimators.Journal of Statistical Software, 16(9):1–16, 2006.doi:10.18637/jss.v016.i09. → page 21[60] Hadley Wickham, Mara Averick, Jennifer Bryan, Winston Chang,Lucy D’Agostino McGowan, Romain Francois, Garrett Grolemund, AlexHayes, Lionel Henry, Jim Hester, Max Kuhn, Thomas Lin Pedersen, Evan71Miller, Stephan Milton Bache, Kirill Muller, Jeroen Ooms, David Robinson,Dana Paige Seidel, Vitalie Spinu, Kohske Takahashi, Davis Vaughan, ClausWilke, Kara Woo, and Hiroaki Yutani. Welcome to the tidyverse. Journal ofOpen Source Software, 4(43):1686, 2019. doi:10.21105/joss.01686. → page21[61] Henian Chen, Patricia Cohen, and Sophie Chen. How big is a big odds ratio?Interpreting the magnitudes of odds ratios in epidemiological studies.Communications in Statistics - Simulation and Computation, 39(4):860–864,2010. doi:10.1080/03610911003650383. URLhttps://doi.org/10.1080/03610911003650383. → page 25[62] David Price, Stanley D. Musgrave, Lee Shepstone, Elizabeth V. Hillyer,Erika J. Sims, Richard F.T. Gilbert, Elizabeth F. Juniper, Jon G. Ayres, LindaKemp, Annie Blyth, Edward C.F. Wilson, Stephanie Wolfe, Daryl Freeman,H. Miranda Mugford, Jamie Murdoch, and Ian Harvey. Leukotrieneantagonists as first-line or add-on asthma-controller therapy. New EnglandJournal of Medicine, 364(18):1695–1707, 2011.doi:10.1056/NEJMoa1010846. URLhttps://doi.org/10.1056/NEJMoa1010846. PMID: 21542741. → page 35[63] Johannes Textor, Benito van der Zander, Mark S Gilthorpe, MaciejLis´kiewicz, and George TH Ellison. Robust causal inference using directedacyclic graphs: the R package ‘dagitty’. International journal ofepidemiology, 45(6):1887–1894, 2016. → page 45[64] Jennifer S Barber, Susan A Murphy, and Natalya Verbitsky. 4. adjusting fortime-varying confounding in survival analysis. Sociological Methodology,34(1):163–192, 2004. → page 45[65] Geert Molenberghs, Herbert Thijs, Ivy Jansen, Caroline Beunckens,Michael G Kenward, Craig Mallinckrodt, and Raymond J Carroll.Analyzing incomplete longitudinal clinical trial data. Biostatistics, 5(3):445–464, 2004. → page 57[66] Noelle M Cocoros, Sean D Pokorney, Kevin Haynes, Crystal Garcia,Hussein R Al-Khalidi, Sana M Al-Khatib, Patrick Archdeacon, Jennifer CGoldsack, Thomas Harkins, Nancy D Lin, and et al. FDA-catalyst-usingFDA’s sentinel initiative for large-scale pragmatic randomized trials:Approach and lessons learned during the planning phase of the first trial.Clinical Trials, 16(1):90–97, 2018. doi:10.1177/1740774518812776. →page 6172[67] Kerollos Nashat Wanis, Arin L Madenci, Miguel A Herna´n, and Eleanor JMurray. Adjusting for adherence in randomized trials when adherence ismeasured as a continuous variable: An application to the lipid researchclinics coronary primary prevention trial. Clinical Trials, page1740774520920893, 2020. → page 62[68] Jessica G Young, Miguel A Herna´n, Sally Picciotto, and James M Robins.Relation between three classes of structural models for the effect of atime-varying exposure on survival. Lifetime data analysis, 16(1):71, 2010.→ page 62[69] Yongling Xiao, Erica EM Moodie, and Michal Abrahamowicz. Comparisonof approaches to weight truncation for marginal structural Cox models.Epidemiologic Methods, 2(1):1–20, 2013. → page 62[70] Miguel A. Herna´n, Sonia Herna´ndez-Dı´az, and James M. Robins.Randomized trials analyzed as observational studies. Annals of InternalMedicine, Oct 2013. doi:10.7326/0003-4819-159-8-201310150-00709. →page 6373

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0392954/manifest

Comment

Related Items