UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Incorporating partial adherence into the principal stratification analysis framework Sanders, Eric 2019

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata


24-ubc_2019_september_sanders_eric.pdf [ 2.47MB ]
JSON: 24-1.0380529.json
JSON-LD: 24-1.0380529-ld.json
RDF/XML (Pretty): 24-1.0380529-rdf.xml
RDF/JSON: 24-1.0380529-rdf.json
Turtle: 24-1.0380529-turtle.txt
N-Triples: 24-1.0380529-rdf-ntriples.txt
Original Record: 24-1.0380529-source.json
Full Text

Full Text

INCORPORATING PARTIAL ADHERENCE INTO THE PRINCIPALSTRATIFICATION ANALYSIS FRAMEWORKbyERIC SANDERSB.Sc. Queen’s University, 2017A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTSFOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE AND POST-DOCTORAL STUDIES(Statistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2019©Eric Sanders, 2019The following individuals certify that they have read, and recommend to the Faculty ofGraduate and Postdoctoral Studies for acceptance, the thesis entitled:Incorporating partial adherence into the principal stratification analysis frameworkSubmitted by Eric Sanders in partial fulfillment of the requirements forthe degree of Master of Sciencein StatisticsExamining Committee:Co-supervisor: Paul Gustafson, StatisticsCo-supervisor: M. Ehsan Karim, Population and Public HealthiiAbstractParticipants in pragmatic clinical trials often partially adhere to treatment. In the presenceof partial adherence, simple statistical analyses of binary adherence (receiving either full orno treatment) introduce biases. We developed a framework which expands the principalstratification approach to allow partial adherers to have their own principal stratum andtreatment level. We derived consistent estimates for bounds on population values of interest.A Monte Carlo posterior sampling method was derived that is computationally faster thanMarkov Chain Monte Carlo sampling, with confirmed equivalent results. Simulations indicatethat the two methods agree with each other and are superior in most cases to the biasedestimators created through standard principal stratification. The results suggest that thesenew methods may lead to increased accuracy of inference in settings where study participantsonly partially adhere to assigned treatment.iiiLay SummaryClinical trials attempt to identify when certain treatments cause outcomes that are betteron average than some alternative. There are a lot of problems that can pop up when doingclinical trials, which might cause results to be inaccurate. The key goal in this thesis is tooffer clinical researchers a new method to analyze clinical trial data without bias when facedwith the problem of patients only partially following a prescribed treatment, particularly inthe context of when the clinical trial is being done in a ‘real world’ setting where researchersmight not be able to meticulously gather a great deal of patient information.ivPrefaceThis dissertation is original, unpublished, independent work by the author, Eric Sanders.Original motivation for the problem of addressing partial adherence in pragmatic clinicaltrials arose from the BC Support Unit Real-World Clinical Trials Methods Cluster, Project#2, headed by Ehsan Karim.vContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viContents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Patient-Oriented Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Pragmatic Clinical Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Standard Clinical Trial Analytical Methods . . . . . . . . . . . . . . . . . . . 31.3.1 Intent-to-Treat Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 41.3.2 As-Treated Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.3 Per-Protocol Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Principal Stratification Analysis in the Pragmatic Clinical Trial Setting . . . 71.5 Problems Introduced by Partial Adherence to Assigned Treatment . . . . . . 82 Incorporating Partial Adherence into Prinicipal Stratification Analysis . 102.1 Framework Setup for Partial Adherence in Principal Stratification Analysis . 102.2 Possible Assumptions and Their Implications . . . . . . . . . . . . . . . . . . 142.2.1 Bounds Resulting From the Monotone-within-Treatment Assumption 15vi2.3 A Monte Carlo Method of Generating Posterior Samples of Effect Estimates 183 Examples and Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 243.1 Limiting Distributions of Parameter Estimates . . . . . . . . . . . . . . . . . 273.1.1 Simulation 1: Both Monotonicity Types . . . . . . . . . . . . . . . . 283.1.2 Simulation 2: Neither Monotonicity Type . . . . . . . . . . . . . . . . 323.2 Validation Trial of the Monte Carlo Method through Comparison with JAGS 363.3 Simulation Set Comparing to the Case Discussed by Shrier . . . . . . . . . . 404 Conclusions, Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51viiList of Tables1 Table of parameters describing data structure. . . . . . . . . . . . . . . . . . 142 Mean square error of various inference methods and across sample sizes of1000 and 8000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47viiiList of Figures1 Posterior sample and limiting distribution for partial treatment effect on par-tial compliers, monotonicity types 1 & 2. . . . . . . . . . . . . . . . . . . . . 292 Posterior sample and limiting distribution for full treatment effect on fullcompliers, monotonicity types 1 & 2. . . . . . . . . . . . . . . . . . . . . . . 303 Posterior sample and limiting distribution for full treatment effect on fullpopulation, monotonicity types 1 & 2. . . . . . . . . . . . . . . . . . . . . . 314 Posterior sample and limiting distribution for partial treatment effect on par-tial compliers, no monotonicity types. . . . . . . . . . . . . . . . . . . . . . . 335 Posterior sample and limiting distribution for full treatment effect on fullcompliers, no monotonicity types. . . . . . . . . . . . . . . . . . . . . . . . . 346 Posterior sample and limiting distribution for full treatment effect on fullpopulation, no monotonicity types. . . . . . . . . . . . . . . . . . . . . . . . 357 JAGS Validation, posterior sample for partial treatment effect on partial com-pliers, monotonicity type 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 JAGS Validation, posterior sample for full treatment effect on full compliers,monotonicity type 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 JAGS Validation, posterior sample for full treatment effect on full population,monotonicity type 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3910 Boxplots of treatment effect point estimates over 2000 simulations with samplesize 1000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4311 Bar plot of treatment effect 95% interval coverage over 2000 simulations withsample size 1000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4412 Boxplots of treatment effect point estimates over 2000 simulations with samplesize 8000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4513 Bar plot of treatment effect 95% interval coverage over 2000 simulations withsample size 8000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46ixAcknowledgementsI thank the faculty, staff, and students of the UBC Department of Statistics, who havetaught me a great deal during my years of study. Their motivating support helped in theproduction of this thesis, and the breadth of knowledge within the department knows nobounds. Special thanks to Paul Gustafson for patiently teaching me so many new concepts,and to Ehsan Karim for always having constructive and helpful feedback on the work I hadcompleted.I also offer gratitude to my parents and siblings for their unending moral support overthe years of my education while I have been away from home.x1 Introduction1.1 Patient-Oriented ResearchThere are records of humans using experiments to inform medical and public health decisionsfor more than two thousand years [4]. In more recent history, due to scientific advancement,there can often be discrepancies between the understanding and goals of clinical professionalsand those of care receivers. While innovatively complex research can lead to large advance-ments in the fields of medicine and public health, there have been calls for support of more‘patient-oriented research’ by organizations such as the US Patient-Centered Outcomes Re-search Institute, the Canadian Institutes of Health Research, and the UK National Institutefor Health Research [38, 40, 14].The precise definition of ‘patient-oriented research’ may differ between organizations, butsome tenets are consistent, such that research primarily focuses on supplying whatever in-formation is directly relevant to and desired by patients who are potentially going to receivea treatment. Furthermore, patient-oriented research aims to form observations and conclu-sions that incorporate such factors as patient burdens, service availability, and other diversefactors that make up an entire treatment plan [38, 40, 14]. By fulfilling these goals, patientsand their caregivers can be informed to the same degree and make the best choices availableto them without any confusion. Many factors are relevant when discussing whether researchis ‘patient-oriented,’ but some distinctions exist in study design that can make this simpler.Pragmatic clinical trials, as the relevant example to be further discussed, may be viewed ashaving some patient-oriented characteristics.1.2 Pragmatic Clinical TrialsIn a standard randomized clinical trial (RCT), it is most common to have highly experiencedinvestigators working with highly selected participants, following strict rules with detailed1follow-up schedules in order to determine the efficacy of a new intervention. A concern thatmay arise from a standard RCT such as this is that the results may not be directly applicablein regular clinical practice. This is because study designs tend to promote internal validity(ability to accurately inform on trends in the rigidly defined trial population) more thanexternal validity (ability to accurately inform on trends in the total population). Pragmatictrials, originally described by Schwartz and Lellouch in 1967[45], are trials which loosensome of the strict requirements of a standard RCT in an attempt to solve this problem andpromote results that are more directly applicable to a larger and more realistic population.Standard RCTs may not identify highly relevant details, such as if less experienced medicalstaff are likely to make mistakes administering treatment, or if patients outside strict inclu-sion criteria can have adverse effects, or if patients in a general clinical setting are likely toinadequately follow guidelines for treatment. Pragmatic trials thus aim to overcome this bybeing more realistic than standard RCTs from the very start in terms of the patients whocome for treatment, the medical professionals offering treatment, and the steps necessary toreceive treatment. Looking at it this way, it is clear that pragmatic clinical trials are a steptowards more patient-oriented research.Trials that place treatment and assessment responsibility on everyday caregivers, or areless restrictive and more holistic when it comes to trial participation and selection, or re-quire less intensive and demanding participation in study proceedings and data collectionare often referred to as pragmatic [11]. While the distinction tends to be loose betweena standard RCT and a pragmatic trial, the PRECIS (Pragmatic-Explanatory ContinuumIndicator Summary) tool has been created in an attempt to measure the level of pragmatismof a trial [49]. While pragmatism in the clinical trial setting was first described over 50 yearsago, only in recent years has it been gaining popularity [41], and there are multiple studiesof note in recent years that have pragmatic features [23, 44, 13, 8, 43, 6, 46]. One of themain reasons that pragmatic trials are gaining popularity is that they are better suited than2RCTs to answer questions about the feasibility or benefit of fully employing an interventionin the most ‘everyday’ clinical setting.By making a study more pragmatic in nature, it is a much smaller step to infer what wouldhappen if a new intervention were implemented on a large scale, and how this interventionmay best benefit the population. As will be seen, this comes with important considerations.1.3 Standard Clinical Trial Analytical MethodsWhether a clinical trial is considered pragmatic or not, there are a number of options forhow to analyze collected data. Primarily, the question arises of how to handle situationswhere the randomization of intervention assignment to patients is threatened. The majorconcern to be addressed moving forward is how to proceed with useful analysis when patientsdo not conform to prescribed interventions, and instead behave in a way that more closelyfits the description of a different intervention in the study, or none at all. As an example, atrial comparing compound Salvia pellets to nitrates as treatments for chronic angina pectorismay begin by randomizing intervention assignment to patients [28], but after data collection,it might become apparent that some patients assigned compound Salvia pellets, in fact,had access to and made some use of nitrates. This type of situation is very common andof a considerable concern because different approaches in analysis can lead to changes inconclusions. Moving forward, some of the classic approaches to a problem such as this aresummarized, as well as notable benefits and drawbacks to applying each method. In manyapplications, it is currently preferred to present results of multiple of the below analysistypes together. Each analytical approach has its weaknesses, so a certain strength is gainedwhen different analyses agree in their findings.31.3.1 Intent-to-Treat AnalysisIn an intent-to-treat analysis, data are analyzed as if patients always received the interven-tion they were assigned. In other words, no changes are made in the analysis if it is revealedthat a patient assigned intervention A actually received intervention B, and the data wouldrecord that the patient received intervention A. In other words assigned intervention groupsare considered immutable post-randomization, regardless of intervention received[19]. Uti-lizing this approach, the most notable result is that the randomization process of assigningtreatments is not violated, and treatment assignment can still be considered truly random,granted that the study was designed correctly. However, as a result of some people in eachgroup potentially receiving the intervention of a different group, results tend to bias towardsthe null, indicating a smaller or less significant difference between intervention outcomes(though some cases exist where bias may be away from the null [48]). Here the use of theterm ‘bias’ is applicable only depending on perspective. If one is interested in analyzing‘the effect of being assigned an intervention compared to a control,’ there is no bias towardsthe null in using the intent-to-treat analysis, but if one is interested in analyzing ‘the effectof receiving an intervention compared to a control,’ then the intent-to-treat analysis couldpossibly be biased, likely towards there being no effect [48]. These two estimable quantitiesare in fact quite distinct and should not be confused with each other, because the first ofthe above effects is heavily impacted by intervention program implementations, while thesecond is an inherent property of a new intervention. Both of these can be highly relevant,and care is needed to tease the two apart and acknowledge that the intent-to-treat analysiscan not perfectly capture the effect of actually receiving treatment.As an example of when intent-to-treat analysis has a clear flaw, suppose that a study waspoorly designed and did not adequately give participants an incentive to stick to an inter-vention. If the control group is doing nothing, but the treatment group has an intensiveintervention plan without adequate incentive supplied, many people in the treatment group4may cease treatment, to a point where a significant proportion of people in the ‘treatment’arm of the study are doing nothing, just like in the control arm. As a result, the analysismay indicate that the treatment is ineffective, even if the treatment would have been highlyeffective if enough incentive was given for people to adhere. This type of problem is evenmore common in the pragmatic trial context, because there are many factors that are free toimpact whether a participant will carry through with proper assigned treatment. As a result,using intention-to-treat analysis in a pragmatic trial can seriously threaten the accuracy ofresults, because a higher proportion of patients can be expected not to follow treatmentregimens perfectly.Acknowledging that the intention-to-treat approach can accurately capture the effect ofbeing assigned an intervention, but becomes biased when estimating the effect of receivingan intervention, it is important to note what problems may arise as soon as analysis de-parts from the intent-to-treat approach. As soon as one decides to alter a data set basedon adherence (such as in as-treated analysis or per-protocol analysis discussed below), therandomization process has been interfered with, and as a result there is an opportunity fornotable differences to arise, either between the control group and intervention group, or inthe definition of the target population [27]. These can lend significant bias to estimates ifnot controlled for.1.3.2 As-Treated AnalysisAn as-treated analysis, as the name suggests, considers all participants to belong to the in-tervention group that most closely matches the treatment they actually received, regardlessof what treatment they were assigned. If the two final treatment groups were appropriatelysimilar with respect to relevant pre-treatment factors, the results could accurately estimatethe effect of receiving treatment. However, the issue is that as-treated analysis is highlylikely to introduce distinct differences between treatment groups, because there are factors5that impact adherence, and thus impact which intervention group patients belong to. Someof these factors are often thought to be immeasurable, such as personality traits, but it is alsoquite possible for the intervention itself to affect adherence. Since the two patient groupswere not defined through a randomization process, there is a high risk of bias that is difficultto quantify.As an example of when a flaw might arise in as-treated analysis, suppose a large fraction ofpeople receiving an intervention had a negative side effect, and many of these people maycease treatment. As a result of these people being treated as members of the control group,only people who responded well to the intervention are left in the treatment group, whichwill bias results towards having a positive outlook on the treatment.1.3.3 Per-Protocol AnalysisIn 1980, an article from the Coronary Drug Project reported that when assigned a placebopill, adherers to the protocol had a lower 5-year mortality risk than non-adherers, whichfrightened many researchers away from performing analyses that separate adherers fromnon-adherers [9]. In a per-protocol analysis, participants are only retained in the data setfor as long as they adhere to the intervention they were assigned. Thus, someone whonever adheres to their assigned intervention will be completely removed from the study forall analyses. After the mentioned 1980 article was published, per-protocol analyses wereregarded very critically, since the approach separates the study population from the targetpopulation according to whatever factors (unmeasured and measured) determine adherence,some of which may be related to the effect of the intervention [25, 26]. However, somemethods have been developed to update per-protocol effect estimates by adjusting for pre-randomization and post-randomization variables thought to impact adherence [37, 26], withnotable improvement in results when revisiting previous studies (including the results fromthe 1980 article regarding the Coronary Drug Project). The development of these techniques6has led to suggestions that adjusted per-protocol analyses be presented alongside intent-to-treat analyses, so as to compare valid estimates of the effect of receiving treatment to theeffect of being assigned treatment [24].1.4 Principal Stratification Analysis in the Pragmatic Clinical TrialSettingThe three standard approaches above are the most straightforward methods available foranalyzing clinical trial data, and it is seen that each has drawbacks and limitations. Whilethe intent-to-treat approach is most popular, it is often combined with the per-protocolapproach in order to actually comment on the effect of receiving treatment, as opposed tosimply being assigned treatment. However, more complicated methods do exist to estimatetreatment effects in clinical trials. Some of the more sophisticated methods include propen-sity scoring, instrumental variable analysis, and principal stratification [47, 15, 2, 22, 29].Of interest here is the latter method, principal stratification, which will be extended in thisthesis to combat the issue of partial adherence to assigned treatment, which is currently notaddressed by any of the mentioned analytical methods.Principal stratification relies on characterizing participants not based on a selection of base-line features, but based on how one variable is determined by another, and by referring toeach group of similarly behaving individuals as a ‘principal stratum’ [42]. For example, ifone has two binary variables, one for treatment assigned and one for treatment received,then there are four possible functions from one to the other (always being equal to the other,always equalling one of the two values, or always being the opposite value as the other). It issupposed that each participant follows one of these four functions, or in other words, belongsto one of these four principal strata. This can be put in more plain language in this exampleby supposing that one treatment arm can be called the ‘active intervention,’ while the otheris the ‘control intervention,’ and then saying that participants either always comply, always7defy, always take the active intervention, or always take the control intervention. After sup-posing that these strata exist, it is possible to form estimates of treatment effects withineach principal stratum, and to form estimates of the distribution of the population acrossprincipal strata. This type of analysis is commonly used in trials with adherence issues inbinary treatment settings, and has been used to form effect estimates for over twenty years[1, 30, 12, 16, 3, 7, 35].The principal stratification approach is also particularly enticing in the pragmatic trial set-ting because of the more nuanced inferences that can be gained from the analysis. An overallestimate can be formed of the effect of treating a whole population with the active inter-vention compared to the control intervention, but individual estimates can also be formedper principal stratum, such as the estimate of treatment effect among those who complywith treatment assignments. These are arguably more informative analyses than simpleintent-to-treat or per-protocol approaches. Often the estimates for a principal stratificationanalysis are calculated through simple arithmetic based on the groups that can be clearlyidentified, but more in-depth estimates can be formed through Bayesian inference to giveexplicit interval estimates and posterior densities [47, 29].1.5 Problems Introduced by Partial Adherence to Assigned Treat-mentAll previously discussed methods are faced with obstacles when clinical trial participantsonly partially adhere to an assigned intervention, placing them somewhere between receiv-ing no treatment and receiving full treatment. Since all of the general methods for analyzingclinical trial data assume some dichotomous (or at least discrete) indicator for treatmentreceived, decisions must be made on what to do with the data in order to make it suitablefor the chosen analytical method. In most applications, the question has become whether tocount partial compliance as full compliance or non-compliance, and efforts have been lacking8in fully accounting for the fact that some participants satisfied some (but not all) specifiedcomponents of the prescribed intervention [47, 10, 50].The most na¨ıve approach to handle the problem of partial adherence is to simply treatpartially adherent participants as either fully compliant or non-compliant. Shrier et. al.[47] assessed the effects of analyses which designate a partially compliant person as fullynon-compliant or fully compliant. It was found that treating partially compliant partic-ipants as fully compliant and proceeding with principal stratification analysis led to thesame kinds of biases as intent-to-treat analysis (generally towards the null); on the otherhand, treating partially compliant participants as non-compliant and proceeding with prin-cipal stratification analysis led to bias away from the null when estimating treatment effectin fully compliant participants [47]. These results indicate a need for some other way toproperly analyze clinical trial data when some participants only partially adhere. Jin andRubin [31] reported a method of principal stratification which allowed for compliance to bemeasured continuously, where a participant’s principal stratum was defined using four con-tinuous variables measuring the proportion of each of the two treatments received in each ofthe two treatment assignment settings. The method of Jin and Rubin relies on multiple stepsof Bayesian sampling to estimate dose-response effects, and has since been used in multiplestudies where exact proportions of treatments received may be quantified [17, 32, 39], someof which extend the methods to incorporate copula analysis [36, 5]. With the potential er-rors introduced by not recognizing partial compliance, and the rigorous information keepingnecessary to perform the analytical methods of Jin and Rubin, there is no method of analysisfor the case of partial compliance in highly pragmatic settings where detailed complianceinformation may not be available.Guo [18] presented a framework that used a trichotomous variable for treatment received thatincluded a level for ‘partial’ treatment as well as full treatment and no treatment. Propensity9scoring methods are used to form a variety of estimates regarding different type of definedcompliers. In this thesis, I define a framework that also uses a trichotomous variable fortreatment received, and I present simple bound estimators which consistently estimate truebounds on treatment effects in relevant population groups. These bounds could have ap-plications as simpler inference tools that do not require a complex Bayesian formulation. Ialso present a Bayesian inference method of forming posterior samples of treatment effectsin relevant population groups, which uses a derived sampling algorithm that is less complexthan the methods of Jin and Ruben as well as Guo.2 Incorporating Partial Adherence into Prinicipal Strat-ification Analysis2.1 Framework Setup for Partial Adherence in Principal Stratifi-cation AnalysisFor a supposed clinical trial, define the following variables for individual i, and suppose thereare a total of n participants in a trial. Note that outcome is presumed to be a binary variablefor simplicity, but could be extended to a continuous variable:Zi = Treatment assigned ∈0 Assigned to not receive treatment with probability 1− pZ ,1 Assigned to receive treatment with probability pZ .Xi = Level of treatment recieved ∈N No treatment received,P Partial treatment received,F Full treatment received.Yi = Outcome variable ∈ {0, 1}.10Note that pZ is held constant across all individuals, as is the standard for randomizedclinical trials. It is further supposed that before random treatment assignment occurs, eachindividual belongs to one of three principal strata defined as follows:1. Those who will comply fully with either treatment assigned. This group is called the“fully compliant” (F) group.2. Those who will never begin the treatment. This group is called the “never taking” (N)group.3. Those who will, if assigned treatment, only partially comply with the prescribed treat-ment schedule. If not assigned treatment, they will not receive treatment as directed.This group is called the “partially compliant” (P) group.The principal strata may be referred to as follows:Pi = Principal Stratum ∈F Fully Compliant,P Partially Compliant,N Never Taker.Temporal patterns will not be observed here. It is assumed that principal strata are deter-mined before randomization and do not shift in response to the course of active treatmentor control treatment. The principal stratum of an individual determines the nature of therelationship between Z and X as follows:Xi =N if Zi = 0 or Pi = N,P if Zi = 1 and Pi = P,F if Zi = 1 and Pi = F.Next, notations will be set up for some values that would be calculated from a full data setof Zi, Xi, Yi values collected for individuals i = 1, ..., n. Let nj = the number of participants11with Zi = j, and let Y¯j = the proportion of participants with Zi = j that have outcomeYi = 1. Thus, Y¯j can be represented as1nj∑i:Zi=jYi.Similarly, let nj,k = the number of participants with Zi = j and Xi = k, and Y¯j,k = theproportion of participants with Zi = j and Xi = k that have outcome Yi = 1, which canthen be represented as 1nj,k∑i:Zi=j,Xi=kYi.It is thus seen that given a data set, there may be values calculated for n0, n1 and Y¯0, Y¯1.There can also be values calculated for n0,n, n1,n, n1,p, n1,f , and Y¯0,n, Y¯1,n, Y¯1,p, Y¯1,f . Note thatthe terms n0,p and n0,f will be 0 while Y¯0,p and Y¯0,f will be undefined if considered, as it isassumed that anyone not assigned treatment can not access treatment, meaning for caseswhere Zi = 0 it is certain that Xi = N in any observable data.The goal of the principal stratification approach is to form estimates of average outcomes fortreatments within each principal stratum, so notation must be defined for these parametersthat are to be estimated.γj = Population proportion of persons with Z = j that will have outcome Y = 1.γk,p = Population proprotion of persons with X = k and P = p that will have outcome Y = 1.pp = True population proportion of persons with P = p.There are two equations which show the relationship of these parameters:γ1 = γf,FpF + γp,PpP + γn,NpN , (1)γ0 = γn,FpF + γn,PpP + γn,NpN . (2)A selection of these parameters may be consistently estimated from the data set. First,γˆ1 = Y¯1,γˆ0 = Y¯0.12Next, in observations where Z = 0, no information can be inferred about principal stratum,because X = 0 always when Z = 0. Conversely, observations where Z = 1 can have principalstratum fully identified by observing the X value. Thus, consistent estimation of principalstrata proportions can be achieved using only the Z = 1 data as follows:pˆN =n1,nn1,pˆP =n1,pn1,pˆF =n1,fn1.There are also estimates for cases where the treatment assigned and treatment receivedcorrespond to one and only one combination of principal strata and treatment received, inother words when the exact principal stratum of the patient can be identified. These casesinform the following consistent parameter estimates. Specifically, being assigned treatmentand receiving full treatment indicates an F patient, being assigned treatment and receivingpartial treatment indicates a P patient, and being assigned treatment and receiving notreatment indicates an N patient. As a result:γˆf,F = Y¯1,f ,γˆp,P = Y¯1,p,γˆn,N = Y¯1,n.All of the above estimates can be substituted into equations (1) and (2), leaving the following:γˆ1 = γˆf,F pˆF + γˆp,P pˆP + γˆn,N pˆN , (3)γˆ0 = γn,F pˆF + γn,P pˆP + γˆn,N pˆN . (4)Here it can be seen that equation (3) already has all elements estimated, while equation (4)13has two parameters without estimates, meaning consistent estimates for γn,F or γn,P havenot been achieved.2.2 Possible Assumptions and Their ImplicationsIf considering all combinations of treatment received and principal strata, there are 9 pa-rameters defined using the γk,p notation. This is despite only 5 parameters governing thedistribution of the observable data: γf,F , γn,F , γp,P , γn,P , γn,N . Allowing for the hypotheticalexistence of the other 4 counterfactual parameters, it is useful to consider them in a tableas follows.Table 1: Mean responses for all treatment and principal stratum combinations, with coun-terfactual values denoted with grey shading.Treatment ReceivedFull Partial NonePrincipalStratumF γf,F γp,F γn,FP γf,P γp,P γn,PN γf,N γp,N γn,NTwo possible assumptions could be made regarding these mean response parameters, and(depending on the situation and validity of the assumptions) they may help in forming moreprecise estimates of treatment effects. The two possible assumptions are:Assumption 1: Within a specific treatment, the parameter values are monotone fromprincipal strata N to P to F. The direction of monotonicity is the same for each treatment.This will be otherwise known as the “Monotone Compliance Response” assumption. Refer-ring to Table 1, this corresponds to each column being monotone in the same direction fromthe top row to the bottom row. Disregarding the counterfactual parameters, this assumptiononly results in the conclusion that min(γn,F , γn,N) ≤ γn,P ≤ max(γn,F , γn,N).14Assumption 2: Within a specific principal stratum, the mean response parameter val-ues are monotone from no treatment to partial treatment to full treatment. The directionof monotonicity is the same in each principal stratum. This will be otherwise known as the“Monotone Dose Response” assumption. Referring to Table 1, this corresponds to each rowbeing monotone in the same direction from the left column to the right column. Disregardingthe counterfactual parameters, this assumption only results in the conclusion that γf,F−γn,Fshares its sign with γp,P − γn,P .2.2.1 Bounds Resulting From the Monotone-within-Treatment AssumptionTaking Assumption 1 (monotone-within-treatment) to be true results in the following:min(γn,F , γn,N) ≤ γn,P ≤ max(γn,F , γn,N).To see in what way this bounds estimation of γn,F and γn,P , Equation (2) can be used torearrange and express γn,F in terms of other parameters as follows:γn,F =γ0 − γn,PpP − γn,NpNpF.The following is then arrived at, from which the situation branches into two cases:min(γ0 − γn,PpP − γn,NpNpF, γn,N)≤ γn,P ≤ max(γ0 − γn,PpP − γn,NpNpF, γn,N). (5)It is uncertain which is larger in the minimum/maximum functions, so it is tentativelyassumed that one case is true and the conclusion is found, then assumed the other case andthe same process is completed. Then, the two conclusions can be combined into the mostgeneral answer.15Case 1, γn,N is the minimum in Equation (5)The two sides of the resulting inequality are separately rearranged, so as to have γn,P in thecenter, because it is the only non-estimated quantity present.γn,N ≤ γn,P γn,P ≤ γ0 − γn,PpP − γn,NpNpFγn,N ≤ γn,P pFγn,P ≤ γ0 − γn,PpP − γn,NpNγn,N ≤ γn,P (pF + pP )γn,P ≤ γ0 − γn,NpNγn,N ≤ γn,P γn,P ≤ γ0 − γn,NpNpF + pPAn inequality that provides information for the estimation of γn,P is now directly observable,and γn,F may be substituted in terms of the other terms in Equation (2) to also provideinformation for the estimation of γn,F . Thus the following two intervals are seen for the twounknown parameters γn,F and γn,P , after some algebraγn,N ≤ γn,P ≤ 1pF + pPγ0 − pNpF + pPγn,N ,1pF + pPγ0 − pNpF + pPγn,N ≤ γn,F ≤ 1pFγ0 − pP + pNpFγn,N .Instead of having bounds for γn,F and γn,P , it might be desired to have bounds for the actualtreatment effects within the principal strata: (γf,F − γn,F ) and (γp,P − γn,P ). These wouldbe expressed as follows:γp,P − 1pF + pPγ0 +pNpF + pPγn,N ≤ γp,P − γn,P ≤ γp,P − γn,N ,γf,F − 1pFγ0 +pP + pNpFγn,N ≤ γf,F − γn,F ≤ γf,F − 1pF + pPγ0 +pNpF + pPγn,N .Case 2, γn,N is the maximum in Equation (5)Again, steps similar to the previous case will be used to separately rearrange the two sides ofthe resulting inequality, so as to have γn,P in the center, because it is the only non-estimable16quantity present. After performing similar arithmetic as in the previous case, the followingintervals are arrived at for γn,F and γn,P :1pF + pPγ0 − pNpF + pPγn,N ≤ γn,P ≤ γn,N ,1pFγ0 − pP + pNpFγn,N ≤ γn,F ≤ 1pF + pPγ0 − pNpF + pPγn,N .And thus the intervals from Case 1 have been flipped in Case 2 for the estimated γn,F andγn,P values. Similarly, extending to interval estimates of (γf,F − γn,F ) and (γp,P − γn,P ),flipped intervals are seen once more:γp,P − γn,N ≤ γp,P − γn,P ≤ γp,P − 1pF + pPγ0 +pNpF + pPγn,N ,γf,F − 1pF + pPγ0 +pNpF + pPγn,N ≤ γf,F − γn,F ≤ γf,F − 1pFγ0 +pP + pNpFγn,N .Combining the two casesWithout ever trying to infer which terms were the smallest or largest in the original mini-mum/maximum functions, the desired intervals may simply be expressed as being betweenthe minimum of the two bounding terms and the maximum of the two bounding terms, sincethe two cases simply have flipped intervals.In other words, if the following are defined:A = γp,P − γn,N ,B = γp,P − 1pF + pPγ0 +pNpF + pPγn,N ,C = γf,F − 1pF + pPγ0 +pNpF + pPγn,N ,D = γf,F − 1pFγ0 +pP + pNpFγn,N .17Then it follows that:min(A,B) ≤ γp,P − γn,P ≤ max(A,B),min(C,D) ≤ γf,F − γn,F ≤ max(C,D).It is clear that A,B,C,D may be estimated consistently, and therefore there are consistentlyestimable bounds by using the estimates γˆp,P , γˆn,N , γˆf,F , pˆP , pˆN , pˆF , γˆ0 to obtain Aˆ, Bˆ, Cˆ, Dˆ.2.3 A Monte Carlo Method of Generating Posterior Samples ofEffect EstimatesIt has been seen that introducing a third treatment level to distinguish partial adherers tostudy protocols results in an ability to form consistent parameter estimates using methodssimilar to a simpler principal stratification approach, but only for some parameters. Forother parameters, it has been seen that only consistent bound estimates may be computed.As a result, one next step in developing this expansion to the principal stratification approachis to consider the possibility of determining posterior distributions of effects of interest.The present model for the data includes unknown parameters for how the population is dis-tributed across principal strata (pB, pP , pN), as well as unknown parameters for probabilitiesof the outcome of interest for each treatment type within each principal strata (9 differentγQ,W values, for treatment levels Q from f, p, n and principal strata W from B,P,N). Outof these parameters, only some values and linear combinations of values may be consistentlyestimated with increasing accuracy as the study sample size increases, while others will notbecome informed by gathering data in the given study framework. For this reason, the modelis called partially identified. By following the steps outlined in Gustafson [21] for partiallyidentified models as described below, a Monte Carlo simulation method can be designed thatis computationally less expensive than other methods such as Markov Chain Monte Carlo:1. Identify a ‘transparent reparameterization’: a one-to-one invertible mapping h from18the original parameter set θ to a new parameter set defined such that it can be strictlyseparated into fully identifiable (φ) and non-identifiable (λ) parts.2. Transform a specified prior distribution piθ(θ) such that it is expressed in terms of thenew parameterization.3. Identify a convenience prior distribution in the new parameter space pi∗(φ, λ). We willhave that pi∗(φ, λ) is independently uniform, and it follows that:pi∗φ,λ|y(φ, λ|y) = pi∗(φ|y)pi∗(λ|φ)pi∗φ,λ|y(φ, λ|y) ∝ pi∗(φ|y)pi∗φ,λ|y(φ, λ|y) ∝ L(y|φ)pi∗(φ)pi∗φ,λ|y(φ, λ|y) ∝ L(y|φ)4. Determine the proportional estimator of the convenience posterior L(y|φ) by separat-ing it into likelihood components. The y vector can be separated into four differentgroups of independent and assumed identically distributed observations. Thus the tar-get likelihood can be calculated as the product of each of the four groups’ outcomelikelihoods as well as the likelihood of the groups’ distribution in the population.5. An importance sampling scheme [34] is used to sample from the convenience posteriordistribution. Weights are assigned to each sample in order to adjust from convenienceposterior samples to samples representing the target posterior distribution.Step 1: Transparent ReparameterizationLet the original parameter set be defined as the vector θ:θ = (γn,P γf,F γp,P γn,N pF pP pN γn,F γp,F γf,P γf,N γp,N)T .19Referring back to the derivations directly preceding Equations (3) and (4), it has been shownthat consistent estimates are available for γf,F , γp,P , γn,N , pF , pP , and pN . Furthermore, aconsistent estimate is available for γn,FpF + γn,PpP + γn,NpN , as it is equivalent to γ0, theaverage outcomes seen in those assigned control intervention, as per Equation (2). Definethe following parameter for simplicity in further derivations:χ = γn,FpF + γn,PpP + γn,NpN .With this information, it is possible to form a simple mapping from θ to a set of parametersthat clearly distinguish parameters fully identifiable by data from parameters that are notat all identifiable by data. Calling this mapping h, the new set of identifiable parameters φ,and the new set of non-identifiable parameters λ, the mapping is written as follows:h(θ) = hγn,Pγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N=χγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N= φλ = (φ, λ)T .In the above mapping, the resulting vector has a horizontal line which shows the distinctionbetween the parameters fully informed by data (φ) and the parameters not informed bydata (λ). It can be observed that this transparent reparameterization is simple in that20the mapping h maps most parameters to themselves, and only one parameter has beenmapping to a linear combination of itself and some other parameters, as γn,P maps to χ =γn,FpB + γn,PpP + γn,NpN .Step 2: Target PriorWe specify a prior on the original parameter space θ as independently Uniform(0,1) for eachγQ,W , and Dirichlet(1,1,1) for (pB, pP , pN). Then, in order to express the prior distributionin terms of the new parameters (φ, λ), the following identity may be used:piφ,λ(φ, λ) = piθ(h−1(φ, λ))∣∣∣∣∂(h−1)∂(φ, λ)∣∣∣∣.Due to the fact that piθ(θ) is uninformative except for defining the bounds of the param-eter space, the term piθ(h−1(φ, λ)) will simply be an indicator function that maps to 0 ifthe inputs (φ, λ) lie outside the image of h. Let H represent the image of h. The de-terminant of the inverse mapping’s derivative is simpler than it may appear, because ofthe fact that the mapping h and thus h−1 maps most vector elements to themselves. Inthe inverse derivative matrix, it arises that all but the first row and first column are asin an identity matrix, and thus the determinant of the matrix will simply be the ele-ment in the (1,1) position of the matrix, which is the derivative of the first element ofh−1(the mapping from χ back toχ−γn,F pB−γn,NpNpP)in terms of χ, which is simply p−1P . Thus,we arrive at the specified prior in terms of the new parameterization as follows:piφ,λ(φ, λ) =IH(φ, λ)pP.Step 3: Convenience PriorFor more convenient calculations, it is necessary to identify a convenience prior in the newparameter space. Similar to the true prior in the original parameter space, define the con-21venience prior pi∗(φ, λ) to be independently Unif(0,1) for χ and the γQ,W parameters, andDirichlet(1,1,1) for pB, pP , pN . Then, due to the independence of the parameters, it is truethat pi∗(λ|φ) is equal to pi∗(λ) and thus uniformly distributed, and it is true that pi∗(φ) isalso uniform.Step 4: Convenience PosteriorDue to the way the convenience prior was defined, the following simplifying steps can beused when forming an expression for the convenience posterior distribution:pi∗(φ, λ|y) = pi∗(φ|y)pi∗(λ|φ)∝ pi∗(φ|y)∝ L(y|φ)pi∗(φ)∝ L(y|φ)So it is seen that when the likelihood of the data given φ is calculated, it shall be propor-tional to the convenience posterior distribution. At this point the likelihood function can bebroken into 5 components. There are four separable ‘bins’ that independent observations canbelong to: one for those assigned no treatment, then three bins for those assigned treatmentfor the three possible treatments received. Each of these bins can independently have itslikelihood calculated using relevant terms. The fifth component of the likelihood function isthe likelihood that the observations were assigned to their respective bins. Subscripts areused here to indicate specific subsets of observations (e.g. yz=1,x=0 is the set of outcomes onlycorresponding to observations with z = 1 and x = 0) or to indicate counts of observations(e.g. nz=1,x=0 is the number of observations with z = 1, x = 0). The likelihood function’sseparation into components can be expressed as follows:L(y|φ) = L(yz=0|φ)L(yz=1,x=0|φ)L(yz=1,x=1|φ)L(yz=1,x=2|φ)L(nz=0, nz=1,x=0, nz=1,x=2, nz=1,x=2|φ).22And here, each part of the likelihood function now only depends on certain parts of φ:L(y|φ) = L(yz=0|χ)L(yz=1,x=0|γn,N)L(yz=1,x=1|γp,P )L(yz=1,x=2|γf,F )×L(nz=0, nz=1,x=0, nz=1,x=2, nz=1,x=2|pF , pP , pN).And now, the first four likelihoods can be expressed using simple binomial distributionfunctions, while the fifth likelihood does not vary with nz=0, since no parameters informthis value, but the remaining three n values may be expressed in a multinomial distribution.Disregarding terms not involving the parameters, the resulting form is:L(y|φ) ∝ χnz=0,y=1(1− χ)nz=0,y=0γnz=1,x=0,y=1n,N (1− γn,N)nz=1,x=0,y=0γnz=1,x=1,y=1p,P (1− γp,P )nz=1,x=1,y=0γnz=1,x=2,y=1f,F (1− γf,F )nz=1,x=2,y=0pnz=1,x=0N pnz=1,x=1P pnz=1,x=2F .From this, the convenience posterior pi∗(φ, λ|y) can be identified as follows, where all param-eters have posterior distributions independent of each other:χ ∼ B(nz=0,y=1 + 1, nz=0,y=0 + 1).γf,F ∼ B(nz=1,x=2,y=1 + 1, nz=1,x=2,y=0 + 1).γp,P ∼ B(nz=1,x=1,y=1 + 1, nz=1,x=1,y=0 + 1).γn,N ∼ B(nz=1,x=0,y=1 + 1, nz=1,x=0,y=0 + 1).(pF , pP , pN) ∼ Dirichlet(nz=1,x=2 + 1, nz=1,x=1 + 1, nz=1,x=0 + 1).γn,F ∼ Unif(0, 1).γp,F ∼ Unif(0, 1).γf,P ∼ Unif(0, 1).γf,N ∼ Unif(0, 1).γp,N ∼ Unif(0, 1).23Step 5: Target Posterior Sampling SchemeIn order to adjust any samples taken from the convenience posterior to reflect the true pos-terior distribution, weights are assigned to each sample taken from the convenience posteriorthat are defined such that the following hold. Letting m represent the size of the MonteCarlo sample, we take:wi ∝ pi(φi, λi)pi∗(φi, λi),Withm∑i=1wi = 1.As pi(φi, λi) and pi∗(φi, λi) have already been determined, it is seen that:wi ∝ IH(φ, λ)pP.In the following section, it will be more clearly shown how to use these results (particularlythe posterior distributions from Step 4 and the weights from Step 5) in order to build aposterior sampling scheme.3 Examples and Simulation ResultsIn the following sections, the “Monte Carlo method” refers to the posterior sampling methodpreviously formulated in Section 2.3. One recurring choice that will be discussed for all sim-ulations is the choice of whether any of the monotonicity types discussed in Section 2.2 aresatisfied during data simulation, and furthermore the choice of whether any of the samemonotonicity types are assumed to be true when performing inference on simulated datasets. In the following sections, the phrase “monotonicity type X” refers to the case where24“assumption X” is true during data simulation, while saying “assumption X” only refers tousing the assumption during inference and not during data simulation.The main idea in the setup of the forthcoming simulations is that different monotonicitytypes can be crossed with different assumptions. For example, the case of data being sam-pled from a distribution satisfying monotonicity type 1 and being analyzed using assumption2 could potentially be very different from the case of data being sampled from a distributionsatisfying monotonicity types 1 and 2 and being analyzed using no assumptions. As a result,forthcoming simulations try to observe the full scope of combinations of monotonicity typesused when generating data and of assumptions made when analyzing data, although only aselection of these combinations are shown in full detail.The steps taken to generate data using a specific monotonicity type is as follows:1. Decide on a parameter set θ (as defined in Section 2.3) that satisfies the given mono-tonicity type(s). Also, decide on n, the number of people in the simulated study,and p, the proportion of patients who would be randomly assigned treatment (for allsimulations shown in this thesis, p = 0.5).2. Randomly generate the vector of Zi values as n samples from a Bernoulli(p) distribu-tion.3. Randomly generate principal strata Pi by sampling n values from the three possibleprincipal strata with replacement, with sampling weights pF , pP , pN as determined inθ.4. Determine the vector of Xi values using the Pi and Zi values (this may be coded usingnested ‘if’ statements, such that for example if Zi = 1 and Pi = F , then Xi = F ).5. Define a new vector of probabilities Ki that the ith patient will have outcome Yi = 125based on θ, Pi and Xi (this may again be coded using nested ‘if’ statements, such thatfor example if Pi = F and Xi = F , then Ki = γf,F ).6. Generate the n values of Yi from Bernoulli(Ki) distributions.7. The generated data to be analyzed are the vectors of Zi, Xi, and Yi values.After simulating data, posterior estimates θˆ are generated using the Monte Carlo method,which is done using the following steps, drawing on the results from Section 2.3:1. Select which assumption(s) will be made for the analysis, if any. Select the desiredposterior sample size, N .2. Define some G such that G ≥ N (some samples will be assigned weights of 0 as a resultviolating requirements of posterior samples, reducing the actual sample posterior size,so more should be generated to accommodate this). We generally used G = 5N .3. Using the data set of Zi, Xi, and Yi values, sample G values from the following distri-butions as (unweighted) posterior samples for each parameter:χ ∼ B(nz=0,y=1 + 1, nz=0,y=0 + 1).γf,F ∼ B(nz=1,x=2,y=1 + 1, nz=1,x=2,y=0 + 1).γp,P ∼ B(nz=1,x=1,y=1 + 1, nz=1,x=1,y=0 + 1).γn,N ∼ B(nz=1,x=0,y=1 + 1, nz=1,x=0,y=0 + 1).(pF , pP , pN) ∼ Dirichlet(nz=1,x=2 + 1, nz=1,x=1 + 1, nz=1,x=0 + 1).γn,F ∼ Unif(0, 1).γp,F ∼ Unif(0, 1).γf,P ∼ Unif(0, 1).γf,N ∼ Unif(0, 1).γp,N ∼ Unif(0, 1).264. Calculate the posterior sample γn,P values as (χ− γn,NpN − γn,FpF )p−1P .5. Identify and remove any posterior observations that violate assumptions, and any pos-terior observations that have γn,P estimates less than 0 or greater than 1. If the totalnumber of remaining samples is less than N , repeat the previous two steps. This willcontinue until at least N posterior samples are achieved.6. Calculate weights such that each wi is proportional to the inverse of the ith posteriorobservation of pP , and the sum of all weights is 1.7. The posterior samples are thus the remaining samples for each parameter, with theirassociated weights wi.Forthcoming figures will only show posterior distributions of three parameters of particularinterest: the effect of receiving partial treatment for partial adherers, the effect of receivingfull treatment for full adherers, and the average effect of receiving full treatment within thewhole population. These three are expressed algebraically in their respective order as:ψ1 = γp,P − γn,P ,ψ2 = γf,F − γn,F ,ψ3 = (γf,FpB + γf,PpP + γf,NpN)− (γn,FpB + γn,PpP + γn,NpN).3.1 Limiting Distributions of Parameter EstimatesAs described in section 2.2.1, when Assumption 1 is made, data may be used to estimateAˆ, Bˆ, Cˆ, and Dˆ, and these values may be used to form estimated bounds on two quantities ofinterest: the average effect of partial treatment in partially compliant people (γp,P−γn,P ) andthe average effect of full treatment in fully compliant people (γf,F−γn,F ). As n tends towardsinfinity these estimated bounds will converge to true bounds. Just as the bounds converge totrue bounds as study sample size approaches infinity, so too will the posterior distributions27of target parameters converge to limiting posterior distributions. It is of interest to visualizeusing some examples of how estimated posterior distributions and estimated bounds compareto limiting bounds and the limiting posterior distribution. Two examples will be visualized,where one has data generated from a distribution satisfying monotonicity types 1 and 2 andthus no invalid assumptions, and one has no monotonicity types satisfied, and thus neitherof the assumptions are valid. Both have a study size of 1000 for the estimated bounds anddistribution, and a study size approaching infinity for the limiting bounds and distribution.3.1.1 Simulation 1: Both Monotonicity TypesThe parameter set used for the case with both monotonicity types is:θ1,2 =γn,Pγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N=0.40.550.450.350..Following in Figures 1, 2, and 3 are plots showing posterior sample values for the parametersof interest ψ1, ψ2, and ψ3 respectively. For each possible assumption, the posterior sampledistribution for a study of size 1000 is compared to the posterior sample distribution for astudy of a size approaching infinity (with bounds additionally shown when Assumption 1 is28made).Figure 1: Histogram of the posterior sample of the partial treatment effect on P patients,with monotonicity types 1 and 2 satisfied. Posteriors formed using violated assumptionshave a grey fill, whereas samples with all assumptions valid have a white fill. A solid verticalline indicates the true value, dashed indicates estimated bounds, dotted indicates limitingbounds. The limiting posterior density curve is plotted over the histogram.29Figure 2: Histogram of the posterior sample of the full treatment effect on F patients, withmonotonicity types 1 and 2 satisfied. Posteriors formed using violated assumptions havea grey fill, whereas samples with all assumptions valid have a white fill. A solid verticalline indicates the true value, dashed indicates estimated bounds, dotted indicates limitingbounds. The limiting posterior density curve is plotted over the histogram.30Figure 3: Histogram of the posterior sample of the full treatment effect on all patients withmonotonicity types 1 and 2 satisfied. Posteriors formed using violated assumptions have agrey fill, whereas samples with all assumptions valid have a white fill. A solid vertical lineindicates the true value. The limiting posterior density curve is plotted over the histogram.In Figures 1, 2, and 3 it can be seen that invoking Assumption 2 does not affect inferenceas much as Assumption 1 in this example, where Assumption 1 appears to reduce varia-tion in the posterior distribution of the partial treatment effect on PC patients and the fulltreatment effect on BC patients, but less so in the full treatment effect on the population.Furthermore, it is visible that when assumptions were made that allowed for bounds to becalculated, there were cases where estimated bounds completely missed the true value, butas sample size increased these estimated bounds converged to accurate ones. It is also worthnoting that estimated bounds seem to agree with posterior, as they existed somewhat nearthe modes of estimates in these examples.313.1.2 Simulation 2: Neither Monotonicity TypeA parameter set now used for the case where no monotonicity types are actually satisfied is:θ0 =γn,Pγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N=0.50.550.450.550..In the same format as in Simulation 1, following are Figures 4, 5, and 6 showing posteriorsample values for the parameters of interest ψ1, ψ2, and ψ3 respectively. Again, for eachpossible assumption, the posterior sample distribution for a study of size 1000 is comparedto the posterior sample distribution for a study with size approaching infinity (with boundsadditionally shown when Assumption 1 is made).32Figure 4: Histogram of the posterior sample of partial treatment effect on P patients, with nomonotonicity types satisfied. Posteriors formed using violated assumptions have a grey fill,whereas samples with all assumptions valid have a white fill. A solid vertical line indicatesthe true value, dashed indicates estimated bounds, dotted indicates limiting bounds. Thelimiting posterior density curve is plotted over the histogram when Assumption 1 is notmade, and omitted when Assumption 1 is made due to the height of the spiked densitywithin the bounds of the dotted lines.33Figure 5: Histogram of the posterior sample of the full treatment effects on F patients, withno monotonicity types satisfied. Posteriors formed using violated assumptions have a greyfill, whereas samples with all assumptions valid have a white fill. A solid vertical line indicatesthe true value, dashed indicates estimated bounds, dotted indicates limiting bounds. Thelimiting posterior density curve is plotted over the histogram when Assumption 1 is notmade, and omitted when Assumption 1 is made due to the height of the spiked densitywithin the bounds of the dotted lines.34Figure 6: Histogram of the posterior sample of the full treatment effects on all patients,with no monotonicity types satisfied. Posteriors formed using violated assumptions have agrey fill, whereas samples with all assumptions valid have a white fill. A solid vertical lineindicates the true value. The limiting posterior density curve is plotted over the histogram.In Figures 4 and 5, it can be seen that invoking Assumption 1 incorrectly has caused pos-terior densities to converge to an incorrect interval when estimating the treatment effectamong F or P patients, while invoking Assumption 2 incorrectly tended to simply reduceposterior density in the area of the true parameter value compared to surrounding values.In the case of Assumption 2 being made, it seems possible that the posterior mean maystill be close to the true value at least in this example, even though the posterior densitysurrounding the true value has decreased. As in the case where the Assumptions were cor-rectly made in Figures 1 and 2, initial bound estimates can be expected to sometimes miss thetrue value, but in this case we observe that the limiting bounds do not include the true value.Invoking incorrect assumptions had different effects on population-level estimates as seenin Figure 6, where incorrectly taking Assumption 2 had a minimal visible effect on posteriorprobabilities, but taking Assumption 1 with or without Assumption 2 led to the posterior35densities being warped away from the true value, although to a lesser extent than in thecases observed in 4 and 5 with the F and P treatment effects.Not shown here are the resulting figures corresponding to a data set generated where mono-tonicity type 1 but not 2 is satisfied, and the resulting figures corresponding to a data setgenerated where montonicity type 2 but not 1 is satisfied. The most consistent observationsacross all completed simulations are that Assumption 1 appears to have a greater impacton inference compared to Assumption 2 – in cases where Assumption 1 is valid, it improvesestimate intervals by a notable amount, but in cases where Assumption 1 is invalid, estimateintervals can become entirely incorrect and narrowly focussed on an irrelevant region. Incomparison, the effect of Assumption 2 appears more benign, in that when it is incorrectlymade, the impact on inference tends to be only slightly away from the true value, while whenthe assumption is correctly made, it tends to have a small beneficial effect on inference.3.2 Validation Trial of the Monte Carlo Method through Com-parison with JAGSIn order to validate the Monte Carlo simulation method that has been derived, posteriordistribution samples can be formed from a simulated data set, comparing the results fromusing the Monte Carlo simulation method and Markov Chain Monte Carlo sampling im-plemented using the ‘R2jags’ package in R. Posterior samples are generated for comparisonin sixteen cases satisfying all parameter set and inference types, and in all cases posteriorsamples indicated the same posterior distribution of parameters when comparing the MonteCarlo approach and Markov Chain Monte Carlo sampling. Figures 7, 8, and 9 display pos-terior sample distributions of all samples taken for the cases where monotonicity type 1 issatisfied. Each figure compares the Monte Carlo method to the MCMC method for all fourassumptions, for one estimand of interest.36For this case with monotonicity type 1 satisfied, a study sample size n = 10000 is used,with a probability of being assigned treatment of 0.5, and data are simulated using thefollowing parameter set:θ2 =γn,Pγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N=0.50.550.450.450..37Figure 7: Posterior sample of partial treatment effects on P patients with monotonicity type2 satisfied. Samples that were taken using violated assumptions have a black fill, whereassamples with all assumptions valid have a white fill.Figure 8: Posterior sample of full treatment effects on F patients With monotonicity type2 satisfied. Samples that were taken using violated assumptions have a black fill, whereassamples with all assumptions valid have a white fill.38Figure 9: Posterior sample of full treatment effects on all patients with monotonicity type 2satisfied. Samples taken using violated assumptions have a black fill, whereas samples withall assumptions valid have a white fill.The plots for simulated data with no monotonicity, monotonicity type 2, and monotonic-ity types 1 and 2 (not shown here) have similar results to above. All simulated posteriorestimates from the Monte Carlo method and Markov Chain Monte Carlo methods have dis-tributions that are visually indistinguishable as in Figures 7, 8, and 9. The observations arethus consistent with the samples arising from the same distribution.Since the Monte Carlo method uses an approach comparable to importance sampling, byassigning weights to each observation, it is useful to quantify the effective sample size of theposterior distribution. Kish’s effective sample size [33] is calculated as (∑ni=1w2i )−1, and forall simulations completed with a posterior sample size of over 1000, none of the calculatedKish’s effective sample sizes were more than 1 less than the raw sample sizes. This supportsthe idea that the weighting used did not severely limit the reliability of the posterior sample.393.3 Simulation Set Comparing to the Case Discussed by ShrierThe simulations presented by Shrier et. al.[47] were for cases where the outcomes measuredwere continuous values between 0 and 100, and patients who were assigned to the controlgroup could access treatment (i.e. there could be a further principal stratum of ‘alwaystakers’ in the population). The simulations showed a bias away from the null hypothesiswhenever partial treatment was assumed to be full treatment, or when partial treatment wasassumed to be no treatment. After having developed the new methods in Sections 2 and 3,it is of interest how these new estimators may compare to the biased approaches observedin Shrier’s simulations.Thus, we set up a simulation scheme that is as similar to Shrier’s as possible, and is ableto compare our methods to the previous biased methods [47]. Instead of having outcomessampled from normal distributions with a parameter set of means between 0 and 100, wedivide these means by 100 and have the parameters instead convey the probability of abinary outcome. Furthermore, all patients in the original simulation scheme that were con-sidered ‘always takers’ are now baseline compliers, so instead of simulating data on 400 fullcompliers, 200 partial compliers, 200 always takers, and 200 never takers, there are 600 fullcompliers, 200 partial compliers, and 200 always takers. The simulation steps are as follows:1. The vector of principal strata P has 600 entries F , 200 entries P , and 200 entries N .2. The vector of assigned treatment Z satisfies that exactly 300 people with principalstratum F are assigned treatment, 100 people with principal stratum P , and 100people with principal stratum N .3. Determine the vector X using the vectors P and Z, which may be coded using nested‘if’ statements.404. Use the following parameter vector θ as adapted from Shrier’s parameter set to definethe vector of probabilities Ki that each patient will have outcome Yi = 1:θ =γn,Pγf,Fγp,Pγn,NpFpPpNγn,Fγp,Fγf,Pγf,Nγp,N=.Note that with this parameter set, both monotonicity assumptions are met, so allinference is being done in the case of any assumptions being valid.5. Generate the 1000 values Yi from Bernoulli(Ki) distributions.6. Analyze the generated Zi, Xi, Yi values to form point estimates of the treatment effectin full compliers using each of the following approaches:• The adapted principal stratification set up in Section 3 using no monotonicity as-sumptions (using mean of posterior as point estimate). Determine a 95% credibleinterval using weighted quantiles of the posterior sample.• The adapted principal stratification set up in Section 3 using monotonicity as-sumption 1 (using mean of posterior as point estimate). Determine a 95% credibleinterval using weighted quantiles of the posterior sample.41• The adapted principal stratification set up in Section 3 using monotonicity as-sumption 2 (using mean of posterior as point estimate). Determine a 95% credibleinterval using weighted quantiles of the posterior sample.• The adapted principal stratification set up in Section 3 using monotonicity as-sumptions 1 and 2 (using mean of posterior as point estimate). Determine a 95%credible interval using weighted quantiles of the posterior sample.• Simple principal stratification assuming partial treatment is no treatment. Deter-mine a 95% confidence interval by bootstrapping from the original data set andgetting a standard bootstrap confidence interval.• Simple principal stratification assuming partial treatment is full treatment. De-termine a 95% confidence interval by bootstrapping from the original data set andgetting a standard bootstrap confidence interval.• The approach developed in Section 2.2.1 of estimating bounds of the treatmenteffect in full compliers by assuming monotonicity across principal strata.After repeating steps 5 and 6 a total of 2000 times, there will then be 2000 sets of 6 pointestimates of the effect, as well as 2000 pairs of bound estimates of the effect. Visible inFigure 10 is a comparison of the distributions of each of these sets of 2000 estimates.42Figure 10: Boxplots of estimates of treatment effect in F patients over 2000 simulations usinga generative parameter set and population adapted from Shrier[47], separated by estimationmethods. The true parameter value is shown as a solid horizontal line.It was also observed that the true value was within the estimated bounds in 42.3% of simu-lations.In addition to examining the performance of single point estimates, it may be observedhow coverage of credible/confidence intervals varies across estimation methods, since eachpoint estimation method also had steps included to determine a 95% credible/confidenceinterval. A plot of 95% interval coverage vs. estimation method is given in Figure 11:43Figure 11: Bar plot of coverage of estimated 95% credible/confidence intervals of treatmenteffect in F patients over 2000 simulations using a generative parameter set and populationadapted from Shrier[47], separated by estimation method.The main observation here is that in this specific case, assuming partial treatment is equalto receiving no treatment and performing simple principal stratification results in the high-est mean square error compared to all other methods, and a coverage below 5%. Also, itis worth observing that the lowest mean square errors are found in any adapted principalstratification method where assumption 1 is made. In comparison, Assumption 2 seems tohave little impact on estimation error, but as was seen in Section 3.1.1 and Section 3.1.2,Assumption 1 tends to narrow posterior distributions, so it is not unexpected that Assump-tion 1’s inclusion decreases RMSE (since Assumption 1 is valid) but also decreases coverage.The much lower coverage of the two simple principal stratification methods does not dis-agree with Shrier’s suggestion that these estimation methods are inherently biased, and thehigher coverage of the adapted principal stratification approach does not disagree with thesuggestion that these methods result in a reduction of this bias.44As a next step in this observation process, we can try observing how these estimator perfor-mances change as the sample size increases. If we are correct that the developed estimationmethods are unbiased and the simple principal stratification methods are biased, we wouldexpect a lower coverage in the simple estimates’ intervals and a high coverage in the newmethods’ intervals. The above simulations are repeated, but the number of patients in eachgroup is increased by a factor of 8, so there are 8000 total patients instead of 1000. Allgroupings of patients are also increased by the same factor, so no proportions in the pop-ulation change. After completing the simulation steps once more with this single change,Figure 12 and Figure 13 are produced:Figure 12: Boxplots of estimates of treatment effect in F patients over 2000 simulations usinga generative parameter set and population adapted from Shrier[47] with total populationincreased to 8000, separated by estimation method. The true parameter value is shown as asolid horizontal line.45Figure 13: Bar plot of coverage of estimated 95% credible/confidence intervals of treatmenteffect in F patients over 2000 simulations using a generative parameter set and populationadapted from Shrier[47], separated by estimation method, after increasing total sample sizeto 8000.It was also now observed that the true value was within the estimated bounds in exactly 52%of simulations. It is visible in these results that as sample size increased, all estimates havereduced variance. We also see that the coverage of both simple principal stratification 95%intervals now approach zero, further agreeing that these estimation methods are inherentlybiased.We may also see in Table 2 the observed mean square error resulting from each of theseestimation methods (except the bound estimation), and see how these measures of errorchanged as sample size increased.46Table 2: Mean square errors of six principal stratification (PS) estimation methods from2000 simulations completed using a generative parameter set and population adapted fromShrier[47], in the case with a sample size of 1000 and in an altered case with a total populationof 8000.Estimation Method Root Mean Square ErrorSample Size 1000 Sample Size 8000Adapted PS, No Assumption 0.0781 0.0689Adapted PS, Assumption 1 0.0565 0.0367Adapted PS, Assumption 2 0.0975 0.0867Adapted PS, Assumptions 1 & 2 0.0506 0.0350Simple PS, Assume Partial as None 0.273 0.268Simple PS, Assume Partial as Full 0.0878 0.0794It is visible that while the posterior mean estimator RMSE values all reduced to less than90% of their original value, the simple principal stratification estimators assuming partialtreatment as no treatment or as full treatment only had RMSE reduce to 98% and 91%of their original values respectively. In both cases the best performance was seen in theposterior mean of the adapted principal stratification, under any assumption setup exceptassumption 2 without assumption 1. These findings do agree with the simulations shown inSection 3.1.1 and 3.1.2, which indicated that assumption 1 does tend to narrow the posterior,while assumption 2 can result in more irregular shape in the posterior distribution which iseven more exaggerated when assumption 2 is made without assumption 1. For more theoryon coverage of credible intervals in partially identified models please see Gustafson [20].4 Conclusions, Future WorkBias was identified in standard clinical trial inference methods when faced with partial ad-herence to treatment protocol. Also described was the lack of more advanced methods thatare feasible in patient-oriented research settings. New methods based on principal stratifica-tion were presented and have been shown to avoid the biases of previous approaches. One ofthe newly proposed methods is a simple bound calculation and one is a posterior sampling47method. The simple bound calculation method is seen to be consistent as long as partialadherers can be accurately assumed to have outcomes between those of non-adherer andfull-adherers (described as Assumption 1), which may be a concern in settings where thetreatment period is long enough for adherence to be impacted by treatment outcome. Theposterior sampling method is more versatile and can be applied in situations where Assump-tion 1 is not valid, and can also explore the optional second assumption that treatment effectis monotone with the amount of treatment received (described as Assumption 2). Even incases where Assumption 1 was valid, the coverage of 95% credible intervals calculated fromposterior samples did outperform the coverage of estimated bounds based on the simplebound calculation method, although this was not explored across a full range of parametersets. The evidence shown here does suggest that the posterior sampling scheme is superiorto the simple bound estimation in all regards except for simplicity.A range of simulations were completed which explored the impact of incorrectly makingeither monotonicity assumption while sampling posterior estimates. Also observed was thebenefit seen when correctly making either assumption. Overall, the impact of Assumptions1 and 2 seemed varied enough that it proved challenging to try to describe the general ex-pected outcome if either of them was made correctly or incorrectly. One key observation isthat Assumption 1 tended to narrow posterior distribution ranges, while Assumption 2 didnot, and only altered the shape of the distribution in a similar range. Assumption 2 seldomactually improved estimation, while Assumption 1 often improved it, even in low-sample-size cases where Assumption 1 was not valid, since the bias was small enough to still allowthe correct parameter values to fall within the resulting narrowed credible interval. In ahigh-sample-size case, it would be reasonable to worry that incorrectly making Assumption1 would result in a bias that outweighed the benefit of a narrowed posterior distribution,but this was also not fully explored.48The findings presented here are built on previously developed theory on principal strati-fication [42], pragmatic clinical trials [38, 40, 14, 45, 10, 50], and partially identified models[21, 20]. They add to and do not contradict the findings of Shrier et. al. [47] that reportedbias in cases where partial treatment was recorded as full treatment or no treatment inclinical trials. The findings of this thesis took a different approach than some previouslypresented methods that are less feasible in pragmatic clinical research settings, but do re-main useful in settings where a larger amount of information can be collected from patients[17, 32, 39, 31, 36, 5, 29, 18].This research is useful to caregivers, because it highlights that analytical tools exist forreducing bias in settings of partial treatment adherence that don’t require incredibly strictregimens of collecting patient information. Using the presented methods, patients could beenrolled in clinical research with minimal extra effort, which would be of great benefit giventhe unavoidable added stress faced when many steps are necessary to access treatment. Themethods presented in this thesis can already be used for great benefit in binary-outcomesettings where patients can not access treatment unless granted access. However, thesemethods could also potentially be expanded in future research to more complex settings us-ing the already presented framework. The most evident next steps to explore the use of thesedeveloped methods are to consider expanding them to cases of continuous responses or morediverse principal strata. While the methods of using partially identified model theory tocreate a more straightforward Monte Carlo sampling method were unique to the binary out-come and simple three-strata framework, a slower Markov Chain Monte Carlo method couldbe used to form estimates in more complicated strata setups or for continuous response data.Overall, these methods have a promise that suggests there would be a great benefit inclinical trials moving forward to keep track of when treatments are partially administered.Current methods of assigning a threshold amount of treatment to count as receiving the full49treatment should be abandoned for methods that properly acknowledge when treatment hasbeen administered but not in full. Hopefully, as a result, biased results can be avoided infuture pragmatic clinical trial research.50References[1] J. Angrist, G. Imbens, and D. Rubin. Identification of causal effects using instrumentalvariables. Journal of the American Statistical Association, 91:444–455, Jun 1993.[2] P. C. Austin. An introduction to propensity score methods for reducing the effects ofconfounding in observational studies. Multivariate Behavioral Research, 46(3):399–424,2011.[3] A. Balke and J. Pearl. Bounds on treatment effects from studies with imperfect com-pliance. Journal of the American Statistical Association, 92(439):1171–1176, 1997.[4] J. H. Baron. Evolution of clinical research: A history before and beyond james lind.Perspectives in Clinical Research, 3(4):149, 2012.[5] F. Bartolucci and L. Grilli. Modeling partial compliance through copulas in a principalstratification framework. Journal of the American Statistical Association, 106(494):469–479, 2011.[6] R. Campbell, F. Starkey, J. Holliday, S. Audrey, M. Bloor, N. Parry-Langdon,R. Hughes, and L. Moore. An informal school-based peer-led intervention for smok-ing prevention in adolescence (assist): a cluster randomised trial. The Lancet,371(9624):1595–1602, 2008.[7] J. Cheng and D. S. Small. Bounds on causal effects in three-arm trials with non-compliance. Journal of the Royal Statistical Society: Series B (Statistical Methodology),68(5):815–836, 2006.[8] N. K. Choudhry, J. Avorn, R. Glynn, E. Antman, S. Schneeweiss, M. Toscano, L. Reis-man, J. Fernandes, C. Spettell, J. L. Lee, and et al. Full coverage for preventive medi-cations after myocardial infarction. New England Journal of Medicine, 365:2088–2097,2011.51[9] Coronary Drug Project Research Group. Influence of adherence to treatment and re-sponse of cholesterol on mortality in the coronary drug project. Journal of Occupationaland Environmental Medicine, 23(7):507, 1981.[10] S. Dodd, I. R. White, and P. Williamson. Nonadherence to treatment protocol inpublished randomised controlled trials: a review. Trials, 13(1), 2012.[11] I. Ford and J. Norrie. Pragmatic trials. New England Journal of Medicine, 375(5):454–463, Apr 2016.[12] C. E. Frangakis, D. B. Rubin, and X. H. Zhou. Clustered encouragement designs withindividual noncompliance: bayesian inference with randomization, and application toadvance directive forms. Biostatistics, 3(2):147–164, 2002.[13] O. Fro¨bert, B. Lagerqvist, G. K. Olivecrona, E. Omerovic, T. Gudnason, M. Maeng,M. Aasa, O. Anger˚as, F. Calais, M. Danielewicz, and et al. Thrombus aspirationduring st-segment elevation myocardial infarction. New England Journal of Medicine,369(17):1587–1597, 2013.[14] Government of Canada Canadian Institutes of Health Research. Strategy for patient-oriented research. CIHR, Jan 2019.[15] S. Greenland. An introduction to instrumental variables for epidemiologists. Interna-tional Journal of Epidemiology, 47(1):358–358, 2017.[16] R. Greevy, J. H. Silber, A. Cnaan, and P. R. Rosenbaum. Randomization inference withimperfect compliance in the ace-inhibitor after anthracycline randomized trial. Journalof the American Statistical Association, 99(465):7–15, 2004.[17] S. F. Grey. How Much Compliance is Enough? Examining the Effect of DifferentDefinitions of Compliance on Estimates of Treatment Efficacy in Randomized ControlledTrials. PhD thesis, Case Western Reserve University, 2013.52[18] T. Guo. Causal effects in randomized trials in the presence of partial compliance: breast-feeding effect on infant growth. PhD thesis, McGill University, 2009.[19] S. Gupta. Intention-to-treat concept: A review. Perspectives in Clinical Research,2(3):109, 2011.[20] P. Gustafson. On the behaviour of bayesian credible intervals in partially identifiedmodels. Electronic Journal of Statistics, 6:2107–2124, 2012.[21] P. Gustafson. Bayesian inference for partially identified models: exploring the limits oflimited data. CRC Press, Boca Raton, 2015.[22] E. F. Halpern. Behind the numbers: Inverse probability weighting. Radiology,271(3):625–628, 2014.[23] Heart Protection Study Collaborative Group. Mrc/bhf heart protection study of choles-terol lowering with simvastatin in 20,536 high-risk individuals: a randomised placebo-controlled trial. The Lancet, 360:7–22, 2002.[24] M. A. Herna´n, S. Herna´ndez-Dı´az, and J. M. Robins. Randomized trials analyzed asobservational studies. Annals of Internal Medicine, Oct 2013.[25] M. A. Herna´n and S. Herna´ndez-DA˜az. Beyond the intention-to-treat in comparativeeffectiveness research. Clinical Trials: Journal of the Society for Clinical Trials, 9(1):48–55, 2011.[26] M. A. Herna´n and J. M. Robins. Per-protocol analyses of pragmatic trials. New EnglandJournal of Medicine, 377(14):1391–1398, May 2017.[27] M. A. Herna´n and D. Scharfstein. Cautions as regulators move to end exclusive relianceon intention to treat. Annals of Internal Medicine, 168(7):515, 2018.53[28] W. Huiping, W. Yu, J. Pei, L. Jiao, Z. Shian, J. Hugang, W. Zheng, and L. Yingdong.Compound salvia pellet might be more effective and safer for chronic stable anginapectoris compared with nitrates. Medicine, 98(9), Mar 2019.[29] G. W. Imbens and D. B. Rubin. Bayesian inference for causal effects in randomizedexperiments with noncompliance. The Annals of Statistics, 25(1):305–327, 1997.[30] G. W. Imbens and D. B. Rubin. Bayesian inference for causal effects in randomizedexperiments with noncompliance. The Annals of Statistics, 25(1):305–327, 1997.[31] H. Jin and D. B. Rubin. Principal stratification for causal inference with extendedpartial compliance. Journal of the American Statistical Association, 103(481):101–111,2008.[32] H. Jin and D. B. Rubin. Public schools versus private schools: Causal inference withpartial compliance. Journal of Educational and Behavioral Statistics, 34(1):24–45, 2009.[33] L. Kish. Survey sampling. Wiley-Interscience publication, New York, 1965.[34] P. M. Lee. Bayesian statistics: an introduction. Wiley, Chichester, 2012.[35] R. J. Little, Q. Long, and X. Lin. A comparison of methods for estimating the causaleffect of a treatment in randomized clinical trials subject to noncompliance. Biometrics,65(2):640–649, 2008.[36] Y. Ma and J. Roy. Causal models for randomized trials with continuous compliance.Statistical Causal Inferences and Their Applications in Public Health Research ICSABook Series in Statistics, pages 187–201, 2016.[37] E. J. Murray and M. A. Herna´n. Adherence adjustment in the coronary drug project: Acall for better per-protocol effect estimates in randomized trials. Clinical Trials: Journalof the Society for Clinical Trials, 13(4):372–378, Jul 2016.54[38] National Institute of Health Research. About involve. INVOLVE.[39] L. C. Page. Understanding the impact of career academy attendance. Evaluation Review,36(2):99–132, 2012.[40] Patient-Centered Outcomes Research Institute. Pcori. Dissemination and Implementa-tion Framework and Toolkit, Jan 2019.[41] N. A. Patsopoulos. A pragmatic view on pragmatic trials. Dialogues in Clinical Neuro-science, 13(2):217, 2011.[42] J. Pearl. Principal stratification – a goal or a tool? The International Journal ofBiostatistics, 7(1):1–13, 2011.[43] R. Pickard, T. Lam, G. MacLennan, K. Starr, M. Kilonzo, G. McPherson, and et al.Antimicrobial catheters for reduction of symptomatic urinary tract infection in adultsrequiring short-term catheterisation in hospital: a multicentre randomised controlledtrial. The Lancet, 380(9857):1927–1935, Nov 2012.[44] I. Roberts, D. Yates, P. Sandercock, B. Farrell, J. Wasserberg, G. Lomas, R. Cot-tingham, P. Svoboda, N. Brayley, G. Mazairac, and et al. Effect of intravenous cor-ticosteroids on death within 14 days in 10008 adults with clinically significant headinjury (mrc crash trial): randomised placebo-controlled trial. Obstetrics & Gynecology,105(1):214, 2005.[45] D. Schwartz and J. Lellouch. Explanatory and pragmatic attitudes in therapeuticaltrials. Journal of Clinical Epidemiology, 62(5):499–505, 2009.[46] A. S. V. Shah, A. Anand, F. E. Strachan, A. V. Ferry, K. K. Lee, A. R. Chapman,and et al. High-sensitivity troponin in the evaluation of patients with suspected acutecoronary syndrome: a stepped-wedge, cluster-randomised controlled trial. The Lancet,392:919–928, 2018.55[47] I. Shrier, R. W. Platt, R. J. Steele, and M. Schnitzer. Estimating causal effects of treat-ment in a randomized trial when some participants only partially adhere. Epidemiology,29(1):78–86, 2018.[48] I. Shrier, E. Verhagen, and S. D. Stovitz. The intention-to-treat analysis is not alwaysthe conservative approach. The American Journal of Medicine, 130(7):867–871, 2017.[49] K. E. Thorpe, M. Zwarenstein, A. D. Oxman, S. Treweek, C. D. Furberg, D. G. Alt-man, S. Tunis, E. Bergel, I. Harvey, D. J. Magid, and et al. A pragmatic-explanatorycontinuum indicator summary (precis): a tool to help trial designers. Canadian MedicalAssociation Journal, 180(10), 2009.[50] T. J. Vanderweele. Mediation analysis with multiple versions of the mediatore. Epi-demiology, 23(3):454–463, 2012.56


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items