UBC Faculty Research and Publications

Stopping randomized trials early for benefit: a protocol of the Study Of Trial Policy Of Interim Truncation-2… Briel, Matthias; Lane, Melanie; Montori, Victor M; Bassler, Dirk; Glasziou, Paul; Malaga, German; Akl, Elie A; Ferreira-Gonzalez, Ignacio; Alonso-Coello, Pablo; Urrutia, Gerard; Kunz, Regina; Culebro, Carolina R; Silva, Suzana d A; Flynn, David N; Elamin, Mohamed B; Strahm, Brigitte; Murad, M H; Djulbegovic, Benjamin; Adhikari, Neill K; Mills, Edward J; Gwadry-Sridhar, Femida; Kirpalani, Haresh; Soares, Heloisa P; Elnour, Nisrin O A; You, John J; Karanicolas, Paul J; Bucher, Heiner C; Lampropulos, Julianna F; Nordmann, Alain J; Burns, Karen E; Mulla, Sohail M; Raatz, Heike; Sood, Amit; Kaur, Jagdeep; Bankhead, Clare R; Mullan, Rebecca J; Nerenberg, Kara A; Vandvik, Per O; Coto-Yglesias, Fernando; Schünemann, Holger; Tuche, Fabio; Chrispim, Pedro P M; Cook, Deborah J; Lutz, Kristina; Ribic, Christine M; Vale, Noah; Erwin, Patricia J; Perera, Rafael; Zhou, Qi; Heels-Ansdell, Diane; Ramsay, Tim; Walter, Stephen D; Guyatt, Gordon H Jul 6, 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-13063_2009_Article_337.pdf [ 905.87kB ]
JSON: 52383-1.0223090.json
JSON-LD: 52383-1.0223090-ld.json
RDF/XML (Pretty): 52383-1.0223090-rdf.xml
RDF/JSON: 52383-1.0223090-rdf.json
Turtle: 52383-1.0223090-turtle.txt
N-Triples: 52383-1.0223090-rdf-ntriples.txt
Original Record: 52383-1.0223090-source.json
Full Text

Full Text

ralssBioMed CentTrialsOpen AcceStudy protocolStopping randomized trials early for benefit: a protocol of the Study Of Trial Policy Of Interim Truncation-2 (STOPIT-2)Matthias Briel1,2, Melanie Lane3, Victor M Montori*3, Dirk Bassler4, Paul Glasziou5, German Malaga6, Elie A Akl7, Ignacio Ferreira-Gonzalez8, Pablo Alonso-Coello9, Gerard Urrutia9, Regina Kunz2, Carolina Ruiz Culebro1, Suzana Alves da Silva10, David N Flynn3, Mohamed B Elamin3, Brigitte Strahm11, M Hassan Murad3, Benjamin Djulbegovic12, Neill KJ Adhikari13, Edward J Mills14, Femida Gwadry-Sridhar15, Haresh Kirpalani1,16, Heloisa P Soares17, Nisrin O Abu Elnour3, John J You1, Paul J Karanicolas15, Heiner C Bucher2, Julianna F Lampropulos3, Alain J Nordmann2, Karen EA Burns18, Sohail M Mulla1, Heike Raatz2, Amit Sood3, Jagdeep Kaur1, Clare R Bankhead5, Rebecca J Mullan3, Kara A Nerenberg1, Per Olav Vandvik19, Fernando Coto-Yglesias20, Holger Schünemann1, Fabio Tuche10, Pedro Paulo M Chrispim21, Deborah J Cook1, Kristina Lutz1, Christine M Ribic1, Noah Vale1, Patricia J Erwin3, Rafael Perera5, Qi Zhou1, Diane Heels-Ansdell1, Tim Ramsay22, Stephen D Walter1 and Gordon H Guyatt1Address: 1Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Canada, 2Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, Basel, Switzerland, 3Knowledge and Encounter Research Unit, Mayo Clinic, Rochester, MN, USA, 4Department of Neonatology, University Children's Hospital Tuebingen, Tuebingen, Germany, 5Centre for Evidence-Based Medicine, Department of Primary Health Care, University of Oxford, Oxford, UK, 6Universidad Peruana Cayetano Heredia, Lima, Peru, 7State University of New York at Buffalo, Buffalo, NY, USA, 8Cardiology Department, Vall d'Hebron Hospital, CIBER de Epidemiología y Salud Pública (CIBERESP), Spain, 9Centro Cochrane Iberoamericano, Hospital Sant Pau, Barcelona, and CIBER de Epidemiologia y Salud Publica (CIBERESP), Spain, 10Teaching and Research Center of Pro-Cardiaco, Rio de Janeiro, Brazil, 11Pediatric Hematology and Oncology Centre for Pediatrics and Adolescent Medicine, University Hospital Freiburg, Freiburg, Germany, 12Center for Evidence-based Medicine, USF Health Clinical Research, Tampa, FL, USA, 13Sunnybrook Health Sciences Centre and University of Toronto, Toronto, ON, Canada, 14British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia, Vancouver, BC, Canada, 15University of Western Ontario, London, ON, Canada, 16Children's Hospital Philadelphia, Philadelphia, PA, USA, 17Mount Sinai Medical Center, Miami Beach, FL, USA, 18St. Michael's Hospital, Keenan Research Centre and Li Ka Shing Knowledge Institute, University of Toronto, Toronto, ON, Canada, 19Department of Medicine, Gjøvik, Innlandet Hospital Health Authority, Norway, 20Hospital Nacional de Geriatría y Gerontología San José, Costa Rica, 21National School of Public Health (ENSP), Rio de Janeiro, Brazil and 22Ottawa Hospital Research Institute, University of Ottawa, Ottawa, ON, CanadaEmail: Matthias Briel - brielm@uhbs.ch; Melanie Lane - lane.melanie@mayo.edu; Victor M Montori* - montori.victor@mayo.edu; Dirk Bassler - dirk.bassler@med.uni-tuebingen.de; Paul Glasziou - paul.glasziou@dphpc.ox.ac.uk; German Malaga - gmalaga01@gmail.com; Elie A Akl - elieakl@buffalo.edu; Ignacio Ferreira-Gonzalez - nacho_ferreira@hotmail.com; Pablo Alonso-Coello - palonso@santpau.cat; Gerard Urrutia - gurrutia@santpau.cat; Regina Kunz - rkunz@uhbs.ch; Carolina Ruiz Culebro - caro.ruiz09@gmail.com; Suzana Alves da Silva - suzana.silva@procardiaco.com.br; David N Flynn - dnflynn@gmail.com; Mohamed B Elamin - elamin.mohamed@mayo.edu; Brigitte Strahm - brigitte.strahm@uniklinik-freiburg.de; M Hassan Murad - murad.mohammad@mayo.edu; Benjamin Djulbegovic - bdjulbeg@health.usf.edu; Neill KJ Adhikari - neill.adhikari@utoronto.ca; Edward J Mills - emills@cfenet.ubc.ca; Femida Gwadry-Sridhar - femida.gwadrysridhar@lhsc.on.ca; Haresh Kirpalani - kirpalanih@email.chop.edu; Heloisa P Soares - soareshp@gmail.com; Nisrin O Abu Elnour - abuelnour.nisrin@mayo.edu; John J You - jyou@mcmaster.ca; Paul J Karanicolas - pjkarani@uwo.ca; Heiner C Bucher - hbucher@uhbs.ch; Julianna F Lampropulos - lampropulos.julianna@mayo.edu; Alain J Nordmann - anordmann@uhbs.ch; Karen EA Burns - burnsk@smh.toronto.on.ca; Sohail M Mulla - mullasm@mcmaster.ca; Heike Raatz - hraatz@uhbs.ch; Amit Sood - sood.amit@mayo.edu; Jagdeep Kaur - kaurj4@mcmaster.ca; Clare R Bankhead - clare.bankhead@dphpc.ox.ac.uk; Rebecca J Mullan - mullan.rebecca@mayo.edu; Kara A Nerenberg - nerenbk@mcmaster.ca; Per Olav Vandvik - perolav.vandvik@sykehuset-innlandet.no; Fernando Coto-Yglesias - fernandocoto@racsa.co.cr; Page 1 of 10(page number not for citation purposes)Holger Schünemann - hjs@buffalo.edu; Fabio Tuche - fabiotuche@gmail.com; Pedro Paulo M Chrispim - chrispim@ensp.fiocruz.br; Deborah J Cook - debcook@mcmaster.ca; Kristina Lutz - kristina.lutz@learnlink.mcmaster.ca; Christine M Ribic - christine.ribic@medportal.ca; Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49Noah Vale - noahvale@gmail.com; Patricia J Erwin - erwin.patricia@mayo.edu; Rafael Perera - rafael.perera@dphpc.ox.ac.uk; Qi Zhou - qzhou@mcmaster.ca; Diane Heels-Ansdell - ansdell@mcmaster.ca; Tim Ramsay - tramsay@ohri.ca; Stephen D Walter - walter@mcmaster.ca; Gordon H Guyatt - guyatt@mcmaster.ca* Corresponding author    AbstractBackground: Randomized clinical trials (RCTs) stopped early for benefit often receive greatattention and affect clinical practice, but pose interpretational challenges for clinicians, researchers,and policy makers. Because the decision to stop the trial may arise from catching the treatmenteffect at a random high, truncated RCTs (tRCTs) may overestimate the true treatment effect. TheStudy Of Trial Policy Of Interim Truncation (STOPIT-1), which systematically reviewed theepidemiology and reporting quality of tRCTs, found that such trials are becoming more common,but that reporting of stopping rules and decisions were often deficient. Most importantly,treatment effects were often implausibly large and inversely related to the number of the eventsaccrued. The aim of STOPIT-2 is to determine the magnitude and determinants of possible biasintroduced by stopping RCTs early for benefit.Methods/Design: We will use sensitive strategies to search for systematic reviews addressing thesame clinical question as each of the tRCTs identified in STOPIT-1 and in a subsequent literaturesearch. We will check all RCTs included in each systematic review to determine their similarity tothe index tRCT in terms of participants, interventions, and outcome definition, and conduct newmeta-analyses addressing the outcome that led to early termination of the tRCT. For each pair oftRCT and systematic review of corresponding non-tRCTs we will estimate the ratio of relativerisks, and hence estimate the degree of bias. We will use hierarchical multivariable regression todetermine the factors associated with the magnitude of this ratio. Factors explored will include thepresence and quality of a stopping rule, the methodological quality of the trials, and the number oftotal events that had occurred at the time of truncation.Finally, we will evaluate whether Bayesian methods using conservative informative priors to"regress to the mean" overoptimistic tRCTs can correct observed biases.Discussion: A better understanding of the extent to which tRCTs exaggerate treatment effectsand of the factors associated with the magnitude of this bias can optimize trial design and datamonitoring charters, and may aid in the interpretation of the results from trials stopped early forbenefit.BackgroundInterim analyses conducted early in the course of a rand-omized clinical trial (RCTs) may suggest larger thanexpected treatment effects that are inconsistent withchance. Consequently, investigators and data monitoringcommittees may conclude that one treatment is superiorunadvisable or even unethical. The publicity surroundingsuch action often captures considerable attention becauseof the large treatment effects and the dramatic nature ofthe decision to terminate early.Clinicians, authors of systematic reviews and meta-analy-Published: 6 July 2009Trials 2009, 10:49 doi:10.1186/1745-6215-10-49Received: 11 February 2009Accepted: 6 July 2009This article is available from: http://www.trialsjournal.com/content/10/1/49© 2009 Briel et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Page 2 of 10(page number not for citation purposes)to the other and decide to stop the trial and release theresults, arguing that completing the trial as planned isses, and professional organizations issuing recommenda-tions face challenges when interpreting the results of suchTrials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49truncated RCTs (tRCTs), because the results could beoverly optimistic. Bias arises because random fluctuationstowards greater treatment effects may result in early termi-nation[1] In other words, the estimated treatment effectmay be exaggerated because statistical stopping rules areprone to stop a trial at a random high. If the decision tostop the trial resulted from observing the apparent benefitof treatment at a random high, the resulting estimate ofthe treatment effect will be misleadingly large. If thisoccurs, data from future similar trials would be expectedto yield a smaller estimate of treatment effect, as a conse-quence of the so-called "regression to the (true) mean"effect[2]A systematic review of 143 RCTs stopped early for benefit,the Study Of Trial Policy Of Interim Truncation (STO-PIT)-1, was published in 2005[3] This systematic reviewdocumented a recent doubling in the number of tRCTsand found that tRCTs are often published in high profilemedical journals. The reporting of the methods thatinformed the decision to truncate the trials was often defi-cient, and treatment effects were often implausibly large,especially when the number of events was small. A recentreview focusing on oncology trials stopped early for ben-efit raised further concerns, finding that almost 80% oftRCTs published in the last three years were used to attainregulatory approval[4]The extent to which tRCTs actually overestimate effects,the magnitude of bias, and the factors associated (and per-haps causally related) with the magnitude of bias in indi-vidual situations, remain uncertain. If one could identifya comprehensive set of trials not stopped early thataddressed the same question as a tRCT, those trials or alltrials (including the tRCT) could provide a least biasedassessment of the true treatment effect that the corre-sponding tRCT (or tRCTs if there were more than oneaddressing the same question) was trying to estimate. Thegoal of this project is to obtain such a comprehensive setof trials matching a number of tRCTs, and thus addressthe question of magnitude of bias, and variables associ-ated with the magnitude of bias.Study objectivesThe present study, STOPIT-2, seeks to determine the mag-nitude and determinants of bias that truncation of RCTsfor benefit may introduce. Our primary research questionsare:• What is the extent to which tRCTs overestimate the treatmenteffect compared with the best available estimate of treatmenteffect as determined by a systematic review and meta-analysisof RCTs addressing the same question as the tRCT?• What factors are associated with the size of the observed dif-ference in the treatment effect between the tRCT and therespective meta-analytic estimate?• Can Bayesian methods, using conservative priors, providemore likely estimates of the true underlying treatment effect,i.e. somehow "correct" for truncation bias?MethodsOverview of methodsSTOPIT-1 included 143 tRCTs and we were able to iden-tify 14 additional RCTs stopped early for benefit througha hand search in the medical literature and personal con-tact with trial investigators. The effort to identify tRCTswill continue by updating the search that led to the trialsidentified for STOPIT-1 (November 2004) using the samesearch strategy and through citation searching linked tothe STOPIT-1 publication and accompanying editorial inJAMA[3,5]In STOPIT-2, we will search for systematic reviewsaddressing the same question as the tRCTs (Figure 1). Wewill utilize the sensitive strategy for systematic reviews putforth for MEDLINE by Montori et al[6] Systematic reviewsthat ask a similar question to the tRCT but do not includethe tRCT due to its publication after the search date of thesystematic reviews will be updated to the present time.Other systematic reviews that include the tRCT will not beupdated. Systematic reviews that are only similar underthe broadest of definitions will be included only if thereview authors chose to pool the tRCT within the system-atic review.From each eligible systematic review we will blind eachRCT's results and two independent reviewers will deter-mine eligibility. From each eligible trial we will thenextract data and conduct new meta-analyses addressingthe outcome that led to the early termination of thetRCT(s). First, we will compare the relative risk generatedby the tRCT with the relative risk from all non-truncatedstudies. Second we will use multivariable regression todetermine the factors associated with the difference inmagnitude of effect between the tRCTs and RCTs notstopped early. These factors will include the presence andquality of a stopping rule, the methodological quality ofthe trials, and the number of events that had occurred atthe time of truncation. Finally, we will compare possiblemethods for correcting the treatment effect estimates fromtRCTs, in particular the use of Bayesian methods usingconservative informative priors to "regress to the mean"the tRCT estimates. We will then compare the degree ofdisagreement with the meta-analytical estimates betweenthe Bayesian-adjusted tRCT and the unadjusted tRCTPage 3 of 10(page number not for citation purposes)results.Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49Some authors have suggested that pooling tRCTs withnon-truncated trials addressing the same question willyield minimally biased estimates of treatment effects[7,8]However, our previous empirical finding that stopped-early studies contributed more than 40% of the weight inmore than a third of meta-analyses including tRCTs chal-lenges this view[9]) Nevertheless, if the overall estimate oftreatment effect (based on all studies, including tRCTs)were the least biased estimate of the true underlying effect,it is this estimate to which one should compare tRCTs.Based on simulations and theoretical considerations wefound compelling strengths and compelling limitationsfor each approach (Table 1). We will explore the extent towhich results are consistent with the hypothesis that thepooled estimate including the tRCTs is least biased usingour empirical data (for instance, the tRCTs should provideFlow chart of the Study of Trial Policy of Interim Truncation (STOPIT)-2igure 1Flow chart of the Study of Trial Policy of Interim Truncation (STOPIT)-2. Abbreviations: RCT, randomized clinical trial; tRCT, truncated randomized trial due to stopping early for benefit; PICO, patient population, intervention, control, out-come.Table 1: Comparison of non-truncated RCTs only and truncated + non-truncated RCTs as comparators to estimate the magnitude of bias associated with stopping clinical trials early for benefit based on simulations and theoretical considerations.Non-truncated RCTs only Truncated & non-truncated RCTs- more appropriate when the number of non-truncated RCTs in meta-analyses is relatively small (= weight of tRCTs in meta-analyses relatively large)- more appropriate when the number of non-truncated RCTs in meta-analyses is relatively large (= weight of tRCTs in meta-analyses relatively small)- more appropriate when true treatment effects are small (RCTs in meta-analyses likely to be underpowered)- more appropriate when true treatment effects are large (RCTs in meta-analyses likely to be adequately powered)- more appropriate in the presence of considerable publication bias - more conservative bias estimation- more appropriate when proportion of trials in meta-analyses without formal stopping rule is large- trial sample separate/independent from tRCT(s) facilitates statistical analysisPage 4 of 10(page number not for citation purposes)Abbreviations: RCTs, randomized clinical trials; tRCT(s), truncated randomized clinical trial(s) due to stopping early for benefitTrials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49a relatively small weight in the meta-analysis as a result oftheir having fewer events because of stopping early).Choosing a primary analysis for a study commonlyinvolves some arbitrariness. Given compelling reasons foreither approach, we decided to conduct both analyses. Wechose non-truncated RCTs only as the comparator in ourprimary analysis. In a complementary second analysis, wewill compare the tRCT and the pooled estimate of all trialsincluding the tRCT.Literature Search for Systematic ReviewsFor meta-analyses, we will search the Cochrane Databaseof Systematic Reviews and the Database of Abstracts ofReviews of Effects using the population and interventionof the tRCTs as search terms. We will also search for meta-analyses in MEDLINE with textwords and Medical SubjectHeading terms based on the study population and theintervention specified in the research question of thetRCTs, if necessary supplemented by a specified outcome,and with textwords "meta-analysis" OR "overview" OR"systematic review" and in a second approach with limits"meta-analysis.pt." AND "human"[6]Eligibility of Systematic ReviewsSystematic reviews will be considered eligible if they meetall of the following 5 criteria:1) Report the methods used to conduct the review2) Describe a literature search that, at minimum, includesMEDLINE3) Include a population similar to that of the tRCT4) Include an intervention similar to that of the tRCT5) Include an outcome similar to the one that was thebasis of the decision to stop the tRCT earlyBecause there is considerable judgment involved in theeligibility decisions, particularly criteria 3 to 5, every deci-sion of the initial adjudicators will be reviewed and con-firmed or refuted by another adjudicator and if necessary,by a third party. If in doubt while applying the broadestsimilarity criteria, a key factor for eligibility will be thatthe systematic review pooled the tRCT. In general, if indoubt, we will judge the systematic review eligible,because there will be a second review of eligibility at thelevel of individual trials.Updating of Systematic ReviewsThe only systematic reviews we will update are those thatdid not include the index tRCT(s) because they were com-reviews to the present using the same strategy used in thesystematic review. We will not update all meta-analyses inthe systematic reviews, only the ones for the outcomesthat led to the early termination of the matching tRCT(s).Identification, retrieval and eligibility of RCTs included in the systematic reviewsFor each systematic review we will retrieve all includedRCTs in full text (including associated manuscriptsdescribing methods) to determine their similarity to theindex tRCT. We will obtain data from unpublished studiesthat were included in the systematic reviews by contactingthe authors of the systematic review and/or the authors ofthe unpublished studies. Including trials addressing aquestion that was different to that addressed in the rele-vant tRCT would bias the assessment of magnitude ofeffect from the trials not stopped early. Thus, we will judgethe eligibility of each trial in the systematic review on thebasis of the following criteria:1) Including a population similar to that of the tRCT2) Including an intervention similar to that of the tRCT3) Including a control similar to that of the tRCT4) Including an outcome similar to the one that led to theearly termination of the tRCT5) Random allocation to intervention and control groupOne could have criteria for similarity that are very strict, orvery permissive. As it is uncertain what the right approachis, we will classify the population, intervention, controland outcome of each potentially eligible trial as either"more or less identical", "similar, but not identical" or"broadly similar". The eligibility form will allow differen-tiation between eligibility of the studies based on the nar-row, the broad or the broadest criteria and the "closeness"of the RCTs to the index tRCT will be considered in theanalyses. We will construct a number of teams of tworeviewers to make the eligibility decisions.Each team will include individuals with expertise relevantto the content of the studies they will review. Within eachpair of reviewers, the rating of the individual RCTs will bedone independently and in duplicate. Disagreements willbe resolved by discussion and, if necessary, by a thirdparty. Because we are at risk of bias in the decision aboutwhether to include a RCT based on the results, the review-ers who judge eligibility will be blinded to the results ofthe trial. Blinding will be accomplished by a separateteam, not involved in study selection, using black ink onPage 5 of 10(page number not for citation purposes)pleted prior to the publication of the tRCT. In theseinstances, we will update the search of the systematic"hard copies" before these are scanned into electronic for-mat or using black boxes overlaid on the sections describ-Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49ing results on electronic versions in portable documentformat of the paper. Every section of potentially eligibleRCTs that reports the magnitude of results (abstract,results and discussion) will be blinded before the decisionon eligibility is made. Blinding will be tested in a randomset of 20 papers sent to 20 reviewers to ensure its success.For RCTs, disagreements in relation to similarity of 2 lev-els or greater will require adjudication. Disagreements inrelation to similarity of 1 level will not and the broadersimilarity rating will be assumed correct (Figure 2).Data extractionFrom each RCT, we will collect the following data induplicate.1. Stopped early (yes/no)2. Methodological quality: allocation concealment (docu-mented as central independent randomization facility ornumbered/coded medication containers prepared anddistributed by an independent facility (e.g. pharmacy));blinding of participants, care providers, and outcomeadjudicators (blinding of participants and care providerswill be rated as "probably yes" when trial report states"double blinded" or "placebo controlled"); loss to follow-up (we will collect the number of participants rand-omized and the number of participants with outcomedata for the outcome of interest allowing for an estima-tion of loss to follow-up)3. Measure of treatment effect for the outcome that termi-nated the tRCT (events and number randomized in inter-vention and control groups)4. Pre-implemented stopping characteristics, if any (e.g.,planned sample size, interim looks, stopping rules,number of events)5. Date of conduct of the trial (start date, stop date, publi-cation date)Statistical AnalysisWe will calculate relative risks for each RCT in our study.For studies that provide results as continuous data(means, standard deviations), we will estimate an approx-imate dichotomous equivalent. To do this we will assumenormal distributions of the results and that half a stand-ard deviation represents the minimal importantchange[10] Using baseline data we will obtain the 0.5standard deviation threshold from the baseline distribu-tion and calculate the proportion of each follow-up distri-bution above or below (depending on the direction of theoutcome) the threshold, i.e. the proportion of patients ineach treatment arm who "did worse". This will allow us tospecify relative risks and associated confidence intervals. Ifbaseline data are not available, we will use the follow-updistribution of the control group to substitute for the 0.5standard deviation threshold.As well, for each meta-analysis we will calculate thepooled relative risk and 95% confidence interval for all tri-als that were not stopped early. Where there is more thanone tRCT per meta-analysis we will also calculate a pooledrelative risk and confidence interval for those tRCTs. Thesepooled estimates of relative risks will be calculated usingan inverse variance weighted random effects model.We will graphically present the results in a scatterplot ofthe effect size (relative risk) of the tRCT (horizontal axis)against the pooled effect size of non-tRCTs (vertical axis).If the tRCT and non-tRCTs give similar results, the pointsExample to illustrate the process of judging similarity between a randomized clinical trial and the corresponding truncated ran-domized clinical trialFigur  2Example to illustrate the process of judging similarity between a randomized clinical trial and the correspond-ing truncated randomized clinical trial. Level 1 = Meets narrow criteria; Level 2 = Meets broad criteria; Level 3 = Meets broadest criteria; Level 4 = Does not meet criteria. * For differences in reviewer ratings of 1 level we will consider the broader similarity rating for the overall rating. ** Differences in reviewer ratings of 2 levels or greater will require adjudication by a Category Rating Reviewer 1 Rating Reviewer 2 Overall RatingPatients Level 2 Level 1          Level 2*Intervention Level 1 Level 3   Adjudication**Control Level 2 Level 4   Adjudication**Outcome Level 1 Level 1          Level 1 Page 6 of 10(page number not for citation purposes)third reviewer.Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49should be scattered along the diagonal of the scatterplot;if the tRCTs overestimate treatment effects they should befound above the diagonal.We will also perform a z-test for each meta-analysis tolook for differences between the truncated and non-trun-cated RCTs for the pooled relative risks. As a summarymeasure we will calculate a ratio of relative risks for eachmeta-analysis as follows:We will plot the log(ratio of relative risks) and calculate anoverall log(ratio of relative risks) as an inverse variance-weighted average of the log(ratio of relative risks). Thesewill be back transformed and the ratio of relative risk val-ues will be plotted for presentational purposes.To investigate possible predictors of treatment effect sizesin RCTs, we will perform a hierarchical (multi-level)regression analysis. Our model will have two levels: indi-vidual RCT (study) level and meta-analysis level. Thedependent variable in this analysis will be the logarithmof the relative risk (logRR) for each study and we willinvestigate the associations of the logRR with characteris-tics of the individual studies. We will investigate five pos-sible predictors. Our main predictor of interest is avariable that we will construct from two different studycharacteristics, the presence and quality of a stopping ruleand whether or not the RCT was truncated early.The rule for stopping early will be categorized as one ofthree possibilities: (i) a rigorous rule (published prior tothe trial plan), (ii) a not-so-rigorous rule such as ad hocrules developed during the trial, (iii) no rule or unknown.Each of these three possibilities will be combined withwhether or not the trial stopped early, creating 6 catego-ries in total. It is very likely that there will be less than sixcategories in our final analysis as it is quite conceivablethat some of the scenarios will not occur. We will carry outpost hoc comparisons of outcomes between these 6 groups,focusing on contrasts that highlight the effects of the ruleand the "truncated study" variable, and their interaction,to the extent that the available data permit.Other study-level characteristics that we will examine arethe methodological quality (blinding of patients, care-giv-ers, and outcome assessors, and allocation concealment),and the total number of events. At the meta-analysis level,the only variable in the model will be an indicator of thespecific meta-analyses to which each study belongs.We will look at the main effects of all the variables and thestatistic (logRR) and an associated variance. The variancewill provide weights for a meta-regression to evaluate thedeterminants of the estimated treatment effect.The multivariable regression described above will be per-formed on 5 different datasets based on different levels ofa variable which we will call closeness. This variable willmeasure how similar the non-truncated trials in eachmeta-analysis are to the corresponding truncated trial(s)with regard to the a) patient population, b) treatmentarm, c) control arm and d) outcome. For each of thesefour, we will categorize closeness into one of three levels:very close (termed as 'fits the narrow criteria' in the data-base), moderately close (termed as 'fits the broad criteria'in the database), and less close (termed as 'fits the broad-est criteria' in the database). This judgement will be codedby 2 reviewers, and the level of agreement (kappa)checked. Each trial will then be categorized by its leastclose category over the four areas which we will use todefine our 5 different datasets. The datasets will be: 1)only trials that are "very close" in all domains; 2) trialswith one or more "moderately close" domain, but no "lessclose" domains and not "very close" in all domains; 3) tri-als that are "less close" in at least one domain; 4) trialsthat are "very close" or "moderately close" in all domains(corresponds to 1) and 2) combined); and 5) all trials.As discussed previously, we will conduct a further analysisin which the comparison is between the tRCT and thepooled estimate of all trials including the tRCT. If thetRCTs provide relatively small weights in the meta-analy-ses as a result of fewer events because of stopping early,the pooled estimate including the tRCTs may provide theleast biased summary estimate.Finally, we will compare possible methods for correctingthe estimates from tRCTs for possible bias, in particularthe use of Bayesian methods. The basic approach here isto use a conservative prior for trials (derived empiricallyfrom past trials in other areas – we will review such exist-ing reviews [11-13]) and combine this information withthe data from the tRCT to obtain a posterior estimate ofeffect. The weight will depend on the relative variance ofthe conservative prior and the tRCT: small trials will leadto an emphasis on the conservative prior whereas large tri-als will attach relatively greater importance to theobserved data. We will calculate such Bayesian relativerisks for each tRCT in our study. As for the simple tRCTestimate, we will graphically present the results for a visualcomparison of the effect size (relative risk) of the trun-cated RCT(s) and the non-truncated RCTs. Based on pre-vious simulation work [14] we would predict that theBayesian estimates obtained will be closer to the meta-log( ) log( /ratio of relative risks relative risk of tRCT rel= ative risk of pooled non tRCTsrelative risk of tRCT− =)log( ) log( )− −relative risk of pooled non tRCTsPage 7 of 10(page number not for citation purposes)interaction between the rule/truncated variable and theother predictor variables. Each study will yield a summaryanalysis findings.Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49DiscussionThe results of STOPIT-2 will extend earlier modeling stud-ies and a systematic review of trials stopped early to pro-vide a precise estimate of the extent of that bias as it playsout in the real world. Estimating the extent of bias and fac-tors associated with the magnitude of bias will have impli-cations for the design, conduct, and reporting of futureclinical trials.Strengths and limitations of our protocolStrengths of our study protocol include a systematic andextensive literature search with the goal of compiling acomprehensive sample of RCTs stopped early for benefit.Notwithstanding, despite complementing our electronicliterature search of multiple databases with hand searchesand personal contacts, we may fail to identify relevanttRCTs as truncation is often not explicitly stated in trialabstracts or methods[15]We will use sensitive search strategies to identify system-atic reviews corresponding to tRCTs,[6] and will under-take a labor-intensive process of judging eligibility ofseveral thousand individual RCTs, each being assessedindependently by two reviewers blinded to the RCTresults. The blinding of trials prior to review is a particularstrength of this study, limiting the potential for biased eli-gibility assessment on the basis of the magnitude of effectof the studies.The accuracy of our results depends on the comprehensivesearch of the systematic reviews we will include. We haveset a relatively low threshold for inclusion of correspond-ing systematic reviews: that the review included a meth-ods section and that the systematic search included at leastMEDLINE. Publication bias threatens all systematicreviews, and many do not include a thorough search forunpublished trials. To the extent that publication biasexists, however, it will likely lead to overestimates of theeffects found in the pooled non-truncated RCTs. Weassume that publication bias is less likely to affect thetRCTs. Thus, we anticipate that publication bias will yieldan underestimate of the upward bias of the tRCTs relativeto the non-tRCTs, and hence that our estimate of biasassociated with truncation is likely to be conservative.Despite objective criteria, when assessing the methodo-logical quality of RCTs we are limited by the quality of thereporting of the trials.A further strength of the study is the planned hierarchicalanalysis in which we link the estimates of treatment effectfrom tRCTs and non-tRCTs addressing the same question.Since the extent to which studies address the same ques-analysis based on similarity of question will furtherstrengthen the results.Ethical and data monitoring implicationsThe results of this study may have a profound effect on thedecision-making of data monitoring committees[16] Datamonitoring committees have an ethical obligation toensure patients are offered effective treatment as soon as itis clear that an effective treatment is indeed available. Tomany people, this mandates a stopping rule that willensure that any trial that shows an apparently large treat-ment effect at an early stage does not continue beyond acertain point. Data monitoring committees also have,however, an ethical obligation to future patients,[14] whorequire precise and accurate data addressing patient-important outcomes to make optimal treatment choices.While there is growing awareness that stopping trials earlyfor apparent benefit may lead to overestimatedresults,[17] little is known about the magnitude and thedeterminants of bias that truncation introduces. Theresults of STOPIT-2 will provide valuable guidance toinvestigators, institutional review boards, funding agen-cies, and data monitoring committees regarding appropri-ate use of stopping rules in clinical trials.Public Engagement in ScienceThe results of STOPIT-2 will impact on numerous issues ofpublic interest. Investigators, patients and their advocates,institutional review boards, and funding agencies mayhave different but convergent interests in stopping a studyas soon as an important difference between experimentaland control group emerges. Increased impact and public-ity may motivate investigators, journals, and fundingagents. A commitment to promptly offer participants inthe less favorable arm the better treatment choice maymotivate investigators, patients and their advocates, anddata monitoring committees. The opportunity to saveresearch dollars by truncating a trial may motivate thefunding agencies.Although a recent extension to the CONSORT statementfor abstracts highlights the importance of reporting if thetrial has stopped earlier than planned and the reason fordoing so,[18] a number of observations suggest that inves-tigators, journal editors, and clinical experts remainmostly unaware of the problematic inferences that mayarise from truncated RCTs. Top journals continue to pub-lish results of stopped early trials, but fail to requireauthors to note the early stopping in the abstract and toreport details that would allow readers to carefully evalu-ate the decision to stop early[3] The QUOROM guidelinesfor meta-analyses and systematic reviews of RCTs [19] rec-ommend that authors describe potential biases in thePage 8 of 10(page number not for citation purposes)tion is a matter of judgment, the provision for a sensitivity review process, but systematic reviewers pay scant atten-tion to the potential bias introduced by including in theirTrials 2009, 10:49 http://www.trialsjournal.com/content/10/1/49meta-analyses RCTs that were stopped early for benefit[9]Important professional organizations continue to issuerecommendations on the basis of trials stopped early forbenefit, including those that had reported very few end-points and which therefore seem most likely to overesti-mate effects[20,21]Empirical evidence of the magnitude and determinants ofbias that truncation introduces will ensure that investiga-tors, editors, authors of meta-analyses and guideline pan-els become appropriately cautious in their interpretationof stopped early trials. Ultimately, this will reduce the riskof prematurely translating unreliable study findings intoclinical practice.The findings of STOPIT-2 will influence recommenda-tions on how to design and conduct RCTs and meta-anal-yses, and how to report their results, as summarized in theCONSORT [22] and QUOROM [19] guidelines. Theresults of the proposed study will also have implicationsfor systematic reviews, including those who publish in theCochrane Library. Grading the strength of recommenda-tions and quality of evidence in clinical guidelines willalso be influenced by the findings of STOPIT-2[23] Giventhe increasing frequency with which trials are stoppedearly, the results of STOPIT-2 will be of interest to numer-ous stakeholders including patients and their physicians,investigators, research ethics boards, funding agencies,journal editors, and policy makers.AbbreviationsRCTs: Randomized clinical trials; tRCTs: Truncated rand-omized clinical trials due to stopping early for benefit;STOPIT: Study of Trial Policy of Interim Truncation;MESH: Medical Subject Heading; logRR: Logarithm of therelative risk; CONSORT: Consolidated Standards ofReporting Trials; QUOROM: Quality of Reporting ofMeta-analyses; PICO: Patient population, intervention,control, outcome.Competing interestsThe authors declare that they have no competing interests.Authors' contributionsAll listed authors contributed to the design of this proto-col.AcknowledgementsWe thank Amanda Bedard, Monica Owen, Michelle Murray, Shelley Ander-son, and Deborah Maddock for their administrative assistance. We are grateful to Ratchada Kitsommart and Chusak Okascharoen for their help with blinding of articles, and Luma Muhtadie, Kayi Li, and Amandeep Rai for their help with the literature search.The study is funded by the British Medical Research Council. The funding agency had no role in the study design, in the writing of the manuscript, or in the decision to submit this or future manuscripts for publication.Matthias Briel is supported by a scholarship for advanced researchers from the Swiss National Science Foundation (PASMA-112951/1) and the Roche Research Foundation. Heiner C. Bucher, Alain J. Nordmann, Regina Kunz, and Heike Raatz are supported by grants from Santésuisse and the Gott-fried and Julia Bangerter-Rhyner-Foundation.References1. Schulz KF, Grimes DA: Multiplicity in randomised trials II: sub-group and interim analyses.  Lancet 2005, 365:1657-1661.2. Pocock S, White I: Trials stopped early: too good to be true?Lancet 1999, 353:943-944.3. Montori VM, Devereaux PJ, Adhikari NK, Burns KE, Eggert CH, BrielM, Lacchetti C, Leung TW, Darling E, Bryant DM, Bucher HC,Schunemann HJ, Meade MO, Cook DJ, Erwin PJ, Sood A, Sood R, LoB, Thompson CA, Zhou Q, Mills E, Guyatt GH: Randomized trialsstopped early for benefit: a systematic review.  JAMA 2005,294:2203-2209.4. Trotta F, Apolone G, Garattini S, Tafuri G: Stopping a trial earlyin oncology: for patients or for industry?  Ann Oncol 2008,19:1347-1353.5. Pocock SJ: When (not) to stop a clinical trial for benefit.  JAMA2005, 294:2228-2230.6. Montori VM, Wilczynski NL, Morgan D, Haynes RB: Optimalsearch strategies for retrieving systematic reviews fromMedline: analytical survey.  BMJ 2005, 330:68.7. Goodman SN: Stopping at nothing? Some dilemmas of datamonitoring in clinical trials.  Ann Intern Med 2007, 146:882-887.8. Goodman SN: Systematic reviews are not biased by resultsfrom trials stopped early for benefit.  J Clin Epidemiol 2008,61:95-96.9. Bassler D, Ferreira-Gonzalez I, Briel M, Cook DJ, Devereaux PJ,Heels-Ansdell D, Kirpalani H, Meade MO, Montori VM, Rozenberg A,Schunemann HJ, Guyatt GH: Systematic reviewers neglect biasthat results from trials stopped early for benefit.  J Clin Epide-miol 2007, 60:869-873.10. Norman GR, Sloan JA, Wyrwich KW: Interpretation of changesin health-related quality of life: the remarkable universalityof half a standard deviation.  Med Care 2003, 41:582-592.11. Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, AdamsJR, Kuderer NM, Lyman GH: The uncertainty principle andindustry-sponsored research.  Lancet 2000, 356:635-638.12. Soares HP, Kumar A, Daniels S, Swann S, Cantor A, Hozo I, Clark M,Serdarevic F, Gwede C, Trotti A, Djulbegovic B: Evaluation of newtreatments in radiation oncology: are they better thanstandard treatments?  JAMA 2005, 293:970-978.13. Joffe S, Harrington DP, George SL, Emanuel EJ, Budzinski LA, WeeksJC: Satisfaction of the uncertainty principle in cancer clinicaltrials: retrospective cohort analysis.  BMJ 2004, 328:1463.14. Pocock SJ, Hughes MD: Practical problems in interim analyses,with particular regard to estimation.  Control Clin Trials 1989,10:209S-221S.15. Sydes MR, Altman DG, Babiker AB, Parmar MK, Spiegelhalter DJ:Reported use of data monitoring committees in the mainpublished reports of randomized controlled trials: a cross-sectional study.  Clin Trials 2004, 1:48-59.16. Mueller PS, Montori VM, Bassler D, Koenig BA, Guyatt GH: Ethicalissues in stopping randomized trials early because of appar-ent benefit.  Ann Intern Med 2007, 146:878-881.17. McCartney M: Leaping to conclusions.  BMJ 2008, 336:1213-1214.18. Hopewell S, Clarke M, Moher D, Wager E, Middleton P, Altman DG,Schulz KF: CONSORT for reporting randomized controlledtrials in journal and conference abstracts: explanation andelaboration.  PLoS Med 2008, 5:e20.19. Moher D, Cook DJ, Eastwood S, Olkin I, Rennie D, Stroup DF:Improving the quality of reports of meta-analyses of ran-domised controlled trials: the QUOROM statement. Qualityof Reporting of Meta-analyses.  Lancet 1999, 354:1896-1900.Page 9 of 10(page number not for citation purposes)20. Palda VA, Detsky AS: Perioperative assessment and manage-ment of risk from coronary artery disease.  Ann Intern Med1997, 127:313-328.Publish with BioMed Central   and  every scientist can read your work free of charge"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."Sir Paul Nurse, Cancer Research UKYour research papers will be:available free of charge to the entire biomedical communitypeer reviewed and published immediately upon acceptancecited in PubMed and archived on PubMed Central Trials 2009, 10:49 http://www.trialsjournal.com/content/10/1/4921. Eagle KA, Berger PB, Calkins H, Chaitman BR, Ewy GA, FleischmannKE, Fleisher LA, Froehlich JB, Gusberg RJ, Leppo JA, Ryan T, SchlantRC, Winters WL Jr, Gibbons RJ, Antman EM, Alpert JS, Faxon DP,Fuster V, Gregoratos G, Jacobs AK, Hiratzka LF, Russell RO, SmithSC Jr: ACC/AHA guideline update for perioperative cardio-vascular evaluation for noncardiac surgery – executive sum-mary: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guide-lines (Committee to Update the 1996 Guidelines on Periop-erative Cardiovascular Evaluation for Noncardiac Surgery).J Am Coll Cardiol 2002, 39:542-553.22. Moher D, Schulz KF, Altman DG: The CONSORT statement:revised recommendations for improving the quality ofreports of parallel-group randomized trials.  Ann Intern Med2001, 134:657-662.23. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coe-llo P, Schunemann HJ: GRADE: an emerging consensus on rat-ing quality of evidence and strength of recommendations.BMJ 2008, 336:924-926.yours — you keep the copyrightSubmit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.aspBioMedcentralPage 10 of 10(page number not for citation purposes)


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items