Open Collections

UBC Faculty Research and Publications

Incorporating external evidence in trial-based cost-effectiveness analyses: the use of resampling methods Sadatsafavi, Mohsen; Marra, Carlo; Aaron, Shawn; Bryan, Stirling Jun 3, 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-13063_2013_Article_2082.pdf [ 400.12kB ]
JSON: 52383-1.0221560.json
JSON-LD: 52383-1.0221560-ld.json
RDF/XML (Pretty): 52383-1.0221560-rdf.xml
RDF/JSON: 52383-1.0221560-rdf.json
Turtle: 52383-1.0221560-turtle.txt
N-Triples: 52383-1.0221560-rdf-ntriples.txt
Original Record: 52383-1.0221560-source.json
Full Text

Full Text

METHODOLOGY Open AccessIncorporating external evidence in trial-basedcost-effectiveness analyses: the use of resamplingmethodsMohsen Sadatsafavi1,2,3*, Carlo Marra2, Shawn Aaron4 and Stirling Bryan3,5AbstractBackground: Cost-effectiveness analyses (CEAs) that use patient-specific data from a randomized controlled trial(RCT) are popular, yet such CEAs are criticized because they neglect to incorporate evidence external to the trial. Apopular method for quantifying uncertainty in a RCT-based CEA is the bootstrap. The objective of the present studywas to further expand the bootstrap method of RCT-based CEA for the incorporation of external evidence.Methods: We utilize the Bayesian interpretation of the bootstrap and derive the distribution for the cost andeffectiveness outcomes after observing the current RCT data and the external evidence. We propose simplemodifications of the bootstrap for sampling from such posterior distributions.Results: In a proof-of-concept case study, we use data from a clinical trial and incorporate external evidence on theeffect size of treatments to illustrate the method in action. Compared to the parametric models of evidencesynthesis, the proposed approach requires fewer distributional assumptions, does not require explicit modeling ofthe relation between external evidence and outcomes of interest, and is generally easier to implement. A drawbackof this approach is potential computational inefficiency compared to the parametric Bayesian methods.Conclusions: The bootstrap method of RCT-based CEA can be extended to incorporate external evidence, whilepreserving its appealing features such as no requirement for parametric modeling of cost and effectivenessoutcomes.Keywords: Cost-benefit analysis, Bayes theorem, Clinical trial, Statistics-nonparametricBackgroundRandomized controlled trials (RCTs), especially ‘prag-matic’ RCTs that measure the effectiveness of interven-tions in realistic settings, are an attractive opportunity toprovide information on cost-effectiveness [1]. In the con-text of such a RCT, many aspects of treatment from clin-ical outcomes to adverse events and costs are measured atthe individual level, which can be used to formulate anefficient policy based on cost-effectiveness principles. Agrowing number of trials incorporate economic endpointsat the design stage and there are established guidelines forconducting a cost-effectiveness analysis (CEA) alongside aRCT [2,3].The statistic of interest in a CEA is the incrementalcost effectiveness ratio (ICER), which is defined as thedifference in cost (ΔC) between two competing treatmentsover the difference in their health outcome (effectiveness)(ΔE). With patient-specific cost and health outcomes athand, estimating the population value of the ICER froman observed sample becomes a classical statistical infer-ence problem. However, given the awkward statisticalproperties of cost data and some health outcomes such asquality-adjusted life years (QALYs), and issues aroundparametric inference on ratio statistics, many investigatorschoose resampling methods for quantifying the samplingvariation around costs, health outcomes, and the ICER [4].In parallel-arm RCTs, this can be performed by obtaining* Correspondence: msafavi@mail.ubc.ca1Institute for Heart and Lung Health, Faculty of Medicine, University of BritishColumbia, 317 – 2194 Health Sciences Mall (Woodward InstructionalResource Centre), Vancouver, Canada V6T 1Z32Collaboration for Outcomes Research and Evaluation, Faculty ofPharmaceutical Sciences, University of British Columbia, 2405 Wesbrook Mall,Vancouver, Canada V6T 1Z3Full list of author information is available at the end of the articleTRIALS© 2014 Sadatsafavi et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of theCreative Commons Attribution License (, which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons PublicDomain Dedication waiver ( applies to the data made available in thisarticle, unless otherwise stated.Sadatsafavi et al. Trials 2014, 15:201 bootstrap sample within each arm of the trial and calcu-lating the mean cost and effectiveness within each armfrom the bootstrap sample; repeating this step many timesprovides a random sample from the joint distribution ofarm-specific cost and effectiveness outcomes. This samplecan then be used to make inference on (such as calculatethe confidence or credible interval for) the ICER [5].Recently, such a framework for evaluating the cost andoutcomes of health technologies has received some criti-cism [6-8]. Specifically, critics argue that making deci-sions on the cost-effectiveness of competing treatmentsshould be based on all the available evidence, not justthose obtained from a single RCT [8]. In this context,evidence synthesis is the practice of combining multiplesources of evidence (from other RCTs, expert opinion,and case histories) in informing the treatment decision,a task that is quantitatively performed using the Bayes’rule [9].A conventional analysis of a clinical trial often involvesmaking inference primarily on the effect size and sec-ondarily on other aspects of treatment such as safety orcompliance. These measures are conceptually distinctenough to be analyzed and reported separately and trial-ists have a full arsenal of standard statistical methods attheir grasp for such analyses. Evidence synthesis is oftenconducted separately, usually through quantitative meta-analysis, after the results of several studies are available.An economist, on the other hand, does not have the lux-ury of dissecting RCT results into different componentsas cost-effectiveness is a function of all aspects of anintervention. As such, evidence external to the trial onany aspect of treatment has bearings on the results ofthe CEA. In addition, when a RCT is used as a vehiclefor the CEA the incorporation of external evidence mustbe part of the analysis. Results of a CEA have direct policyimplications and the economist cannot defer evidencesynthesis to any subsequent stage [8].For trial-based CEAs, if external evidence on cost oreffectiveness is available then the investigator can usestandard parametric Bayesian methods to combine thisinformation with trial results [9]. This has been thedominant paradigm in the Bayesian analysis of RCT-basedCEAs [10-14]. However, prior information on cost andtypical effectiveness outcomes such as QALY is rarelyavailable and if it is, it is often inappropriate to transfer toother settings [15,16]. This is because such outcomes are,to a large extent, affected by the specific settings in thejurisdiction in which they are measured (such as unitprices for medical resources). On the other hand, evidenceon the aspects of the intervention that relate to thepathophysiology of the underlying health condition andthe biologic impact of treatment, such as the effect sizeof treatment or rate of adverse events, are less affectedby specific settings and are therefore more transferable[17]. This puts the investigator in a difficult situationfor a RCT-based CEA as inference is made directly oncost and effectiveness using the observed sample, butevidence is available on some other aspects of treat-ment. One way to overcome this challenge is to create aparametric model to connect cost-effectiveness outcomeswith parameters for which external evidence is available,and use Bayesian analysis, for example through MarkovChain Monte Carlo (MCMC) sampling techniques[18]. But such a model must connect several parame-ters through link functions, regression equations, anderror terms. This involves a multitude of parametricassumptions and there is always the danger of modelmisspecification [19,20]. In addition, even with theadvent of generic statistical software for Bayesian ana-lysis, implementing such a model and comprehensivemodel diagnostics are not an easy undertaking. For aninvestigator using resampling methods for the CEAwho wishes to incorporate external evidence in theanalysis, this paradigm shift to parametric modelingcan be a challenge.In this proof-of-concept study, we propose and illus-trate simple modifications of the bootstrap approach forRCT-based CEAs that enable Bayesian evidence synthesis.Our proposed method requires a parametric specificationof the external evidence while avoiding parametric as-sumptions on the cost-effectiveness outcomes and theirrelation with the external evidence. The remainder of thepaper is structured as follows: after outlining the context,a Bayesian interpretation of the bootstrap is presented.Next, the theory of the incorporation of external evidenceinto such sampling scheme is explained. A case studyfeaturing a real-world RCT is used to demonstrate theapplicability and face validity of the proposed method.A discussion section on the various aspects of the newmethod and its strengths and weaknesses compared toparametric approaches concludes the paper.MethodsContextLet θ = {θi, θe} be the set of parameters to be estimatedfrom the data of a RCT and some external evidence. Itconsists of two subsets: θi, the parameter (s) of interestfor which there is no external evidence, and θe, some pa-rameters for which external evidence is available. Typic-ally, θi includes cost and effectiveness outcomes, and θeconsists of some biological measures of treatment suchas treatment effect. Let D represent the individual-leveldata of the current parallel-arm RCT, fully available tothe investigator. We assume the population of interestfor inference is the same as the population from whichD is obtained, a fundamental assumption in any RCT-based CEA.Sadatsafavi et al. Trials 2014, 15:201 Page 2 of 9 bootstrapIn a Bayesian context, the problem of inference on θfrom a sample D can be conceptualized as incorporatingsome prior information with the information providedby the data to obtain a posterior distribution for θ:P θ DÞ∝π θð Þ:P D θÞjðjð ð1Þomitting a normalizing constant which is the functionof D, but not θ. Here π(θ) is our prior distribution onθ, P(D|θ) is the likelihood of current data, and P(θ|D)is the posterior distribution having observed the trialdata D. If prior and posterior distributions are from aparametric family indexed by a set of distribution pa-rameters, then a fully parametric model can be used todraw inference on P(θ|D). However, one can perform suchBayesian inference non-parametrically: Rubin showed thatif we assume a prior non-informative Dirichlet distributionfor D itself (regardless of which parameter to estimate),then we can directly draw from P(θ|D) using a simpleprocess called the Bayesian bootstrap [21]. In the Bayesianbootstrap of a dataset D consisting of n independent ob-servations, a probability vector P = (p1, …, pn) is gener-ated by randomly drawing from Dirichlet(n; 1, …, 1).The probability distribution that puts the mass of pi onthe ith observation in D can be considered a randomdraw from the ‘distribution of the distribution’ that hasgenerated D. Let D* represent a bootstrapped sample ofD generated in this way, then according to the argumentmade above, θ*, the value of θ measured in this sample,is a random draw from P(θ|D) [21].Ordinary bootstrap as an approximation of the BayesianbootstrapThe process of ordinary bootstrap can also be seen asgenerating a probability vector to the data, except only theprobability vector is generated from the scaled multinomialdistribution [22]. Such a process does not mathematicallycorrespond to formal Bayesian inference. Nevertheless, thesimilarity in both the operation and results to the Bayesianbootstrap has led some investigators to interpret the ordin-ary bootstrap in a Bayesian way [23]. For example, thewidely popular non-parametric imputation of missing datauses ordinary bootstrap as an approximate to the Bayesianbootstrap [22,24]. Indeed, it has already been shown thatthe ordinary and Bayesian bootstrap methods generate verysimilar results in non-parametric value of information ana-lysis of RCT data [21]. Given this, for the rest of this workwe use Bayesian and ordinary bootstraps interchangeably.CEA without the incorporation of external evidenceIn a CEA in which we do not intend to incorporate anyexternal evidence the quantity of interest for inference isP(θ|D). As described in the previous section, a samplefrom this quantity can be obtained using a simpleresampling algorithm:1 For i = 1,…,M, where M is the number of bootstraps:a. Generate D*, a (Bayesian) bootstrap sample withbootstrapping performed within each arm of thetrial.b. Calculate θ* from D*.2 Store the value of θ* and jump to 1.This approach generates M random draws from theposterior distribution of θ having observed the RCTdata. This is indeed the widely popular bootstrapmethod of RCT-based CEA [4]. An estimator for theICER from the bootstrapped data can be obtained bycalculating the ratio of the mean cost over mean effective-ness from the bootstrap samples [4]. Various methods canbe used to construct a credible interval from the boot-strapped samples around this value [4,25]. These samplescan also be used to present uncertainty in the form of acost-effectiveness plane or cost-effectiveness acceptabilitycurve (CEAC) [26].Incorporating external evidenceLet De be some external data providing evidence on θe. Whilethe external data is not fully available to the investigator,evidence is available most typically in the form of theexternal likelihood P(De|θe), for example, recovered fromthe reportedmaximum likelihood estimate and confidencebounds of treatment effect from a previously publishedstudy. We require D and De to be independent samples.This is a typical and fundamental assumption in evidencesynthesis, for example in meta-analysis of treatment effectfrom multiple trials. By our definition of θi and θe, weknow that the external likelihood only provides informa-tion on θe (the information on θi is either not collected oris not reported by the investigators of the external study).As such, the external likelihood is a marginal likelihoodfor θe and hence is not a function of θi. We also note thatsometimes external evidence is obtained through a moresubjective process, such as elicitation of expert opinion. Insuch cases, De becomes an abstract entity and P(De|θe)can be seen as a ‘weight’ function representing the degreeof plausibility of θe against external knowledge.In the presence of external data De, the quantity ofinterest is P(θ|D, De), which can be expanded, throughthree steps, as:P θi; θejD;Deð Þ∝π θi; θeð Þ:P D;Dej θi; θeð Þ∝π θi; θeð Þ:P D θi; θeÞ:P De θi; θeÞ∝P θ DÞ:P De θeÞjðjðjðjðð2ÞIn the above derivations, in the first step we have appliedthe Bayes rule; the second step factorizes the likelihoodSadatsafavi et al. Trials 2014, 15:201 Page 3 of 9 the independence of the external and current data;and the third step is based on the fact that the externaldata provides no information about θi (that is, P(De| θi, θe)is not a function of θi), so the likelihood term P(De| θi, θe)is reduced to P(De|θe).Sampling from the posterior distributionSuppose that a random sample can be generated froman ‘easy’ distribution g, but we are actually interested inobtaining a sample from a ‘difficult’ distribution h. Howcan we use the samples from g to obtain samples from h?Two popular methods for converting samples from g to hare rejection sampling [27] and importance sampling [28];both are based on applying weights proportional todensity ratio h/g to each observation from g. In thepresent context, g = P(θ|D) and h = P(θ|D, De); the weightsare, according to (Equation 2), proportional to P(De|θe).That is, to obtain samples from P(θ|D, De), each θ* as asample from P(θ|D), obtained through bootstrapping,needs to be weighted by P De θeÞ . To operationalizethis, we propose two approaches based on rejection andimportance sampling schemes. The reader can refer toSmith and Gelfand for an elegant elaboration on thesetwo sampling schemes (along with the derivations) [27].Rejection samplingIn this scheme, each D*, the entire bootstrap sample of theRCT data, is accepted by a probability that is proportionalto P De θeÞ , the weight of θe obtained from D*. Thisresults in the following algorithm:1 For i = 1,…,M, where M is the desired size of thesample:a. Generate D*, a (Bayesian) bootstrap sample of D,with bootstrapping performed separately withineach arm of the trial.b. Calculate the parameters θ ¼ θi ; θe fromthis sample.c. Calculate P ¼ P De θe Þ , the weight of θeaccording to external evidence.d. Randomly draw u from a uniform distribution inthe interval [0,1]. If u > P* , then ignore thebootstrap sample and jump to step a.2 Store the value of θ* and jump to 1.This approach generates M random draws from theposterior distribution of θ having observed the RCT dataand the external evidence. All the subsequent steps ofthe CEA, such as calculating the average cost and effective-ness outcomes, interval estimations, and drawing the cost-effectiveness plane and the CEAC, remain unchanged. Ofnote, this algorithm requires that P* be valid probabilitiesbounded between 0 and 1. As such, the external likelihoodshould be scaled (e.g., divided by maxθe P De θej Þð ½ ).Importance samplingAs an alternative to probabilistically accepting or reject-ing bootstrap samples one can assign the weights directlyto each bootstrap sample [27]. That is, one proceeds byobtaining a desired number of bootstraps, calculating θein each sample, and assigning a weight proportional toP De θeÞ to each bootstrap. All subsequent calculationsrequire incorporating such weights (for example, ICERwill be the ratio of the weighted mean of costs over theweighted mean of effectiveness).Regularity conditionsFundamental to the proposed sampling scheme is thatthe joint likelihood of D and De can be factorized intotwo independent likelihoods. The onus is on the investi-gator to ensure this condition is satisfied with at least agood approximation. This can be context-specific. A fewscenarios that violate this assumption are when D andDe have overlapping samples, when De is an estimatefrom a meta-analysis of studies that included the currentstudy D, or when De represents experts’ opinion abouttreatment effect if their opinion is already influenced bythe results of the current study (the hindsight bias [29]).In addition, the general regularity conditions requiredfor the rejection and importance samplings should hold[27]. Particularly, since P(θ|D) is most often continuous(or for the regular bootstrap it takes many discretevalues), the external likelihood P(De|θ), should also becontinuous, otherwise the chance of samples from P(θ|D) hitting non-zero areas of P(De|θ) will be infinitelysmall. Next, θe should be identifiable (unique) withineach D*. This assumption holds for the most typicalform of external evidence such as rates or measures ofrelative risk [30]. Further, P(De|θ) should be bounded. IfP(De|θ) has an infinite maximum, for example, if it isproportional to the density function of a beta distribu-tion with either of its parameters being less than one theproposed sampling schemes might fail. Such distributionsare, however, mainly used as non-informative priors andseldom represent external evidence in realistic scenarios.On the other hand, mixed-type distributions such as theso called lump-and-smear priors that put point mass onthe value of the parameter consistent with the null hy-pothesis ([31] page 161), have unbounded density func-tions and cannot readily be used in the proposed samplingmethods.We used data from a real-world RCT to show the prac-tical aspects of implementing the proposed algorithms.Ethics approval was obtained from the Ottawa HospitalSadatsafavi et al. Trials 2014, 15:201 Page 4 of 9 Ethics Board (#2002623-01H) and VancouverCoastal Health Authority (#C03-0275).ResultsAn illustrative exampleThis case study is to demonstrate the operational aspectsof implementing the algorithm and is not intended tobe a practice in comprehensive evidence synthesis toinform policy.The case study is based on the OPTIMAL trial, amulticenter study evaluating the benefits of combinationpharmacological therapy in preventing respiratory exac-erbations in patients with chornic, obstructive pulmon-ary disease (COPD) [32,33]. Pharmacological treatmentof COPD, typically with inhaled medications, is often re-quired to keep the symptoms under control and reducethe risk of exacerbations. Sometimes patients receivecombinations of treatments of different classes in an at-tempt to bring the disease under control. However, thereis a lack of evidence on whether such combination ther-apies are effective. The OPTIMAL trial was designed toestimate the comparative efficacy and cost-effectivenessof single and combination therapies in COPD. It in-cluded 449 patients randomized into three treatmentgroups: T1: monotherapy with an inhaled anticholinergic(tiotropium, N = 156); T2: double therapy with aninhaled anticholinergic plus an inhaled beta-agonist (tio-tropium + salmeterol, N = 148); and T3: triple therapywith an inhaled anticholinergic, an inhaled beta-agonist,and an inhaled corticosteroid (tiotropium + fluticasone+ salmeterol, N = 145). The primary outcome measureof the RCT was the proportion of patients who experi-enced at least one respiratory exacerbation by the end ofthe follow-up period (52 weeks). This outcome was notsignificantly different across the three arms: the odds ra-tio (OR) for the risk of having at least one exacerbationby the end of the follow-up period was 1.03 (95% CI,0.63 to 1.67) for T2 versus T1 and 0.84 (95%CI, 0.47 to1.49) for T3 versus T1 (lower OR indicates a betteroutcome). Because the T2 arm in the OPTIMAL trialwas dominated (was associated with higher costs andworse effectiveness outcomes) in the original CEA, andfor the sake of brevity, in this case study we restrict theanalysis to a comparison between T3 and T1.Details of the original CEA are reported elsewhere[34]. Data on both resource use and quality of life werecollected at individual level during the trial, which wasused to carry out the CEA. The main outcome of theCEA was the incremental costs per QALY gained for T3versus T1 (that is, the difference in mean costs over thedifference in mean QALYs). Since individual level resourceuse and effectiveness outcomes were available, the CEAwas based on the direct inference on their distribution. Noexternal information was incorporated in the analysis inthe original CEA.External evidenceThe set of parameters with external evidence in this ana-lysis (θe) consists of one quantity: the logarithm of rateratio (RR) of exacerbations between T3 and T1 (denotedby θT3,T1) within the follow-up period. We used a formalprocess for evidence synthesis by performing a MED-LINE search for all clinical trials as well as systematicreviews on the treatment effect of combination pharma-cotherapies for COPD. In synthesizing evidence, weassumed a ‘class effect’ for the study medications, in linewith conventional wisdom and several pharmacoepide-miology studies evaluating such medications in COPD[35-37]. The most relevant source of evidence on the ef-fect size of T3 versus T1 was from a RCT on comparingbudesonide (in the same class as fluticasone) and formo-terol added to tiotropium versus tiotropium alone inCOPD patients [38]. This study reported a RR of 0.38(95% CI 0.25 to 0.57). The evidence was parameterizedby using normal likelihoods on the log-RR scale. Whentransferring evidence form one setting to another it isimportant to consider the likely presence of between-study variation (due to difference in inclusion criteria,treatment protocol, measurements, and so on) [39]. Be-cause only one study on this comparison was at hand, noestimate for between-study variation could be obtained.As such, we use the estimated between-study varianceof 0.01783 from the multiple-treatment comparison ofCOPD treatments (personal communication with theauthor K Thorlund) [35]. This results in the externalevidence being associated with a RR of 0.38(95% CI 0.24to 0.59), thus:log RR eNormal μ; σð Þ; μ ¼ −0:968; σ ¼ 0:246 ð3Þwith μ and σ corresponding to the mean and standarddeviation of the normal distribution. We note that theuncertainty around the log-RR from external evidence,represented by the above probability distribution, stemsfrom two sources: the finite sample of the external study,and our assumption on between-study variability. Over-all, the RR representing external evidence is much morein favor of combination therapy than the RR observedin the OPTIMAL trial. As such, we a priori expect thatthe incorporation of external evidence shall improve thecost-effectiveness outcomes in favor of T3.Putting all these together, the external evidence can beparameterized as:P Dejθð Þ∝e−θT3;T1−μð Þ22σ2 ∝e−θT3;T1þ0:968ð Þ20:121 ð4ÞSadatsafavi et al. Trials 2014, 15:201 Page 5 of 9 normal likelihood function representing our know-ledge on treatment effect. The original algorithm for theCEA can now be updated to incorporate the externalevidence as follows (using the rejection sampling scheme):1 For i = 1,2,…,M.a. Generate D*, a (Bayesian) bootstrap sample withineach of the three arms of the RCT.b. Impute the missing values in costs, utilities, andexacerbations in D*.c. Calculate θT3;T1, the log(RR) of exacerbation duringthe follow-up period for T3 vs. T1 from the bootstrappedsample.d. Calculate P ¼ P θT3;T1 using the distributionconstructed for the external evidence.e. Randomly draw u from a uniform distribution in theinterval [0,1]. If u >P, then ignore the bootstrappedsample and jump to step a.f. . Calculate mean costs, exacerbations, and QALYsfor each arm from D*.2 Store the average values for costs, exacerbation rates,and QALYs; then jump to 1.The simulation was stopped after 10,000 accepted boot-straps for the rejection sampling method incorporatingthe external evidence were generated. To obtain the re-sults using the importance sampling method, we used thesame set of bootstraps generated in the above algorithm,including all the accepted and rejected bootstraps.In addition to the ICER, we also reported the expectedvalues of the cost and health outcomes for each trialarm, and also plotted the CEAC, without and with theincorporation of the external evidence. The CEAC be-tween two treatments is the probability that a treatmentis cost-effective compared to another at a given value ofthe decision-maker’s willingness-to-pay (λ) for one unitof the health outcome [26]. The statistical code for thiscase study is provided in Additional file 1.Results of the case studyTable 1 presents the expected value costs and QALYs forthe T1 and T3 arms of the OPTIMAL trial without andwith the incorporation of the external evidence. TheBayesian and ordinary bootstraps generated very similarresults (Table 1). Similarly, results from the rejection andimportance sampling methods were very similar (resultsnot shown).As this table demonstrates, the incorporation of externalevidence shifted the outcomes of the T3 arm in the favor-able direction (lower costs and higher QALYs), and shiftedthe outcomes of the T1 arm in the opposite direction. Thisis an expected finding given the strong evidence in favorof T3 for the effect size of T3 versus T1 from the externalsource.The impact of incorporating external evidence is moreevident on the ICER. The ICER of T3 versus T1 de-creased by 52% after the incorporation of external evi-dence. Again, this is reflective of the fact that externalevidence is more in favor of T3 than the likelihood(RCT data) is.Figure 1 presents the results of incorporating externalevidence on the CEAC (using the Bayesian bootstrap).The incorporation of external evidence increased theprobability of cost-effectiveness for T3, especially withhigher willingness-to-pay (λ) values. Without the incorp-oration of external evidence, the probability of T3 beingcost-effective compared to T1 reach the 50% threshold atλ values greater than $240,000/QALY, while the incorpor-ation of the external evidence moved this threshold to$115,000/QALY.DiscussionContemporarily, when an economic evaluation is con-ducted alongside a single RCT, the practice of evidencesynthesis is not an integral part of the analysis. In ouropinion, this is partly because parametric Bayesianmodeling, the hitherto only available method, results inTable 1 Outcomes of the OPTIMAL CEA without and with the incorporation of external evidence*T1 T3 Difference (T3 – T1) ICERNo external evidenceBayesian bootstrap Costs 2649 (466) 4074 (547) 1425 (721) 250,329QALY 0.7071 (0.0075) 0.7128 (0.0093) 0.0057 (0.0087)Ordinary bootstrap Costs 2650 (467) 4077 (551) 1427 (721) 251,171QALY 0.7071 (0.0075) 0.7128 (0.0093) 0.0057 (0.0087)With external evidenceBayesian bootstrap Costs 2753 (492) 3959 (510) 1205 (709) 121,260QALY 0.7053 (0.0074) 0.7152 (0.0092) 0.0099 (0.0085)Ordinary bootstrap Costs 2742 (477) 3966 (536) 1225 (709) 126,387QALY 0.7054 (0.0074) 0.7151 (0.0092) 0.0098 (0.0084)*Results are mean (standard deviation).ICER, incremental cost-effectiveness ratio; QALY, quality-adjusted life year.Sadatsafavi et al. Trials 2014, 15:201 Page 6 of 9 and complex statistical models. In thiswork we propose simple and intuitive algorithms for theincorporation of external evidence in RCT-based CEAsthat use bootstrapping to draw inference. Rejection andimportance samplings which form the basis of the pro-posed method are popular paradigms in which samplingfrom a ‘difficult’ distribution is replaced by sampling froma proposal (or instrumental) distribution [40]. Here,sampling from P(θ|D,De) is performed via P(θ|D), andthe latter can easily be sampled through (Bayesian)bootstrapping.In synthesizing evidence for RCT-based CEAs, a care-fully crafted parametric model with comprehensive ana-lysis of model convergence and sensitivity of results toparametric assumptions has indisputable strengths overresampling approaches, including the higher computa-tional efficiency of MCMC or likelihood-based methodsand the ability to synthesize and propagate all evidencein a single analytical framework [41,42]. Nevertheless,important advantages make the proposed resamplingmethods a competitive option. The proposed methodsare intuitive and easy extensions of the popular boot-strap method of RCT-based CEAs; they do not requirespecialist software and in-depth content expertise forimplementation. In addition to such practical advantages,the proposed resampling methods connect the parametersfor which external evidence is available to the cost andeffectiveness outcomes without an explicit model, whichis a requirement in parametric Bayesian approaches.Our paper provides a conceptual framework and furtherresearch into theory, as well as practical issues in usingthis method, should follow. The apparent simplicity ofthe bootstrap may conceal the assumptions being made,especially with small datasets [21,43]. Furthermore, ifthe external evidence and RCT data substantially differon the information they provide for the evidence (thatis, that the prior and data are in conflict) [44], or whenthere are multiple parameters for which external evidenceis available, then the sampling methods will becomeinefficient.Further research is needed to improve sampling effi-ciency and to incorporate external evidence in otherparadigms such as cluster or crossover RCTs. Import-antly, the theoretical construct of the proposed methoddoes not necessarily restrict it to RCT-based CEAs. Asimilar concept can be used to reconcile evaluationsbased on observational data with external evidence. Thiswill inevitably invoke questions about the applicability ofdifferent metrics of the effect size in non-randomizedstudies (for example, average treatment effect versusaverage treatment effect for the treated), and the validityof the bootstrap as the sampling method (for example,Figure 1 Cost-effectiveness acceptability curve (CEAC) without and with the incorporation of external evidence. The horizontal grey linerepresents the 50% threshold on probability of cost-effectiveness.Sadatsafavi et al. Trials 2014, 15:201 Page 7 of 9 a propensity-score-matched cohort). In addition,further empirical research is required to evaluate thereal-world applicability and feasibility of the methodand to demonstrate its comparative performance againstconventional methods of evidence synthesis (for example,parametric Bayesian analysis using MCMC).This paper deliberately stays away from the debate onwhether to incorporate external evidence for a givensituation an d focuses on the ‘how to’ question. The‘whether to’ question is context-specific and great care isrequired for the sensible use of external evidence in eachsetting. For the case study, for example, the substantialdiscrepancy in the results between the external andcurrent RCTs (with regard to the efficacy of triple therapyversus monotherapy) should more than anything generatemisgivings about the suitability of borrowing evidencefrom that external source. However, the case study wasundertaken as a step in the direction of proof of concept,applicability, and face validity of the proposed methods.This is not a withdrawal from the deep considerationsrequired for sensible evidence synthesis.ConclusionsFaced with the escalating costs of RCTs and the requirementby many decision-making bodies for formal economic evalu-ation of emerging health technologies, trialists and healtheconomists are hard-pressed to generate as much relevantinformation for policymakers as possible. As such, and des-pite criticisms, it appears that RCT-based CEAs are here tostay. The incorporation of external evidence helps optimizeadoption decisions. Aside from their theoretical contribu-tion, if their real-world applicability is proven the proposedmethods can provide the large camp of analysts using boot-strap for RCT-based CEAs with a statistically sound, easilyimplementable tool for such purpose.Additional fileAdditional file 1: File name: R code.r. Description: This is the R codeused for the case study.AbbreviationsCEA: Cost-effectiveness analysis; CEAC: Cost-effectiveness acceptability curve;COPD: Chronic obstructive pulmonary disease; ICER: Incremental cost-effectiveness ratio; MCMC: Markov chain Monte Carlo; OR: Odds ratio;RCT: Randomized controlled trial; RR: Rate ratio; QALY: Quality-adjusted life year.Competing interestsThe authors declare that they have no competing interests.Authors’ contributionsThis work was part of MS’ PhD research. MS developed the research questionand the methodology. MS and SB designed the case study. CM and SAhelped with the acquisition of the data and provided content advice for thecase study. MS performed the computer simulations. MS and SB developedthe first draft of the manuscript. All authors critically revised the manuscriptand approved the final version.AcknowledgmentsThis study was part of MS's PhD research which was funded by a graduatefellowship award from the Canadian Institutes of Health Research. The authorswould like to thank Dr Craig Mitton (University of British Columbia) andLawrence McCanmdless (Simon Fraser University) for their valuable advice, andMs Stephanie Harvard and Ms Jenny Leese for editorial assistance.Author details1Institute for Heart and Lung Health, Faculty of Medicine, University of BritishColumbia, 317 – 2194 Health Sciences Mall (Woodward InstructionalResource Centre), Vancouver, Canada V6T 1Z3. 2Collaboration for OutcomesResearch and Evaluation, Faculty of Pharmaceutical Sciences, University ofBritish Columbia, 2405 Wesbrook Mall, Vancouver, Canada V6T 1Z3. 3Centrefor Clinical Epidemiology and Evaluation, Vancouver Coastal Health Institute,University of British Columbia, 7th Floor, 828 West 10th Avenue, ResearchPavilion, Vancouver, Canada V5Z 1M9. 4Faculty of Health Sciences, Universityof Ottawa, 451, Smyth Road, Ottawa, Canada K1H 8M5. 5School of Public andPopulation Health, University of British Columbia, 2206 East Mall, Vancouver,Canada V6T 1Z3.Received: 25 September 2013 Accepted: 19 May 2014Published: 3 June 2014References1. Drummond M: Introducing economic and quality of life measurementsinto clinical studies. Ann Med 2001, 33:344–349.2. Glick H, Doshi J, Sonnad S, Polsky D: Economic Evaluation in Clinical Trials.New York: Oxford University Press; 2007.3. Ramsey S, Willke R, Briggs A, Brown R, Buxton M, Chawla A, Cook J, Glick H,Liljas B, Petitti D, Reed S: Good research practices for cost-effectivenessanalysis alongside clinical trials; the ISPOR RCT-CEA Task Force report.Value in health 2005, 8:521–33.4. Briggs A, Wonderling D, Mooney C: Pulling cost-effectiveness analysis upby its bootstraps: a non-parametric approach to confidence intervalestimation. Health Econ 1997, 6:327–340.5. Drummond M, O’Brien B, Stoddart G, Torrance G: Methods for the EconomicEvaluation of Health Care Programmes. United Kingdom: Oxford UniversityPress; 2005.6. Buxton MJ, Drummond MF, Van Hout BA, Prince RL, Sheldon TA, Szucs T,Vray M: Modelling in economic evaluation: an unavoidable fact of life.Health Econ 1997, 6:217–227.7. Brennan A, Akehurst R: Modelling in health economic evaluation. What isits place? What is its value? Pharmacoeconomics 2000, 17:445–459.8. Sculpher M, Claxton K, Drummond M, McCabe C: Whither trial-basedeconomic evaluation for health care decision making? Health Econ 2006,15:677–687.9. Spiegelhalter D, Freedman L, Parmar M: Bayesian Approaches toRandomized Trials. Journal of the Royal Statistical Society Series A(Statistics in Society) 1994, 157:357–416.10. O’Hagan A, Stevens JW, Montmartin J: Bayesian cost-effectiveness analysisfrom clinical trial data. Stat Med 2001, 20:733–753.11. Briggs A: A Bayesian approach to stochastic cost-effectiveness analysis.An illustration and application to blood pressure control in type 2diabetes. Int J Technol Assess Health Care 2001, 17:69–82.12. Heitjan D, Moskowitz A, Whang W: Bayesian estimation of cost-effectiveness ratios from clinical trials. Health Econ 1999, 8:191–201.13. Heitjan D, Li H: Bayesian estimation of cost-effectiveness: an importance-sampling approach. Health Economics 2004, 13:191–198.14. Al M, Van Hout B: A Bayesian approach to economic analyses of clinicaltrials: the case of stenting versus balloon angioplasty. Health Econ 2000,9:599–609.15. O’Brien B: A tale of two (or more) cities: geographic transferability ofpharmacoeconomic data. Am J Manag Care 1997, 3(Suppl):S33–39.16. Cook JR, Drummond M, Glick H, Heyse JF: Assessing the appropriatenessof combining economic data from multinational clinical trials. Stat Med2003, 22:1955–1976.17. Drummond M, Barbieri M, Cook J, Glick H, Lis J, Malik F, Reed S, Rutten F,Sculpher M, Severens J: Transferability of economic evaluations acrossjurisdictions: ISPORGood Research Practices Task Force report. Value Health2009, 12:409–418.Sadatsafavi et al. Trials 2014, 15:201 Page 8 of 9 Lunn D, Thomas A, Best N, Spiegelhalter D: WinBUGS – A Bayesianmodelling framework: Concepts, structure, and extensibility. Statistics andComputing 2000, 10:325–337.19. Mihaylova B, Briggs A, O’Hagan A, Thompson S: Review of statisticalmethods for analysing healthcare resources and costs. Health Econ 2011,20:897–916.20. Thompson S, Nixon R: How sensitive are cost-effectiveness analyses tochoice of parametric distributions? Med Decis Making 2005, 25:416–423.21. Rubin D: The Bayesian Bootstrap. Ann Statist 1981, 9:130–134.22. Rubin D: Multiple Imputation for Nonresponse in Surveys. New York: JohnWiley; 1987.23. Lo A: A Large Sample Study of the Bayesian Bootstrap. Ann Statist 1987,15:360–375.24. Schafer J: Multiple imputation: a primer. Statistical Methods in MedicalResearch 1999, 8:3–15.25. Polsky D, Glick HA, Willke R, Schulman K: Confidence intervals for cost-effectiveness ratios: a comparison of four methods. Health Econ 1997,6:243–252.26. Fenwick E, Claxton K, Sculpher M: Representing uncertainty: the role ofcost-effectiveness acceptability curves. Health Economics 2001,10:779–787.27. Smith A, Gelfand A: Bayesian Statistics without Tears: A Sampling-Resampling Perspective. The American Statistician 1992, 46:84–88.28. Von Neumann J: Various techniques used in connection with randomdigits. Nat Bureau Stand Appl Math Ser 1951, 12:36–38.29. Roese NJ, Vohs KD: Hindsight Bias. Perspectives on Psychological Science2012, 7:411–426.30. Lehmann EL, Casella G: Theory of Point Estimation. New York: Springer; 1998.31. Spiegelhalter D, Abrams K, Myles J: Bayesian Approaches to Clinical Trials andHealth Care Evaluation. Chichester: John Wiley & Sons; 2004.32. Aaron S, Vandemheen K, Fergusson D, FitzGerald M, Maltais F, Bourbeau J,Goldstein R, McIvor A, Balter M, O’donnell D: The Canadian OptimalTherapy of COPD Trial: design, organization and patient recruitment.Can Respir J 2004, 11:581–585.33. Aaron S, Vandemheen K, Fergusson D, Maltais F, Bourbeau J, Goldstein R,Balter M, O’Donnell D, McIvor A, Sharma S, Bishop G, Anthony J, Cowie R,Field S, Hirsch A, Hernandez P, Rivington R, Road J, Hoffstein V, Hodder R,Marciniuk D, McCormack D, Fox G, Cox G, Prins H, Ford G, Bleskie D,Doucette S, Mayers I, Chapman K, et al: Tiotropium in combination withplacebo, salmeterol, or fluticasone-salmeterol for treatment of chronicobstructive pulmonary disease: a randomized trial. Ann Intern Med 2007,146:545–555.34. Najafzadeh M, Marra C, Sadatsafavi M, Aaron S, Sullivan S, Vandemheen K,Jones P, FitzGerald J: Cost effectiveness of therapy with combinations oflong acting bronchodilators and inhaled steroids for treatment of COPD.Thorax 2008, 63:962–967.35. Mills EJ, Druyts E, Ghement I, Puhan MA: Pharmacotherapies for chronicobstructive pulmonary disease: a multiple treatment comparisonmeta-analysis. Clin Epidemiol 2011, 3:107–129.36. Ernst P, Gonzalez AV, Brassard P, Suissa S: Inhaled corticosteroid use inchronic obstructive pulmonary disease and the risk of hospitalization forpneumonia. Am J Respir Crit Care Med 2007, 176:162–166.37. Spitzer WO, Suissa S, Ernst P, Horwitz RI, Habbick B, Cockcroft D, Boivin JF,McNutt M, Buist AS, Rebuck AS: The use of beta-agonists and the risk ofdeath and near death from asthma. N Engl J Med 1992, 326:501–506.38. Welte T, Miravitlles M, Hernandez P, Eriksson G, Peterson S, Polanowski T,Kessler R: Efficacy and tolerability of budesonide/formoterol added totiotropium in patients with chronic obstructive pulmonary disease.Am J Respir Crit Care Med 2009, 180:741–750.39. Ades A, Lu G, Higgins J: The Interpretation of Random-Effects Meta-Analysis in Decision Models. Medical Decision Making 2005, 25:646–654.40. Robert C, Casella G: Monte Carlo Statistical Methods. New York: Springer;2004.41. Cooper N, Sutton A, Abrams K, Turner D, Wailoo A: Comprehensivedecision analytical modelling in economic evaluation: a Bayesianapproach. Health Econ 2004, 13:203–226.42. Ades A, Sculpher M, Sutton A, Abrams K, Cooper N, Welton N, Lu G:Bayesian methods for evidence synthesis in cost-effectiveness analysis.Pharmacoeconomics 2006, 24:1–19.43. Beran R: The Impact of the Bootstrap on Statistical Algorithms andTheory. Statistical Science 2003, 18:175–184.44. Hoch J, Briggs A, Willan AR: Something old, something new, somethingborrowed, something blue: a framework for the marriage of healtheconometrics and cost-effectiveness analysis. Health Econ 2002,11:415–430.doi:10.1186/1745-6215-15-201Cite this article as: Sadatsafavi et al.: Incorporating external evidence intrial-based cost-effectiveness analyses: the use of resampling methods.Trials 2014 15:201.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at et al. Trials 2014, 15:201 Page 9 of 9


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items