ralBMC Medical Research ssBioMed CentMethodologyOpen AcceResearch articleComparison of Bayesian and classical methods in the analysis of cluster randomized controlled trials with a binary outcome: The Community Hypertension Assessment Trial (CHAT)Jinhui Ma1, Lehana Thabane*1, Janusz Kaczorowski2, Larry Chambers3, Lisa Dolovich4, Tina Karwalajtys4 and Cheryl Levitt4Address: 1Department of Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada, 2Centre for Applied Health Research & Evaluation, University of British Columbia, Vancouver, British Columbia, Canada, 3Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, Ontario, Canada and 4Department of Family Medicine, McMaster University, Hamilton, Ontario, CanadaEmail: Jinhui Ma - maj26@mcmaster.ca; Lehana Thabane* - thabanl@mcmaster.ca; Janusz Kaczorowski - janusz.kaczorowski@familymed.ubc.ca; Larry Chambers - lchamber@scohs.on.ca; Lisa Dolovich - ldolovic@mcmaster.ca; Tina Karwalajtys - karwalt@mcmaster.ca; Cheryl Levitt - clevitt@mcmaster.ca* Corresponding author AbstractBackground: Cluster randomized trials (CRTs) are increasingly used to assess the effectiveness of interventions toimprove health outcomes or prevent diseases. However, the efficiency and consistency of using different analyticalmethods in the analysis of binary outcome have received little attention. We described and compared various statisticalapproaches in the analysis of CRTs using the Community Hypertension Assessment Trial (CHAT) as an example. TheCHAT study was a cluster randomized controlled trial aimed at investigating the effectiveness of pharmacy-based bloodpressure clinics led by peer health educators, with feedback to family physicians (CHAT intervention) against UsualPractice model (Control), on the monitoring and management of BP among older adults.Methods: We compared three cluster-level and six individual-level statistical analysis methods in the analysis of binaryoutcomes from the CHAT study. The three cluster-level analysis methods were: i) un-weighted linear regression, ii)weighted linear regression, and iii) random-effects meta-regression. The six individual level analysis methods were: i)standard logistic regression, ii) robust standard errors approach, iii) generalized estimating equations, iv) random-effectsmeta-analytic approach, v) random-effects logistic regression, and vi) Bayesian random-effects regression. We alsoinvestigated the robustness of the estimates after the adjustment for the cluster and individual level covariates.Results: Among all the statistical methods assessed, the Bayesian random-effects logistic regression method yielded thewidest 95% interval estimate for the odds ratio and consequently led to the most conservative conclusion. However, theresults remained robust under all methods – showing sufficient evidence in support of the hypothesis of no effect for theCHAT intervention against Usual Practice control model for management of blood pressure among seniors in primarycare. The individual-level standard logistic regression is the least appropriate method in the analysis of CRTs because itignores the correlation of the outcomes for the individuals within the same cluster.Conclusion: We used data from the CHAT trial to compare different methods for analysing data from CRTs. Usingdifferent methods to analyse CRTs provides a good approach to assess the sensitivity of the results to enhanceinterpretation.Published: 16 June 2009BMC Medical Research Methodology 2009, 9:37 doi:10.1186/1471-2288-9-37Received: 29 September 2008Accepted: 16 June 2009This article is available from: http://www.biomedcentral.com/1471-2288/9/37© 2009 Ma et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Page 1 of 9(page number not for citation purposes)BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37BackgroundCluster randomized trials (CRTs) are increasingly used inthe assessment of the effectiveness of interventions toimprove health outcomes or prevent diseases [1]. Theunits of randomization for such trials are groups or clus-ters such as family practices, families, hospitals, or entirecommunities rather than individuals themselves. CRTdesigns are used to evaluate the effectiveness of not onlygroup interventions but also individual interventionswhere group-level effects are relevant. CRTs may also leadto substantially reduced statistical efficiency compared totrials that randomize the same number of individuals [2].They may also produce selection bias since the allocationarm that the subject receives is often known in advance[3]. However, in practice, CRT designs have several attrac-tive features that may outweigh these disadvantages. Clus-ter randomization minimizes the likelihood ofcontamination between the intervention and the controlarms. In addition, the nature of the intervention itself maydictate its application as the optimal strategy [4].The main consequence of a cluster design is that the out-comes for subjects within the same cluster can not beassumed to be independent. This is because the subjectswithin the same cluster are more likely to be similar toeach other than those from different clusters. This leads toa reduction in statistical efficiency due to clustering, i.e.the design effect. The design effect is a function of the var-iance inflation factor (VIF), given by 1 + ( - 1)ρ, where denotes the average cluster size and ρ is a measure ofintra-cluster correlation – interpretable as the correlationbetween any two responses in the same cluster [2,5]. Con-sidering the two components of the variation in the out-come, between-cluster and within-cluster variations, ρmay also be interpreted as the proportion of overall varia-tion in outcome that can be accounted for by the between-cluster variation.These principles are well established in the design ofCRTs, especially when there are implications for the sam-ple size planning. In statistical analysis, it has long beenrecognized that ignoring the clustering effect will increasethe chance of obtaining statistically significant but spuri-ous findings [6]. Although many papers have comparedanalytical methods for CRTs with binary outcomes overthe last decade, none have investigated the Bayesianmodel in the analysis of CRT in detail. In particular, com-parison of random-effects meta analytic approach withother methods for the analysis of matched pair CRTs [7]has not been done. In this paper, we compare various sta-tistical approaches in the analysis of CRTs using the Com-controlled trial using blocked stratified, matched-paircluster randomization. We also explored in much detailthe application of the Bayesian random-effects model inthe analysis of CRTs. In particular, we investigated theimpact of different prior distributions on the estimate ofthe treatment effect.MethodsOverview of the CHAT studyThe CHAT study was a cluster randomized controlled trialaimed at investigating the effectiveness of pharmacy-based blood pressure (BP) clinics led by peer health edu-cators, with feedback to family physicians (FP) on themonitoring and management of BP among older adults[8]. The participants of the trial included the FP practices,patients, pharmacies and peer health educators. EligibleFPs were non-academic, full-time practitioners with regu-lar family practices in terms of size and case-mix, and wereable to provide an electronic roster that included a mail-ing address of their patients 65 years and older. FPs whoworked in walk-in clinics or emergency departments, wereabout to retire or worked part-time, had fewer than 50patients 65 years or older, or had a specialized practiceprofile were excluded from the study. Eligible patientswere 65 years or older at the beginning of the study, con-sidered by their FPs to be regular patients, community-dwelling and able to leave their homes to attend the com-munity-based pharmacy sessions. To ensure that theresults would be generalizable to patients in other FP prac-tices, the trial had very few exclusion criteria for patients.The study design was a multi-centre randomized control-led trial using blocked stratified, matched-pair cluster ran-domization. Family practices were the unit ofrandomization. Eligible practices were stratified accordingto (1) the median number of patients in the practice withadequate BP control and (2) the median number ofpatients aged 65 years and older, and matched accordingto centers. The trial started in 2003 with 28 FPs practisingin Ottawa and Hamilton randomly selected from the eli-gible FPs. Fourteen were randomly allocated to the inter-vention (pharmacy BP clinics) and 14 to the control group(no BP clinics offered). Fifty-five eligible patients wererandomly selected from each FP roster. Therefore, 1540patients participated in the study.All eligible patients in both the intervention or controlgroup got usual health service at their FP's office. Patientsin the practices allocated to the intervention group wereinvited to visit the community BP clinics. Peer health edu-cators assisted patients to measure their BP and recordtheir readings on a form that also asked about cardiovas-cular risk factors. Research nurses, assisted by the FP officemmPage 2 of 9(page number not for citation purposes)munity Hypertension Assessment Trial (CHAT) as anexample. The CHAT study is a multi-centre randomizedstaff, conducted the baseline and end-of-trial (12 monthsafter the randomization) audits of the health records ofBMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37the 1540 patients (55 per practice) who participated in thestudy. These data were collected to determine the effect ofthe intervention.OutcomesThe primary outcome of the CHAT study was a binary out-come with "1" indicating the patient's BP was controlledat the end of the trial and "0" otherwise. We defined thatthe patient's BP was controlled as follows:• if the BP reading was available in the patient's chartat the end of the trial and the systolic BP ≤ 140 mmHgand diastolic BP ≤ 90 mmHg for patient without dia-betes or target organ damage, or• the systolic BP ≤ 130 mmHg and diastolic BP ≤ 80mmHg for patient with diabetes or target organ dam-age.Secondary outcomes of the CHAT study included 'BPmonitored', frequency of BP monitoring, systolic BP read-ing, and diastolic BP reading. The analyses presented inthis paper are based on the primary outcome only. Theanalysis of secondary outcomes will be the subject ofanother paper reporting the trial results.Statistical methodsThe analysis of CRTs may be based on the analysis ofaggregated data from each cluster or based on individuallevel data, which correspond to the cluster-level and theindividual-level analysis methods, respectively. Theadjustment for individual-level covariates may be appliedonly for the individual level analysis. While the adjust-ment for the cluster-level covariates may be applied forboth the cluster-level and the individual-level analysis. Inthis paper, the random-effects meta-regression methodwas performed using STATA Version 8.2 (College Station,TX). Other standard analyses were performed using SASVersion 9.0 (Cary, NC). The Bayesian analysis was per-formed using WinBugs Version 1.4. The results from clas-sical analyses for binary outcomes are reported as oddsratio (OR) and corresponding 95% confidence interval(CI). The results from the Bayesian method are reported asposterior estimate and corresponding 95% credible inter-val (CrI). CrIs are the Bayesian analog of confidence inter-vals. The reporting of the results follows the CONSORT(Consolidated Standards of Reporting Trials) statementguidelines for reporting cluster-randomized trials [9] andROBUST guideline [10] for reporting Bayesian analysis.Cluster-level analyses methodsFollowing Peters et al [11], we assume that the number ofpatients in cluster/FP i (i = 1 to 28) with BP controlled andand its variance isUn-weighted linear regressionConsidering log odds for each cluster as the dependentvariable, the un-weighted linear regression model [12]can be expressed as:where xi denotes the vector of covariates (interventiongroups and centers), β represents the effect of the covari-ates in the log odds scale, and ui represents the cluster levelrandom effect. The ui here is assumed to follow normaldistribution with a zero mean and a constant variance. Inthis method, each cluster/FP is given equal weight whenestimating the regression coefficient β. We implementedthis model using SAS proc glm.Weighted linear regressionThe weighted linear regression method [12] has the samemodel expression as the un-weighted linear regressionmethod. It treats the log odds estimated from each clusteras the outcome, and treatment group as one of the explan-atory variables. The weight was defined as the inverse var-iance of the log odds, i.e. wi = 1/vari for FP i. Compared tothe un-weighted linear regression – in which all clusterestimates are weighed equally – the weighted linearregression gave clusters with higher precision moreweight, and therefore more contribution in estimating thetreatment effect. We implemented this model using SASproc glm.Random-effects meta-regressionThe random-effects meta-regression model [13] is similarto the un-weighted linear regression model:However, the ui here is assumed to follow a normal distri-bution on the log odds scale with a zero mean and anuncertain variance, which represents the between clustervariance and can be estimated when fitting the model. Weimplemented this model using STATA metareg.Individual-level analysesWe used six individual-level statistical methods thatlog log ,oddrini rii =−⎛⎝⎜⎞⎠⎟var .i ri ni ri= +−1 1log ,odd x ui i i= +blog .odd x ui i i= +bPage 3 of 9(page number not for citation purposes)the total number of patients in the cluster/FP are denotedby ri and ni respectively. For the FP i, the log odds ofnumber of patients with BP controlled is estimated asextend the standard logistic regression methods by addingspecific strategies to handle the clustering of the data, andtherefore are valid for analyzing clustering data.BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37Standard logistic regressionThe standard logistic regression model [14,15] can beexpressed as:where xij denotes the vector of covariates (BP controlled atbaseline, intervention groups etc.) for patient j in the clus-ter/FP i; yij is the binary outcome indicating if the BP iscontrolled for patient j in the cluster/FP i; and πij = Pr(yij =1|xij).The standard logistic model assumes that data from differ-ent patients are independent. Since this assumption is notvalid for the correlated data, it is not valid for analyzingcluster randomized trials. We implemented this modelusing SAS proc genmod.Robust standard errorsLike the standard logistic regression, the robust standarderror method [14,16] gives the same estimates since bothof them assume independent data to get the estimate ofthe treatment effect. However, in the robust standarderrors method, the standard errors for all the estimates areobtained using 'Huber sandwich estimator' which can beused to estimate the variance of the maximum likelihoodestimate when the underlying model is incorrect or themodel assumption is wrong [17]. It is often used for clus-tered data [18]. We implemented this model using SASproc genmod.Generalized estimating equationsThe generalized estimating equations (GEE) [14,19,20]method permits the specification of a working correlationmatrix that accounts for the form of within-cluster corre-lation of the outcomes. In the analysis of CRTs, we gener-ally assume that there is no logical ordering forindividuals within a cluster, i.e. the individuals within thesame cluster are equally correlated. In this case, anexchangeable correlation matrix should be used. Weimplemented this model using SAS proc genmod.Though the sandwich standard error estimator is consist-ent even when the underlying model is specified incor-rectly, it tends to underestimate the standard error of theregression coefficient when the number of clusters is notlarge enough [21,22]. Furthermore, the estimate of stand-ard error is highly variable when the number of clusters istoo small. In this paper, we employed two methods pro-posed by Ukoumunne [23] to correct this bias. Bothmethods can be used when there are equal numbers ofclusters in each arm and no covariate adjustment. In the, where J is the number of clusters in each arm.In the second method – modified GEE (2), the increasedvariability of the sandwich standard error estimator wasaccounted for by building the confidence interval for thetreatment effect based on the quantiles from the t-distri-bution with 2(J-1) degree of freedom.Random-effects meta-analytic approachThis method is appropriate only for CRTs with matchedpair design [2]. If we assume that the data from eachpaired cluster are arising from a meta-analysis of inde-pendent randomized controlled clinical trials, then wecan apply the traditional random-effects meta-analysismethod to pool the results from all the pairs [13]. The ran-dom-effects meta-analytic approach for analysing CRTsconsists of two steps. First, the treatment effect is esti-mated for each paired cluster. Second, the overall treat-ment estimator is calculated as a weighted average of thepaired cluster estimates, where weights are the inverse ofthe estimated variances of treatment effects of the pairedclusters. We implemented this model using SAS proc gen-mod and proc mixed.Random-effects logistic regressionThe random-effects logistic regression [15,19] is a specialkind of hierarchical linear model. Compared to the stand-ard logistic regression, the random-effects logistic regres-sion includes a cluster-level random effect in the modelwhich is assumed to follow a normal distribution with azero mean and an unknown variance τ2 (the between-cluster variance); τ2 is estimated in the regression. Byallowing for over-dispersion parameter to be estimated,we adopted the estimating algorithm of pseudo-likeli-hood function of Wolfinger/O'Connell 1993 [24]. Com-pared to the Bayesian model, the CI for the treatmenteffect from this method is narrower since it is based onestimated constant variance components without allow-ance for uncertainty [25]. In practice, it may be difficult toassess the validity of the model assumption that the clus-ter-level random-effects follow a normal distribution. Weimplemented this model using SAS macro glimmix.Bayesian random-effects regressionThe Bayesian random-effects regression model [26] hasthe same format as the traditional random-effects logisticregression. However it is based on different assumptionsto the variance of the cluster level random effect. TheBayesian approach assumes the variance of the randomeffect τ2 as an unknown parameter while the traditionalregression approach assumes it as a constant. In the Baye-sian approach, the uncertainty of τ2 is taken into accountby assuming a prior distribution which presents thelogit( ) ,p bij ijx=J J/( )− 1Page 4 of 9(page number not for citation purposes)first method – modified GEE (1), the bias of the sandwichstandard error is corrected by multiplying it byresearcher's pre-belief or external information to τ2. Theobserved data are presented as a likelihood function,BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37which is used to update the researcher's pre-belief andthen obtain the final results. The final results are pre-sented as the posterior distribution.When applying the Bayesian model, it is essential to statein advance the source and structure of the prior distribu-tions that are proposed for the principal analysis [27,28].In our Bayesian analysis, we assumed the non-informativeuniform prior distribution with lower and upper boundsas 0 and 10 respectively to minimize the influence of theresearcher's pre-belief or external information on theobserved data. Consequently, the result from the Bayesianapproach should be comparable to the results from theclassical statistical methods. We also assumed that theprior distribution for all the coefficients follows a normaldistribution with a mean of zero and precision 1.0E-6.The total number of iterations to obtain the posterior dis-tribution for each end point is 500,000, the burned-innumber is 10,000, and the seed is 314159. The non-con-vergence of the Markov Chain is evaluated by examiningthe estimated Monte Carlo error for posterior distribu-tions and a dynamic trace plots, times series plots, densityplots and autocorrelation plots.Impact of priors for Bayesian analysisEven though the researcher's subjective pre-beliefs, whichare expressed as prior distribution functions, can beupdated by the likelihood function of the observed data,misspecification of priors has an impact on the posteriorin some cases. To verify the robustness of the results fromthe Bayesian random-effects logistic regression, we evalu-ated the impact of different prior distributions of the var-iance parameter in the analysis of the primary outcome,BP controlled, without adjustment for any covariates.The commonly used priors for the variance parameter areuniform (non-informative prior) and inverse gamma(non-informative and conjugate prior) [29]. The esti-mates of treatment effect when assuming different priordistributions were quite consistent based on the resultspresented in Table 1. For uniform priors, the estimatesand the 95% CIs were similar when the upper bound ofthe uniform distribution was greater than or equal to 5.For the inverse-gamma prior, Gelman [29] pointed outthat when τ2 is close to zero, the results may be sensitiveto different choices of the parameter ε. Since τ2 is approx-imately 0.5 (estimated from random-effects meta regres-sion, random-effects logistic regression and Bayesianapproaches), which is not close to zero, our results are sta-ble when using inverse-gamma prior with differentchoices of the parameter ε.ResultsSince the data collection was based on chart review, therewere very few missing values for the CHAT study. Demo-graphic information and health conditions were balancedbetween the two study arms at baseline. Of the 1540patients who were included, there were 41% (319/770)male patients in the control group and 44% (339/769)male patients in the intervention group. At the beginningof the trial, the mean age of the patients was 74.36 with astandard deviation (SD) of 6.22 in the control group, and74.16 with SD of 6.14 in the intervention group. In theintervention and control group, 55% (425/770) and 55%(420/770) of patients had BP controlled at baseline; 57%(437/770) and 53% (409/770) of patients had BP con-trolled at the end of the trial.In analyzing the binary primary outcomes of the CHATtrial (BP controlled), the results from different statisticalmethods were different. However, the estimates obtainedfrom all of the nine methods showed that there were nosignificant differences in improving the patients' BPbetween the intervention and the control groups.For the cluster-level methods, we compared the oddsratios and 95% confidence interval with and withoutadjustment for the stratifying variable, 'centre' (HamiltonTable 1: Comparison of the Impact of Different Priors on Bayesian ModelPrior Outcome: BP controlled (unadjusted for covariates)Type of Prior Prior distribution Odds Ratio 95% CIUniform (0, 1) 1.11 (0.64 1.92)Uniform (0, 5) 1.09 (0.61 1.94)Non-informative Uniform (0, 10) 1.09 (0.61 1.94)Uniform (0, 50) 1.09 (0.61 1.94)Uniform (0, 100) 1.09 (0.61 1.94)Non-informative and Conjugate IGamma (0.001, 0.001) 1.11 (0.63 1.94)IGamma (0.01, 0.01) 1.11 (0.63 1.95)IGamma (0,1, 0.1) 1.12 (0.64 1.95)Page 5 of 9(page number not for citation purposes)CI = confidence interval; BP = Blood pressure; Igamma = Inverse GammaBMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37or Ottawa). The variable 'centre' is not significant at α =0.05 in predicting if the patients' BPs were controlled atthe end of the trial. When adjusting for covariate 'centre',the treatment effects were slightly different and the 95%confidence intervals for the treatment effects were nar-rower. For individual-level methods, we compared theresults of the analysis with and without adjustment forpatients' characteristics at baseline. These baseline charac-teristics included diabetes, heart disease, and whether ornot the BP of the patient was controlled at baseline. All ofthese covariates were significantly associated with the out-come at level α = 0.05. When we included some patients'baseline information as the covariates in the models, theodds ratios of the treatment effect changed slightly andthe 95% confidence intervals tended to be much narrowercompared to estimates without adjustment for any covari-ate. The intra-cluster correlation coefficient (ICC) reducedfrom 0.077 to 0.054 after adjusting for covariates. The95% confidence intervals for the treatment effect from thetwo modified GEE models became slightly wider after thebias of sandwich standard error estimator was corrected,but our conclusions remained robust. The comparison ofthe results from different statistical methods is presentedin Table 2 and Figure 1.DiscussionSummary of Key FindingsWe applied three cluster-level and five individual-levelapproaches to analyse results of the CHAT study. We alsoemployed two methods to correct the bias of the sand-wich standard error estimator from the GEE model.Among all the analytic approaches, only the individual-level standard logistic regression was inappropriate sinceit does not account for the between-cluster variation. ThisForest Plot: Comparison of Methods without Adjustment for Covariatesigure 1Page 6 of 9(page number not for citation purposes)Forest Plot: Comparison of Methods without Adjustment for Covariates.BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37is because it tends to underestimate the standard error ofthe treatment effect and its p-value. Correspondingly, thismethod might exaggerate the treatment effect. All theother methods handle the clustering by different tech-niques, and therefore were appropriate. All but theweighted regression method yielded similar point esti-mates of the treatment effect. This is not surprising sincethe weighted regression method can potentially affect thelocation of the estimate as well as the precision. The Baye-sian random-effects logistic regression yielded the widestconfidence interval. This was due to the fact that the Baye-sian random-effects logistic regression incorporates theuncertainty of all parameters. The 95% confidence inter-vals for the treatment effect from the two modified GEEmodels are slightly wider than that from the GEE model.Adjusting for important covariates that are correlated withthe outcome increased the precision and reduced the ICC.This is consistent with the finding from Campbell for theanalysis of cluster trials in family medicine with a contin-uous outcome [30]. By adjusting for important covariates,we are able to control for the effect of imbalances in base-line risk factors and reduce unexplained variation. In gen-eral, it is important to note that for logistic regression, thepopulation averaged model (fitted using GEE) and thecluster specific method (modelled by random effectsmodels) are in fact estimating different population mod-els. This is covered in detail by Campbell [31] and was firsteffects logistic regression to be exactly the same. However,they are related through the ICC [31]. In our case, the esti-mates from the two models are similar since the ICC inthe CHAT study is relatively small.Sensitivity analysis and simulation studySeveral sensitivity analyses can be considered for CRTs.First, since different methods yield different results, andvery few methodological studies provide guidance ondetermining which method is the best, comparing theresults from different methods might help researchers todraw a safer conclusion, though the marginal odds ratioestimated by the GEE and the conditional odds ratio esti-mated from random-effect models may be interpreted dif-ferently [33]. Second, sensitivity analysis can be used toinvestigate the sensitivity of the conclusions to differentmodel assumptions. For example, in the random-effectsmodel, we assume that the cluster-level random effectsfollow a normal distribution on the log odds scale. How-ever, a sensitivity analysis can be carried out by allowingempirical investigation on the distribution of the randomeffects. Finally, a sensitivity analysis can also indicatewhich parameter values are reasonable to use in themodel.The Bayesian analysis incorporates different sources ofinformation in the model. However, a disadvantage ofTable 2: Comparison of Nine Methods with and without Adjustment for CovariatesUnit of Analysis Method of Analysis Unadjusted for Covariates Adjusted for CovariatesOR 95% CI OR 95% CICluster Un-weighted Regression 1.05 (0.59 1.87) 1.05 (0.60 1.84)Weighted Regression 1.27 (0.81 1.99) 1.27 (0.82 1.96)Random-effects Meta Regression 1.05 (0.60 1.85) 1.05 (0.61 1.82)Individual Standard Logistic Regression 1.14 (0.93 1.39) 1.17 (0.95 1.44)Robust Standard Error 1.14 (0.72 1.80) 1.17 (0.79 1.73)Generalized Estimating Equations ** 1.14 (0.72 1.80) 1.15 (0.76 1.72)Modified GEE (1) *** 1.14 (0.71 1.83)Modified GEE (2) **** 1.14 (0.71 1.84)Random-effects Meta Analysis 1.09 (0.68 1.74) 1.12 (0.73 1.70)Random-effects Logistic Regression 1.10 (0.65 1.86) 1.13 (0.71 1.80)Bayesian Random-effects Regression 1.12 (0.64 1.95) 1.13 (0.68 1.87)OR = odds ratio; CI = confidence interval* For the cluster level analysis, include 'center' (i.e. Hamilton and Ottawa) as the covariate; for the individual level analysis, include 'diabetes at baseline', 'heart disease at baseline', and 'BP controlled at baseline' as the covariates.** The intra-cluster correlation coefficient (ICC) estimated from GEE are 0.077 and 0.054 when unadjusted for covariates and adjusted for covariates respectively.*** The confidence interval was calculated based on the corrected standard error which was equal to the sandwich standard error estimator multiply by , where J is the number of clusters in each arm.**** The Confidence interval was calculated based on the quantiles from the t-distribution with 2(J-1) degrees of freedom instead of quantiles from the standard normal distribution.J J/( )− 1Page 7 of 9(page number not for citation purposes)discussed by Neuhaus and Jewell [32]. Thus, we wouldnot expect the estimates for the GEE and the random-this technique is that the results of the analysis aredependent on the choice of prior distributions. We per-BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/37formed more analyses to assess the sensitivity of theresults to different prior distributions representing weakinformation (i.e. non-informative prior) relative to thetrial data, and the results remained robust.A simulation study by Austin [34] suggested that the sta-tistical power of GEE is the highest among t-test, Wilcoxonrank sum test, permutation test, adjusted chi-square testand logistic random-effects model for the analysis ofCRTs. However, researchers should be cautioned aboutthe limitations of the GEE method. First, when thenumber of clusters is small, the estimate of variance pro-duced under GEE could be biased [21,22], particularly ifthe number of clusters is less than 20 [35]. In this case,correction for the bias would be necessary. Second, theresearch on the goodness-of-fit tests to the GEE applica-tion still faces some challenges [36]. Third, Ukoumunne etal [23] compared the accuracy of the estimation and theconfidence interval coverage from three cluster-levelmethods – the un-weighted cluster-level mean difference,weighted cluster-level mean difference and cluster-levelrandom-effects linear regression – and the GEE model inthe analysis of binary outcome from a CRT. Their resultsshowed that the cluster-level methods performed well fortrials with sufficiently large number of subjects in eachcluster and a small ICC. The GEE model led to some biasof the sandwich standard error estimator when thenumber of clusters are relatively few. However, this biascould be corrected by multiplying the sandwich standarderror by , where J is the number of clusters ineach arm, or by building the confidence interval for thetreatment effect based on the quantiles from the t-distri-bution with 2(J-1) degree of freedom. With these correc-tions, the GEE was found to have good properties andwould be generally preferred in practice over the cluster-level methods since both cluster-level and individual-levelconfounders can be adjusted for.ConclusionWe used data from the CHAT trial to compare differentmethods for analysing data from CRTs. Among all the sta-tistical methods, Bayesian analysis gives us the largeststandard error for the treatment effect and the widest 95%CI and therefore provides the most conservative evidenceto the researchers. However, the results remained robustunder all methods – showing sufficient evidence in sup-port of the hypothesis of no effect for the CHAT interven-tion against Usual Practice control model formanagement of blood pressure among seniors in primaryof trial data so as to assess impact of different modelassumptions on results. Nonetheless, we cannot inferfrom these analyses which method is superior in the anal-ysis of CRTs with binary outcomes. Further research basedon simulation studies is required to provide betterinsights into the comparability of the methods in terms ofstatistical power for designing CRTs.Competing interestsThe authors declare that they have no competing interests.Authors' contributionsJM and LT conceived the study. LT, JK, LC, LD, TK and CLparticipated in the design and implementation of theCHAT study. Data cleaning was done by TK and JM. JMconducted the data analysis and wrote the initial draft ofthe manuscript. Results of the data analysis were inter-preted by JM and LT. All authors reviewed and revised thedraft version of the manuscript. All authors read andapproved the final version of the manuscript.AcknowledgementsThe CHAT trial was funded by the grant from the Canadian Institute of Health Research (CIHR). Dr. Lehana Thabane is a clinical trials mentor for CIHR. We thank the reviewers for insightful comments that improved the presentation of the manuscript.References1. Campbell MK, Grimshaw JM: Cluster randomised trials: time forimprovement. The implications of adopting a cluster designare still largely being ignored. BMJ 1998, 317(7167):1171-1172.2. Donner A, Klar N: Design and Analysis of Cluster Randomisation Trials inHealth Research London: Arnold; 2000. 3. Torgerson DJ: Diabetes education: Selection bias in clustertrial. BMJ 2008, 336(7644):573.4. The COMMIT Research Group: Community Intervention Trialfor Smoking Cessation (COMMIT): I. cohort results from afour-year community intervention. Am J Public Health 1995,85(2):183-192.5. Donner A, Klar N: Pitfalls of and controversies in cluster rand-omization trials. Am J Public Health 2004, 94(3):416-422.6. Cornfield J: Randomization by group: a formal analysis. m J Epi-demiol 1978, 108(2):A100-102.7. Ukoumunne OC, Gulliford MC, Chinn S, Sterne JA, Burney PG:Methods for evaluating area-wide and organisation-basedinterventions in health and health care: a systematic review.Health Technol Assess 1999, 3(5):iii-92.8. Community Hypertension Assessment Trial [http://www.chapprogram.ca]9. Campbell MK, Elbourne DR, Altman DG, CONSORT group: CON-SORT statement: extension to cluster randomised trials.BMJ 2004, 328(7441):702-708.10. Sung L, Hayden J, Greenberg ML, Koren G, Feldman BM, TomlinsonGA: Seven items were identified for inclusion when reportinga Bayesian analysis of a clinical study. J Clin Epidemiol 2005,58(3):261-268.11. Peters TJ, Richards SH, Bankhead CR, Ades AE, Sterne JA: Compar-ison of methods for analysing cluster randomized trials: anexample involving a factorial design. Int J Epidemiol 2003,32(5):840-846.12. Draper NR, Smith H: Applied regression analysis 3rd edition. NewYork: Wiley; 1998. 13. Thompson SG, Sharp SJ: Explaining heterogeneity in meta-anal-ysis: a comparison of methods. Stat Med 1999,J J/( )− 1Page 8 of 9(page number not for citation purposes)care. Our analysis reinforces the importance of buildingsensitivity analyses to support primary analysis in analysis18(20):2693-2708.14. Dobson AJ: An introduction to generalized linear models 2nd edition.Boca Raton: Chapman & Hall/CRC; 2002. Publish with BioMed Central and every scientist can read your work free of charge"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."Sir Paul Nurse, Cancer Research UKYour research papers will be:available free of charge to the entire biomedical communitypeer reviewed and published immediately upon acceptancecited in PubMed and archived on PubMed Central BMC Medical Research Methodology 2009, 9:37 http://www.biomedcentral.com/1471-2288/9/3715. Hosmer DW, Lemeshow S: Applied logistic regression 2nd edition. NewYork: Toronto: Wiley; 2000. 16. Huber PJ: Robust statistics New York: Wiley; 1981. 17. Long JS, Ervin LH: Using heteroscedastic consistent standarderrors in the linear regression model. American statistician 2000,54:795-806.18. White H: A heteroscedastic-consistent covariance matrixestimator and a direct test of heteroskedasticity. Econometrica1980, 48(4):817-838.19. McCullagh P, Nelder J: Generalized linear models 2nd edition. London;New York: Chapman and Hall; 1989. 20. Zeger SL, Liang KY, Albert PS: Models for longitudinal data: ageneralized estimating equation approach. Biometrics 1988,44(4):1049-1060.21. Prentice RL: Correlated binary regression with covariates spe-cific to each binary observation. Biometrics 1988,44(4):1033-1048.22. Mancl LA, DeRouen TA: A covariance estimator for GEE withimproved small-sample properties. Biometrics 2001,57(1):126-134.23. Ukoumunne OC, Carlin JB, Gulliford MC: A simulation study ofodds ratio estimation for binary outcomes from cluster ran-domized trials. Stat Med 2007, 26(18):3415-3428.24. Wolfinger RD, O'Connell M: Generalized linear models: apseudo-likelihood approach. Journal of Statistical Computation andSimulation 1993:233-243.25. Omar RZ, Thompson SG: Analysis of a cluster randomized trialwith binary outcome data using a multi-level model. Stat Med2000, 19(19):2675-2688.26. Gelman A: Bayesian data analysis 2nd edition. Boca Raton, Fla.: Chap-man & Hall/CRC; 2004. 27. Spiegelhalter DJ: Bayesian methods for cluster randomized tri-als with continuous responses. Stat Med 2001, 20(3):435-452.28. Turner RM, Omar RZ, Thompson SG: Bayesian methods of anal-ysis for cluster randomized trials with binary outcome data.Stat Med 2001, 20(3):453-472.29. Gelman A: Prior distributions for variance parameters in hier-archical models. Bayesian Analysis 2006, 1(3):515-533.30. Campbell MJ: Cluster randomized trials in general (family)practice research. Stat Methods Med Res 2000, 9(2):81-94.31. Campbell MJ, Donner A, Klar N: Developments in cluster rand-omized trials and Statistics in Medicine. Stat Med 2007,26(1):2-19.32. Neuhaus JM, Jewell NP: A geometric approach to assess biasdue to omitted covariates in generalized linear models.Biometrika 1993, 80(4):807-815.33. FitzGerald PE, Knuiman MW: Use of conditional and marginalodds-ratios for analysing familial aggregation of binary data.Genet Epidemiol 2000, 18(3):193-202.34. Austin PC: A comparison of the statistical power of differentmethods for the analysis of cluster randomization trials withbinary outcomes. Stat Med 2007, 26(19):3550-3565.35. Horton NJ, Lipsitz SR: Review of software to fit generalizedestimating equation regression models. American Statistician1999, 53:160-169.36. Ballinger GA: Using Generalized Estimating Equations forLongitudinal Data Analysis. Organizational Research Methods2004, 7(2):127-150.Pre-publication historyThe pre-publication history for this paper can be accessedhere:http://www.biomedcentral.com/1471-2288/9/37/prepubyours — you keep the copyrightSubmit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.aspBioMedcentralPage 9 of 9(page number not for citation purposes)
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Faculty Research and Publications /
- Comparison of Bayesian and classical methods in the...
Open Collections
UBC Faculty Research and Publications
Comparison of Bayesian and classical methods in the analysis of cluster randomized controlled trials… Ma, Jinhui; Thabane, Lehana; Kaczorowski, Janusz; Chambers, Larry; Dolovich, Lisa; Karwalajtys, Tina; Levitt, Cheryl Jun 16, 2009
pdf
Page Metadata
Item Metadata
Title | Comparison of Bayesian and classical methods in the analysis of cluster randomized controlled trials with a binary outcome: The Community Hypertension Assessment Trial (CHAT) |
Creator |
Ma, Jinhui Thabane, Lehana Kaczorowski, Janusz Chambers, Larry Dolovich, Lisa Karwalajtys, Tina Levitt, Cheryl |
Contributor | University of British Columbia. Centre for Applied Health Research and Evaluation |
Publisher | BioMed Central |
Date Issued | 2009-06-16 |
Description | Background: Cluster randomized trials (CRTs) are increasingly used to assess the effectiveness of interventions to improve health outcomes or prevent diseases. However, the efficiency and consistency of using different analytical methods in the analysis of binary outcome have received little attention. We described and compared various statistical approaches in the analysis of CRTs using the Community Hypertension Assessment Trial (CHAT) as an example. The CHAT study was a cluster randomized controlled trial aimed at investigating the effectiveness of pharmacy-based blood pressure clinics led by peer health educators, with feedback to family physicians (CHAT intervention) against Usual Practice model (Control), on the monitoring and management of BP among older adults. Methods We compared three cluster-level and six individual-level statistical analysis methods in the analysis of binary outcomes from the CHAT study. The three cluster-level analysis methods were: i) un-weighted linear regression, ii) weighted linear regression, and iii) random-effects meta-regression. The six individual level analysis methods were: i) standard logistic regression, ii) robust standard errors approach, iii) generalized estimating equations, iv) random-effects meta-analytic approach, v) random-effects logistic regression, and vi) Bayesian random-effects regression. We also investigated the robustness of the estimates after the adjustment for the cluster and individual level covariates. Results Among all the statistical methods assessed, the Bayesian random-effects logistic regression method yielded the widest 95% interval estimate for the odds ratio and consequently led to the most conservative conclusion. However, the results remained robust under all methods – showing sufficient evidence in support of the hypothesis of no effect for the CHAT intervention against Usual Practice control model for management of blood pressure among seniors in primary care. The individual-level standard logistic regression is the least appropriate method in the analysis of CRTs because it ignores the correlation of the outcomes for the individuals within the same cluster. Conclusion We used data from the CHAT trial to compare different methods for analysing data from CRTs. Using different methods to analyse CRTs provides a good approach to assess the sensitivity of the results to enhance interpretation. |
Genre |
Article |
Type |
Text |
Language | eng |
Date Available | 2016-01-05 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution 4.0 International (CC BY 4.0) |
DOI | 10.14288/1.0223057 |
URI | http://hdl.handle.net/2429/56166 |
Affiliation |
Medicine, Faculty of Non UBC |
Citation | BMC Medical Research Methodology. 2009 Jun 16;9(1):37 |
Publisher DOI | 10.1186/1471-2288-9-37 |
Peer Review Status | Reviewed |
Scholarly Level | Faculty |
Copyright Holder | Ma et al. |
Rights URI | http://creativecommons.org/licenses/by/4.0/ |
Aggregated Source Repository | DSpace |
Download
- Media
- 52383-12874_2008_Article_353.pdf [ 974.59kB ]
- Metadata
- JSON: 52383-1.0223057.json
- JSON-LD: 52383-1.0223057-ld.json
- RDF/XML (Pretty): 52383-1.0223057-rdf.xml
- RDF/JSON: 52383-1.0223057-rdf.json
- Turtle: 52383-1.0223057-turtle.txt
- N-Triples: 52383-1.0223057-rdf-ntriples.txt
- Original Record: 52383-1.0223057-source.json
- Full Text
- 52383-1.0223057-fulltext.txt
- Citation
- 52383-1.0223057.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.52383.1-0223057/manifest