UBC Faculty Research and Publications

Clinical significance in pediatric oncology randomized controlled treatment trials: a systematic review Howard, A. Fuchsia; Goddard, Karen; Rassekh, Shahrad R; Samargandi, Osama A; Hasan, Haroon Oct 5, 2018

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
52383-13063_2018_Article_2925.pdf [ 935.95kB ]
Metadata
JSON: 52383-1.0372979.json
JSON-LD: 52383-1.0372979-ld.json
RDF/XML (Pretty): 52383-1.0372979-rdf.xml
RDF/JSON: 52383-1.0372979-rdf.json
Turtle: 52383-1.0372979-turtle.txt
N-Triples: 52383-1.0372979-rdf-ntriples.txt
Original Record: 52383-1.0372979-source.json
Full Text
52383-1.0372979-fulltext.txt
Citation
52383-1.0372979.ris

Full Text

REVIEW Open AccessClinical significance in pediatric oncologyrandomized controlled treatment trials: asystematic reviewA. Fuchsia Howard1* , Karen Goddard2, Shahrad Rod Rassekh3, Osama A Samargandi4 and Haroon Hasan2,5AbstractBackground: Clinical significance in a randomized controlled trial (RCT) can be determined using the minimalclinically important difference (MCID), which should inform the delta value used to determine sample size. Theprimary objective was to assess clinical significance in the pediatric oncology randomized controlled trial (RCT)treatment literature by evaluating: (1) the relationship between the treatment effect and the delta value as reportedin the sample size calculation, and (2) the concordance between statistical and clinical significance. The secondaryobjective was to evaluate the reporting of methodological attributes related to clinical significance.Methods: RCTs of pediatric cancer treatments, where a sample size calculation with a delta value was reported orcould be calculated, were systematically reviewed. MEDLINE, EMBASE, and the Cochrane Childhood Cancer GroupSpecialized Register through CENTRAL were searched from inception to July 2016.Results: RCTs (77 overall; 11 and 66), representing 95 (13 and 82) randomized questions were included for non-inferiority and superiority RCTs (herein, respectively). The minority (22.1% overall; 76.9 and 13.4%) of randomizedquestions reported conclusions based on clinical significance, and only 4.2% (15.4 and 2.4%) explicitly based thedelta value on the MCID. Over half (67.4% overall; 92.3 and 63.4%) reported a confidence interval or standard errorfor the primary outcome experimental and control values and 12.6% (46.2 and 7.3%) reported the treatment effect,respectively. Of the 47 randomized questions in superiority trials that reported statistically non-significant findings,25.5% were possibly clinically significant. Of the 24 randomized questions in superiority trials that were statisticallysignificant, only 8.3% were definitely clinically significant.Conclusions: A minority of RCTs in the pediatric oncology literature reported methodological attributes relatedto clinical significance and a notable portion of statistically insignificant studies were possibly clinically significance.Keywords: Clinical significance, Minimally clinically important difference, Randomized controlled trials, PediatriconcologyBackgroundCancer among children is rare, accounting for less than1% of all new cases in Canada [1]. Over the past 50 years,the 5-year relative survival rate for pediatric cancers hasrisen dramatically, from 10 to 83% [2, 3], largely becauseof treatment advances and high rates of clinical trialsparticipation, estimated to be upwards of 60% [4].Pediatric clinical trials are remarkably complex becauseof the lower incidence of disease, safety concerns, strin-gent regulatory requirements and limited commercialinterest [5]. As such, randomized controlled trials (RCTs)are predominately multi-institutional, resource intensive,often take years to complete and rarely have pharmaceut-ical industry support [6]. By and large, national andinternational collaborative efforts, such as the Children’sOncology Group, coordinate the majority of trials, theresults of which often provide the basis for changes intreatment regimens and standard of care [7]. Even onetrial can dramatically change standard of care [8, 9].* Correspondence: fuchsia.howard@ubc.ca1School of Nursing, The University of British Columbia, T201-2211 WesbrookMall, Vancouver, BC V6T 2B5, CanadaFull list of author information is available at the end of the article© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, andreproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link tothe Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Howard et al. Trials  (2018) 19:539 https://doi.org/10.1186/s13063-018-2925-8The concept of clinical significance is now consideredcrucial in RCT planning and interpretation [10]. The 2010Consolidated Standards of Reporting Trials (CONSORT)Statement acknowledged the importance of assessing studyresults based on sample size calculations and assumptionsthat include clinical significance [11]. Clinical significancecan be determined using the minimal clinically importantdifference (MCID); “the smallest treatment effect that wouldresult in a change in patient management, given its sideeffects, costs, and inconveniences” [12, 13]. It is critical that astudy be powered based on a delta value that reflects theMCID. The delta value component of the sample sizecalculation is the difference between the experimental andthe control group that can be detected based on the type 1(α value) and type 2 (β value) errors. A delta value that re-flects the MCID ensures that the sample size calculationallows for an evaluation of clinical significance. Inaddition, the appropriate clinical interpretation of resultsrequires authors to report methodological attributes re-lated to clinical significance, such as justification of theMCID [11, 14, 15]. Thus, it is essential for studies to bedesigned and interpreted based on an evidence-basedMCID and clinically relevant measures, such as confi-dence intervals (CIs), which provide information on statis-tical significance, and the direction and size of thetreatment effect [12, 16–18].Research suggests that RCT authors rely primarily onstatistical significance, they do not consistently providetheir own interpretation of the clinical importance of re-sults, and they rarely provide sufficient information to en-able readers to draw their own conclusions [13, 19, 20].The degree to which clinical significance has beenassessed in the pediatric oncology literature remains un-known. The primary study objective was to assess clinicalsignificance, and reporting of clinical significance, in thepediatric oncology RCT literature by, first, evaluating therelationship between the treatment effect (with its associ-ated CI) and the delta value, as reported in the sample sizecalculation and, second, assessing the concordance be-tween statistical and clinical significance. The secondarystudy objective was to assess methodological attributes re-lated to clinical significance.MethodsThis systematic review was conducted following apre-defined protocol, which was informed by the PreferredReporting Items for Systematic Reviews and Meta-Analyses(PRISMA) Statement [21].Search strategyThe academic literature was systematically searchedusing a comprehensive search strategy to identify allRCTs in pediatric oncology (Additional file 1: Appen-dix A). We searched MEDLINE, EMBASE, and theCochrane Childhood Cancer Group Specialized Regis-ter through CENTRAL from inception to July 2016.Our search strategy was developed by initially usingthe Canadian Agency for Drugs and Technologies inHealth search filter to identify RCTs [22] and subse-quently adapted to identify RCTs in pediatric oncol-ogy using the Cochrane Childhood Cancer SearchFilter, which has been validated by Leclercq et al.[23]. We also assessed the reference lists of the stud-ies that fulfilled our inclusion criteria to identify add-itional studies. Our search was restricted to studies inEnglish and was inclusive of the published literature.Eligibility criteriaStudies were deemed eligible if the study adhered to aRCT study design (i.e., did not included a non-random-ized component or a historical control), where the studypopulation consisted of pediatric patients diagnosed withcancer and the primary outcome of the trial was a rele-vant cancer treatment outcome (e.g., a treatment regi-men assessing overall survival, event-free survival, etc.),and the trial was a phase III trial that did not stop earlydue to futility. This did not include studies wherein theprimary outcome was treatment complications or sideeffects, pharmacokinetic trials, toxicity trials,non-clinical interventions, or drug safety profile trials.Only studies that reported a sample size calculationwhere a delta value was reported or could be calculatedfor the randomized question were included. Trials thatwere long-term follow-up studies were excluded andonly studies that reported the most recent trial resultswere reported. Only phase III trials that were notstopped early due to futility were eligible for inclusion.RCTs where the study population consisted of bothpediatric and adult patients were deemed eligible ifadults were less than or equal to 25 years of age. Wechose this age range to reflect norms in pediatric oncol-ogy treatment research wherein trials most often includeparticipants up to 21 years of age, though many havealso included participants up to age 25 years because ofthe potential benefit for these slightly older patients.Restricting the age limit to 21 years would result in theexclusion of a number of trials where the majority ofparticipants were aged less than 21 years.Study identificationTwo investigators (HH and KN, non-independently)screened the title and abstracts based on the specifiedinclusion criteria. The full text was retrieved andreviewed if the title and abstract was insufficient to de-termine fulfillment of inclusion criteria. Subsequently,one investigator (HH) conducted a full-text review toassess all of the studies that passed through the firstround of title and abstract for inclusion eligibility. TheHoward et al. Trials  (2018) 19:539 Page 2 of 11principal investigator (AFH) was available to resolve anydiscrepancies or disagreements encountered duringstudy selection.Data extractionA standardized data extraction template was developedto collect attributes relevant to clinical and statistical sig-nificance in addition to general characteristics. The dataextraction template was initially piloted on a sample of15 included studies to ensure that pertinent informationwas captured and subsequently finalized based on theresults of the pilot. Data was collected by one investiga-tor (HH) based on each of the randomized questionswithin all RCTs, thereby capturing each outcome andthe corresponding reported sample size calculation.AnalysisCharacteristics of included studiesThe characteristics of the studies included in our sys-tematic review included: journal, region, and year ofpublication; source of funding; whether the RCTincluded exclusively children or adults and children; thedisease site of focus of the RCT (hematological, centralnervous system, non-central nervous system solidtumor); RCT study design (2 × 2 factorial, greater thantwo arms; parallel-group); trial group; primary outcomedefined as time-to-event or dichotomous; primary out-come intervention (chemotherapy, multimodal therapy,hematopoietic stem-cell transplant, radiation therapy).This analysis was based on studies and stratified by RCTtype (non-inferiority or superiority).Reporting of methodological attributes associated withclinical significanceThe selection of methodological attributes related to clin-ical significance were informed by evidence from the litera-ture and the expertise of the research team [11, 13, 24, 25].These attributes consisted of: explicitly identifying the ex-pected magnitude of difference as the MCID and providingjustification for why this MCID was selected, whether it bebased on clinical relevance or methodological; reporting thedelta value as an absolute and/or relative difference strati-fied by primary outcome type (time-to-event or dichotom-ous); reporting anticipated control and experimental valuesfor which the delta value was derived from and providingthe rationale for why the assumed control value wasselected; type 1 error (α value) and number of sidesof p value; power (1 − β value); reporting statisticalsignificance of the primary outcome via a p value;reporting a confidence interval (CI) or standard erroraround the experimental and control estimates for theprimary outcome; reporting treatment effect (i.e.,experimental value – control value); reporting, within thediscussion, an assessment of the clinical importance of theresults of the primary outcome through an interpretationof the results based on the delta value specified in thesample size calculation. As methodological attributes arerelevant to the primary outcome, this analysis was re-stricted to randomized questions stratified by RCT type(non-inferiority or superiority). Therefore, studies that in-volved multiple primary outcomes (e.g., 2 × 2 factorial tri-als, etc.), would have more than one randomized question.Clinical significanceClinical significance was determined based on the guide-lines proposed by Man-Son-Hing et al. [10] which con-siders the relationship between the MCID of the treatmenteffect and the CI and designated to one of the followingfour different levels (Fig. 1): (1) Definite – the MCID issmaller than the lower limit of the CI of the treatmenteffect, (2) Probable – the MCID is greater than the lowerlimit of the CI of the treatment effect, but smaller than thetreatment effect, (3) Possible – the MCID is less than theupper limit of the CI of the treatment effect, but greaterthan the treatment effect, and (4) Definitely Not – theMCID is greater than the upper limit of the CI of the treat-ment effect. The CI should be based on the α value speci-fied in the sample size calculation.We restricted this analysis to randomized questions insuperiority trials because these guidelines are intendedto be applied to superiority trials. For each study, thedelta value was assumed to be the MCID irrespective ofwhether explicitly stated by the authors. This assump-tion was applied in an earlier study by Chan et al. [13],and is a pragmatic approach in the context of pediatriconcology. This is based on the premise that the deltavalue must be reflective of the MCID to the extent thatit will result in strong evidence to change standard ofcare, while also feasible to achieve in a rare diseasepopulation. A traditional approach to surveying clini-cians and patients to determine a MCID is not realisticin the scope of rare diseases and thus the delta valuemust follow a definition, which is pragmatic yet clinicallyrelevant and evidence-based. This analysis was restrictedto randomized questions to account for studies withmultiple primary outcomes. Additionally, only random-ized questions were included where the treatment effectand its CI were reported or could be calculated.Statistical analysisThe CI of the treatment effect for each randomizedquestion was determined with the methodology outlinedby Hackshaw [26] and Altman and Anderson [27] fordichotomous and time-to-event primary outcomesrespectively when the CI of the treatment effect was notreported. The CI of the treatment effect for time-to-e-vent outcomes could be calculated only if the CIassociated with the experimental and control estimatesHoward et al. Trials  (2018) 19:539 Page 3 of 11were reported or a Kaplan-Meier curve with patients atrisk was reported. The time point specified in the samplesize calculation was used and if it could not be inferredfrom a Kaplan-Meier curve, or was not reported, thetime point reported was used. If the aforementioned wasnot provided, the CI for the treatment effect could notbe calculated and the randomized question was excludedfrom this analysis. The treatment effect CI was calcu-lated based on the design α value if reported in the sam-ple size calculation and if not reported an alpha of 0.05was assumed. In the event the primary outcome of thesample size calculation included both an absolute andrelative difference, the absolute difference was used. Thelevel of concordance between statistical and clinical sig-nificance was assessed through descriptive statistics.Descriptive statistics were calculated to assess thereporting of methodological attributes associated withclinical significance. SAS (Statistical Analysis Software)version 9.4 (SAS Institute, Cary, NC, USA) was used toperform all analyses.ResultsOur search identified 3750 unique studies from Medline,EMBASE, and the Cochrane Childhood Cancer GroupSpecialized Register accessed through CENTRAL. Follow-ing title and abstract screening, 406 studies were evaluatedfor eligibility based on full-text review. Of these studies, 329studies were excluded and 77 studies were includedin the systematic review (Fig. 2) (Additional file 1:Appendix B). Table 1 summarizes the characteristicsof the included studies.Table 2 summarizes the methodological attributes rele-vant to assessing clinical significance for all includedstudies by randomized questions stratified by RCT type(non-inferiority and superiority RCT, herein, respect-ively). Only 4.2% (15.4 and 2.4%) of randomized ques-tions explicitly identified that the delta value was basedon the MCID, while 22.1% (76.9 and 13.4%) of random-ized questions discussed the clinical importance in rela-tion to the delta value specified in their sample sizecalculation. The majority (95.6% overall; 100.0 and95.1%) of randomized questions reported the delta valuein the sample size calculation as an absolute value andthe minority in relative terms (e.g., relative risk reduc-tion, relative hazard rate, etc.). Almost three quarters re-ported (76.8% overall; 76.9 and 76.8%) the estimateassumed for the control group, of which only 18.9%(46.2 and 14.6%) reported justification for why the esti-mate was assumed. The statistical significance of the pri-mary outcome was reported using a p value in 83.2%(100.0 and 80.5%) of randomized questions, while overhalf (67.4% overall; 92.3 and 63.4%) reported CIs orstandard error bars for the experimental and controlvalues. The majority of studies reported type 1 and typeClinicalSignificance1- Definite2-3- Possible4- Definitely NotStatistical SignificanceYesYesNoYesNoYesNoMinimal clinically important differenceTreatment effect and confidence intervalHarm BenefitProbableFig. 1 Relationship between clinical significance and statistical significance (adapted from Man-Son-Hing, et al.) [10]Howard et al. Trials  (2018) 19:539 Page 4 of 112 errors in their sample size calculations; however, only12.6% (46.2 and 7.3%) reported the treatment effect inthe results.Table 3 and Fig. 3 summarize the level of clinical sig-nificance in superiority trials, determined when examin-ing the relationship between the MCID of the treatmenteffect and its associated CIs, in relation to the reportingof statistically significant findings for superiority RCTsthat satisfied the criteria. Of the 71 randomized ques-tions that reported statistically insignificant findings,25.5% (n = 12) were found to have possible clinical sig-nificance. Of the 24 randomized questions that reportedstatistically significant findings, 8.3% (n = 2) were foundto have clinical significance categorized as “DefinitelyNot,” 83.4% (n = 20) as “Probable or Possible,” and 8.3%(n = 2) as “Definite.” Of the total 71 randomized ques-tions, only 2.8% (n = 2) were found to have definiteclinical importance while 45.1% (n = 32) were found tohave “Probable or Possible” clinical significance and theremaining 52.1% (n = 37) were “Definitely Not” clinicallysignificant.DiscussionIn this systematic review, we demonstrated that only aminority of the 77 RCTs (11 non-inferiority and 66 su-periority RCTs) in the published pediatric oncologytreatment literature reported methodological attributesrelated to clinical significance. A notable portion ofRCTs reporting statistically insignificant results wasfound to have possible clinical significance and likewisefor those reporting statistically significant results.Strengths and weaknessesThe strengths of this study stem from the inclusion ofall pediatric oncology RCTs, from database inception toJuly 2016, that evaluated a range of cancer treatmentsfor various cancer types. To our knowledge, this is thefirst study to assess clinical significance in the rareFig. 2 Selection of randomized controlled trials in the systematic reviewHoward et al. Trials  (2018) 19:539 Page 5 of 11Table 1 Characteristics of 77 included studies, by non-inferiority and superiority trialsCharacteristic Non-inferiority trials (N = 11) Superiority trials(N = 66)n % n %Journal of publicationJournal of Clinical Oncology 3 27.3 25 37.9Blood 2 18.2 10 15.2Pediatric Blood & Cancer 0 0.0 6 9.1Leukemia 1 9.1 4 6.1Lancet 2 18.2 1 1.5Cancer 0 0.0 4 6.1Lancet Oncology 1 9.1 3 4.5New England Journal of Medicine 1 9.1 2 3Other 1 9.1 11 16.7Region of publicationEurope 6 54.5 23 34.8North America 4 36.4 41 62.1Other 1 9.1 2 3Year of publication1976 to 1989 1 9.1 4 6.11990 to 2003 4 36.4 28 42.42004 to 2016 6 54.5 34 51.5Source of fundingNon-industry 10 90.9 56 84.8Industry and non-industry 0 0.0 2 3Not stated 1 9.1 8 12.1Study participantsExclusively children 7 63.6 41 62.1Adults included 4 36.4 25 37.9Disease siteHematological 7 63.6 43 65.2Central nervous system tumor 1 9.1 11 16.7Non-central nervous system solid tumor 3 27.3 12 18.2RCT study design2 × 2 factorial 0 0.0 4 6.1Greater than 2 arms 1 9.1 5 7.6Two-armed 10 90.9 57 86.3RCT trial groupPOG 1 9.1 15 22.7CCG 1 9.1 12 18.2COG 0 0.0 8 12.1BFM 1 9.1 9 13.6UK MRC 1 9.1 4 6.1Other 7 63.6 18 27.3Howard et al. Trials  (2018) 19:539 Page 6 of 11disease context of pediatric cancer RCTs. A limitation isthat the search was restricted to studies in English andwas inclusive of the published literature, and was thus,prone to language and publication bias. The assessmentof clinical significance was based on the delta value inthe published report and not the trial protocol and,therefore, it is possible that information recorded as notreported was reported in the trial protocol. However, theparameters used in this review comprise of CONSORT-mandated items and thus should be reported in the pub-lished report. A limitation associated with assessingclinical significance as per Man-Son-Hing et al., [10] isthat the weight for each level of clinical significance willvary depending on the research question. However, thedefinitions applied to classify the level of clinical signifi-cance are arbitrary and not meant to be adhered tostrictly. For example, in the context of a disease with noavailable effective treatments, a treatment found to bestatistically significant but that only shows possibleclinical significance would likely warrant implementationin the clinical setting. This is of particular relevance topediatric cancer, where some cancer subtypes still havedismal survival and high relapse rates. Conversely, dem-onstrating superiority to a well-established treatmentwould require adherence to a strict definition of clinicalimportance because changes to recommended treatmentshould not proceed unless definite clinical significance isdemonstrated. In interpreting the study findings, it isalso important to note that only 13 of the 95 ran-domized questions were from non-inferiority trials,representing a small minority. Thus, it is critical thatthe assessments of these non-inferiority trials in thisresearch not be over-interpreted or generalized to thesuperiority trials.Comparison with existing literatureBased on our analysis, the majority (77.9%) of random-ized questions did not describe the clinical importanceof their findings in relation to the delta value of the in-terventions in question. The minority (4.2%) of random-ized questions explicitly identified the delta value as theMCID with justification. These findings are in line withthe limited number of previous studies investigatingclinical significance reporting, wherein under-reportingwas found [13, 19, 20, 28, 29]. Chan et al. [13] foundthat, among a random sample of 27 RCTs in majormedical journals, 20 articles included sample size calcu-lations, 90% of which reported a delta value but only11% stated that the delta value was chosen to reflect theMCID of the intervention. Study results were interpretedfrom the perspective of clinical importance in 20 of 27(74%) articles, with only one article discussing clinicalimportance in relation to a reported sample sizeMCID. In a review of 57 dementia drug RCTs,Molnar et al. [29] found that 46% discussed the clin-ical significance of their results, and no studies usedformally derived MCIDs. These results are in linewith our review findings.Study explanations and implicationsIn our study, the failure to incorporate a MCID intostudy design and/or state whether or not the delta valuein the sample size calculation was based on the MCIDcould, in part, be attributed to poor reporting in com-bination with the difficulty of achieving a reasonablesample size, even when recruiting patients from multipleinstitutions and over a long duration of time [30–32].The formidable challenges of conducting pediatric on-cology trials include parent and physician reluctance toinvolve children in trials, difficulties obtaining consent,permission and assent for study participation, and thededicated time and attention required to educate chil-dren and families, not to mention the remarkably com-plex logistics of involving multiple sites in multiplecountries and the stringent safety monitoring required[33, 34]. Therefore, sample size calculations are perhapsTable 1 Characteristics of 77 included studies, by non-inferiority and superiority trials (Continued)Characteristic Non-inferiority trials (N = 11) Superiority trials(N = 66)n % n %OutcomeTime-to-event 10 90.9 56 84.8Dichotomous 1 9.1 10 15.2Intervention in questionChemotherapy 9 81.8 57 86.4Multimodal therapy 0 0.0 2 3Hemopoietic stem-cell transplant 1 9.1 6 9.1Radiation therapy 1 9.1 1 1.5RCT randomized control trial, POG Pediatric Oncology Group, CCG Children’s Cancer Group, COG Children’s Oncology Group, BFM Berlin Frankfurt Münster StudyGroup; UK MRC United Kingdom Medical Research CouncilHoward et al. Trials  (2018) 19:539 Page 7 of 11Table 2 Methodological attributes relevant to interpretation of study results from a clinical perspective for 95 randomized questions,by non-inferiority and superiority trialsCharacteristic Non-inferiority trials(N = 13)Superiority trials (N = 82)n %b n %bMethodsExpected magnitude of difference identified explicitly as the MCIDc 2 15.4 2 2.4Justification for MCIDaClinical relevance 1 50.0 1 50.0Methodological 1 50.0 1 50.0Delta valueStated as an absolute difference 13 100.0 78 95.1Margin (median, IQR) − 0.10 − 0.10,0.10 0.12 0.10, 0.17Time-to-event 12 92.3 69 88.5Dichotomous outcome 1 7.7 9 11.5Stated as relative difference 0 0.0 7 8.5Margin (median, IQR) N/A 0.63 0.60, 2.50Time-to-event N/A 6 85.7Dichotomous outcome N/A 1 14.3Stated as a percentage and ratio 0 0.0 4 4.9Anticipated control value stated 10 76.9 63 76.8Assumptions in the control group 6 46.2 12 14.6StatedResults from previous trial or systematic review 5 83.3 11 91.7Based on clinical expertise 1 16.7 1 8.3Type 1 error (α value)Stated 10 76.9 52 63.40.20 0 0.0 1 1.90.10 2 20.0 2 3.80.05 8 80.0 49 94.2SidesStated 12 92.3 40 48.8One-sided 10 83.3 29 72.5Two-sided 2 16.7 11 27.5Type 2 error (1 − β value)Stated 12 92.3 81 98.8< 80% 2 16.7 7 8.680 to 85% 6 50.0 58 71.685 to 90% 1 8.3 7 8.6≥ 90% 3 25.0 9 11.1ResultsStatistical significance of primary outcome reported via p value 13 100.0 66 80.5Confidence intervals/standard error for primary outcome reported 12 92.3 52 63.4Treatment effect stated 6 46.2 6 7.3Discussion (and/or Results)Clinical importance of primary outcome discussed 10 76.9 11 13.4aMCID minimally clinically important difference, IQR interquartile rangebPercentages may not sum to 100% due to roundingcAssumed to be the delta value from the sample size calculationHoward et al. Trials  (2018) 19:539 Page 8 of 11often based on study feasibility and a larger delta valueis chosen to reduce the sample size required for thestudy. This has the potential to place patients at risk ofparticipating in a trial that might lead to erroneous con-clusions based on flawed study design and risks the mis-management of precious time and resources [35]. Oftentimes, studies report that the intervention groups do notdiffer, when in actuality they lacked sufficient power tosupport this claim as well as detect a clinically meaning-ful treatment effect [11, 36–39]. This is concerning inour review wherein about two thirds of the randomizedquestions in superiority trials were not statisticallysignificant, yet 25.5% were found to be have probable orpossible clinical significance. Arguably, it is unreasonable toTable 3 Relationship between statistical significance and clinicalsignificance in superiority randomized controlled trialsconsisting of 71 randomized questionsClinicalsignificanceStatistical significance TotalN = 71No (n = 47) Yes (n = 24)N % N % N %Definite 0 0.0 2 8.3 2 2.8Probable 0 0.0 7 29.2 7 9.9Possible 12a 25.5 13 54.2 25 35.2Definitely Not 35 74.5 2b 8.3 37 52.1aTwo randomized question included where the confidence interval of thetreatment effect was based on an alpha of 0.05 although the sample sizecalculation stated an alpha of 0.10. This was due to insufficient informationbeing reported which precluding calculating the 90% confidence intervalbStatistically significant solely due to the direction of the effect being relatedto harm as opposed to benefitFig. 3 Relationship between statistical significance and clinical significance in superiority randomized controlled trials (RCTs). aTwo randomizedquestions included where the confidence interval of the treatment effect was based on an alpha of 0.05 although the sample size calculationstated an alpha of 0.10. This was due to insufficient information being reported which precluded calculating the 90% confidence interval.bStatistically significant solely due to the direction of the direction of the effect being related to harm as opposed to benefitTable 4 Recommendations for incorporating clinicalsignificance into randomized controlled trial design andinterpretationRecommendationsa1. Conduct a comprehensive review of the literature to identify theMCID. If the RCT is completely novel, use preliminary pilot data toinform the MCID2. Perform a sample size calculation using a delta value that is basedon the MCID. If the sample size is not feasible given resourceconstraints, adjust the delta value to increase the sample size to a valuethat is still clinically meaningful3. When reporting the results of an RCT ensure the following arereported in the sample size calculation:• Type 1 error (α value)○ One- or two-sided p value• Type 2 error (β value)○ At least 80% is recommended• Estimated controlled value and justification• Estimated experimental value and justification• Delta value in absolute terms and justification of treatment effect• Explicitly identify primary outcome when multiple outcomes arebeing investigated4. Calculate and report confidence intervals for the experimental andcontrolled values as well as the treatment effect5. Interpret the treatment effect and its confidence interval in relationto the MCID and place weight on conclusions based on the precisiondetermined by the confidence interval6. Ensure conclusions reflect the quality of the trial based on therecommendations of the CONSORT StatementMCID minimally clinically important difference, RCT randomized control trial,CONSORT Consolidated Standards of Reporting TrialsaRecommendations adapted from and informed by Cook et al. [24], Moher etal. [11], and Koynova et al. [25]Howard et al. Trials  (2018) 19:539 Page 9 of 11expect for RCTs to be purely powered based on the MCIDwith disregard for feasibility issues; however, trials would beimproved if they were powered with consideration of theMCID as well as feasibility, as recommended in theCONSORT Statement [11]. This point is relevant inrare disease trials where a MCID purely reflective ofclinician and patient preferences is not realistic.Rather, the MCID in a rare disease context is perhapsbest determined by weighing the evidence in the lit-erature and/or pilot studies, clinical expertise, andpatient preferences in combination with feasibility.Careful evaluation of the aforementioned would helpensure that evidence-based decisions to changeclinical practice are supported.As Cook et al. [24] state, improved standards in bothRCT sample size calculations and reporting of these cal-culations could assist health care professionals, patients,researchers and funders to judge the strength of theavailable evidence and ensure responsible use of scarceresources. Without explicit discussion of the treatmenteffect in relation to the MCID, we allow for subjectiveinterpretation of trial results based solely on whether ornot the results were statistically significant. Statisticalsignificance determined by a p value only provides infor-mation on whether a significant difference exists anddoes not provide information on the direction and sizeof the effect [14]. For instance, if a decision-maker reliessolely on statistical significance, they are only able toinfer that an experimental treatment is significantly dif-ferent, or not, from the control, based on the power ofthe study. However, if a decision-maker relies on clinicalsignificance, which also provides information on statis-tical significance, they can infer the direction and size ofthis difference in relation to a MCID (by comparingwhere the treatment effect and its CI fall in relation tothe MCID). The latter approach provides greater utilitybecause a decision-maker can ascertain how harmful orbeneficial an experimental treatment is in comparison tothe control and assess the value of a trial with greater con-fidence, whether deemed statistically significant or not.A study might be statistically significant based on anarbitrary delta value and conclusions might be drawn with-out any consideration for the precision of the treatmenteffect and whether it was clinically meaningful. This wasdemonstrated in our study where studies found to be statis-tically significant but definitely not clinically significantwere due to the fact that the direction of the effect size wasrelated to harm as opposed to benefit, which cannot beascertained solely through a p value. Additionally, a notableportion of studies found to be statistically insignificant werefound to have possible clinical significance, which demon-strates, as stated in the CONSORT Statement, that statisti-cally insignificant results do not preclude potentialclinically meaningful findings.In addition to designing a study based on a delta valuereflective of a MCID, it is necessary to provide justifica-tion of the MCID, experimental value and control value,which our study revealed were only reported by a minor-ity of studies. Reporting of this justification will allowthe reader to apply the appropriate weight to the au-thors’ conclusions as values based on systematic reviewsor meta-analyses will have a higher weight than valuesbased on the research team’s expertise. It is also essentialfor the treatment effect to be interpreted with the preci-sion of the CI in mind [10, 14]. For instance, a treatmenteffect may be within the MCID, but due to inadequatepower from an inaccurate sample size calculation, theprecision of this estimate may be weak and thus thefindings should be graded as low evidence. CIs shouldbe reported for the experimental and control values aswell as the treatment effect as recommended by theCONSORT Statement [11].RecommendationsGiven our study results and the implications discussed,in Table 4 we propose recommendations, adapted andinformed by Cook et al. [24], Moher et al. [11], andKoynova et al. [25] to promote the incorporation ofclinical significance into RCT design and interpretation.Our results also raise the question of whether clinicalsignificance is under-utilized and poorly reported inother rare disease contexts wherein obtaining anadequate study sample challenges feasibility. Moreover,research and knowledge translation efforts are requiredto raise awareness and understanding of the importanceof clinical significance.Additional fileAdditional file 1: Comprehensive search strategy data. (DOCX 66 kb)AbbreviationsCI: Confidence interval; CONSORT: Consolidated Standards of Reporting Trials;MCID: Minimal clinically important difference; PRISMA: Preferred ReportingItems for Systematic Reviews and Meta-Analyses; RCT: Randomizedcontrolled trialAcknowledgementsThe authors gratefully acknowledge the work of Kelly Newton in conductingliterature searches and collecting data for this review.Availability of data and materialsThe datasets generated and analyzed during the current study are availablefrom the corresponding author on reasonable request.Authors’ contributionsAFH, OAS, and HH conceived and designed the study. HH collected andanalyzed the data. AFH and HH wrote the first drafts of the manuscript, andOAS, SRR, and KG contributed to subsequent drafts. All authors had fullaccess to all of the data in the review and take responsibility for the integrityof the data and the accuracy of the data analysis. All authors read andapproved the final manuscript.Howard et al. Trials  (2018) 19:539 Page 10 of 11Ethics approvalThis systematic review did not require ethical approval because noparticipants were involved, nor was identifiable information collected.Consent for publicationNot applicable.Competing interestsThe authors certify that they have no affiliations with, or involvement in, anyorganization or entity with any financial interest, or non-financial interest inthe subject matter or materials discussed in this manuscript.Publisher’s NoteSpringer Nature remains neutral with regard to jurisdictional claims inpublished maps and institutional affiliations.Author details1School of Nursing, The University of British Columbia, T201-2211 WesbrookMall, Vancouver, BC V6T 2B5, Canada. 2Department of Radiation Oncology,BC Cancer, Vancouver, BC V5Z 4E6, Canada. 3Division of Hematology/Oncology, BC Children’s Hospital, Vancouver, BC V6H 3N1, Canada. 4Divisionof Plastic Surgery, QEII Health Sciences Centre, Halifax, NS B3H 3A7, Canada.5Epi Methods Consulting, Toronto, ON M5V 0C4, Canada.Received: 4 June 2018 Accepted: 19 September 2018References1. Canadian Cancer Society’s Advisory Committee on Cancer Statistics.Canadian Cancer Statistics 2015. Toronto: Canadian Cancer Society; 2015.2. American Cancer Society. Cancer Facts & Figures 2017. Atlanta: AmericanCancer Society; 2017.3. O’Leary M, Krailo M, Anderson JR, et al. Progress in childhood cancer: 50years of research collaboration, a report from the Children’s OncologyGroup. Semin Oncol 2008: Abstract 35, p. 484–493. Elsevier.4. Bleyer A, Budd T, Montello M. Adolescents and young adults with cancer:the scope of the problem and criticality of clinical trials. Cancer. 2006;107(7Suppl):1645–55.5. Joseph PD, Craig JC, Tong A, et al. Researchers’, regulators’, and sponsors’views on pediatric clinical trials: a multinational study. Pediatr. 2016;138(4):e20161171.6. Pritchard-Jones K, Lewison G, Camporesi S, et al. The state of research intochildren with cancer across Europe: new policies for a new decade.Ecancermedicalscience. 2011;5:1–80.7. Akobeng AK. Confidence intervals and p-values in clinical decision making.Acta Paediatr. 2008;97(8):1004–7.8. Vora A, Goulden N, Wade R, et al. Treatment reduction for children andyoung adults with low-risk acute lymphoblastic leukaemia defined byminimal residual disease (UKALL 2003): a randomised controlled trial. LancetOncol. 2013;14(3):199–209.9. Yu AL, Gilman AL, Ozkaynak MF, et al. Anti-GD2 antibody with GM-CSF,interleukin-2, and isotretinoin for neuroblastoma. N Engl J Med. 2010;363(14):1324–34.10. Man-Son-Hing M, Laupacis A, O’Rourke K, et al. Determination of the clinicalimportance of study results. J Gen Intern Med. 2002;17(6):469–76.11. Moher D, Hopewell S, Schulz KF, et al. CONSORT 2010 explanation andelaboration: updated guidelines for reporting parallel group randomisedtrials. Int J Surg. 2012;10(1):28–55.12. Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertainingthe minimal clinically important difference. Control Clin Trials. 1989;10(4):407–15.13. Chan KB, Man-Son-Hing M, Molnar FJ, et al. How well is the clinicalimportance of study results reported? An assessment of randomizedcontrolled trials. CMAJ. 2001;165(9):1197–202.14. Ferrill MJ, Brown DA, Kyle JA. Clinical versus statistical significance:interpreting P values and confidence intervals related to measures ofassociation to guide decision making. J Pharm Pract. 2010;23(4):344–51.15. David MC. How to make clinical decisions from statistics. Clin Exp Optom.2006;89(3):176–83.16. Pocock SJ, Hughes MD, Lee RJ. Statistical problems in the reporting of clinicaltrials. A survey of three medical journals. N Engl J Med. 1987;317(7):426–32.17. Gardner MJ, Altman DG. Confidence intervals rather than P values:estimation rather than hypothesis testing. Br Med J (Clin Res Ed). 1986;292(6522):746–50.18. Bland JM, Peacock JL. Interpreting statistics with confidence. TheObstetrician & Gynaecologist. 2002;4(3):176–80.19. Hoffmann TC, Thomas ST, Shin PN, et al. Cross-sectional analysis of thereporting of continuous outcome measures and clinical significance ofresults in randomized trials of non-pharmacological interventions. Trials.2014;15:362.20. van Tulder M, Malmivaara A, Hayden J, et al. Statistical significance versusclinical importance: trials on exercise therapy for chronic low back pain asexample. Spine. 2007;32(16):1785–90.21. Moher D, Liberati A, Tetzlaff J, et al. Preferred reporting items for systematicreviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097.22. CADTH. Strings attached: CADTH database search filters [Internet]. https://www.cadth.ca/resources/finding-evidence. Accessed 26 Sept 2018.23. Leclercq E, Leeflang MM, van Dalen EC, et al. Validation of search filters foridentifying pediatric studies in PubMed. J Pediatr. 2013;162(3):629–634.e2.24. Cook JA, Hislop J, Altman DG, et al. Specifying the target difference in theprimary outcome for a randomised controlled trial: guidance forresearchers. Trials. 2015;16:12.25. Koynova D, Lühmann R, Fischer R. A framework for managing the minimalclinically important difference in clinical trials. Ther Innov Regul Sci. 2013;47(4):447–54.26. Hackshaw A. Statistical formulae for calculating some 95% confidence intervals.In: A concise guide to clinical trials. Oxford: Wiley-Blackwell; 2009. p. 205–7.27. Altman DG, Andersen PK. Calculating the number needed to treat for trialswhere the outcome is time to an event. BMJ. 1999;319(7223):1492.28. Castellini G, Gianola S, Bonovas S, et al. Improving power and sample sizecalculation in rehabilitation trial reports: a methodological assessment. ArchPhys Med Rehabil. 2016;97(7):1195–201.29. Molnar FJ, Man-Son-Hing M, Fergusson D. Systematic review of measures ofclinical significance employed in randomized controlled trials of drugs fordementia. J Am Geriatr Soc. 2009;57(3):536–46.30. PHY C, Murphy SB, Butow PN, et al. Clinical trials in children. Lancet.364(9436):803–11.31. Estlin EJ, Ablett S. Practicalities and ethics of running clinical trials inpaediatric oncology - the UK experience. Eur J Cancer. 2001;37(11):1394–8discussion 1399-401.32. Burke ME, Albritton K, Marina N. Challenges in the recruitment of adolescentsand young adults to cancer clinical trials. Cancer. 2007;110(11):2385–93.33. Joseph PD, Craig JC, Caldwell PH. Clinical trials in children. Br J ClinPharmacol. 2015;79(3):357–69.34. Berg SL. Ethical challenges in cancer research in children. Oncologist. 2007;12(11):1336–43.35. Detsky AS. Using economic analysis to determine the resourceconsequences of choices made in planning clinical trials. J Chronic Dis.1985;38(9):753–65.36. Moher D, Dulberg CS, Wells GA. Statistical power, sample size, and theirreporting in randomized controlled trials. JAMA. 1994;272(2):122–4.37. Freiman JA, Chalmers TC, Smith H Jr, et al. The importance of beta, the typeII error and sample size in the design and interpretation of the randomizedcontrol trial. Survey of 71 “negative” trials. N Engl J Med. 1978;299(13):690–4.38. Charles P, Giraudeau B, Dechartres A, et al. Reporting of sample sizecalculation in randomised controlled trials: review. BMJ. 2009;338:b1732.39. Yusuf S, Collins R, Peto R. Why do we need some large, simple randomizedtrials? Stat Med. 1984;3(4):409–22.Howard et al. Trials  (2018) 19:539 Page 11 of 11

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.52383.1-0372979/manifest

Comment

Related Items