UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Developing statistical methodologies to support embedding clinical trials within learning health systems Zhan, Denghuang

Abstract

A learning health system (LHS) is a framework designed to generate and apply evidence for improving healthcare decisions and outcomes. In practice, bridging the gap from data generation to evidence is challenging, often constrained by the lack of statistical methodologies to utilize available LHS data optimally. This dissertation contributes to methodologies for embedding clinical trials into learning health systems, addressing challenges in adaptive trial designs, real-world data integration, and personalized medicine. First, we developed an R package, BayesAET, to evaluate Bayesian adaptive enrichment designs via simulation, allowing for early termination, flexible randomization, and incorporation of external information. This tool supports efficient trial designs tailored to subpopulations, facilitating more personalized treatment recommendations. Second, we investigated the properties of Bayesian group-sequential noninferiority trials, highlighting trade-offs in power, sample size, and interoperability compared to Frequentist designs. Using plasmode simulations based on a real-world dataset, the study demonstrated Bayesian groupsequential designs reduce expected sample sizes significantly but careful consideration is required to plan the number of interim analyses to optimize the specific design. We found that in our test scenarios, response adaptive randomization (RAR) resulted in more participants being randomized to the inferior treatment in group-sequential non-inferiority trials, emphasizing the need for careful evaluation of RAR’s appropriateness in such settings. Third, we introduced a novel method for efficiently constructing a prior distribution for a treatment effect given two datasets in which confounding is present in one dataset due to missing covariates. This method demonstrates comparable performance to Bayesian MCMC approaches while significantly improving computational efficiency, making it practical for evolving LHS environments. Lastly, we conducted a scoping review of methodologies for combining randomized and observational data to estimate Conditional Average Treatment Effects (CATE), and evaluated the performance of three representative approaches via simulation, highlighting their unique strengths and limitations depending on confounding levels, covariate overlap, and sample sizes. The scoping review and these comparisons provide practical guidance for researchers seeking to combine diverse data sources effectively in precision medicine research.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International