Developing statistical methodologies to support embedding clinical trials within learning health systems

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Developing statistical methodologies to support embedding clinical trials within learning health systems Zhan, Denghuang

Abstract

A learning health system (LHS) is a framework designed to generate and apply evidence for improving healthcare decisions and outcomes. In practice, bridging the gap from data generation to evidence is challenging, often constrained by the lack of statistical methodologies to utilize available LHS data optimally. This dissertation contributes to methodologies for embedding clinical trials into learning health systems, addressing challenges in adaptive trial designs, real-world data integration, and personalized medicine. First, we developed an R package, BayesAET, to evaluate Bayesian adaptive enrichment designs via simulation, allowing for early termination, flexible randomization, and incorporation of external information. This tool supports efficient trial designs tailored to subpopulations, facilitating more personalized treatment recommendations. Second, we investigated the properties of Bayesian group-sequential noninferiority trials, highlighting trade-offs in power, sample size, and interoperability compared to Frequentist designs. Using plasmode simulations based on a real-world dataset, the study demonstrated Bayesian groupsequential designs reduce expected sample sizes significantly but careful consideration is required to plan the number of interim analyses to optimize the specific design. We found that in our test scenarios, response adaptive randomization (RAR) resulted in more participants being randomized to the inferior treatment in group-sequential non-inferiority trials, emphasizing the need for careful evaluation of RAR’s appropriateness in such settings. Third, we introduced a novel method for efficiently constructing a prior distribution for a treatment effect given two datasets in which confounding is present in one dataset due to missing covariates. This method demonstrates comparable performance to Bayesian MCMC approaches while significantly improving computational efficiency, making it practical for evolving LHS environments. Lastly, we conducted a scoping review of methodologies for combining randomized and observational data to estimate Conditional Average Treatment Effects (CATE), and evaluated the performance of three representative approaches via simulation, highlighting their unique strengths and limitations depending on confounding levels, covariate overlap, and sample sizes. The scoping review and these comparisons provide practical guidance for researchers seeking to combine diverse data sources effectively in precision medicine research.

Item Metadata

Title	Developing statistical methodologies to support embedding clinical trials within learning health systems
Creator	Zhan, Denghuang
Supervisor	Wong, Hubert; Vila-Rodriguez, Fidel
Publisher	University of British Columbia
Date Issued	2025
Description	A learning health system (LHS) is a framework designed to generate and apply evidence for improving healthcare decisions and outcomes. In practice, bridging the gap from data generation to evidence is challenging, often constrained by the lack of statistical methodologies to utilize available LHS data optimally. This dissertation contributes to methodologies for embedding clinical trials into learning health systems, addressing challenges in adaptive trial designs, real-world data integration, and personalized medicine. First, we developed an R package, BayesAET, to evaluate Bayesian adaptive enrichment designs via simulation, allowing for early termination, flexible randomization, and incorporation of external information. This tool supports efficient trial designs tailored to subpopulations, facilitating more personalized treatment recommendations. Second, we investigated the properties of Bayesian group-sequential noninferiority trials, highlighting trade-offs in power, sample size, and interoperability compared to Frequentist designs. Using plasmode simulations based on a real-world dataset, the study demonstrated Bayesian groupsequential designs reduce expected sample sizes significantly but careful consideration is required to plan the number of interim analyses to optimize the specific design. We found that in our test scenarios, response adaptive randomization (RAR) resulted in more participants being randomized to the inferior treatment in group-sequential non-inferiority trials, emphasizing the need for careful evaluation of RAR’s appropriateness in such settings. Third, we introduced a novel method for efficiently constructing a prior distribution for a treatment effect given two datasets in which confounding is present in one dataset due to missing covariates. This method demonstrates comparable performance to Bayesian MCMC approaches while significantly improving computational efficiency, making it practical for evolving LHS environments. Lastly, we conducted a scoping review of methodologies for combining randomized and observational data to estimate Conditional Average Treatment Effects (CATE), and evaluated the performance of three representative approaches via simulation, highlighting their unique strengths and limitations depending on confounding levels, covariate overlap, and sample sizes. The scoping review and these comparisons provide practical guidance for researchers seeking to combine diverse data sources effectively in precision medicine research.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2025-04-23
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0448505
URI	http://hdl.handle.net/2429/90720
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Population and Public Health
Affiliation	Medicine, Faculty of; Population and Public Health (SPPH), School of
Degree Grantor	University of British Columbia
Graduation Date	2025-05
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Developing statistical methodologies to support embedding clinical trials within learning health systems Zhan, Denghuang

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights