- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Harnessing the power of causal inference and predictive...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Harnessing the power of causal inference and predictive analytics for survival outcomes with health administrative data : applications to tuberculosis research Hossain, Md. Belal
Abstract
Background: Analyzing health administrative data is challenging both for causal inference and predictive analytics because data are not collected for research. Considering tuberculosis (TB) as a motivating example, we compared different epidemiological methods to analyze health administrative data for survival outcomes. We primarily focused on (i) evaluating the performance of high-dimensional disease risk score (hdDRS) to deal with residual confounding bias in causal inference, (ii) assessing the benefits of repeated nested case-control and matched-cohort analyses to estimate a stable hazard ratio (HR) with time-dependent exposure in causal inference, (iii) developing LASSO-based survival prediction models with multiple imputed datasets, and (iv) determining the performance of high-dimensional prediction model (hdPM) for enhancing risk prediction by supplementing unobserved clinical predictors. Methods: We developed a retrospective cohort of immigrants in British Columbia, Canada, 1985-2019 by utilizing linked health administrative databases. Notable methods include (i) designing plasmode simulations to compare eight hdDRS methods varied by different approaches to fit the risk score model; (ii) conducting extensive data analysis with Monte Carlo simulations by varying the number of controls with and without repeated sampling; (iii) conducting extensive data analysis and plasmode simulations to explore the performance of three distinct statistical methods: prediction average, performance average, and stacked; (iv) conducting comprehensive plasmode simulations to in scenarios where a strong or weak clinical predictor was unavailable to the analyst. Results: Key methodological findings included: (i) hdDRS effectively reduced residual confounding bias, (ii) repeating NCC and matched-cohort analyses multiple times and pooling the estimates improved the stability of the HR estimate, (iii) LASSO models applied to stacked datasets produced the most robust survival predictions, and (iv) hdPM with LASSO-shrinkage enhanced model accuracy by incorporating high-dimensional predictors. Conclusions: Evidence was generated to overcome some limitations of prior studies analyzing health administrative data with survival outcomes. Specifically, we minimized residual confounding bias, stabilized estimates for time-dependent exposures, developed LASSO-based survival prediction after addressing missing data challenges using multiple imputation technique, and improved risk prediction after supplementing unobserved important clinical predictors. These methodological advancements have implications for the analysis and interpretation of health administrative databases in TB literature and beyond.
Item Metadata
Title |
Harnessing the power of causal inference and predictive analytics for survival outcomes with health administrative data : applications to tuberculosis research
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2025
|
Description |
Background: Analyzing health administrative data is challenging both for causal inference and predictive analytics because data are not collected for research. Considering tuberculosis (TB) as a motivating example, we compared different epidemiological methods to analyze health administrative data for survival outcomes. We primarily focused on (i) evaluating the performance of high-dimensional disease risk score (hdDRS) to deal with residual confounding bias in causal inference, (ii) assessing the benefits of repeated nested case-control and matched-cohort analyses to estimate a stable hazard ratio (HR) with time-dependent exposure in causal inference, (iii) developing LASSO-based survival prediction models with multiple imputed datasets, and (iv) determining the performance of high-dimensional prediction model (hdPM) for enhancing risk prediction by supplementing unobserved clinical predictors.
Methods: We developed a retrospective cohort of immigrants in British Columbia, Canada, 1985-2019 by utilizing linked health administrative databases. Notable methods include (i) designing plasmode simulations to compare eight hdDRS methods varied by different approaches to fit the risk score model; (ii) conducting extensive data analysis with Monte Carlo simulations by varying the number of controls with and without repeated sampling; (iii) conducting extensive data analysis and plasmode simulations to explore the performance of three distinct statistical methods: prediction average, performance average, and stacked; (iv) conducting comprehensive plasmode simulations to in scenarios where a strong or weak clinical predictor was unavailable to the analyst.
Results: Key methodological findings included: (i) hdDRS effectively reduced residual confounding bias, (ii) repeating NCC and matched-cohort analyses multiple times and pooling the estimates improved the stability of the HR estimate, (iii) LASSO models applied to stacked datasets produced the most robust survival predictions, and (iv) hdPM with LASSO-shrinkage enhanced model accuracy by incorporating high-dimensional predictors.
Conclusions: Evidence was generated to overcome some limitations of prior studies analyzing health administrative data with survival outcomes. Specifically, we minimized residual confounding bias, stabilized estimates for time-dependent exposures, developed LASSO-based survival prediction after addressing missing data challenges using multiple imputation technique, and improved risk prediction after supplementing unobserved important clinical predictors. These methodological advancements have implications for the analysis and interpretation of health administrative databases in TB literature and beyond.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2025-04-10
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0448347
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2025-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International