- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A new data driven framework for simulating mendelian...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
A new data driven framework for simulating mendelian randomization data Tinti Tomio, Giuseppe
Abstract
Mendelian randomization (MR) is a causal inference method that allows biostatisticians to leverage DNA measurements to study causal effects with only observed data. Recent advancements including two-sample summary-level mendelian randomization (TSSLMR) and the data source IEU OpenGWAS database have lowered the barrier for conducting MR studies and opened the opportunity to mine causal effects. In the first part of the thesis, I show that there is a mismatch between the characteristics of modern TSSLMR data and how articles that propose popular TSSLMR models conduct their simulations. Next, I propose my solution: a data driven simulation framework for MR data that aims to be realistic, interpretable and easy to use thanks to a complementary R package implementation. As for the results, I show that models perform far better in literature-based simulations compared to more realistic simulations based on my proposed framework. Lastly, I warn that the mismatch between simulated and real data along with the obtained results may lead researchers to have over optimistic expectations about models performance in real applications.
Item Metadata
Title |
A new data driven framework for simulating mendelian randomization data
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2023
|
Description |
Mendelian randomization (MR) is a causal inference method that allows biostatisticians to leverage DNA measurements to study causal effects with only observed data. Recent advancements including two-sample summary-level mendelian randomization (TSSLMR) and the data source IEU OpenGWAS database have lowered the barrier for conducting MR studies and opened the opportunity to mine causal effects. In the first part of the thesis, I show that there is a mismatch between the characteristics of modern TSSLMR data and how articles that propose popular TSSLMR models conduct their simulations. Next, I propose my solution: a data driven simulation framework for MR data that aims to be realistic, interpretable and easy to use thanks to a complementary R package implementation. As for the results, I show that models perform far better in literature-based simulations compared to more realistic simulations based on my proposed framework. Lastly, I warn that the mismatch between simulated and real data along with the obtained results may lead researchers to have over optimistic expectations about models performance in real applications.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2023-09-05
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0435751
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2023-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International