The Open Collections site will be undergoing maintenance 8-11am PST on Tuesday Dec. 3rd. No service interruption is expected, but some features may be temporarily impacted.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Hidden at the root : statistical methods for population...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Hidden at the root : statistical methods for population size estimation on trees Flynn, Mallory J.
Abstract
In many fields, populations of interest are hidden from data for a variety of reasons, though their magnitude remains important in determining resource allocation and appropriate policy. One popular approach to population size estimation, the multiplier method, is a back-calculation tool requiring only a marginal subpopulation size and an estimate of the proportion belonging to this subgroup. Another approach is to use Bayesian methods, which are inherently well-suited to incorporating multiple data sources. However, both methods have their drawbacks. A framework for applying the multiplier method which combines information from several known subpopulations has not yet been established; Bayesian models, though able to incorporate complex dependencies and various data sources, are difficult for researchers in less technical fields to design and implement. Increasing data collection and linkage across diverse fields suggests accessible methods of estimating population size with synthesized data are needed. In public health and epidemiology, these linkages often admit a tree structure, with the target population represented by the root, and paths from root-to-leaf representing pathways of care after a health event. In this thesis, we propose an extension to the well-known multiplier method which is applicable to tree-structured data, where multiple subpopulations and corresponding proportions combine to generate a population size estimate via the minimum variance estimator. The estimates given by this methodology are compared to those from a Bayesian hierarchical model, for both simulated and real world data, the latter provided by BC's opioid overdose cohort, a tree-like data structure which tracks individuals along pathways of care after an overdose. Subsequent analysis elucidates which data are key to estimation in each method, and examines robustness and feasibility of methods. Finally, two R packages have been developed to facilitate the use of these methods on similar applications. The first provides a straightforward method of estimating population size on tree-structured data with the modified multiplier methodology. The second provides functionality to automatically generate Bayesian model code for tree-structured data intended to be used for estimation with the MCMC sampler, JAGS, lowering the technical barrier of implementation.
Item Metadata
Title |
Hidden at the root : statistical methods for population size estimation on trees
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2023
|
Description |
In many fields, populations of interest are hidden from data for a variety of reasons, though their magnitude remains important in determining resource allocation and appropriate policy. One popular approach to population size estimation, the multiplier method, is a back-calculation tool requiring only a marginal subpopulation size and an estimate of the proportion belonging to this subgroup. Another approach is to use Bayesian methods, which are inherently well-suited to incorporating multiple data sources. However, both methods have their drawbacks. A framework for applying the multiplier method which combines information from several known subpopulations has not yet been established; Bayesian models, though able to incorporate complex dependencies and various data sources, are difficult for researchers in less technical fields to design and implement. Increasing data collection and linkage across diverse fields suggests accessible methods of estimating population size with synthesized data are needed. In public health and epidemiology, these linkages often admit a tree structure, with the target population represented by the root, and paths from root-to-leaf representing pathways of care after a health event.
In this thesis, we propose an extension to the well-known multiplier method which is applicable to tree-structured data, where multiple subpopulations and corresponding proportions combine to generate a population size estimate via the minimum variance estimator. The estimates given by this methodology are compared to those from a Bayesian hierarchical model, for both simulated and real world data, the latter provided by BC's opioid overdose cohort, a tree-like data structure which tracks individuals along pathways of care after an overdose. Subsequent analysis elucidates which data are key to estimation in each method, and examines robustness and feasibility of methods. Finally, two R packages have been developed to facilitate the use of these methods on similar applications. The first provides a straightforward method of estimating population size on tree-structured data with the modified multiplier methodology. The second provides functionality to automatically generate Bayesian model code for tree-structured data intended to be used for estimation with the MCMC sampler, JAGS, lowering the technical barrier of implementation.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-10-31
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0437150
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2023-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International