Bayesian adjustments for disease misclassification in epidemiological studies of health administrative databases, with applications to multiple sclerosis research

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Bayesian adjustments for disease misclassification in epidemiological studies of health administrative databases, with applications to multiple sclerosis research Högg, Tanja

Abstract

With disease information routinely established from diagnostic codes or prescriptions in health administrative databases, the topic of outcome misclassification is gaining importance in epidemiological research. Motivated by a Canada-wide observational study into the prodromal phase of multiple sclerosis (MS), this thesis considers the setting of a matched exposure-disease association study where the disease is measured with error. We initially focus on the special case of a pair-matched case-control study. Assuming non-differential misclassification of study participants, we give a closed-form expression for asymptotic biases in odds ratios arising under naive analyses of misclassified data, and propose a Bayesian model to correct association estimates for misclassification bias. For identifiability, the model relies on information from a validation cohort of correctly classified case-control pairs, and also requires prior knowledge about the predictive values of the classifier. In a simulation study, the model shows improved point and interval estimates relative to the naive analysis, but is also found to be overly restrictive in a real data application. In light of these concerns, we propose a generalized model for misclassified data that extends to the case of differential misclassification and allows for a variable number of controls per case. Instead of prior information about the classification process, the model relies on individual-level estimates of each participant's true disease status, which were obtained from a counting process mixture model of MS-specific healthcare utilization in our motivating example. Lastly, we consider the problem of assessing the non-differential misclassification assumption in situations where the exposure is suspected to impact the classification accuracy of cases and controls, but information on the true disease status is unavailable. Motivated by the non-identified nature of the problem, we consider a Bayesian analysis and examine the utility of Bayes factors to provide evidence against the null hypothesis of non-differential misclassification. Simulation studies show that for a range of realistic misclassification scenarios, and under mildly informative prior distributions, posterior distributions of the exposure effect on classification accuracy exhibit sufficient updating to detect differential misclassification with moderate to strong evidence.

Item Metadata

Title	Bayesian adjustments for disease misclassification in epidemiological studies of health administrative databases, with applications to multiple sclerosis research
Creator	Högg, Tanja
Publisher	University of British Columbia
Date Issued	2018
Description	With disease information routinely established from diagnostic codes or prescriptions in health administrative databases, the topic of outcome misclassification is gaining importance in epidemiological research. Motivated by a Canada-wide observational study into the prodromal phase of multiple sclerosis (MS), this thesis considers the setting of a matched exposure-disease association study where the disease is measured with error. We initially focus on the special case of a pair-matched case-control study. Assuming non-differential misclassification of study participants, we give a closed-form expression for asymptotic biases in odds ratios arising under naive analyses of misclassified data, and propose a Bayesian model to correct association estimates for misclassification bias. For identifiability, the model relies on information from a validation cohort of correctly classified case-control pairs, and also requires prior knowledge about the predictive values of the classifier. In a simulation study, the model shows improved point and interval estimates relative to the naive analysis, but is also found to be overly restrictive in a real data application. In light of these concerns, we propose a generalized model for misclassified data that extends to the case of differential misclassification and allows for a variable number of controls per case. Instead of prior information about the classification process, the model relies on individual-level estimates of each participant's true disease status, which were obtained from a counting process mixture model of MS-specific healthcare utilization in our motivating example. Lastly, we consider the problem of assessing the non-differential misclassification assumption in situations where the exposure is suspected to impact the classification accuracy of cases and controls, but information on the true disease status is unavailable. Motivated by the non-identified nature of the problem, we consider a Bayesian analysis and examine the utility of Bayes factors to provide evidence against the null hypothesis of non-differential misclassification. Simulation studies show that for a range of realistic misclassification scenarios, and under mildly informative prior distributions, posterior distributions of the exposure effect on classification accuracy exhibit sufficient updating to detect differential misclassification with moderate to strong evidence.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2018-11-26
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0374224
URI	http://hdl.handle.net/2429/67885
Degree (Theses)	Doctor of Philosophy - PhD
Program (Theses)	Statistics
Affiliation	Science, Faculty of; Statistics, Department of
Degree Grantor	University of British Columbia
Graduation Date	2019-02
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Bayesian adjustments for disease misclassification in epidemiological studies of health administrative databases, with applications to multiple sclerosis research Högg, Tanja

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights