UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Bayesian causal inference for discrete data Daly-Grafstein, Daniel

Abstract

Causal inference provides a framework for estimating how a response changes when a given cause of interest changes. When all data are discrete we can use saturated nonparametric models to avoid unnecessary assumptions in our causal inference modelling, where we specify unique parameters for all possible combinations of treatments and confounders when estimating an outcome. Bayesian methods allow us to incorporate prior information into these saturated models, making them usable beyond simple settings with low dimensional confounders. In this thesis we propose two new nonparametric Bayes methods for causal inference based on saturated modelling. The first method combines a parametric model with a nonparametric saturated outcome model to estimate treatment effects in observational studies with longitudinal data. By conceptually splitting the data, we can combine these models while maintaining a conjugate framework, allowing us to avoid the use of Markov chain Monte Carlo methods. Approximations using the central limit theorem and random sampling allows our method to be scaled to high-dimensional confounders. The second method uses prior restrictions of the parameter space of a saturated model to partially identify causal effect estimates in scenarios with nonignorable missing outcome data. We focus on two common restrictions, instrumental variables and the direction of missing data bias, and investigate how these restrictions narrow the identification region for parameters of interest. Additionally, we propose a rejection sampling algorithm that allows us to quantify the evidence for these assumptions in the data. Saturated models require discrete data so continuous data must be discretized to use these methods, which can introduce residual confounding. We conclude by proposing a new soft-thresholding technique to discretize continuous confounders in the context of frequentist linear regression. We show using a triangular distribution weighting function can reduce the bias induced by discretization, while maintaining the interpretability benefits typically associated with discrete variables.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International