UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Exploring the impact of different missing data mechanisms on the efficiency of parameter estimates Chen, Lihan

Abstract

Modern missing data techniques, such as full information maximum likelihood (FIML) and multiple imputation (MI), have become more accessible and popular in recent years. When data are missing at random (MAR), these techniques produce consistent parameter estimates, correct standard errors, and valid statistical inferences, without the need for researchers to specify the details of the underlying missing data mechanisms. However, details of MAR mechanisms can affect the efficiency of parameter estimates. Under current practice, this efficiency loss is typically measured only by the rate of missing data; yet, even when the rate of missing data is held constant, variations in the MAR mechanism can lead to different efficiency loss. As a result, the statistical power of psychological studies can be impacted by missing data in ways that are unexpected and unreported. The fraction of missing information (FMI) is a direct measure of efficiency loss due to missing data. In this dissertation, I first explored the impact of different variations of MAR under a wide range of scenarios using FMI. It was discovered that efficiency loss has a complex relationship to many factors, such as the conditioning relationship and the values of model parameters. These findings demonstrate the need to adopt FMI as a diagnostic measure in empirical studies when data are missing. Furthermore, these findings show the need to control for moderators of efficiency loss in simulation studies, and to use FMI as a guide for study design. Next, I conducted a series of simulation studies to evaluate the properties of several sample estimates of FMI, and found that the accurate estimation of FMI required a sample size of at least 200. In the final part of the dissertation, I provided further reasons why it is important to study efficiency loss under MAR, by demonstrating the connection between MAR efficiency and sampling strategies in psychological studies. Using extreme groups design (EGD) as an example, I showed that more efficient forms of MAR mechanisms can help improve planned missing data designs.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International