UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Two-stage maximum likelihood approach for item-level missing data in regression Chen, Lihan


Psychologists often use scales composed of multiple items to measure underlying constructs, such as well-being, depression, and personality traits. Missing data often occurs at the item-level. For example, participants may skip items on a questionnaire for various reasons. If variables in the dataset can account for the missingness, the data is missing at random (MAR). Modern missing data approaches can deal with MAR missing data effectively, but existing analytical approaches cannot accommodate item-level missing data. A very common practice in psychology is to average all available items to produce scale means when there is missing data. This approach, called available-case maximum likelihood (ACML) may produce biased results in addition to incorrect standard errors. Another approach is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if even one item is missing. SL-FIML is inefficient and prone to bias. A new analytical approach, called the two-stage maximum likelihood approach (TSML), was recently developed as an alternative (Savalei & Rhemtulla, 2017b). The original work showed that the method outperformed ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL- FIML, and TSML in the context of bivariate regression. It was shown that when item loadings or item means are unequal within the composite, ACML and SL-FIML produced biased estimates on regression coefficients under MAR. Outside of convergence issues when the sample size is small and the number of variables is large, TSML performed well in all simulated conditions, showing little bias, high efficiency, and good coverage. Additionally, the current study investigated how changing the strength of the MAR mechanism may lead to drastically different conclusions in simulation studies. A preliminary definition of MAR strength is provided in order to demonstrate its impact. Recommendations are made to future simulation studies on missing data.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International