- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Two-stage maximum likelihood approach for item-level...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Two-stage maximum likelihood approach for item-level missing data in regression Chen, Lihan
Abstract
Psychologists often use scales composed of multiple items to measure underlying constructs, such as well-being, depression, and personality traits. Missing data often occurs at the item-level. For example, participants may skip items on a questionnaire for various reasons. If variables in the dataset can account for the missingness, the data is missing at random (MAR). Modern missing data approaches can deal with MAR missing data effectively, but existing analytical approaches cannot accommodate item-level missing data. A very common practice in psychology is to average all available items to produce scale means when there is missing data. This approach, called available-case maximum likelihood (ACML) may produce biased results in addition to incorrect standard errors. Another approach is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if even one item is missing. SL-FIML is inefficient and prone to bias. A new analytical approach, called the two-stage maximum likelihood approach (TSML), was recently developed as an alternative (Savalei & Rhemtulla, 2017b). The original work showed that the method outperformed ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL- FIML, and TSML in the context of bivariate regression. It was shown that when item loadings or item means are unequal within the composite, ACML and SL-FIML produced biased estimates on regression coefficients under MAR. Outside of convergence issues when the sample size is small and the number of variables is large, TSML performed well in all simulated conditions, showing little bias, high efficiency, and good coverage. Additionally, the current study investigated how changing the strength of the MAR mechanism may lead to drastically different conclusions in simulation studies. A preliminary definition of MAR strength is provided in order to demonstrate its impact. Recommendations are made to future simulation studies on missing data.
Item Metadata
Title |
Two-stage maximum likelihood approach for item-level missing data in regression
|
Creator | |
Publisher |
University of British Columbia
|
Date Issued |
2017
|
Description |
Psychologists often use scales composed of multiple items to measure underlying constructs, such as well-being, depression, and personality traits. Missing data often occurs at the item-level. For example, participants may skip items on a questionnaire for various reasons. If variables in the dataset can account for the missingness, the data is missing at random (MAR). Modern missing data approaches can deal with MAR missing data effectively, but existing analytical approaches cannot accommodate item-level missing data. A very common practice in psychology is to average all available items to produce scale means when there is missing data. This approach, called available-case maximum likelihood (ACML) may produce biased results in addition to incorrect standard errors. Another approach is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if even one item is missing. SL-FIML is inefficient and prone to bias. A new analytical approach, called the two-stage maximum likelihood approach (TSML), was recently developed as an alternative (Savalei & Rhemtulla, 2017b). The original work showed that the method outperformed ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL- FIML, and TSML in the context of bivariate regression. It was shown that when item loadings or item means are unequal within the composite, ACML and SL-FIML produced biased estimates on regression coefficients under MAR. Outside of convergence issues when the sample size is small and the number of variables is large, TSML performed well in all simulated conditions, showing little bias, high efficiency, and good coverage. Additionally, the current study investigated how changing the strength of the MAR mechanism may lead to drastically different conclusions in simulation studies. A preliminary definition of MAR strength is provided in order to demonstrate its impact. Recommendations are made to future simulation studies on missing data.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2017-08-18
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0354497
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2017-09
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Loading media...
Item Citations and Data
Permanent URL (DOI):
Copied to clipboard.Rights
Attribution-NonCommercial-NoDerivatives 4.0 International