UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Investigating the impact of hidden subpopulations on individual test scores using IRT-based methods Zou, Danjie

Abstract

The accuracy of test scores is critical to ensuring validity and fairness in educational and psychological assessments. This dissertation aims to investigate how hidden subpopulations affect the accuracy of individuals’ latent ability estimates and to identify what factors are related to the estimation. Two simulation studies were conducted to investigate the influence of latent class structure on the accuracy of latent ability estimates. In Study 1, measurement invariance (MI) was assumed with identical models across classes, whereas Study 2 allowed class-specific models. Specifically, I compared the accuracy of ability estimates using two item response theory (IRT) approaches: regular IRT and mixture IRT. Four key factors were manipulated—sample size, class proportion, class ability distributions, and test difficulty range—to assess their impact on person-level and sample-level biases. Four consistent findings emerged across both studies. First, mixture IRT did not outperform regular IRT, despite accounting for latent classes. Second, person-level biases exhibited non-uniform patterns relative to true ability: when mixture was ignored, the relationship followed a cubic pattern, whereas modeling mixture produced two distinct quadratic bias patterns. Third, the separation between classes was the most influential factor in reducing bias magnitude, regardless of whether measurement invariance was held. Fourth, both classes experienced bias. Several differences between the studies merit particular attention. When lacking measurement invariance, the interaction between class distribution and class proportion had a substantial impact on bias magnitude—an effect not observed under MI. Additionally, the four main factors and their two-way interactions accounted for considerably less variation in bias when lacking MI. These findings suggest that construct inequivalence introduces additional sources of bias beyond what can be explained by these factors alone. This dissertation uncovered the complex challenges introduced by mixture distributions in the population and underscores how this could introduce substantial bias into individual ability estimates. It also identified specific conditions that exacerbate these biases and discussed their implications for validity and test fairness—particularly when scores are used for high-stakes decisions. Finally, our findings suggest methodological alternatives for handling mixture distributions in ability estimation and pointed out key directions for future research.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International