Accuracy of differential item functioning detection methods in structurally missing data due to booklet design

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Accuracy of differential item functioning detection methods in structurally missing data due to booklet design Sandilands, Debra Anne

Abstract

Differential item functioning (DIF) analyses are used to analyze structurally missing data (SMD) due to balanced incomplete block (BIB) booklet designs commonly used in large scale assessments (LSAs). Only one DIF method, the Mantel Haenszel (MH) method, has previously been studied in this context. The purposes of this study were to investigate and compare the power and Type I error rates of an additional DIF method, the IRT-based Lord’s Wald test, with the MH method and to extend the research on methods of forming the MH matching variable (MV) by proposing and testing a modification to the MH MV in the SMD context. A simulation study investigated the effects of sample size, ratio of group sizes, test length, percentage of DIF items, and differences in group abilities on the power and Type I error rates of four DIF methods: the IRT-Lord’s and MH using a block-wise, a booklet-wise, and a modified MV. The study design was selected to reflect authentic situations in which DIF might be investigated in LSAs that typically use BIB designs. The three MH methods maintained better Type I error rates than the IRT-Lord’s method which was inflated when the group sample sizes were unequal. None of the four methods had high power to detect DIF at the smallest sample size (1200). In the other sample size conditions the IRT-Lord’s method had high power to detect DIF only when group sizes were equal. None of the MH methods had high power when the group mean ability levels differed, nor when the proportion of DIF in the MV was high. These results indicate that DIF may go undetected in many realistic SMD conditions, potentially undermining the validity of score comparisons across groups. Recommendations to maximize DIF detection in SMD include using the MH method with a block-wise MV, ensuring a large overall sample size, and over-sampling small policy-relevant groups to result in more balanced group sample sizes. Results also indicate that other sources of validity evidence to support score comparability should be provided since DIF analyses cannot yet be solely relied upon for this purpose.

Item Metadata

Title	Accuracy of differential item functioning detection methods in structurally missing data due to booklet design
Creator	Sandilands, Debra Anne
Publisher	University of British Columbia
Date Issued	2014
Description	Differential item functioning (DIF) analyses are used to analyze structurally missing data (SMD) due to balanced incomplete block (BIB) booklet designs commonly used in large scale assessments (LSAs). Only one DIF method, the Mantel Haenszel (MH) method, has previously been studied in this context. The purposes of this study were to investigate and compare the power and Type I error rates of an additional DIF method, the IRT-based Lord’s Wald test, with the MH method and to extend the research on methods of forming the MH matching variable (MV) by proposing and testing a modification to the MH MV in the SMD context. A simulation study investigated the effects of sample size, ratio of group sizes, test length, percentage of DIF items, and differences in group abilities on the power and Type I error rates of four DIF methods: the IRT-Lord’s and MH using a block-wise, a booklet-wise, and a modified MV. The study design was selected to reflect authentic situations in which DIF might be investigated in LSAs that typically use BIB designs. The three MH methods maintained better Type I error rates than the IRT-Lord’s method which was inflated when the group sample sizes were unequal. None of the four methods had high power to detect DIF at the smallest sample size (1200). In the other sample size conditions the IRT-Lord’s method had high power to detect DIF only when group sizes were equal. None of the MH methods had high power when the group mean ability levels differed, nor when the proportion of DIF in the MV was high. These results indicate that DIF may go undetected in many realistic SMD conditions, potentially undermining the validity of score comparisons across groups. Recommendations to maximize DIF detection in SMD include using the MH method with a block-wise MV, ensuring a large overall sample size, and over-sampling small policy-relevant groups to result in more balanced group sample sizes. Results also indicate that other sources of validity evidence to support score comparability should be provided since DIF analyses cannot yet be solely relied upon for this purpose.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2014-06-20
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivs 2.5 Canada
DOI	10.14288/1.0167505
URI	http://hdl.handle.net/2429/47031
Degree	Doctor of Philosophy - PhD
Program	Measurement, Evaluation and Research Methodology
Affiliation	Education, Faculty of; Educational and Counselling Psychology, and Special Education (ECPS), Department of
Degree Grantor	University of British Columbia
Graduation Date	2014-09
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/2.5/ca/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Accuracy of differential item functioning detection methods in structurally missing data due to booklet design Sandilands, Debra Anne

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights