UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Examination of test equivalence between French and English language versions of Progress in International Reading Literacy Study 2011 Goodrich, Shawna


In Canada, international large-scale assessments (LSAs), such as the Progress in International Reading Literacy Study (PIRLS), are administered in the two official languages, French and English. The validity of decisions made from these assessments depends on the equivalence of different language versions and the comparability of scores across language groups. Previous research examining French and English language versions of large-scale assessments administered in Canada indicates that equivalence cannot be assumed when tests are adapted (Ercikan, Gierl, McCreith, Puhan & Koh, 2004; Ercikan & McCreith, 2002; Gierl, 2000; Oliveri & Ercikan, 2011). Research has shown that the quality of test adaptation is particularly important to ensure comparability, interpretability and consequential equity across language groups. The purpose of this study is to examine test equivalence and score comparability at the item and test level between French and English language groups administered PIRLS 2011. Confirmatory factor analysis and two methods of differential item functioning were conducted to examine score comparability. Four bilingual expert reviewers with expertise in reading literacy conducted independent blind linguistic and cultural reviews to identify the degree of test equivalence and potential sources of differences between the French and English language versions of released items from PIRLS 2011. As a whole, evidence from this study indicates there are important scale level differences between the French and English language versions of PIRLS 2011 that call for further investigation and on average 25% of items across thirteen booklets function differently at the item level. Reviews by experts of released items indicate that there are many differences between the two language versions for both statistically identified DIF items and non-statistically identified items. Reviewers concluded that inappropriate translation produced unintended differences in content and difficulty levels between the two language versions.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International