UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A simulation study comparing the reliability and validity of methods of scoring ratings Phillips, Norman

Abstract

Simulated rating data were generated according to a uni-factor model under varying conditions of: number of judges; number of targets; discrepancies in judges' scales of measurement; and mean and variance in distributions of individual judges' reliabilities. Burt's (1936) method of standardizing ratings, estimating judges' individual reliabilities from the rating data, and weighting ratings by a function of the judges' estimated reliabilities resulted in higher correlations with true scores than did the simple consensus. A method of scaling true score estimates to an optimal absolute scale resulted in reduced mean square deviations from the true scores. Burt's estimates showed close resemblance to maximum likelihood factor scores. Several proposed methods of estimating individual judges' reliabilities were tested. Only Cronbach's performed poorly under some conditions. The maximum likelihood factor loading estimate appeared to give the best estimate overall. The alpha coefficient was found to be a much poorer estimate of the reliability of the sum (or mean) of a group of judges than another estimate which involved estimating judges' individual reliabilities.

Item Media

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.