UBC Theses and Dissertations
Implications of using Likert data in multiple regression analysis Owuor, Charles Ochieng
Many of the measures obtained in educational research are Likert-type responses on questionnaires. These Likert-type variables are sometimes used in ordinary least-squares regression analysis. However, among the key implications of the assumptions of regression is that the criterion is continuous. Little research has been done to examine how much information is lost and how inappropriate it is to use Likert variables in ordinary least-squares multiple regression. Therefore, this study examined the effect of Likert-type responses in the criterion variable and predictors for various scale points, on the accuracy of regression models using normal and skewed observed response patterns. This was done for the case of three predictors and one criterion. Similarly, eight levels of Likert-type categorization ranging from two to nine scale points were considered for both predictors and criterion variables. It was found that the largest bias in the estimation of the model R-squared, the relative Pratt Index, and Pearson correlation coefficient occurred for two or three-point Likert scales. The bias did not substantially reduce any further beyond the four-point Likert scale. Type of correlation matrix had no effect on the model fit. However, skewed response distribution resulted in large biases in both R² and Pearson correlation, but not in Relative Pratt index, which was not affected by the response distribution. Practical contribution and significance of the study is that it has provided information and insight on how much information is lost due to bias, and the extent to which accuracy is compromised in using Likert data in linear regression models in education and social science research. It is recommended that researchers and practitioners should recognize the extent of the bias in ordinary least-squares regression models with Likert data, resulting in substantial loss of information. For variable importance, the relative Pratt index should be used given that it is robust to Likert conditions and response distributions. Finally, when interpreting reported regression results in the research literature one should recognize that the reported R-squared values are underestimated and that the Pearson correlations are also typically underestimated and sometimes substantially underestimated.
Item Citations and Data