The interactive effects of data categorization and noncircularity on the sampling distribution of generalizability coefficients in analysis of variance models : an empirical investigation

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

The interactive effects of data categorization and noncircularity on the sampling distribution of generalizability coefficients in analysis of variance models : an empirical investigation Eom, Han J.

Abstract

The present study employed Monte Carlo procedures to investigate the effects of data categorization and noncircularity on generalizability (G) coefficients for the one-facet and two-facet fully-crossed balanced designs as well as on the Type I error rates for F tests in repeated measures ANOVA designs. Computer programs were developed to conduct a series of simulations under various sampling conditions. Five independent parameters were considered in the simulations: (a)three levels of repeated measures (3, 5, 7); (b) three G coefficients (.60, .75, .90); (c) three epsilon values (.50,.70, 1.0); (d) three sample sizes (15, 30, 45); and (e) six measurement scales (Continuous, 5-point and 3-point scales with either normal or uniform distribution, and dichotomous). For the one-facet design, the results of the simulations indicated that categorical data resulted in a considerably smaller G coefficient than for the parent continuous data, especially for a 3-point or less scale. Noncircularity did not introduce any bias to the estimate, but yielded more variable estimates of the G coefficient. The sampling theory of G coefficients with continuous data was fairly robust to a moderate departure from circularity, but somewhat sensitive to severe noncircularity (about 6% for E = .7 and about 7.2% for E =.5 of the sample estimates lay in the 5% region of the upper tail). However, it was not adequate for categorical data, especially for a 3-point or less scale. The results of the two-facet design closely paralleled those of the one-facet design in terms of the effects of categorization, sample size, and population G values. The primary difference in the findings between the two designs was that the sampling theory of G coefficients for the two-facet design, which was developed using Satterthwaite's procedure, was very satisfactory and quite robust to violations of the circularity assumption. Type I error rates of the F test for continuous data were inflated when the circularity assumption failed, with categorization causing a slight reduction in this inflation. Relationships among the population epsilon, the sample estimate, and the Type I error rates across the 81 simulated conditions revealed the presence of a strong negative relationship between the epsilon estimates and the associated Type I error rates, thus supporting current theory. However, for the e = 1.0condition the associated Type I error rates were all close to the nominal level, and the correlation with the estimated epsilon was near zero. Further investigation of the correlations among the sample estimates ("C, MSe, and MSr) within each population epsilon condition suggested that the inflation in Type I error rates is not, as is commonly assumed, merely a function of the population epsilon value. This led us to question the current practice of utilizing an epsilon-adjusted F test in repeated measures ANOVA designs.

Item Metadata

Title	The interactive effects of data categorization and noncircularity on the sampling distribution of generalizability coefficients in analysis of variance models : an empirical investigation
Creator	Eom, Han J.
Publisher	University of British Columbia
Date Issued	1993
Description	The present study employed Monte Carlo procedures to investigate the effects of data categorization and noncircularity on generalizability (G) coefficients for the one-facet and two-facet fully-crossed balanced designs as well as on the Type I error rates for F tests in repeated measures ANOVA designs. Computer programs were developed to conduct a series of simulations under various sampling conditions. Five independent parameters were considered in the simulations: (a)three levels of repeated measures (3, 5, 7); (b) three G coefficients (.60, .75, .90); (c) three epsilon values (.50,.70, 1.0); (d) three sample sizes (15, 30, 45); and (e) six measurement scales (Continuous, 5-point and 3-point scales with either normal or uniform distribution, and dichotomous). For the one-facet design, the results of the simulations indicated that categorical data resulted in a considerably smaller G coefficient than for the parent continuous data, especially for a 3-point or less scale. Noncircularity did not introduce any bias to the estimate, but yielded more variable estimates of the G coefficient. The sampling theory of G coefficients with continuous data was fairly robust to a moderate departure from circularity, but somewhat sensitive to severe noncircularity (about 6% for E = .7 and about 7.2% for E =.5 of the sample estimates lay in the 5% region of the upper tail). However, it was not adequate for categorical data, especially for a 3-point or less scale. The results of the two-facet design closely paralleled those of the one-facet design in terms of the effects of categorization, sample size, and population G values. The primary difference in the findings between the two designs was that the sampling theory of G coefficients for the two-facet design, which was developed using Satterthwaite's procedure, was very satisfactory and quite robust to violations of the circularity assumption. Type I error rates of the F test for continuous data were inflated when the circularity assumption failed, with categorization causing a slight reduction in this inflation. Relationships among the population epsilon, the sample estimate, and the Type I error rates across the 81 simulated conditions revealed the presence of a strong negative relationship between the epsilon estimates and the associated Type I error rates, thus supporting current theory. However, for the e = 1.0condition the associated Type I error rates were all close to the nominal level, and the correlation with the estimated epsilon was near zero. Further investigation of the correlations among the sample estimates ("C, MSe, and MSr) within each population epsilon condition suggested that the inflation in Type I error rates is not, as is commonly assumed, merely a function of the population epsilon value. This led us to question the current practice of utilizing an epsilon-adjusted F test in repeated measures ANOVA designs.
Extent	8985409 bytes
Genre	Thesis/Dissertation
Type	Text
File Format	application/pdf
Language	eng
Date Available	2008-09-10
Provider	Vancouver : University of British Columbia Library
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.
DOI	10.14288/1.0076803
URI	http://hdl.handle.net/2429/1758
Degree	Doctor of Philosophy - PhD
Program	Interdisciplinary Studies
Affiliation	Graduate and Postdoctoral Studies
Degree Grantor	University of British Columbia
Graduation Date	1993-11
Campus	UBCV
Scholarly Level	Graduate
Aggregated Source Repository	DSpace

Item Media

ubc_1993_fall_phd_eom_han.pdf -- 8.57MB

Item Citations and Data

Rights

For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Open Collections

UBC Theses and Dissertations

The interactive effects of data categorization and noncircularity on the sampling distribution of generalizability coefficients in analysis of variance models : an empirical investigation Eom, Han J.

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights