STATISTICAL POWER FOR REPEATED MEASURES A N O V A by PATRICK JOHN POTVIN B . S c , Concordia University, 1988 A THESIS S U B M I T T E D I N P A R T I A L F U L F I L L M E N T OF THE REQUIREMENTS FOR T H E D E G R E E OF MASTER OF SCIENCE in T H E F A C U L T Y OF G R A D U A T E STUDIES (School of Human Kinetics) W e accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A A p r i l 1996 © Patrick John Potvin, 1996 In presenting degree freely at this the thesis in partial University of British available for copying of department publication this or of reference thesis by this for his thesis scholarly or for her The University of British Columbia Vancouver, Canada the requirements that I further agree purposes may representatives. financial Department of Columbia, I agree and study. permission. DE-6 (2/88) fulfilment gain It shall not be is that the permission granted allowed an advanced Library shall make by understood be for for the that without it extensive head of my copying or my written Abstract Determining power a prior for univariate repeated measures (RM) A N O V A designs is a difficult and often excluded practice in the planning of experimental research. Complicated procedures and lack of accessibility to computer power programs are among some of the problems which have discouraged researchers from perforrning power analysis on these designs. Another more serious issue has been the lack of methods available for estimating power of designs with two or more R M factors. Due to uncertainties on how to compute an appropriate error term when more than one variance-covariance matrix exists, analytical methods for approximating power are currently restricted to R M designs with only one withinsubjects variable. The purpose of this study therefore, was to facilitate the process of power detennination by providing a series of power tables for A N O V A designs with one and two within-subject variables. A secondary objective was to investigate less well known power trends among A N O V A designs having heterogeneous (nonspherical) correlation matrices or two R M factors. Power was generated using analytical and Monte Carlo simulation methods for varying experimental conditions of sample size (5, 10 , 15, 20, 25 & 30), effect size (small, medium & large), alpha (.01, .05 & .10), correlation (.4 & .8), variance-covariance matrix patterns (constant, e=1.00 and trend, e<.56) and levels of R M (3, 6 & 9). Examination of power results revealed that under conditions of nonsphericity (trend matrix pattern), power was found to be greater at small effect sizes and lower at medium and large effect sizes compared to those values generated under conditions involving spherical (constant matrix) structures. Regarding designs with two R M factors, power of main effects tests was observed to be greatest for a given condition so long as the average correlation among trials of the pooled factor was equal to or below that of the main effects factor. For interaction tests of the same model, power was found to be greatest for a given condition when at least one factor had an average correlation across its trials equal to .80. From simulation results, the relationship between error variance and power across different correlation matrices of the two-way R M design was examined and approximations of the noncentrality parameter for each test of this model were derived. ii Table of Contents Abstract ii Table of Contents iii List of Tables v List of Figures vii Acknowledgment ix Chapter One Chapter Two Introduction mtroduction 2 Purposes of Study 6 Definitions 7 Literature Review I. Factors Related To Power II. Power, RM ANOVA Assumptions and Violation of Sphericity III. Power Determination For RM Designs Chapter Three 9 10 20 27 Study Expectations 38 Hypotheses 39 Methodology..... 40 I. RM ANOVA Designs and Experimental Conditions II. Power Determination III. Accuracy and Reliability of Power Estimates Delimitations of Study Chapter Four 1 Results 41 45 47 49 50 I. One-Way Repeated Measures ANOVA 51 II. Two-Way Repeated Measures ANOVA 63 III. Two-Way Mixed ANOVA 84 Chapter Five Chapter Six Discussion 99 I. One-Way Repeated Measures ANOVA 100 II. Two-Way Repeated Measures ANOVA 112 Summary and Conclusions 124 References 129 Appendix 1.0 Letter Requesting Data From Authors Appendix 2.1-2 J Empirical Data Collected From Various Studies and Used To Deterrnine Experimental Conditions of This Study 135 Appendix 3.0 FORTRAN Program For Calculating Noncentrality Parameter and Effect Sizes (d & f) 138 Appendix 4.1-43 Cell and Marginal Means of RM ANOVA Designs 148 Appendix 5.0 Function Used To Compute Effect Size (d) For Tests of Interaction in Appendix 6.1-6.2 134 Two-Way (AxB) ANOVA Designs 153 Correlation Matrices For ANOVA Designs 154 Appendix 7.1-7.4 Effect Size and Noncentrality Parameter Values For RM ANOVA Designs Appendix 8.0 Seeds Used For Monte Carlo Simulations ; 165 169 List of Tables Table 2.1. Summary of Relevant Studies Related To Power and Sample Size Estimation For RM ANOVA Designs 35 Table 2.2 Comparison of. Computer Programs Available For Detennining Power in RM ANOVA Designs Table 3.1 37 Methods and Computer Programs Used to Determine Power For the Different Repeated Measures (RM) ANOVA Designs of This Study 47 Table 4.1 Power For a One-Way Repeated Measures ANOVA With 3 Levels 52 Table 4.2 Power For a One-Way Repeated Measures ANOVA With 6 Levels 53 Table 4.3 Power For a One-Way Repeated Measures ANOVA With 9 Levels 54 Table 4.4a Power of the A Main Effect For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors Power of the B Main Effect For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors 65 Power of the AB Interaction For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors 66 Power of the A Main Effect For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors 67 Power of the B Main Effect For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors 68 Power of the AB Interaction For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors 69 Power of the A Main Effect For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors. 70 Power of the B Main Effect For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors 71 Power of the AB Interaction For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors .72 Power of the Groups Main Effect For a 2(Groups) x 3(Trials) ANOVA With Repeated Measures on One Factor 85 Table 4.4b Table 4.4c Table 4.5a Table 4.5b Table 4.5c Table 4.6a Table 4.6b Table 4.6c Table 4.7a 64 Table 4.7b Table 4.7c Table 4.8a Table 4.8b Table 4.8c Table 4.9a Table 4.9b Table 4.9c Table 5.1. Power of the Trials Main Effect For a 2(Groups) x 3(Trials) ANOVA With Repeated Measures on One Factor. 86 Power of the Interaction Test For a 2(Groups) x 3(Trials) ANOVA With Repeated Measures on One Factor. 87 Power of the Groups Main Effect For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures on One Factor 88 Power of the Trials Main Effect For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures on One Factor. 89 Power of the Interaction Test For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures on One Factor. 90 Power of the Groups Main Effect For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures on One Factor 91 Power of the Trials Main Effect For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures on One Factor 92 Power of the Interaction Test For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures on One Factor. 93 Mean Statistics Generated From Monte Carlo Simulation (replications = 3000) For Different Two-Way RM ANOVA Designs Under Medium Effect Size 114 vi List of Figures Figure 3.01. Figure 4.01 Experimental conditions of R M A N O V A designs power was generated for in this study 44 Comparisons of power between one-way repeated measures A N O V A (K=6) with constant and trend correlation matrices under varying effect sizes 57 Figure 4.02. Comparisons of power for one-way A N O V A designs with 3,6, and 9 repeated measures (RM) under constant and trend correlation matrices and small effect size (.2) 59 Figure 4.03. Comparisons of power for one-way A N O V A designs with 3,6, and 9 repeated measures (RM) under constant and trend correlation matrices and large effect size (.8) 60 Figure 4.04. Power for one-way A N O V A designs with 3,6, and 9 repeated measures under varying correlation matrices, effect sizes and alpha 62 A comparison of power across different correlation matrices for tests of a 3 x 6 R M A N O V A design , 74 Change in power for the " A " test of a two-way R M A N O V A as the number of levels of factor " B " increase under varying effect sizes and correlation matrices 78 Change in power for the " B " test of a two-way R M A N O V A as the number of levels of factor " B " increase under varying effect sizes and correlation matrices 79 Change in power for the " A B " test of a two-way R M A N O V A as the number of levels of factor " B " increase under varying effect sizes and correlation matrices 81 A comparison of power between A , B and A B tests of the two-way R M A N O V A under different levels of factor B (3, 6, and 9) and correlation matrices 83 Comparisons of power between two-way mixed A N O V A designs for the Groups, Trials and Groups by Trials tests as the average correlation among repeated trials is increased 95 Comparisons of power between two-way mixed A N O V A designs with different levels of R M (3,6 and 9) under varying effect sizes for the Groups, Trials and Groups by Trials tests 96 A comparison of power between tests of a 2 x 6 mixed A N O V A design as the average correlation among repeated trials is increased 98 Figure 4.05. Figure 4.06. Figure 4.07. Figure 4.08. Figure 4.09. Figure 4.10. Figure 4.11. Figure 4.12. vii Figure 5.01. Figure 5.02. Figure 5.03. Figure 5.04. F distributions for one-way repeated measures A N O V A designs when effect size is small (.2) and the pattern of the correlation matrix is altered 102 F distributions for one-way repeated measures A N O V A designs when effect size is medium (.5) and the pattern of the correlation matrix is altered. 103 F distributions for one-way A N O V A designs with 3,6 and 9 repeated measures under a constant correlation matrix and varying effect size 108 F distributions for one-way A N O V A designs with 3,6 and 9 repeated measures under a trend correlation matrix and varying effect size 110 viii Acknowledgment I would like to extend a special thanks to Dr. Schutz for his tremendous help, guidance and support throughout this one and a half year project. He rescued me when it seemed I would drown in a sea of academic discontent and gave me an opportunity to move in a new and challenging direction. Dr. Schutz represents the true researcher, fully dedicated to his field and the pursuit of knowledge. He is the epitome of the ideal graduate advisor and I feel privileged to have learnt from him and to have worked under his wing. I would also like to thank Dr. Martin and Dr. Courts for agreeing to be a part of this project and extending their interests in an area that was rather removed from their own. Lastly, I would like to thank Michelle for being as wonderful as she is. She gave me the confidence and courage to undertake such a project and was the light at the end of my tunnel each and every day. Her love and support is my foundation of strength and I love her very much. "First choo get da money; then when choo get da money, choo get da Power. And when choo get da Power, then choo get da women! " - Tony Montana, Scareface. ix Introduction Chapter One Introduction Potvin '96 1 Introduction A n important part in the planning and formulation of experimental research involves detennining a study's power to show a statistically significant effect Power, in hypothesis testing, represents the probability of detecting a true effect of given magnitude or more specifically, the chance of rejecting a null hypothesis when a difference in the population actually exists. Determining a study's power a priori is instrumental in helping the researcher decide whether an experiment is worth the time, money and effort required to conduct it. A n investigation having low power stands a small chance of showing an effect and therefore should not be pursued without at least some modification to its experimental design (Olejnik, 1984). McKlifying a research design in order to maximize its chances of showing a true effect requires an understanding of those statistical parameters which influence power. These include sample size or the number of subjects used in the study, the expected effect size or magnitude of the difference between means judged meaningful, the significance level (alpha), the error variance associated with the dependent variable(s) and the type of statistical analysis being used (Cohen, 1988; Kraemer & Thiemann, 1987; Lipsey, 1990; St. Pierre, 1980). Generally, an increase i n sample size, effect size or level of significance increases the power of an experimental design while an increase in error variation reduces it. Among types of designs, those which take into consideration more information about subjects (i.e. A N C O V A , repeated measures A N O V A ) tend to provide greater power (Olejnik, 1984) over those which account for less (i.e. randomized group A N O V A ) . Since the process of power estimation requires previous knowledge of these factors, power analysis serves as a helpful adjunct in understanding the limitations and strengths of a study. Despite its importance and obvious benefit to the researcher, power analysis is frequently a neglected component in experimental planning (Cohen, 1988; Howell, 1992). One of the main reasons for this is because methods of power determination are often laborious and computationally complex. For the simpler statistical procedures such as t-tests, significance testing of correlation, and randomized group Potvin '96 2 Introduction analysis of variance (RG A N O V A ) , the process of power estimation is relatively straight forward. Computer programs, tables of power values and analytical formulae are abundant in the literature and provide simple, quick and in most cases, reliable methods for determining the power of these tests (Borenstein & Cohen, 1988; Bradley, 1989; Cohen, 1988; Kraemer & Thiemann, 1987; Lipsey, 1990; P A S S , 1991; SOLO, 1992). For other designs, the process of power ^termination is not as straightforward. This is particularly evident in the case of repeated measures analysis of variance designs ( R M A N O V A ) where several problematic issues exist which limit the implementation of power estimation for these tests. At the forefront is the perceived complexity and tedious nature of most analytical solutions described in the literature which discourage their use among researchers who lack a strong statistical background. Although both commercial and local computer programs have become available over the years to facilitate power analysis of these designs, these too incur limitations in that many are non userfriendly, difficult to access and in some cases, do not provide accurate estimates of power. A more serious limitation of power analysis for these designs is the lack of solutions available for certain types of R M A N O V A tests. Currentiy, a priori analytical methods for approximating power are mainly limited to simpler R M designs, specifically those with a single within-subjects variable. Davidson (1972) provided univariate and multivariate solutions for approximating power in the single repeated measures (RM) design while Marcucci (1986) presented power procedures for the one-way withinsubjects and the two-way mixed experimental design using both a univariate and multivariate approach. Among other work closely related to power estimation, Vonesh and Schork (1986) derived sample size formulae for the one-way R M design using univariate and multivariate approaches while Rochon (1991) extended their work to include the two group within-subjects design using only a multivariate technique. What is evident from a survey of the literature on power analysis is the apparent lack of methods available for estimating power or sample size for A N O V A designs with more than one repeated measures factor. Potvin '96 3 Introduction Presently power solutions (univariate or multivariate) which effectively account for the various correlation matrices of multiple R M designs are deficient One of the main reasons for this apparent deficiency may be the inherent difficulty in deriving power approximations analytically when two or more distinct correlation (r) matrices are present within a design. Normally when deterrnining power analytically for a R M A N O V A with one r matrix (one withinsubjects variable), the average r of the matrix is used in power calculations to express the amount of within-subject variance. However, when two or more R M factors are present, three types of average correlations exist, one for each independent matrix of the design (within-factor A , within-factor B and A B matrices) and it is not clear how these different matrices interact, i f at all, to affect the error variance and thus power of a particular test. Nor is it certain whether simply substituting the average r of a given matrix (instead of, for example, the mirnmurn correlation coefficient) in the power formulae of a particular test is an appropriate procedure to follow. Pilot work conducted by the researchers of this study (Potvin & Schutz, 1995) has revealed that the relationship between power and correlation for designs with multiple R M factors is more complicated than that expressed in the power formulae of designs with just one r matrix. That is, the average correlation coefficient of a given matrix in a multiple R M design does not adequately account for the change that occurs in the error variance of its respective test. Perhaps because of the uncertainty concerning this relationship, few attempts have been made to resolve this issue computationally. Although Winer, Brown and Michels (1991) and Dodd and Schultz (1973) provide a post-hoc procedure for deterrnining power for these designs using omega-squared, ©2, a measure of the magnitude of the experimental effect, such a method is not very practical for a priori power analysis since it requires the researcher to know ahead of time the mean square treatment and error variance of the test involved. Potvin '96 4 Introduction Another problem complicating computation of power for these designs is the pattern or variability among r coefficients of the matrix(cies) involved. When r coefficients within a matrix result in a heterogeneous or simplex (trend) pattern, that is they decrease substantially in magnitude across the levels of the matrix, an important assumption of univariate A N O V A , called sphericity, is violated. Under this assumption, the variances of all pairwise differences between trial means involved should be about equal. When violation of the sphericity assumption has occurred (nonsphericity), using an average r value in existing power formulae becomes inappropriate since the effect on the error variance of the involved test is no longer expressed adequately by this variable alone. Although some methods have been developed to deal with this (Muller & Barton, 1989, 1991; Mulvenon, 1993; Rochon, 1991), they are complicated and in some cases, not appropriate for univariate tests, making power estimation difficult for designs with such r structures. In addition, examination of how power is affected under conditions of nonsphericity, has not been well documented. As a result of these problematic issues, power estimation, particularly for R M designs with multiple within-subject variables or heterogeneous r matrices remains difficult and/or obsolete. Currently, investigators whose experimental designs involve multiple R M factors or nonspherical r matrices are faced with either avoiding the power issue altogether or estimating values using procedures that are difficult and/or less practical and applicable for their design. Since such designs are frequently encountered in the field of human kinetics, it is important that the power of these statistical tests under varying conditions be determined and their values made available. While the complexity involved in deriving power estimates for these conditions may discourage attempts at resolving these issues analytically, an alternative but somewhat less accurate method for accomplishing this task is through Monte Carlo simulation. This process, which uses computer simulation to generate several hundred or thousand replications of a particular R M analysis test, can provide approximations of power based on the number of tests found significant By tabulating these results, both Potvin '96 5 Introduction power and sample size values can be made available for specific R M designs under varying conditions. In the past, Monte Carlo simulation has proven useful for approximating power in two-way mixed R M A N O V A models (Grima, 1987; Mendoza et al., 1974; Muller & Barton, 1989) but this method has not been extended to those designs involving two or more R M factors and has received only limited use among tests that do not meet the assumptions of sphericity. Purposes of Study The problems inherent with power estimation for R M designs served as the rationale for this present study. One of the main purposes of this investigation was to provide researchers in the field of human kinetics with a more accessible method for determining univariate power of one- and two-way A N O V A designs involving single and double R M factors. This was accomplished by generating power values using analytical and Monte Carlo methods and making available these estimates in the form of power tables for varying conditions of sample size, effect size, and magnitudes and patterns of correlation. A secondary purpose of this study was to describe some of the power trends which occurred under conditions of nonsphericity and among tests having two R M factors. Potvin '96 6 Introduction Definitions The following terms appear frequently throughout this dissertation and have been denned to facilitate understanding. One- and Two-Wav RM ANOVA A univariate analysis of variance test having one and two repeated measures (within-subjects) variables, respectively. Two-Way Mixed ANOVA A univariate analysis of variance test having one repeated measures (within-subjects) and one randomized-group (between-subjects) variable. Test Refers to any of the F tests in a one- or two-way ANOVA design (main effects and interaction). Experimental Condition Refers to any of the statistical parameters in a design including effect size, sample size, alpha, number of repeated measures and magnitude and pattern of correlation. Potvin '96 7 Introduction Constant r matrix pattern (O Refers to a correlation (r) structure in which all coefficients included are equal in magnitude and meet the assumption of sphericity (see section II of chapter 2 for a definition of this assumption). Synonymous terms include a "spherical design/condition" and "r structures with high epsilon (1.00)". Trend r matrix pattern Refers to a r structure in which the coefficients decrease in magnitude across the levels of the matrix (simplex pattern) resulting in violation of the assumption of sphericity. Synonymous terms include a "nonspherical design!condition" or "r structures with low epsilon (<.56)'\ Pooled Factor Refers to the factor (e.g. A or B) in a two-way A N O V A whose levels (scores) are averaged-over or "pooled" to produce a single score for each level of the other factor. A synonymous term for this is an "averaged-over factor". Potvin '96 8 Chapter Two Literature Review Potvin '96 Literature Review This chapter discusses important aspects related to power analysis of R M designs. The first section includes an explanation of those factors which influence power and how they are interrelated. The second section discusses vital assumptions about R M designs and illustrates how violation of some can affect power of the univariate F test. The last section concludes with a review of methods currently available for estimating power in R M A N O V A . I. Factors Related to Power The following provides a brief explanation of the theory behind power estimation and how effect size, level of significance, sample size, correlation and the noncentrality parameter relate to statistical power. Noncentrality Parameter (X) When the null hypothesis is false, the F ratio (MS trials/MS error) for the one-way R M A N O V A or main effect of a factorial R M test no longer assumes a central F distribution (Howell, 1992; Winer et al., 1991). Rather it follows a noncentral F distribution based upon the noncentrality parameter, X (lambda), where, E(F) = (2.00) df -2{ 2 4f, and, for a one-way R M A N O V A Potvin '96 (2.01) 10 Literature Review for the grouping factor (A) main effect (2.02) of a two-way mixed A N O V A for the R M factor (B) main effect (2.03) XB = of a two-way mixed A N O V A n£ ^(y-jj — \i — \ij + \if t f° r m e grouping x R M Interaction (AxB) (2.04) of a two-way mixed A N O V A E ( F ) represents the expected value for the overall F ratio, df, and df are the degrees of freedom for the 2 numerator and denominator of the F statistic and |Xij = the cell mean, \Lj and \ij = the marginal means for the levels of the randomized group (RG) and repeated measures (RM) factors respectively, \i = the grand mean, n = the sample size per group, p and q = the number of levels of the R G and R M factors respectively andG ] = the error variance for the specific effect (Bradley 1989). The noncentrality parameter represents the factor by which the F ratio departs from the central F distribution and signifies the true distribution of F when a difference between means actually exists (Howell, 1992; Winer et al., 1991). Closely associated to X, is another statistic, <j) (phi), which is a function of X and the number of trials and/or groups in the design. For a one-way R M A N O V A it is represented as, (2.05) Potvin '96 11 Literature Review While some researchers use X (Bradley, 1989; Howell, 1992) and others <> | (Winer et al., 1991) in their discussions on power, both are considered general representations of a noncentral measure of the F distribution. The relationship between power and X is a curvilinear one. Generally, as X increases so does the power of the test until a maximum level is reached at which point any further increase in X has no effect on power. Therefore, an experimental design with a large difference between means (e.g. a big Z(UJ-|A) ) will 2 have greater power than one with a smaller difference, all other conditions being constant. L u i and Cumberland (1992) and Vonesh and Schork (1986) expressed the relationship between X and power using the cumulative distribution function, (2.06) where F(a , df , df ) is the upper etth percentile of the F-distribution with df, and df degrees of freedom t 2 3 and F(w, df df , X) is the non-central F-distxibution with the noncentrality parameter X. When a true lt 2 effect exists, most other factors related to power exert their influence by either increasing or decreasing X. Effect Size Effect size (ES) represents the magnitude of the difference between means or the extent of the treatment effect in standard deviation units (Lipsey, 1990). Cohen (1988) describes E S for a design involving only two groups as the standardized difference between the means represented by d, where, (2.07) Potvin '96 12 Literature Review and u«-u* is the difference between the population means and o = the common or pooled standard deviation of the two groups. Since both the numerator and denominator are expressed in the units of the dependent variable, E S , like X, is unitless. For those designs with three or more groups (RG ANOVA), Cohen (1988, p.275) expresses ES as / , the 'standard deviation of the standardized means' represented by, where k = the number of groups in the design and = the average within-groups error variance. Since both d and/are representations of ES, they should be related to one another. Cohen (1988) shows this relationship for the condition where population means are evenly spread apart as follows: . where LL max—Li min d =- —, (2.10) and u.«„ - [L* is the difference between the maximum and minimum population means. As shown by Winer et al. (1991) and Bradley (1989), under the two-way mixed ANOVA model, a slight modification to Cohen's/is required since the appropriate error term is different from that of the complete RG model for each test involved. For a main effect and interaction term involving one repeated Potvin '96 13 Literature Review measures factor, the error variance is represented by o\ , the subjects within-groups by trials WixT interaction which can be calculated from a s follows: o^(l-p) , where p is the average of the k(k-l)/2 correlation coefficients among the k trials (Winer et al., 1991). For a main effect involving a ramiomized group factor, the error term, o 2 , is expressed as o 2 Swg Swg = + (q-l)p], the variance attributed to subjects within groups, where q = the number of levels of the R M factor. Therefore, substituting these modified error terms for a R M A N O V A test,/can be expressed as, for the one-way R M test (2.11) for the grouping factor (A) main effect (2.12) of a two-way mixed A N O V A for the R M factor (B) main effect (2.13) of a two-way mixed A N O V A for the G x R M Interaction (AxB) (2.14) of a two-way mixed A N O V A It should seem evident from these formulae that effect size is direcdy related to X by, X = rikf for the one-way R M A N O V A , (2.15) X = npq/2 for main effects of the two-way mixed R M A N O V A , (2.16) 2 Potvin '96 14 Literature Review and X = n[(p-l)(q-l) + l ] / 2 for interaction of the two-way mixed RM ANOVA. (2.17) Therefore any increase in effect size caused by a larger difference between means, a reduction in the variance of the dependent variable or a combination of both will increase X and thus power among these designs. Level of Significance (a) Level of significance or alpha (a), which represents the probability of rejecting a true null hypothesis (type I error) in significance testing, has a direct nonlinear relationship to power (Lipsey, 1990). This relationship is reflected in equation 2.06 which shows that when a is increased for any given numerator and denominator degrees of freedom, the critical F ratio of the central F distribution decreases (shifts to the left or more towards the central part of this distribution). Since power represents that proportion of the noncentral distribution immediately right of this value, a decrease in the critical F causes a greater proportion of the central F distribution to fall within the rejection zone, thereby increasing power. Unlike other factors related to power, a represents one of the few parameters a researcher has complete control over since it is set by the investigator him/herself. Unfortunately by convention, a is almost universally set at .05 or .01 and is rarely tolerated above the 5% level (Lipsey, 1990). Therefore the extent to which an alteration in a improves the power of a design is modest at best and in many cases the least effective of the related factors. Potvin '96 15 Literature Review Sample Size (n) Sample size or the number of subjects (n) in a group has a direct nonlinear relationship to power as expressed in equations 2.01-2.04 for X. Since n is found in the numerator of these formulae, it is clear that for any given R M design, an increase in n will result in an increase in X and therefore power, when all other parameters are held constant. A less mathematical approach to interpreting the relationship of n to power can be explained by the central limit theorem. This theorem states that the sampling distribution of the mean will approach a normal distribution as n increases (Howell, 1992). In other words, since the s sample mean is distributed normally with a standard deviation of —j=, an increase in n is more likely to produce a sample mean that deviates less from the true population mean. Therefore, i f a difference exists in the population or there is a true treatment effect, then the sample means are more likely to reflect this as n becomes larger. In a visual display of sampling distributions represented by large n's, this is reflected as less overlap between those distributions that are truly different from one another, an example of the greater power that exists with larger sample sizes. In an A N O V A test, the influence of increased sample size results in a larger mean sum of squares for the treatment effect (MS™,) since the variance associated with this effect is weighted by n (e.g. nZCuy-p.) ). The overall result is a larger numerator and therefore bigger F ratio. 2 Correlation One of the advantages of R M designs over randornized group (RG) designs is that the former allow the overall variability of treatment scores to be reduced for those effects involving a R M factor (Howell, 1992). This is because the dependency or correlation that results among scores when the same subjects are used for all conditions offers an opportunity to reduce between subject differences from the Potvin *96 16 Literature Review within-groups ( R G ) error term (a ). This was illustrated earlier in equations 2.11-2.14 in which/, the 2 effect size index for randomized group ANOVA designs, was modified to account for correlated measures within the R M designs. These equations show how the correlation among the levels of the treatment condition modify the R G error variance to produce unique error terms for the different effects associated with the R M ANOVA design. Winer et al. (1991, pp. 261-267) explained these modifications mathematically. For a main effect or one way ANOVA involving a R M factor, the mean sum of squares error (MSER* ) is given by, MSKHB = var —cov or = o i - o i where O 2 (2.18) = the mean variance among the levels of the treatment condition and a is the average of all 2 the covariances in the variance-covariance matrix of the R M factor which, as Winer et al. (1991) point out, solely represents the variance attributed to between subject differences. Since a 2 =o~ p, with 2 substitution we arrive at, MSEW =G -o p 2 2 = a (l-p) 2 or simply a (l-p) 2 (2.19) which is used in the denominator of equations 2.11, 2.13 and 2.14. With subject differences reduced, the error term now represents the residual variance associated with the subjects by treatment interaction (one way design) or the subjects within-groups by treatment interaction (mixed factorial design). Potvin '96 17 Literature Review It should therefore seem evident that the degree to which the error variance is reduced, as shown in equations 2.11, 2.13 and 2.14, is dependent on the magnitude of the average correlation among the treatment conditions. As the average correlation increases, the degree to which the error term is reduced also increases. For a one-way R M A N O V A design or R M main effect and interaction of a mixed factorial test, this reduction in error variance produces a greater / and X and a concomitant increase in the power of the associated effect when a true difference exists. With regards to A N O V A test results, a reduction in the error term (MSHU, ) will produce a larger F value, all else being equal. Unlike other factors affecting power, the relationship between correlation and power is not consistent for all effects within a factorial R M design. Winer et al. (1991) describe that for main effects involving a grouping factor, the reduction in the MSm* (represented by MSs*,, the subjects within groups variance) is directly related to correlation by, MS E = var + (q= a 2 + ( l)cov -l)a p 2 < 7 = a [ l + (<7-l)p] (220) 2 Therefore, from equation 2.20 we see that in a mixed A N O V A design, the error term for a grouping main effect will actually increase as the number of levels and/or the correlation among the treatment conditions increases. This makes sense when one considers that averaging many highly dependent scores across levels of a R M factor will produce less residual variance among scores for each subject, enhancing betweensubject differences within a group and therefore resulting in greater within-group variability than when many independent scores are averaged. In contrast, a lower correlation among scores will decrease differences between subjects due to a higher residual variance in each person's scores and thus reduce within-group error. Potvin '96 18 Literature Review The effects of altering the within-groups variance on the effect size and power of a mixed A N O V A test can be demonstrated by equations 2.12 and 2.16. As the error term increases with a rise in correlation,/and A, decrease resulting in less power to detect a true group effect. In a simulated A N O V A test, this will result in a smaller F ratio. The effects of correlation on the error variance and power of different tests in the two-way mixed A N O V A model, as described in this section, may also be applied to those R M designs with two or more randomized group factors. The only difference is that an additional variable is required in equations 2.122.17 for each new factor added to the design. However, for designs with two or more R M factors in which a separate variance-covariance matrix exists for each variable and interaction of variables having repeated measures, no mathematical derivations are available at present to explain what influences multiple matrices will have on the error variance of different tests. While it may seem the equation for calculating the appropriate error term in single R M factor A N O V A (2.19) may also apply to these tests, a major problem exists in deterrnining what correlation value to use in the calculations. Results from a preliminary project conducted by this researcher suggest that simply using the average correlation of each respective matrix in equation 2.19 will not suffice when other R M factors are present (Potvin and Schutz, 1995). This is because the average correlation of one factor seems to be influenced by the magnitude of the average correlation and number of repeated measures of the other factors) present in the design. Therefore, under these conditions, the effect on error variance of these tests can be expected to differ from what would otherwise be observed i f the average correlation coefficient of a pooled or overall matrix was simply used. This implies the relationship between correlation and error variance is different from that expressed in equation 2.19. It appears no other work has been undertaken to explain this relationship analytically for designs with two or more R M factors. Thus the effects of multiple R M variables on power is currently unknown. Potvin '96 19 Literature Review n. Power, RM ANOVA Assumptions and Violation of Sphericity A. Univariate Assumptions In addition to the usual statistical assumptions necessary for all ANOVA designs (multivariate normality, homogeneity of variances, linearity and independence across subjects/units ), R M tests are also restricted by certain assumptions concerning the structure of the variance-covariance matrix(s) involved. These include the assumptions of compound symmetry, circularity and sphericity. Compound Symmetry: When all the variances of treatment levels are equal (i.e. a^j = = o^j) as well as all their covariances (i.e. Oy£ = oy/ = o"£/). the resultant variance-covariance matrix, Z, is said to have a pattern of compound symmetry, CS, (Winer et al., 1991). Since Z is simply a different expression of the correlation matrix, this implies that the correlations between observations of any pair of treatment conditions in the matrix are also all equal (i.e. Pjk = Pjl = Pkl)- Huynh & Feldt (1970) showed that the assumption of equal correlations (compound symmetry) is a sufficient but not necessary condition for the univariate RM ANOVA. This means that so long as other less restrictive assumptions hold (to be discussed), violations in the pattern of compound symmetry will not require adjustments to the critical F ratio of the univariate test. Circularity: Unlike CS, the assumption of circularity does not require that all covariances of a matrix be homogeneous. Instead, Winer et al. (1991) demonstrated that if the variances of all pairwise differences Potvin '96 20 Literature Review between treatment means equal a constant, then the F test will be valid and not require adjustment This less restrictive assumption can be expressed algebraically as follows: a j + G ^ - 2aft = 2A for all j, k pairs 2 2 where a j and o ^ are the variances for a pair of treatment conditions, 2 (2.21) is the covariance for the pair and 2A is the constant Winer et al. also showed that since var-cov = A OVOH = A or (2.22) that is, the difference between the average variance and average covariance of a matrix is equal to a constant, circularity also implies that the residual error variances for all pairs of treatments are homogeneous (see also equations 2.18 & 2.19). Since CS is a special case of circularity, it is possible for a matrix to have circularity but not CS and still result in a valid F test. However, when the assumption of circularity is violated, the power or type I error rate of the F test can be seriously affected, requiring proper adjustment to the critical F value (Collier et al., 1967; Davidson, 1972; Huynh & Feldt, 1970; Muller & Barton, 1989; Mendoza et al., 1974; Rouanet & Lepine, 1970). Sphericity: The condition of sphericity simply represents an alternative expression of the property of circularity. Rather than presenting the variables of a matrix as is, it is helpful when using matrix algebra Potvin '96 21 Literature Review to transform the covariance matrix into an orthonormal matrix. An orthonormal matrix is one in which the rows of the covariance matrix are converted into normalized coefficients of orthogonal contrasts (i.e. like the coefficients used in trend analysis). As Winer et al. (1991) explain, under the assumptions of sphericity, an orthonormal matrix should have the property, Zy =M*Z^V/*'= Al (223) where M* is any orthonormal matrix of contrasts across repeated occasions with dimensions (k-1) x k, I, represents the actual variance-covariance matrix, M*' is the inverted form of M * . A is a x constant and / is an identity matrix with ones on the diagonal (variances) and zeros on the off-diagonal elements (covariances). According to Winer et al., an orthonormal matrix (Ly) having the form Al is said to be spherical. Since sphericity is an alternative form of the circularity condition, the ramifications incurred to the F test when the assumption of sphericity is violated are similar to those described previously for conditions of noncircularity. For the purpose of clarity throughout this dissertation, the term sphericity will be used to indicate either assumption. B. Multivariate Assumptions Apart from the univariate approach, statistical tests involving RM can also be analyzed under a multivariate model (Hotelling's T or MANOVA). That is, each trial of a RM design may be regarded as 2 a separate dependent variable and treated as such in the analysis. Under the multivariate model, most assumptions described previously for the univariate technique also hold for this test with the exception of the assumption of sphericity which is not required (Muller, LaVange, Ramey & Ramey, 1992). For this Potvin '96 22 Literature Review reason, the multivariate technique is often considered a better choice for analyzing RM designs when sphericity is not met since it does not require adjustment of a test statistic (Davidson, 1972; Grima, 1987; Huynh & Feldt, 1970; Mendoza et al., 1974; Rouanet & Lepine, 1970; Schutz & Gessaroli, 1987). However, when the assumption of sphericity is met, the univariate model which results in a conventional F value (MSnah/MSisuO, offers a valid and attunes better approach (Davidson, 1972; Green, 1992; Mendoza et al., 1974; Muller & Barton, 1989). Thus, the decision of whether to use a univariate or a multivariate model for analysis of RM designs, in most cases, rests upon which test provides the greater power. Ideally, a researcher wishing to maximize the power of their RM design should determine power using both approaches and choose the technique offering the most power. Realistically however, this power comparison is rarely carried out since anecdotally, the univariate approach is by far the more common one utilized by health and behavioral researchers, regardless of any power advantage the multivariate model may have. C. Epsilon (e) When the assumption of sphericity is not met in univariate tests, Box (1954) showed that the degree to which this assumption is violated could be measured by the population parameter, epsilon (e). e ranges from 1.00 to l/(k-l) with a value of 1.00 representing perfect sphericity and l/(k-l) representing the greatest level of violation possible under k occasions. Since population values of e are rarely known, Greenhouse and Geisser (1959) suggested using the sample covariance matrix to approximate e and thus this estimate is often referred to as the G-G estimate, e . Huynh and Feldt (1976) showed that e was a A A biased estimate of e when e>.75 and they developed an alternate estimator (which takes into account sample size and the number of levels of the factors) which is referred to as the H-F estimator, e~. Potvin '96 23 Literature Review In the event that a RM ANOVA test exhibits nonsphericity, e or its estimates , e and e~, are used A to correct the degrees of freedom of the critical F ratio. The adjustment to the critical value is necessary since several researchers have shown that under a true null hypothesis, the type I error rate (chance of rejecting a truly nonsignificant effect) is inflated (Davidson, 1972; Eom ,1993; Greenhouse & Geisser, 1959; Huynh & Feldt, 1976; Mendoza et al., 1974). By multiplying the degrees of freedom of the F statistic by one of these correction factors, a test is rendered more conservative (the critical F is increased) thereby decreasing the chance of making a type I error. Since e, e and e~ do not all produce identical A values, the degree to which a test is adjusted is dependent on the value being used. As Muller and Barton (1989) explain, the uncorrected F test provides the least conservative adjustment possible, followed by e~, then e and finally e-adjusted tests. In situations where maximum protection against type I errors under A nonsphericity is required, Greenhouse and Geisser (1959) suggested using the most conservative adjustment possible, that involving multiplication by the lower limit of e, l/(k-l). Under conditions where e = 1 (sphericity), the correction factor is unnecessary. D. Power Calculations Under Different Conditions of Sphericity When e = 1, the uncorrected F follows an exact noncentral F distribution under a false null hypothesis or true effect (Muller & Barton, 1989). In this case correction using epsilon is not necessary and power can be calculated using those equations presented earlier in section I of this chapter. In the case where e * 1, the uncorrected F does not follow an exact noncentral F distribution and therefore the usual derivations of power for a RM test may be misleading. Here, the effects on power under conditions of nonsphericity are not as well defined as they are for type I error rates. Several researchers have found power of the uncorrected F test to be overestimated as e decreases (Marcucci, 1986; Muller & Barton, 1986) while others have shown it to be underestimated (Mendoza et al., 1974) due Potvin '96 24 Literature Review to an increase in the number of outliers of F that occur. A reason for the contrasts in power trends seen in these studies may be a reflection of the magnitude of effect size involved. That is, large effect sizes (e.g. large differences between treatment means) are likely to exhibit lower power estimates under nonsphericity due to a greater overlap between existing population distributions while small effect sizes (small differences between means) are more likely to result in larger power values than expected because of less overlap between distributions. Regardless of whether power is over or underestimated, adjustment to the univariate F statistic is necessary in order to rninimize inaccuracies. Muller and Barton (1989, 1991) and Muller et al. (1992) provided approximations of power for several adjusted F tests. An example of one of their methods for estimating power using the e and e correction factors is as follows; A First, the epsilon-adjusted critical value of any F test is found from a central (inverse) F distribution function, namely Fa** (E(e )) « A FINV [ 1 - a , df,*E(e*), &*E(e*) ] , where FINV [ 1 - a, df *E(e ), df *E(e ) Jrepresentsthe value of the F statistic based on epsilon-adjusted A t A 2 numerator (df,) and derwminator (df ) degrees of freedom such that Pr{F£ F^} = 1 - a, the probability 2 that F observed will be less than or equal to epsilon-adjusted F critical. Here, E E a 1 represents the expected estimate of e which according to Muller and Barton (1989) is a better measure to use over sample e for A A adjusting the degrees of freedom since they found it to improve the accuracy of power approximations under conditions of nonsphericity. Second, the noncentrality parameter (NCP) is calculated using the appropriate function from equations 2.00-2.04 and then multiplied by Box's e as follows, represents the long range average of e from many sample estimates and can be approximated using formulae 2.16-2.19 of Muller & Barton (1989) A Potvin '96 25 Literature Review NCP (e) = X* e. (224) Finally, power is computed from a noncentral F distribution function as, Power (Tffe ;; - 1 - FPROB [ A (E(e )), df,*e, df *e, Xe ] , A 2 where FPROB [ F^ (E(e )) , df,*e, df *e, Xe ] represents the noncentral F distribution function, namely A 2 Pr{F <. Fed,}, for a E(e )-corrected noncentral F statistic based on e-adjusted numerator and denominator A df and the e-adjusted noncentrality parameter. In addition to the e-adjusted functions above, Muller et al. (1992) provided power functions for all other corrected tests as well (see p. 1215 in their article). As they explained, the only difference among these functions is the way in which the critical value is determined. Potvin '96 26 Literature Review III. Power Determination For RM Designs A. Review of Past Work Over the last 60 years, both analytical and Monte Carlo simulation methods have been used to obtain power and sample size estimates for a variety of ANOVA designs. Work in this area has mostly focused on randornized group designs (Barcikowski & Holthouse, 1972; Borenstein & Cohen, 1988; Borich & Godbout, 1974; Cohen, 1969,1988; Kraemer & Thiemann, 1987; Koele, 1982; Pearson & Hartley, 1951; Rotton & Schonemann, 1978; Tang, 1938; Tiku, 1967) while efforts at providing estimates for those involving repeated measures have only been attempted in the last 25 years (Davidson, 1972; Grima, 1987; Marcucci, 1986; Mendoza et al., 1974; Muller & Barton, 1989; Mulvenon, 1993; Robey & Barcikowski, 1984). Part of the reason for this lag, despite the frequent use of R M designs in the behavioral and biological sciences (Edgington 1974), is due perhaps to challenges statisticians faced in deriving power formulae that could account for a correlation structure and the frequent conditions of nonsphericity associated with these designs. Despite these challenges however, methods for providing power estimates have been successfully implemented for the one way within-subjects and two way mixed models. One Way RM Designs The earliest efforts to approximate power for the RM ANOVA model involved those designs with a single group within-subjects variable. Davidson (1972) was among the first to provide power estimates for this design when he compared analytical approaches to power using univariate (uncorrected and conservative F) and multivariate (Hotelling's T ) methods. In his study, Davidson derived power values 2 for designs involving a range of RM levels (3, 6 & 16), noncentrality parameters (.5 to 3.0) and sample Potvin '96 27 Literature Review sizes (4 to «») in which the covariance matrix either met or violated the assumption of compound symmetry. Hisfindingsrevealed that when e=l, the uncorrected univariate test exhibited the greatest power but that the multivariate test approached an almost equal level as n increased. When e < 1, the multivariate test was found to be almost always more powerful than the e-adjusted F test except when effect size was large and in some cases when sample size was small. One of the first attempts to computerize procedures for estimating power in RM designs was conducted by Barcikowski (1973). He developed a computer program for calculating power of the oneway RM design through use of the multivariate (Hotelling's T^) statistic and in later years, with the collaborative efforts of another researcher, extended his program to include power o^terrnination under the univariate model as well (Robey & Barcikowski, 1984). Their more recent program allowed input of treatment means, levels of repeated measures, several sample sizes (up to 20 at a time) and level of significance (a). It also provided an option for users to deterrnine the power of several univariate tests (uncorrected, e-adjusted and e~-adjusted F tests) under conditions of sphericity and nonsphericity. Unfortunately, these researchers did not provide detailed descriptions of the power computations involved in their methods. Similar to Davidson's (1972) study, Marcucci (1986) compared the power and type I error rate of univariate and multivariate tests for a single R M factor design under conditions of sphericity and nonsphericity. Power estimates were derived analytically using an approximation to the distribution of the F statistic and values for 3 competing tests (the uncorrected and Box's e-adjusted F tests and Hotelling's T^) involving either 3,4 or 5 repeated measures were provided over a range of covariance structures (e = 1.00, .98, .90 & .72), effect sizes (zero to high) and sample sizes (10 & 20). Results derived from the approximations were similar to those seen in the Davidson (1972) study with the conventional F test having the highest power when the assumption of sphericity was met As e decreased however, the multivariate method was shown to gain a substantial power advantage over the univariate tests under most Potvin '96 28 Literature Review conditions. In particular, the multivariate test was most sensitive in detecting small mean differences among highly correlated trials when the correlation between the other treatment conditions present was low. In addition to comparing values between tests, Marcucci (1986) also provided evidence supporting the accuracy of Iris power formulae. Within the same year Marcucci published Iris power formulae, Vonesh and Schork (1986) provided an analytical solution for deternuning sample size in the univariate (uncorrected) and multivariate (Hotelling's T ) analysis of single-sample repeated measurements. Sample sizes were derived and 2 tabulated for power values of .8 and .9 and alpha levels of .01 and .05 under a range of conditions involving different effect sizes (Cohen's d = 1 to 3), minimum correlation values (0-.9) and repeated measurements (3-6). However, sample sizes were given only for the multivariate model and no attempt was made to compare power between different tests. Two Way ANOVA With OneRM Factor The two way mixed design and its associated multiple tests presented a more complex model for which to approximate power. Mendoza et al. (1974) were among the first to provide power estimates for this model. They used Monte Carlo simulation to compare power values between univariate and multivariate tests for the trial and interaction effects of a 3 (groups) by 4 (trials) design. Power and type 1 error estimates were computed for four univariate tests (conventional, e-, conservative e-, e -corrected) A and two multivariate tests (Hotelling's T for the trials effect and Roy's Largest Root criterion for the 2 interaction effect) under conditions that either met or violated the assumptions of normality (normal or skewed distribution) and sphericity (e = 1, .5087 & .5365). Simulations were conducted for three different effect sizes (none, small and large) using a single sample size (9 per group) at alpha = .05. From their results, they concluded that the uncorrected F was the more powerful test under all conditions and effects when e = 1 but that the multivariate tests provided superior power over all univariate tests when e < 1. Potvin '96 29 Literature Review The only exception was for the multivariate interaction term which displayed less power when effect size was large. Skewed distributions were found to have little effect on the results. In a similar but unpublished study, Grima (1987) also exarnined power and type I error differences between univariate tests (uncorrected F and e-adjusted F) and multivariate tests (Hotelling's for trial effects and Wilk's, Hotelling-Lawley's, Pillai-Bartlett's and Roy's criterion for interaction effects) using Monte Carlo simulation on the same 3 x 4 design as Mendoza et al. (1974). Grima extended the work of her predecessors by generating values under varying variance-covariance structures that conformed either to CS, sphericity or muftisample circularity^ (e =.98, .81 & .99, respectively). Small 2 A and moderate effect sizes were chosen (Cohen's/= .15 & .25) and a range of sample sizes selected (13-98 per group) so that power values obtained approximated fixed values of .75, .80 and .85 . The conventional 3 and corrected F tests were found to give greater power than the multivariate tests under assumptions of CS and sphericity but the difference between these tests decreased as sample size increased. Under some conditions of multisample sphericity for both the trials and interaction effects, several of the multivariate tests were observed to approach or even surpass the power of the univariate tests when effect size was small or moderate. Among the first analytical approaches to power for the two-way design was reported by Marcucci (1986) who, in addition to deriving formulae for the simple RM design also provided a solution for the mixed model as well. He illustrated how power for both the trials and interaction effects of the univariate and multivariate tests could be approximated using the same formulae for the one-way design with only minor substitution of expressions to X and error terms. Results from the application of these formulae however, were not presented or discussed as was done for the single factor model. 2 Grima (1987 pp. 52-68) used the term reducibility to describe sphericity and multisample sphericity to describe the condition where all groups of a particular design exhibit homogeneous spherical covariance matrices. In fact, the power estimates generated from this study were actually greater than expected theoretical values of .2, .5 & .8 since the latter were based on Cohen's tables (1977) for randomized group ANOVA. Potvin '96 30 Literature Review In an effort to improve the accuracy of power approximations under conditions of nonsphericity, Muller and Barton (1989, 1991) derived formulae for several corrected univariate tests of the mixed model. Their study was an extension of earlier work by Muller and Peterson (1984) who provided convenient power approximations for several multivariate tests. In the current study, power formulae were given for the uncorrected, the e-adjusted and e~-adjusted F tests. These authors proposed that the critical F of a particular test should be corrected using the 'expected value of the epsilon estimator" (i.e. E(e ) or E A (e~)) rather than the sample e or e~ usually used in significance testing, (see section II of this chapter for A an example of procedures involved). Muller and Barton (1989) tested the accuracy of their approximations by comparing their computed values (uncorrected and e tests only) with the simulated results from the A Mendoza et al. (1974) study described earlier. They also compared their analytical values with results from their own Monte Carlo simulation for the interaction effect of a 3 x 4 design involving varying covariance structures (e = 1.00, .897, .814, .757, .533 &.532), sample sizes (N=15 & 30) and power estimates (.2, .5 & .8). They found their power approximations to be generally accurate (the largest absolute difference found was .052) but recommended using the e -corrected test over the e~- adjusted one A in future cases of nonsphericity since the former provided the most power while maximizing control of Type I error. In a follow-up of their own work, Muller et al. (1992) extended their power approximations to include Geisser-Greenhouse's conservative test as well. Using a case study involving a two way mixed model, they also demonstrated how their multivariate and univariate power equations could be used effectively to provide important information during the planning of a study. Examples of power estimates were presented for the interaction effect of the e -adjusted test and Wilks's LR criterion under conditions A involving different covariance matrices (e = 1.0 & .9), sample sizes (N= 100, 200 and 400), number of repeated measures (2 & 3) and effect sizes. Potvin '96 31 Literature Review As a continuation of Muller and Barton's (1989, 1991) work, Mulvenon (1993) provided four alternative methods for calculating the power of the univariate F test of the mixed model without the need of computing an expected value of epsilon (Ee ), as required with existing formulae. Since in practice A estimates of population values for the variance-covariance matrix are rarely known, Mulvenon suggested using sample values of I instead when computing power . His study investigated how the use of sample values affected the accuracy of four new formulae in estimating power. The equations involved included a modified version of the one derived by Muller and Barton (1989) as well as three others developed by Betz and Thompson (1990, unpublished). Results from these new equations were compared with those generated using the existing procedures recommended by Muller and Barton (1989, 1991). Through Monte Carlo simulation, mean power estimates were obtained for each formulae under conditions similar to those used by Muller and Barton (1989) as well a range of other design conditions involving different sample sizes (10, 20 & 30), e values (.35, .55. 75, .80, .85, .90 & .95), repeated measures levels (3, 5, 7, & 9) and fixed effect sizes (chosen to reflect power values of .2, .5 & .8). In most cases, the new formulae proved reasonably accurate at approximating power compared to the existing method with the modified Muller and Barton (1989,1991) equation appearing to be the most reliable of the four. Accuracy was found to improve with greater sample sizes and decrease with higher numbers of repeated measures while the degree of nonsphericity (e) had little effect on their reliability. Methods have also been conducted for estimating sample size in the two-way mixed design. Rochon (1991) extended the sample size formulae of Vonesh and Schork (1986) to include a betweensubjects variable. However, unlike his predecessors, he provided analytical procedures for only the multivariate test (Hotelling's 7% He also accounted for the correlation structure of the RM variable by deriving formulae for different patterns of the correlation matrix (compound symmetry and autoregression). Tables of sample size were provided for different hypotheses (multivariate, group main Potvin '96 32 Literature Review effect and group bytimeinteraction) under varying conditions of effect size (Cohen's d=A, .3, .5, .7, .9, 1.1), correlation (0 to .9), covariance pattern and repeated measures (3, 5, 7, 9). All sample sizes presented corresponded to a power value of .8 and alpha level of .05. Other research efforts related to sample size include works by Bloch (1986) and Lui and Cumberland (1992). These researchers also provided sample size formulae for the two-way mixed model but unlike Rochon (1991), used a univariate technique instead. Unfortunately, their solutions only applied to the between-subjects main effect and did not include the other hypotheses (RM main effect and interactibn). ANOVA Designs With One Repeated Measures Variable and Two or More Between-Subject Variables Although no published work dealing directly with power solutions for ANOVA designs having two or more randomized group factors and a single within-subjects variable appears in the literature, present analytical methods for the two-way mixed model can be extended to these designs as well. Winer et al. (1991) showed how the numerator and denominator terms for the different tests of a k-way rnixed model remain almost the same using the univariate approach. The only exception being an increase in the number of degrees of freedom + one variables included in the effect size and noncentrality parameter formulae given in equations 2.12-2.17 (one extra variable for each new factor added) and the need to calculate power for higher order interactions (which parallels procedures for simple interactions) . Therefore, by extending existing principles of the two-way rnixed ANOVA, power deterrnination for these designs under the univariate model is possible. Summary of Previous Work From the review presented, it seems evident that a variety of univariate and multivariate methods now exist for estimating power or sample size of single and mixed R M ANOVA designs. The choice of Potvin '96 33 Literature Review whether to use a univariate or multivariate technique seems dependent on whether certain conditions of the correlation matrix are met Generally, when evidence of sphericity exists, the univariate approach is more powerful and therefore the better choice. However, when the assumptions of sphericity are severely violated, the multivariate model generally shows a power advantage and thus is the preferred analysis. Table 2.1 provides a summary of all relevant studies related to power and sample size estimation among RM designs discussed. Apparent from this review and Table 2.1 and of particular relevance to this proposed study, is the lack of methods currendy available for estimating power or sample size among those designs involving multiple RM factors. B. Computer Programs Available Despite the availability of analytical solutions for determining power of some RM designs, the rather complex and often tedious nature of procedures involved frequently discourage their use among investigators. Over the years, computer programs have been developed to expedite the process of power analysis and make it more appealing to researchers. Many of the programs available today are the result of experimenters' own efforts to incorporate and improve the application of their own or colleagues approximations. Unfortunately, the majority of these programs involve crude FORTRAN-type subroutines requiring a certain amount of computer language and progranirning knowledge which limits their use among the less computer-literate users. In addition, accessibility to some of these programs is often difficult Recent commercial products have emerged that offer a more accessible and user-friendly approach to power determination in RM designs. Borenstein and Cohen's (1988) power program , although not designed to provide estimates for RM ANOVA, can be modified to do so by manually calculating the appropriate effect size (f) and degrees of freedom for the RM test, and using these values in place of those Potvin '96 34 Literature Review > I- 2 a. i. 8. I 5 S3 a. Is> « 1 3 > > o o <u .o H <u Q u Hi 2 a. S 5 O •-s u i g 0 JO K <u Q J II 1 1 •c Q Q S Q H S z o •s•S- »• s IB 3 on CJ I ft. '9 ^ o CN o 2 oi 1 s, a 1 Ln en co co co co" co" CS CO j5 CO >S OH <*> PL, -is & O w O Q & i£ i5 OH OH co r- ia •a PH & L& ^ OH CU OH OH J2 1•e % oo t» m CQ •S3 Bare § o Davi h Refe On <u u S3 OH Potvin '96 111 35 Literature Review for RG ANOVA. Programs that offer more direct methods of power estimation for one-way RM and twoway mixed designs include PASS (1991), SOLO (1992), DATASIM (1989) and Robey and Barcikowski (1984). DATASIM, a program developed by Bradley (1989), can deterrnine power directty using a cumulative distribution function or mdirectly using Monte Carlo simulation for both the one-way RM and two-way mixed designs. The software, however, is limited since power under conditions of nonsphericity can only be estimated through simulation (and then again, only for the one-way design) while direct computations of powerrequirethe user to calculate A, manually. PASS and SOLO are actually identical programs developed by the same author but distributed through two separate companies (Number Cruncher Statistical System (NCSS) & BMDP,respectively).Although these programs apparently allow direct computation of power for RM designs, inaccuracy of results obtained on test trials by the authors of this project has left doubts concerning the appropriateness of formulae involved. In addition, neither program accommodates for differences in the patterns of covariance matrices and therefore, accurate power estimates under conditions of nonsphericity are unobtainable. Although Robey and Barcikowski's (1984) program can compute power values under conditions of sphericity and nonsphericity, their program is limited to just the single group within-subjects model as described earlier. For purposes of comparison, the characteristics of these programs have been summarized in Table 2.2. Potvin '96 36 iS o e o a « S H o s n. a U Designs ££ z 1 WayRM 2 Way Mixed RMxRM u Z Datasim (Bradley '89) CS Software Power Calculations « M >H >H O O c a , O o o o Z Z Z o o o Z Z Z o o o Z Z Z o o o o zz • Yes CS c e . P P ' m M S S3 £ $H Z HI co cn o Potvin '96 Un, G-G, H-F, T2 CO O 1 WayRM 2 Way Mixed RMxRM Ui Robey & Barcikowski '84 cn <U £>H 1 WayRM 2 Way Mixed RMxRM cn Solo/PASS '92 Monte Carlo Capability <L> 1 WayRM 2 Way Mixed RMxRM Accounts For Cov Structure ^-t Statistical Power Analysis (Borenstein & Cohen '88) Tests Power Given For Literature Review Z z • >H 5* s g 37 Literature Review Study Expectations Based on the relationship between power and certain experimental parameters, the following power trends were expected: > As sample size increased, power for all tests of a given design would increase due to an increase in the numerator of the F ratio, all other factors held constant > As effect size increased, power for all tests of a design would increase due to an increase in the numerator of the F ratio, all other factors held constant > As the level of significance increased from .01 to .10, power for all tests of a given design would increase due to a reduction in the critical F ratio of the central F distribution, all other factors held constant > As the average correlation among trials increased for those designs involving a single repeated measures factor, power for the interaction and within-subjects main effects tests would increase due to a reduction in the error term of the F ratio, all other factors held constant. In contrast, the power for the between-subjects (group) main effects test of a two-way rnixed ANOVA would actually decrease with an increase in the average correlation among trials due to a larger denominator in the F ratio, all other factors held constant > Under conditions involving trend correlation matrix patterns (sphericity severely violated), power values obtained would be either greater or less than those under constant correlation matrix patterns due to an increase in the variability of the F ratio. Potvin '96 38 Literature Review Hypotheses 4 In addition to the expectations listed, it was hypothesized that for two-way ANOVA designs in which both factors involve repeated measures (A & B): Main Effects > As the average correlation among trials of factor B decreased, the power of the A main effect would increase due to a reduction in the error term of the F ratio, all other conditions held constant > As the number of repeated measures (levels) of factor B increased, the power of the A main effect would increase due to an increase in the F ratio, all other conditions held constant Interaction > As the average correlation among trials of both factors increased, in general, the power of the interaction effect would increase due to a reduction in the error term of the F ratio, all other conditions held constant In addition, for those conditions in which unequal average correlations exist between the two factors: > As the number of repeated measures for the factor with the higher average correlation increased, the power of the interaction effect would increase due to an increase in the F ratio, all other conditions held constant > As the number of repeated measures for the factor with the lower average correlation increased, the power of the interaction effect would decrease due to a reduction in the F ratio, all other conditions held constant. 4 Based on pilot project results. Potvin '96 39 Methodology Chapter Three Methodology Potvin '96 40 Methodology A description of the methodology used in this investigation has been organized into three separate sections. The first section outlines the process by which experimental parameters and ANOVA designs were chosen for use in the power analysis of this project. The second section describes the computer programs utilized to generate power values for the different designs involved. The last section provides a brief description of the methods used to verify the reliability of these power programs. I. RM ANOVA Designs and Experimental Conditions A . Collecting Empirical Data In order to generate power values reflective of those experimental parameters common to the field of human kinetics, efforts were made to collect empirical data from several disciplines within the field. Research databases common to the area (SportsDiscus, Med-line) were searched for relevant studies involving univariate R M ANOVA designs and any one of the following dependent variables: oxygen consumption (lAnin or ml x kg x min'), torque (NM or ft-lb.), reaction time (msec) and acquisition and 1 retention scores. These dependent variables were chosen to encompass different disciplines in thefieldof human kinetics and to provide a wide variety of parameters (i.e. effect size, magnitude and pattern of correlation, RM levels) for power analysis. They were also selected out of familiarity and interest to the researcher. A total of 49 independent studies meeting these criteria were identified and letters were men sent to the investigators requesting access to their raw data (a sample copy of the forwarded letter is given in appendix 1.0). Of the 45 letters sent (4 studies were conducted by the same author), only 9 responses were received and of these, only 2 authors provided data which could be used in this study. Since this fell short of the number of data sets desired (15-20), dissertations from the School of Human Kinetics were also searched and restrictions on the type of dependent variable involved dropped in an attempt to increase Potvin '96 41 Methodology the amount obtained. This raised the total number of usable data sets to 21 , which met the researcher's s original objective. B. Selecting R M Designs and Experimental Conditions The R M ANOVA designs used in this study included the one-way RM, the two-way rnixed (2 (group) x K (RM)) and the two-way within-subjects (3(RM) x K(RM)) models. These designs were chosen due to their frequent occurrence in thefield.The 3 x K design in particular, was selected to test the specific hypotheses of this study. Empirical data collected from the studies were used to determine mean and range values of effect size (ES), sample size (n), average correlation of a given matrix (Ave r) and sample epsilon (e) for each RM design involved. These values were then used to establish the experimental parameters for which power would be generated (see appendix 2.1-2.3 for a list of both the raw and mean values of these statistics for all studies selected). With regards to effect size, values of .2, .5 and .8 were determined to be accurate representations of small, medium and large effects among studies chosen, respectively. These were equivalent to values proposed by Cohen (1988) for behavioral science data. Regarding Ave r, .4 was found to be reflective of a moderately-low correlation while .8 was considered to be representative of a moderately-high correlation among repeated trials exarriined. The range of epsilon values observed (.3347-1.000) provided verification of the different degrees of heterogeneity (violations of sphericity) that exists among r matrices orjtained from human kinetics data. Although sample sizes of 5 to 30 and RM levels of 3,6 and 9 were pre-selected prior to the commencement of this study, information derived from the data sets and 49 studies examined supported the prevalence of these values in the disciplines of exercise science. 5 Some studies involved several dependent variables and thus provided more than one data set Potvin '96 42 Methodology Thus, for a given design, power was estimated under a variety of experimental conditions involving three different effect sizes (.2, .5 and .8), three levels of K (3,6, and 9), two Ave r values (.4 and .8), two r matrix patterns (constant, e = 1.000 and trend, e < .560), six sample sizes (5,10, 15, 20, 25 and 30) and three levels of significance (.01, .05 and .10). Due to time restraints, power under trend matrix patterns was limited only to the one-way model. Figure 3.01 provides a flow chart of the various conditions power was determined for among each of the three ANOVA designs involved. Potvin '96 43 Methodology One Wav R M A N O V A r matrix Trials Alpha ES Aver Pattern n 30 3 x 3 x 3 x 2 x 2 x 6 = 648 conditions Two Wav Mixed A N O V A r matrix Trials 3 Ave r Alpha Pattern 0.01 Mod. Low (.4) (5) Constant < ^ Mod. High (.8) Large (.8) 1 x 6 = 324 conditions per test Two Wav R M A N O V A Trials 3 Alpha 0.01 Aver ES Ave r fl B Small (.2) Mod. Low (.4) Mod. Low (.4) Mod. High (.8) Mod. High (.8) Large (.8) 2 x 6 = 648 conditions per test #of # of Tests Conditions Conditions per Design One-Way RM ANOVA: 1 X 648 Two-Way Mixed ANOVA: 3 X 324 972 Two-Way RM ANOVA: 3 X 648 1944 Total Conditions Involved 648 3564 Figure 3.01. Experimental conditions of R M A N O V A designs power was generated for in this study. (ES = effect size; Ave r = average correlation among trials; n = sample size: Test = Main effects and interaction). Potvin '96" 44 Methodology II. Power Determination A . Analytical For those conditions of the one-way R M and two-way mixed designs conforming to the assumptions of sphericity (constant r matrix pattern), power was calculated directly using analytical methods. Using equations 2.01-2.04 (see chapter 2), a FORTRAN program was first developed by the researcher in order to compute the noncentrality parameter (X) for each unique condition of a design (appendix 3.0 includes a copy of this program). By inputting the appropriate treatment means and variances, the FORTRAN program produced X values for any given Ave r, n and effect size (d). When calculating X, treatment means and variances were selected to ensure effect sizes conformed to d values of .2, .5 and .8 for all tests of a design (see appendix 4.1-4.3 for all test means and variances used in this study). In all cases except tests of interaction, definitions of d were similar to those given by Cohen (1988) and equation 2.10 of this thesis. Since no expressions of d for interaction tests could be found in the literature, equations had to be developed by the researcher of this study. These functions are given in appendix 5.0. Once A. values were obtained, these were then used to compute exact power for different levels of a by inputting them into a cumulative distribution function. DATASIM, a statistical software program developed by Bradley (1989), was used for this purpose. B. Monte Carlo Simulation For one-way RM designs under conditions of nonshpericity (i.e. trend r matrix patterns) and twoway R M designs, Monte Carlo (M-C) simulation was used to estimate power. Simulation programs developed by Eom (1993) were adapted for this task. These FORTRAN programs (one for each RM design) provided estimates of power through an iterative process. This entailed generation of a database of Potvin *96 45 Methodology random numbers (120,000) based on specific population parameters from which random samples of data were then repeatedly drawn and subjected to the appropriate RM ANOVA test. Power was determined by totaling the number of F tests found significant for a given a and then dividing by the overall number of tests performed for the simulatioa Prior to each simulation run, the program required the user to input the treatment means, the variance-covariance structure and all sets of orthogonal contrasts of a given design. A single run, therefore, produced power values for one specific effect size and correlation matrix and all sample sizes and levels of alpha of a particular RM design. To ensure reasonable accuracy of power estimates, the number of replicated tests the Monte Carlo programs performed per simulation was set at 3000. Although this was rather on the low side for a Monte Carlo investigation, the vast number of conditions which were to be simulated in this study necessitated that this number remain at a manageable level. With replications set at 3000, the standard error of proportion (SEP) for true power values of .50 and .99 was + 009 and ±.002, respectively . With 95% 6 confidence therefore, the accuracy of power values derived from the Monte Carlo simulations of this study was expected to be about ±.02 for tests with moderate power (.50) and ±.004 for those exhibiting power at the extremes (.99 or .01). Table 3.1 summarizes the power methods used for each design of this study. where p = true power and n = number of replications. Potvin '96 46 Methodology Table 3.1 Methods and Computer Programs Used to Determine Power For the Different Repeated Measures (RM) ANOVA Designs of This Study. Experimental Conditions ANOVA Design Programs Used Method One Way RM NCP(1996) - Datasim(1989) Bom (1993) Analytic Monte Carlo* A l l nonspherical Two-Way Mixed NCP(1996) - Datasim(1989) Analytic Only spherical Two-Way RM Eom(1993) Monte Carlo* Only spherical NCP = FORTRAN program for calculating the Noncentrality Parameter Only spherical * number of replications = 3000 III. Accuracy and Reliability of Power Estimates Several measures were undertaken to verify the accuracy of methods used in this study to approximate power. For Monte Carlo programs, three procedures were employed. First, in order to ensure the M-C routines were computing correcfly, ANOVA test results generated from these programs were compared to those computed from a well known statistical analysis package (BMDP). This was done by extracting test data generated by the simulation program and then subjecting them to the appropriate statistical procedure(s) using BMDP. For all conditions exarnined, ANOVA statistics (mean sum of square values, F ratios) were identical between the two programs. A second method for assessing the accuracy of the M-C programs involved comparing power values from the simulation program to those given by DATASIM under identical conditions of a one-way design. A total of 162 conditions were involved in which absolute differences in power between M-C and Potvin '96 47 Methodology DATASIM were determined for designs with 3,6 and 9 RM. The mean absolute difference (MAD) among conditions was .0071 or .71%. According to Muller and Barton (1989), absolute differences equal to or below .04 (4.0%) are sufficient for power purposes. Of the 162 differences, none were above this value and only 4 were above .025 (2.5%). Although there was a trend for the simulation program to underestimate power at extreme ends of the power curve and overestimate at moderate power levels (when compared to DATASIM values), these tendencies were slight It was therefore concluded that both programs produced similar power estimates. A third procedure for testing the reliability of the M-C programs was conducted by replicating simulation runs under identical experimental parameters. Simulations were repeated ten and fourtimesfor several conditions of the one and two-way RM designs, respectively, in which only the random number generator seed (i.e. the number used to initiate the data generation process) was changed. For the one-way design, 95% confidence intervals were determined for each condition involved using DATASIM values as true power and the number of M-C estimates falling outside of these intervals was established. Of the 180 power values generated, 27 (15%) fell outside of their respective intervals, the majority (21) occurring when true power was between .25 and .64. Although this was higher than expected (with 95% confidence, only 5% should occur), the range among power values generated for a given condition rarely surpassed .03 and none of the outliers had absolute differences (from those of DATASIM) above .04. For the two-way program, confidence intervals could not be established since true power values were unknown. However, the largest range among power values from repeated simulations of a given condition was .025. Based on this and other pilot results, the M-C programs were considered reliable and capable of providing accurate estimates of true power. In order to validate the accuracy of power estimates generated by the DATASIM program, values computed by this program following calculation of the noncentrality parameter were compared to those given by Davidson (1972) for one-way R M designs with 3, 6 and 16 repeated trials. Of 36 power Potvin '96 48 Methodology comparisons made, only two conditions (6%) produced absolute differences greater than .04. MAD values for the designs with 3, 6 and 16 RM were .006, .026 and .003, respectively. Thus the analytical methods used to calculate power in this study were generally found to produce estimates equivalent to those reportedelse where. Delimitations of Study Power determination for this study was delimited; > to the univariate RM ANOVA approach. > to those designs and experimental conditions selected. > toeffect size conditions where population means were equally spaced apart > to equal sample sizes and number of observations for all groups and trials involved. > to correlation coefficients equal either to . 4 or .8 for AB pairs in a two-way RM design. > to a trend or simplex r pattern for all matrices not corrforrning to sphericity. > to severe violations of sphericity among trend r structures (e <.56). > by the expressions of effect size (d) used in this study. > by the accuracy of the computer and Monte Carlo simulation programs used. Potvin '96 49 Chapter Four Results Potvin '96 Results The results of this study have been organized according to the type of design involved. Data for the one-way RM ANOVA design are presented first, followed by those for the two-way design with multiple repeated measures while results for the two-way mixed ANOVA (group by trials) are given last I. One-Way Repeated Measures ANOVA A. Power Tables Power values for the one-way RM ANOVA with 3,6 and 9 repeated measures are given in Tables 4.1,4.2 and 4.3 respectively. Each table provides power for the different levels of alpha, effect size (ES), sample size (n), average correlation coefficients (Ave r) and patterns of correlation matrices (C and T) involved. Power values derived using DATASIM are under the "C" column (constant correlation matrix) while values estimated using Monte Carlo simulation are under the "T" column (trend correlation matrix). The description of power trends which follows may be clarified by referring to these tables and appropriate figures when indicated. Potvin '96 51 Results Table 4.1 PowrFor a One-Way Repeated Measures ANOVA With3 Levels Alpha = .01 ES: Ave n r pattern: C T C T C T C T C T C T n 5 10 15 20 25 30 01 02 02 03 03 04 03 04 05 05 06 07 02 03 05 07 10 12 05 06 08 10 13 16 03 07 12 18 24 32 07 10 16 22 26 32 07 25 47 67 82 91 13 29 45 61 72 82 06 21 39 57 73 84 11 23 39 53 66 77 23 73 95 99 100 100 29 66 88 96 99 100 Small (.20) 0.40 Medium (.50) 0.40 0.80 Alpha = ES: Ave n r pattern: C n 5 10 15 20 25 30 06 07 08 10 11 13 n 5 10 15 20 25 30 11 13 15 17 19 21 0.80 .05 0.80 T C T C T C T C T C T 09 11 11 13 14 16 08 12 16 20 25 30 11 15 19 21 26 30 11 20 29 39 48 57 15 22 30 38 44 51 24 52 74 88 95 98 27 50 65 77 86 92 21 45 67 82 91 25 43 60 73 83 90 54 93 99 100 100 100 53 83 95 99 100 100 Small (20) 0.40 C 0.80 Small (.20) 0.40 Medium (30) 0.40 Alpha = ES: Ave r. r pattern: Lain e(.80) 0.40 T 16 16 17 18 21 22 C 14 20 26 31 37 42 T 17 22 26 29 34 38 % 0.80 .10 Medium (JO) 0.40 0.80 Large (.80) 0.40 0.80 Large (.80) 0.40 0.80 C T C T 19 31 42 53 62 70 22 30 39 48 54 61 37 67 85 94 98 99 36 59 74 85 91 95 - 0.80 C T c T 33 60 79 90 96 98 35 54 71 82 91 95 71 97 100 100 100 100 63 89 97 100 100 100 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) T = Trend Correlation Matrix Pattern (e < .56) r = Average of Correlation Coefficients in a given Matrix n = Sample Size All power values are in percent Potvin "96 52 Results Table 4.2 Power For a One-Way Repeated Measures ANOVA With 6 Levels Alpha = .01 ES: Ave n r pattern: C T C T C T C T C T C T n 5 10 15 20 25 30 01 02 02 02 03 03 04 05 05 06 07 08 02 03 05 07 09 12 05 08 10 12 15 17 03 06 12 18 26 34 07 12 18 23 29 34 08 29 53 74 88 95 14 33 50 62 74 82 07 23 44 64 79 89 13 26 41 54 66 75 31 83 98 100 100 100 35 68 87 96 99 100 Alpha = .05 ES: Ave n r pattern: n 5 10 15 20 25 30 Small (.20) 0.40 Small (20) 0.40 Medium (50) 0.40 0.80 Medium (£0) 0.40 0.80 Large (.80) 0.40 0.80 0.80 0.80 T C T C T C T C T C T 06 07 08 10 11 12 13 14 14 08 11 15 20 24 12 16 19 22 26 29 29 11 19 29 39 49 58 14 22 31 ,37 44 50 25 54 78 91 97 99 27 48 66 77 85 90 21 47 69 85 93 97 25 42 58 71 80 86 60 96 100 100 100 100 51 81 94 98 99 100 Alpha = Small (20) 0.40 ? n 5 10 15 20 25 30 0.80 C 09 10 12 ES: Aver r pattern: Large (.80) 0.40 .10 Medium (SO) 0.40 0.80 0.80 Large (.80) 0.40 0.80 C T C T C T C T C T c T 11 13 15 16 18 20 16 17 17 19 20 20 14 19 25 30 36 41 18 22 26 28 33 36 19 30 42 53 62 71 21 30 39 45 54 60 37 68 87 96 99 100 36 58 73 82 89 93 33 61 81 92 97 99 33 52 67 77 86 90 74 98 100 100 100 100 62 86 96 99 100 100 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) T = Trend Correlation Matrix Pattern (e < .56) r = Average of Correlation Coefficients in a given Matrix n = Sample Size All power values are in percent Potvin "96 53 Results Table 4.3 Power For a One-Way Repeated Measures ANOVA With 9 Levels Alpha = .01 ES: Ave n r pattern: n 5 10 15 20 25 30 Small (20) 0.40 0.80 Medium (SO) 0.40 0.80 Large (.80) 0.40 0.80 C T C T C T C T C T C T 01 02 02 02 03 03 04 05 06 08 08 09 02 03 05 07 10 14 06 08 12 14 17 20 03 07 14 22 31 40 08 15 20 26 33 38 10 36 64 84 94 98 18 37 54 69 79 87 08 28 53 74 88 95 14 31 45 60 70 80 40 92 100 100 100 100 40 76 93 98 100 100 Alpha = .05 ES: Ave n r pattern: n 5 10 15 20 25 30 Small (20) 0.80 0.40 Medium (50) 0.40 0.80 Large (.80) 0.40 0.80 C T C T C T C T C T c T 06 07 08 09 11 12 11 12 14 15 15 16 08 12 16 21 26 32 13 16 21 25 29 32 11 21 32 44 55 65 16 26 33 41 48 54 28 61 84 95 99 100 31 52 69 81 89 93 24 53 77 90 97 99 26 46 61 74 82 89 68 98 100 100 100 100 57 87 97 100 100 100 Alpha = .10 ES: Ave n r pattern: n 5 10 15 20 25 30 C 11 13 15 17 19 21 Small (20) 0.80 0.40 T C T C 17 17 19 21 21 21 14 20 26 32 38 45 19 23 27 32 36 39 19 32 45 57 68 77 Medium (50) 0.40 0.80 T c T C 23 33 41 49 56 62 41 74 91 98 100 100 39 59 76 86 93 96 36 66 86 95 99 100 Large(.80) 0.40 0.80 T G T 35 54 69 81 87 92 80 99 100 100 100 100 66 91 98 100 100 100 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) T = Trend Correlation Matrix Pattern (e < .56) r = Average of Correlation Coefficients in a given Matrix n = Sample Size All power values are in percent Potvin '96 54 Results B. Power Trends Alpha (a) As expected and observed from each table, power for any given ES, n, Ave r and matrix pattern increases as the level of significance increases. Power is lowest when alpha is set to .01 and highest when set to .10. Sample Size (ri) From the power tables, we see that an increase in sample size, as predicted, brings about an increase in power when all other conditions for a particular design are held constant. Interestingly, when sample size is small (n = 5), power is, at best, only moderately high (.80 for K = 9 under large ES and high Ave r). This exemplifies the importance of having a sufficient n to achieve a reasonable amount of power when other factors in a study's design are less than optimal. Effect Size (ES) Also from the power tables, we see that when effect size increases for any given a, n, Ave r and matrix pattern of a particular design, power increases. The degree of influence ES has on the power of an experimental design is demonstrated by observing how maximum power for those conditions with small ES is, at best, moderately-low (.45) but increases considerably when a medium ES is involved (1.00). Average Correlation (Ave r) In agreement with the researcher's expectations, the power tables demonstrate that when the Ave r for a given design increases, power increases at any given a, n, ES and r matrix pattern,. In addition, Potvin '96 55 Results when the Ave r is moderately low (.4), a large ES and/or n is required to achieve a high degree of power (> .85). In contrast, when the Ave r is moderately high (.8) a medium ES and only moderate sample size (n = 20) is required to obtain high power. This illustrates the substantial effect correlation has on power in RM ANOVA designs. Pattern of Correlation Matrix Unlike those factors previously discussed, the pattern of coefficients within a correlation matrix (constant or trend) does not have a common effect on the power of a one-way RM ANOVA at any given n, a, ES, Ave r and RM level (K). From the power tables, it can be observed that when a design involving a constant correlation matrix pattern (C) has low power (< .20), regardless of what statistical parameters are involved, power for that same design under a trend correlation matrix pattern (T) will be slightly greater (largest differences .06-.08). In contrast, when a design under C has moderate to high power (.50.99), that same design under T appears to have less power (largest differences .16-. 18). For example, a design with 6 RM levels, small effect size, and moderately low Ave r (.4) will have a power of .03 at n = 30 and a = .01 under C and a power of .08 under T. On the other hand, when effect size is large and all other conditions are held constant for that same design, power under C is .89 compared to .75 under T. Figure 4.01 illustrates how these power differences between constant and trend matrices change as effect size is increased. Also evident from the tables and this figure is that differences in power between tests Potvin *96 56 Results Large Effect Size (.8) Constant r Trend r Sample Size Figure 4.01. Comparisons of power between one-way repeated measures ANOVA (K = 6) with constant and trend correlation matrices under varying effect sizes. Note: All design conditions were based on ave r = .4 and alpha = .05. Numbers within graphs represent differences In power. Potvin '96 57 Results with C and T are lowest when statistical power is within the range of .20 to .40. In addition, the point within this range at which power becomes equal appears to be influenced by the magnitude of other experimental parameters. Generally, when a design has a small effect size and low Ave r, the range at which power for designs with C and T becomes equal is between .20 and .25. Likewise when ES is moderate and Ave r = .4, power between designs with C and T is equal at about .30 to .35 while those designs with large ES and high Ave r produce equal power at about .35 to .40. Once power surpasses .40, regardless of the conditions involved, designs demonstrating a high degree of nonsphericity (low epsilon) result in less power and do not equal the power of spherical (high epsilon) designs again until power approaches 1.00. Another interesting observation is how this magnitude of difference in power between designs with C and T changes across designs with different levels of repeated measures. As Figure 4.02 illustrates, when power of a design under C is very low (<.20), the power advantage tests with T have over those with C tends to become slightly larger as the number of repeated measures in a design increases. Therefore, a design with 9 repeated measures under T has greater power over one with 6 which in turn has greater power over one with 3 levels under identical conditions. Similarly, when a design under C has moderate to high power (.50-.90), the reduction in power for those same designs under T tends to be slightly greater for K = 9 and K = 6 than for K = 3 (Figure 4.03). This power difference between tests with C and T and different K will be discussed further in the following section. Potvin '96 58 Results ^ ~ Trend r (K=9) Trend r (K=6) « - Trend r (K=3) > 10 15 Constant r (K3=K6=K9) 20 Sample Size Figure 4.02. Comparisons of power for one-way ANOVA designs with 3, 6, and 9 repeated measures (RM) under constant and trend correlation matrices and small effect size (.2). Note: All design conditions were based on ave r = .8 and alpha = .01. Potvin '96 59 Results K=3 5 10 15 Sample Size 20 25 30 K=9 4® " Trend r • 10 15 Sample Size Constant r 20 Figure 4.03. Comparisons of power for one-way ANOVA designs with 3, 6, and 9 repeated measures (RM) under constant and trend correlation matrices and large effect size (.8). Note: All design conditions were based on ave r = .4 and alpha = .10. Numbers within graphs represent differences In power. Potvin '96 60 Results Number of Repeated Measures (K.) When comparing power values between one-way ANOVA tests with different levels of repeated measures (that is, making comparisons between tables 4.1 (K = 3), 4.2 (K = 6) and 4.3 (K = 9)), it becomes apparent that the effects of an increase in the levels of repeated measures (K) on power are dependent on the effect size and pattern of the correlation matrix involved. In general, for tests with C, an increase in K results in very little difference in power between designs when effect size is small and other factors are held constant. However at medium and large effect sizes (.5 and .8 respectively), those designs with a greater number of RM exert a power advantage over those with fewer RM levels across most n, a and Ave r with maximum differences as high as .14-. 19. Referring to the top graph in Figure 4.04, it is evident that a one-way design with 9 repeated measures has greater power over a design with K = 6 or K = 3 levels when effect size is large. Likewise a test with 6 levels shows superior power over one with 3 at the same effect size. However, as effect size decreases we see the power advantage gained from having a greater number of RM levels declines to the point where little apparent difference exists between the three tests at small effect size. Interestingly, an almost opposite effect seems to occur under T. Looking at power values under a trend r matrix or referring to the middle graph in Figure 4.04, we see that at a small effect size, slightly greater power exists for tests with higher levels of K when a = .01 but this power advantage is reduced as a, n and/or Ave r increases (the latter two factors are not shown in the graph). In addition, as effect size becomes large, we see that a test with K = 3 levels demonstrates greater power than one with K = 6 and as much if not greater power than one with 9 RM at both levels of alpha (middle and bottom graphs of Figure 4.04). Thus when power is generally high among designs (large ES), the tendency for power to increase with a greater number of repeated measures, as was observed under C appears to be reduced or Potvin *96 61 Results Constant r, alpha = .01 Trend r, alpha = .01 3 6 9 K Trend r, alpha = .10 3 6 g K Figure 4.04. Power for one-way ANOVA designs with 3, 6, and 9 repeated measures under varying correlation matrices, effect sizes and alpha. Note: All designs based on ave r = .4 and n - 30. r = correlation Potvin '96 62 Results lost entirely when the assumptions of sphericity are severely violated (e < .56). Likewise, when power is generally low (small ES), the rather constant values observed among designs with varying K under C, seems to give way to a power advantage in favor of designs with a greater number of repeated measures under nonsphericity. n. Two-Way Repeated Measures ANOVA A. Power Tables Power values for 3 x 3 , 3 x 6 and 3 x 9 ANOVA designs with two repeated measures factors are given in tables 4.4a-c, 4.5a-c and 4.6a-c, respectively. Within each table, power is provided for the different levels of alpha, effect size, sample size and average correlation coefficients of RM factors involved. Each column within a given ES and a represent power values for one of the four correlation (AB) matrices examined under the two-way RM design (Ave r = .4 for A and B; Ave r = .4 for A, Ave r = .8 for B; Ave r = .8 for A, Ave r = .4 for B; Ave r = .8 for A and B). For comparison, the overall average r of each matrix is also given. Power values were generated under constant r matrix patterns (assumptions of sphericity met) using Monte Carlo procedures. Means, standard deviations and complete matrices are given in appendices 4.1-4.3 and 6.1-6.2, respectively. The description of power trends which follows may be clarified by referring to these tables and appropriate figures when indicated. Potvin '96 63 Results Table 4.4a Power of the A Main Effect For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors. Test: Main Effect of Factor A (3 levels) .01 Alpha = ES: Small (.20) OA r for A: r forB: 0.4 I 0.8 0.4 0.40 0.8 0.50 0.4 0.8 Overall r: 0.50 0.80 0.4 0.40 n 5 10 15 20 25 30 02 04 05 08 11 13 01 02 03 04 04 05 04 10 19 30 41 51 03 10 19 30 41 50 08 25 48 66 81 89 03 09 16 25 36 45 Alpha = .05 ES: Small (20) r for A: r forB: 04 0.8 0.50 0.4 0.50 I 27 83 98 100 100 100 0.4 0.8 0.4 0.4 | 0.8 0.40 0.50 0.4 0.50 28 81 98 100 100 100 23 74 95 99 100 100 08 30 54 73 85 93 75 100 100 100 100 100 I 0.8 Large (.80) 0.8 OX 0.8 0.80 Medium (50) 0.8 OA Large (.80) Medium (SO) 0.8 0.8 0.4 OA Overall r: 0.4 0.40 0.50 0.4 0.50 0.8 0.80 0.4 0.40 0.50 0.4 0.50 0.8 0.80 0.4 0.40 0.8 0.50 0.50 0.8 0.80 n 5 10 15 20 25 30 08 12 16 21 27 31 06 08 10 12 13 16 15 27 42 56 67 75 12 28 42 55 65 75 23 53 73 87 94 97 12 26 38 50 61 69 61 95 100 100 100 100 61 96 100 100 100 100 53 92 99 100 100 100 25 58 79 90 95 99 95 100 100 100 100 100 96 100 100 100 100 100 Alpha = .10 ES: Small (.20) OA r for A: r forB: OA Overall r: Medium (50) 0.8 Large (.80) 0.8 0.4 0.4 0.8 0.40 0.8 0.50 0.4 0.50 0.8 0.80 0.4 0.40 0.8 0.50 0.4 0.50 0.8 0.80 0.4 0.40 0.8 0.50 0.4 0.50 0.8 0.80 14 20 26 33 38 44 11 15 17 20 22 25 25 40 56 68 78 85 23 41 56 67 76 84 37 67 83 93 97 99 22 37 52 64 73 80 76 98 100 100 100 100 77 98 100 100 100 100 69 96 100 100 100 100 39 71 88 95 98 99 99 100 100 100 100 100 99 100 100 100 100 100 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 n = Sample Size per group All power values are in per 64 Results Table 4.4b Power of the B Main Effect For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors. Test: Main Effect of Factor B (3 levels) .01 Alpha = ES: r for A: r forB: 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 02 05 08 11 15 19 05 15 27 39 51 61 01 03 05 06 07 08 05 15 27 39 . 50 61 09 34 61 77 88 94 33 88 99 100 100 100 04 14 25 35 47 56 33 88 99 100 100 100 30 82 98 100 100 100 81 too 100 100 too 100 11 38 64 81 91 96 81 100 100 100 100 100 Alpha = .05 n 5 10 15 20 25 30 Small (.20) 0.4 Medium (50) 0.8 0.4 ES: r for A: r forB: 0.4 0.8 0.4 0.8 0.4 Overall r: 0.40 0.50 0.50 0.80 09 15 21 26 32 37 17 33 48 62 71 79 06 10 13 16 18 20 16 32 48 61 72 79 n 5 10 15 20 25 30 Small (.20) 0.4 Medium (50) 0.8 OA Large (-80) 0.8 0.4 Large (.80) 0.8 I 0.8 0.4 0.40 0.50 0.50 26 59 80 91 96 63 97 100 100 100 100 14 32 44 57 67 76 Alpha = .10 0.8 I 0.8 0.4 0.4 I 0.8 0.8 0.4 0.80 0.40 0.50 0.50 65 96 100 100 100 100 59 94 100 100 100 100 97 100 100 100 100 100 29 62 83 92 97 99 I 0.8 0.80 97 100 100 100 100 100 ES: r for A: r forB: OA OA 0.4 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 15 23 31 36 42 47 25 44 59 71 79 86 12 17 20 24 25 28 25 45 59 71 80 87 39 70 87 94 98 99 78 99 100 100 100 100 24 42 57 67 76 83 78 99 100 100 100 100 72 97 100 100 100 100 99 100 100 100 100 100 44 73 90 99 100 100 100 100 100 Small (20) Medium (50) 0.8 0.4 Large (.80) 0.8 0.4 0.8 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin "96 % 99 100 n = Sample Size per group All power values are in percent. 65 Results Table 4.4c Power of the AB Interaction For a 3(A) x 3(B) ANOVA With Repeated Measures on Two Factors. Test: A by B Interaction Alpha = .01 ES: r for A: r forB: Overall r: Small (.20) OA Medium (JO) 0.8 0.4 Large (.80) 0.8 0.4 0.8 OA 0.8 OA 0-8 OA 0.8 OA 0.8 OA 0.8 OA 0.8 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 01 01 01 01 02 02 02 02 02 02 03 04 02 02 02 03 03 04 01 02 02 03 03 04 02 03 04 06 07 10 03 08 14 23 33 42 03 09 15 23 32 42 03 08 15 24 34 43 02 06 11 17 24 33 09 29 52 72 86 93 08 31 54 73 86 94 08 28 53 72 85 93 n 5 10 15 20 25 30 Alpha = .05 ES: r for A: r forB: Overall n Medium (50) Small (20) OA OA 0.40 I 0.8 OA 0.8 0.50 0.50 06 07 08 09 11 14 07 07 09 11 12 13 Large (.80) 0.8 0.4 0.4 04 I 0.8 0.80 0.40 0.50 0.50 0.80 0.40 06 08 07 10 14 17 21 24 12 22 35 46 57 67 12 24 35 46 56 66 12 23 36 47 58 67 10 19 29 38 48 55 I 0.8 OA I 0.8 OA I 0.8 0.8 0.4 0.50 0.50 25 54 76 89 96 98 24 55 77 89 96 99 | 0.8 n 5 10 15 20 25 30 05 06 06 07 07 08 09 10 12 13 Alpha = .10 ES: r for A: r forB: Overall r: Small (20) Medium (50) Large (.80) OA OA OS OA 0.8 OA 0.8 OA 0.8 OA 0.8 OA 0.8 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 0.40 0.50 0.50 0.80 10 12 12 14 13 14 12 14 14 17 18 23 12 14 16 17 19 22 12 13 15 18 21 22 13 18 22 27 32 36 22 34 48 60 69 78 19 35 48 59 69 77 21 35 48 60 70 77 18 31 41 52 61 68 39 68 85 94 98 99 36 69 86 94 98 99 38 67 85 94 98 99 0.8 0.4 0.8 0.4 0.8 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin^ n = Sample Size per group All power values are in percent 66 Results Table 4.5a Power of the A Main Effect For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors. Test: Main Effect of Factor A (3 levels) .01 Alpha = ES: r for A: r forB: 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 02 07 11 18 25 32 01 02 03 04 04 07 07 26 47 65 79 88 07 25 46 65 79 88 17 59 85 96 99 100 03 09 18 28 39 49 61 99 100 100 100 100 63 99 100 100 100 100 53 98 100 100 100 100 09 33 58 76 89 95 98 100 100 100 100 100 98 100 100 100 100 100 Alpha = .05 n 5 10 15 20 25 30 Small (.20) OA Medium (50) 0.8 0.4 ES: r for A: r forB: 0.4 OA 05 0.4 0.8 0.4 Overall r: 0.40 0.52 0.45 0.80 0.40 11 20 30 38 48 56 06 08 11 12 14 18 23 52 72 86 93 96 23 52 73 86 93 97 n 5 10 15 20 25 30 0.4 Overall r: 0.40 20 31 41 51 61 69 0.8 0.4 I 0.8 0.45 43 83 96 99 100 100 12 27 40 53 63 72 89 100 100 100 100 100 Alpha = .10 I 0.8 0.4 0.4 0.80 0.40 91 100 100 100 100 100 84 100 100 100 100 100 I 0.8 0.4 OX OA | 0.8 0.52 0.45 28 60 81 93 97 99 100 100 100 100 100 100 Large(.80) Medium (50) 0.8 0.8 Large (.80) 0.8 0.4 0.52 Small (20) OA OX 0.4 Medium (50) Small (20) ES: r for A: r forB: Larg 0.8 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.52 0.45 0.80 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 12 14 18 20 24 27 37 66 83 92 96 98 36 66 84 93 96 99 59 91 98 100 100 100 22 40 53 67 75 83 96 100 100 100 100 100 97 100 100 100 100 100 92 100 100 100 100 100 43 74 89 97 99 100 100 100 100 100 100 100 100 100 100 100 100 100 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 n = Sample Size per group All power values are in percent 67 Results Table 4.5b Power of the B Main Effect For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors. Main Effect of Factor B ( 6 levels) Test: Alpha = .01 ES: r for A: r forB: 0.4 OS 0.4 0.8 0.4 OS 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 n 5 10 15 20 25 30 Small (.20) Medium (SO) 0.8 0A 0.4 Large (-80) 0.8 0.4 0.8 1 02 05 06 10 11 16 05 13 23 37 50 61 01 02 03 04 04 05 04 13 24 37 50 61 10 36 60 79 90 96 43 92 100 100 100 100 Alpha = •05 05 11 20 31 42 54 45 93 100 100 100 100 37 86 98 100 100 100 94 100 100 100 100 100 12 39 66 84 93 98 ES: r for A: r forB: 0.4 OS 0.4 0.8 0.4 OS 0.4 0.8 0.4 0.8 0.4 Overall r: 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 0.40 0.52 0.45- 08 13 18 23 29 34 14 30 45 61 72 81 06 09 12 12 14 16 15 30 46 59 72 80 28 58 81 92 97 99 70 98 100 100 100 100 14 27 40 53 65 75 72 98 100 100 100 100 63 96 100 100 100 100 99 100 100 100 100 100 30 63 85 94 98 99 Alpha = .10 n 5 10 15 20 25 30 Small (20) Medium (SO) 0.8 0A Large (.80) 0.8 0.4 92 100 100 100 100 100 0.4 0.8 0.8 ' 0.80 99 100 100 100 100 100 ES: r for A: r forB: 0.4 OS 0.4 OS 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.45 0:80 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 15 21 27 34 40 46 23 42 59 73 82 88 12 16 19 21 23 25 25 42 58 70 80 88 40 70 88 96 99 99 82 99 100 100 100 100 23 40 53 65 76 84 81 99 100 100 100 100 76 98 100 100 100 100 100 100 100 100 100 100 43 74 91 97 99 100 100 100 100 100 100 100 Small (20) 0A Medium (50) 0S 0.4 Large (.80) 0.8 0.4 0.8 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 n = Sample Size per group All power values are in percent 68 Results Table 4.5c Power of the AB Interaction For a 3(A) x 6(B) ANOVA With Repeated Measures on Two Factors. A by B Interaction Test: Alpha = .01 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (.20) 04 OA 0.40 I 0.8 OA 0.S2 0.4S 01 01 02 03 03 04 01 01 02 01 01 01 01 01 01 Medium (50) 0.4 0.8 OA | 0.8 OA 0.8 0.8 I 0.8 03 04 04 0.80 0.40 0.52 0.45 01 01 02 03 03 03 02 02 03 07 14 23 33 42 03 07 15 23 32 42 04 05 07 09 I Large (.80) 0.4 0.8 OA | 0.8 OA 0.8 0.80 0.40 0.52 0.45 03 07 14 24 32 42 02 06 11 18 25 34 08 30 56 77 89 96 09 29 56 76 90 96 I 0.80 08 30 56 77 91 96 Alpha = .05 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (.20) 0.8 OA OA 0.40 I 0.8 OA 0.52 0.45 06 06 08 10 05 07 09 10 11 13 06 06 07 06 07 08 11 12 Medium (50) 0.4 I 0.8 Large (.80) 0.8 0.4 0.8 0.80 0.40 I 0.8 0.52 0.45 I 0.8 0.80 0.40 I 0.8 0.52 0.45 06 06 08 10 10 12 07 09 13 15 19 23 10 20 34 45 55 66 12 20 33 46 56 68 11 21 33 46 55 67 10 18 28 39 48 59 25 55 79 92 96 99 25 54 78 91 97 99 OA OA OA OA I 0.8 0.80 23 53 78 91 97 99 Alpha = .10 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Medium (50) Small (.20) OA 0.8 0.4 Large (.80) 0.8 0.4 OA 0.8 OA 0.8 04 0.8 04 0.8 0.40 0.52 0.45 0.80 0.40 0.52 0.45 0.80 11 11 13 12 13 14 12 12 15 16 18 20 11 13 16 17 20 22 13 15 21 24 29 35 19 32 47 58 68 77 19 32 46 58 69 78 19 33 46 60 68 77 ii : 12 14 17 18 20 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 0.8 04 0.8 04 0.8 0.40 0.52 0.45 0.80 17 28 40 52 61 71 37 68 87 96 98 99 35 68 87 95 98 100 36 67 87 95 99 100 n = Sample Size per group All power values are in percent 69 Results Table 4.6a Power of the A Main Effect For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors. Test: Main Effect of Factor A (3 levels) Alpha = .01 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (.20) 0.4 Medium (JO) 0.8 0.4 Large (.80) 0.8 0.4 0.8 0.4 0.40 0.8 0.52 0.4 0.43 0.8 0.80 0.4 0.40 0.8 0.52 0.4 0.43 0.8 0.80 0.4 0.40 0.8 0.52 0.4 0.43 0.8 0.80 03 10 19 29 39 50 01 02 03 04 05 05 12 42 70 86 94 98 12 43 69 88 95 99 28 80 98 100 100 100 04 10 18 28 39 50 83 100 100 100 100 100 82 100 100 100 100 100 76 100 100 100 100 100 10 34 60 78 89 95 100 100 100 100 100 100 100 100 100 100 100 100 Alpha = .05 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (20) 0A Medium (50) 0.8 OX 0.4 Large (.80) 0.8 0.4 0.8 0.4 0.40 0.52 0.4 0.43 0.8 0.80 0.4 0.40 0.8 0.52 0.4 0.43 0.8 0.80 0.4 0.40 0.8 0.52 0.4 0.43 0.8 0.80 13 28 41 54 64 74 06 08 10 12 15 16 33 69 89 96 99 100 34 70 88 97 99 100 61 96 100 100 100 100 14 28 41 53 66 74 98 100 100 100 100 100 98 100 100 100 100 100 96 100 100 100 100 100 28 62 83 94 97 99 100 100 100 100 100 100 100 100 100 100 100 100 Alpha = .10 ES: r for A: r forB: 0.4 OS 0.4 Overall r: 0.40 0.52 22 41 55 67 76 84 12 14 18 21 24 27 Small (.20) 0A Medium (50) 0.4 0.8 0.43 0.8 0.80 49 81 94 98 100 100 49 81 94 99 100 100 0.4 Large (.80) 0.4 0.8 0.4 0.4 0.8 0.40 0.8 0.52 0.43 0.8 0.80 0.40 0.8 0.52 0.4 0.43 0.8 0.80 76 98 100 100 100 100 24 40 55 67 77 84 99 100 100 100 100 100 100 100 100 100 100 100 99 100 100 100 100 100 42 75 90 97 99 100 100 100 100 100 100 100 100 100 100 100 100 100 n 5 10 15 20 25 30 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 n = Sample Size per group All power values are in percent 70 Results Table 4.6b Power of the B Main Effect For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors. Test: Main Effect of Factor B (9 levels) Alpha .01 ES: r for A: r forB: OA 0.4 OX 0.4 0.8 0.4 OX OA 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 02 04 07 10 12 16 04 14 27 42 55 68 01 02 03 03 04 05 05 15 27 41 56 68 12 41 68 86 95 98 54 97 100 100 100 100 04 12 23 35 50 62 54 97 100 100 100 100 46 93 100 100 100 100 98 100 100 100 100 100 14 44 74 90 97 99 98 100 100 100 100 100 Alpha = .05 n 5 10 15 20 25 30 Small (20) Medium (SO) 0.8 0.4 Large (.80) 0.8 0.4 0.8 ES: r for A: r forB: 0.4 0.8 0.4 OX 0.4 OX 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 09 14 20 24 29 35 15 33 49 64 77 85 07 08 10 11 13 15 16 32 49 66 77 86 29 65 86 96 99 100 78 99 100 100 100 100 14 27 45 58 72 81 77 99 100 100 100 100 72 98 100 100 100 100 100 100 100 100 100 100 33 68 89 97 99 100 100 100 100 100 100 100 Alpha = .10 Small (20) 0A n 5 10 15 20 25 30 Medium (SO) 0X 0.4 Large (.80) 0.8 0.4 0.8 ES: r for A: r forB: OA 0.4 OX 0.4 0.8 0.4 OX 0.4 0.8 0.4 0.8 0.4 0.8 Overall r: 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 16 22 30 35 41 48 25 45 62 75 85 91 12 14 17 19 21 24 25 44 62 76 85 92 43 77 92 98 99 100 87 100 100 100 100 100 24 40 57 70 80 88 87 100 100 100 100 100 82 99 100 100 100 100 100 100 100 100 100 100 46 79 93 99 100 100 100 100 100 100 100 100 n 5 10 15 20 25 30 Small (20) Medium (SO) 0.8 0.4 r = Average of Correlation Coefficients in a given Matrix Large (.80) 0.8 0.4 ES = Effect Size (d) i coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 0.8 n = Sample Size per group All power values are in percent 71 Results Table 4.6c Power of the AB Interaction For a 3(A) x 9(B) ANOVA With Repeated Measures on Two Factors. A by B Interaction Test: Alpha = .01 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (.20) OA OA 0.40 I 0.8 OA 0.52 0.43 02 01 02 03 03 04 01 02 02 02 03 04 01 02 01 02 01 01 Medium (50) 0.8 I 0.8 OA OA 0.80 0.40 01 01 02 03 03 04 02 03 04 05 07 10 I 0.8 Large (.80) 0.8 0.4 0.8 0.52 0.43 I 0.800.8 0.40 I 0.520.8 0.43 I 0.800.8 04 09 17 27 40 50 03 06 17 26 38 50 03 09 16 26 40 51 02 06 14 20 29 39 11 35 65 85 95 99 10 37 66 86 94 98 10 35 64 85 95 98 OA OA OA Alpha = .05 ES: r for A: r forB: Overall T: n 5 10 15 20 25 30 Small (.20) 0.8 OA OA 0.40 I 0.8 OA 0.52 0.43 06 06 08 10 11 13 05 07 09 09 10 12 06 06 06 07 06 07 Medium (50) I 0.8 0.4 Large (.80) 0.8 OA I 0.8 OA 0.80 0.40 0.52 0.43 06 07 08 10 12 13 08 11 14 16 21 25 12 23 37 52 63 75 12 24 37 51 62 74 I 0.8 0.8 OA OA 0.80 0.40 13 24 36 51 64 74 10 19 30 41 53 63 I 0.8 OA 0.52 0.43 28 61 84 95 99 100 27 61 86 96 99 100 I 0.8 0.80 27 61 84 95 99 100 Alpha = .10 ES: r for A: r forB: Overall r: n 5 10 15 20 25 30 Small (.20) OA Medium (50) 0.8 Large (.80) 0.8 0.4 0.4 0.8 OA 0.8 OA 0.8 OA 0.8 OA 0.8 OA 0.8 OA 0.8 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 0.40 0.52 0.43 0.80 10 11 11 13 13 13 11 13 15 18 19 21 11 13 15 16 18 20 11 14 15 17 20 23 14 19 22 26 31 36 20 35 51 64 74 84 20 35 50 64 74 83 22 36 50 64 76 84 18 29 42 55 65 76 40 73 91 98 100 100 39 72 92 98 99 100 39 74 91 98 99 100 r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) r coefficients for AB pairs of a given matrix are equal to the lowest r among factors A and B. Potvin '96 n = Sample Size per group All power values are in percent. 72 Results B. Power Trends Alpha. Sample Size and Effect Size Similar to the one-way R M ANOVA and as predicted, power was found to increase as a, n or ES increased for any given experimental condition and test of the two-way RM design. The only exceptions were when power either approached level of significance or a value of 1.00, in which case, values were approximately equal across the levels of a conditioa Average Correlation Among Factors (Different Correlation Matrices) Main Effect of A Beginning with the main effects test of factor A, tables 4.4a, 4.5a and 4.6a reveal how when the Ave r of factor A increases from .4 to .8 and all other conditions are held constant (that is, compare columns 1 with 3 and columns 2 with 4 within a given ES), power increases in all three designs. In addition, the degree of increase in power as Ave r goes from .4 to .8, in general, is greater as K„, the number of RM of factor B, increases (i.e. power increase is greatest for 3 x 9 design and least for 3 x 3). Of further interest is the power trend observed among the four different correlation structures. Here we see that for almost any given a, n, ES and K„, power is lowest for a test having a matrix with an average correlation among trials of factors A and B equal to .4 and .8, respectively, (abbreviated 4-8 matrix), and is greatest (equally) for a test whose matrices involve an Ave r for A of .8 and Ave r for B of either .4 or .8 (abbreviated 8-4 and 8-8, respectively) while a test having a matrix with Ave r equal to .4 for A and B Potvin '96 73 Results Main Effect of A Main Effect of B AB Interaction 0.25-j 1 0.20 • 1 1 |—I—j 1 Sample Size (n) Figure 4.05. A comparison of power across different correlation matrices for tests of a 3 x 6 RM ANOVA design. Note: Design based on smaB ES and a = .05. Potvin '96 74 Results (abbreviated 4-4 matrix) displays a magnitude of power in between. The only exception to this trend occurs at the upper end of the power curve where the difference in power between tests with different matrices diminishes as power approaches 1.00. The top graph in Figure 4.05 depicts this common power trend across matrices of the A main effects test for a 3 x 6 design with small ES and a = .05. Both the tables and this figure indicate that when the average correlation among A trials (Ave r ) is high (.8) and A the average correlation among B trials (Ave r ) is either equal to (.8) or lower (.4) than Ave r , power will B A be greater than when the Ave r is lower (.4) and especially greater than when Ave r (.8) surpasses Ave A B r . Thus it seems power for the A main effects test under different correlation matrices in a two-way RM A design is dependent on the average correlation of A and independent of the average correlation of B at least up until Ave r„ becomes larger than Ave r at which point power is negatively affected. A Main Effect of B Referring now to power values for the main effects test of factor B shown in tables 4.4b, 4.5b and 4.6b, we see sirnilar power trends across the different correlation matrices as those observed for the A main effects test. Examining the Ave r of factor B first, we see that as Ave r„ increases from .4 to .8 and all other experimental conditions are held constant, power for the B test increases (compare columns 1 with 2 and 3 with 4 for any given ES and a). Secondly, those matrices having an Ave r = .8 and Ave r = g A .4 or .8 (i.e. 8-8 and 4-8) produce a test with the most power, followed by a matrix with average correlations equal to .4 for A and B (4-4) while a matrix in which Ave r„ is less than Ave r (8-4) displays A the least power of the four. Again, as in the A test, we see that if the Ave r of the factor being averagedover is above the Ave r of the main effects factor (B in this case), power of the test drops considerably. Likewise, if the Ave r of the pooled factor is less than or equal to the Ave r of the main effects factor (8-8 or 4-8), power will be highest. The middle graph of Figure 4.05 illustrates this general power trend among the different matrices of the B main effects test Potvin '96 75 Results Interaction Different power trends emerge for tests of interaction (AB), compared to those described previously for main effects tests. Referring to tables 4.4c, 4.5c and 4.6c, we see that an increase in the overall Ave r of the four matrices does not necessarily produce a concomitant increase in power for the AB test. Rather, those matrices in which at least one factor (A or B or both) has an Ave r = .8 result in the highest power for the AB test. As illustrated in the bottom graph of Figure 4.05, three of the four matrices (8-8, 8-4 and 4-8) produce relatively equal power for the AB test while the 4-4 matrix is the only one of the four which exhibits inferior power values across most n. These results suggest power for the interaction test in the two-way RM model remains the same among correlation structures with different overall Ave r so long as all matrices involved have at least one RM factor with an Ave r among its trials equal in magnitude to the highest overall Ave r observed among the AB matrices. It appears, therefore, that the mean magnitude of correlation coefficients among pooled trials of factor A or B is a more influential variable affecting the power of an interaction test than the overall Ave r of the AB matrix. Number of Repeated Measures of Factor B ( K ) - Differences Between Designs B As in the one-way design under conditions of sphericity, a general increase in power across designs with greater K„ was also observed among the tests of the two-way RM model. However, the extent of this trend was not the same for all tests. Main Effect of A Of the three tests involved, the tendency for power to increase as K„ increases was most noticeable for the A main effects test. Contrasts in power between K Potvin '96 B = 9 and K B =3 designs of this test were 76 Results generally high, with differences as large as .53-.58. Figure 4.06 illustrates this common pattern for the A test across different effect sizes and correlation matrices when n = 10 and a = .05. One exception to the rule seems to be for a test having a 4-8 matrix as shown in the top graph, where at small ES, power seems to be about equal across designs with different K„. However, under medium and large effect sizes (middle and bottom graphs) even this same test begins to demonstrate slightly greater power as K increases. B Another exception is when power for the A test under the other 3 matrices approaches 1.00 (medium and large effect sizes) in which case differences between designs are reduced and eventually negated. Main Effect of B A general tendency for power to increase with larger K„ also exists for the B main effect. However, unlike the A test, the increase is generally small (largest differences ranging between .16 and .21) and only observable when K„ = 9 and designs involved have a moderate to high degree of power (above -.40). Figure 4.07 demonstrates these differences in power between designs with distinct K under B the same conditions described for the A test. As can be seen in the top graph, power between designs are about equal for the four matrices under a small ES. At medium ES (middle graph), a design with 9 RM begins to show a slight power advantage over the other designs under most conditions except when the power of a test remains low (the 8-4 matrix) in which case a reverse effect occurs. However, under a large ES (bottom graph), even this test demonstrates a tendency towards greater power as K„ increases. Once power approaches extremely high values (1.00), all designs regardless of K„ have about equal power. The observed power trend associated with the B main effects test is similar to the general pattern observed in the one-way RM model under a constant r matrix. Potvin '96 77 Results A Test; Small Effect Size A Test; Medium Effect Size A Test; Large Effect Size Figure 4.06. Change in power for the "A" test of a two-way RM ANOVA as the number of levels of factorB" increase under varying effect sizes and correlation matrices. H Note: Based on a = .05 and n = 10. Potvin '96 78 Results B Test; Small Effect Size B Test; Medium Effect Size B Test; Large Effect Size Potvin '96 79 Results AB Interaction For the interaction test, the increase in power accompanying larger K„ is mainly evident, like the B test, only when K„ = 9. However, unlike the B test, a particular design's power does not need to be as high in order for this pattern to become noticeable (only above -.20-.30). Figure 4.08 illustrates how when effect size is small Cow power), designs with 3, 6 or 9 RM produce similar power values but as ES becomes larger (power increased above -.20), a design with 9 RM shows a slight power advantage over those with fewer RM under most correlation matrices. Only when a test involves a 4-4 matrix does this pattern fail to emerge since power among all 3 designs still remains fairly low. Under more favorable experimental conditions (greater n, a,) however, even this test demonstrates a similar power trend as the others (not shown in figure). Of the three tests in the two-way RM ANOVA, the AB test showed the smallest contrast in power as K„ increased, with differences between K B = 9 and K B =3 reaching a maximum of only .8 to .13. Between Test Comparisons: Main Effects and Interaction. When comparing power values between different tests of a two-way R M ANOVA, some distinctive patterns emerge under each of the correlation matrices and designs involved. Figure 4.09 provides a comparison of power among the main effects and interaction tests of all three designs under small ES, a = .05 and n = 30. Referring to the top graph, we see that for a 3 x 3 design involving a 4-4 matrix, power for the B test is slightly greater (.37) than that of the A test (.31) while both exhibit greater power over the interaction test (.08). The same design under an 8-8 matrix reveals a similar power order among the three tests except the differences in power between main effects tests and the AB test is considerably larger. For the heterogeneous matrices (4-8 and 8-4), the power order appears to be dependent on the magnitude of the Ave r of the main effects factor. As illustrated by the bar graphs, when Potvin '96 80 Results AB Test; Small Effect Size AB Test; Medium Effect Size AB Test; Large Effect Size Figure 4.08. Change in power for the "AB" test of a two-way RM ANOVA as the number of levels of factor "B" increase under varying effect sizes and correlation matrices. Note: Based on a = .05 and n = 10. Potvin '96 81 Results the Ave r of B is greater than that of A (4-8 matrix), the B test shows superior power whereas when Ave r of A is greater (8-4 matrix), the A test has the greater power. In addition, the degree to which power is greater for the B test over the A test under a 4^8 matrix (.79 - .16 = .63) is slightly larger than that observed when the A test dominates under a 8-4 matrix (.75 - .20 = .55). Although these comparisons are specific to conditions involving a small ES, level of significance of .05 and a sample size = 30, examination of most corresponding power values between tables 4.4a, 4.4b and 4.4c reveals a similar power order between tests across all four matrices. Exceptions are for those conditions in which power approaches 1.00. For a 3 x 6 design, the order of greatest to least power among tests changes (center graph). Under the 4-4 and 8-8 matrices, we see that the A test, in contrast to results observed for the 3 x 3 ANOVA, gains a substantial power advantage over the B test. Further, the difference in power between tests A and B under the heterogeneous matrices seems to favor the A test. Although the B test still shows greater power over the A test when the Ave r of factor B = .8, the difference between the two tests (.81 - .18 = .63) is less than that observed when A dorninates under an 8-4 matrix (.96 - .16 = .80). As in the 3 x 3 design, the interaction test again displays the least power across all four matrices. For a 3 x 9 design, the power order between tests is sirnilar to that observed in the 3 x 6 model with the exception that gains in power for the A test over the other tests are even further enhanced under all r matrices except 4-8 (see bottom graph). Potvin '96 82 Results 3 x 3 ANOVA 1.00 -r 0.79 0.79 .0.75 0.80 - BA 0.60 - • B 0.37 °- 0.40 4- - -o.3t 0.16 0.20 -• • AB 0.20 0.14. "0.13" 0".13 0.08 0.00 4-8 4-4 8-8 8-4 r Matrix 3 x 6 ANOVA 056 1.00 0.81 0.80 0.60 4 .057. . . 0.80 BA 0.56 • B 0.34 °- 0.40 + - • AB 0.18 0.20 0.12. 0.16 0-.13 - 0.12. 0.08 0.00 4-8 4-4 r Matrix 8-8 8-4 3x9ANOVA 1.00 1.00 0.86 0.85 0.80 s "0:74 BA 0.60 i • B 0.35. • AB °- 0.40 0.16 0.20 0.00 0.13 0.15. 0.07 4-4 4-8 r Matrix 8-4 .0.J3. 0.T2- 8-8 Figure 4.09. A comparison of power between A, B and AB tests of the two-way RM ANOVA under different levels of factor B (3,6 and 9) and correlation matrices. Note: Designs based on small ES, n = 30 and a = .05. ji Potvin '96 83 Results HI. Two-Way Mixed ANOVA A . Power Tables Power values for the 2 x K mixed ANOVA with 3, 6 and 9 repeated measures on the second factor are given in tables 4.7a-c, 4.8a-c and 4.9a-c, respectively. Each table provides power for the different levels of alpha, effect size, sample size and average correlation coefficients involved. All values were derived by first calculating the corresponding noncentrality parameter (X) of each experimental condition and then converting A, to power using a cumulative distribution function given in DATASIM. Due to time restraints, power under conditions of nonsphericity (trend) was not determined. Means, standard deviations and complete correlation matrices of conditions involved are given in appendices 4.14.3 and 6.1-6.2, respectively. The description of power results for the rnixed model that follows may be facilitated by referring to these tables and appropriate figures when indicated. Potvin '96 84 Results Table 4.7a Power of the Groups Main Effect For a 2(Groups) x 3(Trials) ANOVA With Repeated Measures On One Factor. Test: Randomized Group Main Effect Alpha = ES: Small (.20) n r pattern: Larg e(.80) Medium (50) 0.80 0.40 .01 0.40 0.80 0.40 0.80 C C c C C C 01 02 03 04 04 05 01 02 02 03 03 04 04 10 17 25 34 42 03 07 11 16 22 27 10 30 52 71 84 92 07 19 34 50 63 75 n 5 10 15 20 25 30 Alpha = ES: Small (JO) n r pattern: Medium (50) 0.80 0.40 .05 Large (.80) 0.80 0.40 0.40 0.80 C C C C C C 5 10 06 08 06 07 14 26 29 58 15 10 12 14 16 09 10 11 12 39 50 60 69 11 20 28 36 45 52 22 43 62 76 86 92 n 20 25 30 Alpha = ES: Small (JO) n r pattern: .10 Medium (50) 0.80 0.40 79 91 96 99 0.40 Large (.80) 0.80 0.40 0.80 C G C C C C 12 15 18 20 23 25 12 13 15 17 19 21 24 39 53 64 74 81 20 30 41 50 59 66 43 73 89 96 99 100 34 58 75 87 93 97 n 5 10 15 20 25 30 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent. Potvin '96 85 Results Table 4.7b Power of the Trials Main Effect For a 2(Groups) x 3(TriaIs) ANOVA With Repeated Measures On One Factor. Test: Trials (Repeated Measures) Main Effect Alpha = .01 Small (20) ES: n r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 0.40 0.40 Large (.80) 0.80 0.40 0.80 C C c C C C 02 03 04 05 06 08 03 07 12 18 25 32 06 18 32 46 60 72 24 67 91 98 100 100 20 57 84 95 99 100 71 99 100 100 100 100 Alpha = .05 ES: Small (20) n r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 0.40 0.40 Large (.80) 0.80 0.40 0.80 C C C C C c 07 10 13 15 18 21 12 20 30 39 48 56 19 39 57 72 83 90 51 88 98 100 100 100 44 81 96 99 100 100 92 100 100 100 100 100 Alpha = .10 Small (20) ES: n 0.40 r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C C C C c 13 17 21 25 28 32 20 31 42 52 61 69 30 53 70 83 91 95 66 94 99 100 100 100 59 90 98 100 100 100 97 100 100 100 100 100 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent Potvin '96 86 Results Table 4.7c Power of the Interaction Test For a 2(Groups) x 3(Trials) ANOVA With Repeated Measures On One Factor. Test: Group by Trials Interaction Alpha = .01 ES: Small (.20) n 0.40 r pattern: Medium (SO) 0.80 0.40 Large (SO) 0.80 0.40 0.80 C C C C C C 01 01 02 02 02 02 01 02 03 04 05 06 02 04 06 08 11 13 05 12 22 32 44 54 04 10 18 26 36 45 14 41 67 85 94 98 n 5 10 15 20 25 30 Alpha = .05 ES: Small (20) n 0.40 r pattern: Larg e(S0) Medium (50) 0.80 0.40 0.80 0.80 0.40 C C C C C C 06 06 07 07 08 07 09 11 13 15 17 08 13 17 22 27 31 16 30 44 58 69 78 14 26 38 50 61 71 34 68 88 96 99 100 n 5 10 15 20 25 30 09 Alpha = .10 ES: r. r pattern: Small (20) 0.40 Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C C C C C 11 12 13 14 15 16 12 15 18 21 24 27 15 21 27 33 38 44 25 43 58 71 81 88 23 38 52 64 74 82 48 80 94 99 100 100 n 5 10 15 20 25 30 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent. Potvin '96 87 Results Table 4.8a Power of the Groups Main Effect For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures On One Factor. Test: Randomized Group Main Effect Alpha = .01 ES: Small (.20) n 0.40 r pattern: Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C c C C C 02 02 03 04 05 06 01 02 02 03 03 04 05 12 21 31 42 52 03 07 12 17 23 29 12 38 63 81 91 96 07 20 36 52 66 77 n 5 10 15 20 25 30 Alpha = .05 ES: Small (20) r. 0.40 r pattern: Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C C C C C 07 09 11 13 16 18 06 07 16 31 45 58 69 78 12 20 29 38 46 54 34 67 86 95 99 100 22 45 64 78 88 93 n 5 10 15 20 25 30 09 10 11 13 Alpha = .10 ES: Small (20) n 0.40 r pattern: Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 e C C C C C 13 16 19 22 25 28 12 14 15 17 19 21 26 44 60 72 81 88 20 31 42 52 60 68 49 80 94 98 100 100 35 59 77 88 94 97 n 5 10 15 20 25 30 C = Constant Correlation Matrix Pattern (e = 1.0) r = Average of Correlation Coefficients in a given Matrix ES = Effect Size (d) Potvin '96 n = Sample Size per group All power values are in percent 88 Results Table 4.8b Power of the Trials Main Effect For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures On One Factor. Test: Trials (Repeated Measures) Main Effect Alpha = .01 Small (JO) ES: r. 0.40 r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 Large (.80) 0.80 0.40 0.40 0.80 C C C C C C 02 02 03 04 06 07 03 07 12 18 25 33 06 18 34 50 65 77 28 74 95 99 100 100 22 63 89 98 100 100 82 100 100 100 100 100 Alpha = .05 Small (JO) ES: n 0.40 r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C C C C c 07 09 12 14 17 20 11 20 29 39 48 57 19 39 58 74 85 92 54 91 99 100 100 100 46 84 97 100 100 100 95 100 100 100 100 100 Alpha = .10 Small (JO) ES: n 0.40 r pattern: n 5 10 15 20 25 30 Medium (50) 0.80 Large (.80) 0.80 0.40 0.40 0.80 C C C C C c 13 16 20 23 27 31 19 30 41 52 61 70 30 52 71 84 92 96 67 95 100 100 100 100 60 92 99 100 100 100 98 100 100 100 100 100 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent. Potvin '96 89 Results Table 4.8c Power of the Interaction Test For a 2(Groups) x 6(Trials) ANOVA With Repeated Measures On One Factor. Test: Group by Trials Interaction Alpha = ES: n r pattern: C c C n 5 10 15 20 25 30 01 01 01 02 02 02 01 02 03 03 04 05 02 03 05 07 10 13 Small (.20) 0.40 Medium (SO) 0.40 0.80 C 0.80 05 12 22 34 47 59 Alpha = ES: n r pattern: C G C n 5 10 15 20 25 30 05 06 07 07 08 08 06 08 10 12 14 16 08 12 16 21 26 31 Small (20) 0A0 15 29 45 59 71 81 Alpha = G Small (20) 0.40 C n 5 10 15 20 25 30 11 12 12 13 14 15 12 15 17 20 23 25 0.80 C = Constant Correlation Matrix Pattern (e = 1.0) C C Large (.80) 0.40 C 04 10 18 27 38 48 15 45 73 90 97 99 C Large (.80) 0.40 C 13 25 38 51 63 72 35 70 90 97 99 100 C Large (.80) 0.40 C 22 37 51 64 75 83 49 81 95 99 100 100 0.80 .05 Medium (SO) 0.40 0.80 C 0.80 ES: n r pattern: .01 0.80 .10 Medium (SO) 0.40 0.80 C 15 20 26 32 37 43 25 42 58 71 82 89 ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix 0.80 n = Sample Size per group All power values are in percent Potvin '96 90 Results Table 4.9a Power of the Groups Main Effect For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures On One Factor. Test: Randomized Group Main Effect Alpha = .01 Small (20) ES: r. 0.40 r pattern: Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 C C c C G C 02 02 03 04 06 07 01 02 02 03 03 04 05 13 23 34 45 56 03 07 12 17 23 29 13 41 67 84 93 98 07 21 37 53 66 78 n 5 10 15 20 25 30 Alpha = .05 ES: 0.40 r pattern: Medium (50) Small (20) n 0.80 Large (.80) 0.80 0.40 0.40 0.80 C C C C C C 07 09 12 14 16 19 06 07 09 10 11 13 17 33 48 61 72 81 12 20 29 38 47 55 36 70 89 97 99 100 23 45 64 79 88 94 n 5 10 15 20 25 30 Alpha = .10 Small (20) ES: n 0.40 r partem: Medium (50) 0.80 Large (.80) 0.80 0.40 0.80 0.40 C C G C C C 13 16 20 23 26 29 12 14 16 17 19 21 28 46 62 75 84 90 20 32 42 52 61 69 52 83 95 99 100 100 35 60 78 89 95 98 n 5 10 15 20 25 30 C = Constant Correlation Matrix Pattern (e =1.0) ~ ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent. Potvin '96 91 Results Table 4.9b Power of the Trials Main Effect For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures On One Factor. Test: Trials (Repeated Measures) Main Effect Alpha = .01 Small (.20) ES: n 0.40 r pattern: Large (.80) Medium (SO) 0.80 0.40 0.80 0.80 0.40 C C c C C C 02 02 03 05 06 08 03 07 14 21 30 39 07 21 40 59 75 86 35 84 98 100 100 100 27 74 95 99 100 100 91 100 100 100 100 100 n 5 10 15 20 25 30 Alpha = .05 ES: Small (20) n 0.40 r pattern: Medium (SO) 0.80 0.40 Large(.80) 0.80 0.40 0.80 C C C C C c 07 09 12 15 18 22 12 21 32 43 54 64 21 44 65 81 91 96 61 95 100 100 100 100 52 90 99 100 100 100 98 100 100 100 100 100 n 5 10 15 20 25 30 Alpha = .10 Small (20) ES: n 0.40 r pattern: Medium (SO) 0.80 Large (.80) 0.80 0.40 0.40 0.80 C e C c C c 13 17 21 25 29 33 20 32 45 56 67 75 32 57 77 89 95 98 73 98 100 100 100 100 66 95 100 100 100 100 99 100 100 100 100 100 n 5 10 15 20 25 30 : C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent. Potvin'96 92 Results Table 4.9c Power of the Interaction Test For a 2(Groups) x 9(Trials) ANOVA With Repeated Measures On One Factor. Test: Group by Trials Interaction Alpha = .01 ES: Small (20) n r pattern: Larg iW) Medium (SO) 0.80 0.40 0.40 0.80 0.40 0.80 C C c C C C 01 01 01 02 02 02 01 02 04 06 08 11 15 05 14 26 41 55 68 04 11 21 33 45 57 18 54 83 95 99 100 n 5 10 15 20 25 30 02 03 04 04 06 Alpha = .05 ES: Small (20) n r pattern: Larg e(*0) Medium (SO) 0.80 0.40 0.40 0.80 0.40 0.80 c C C C C C 05 06 07 07 08 08 06 08 10 12 14 17 08 12 17 22 28 34 16 33 50 66 78 87 14 28 43 57 69 79 39 77 95 99 100 100 n 5 10 15 20 25 30 Alpha = .10 ES: Small (20) n 0.40 r pattern: Medium (50) 0.80 0.40 Large (.80) 0.80 0.40 0.80 e e C C C e n 12 12 13 14 15 12 15 18 21 24 27 15 21 27 34 40 47 26 45 63 77 87 93 23 40 56 70 80 88 53 86 97 100 100 100 n 5 10 15 20 25 30 C = Constant Correlation Matrix Pattern (e = 1.0) ES = Effect Size (d) r = Average of Correlation Coefficients in a given Matrix n = Sample Size per group All power values are in percent Potvin '96 93 Results B. Power Trends Alpha. Sample Size and Effect Size Similar to the one-way and two-way RM ANOVA and as predicted, power was found to increase as ct, n or ES increased for any given experimental condition and test of the two-way mixed design. Exceptions were for those conditions in which power either approached level of significance or 1.00, in which case, values were approximately equal. Average Correlation The effects of increasing the Ave r among trials of the repeated measures factor in the rnixed ANOVA model are different depending on the test involved. For a main effect test on the grouping factor, an increase in Ave r from .4 to .8, as expected, causes a decrease in power for almost any given a, n, ES and K as shown in tables 4.7a, 4.8a and 4.9a. Figure 4.10 (top graph) also depicts this common trend for the groups test of all three designs under conditions involving medium ES, level of significance = .05 and n - 15. For the trials main effect and interaction tests, the reverse is true, with both displaying an increase in power under most conditions as Ave r is increased (center and bottom graphs, respectively and tables 4.7b-c, 4.8b-c and 4.9b-c). Number of Repeated Measures - Differences Between Designs Figure 4.11 illustrates the common power trends across designs under different effect sizes for each of the three tests of the mixed model. In most cases, the changes in power across K were similar to those seen in the two-way RM model. For the group test, like the A test of the RM model, there was a tendency for power to increase as K increased. This increase was greatest at larger effect sizes and when Potvin '96 94 Results Group x Trials Interaction 0.50 1 r 0.10 0.05 0.00 •0.4 0.8 Average Correlation (ave r) Figure 4.10. Comparisons of power between two-way Mixed ANOVA Designs for the Groups, Trials and Group by Trials tests as the average correlation among repeated trials is increased. Based on medium ES, a = .05 and n = 15 per group. Potvin '96 95 Results Groups Main Effect (r = .4) Medium ES(.5) Small ES (.2) Number of RM (K) Trials Main Effect (r = .4) Medium ES (.5) Small ES (.2) Number of RM (K) Groups by Trials Interaction (r = .8) Largi ES (.8) Medium ES (.5) Small Number of RM (K) ES(.2) Figure 4.11. Comparisons of power between two-way mixed ANOVA designs with different levels of RM (3, 6 and 9) under varying effect sizes for the Group, Trials and Group by Trials tests. Based on a = .05, n = 15 per group. Potvin '96 96 Results the Ave r among trials was low (.4) as demonstrated in the top graph of thisfigure.Although not shown, when the Ave r was moderately high (.8), differences across designs were slight (.03 the most). For the Trials test, a similar trend was seen as that of the group test (center graph) with differences across K being greatest at medium and large effect sizes. However, unlike the group test, differences in power were found under both Ave r. The interaction test showed similar results as the trials test but only at larger effect sizes and when K = 9 (bottom graph). The group by trials results were also similar to those of the AB test in the two-way R M model. Of the values reported in tables 4.7a - 4.9c, the largest differences in power across designs for the group, trials and interaction tests were .15, .20 and .16, respectively. Minimum differences were found when power among designs was generally low (small effect sizes) or approached 1.00. Between Test Comparisons: Main Effects and Interaction Figure 4.12 portrays the general power order between tests under different Ave r for most experimental conditions of the two-way mixed ANOVA. As can be seen, at a moderately low Ave r (.4), the trials test displays the most power, followed by the group test while the interaction test exhibits the least. At a moderately high Ave r (.8), and as expected, we see power for the group test is reduced, dropping below that of the interaction test while power for the other tests increase from their values under Ave r = .4. This general power order observed across Ave r's for tests of the two-way model holds under most experimental conditions and designs. Exceptions exist for the 2 x 6 design under small ES and moderately low Ave r (.4) where power values remain about the same between main effect tests as well as at the extremes of the power curve where differences are rninimal. Potvin '96 97 Results 2 x 6 ANOVA 1.00 -r 0.90 0.80 0.70 - * 0. •Groups Main Effect 0.60 s • V t 0.50 • Trials Main Effect Groups x Trials Interaction 0.40 • 0.30 0.20 -; 0.10 • 0.00 -0.4 0.8 Average Correlation (r) Figure 4.12. A comparison of power between tests of a 2 x 6 Mixed ANOVA design as the average correlation among repeated trials is increased. Based on medium ES, a = .05 and n = 15 per group. Potvin '96 98 Discussion Chapter Five Discussion Potvin '96 99 Discussion The relationships between power and level of significance, sample size and effect size have been discussed previously in chapter 2 and therefore require no further elaboration. In addition, the effects of varying the average correlation coefficient on the power of a one-way RM ANOVA and tests of a two-way design with a single repeated measures factor have been explained in detail and need not be repeated here. The focus of this section therefore, will be to discuss those relationships less well understood. Particularly, the influence other important statistical parameters such as nonspherical correlation matrices, different levels of RM and the existence of multiple RM factors have on power. I. One-Way Repeated Measures ANOVA Power Comparisons Across Constant (Spherical) and Trend (Nonsphericall Correlation Matrices The observed differences in power between one-way ANOVA designs with constant and trend r matrices were in agreement with the hypothesis that power would be altered under heterogeneous r matrices. Our findings also agree with results from previous studies that have examined power under conditions of low epsilon. Marcucci (1986) and Muller and Barton (1986) showed, under a false null hypothesis (effect size greater than 0), that as epsilon was lowered, power for a RM ANOVA design was overestimated while Mendoza et al. (1974) found it to be underestimated. One of the main reasons for this difference in power between designs with high and low epsilon as seen in this and other studies is due to an increase in the variability of the F Ratio which occurs as epsilon is decreased. Eom (1993) indicated that the increase in the variance of the F ratio is a direct consequence of both an increase in the variability of its numerator (mean sum of squares for Trials or MS* ) and denominator (mean square error or MSEM) terms. In his study he demonstrated using Monte Carlo Potvin '96 100 Discussion simulation, that under conditions of nonsphericity the null F ratio of a R M ANOVA test is more variable, resulting in a greater occurrence of outliers. Although his study involved examination of type I error rates rather than power, it is believed that an increase in the variability of the F ratio is also responsible for the power trends observed in the present study under T. In order to verify this was the case in this study, power under C for various conditions of the oneway design was also derived using Monte Carlo procedures and several statistics including the F ratio were compared to identical conditions under T. For all conditions examined, values averaged over three thousand ANOVA test replications revealed that the mean F , M S K and MSERR values between conditions with C and T remained relatively equal while the respective standard deviations (SD) of these statistics were all found to be greater under T. Figures 5.01 and 5.02 include plots of the distiibutions of the F values under small and medium ES respectively, and varying K. For comparison, the respective F distributions under the null hypothesis (Ho C) are also plotted. Among the distributions under the alternate hypothesis (Ha) of any one design (one graph), we see that the mean F ratios of tests under C and T are more or less similar but their SD are quite different with those under T all being higher. Moreover, we see that all distributions under T tend to be more concentrated in the tail regions. Under medium ES, _ these distributions appear to be more platykurtic (flatter) while under small ES they are more positively skewed than those of C. This greater concentration in the tails and altered shape of the distributions under T is due to the larger number of outliers that result when epsilon is low as explained earlier. Such an effect tends to spread out the distribution of F values over a greater range thereby altering the power of a design under a true hypothesis. Potvin '96 101 Discussion K = 3, Small Effect Size F-Ratio Mean S B HoC 1.02 1.09 Ha C 2.64 2.22 Ha T 2.60 2.88 u O" ,= 5.00 •Null Ho ' Constant r Trend r Power HaC Ha T 0.12 0.16 T>C K = 6, Small Effect Size F-Ratio Mean £D HoC HaC Ha T 1.01 1.85 1.86 HaC HaT Power 0.12 0.17 0.67 1.07 1.72 F-Ratlo K = 9, Small Effect Size F-Ratio Mean HOC HaC HaT 1.02 1.72 1.73 SB 0.52 0.81 1.30 Power HaC Ha T 0.14 0.20 T>C F-Ratio Figure 5.01. F distributions for one-way repeated measures ANOVA designs when effect size is small (.2) and the pattern of the correlation matrix is altered. Ha C - Alternate F distribution under constant correlation matrix. Ho C - Null F distribution under constant r matrix Ha T - Alternate F distribution under trend correlation matrix. Note: All designB based on ave r - .8, n - 30. X = Mean F Potvin '96 102 Discussion K = 3, Medium Effect Size F-Ratio Mean SD HoC 1.02 1.09 HaC 10.89 5.19 Ha T 11.12 7.16 = 5.00 o Power Tf W «> W Ha C 0.91 Ha T 0.82 C>T 00 W F-Ratio K = 6, Medium Effect Size •Null Ho 1 Constant r Trend r F-Ratio Mean SD HoC 1.01 0.67 HaC 6.40 2.31 HaT 6.57 3.77 Power Ha C 0.95 Ha T 0.82 C>T F-Ratio K = 9, Medium Effect Size F-Ratio = 2.59 •Null Ho • Constant r Trend r T~i—l—r CO 00 "V O W CM W CM HoC HaC HaT Mean 3Q 1.02 0.52 5.44 1.64 5.48 2.84 Power Ha C 0.98 Ha T 0.87 C>T F-Ratio Figure 5.02. F distributions for one-way repeated measures ANOVA designs when size size is medium (.5) and the pattern of the correlation matrix is altered. Ha C - Alternate F distribution under constant correlation matrix. Ho C - Null P distribution under constant r matrix Ha T - Alternate P distribution under trend correlation matrix. Note: All designB based on ave r - .8, n - 30. X - Mean F Potvin '96 103 Discussion Although power is known to be altered under conditions of nonsphericity, as shown in other studies (Marcucci, 1986; Muller & Barton, 1986; Grima, 1987; Mendoza et al., 1974 ), the reason(s) why power can sometimes result in values above or below those under C has not been well documented. One possible explanation may be exemplified when identical designs with different effect sizes are compared across Figures 5.01 and 5.02. Referring to any one graph in Figure 5.01, we see that when ES is small, all the mean F ratios of Ha fall below the F critical values of their respective null distributions (e.g. F238, .01 = 5.00 for a design with K= 3 levels). More importantly, the area of the F distribution past this critical point (to theright)is slightly greater under T than it is under C. With more variability in the F ratio under T, the greater number of outliers occurring above this point compared to C results in more tests achieving significance, therefore producing greater power for the design with a trend r matrix pattern. In contrast, when examining any one graph in Figure 5.02 under a medium effect size, all the mean F ratios under Ha are above the null F critical values. In addition, the area of the F distribution above the critical point is less under T than it is under C. Thus, with more variability in the F ratio under T, the greater number of outliers falling below (to the left) of this point compared to C reduces the power of designs involving a trend matrix. What seems especially important here in deciphering whether a trend design will show greater or less power depends not so much on the magnitude of the effect size involved but rather, on all the statistical factors that contribute to a design having a mean F ratio which falls above or below its F critical value. When the mean F ratio falls above its critical value, the greater variability in F that results for tests under nonsphericity will cause a decrease in power over identical designs under sphericity. Likewise, when the mean F ratio is belOw its critical value, the greater variance in F that results produces a design with more power under T. This helps explain why in this study, a design exhibiting low power under C showed slightly greater power when a nonspherical r matrix was involved while at the same time, another design with high power under C, showed less power under T. This may also clarify findings from the Mendoza et al. (1974) study in which a test's power was shown to be less under T when Potvin '96 104 Discussion power under C was moderate (.5) to high (.89). However, for reasons unclear, it does not account for the results of Marcucci (1986) and Muller and Barton (1986) who found power to be greater under T regardless of a test's power under spherical structures. The discrepancy in results may be due to the fact that the latter studies involved analytical approximations of power whereas the Mendoza et al. investigation, like this one, involved Monte Carlo simulation. Power Comparisons Across Designs With Different Levels of RM (XI Constant r Matrix Under a constant r matrix pattern, the general observation that power increased slightly as the number of repeated measures within a one way ANOVA design increased can, in most instances, be attributed directiy to an increase in a design's noncentrality parameter (X), as given in equation 2.01. The reader may recall that X represents the factor by which the F ratio departs from the central F distribution when a true difference between means exists and has a curvilinear relationship with power. The increase in X when K increases and all other parameters are held constant is the result of a greater numerator sum of squares for the trials effect which occurs because of the greater number of means that exist in a design with larger K (i.e. a greater number of deviations from the grand mean (i.e. ^((i.^ — (j.) ). Comparing X 2 across designs with different K for those conditions in which power was determined using DATASIM (X values are given in appendices 7.1-7.4), shows that, under most conditions, an increase in K results in an increase in X and thus power. However, when effect size is small resulting in a design with low power (<.20), an increase in K seems to have little effect on improving power, despite an increase in X (see top graph offigure4.04 again). This finding seems contrary to the expected relationship between X and power. Potvin '96 105 Discussion In order to help explain this occurrence, the F distributions from Figures 5.01 and 5.02 for the different designs under C were plotted against each other and shown in Figure 5.03 for each different ES. For comparison, the F distributions of designs under the null hypothesis (Ho C) are also shown (top graph). Examining the different distributions under a small effect size (center graph), we see that as the number of R M increases from 3 to 9, the F distributions change in shape going from a relatively flat and positively skewed distribution (K = 3), to one that is more leptokurtic and bell-shaped (K = 9). Also important to note is how the mean and standard deviation values of the alternate F's and their respective critical F values decrease as K increases. This decrease in mean F values as K increases is the result of a decrease in the numerator of the F ratio alone since mean M S K was found to decrease with higher levels of R M while MSERR (the denominator of F) remained constant across designs (these latter statistics are not shown). From this information, therefore, it appears that, for a design with K = 3 levels, despite having a larger F critical value (F = 5.00), the greater variability (more outliers) and flatter distribution of its F results in 3 an almost equal area (power) occurring to therightof the critical point as that observed in designs with 6 or 9 R M levels. Thus, it seems the differences in F distributions observed between designs with different K under C and small effect size (low power) tend to dissipate the power advantages gained from having a greater number of R M levels. This helps explain why an increase in X across designs with greater K did not always produce higher power under such conditions. Although the decrease in mean F values across designs with greater K at first seem surprising given that X increases (with X and F directly related, you would think one would mirror an increase in the other), it should be remembered that the numerator of X (i.e. n ^ (\Ljj — \i) 2 is not divided by numerator degrees of freedom as is F (SSk/k^l), and therefore is not reduced when K is increased. In contrast, when examining the distributions under a medium effect size (bottom graph of Figure 5.03), we see a different effect emerge. Although the change in the shape of the distributions when K is increased is similar to that observed earlier under a small effect size (with the exception that all Potvin '96 106 Discussion distributions are less positively-skewed), the area to therightof each design's respective F critical value is different between K. Here a design with 9 RM levels has most of the area of its distribution to therightof its F critical point (F , i = 2.59), followed by a design with K = 6 levels which has slightly less while a 9 0 design with K = 3 displays the least. Again the mean and standard deviation values of F are found to decrease as K increases. In this case, the flatter distribution and greater variability of F for those designs with fewer repeated measures serves as a detriment, leading to less power while the more leptokurtic distribution and smaller variability of F among tests with more RM provides a power advantage (e.g. K = 3 .91, K« = .95 & K = .98). Thus, under conditions of medium effect size (moderately high power) and 9 constant r matrix pattern, the increase in X. that accompanies designs with more R M levels is not overshadowed by a less variable F but in fact, enhanced, leading to greater power for such designs. This explains why, when assumptions of sphericity are met, we see a tendency for power to increase as K increases under larger effect sizes. Trend r Matrix The changes in power observed between designs with different R M when the assumptions of sphericity are not met (low epsilon) may be explained by also comparing the pattern of F distributions across designs. Such a comparison is given in Figure 5.04 for conditions involving both small and medium effect sizes. Again, each design's respective null F distribution (under C) is shown (top graph). Referring to the center graph we see that when effect size is small, all three distributions are fairly skewed with the highest frequency of F values occurring at or below 1.0. The distributions, like those observed under C, differ from one another in how values are concentrated in the tail regions with K = 3 showing the most concentration and K = 9 the least. This difference in concentration is also reflected by the decrease in the SD and mean of F that occurs when the number of RM levels are increased as shown to therightof the graph. Although the distribution for the design with 3 levels has greater variability than those with 6 or 9, the area to therightof its F critical point is smallest as indicated by its power (.16). This is different to Potvin '96 107 Discussion Constant Correlation Matrix, Null F (Effect Size = 0) Xg = X 6 = X3 *K = 3 •K = 6 K=9 F-Ratio Mean SD K = 3 1.02 1.09 K = 6 1.01 0.67 K = 9 1.02 0.51 Power F , .01 = 5.00 K = 3 0.01 K = 6 0.01 K = 9 0.01 Kg = Ke = K 3 3 F-Ratlo Constant Correlation Matrix, Alternate F (Effect Size = .2) 900 •K = 3 K=6 K=9 F-Ratio Mean SQ K = 3 2.64 2.22 K = 6 1.85 1.07 K = 9 1.72 0.81 Power K = 3 0.12 K = 6 0.12 K = 9 0.14 Kg > Kg = K 3 F-Ratio Constant Correlation Matrix, Alternate F (Effect Size = .5) 500 400 •K = 3 'K = 6 K=9 8 8 F-Ratio Figure 5.03. F-Ratio Mean SD K = 3 10.89 5.19 K = 6 6.40 2.31 K = 9 5.44 1.64 Power K = 3 0.91 K = 6 0.95 K = 9 0.98 Kg > Kg > K 3 F Distributions for one-way ANOVA designs with 3, 6 and 9 repeated measures under a constant correlation matrix and varying effect size. Note: All designs based on ave r = .8, n = 30. Potvin '96 F critical values given within graphs. 108 Discussion that seen earlier under C where the area was equal among the 3 designs. The reason for this discrepancy under T is due to differences in the number of outliers that reach significance among designs with varying K. When epsilon is low, the number of outliers that surpass the F critical for a design with 3 RM is less than that for a design with 6 and even lesser than that for a design with 9 (even though the magnitude of outlier values is greatest for K = 3). This is because the F distribution of K = 3 is less centrally concentrated around critical F, providing fewer opportunities to obtain outliers past this point (compare the slopes of each design's c&stribution at the F critical points). Referring now to the bottom graph of Figure 5.04 in which power for the different designs is relatively high due to a larger effect size, we see an almost opposite effect occurring. Here, the mean F values for the 3 distributions are all above their respective F critical values with K =3, again demonstrating the highest mean value followed by K = 6 and then K = 9. The shape of the distributions are all flatter and less positively skewed than those observed under a small ES with the K = 3 distribution showing the highest SD and the K = 9 distribution the least In this situation, we see how a larger portion of the distribution for K = 9 falls above its F critical point compared to that of K = 6 and K = 3 resulting in greater power for this design. What is more important to note from this graph is the larger concentration of F values which border the critical points of designs with higher K. Under T, such a concentration provides a greater opportunity for outliers to occur below the F critical (i.e. reduce power) among those designs with more RM. Although the example given in the bottom graph of Figure 5.04 does not adequately represent this effect (power is greatest for K = 9), other comparisons between these designs under medium and large ES do indicate this trend very well (see a = .05 & .10 in the power tables as well as the center and bottom graphs of Figure 4.04). Therefore, it appears that under conditions of nonsphericity, the degree to which power is affected by a difference in the number of RM is dependent on the magnitude of power a design acquires from other factors and the F distribution involved. When a design with small effect sizeflowpower) has a low number of RM (e.g. K = 3), there seems to be a slightly lower probability of obtaining an F value above level of significance Potvin '96 109 i Discussion Constant Correlation Matrix, Null F (Effect Size = 0) K=3 ~ • 880000 ,. K =6 K = g F-Ratio Mean SD K :3 1.02 1.09 1.01 0.67 K 6 1.02 0.51 K ;9 Power K :3 0.01 K :6 0.01 K: : 9 0.01 f V Ke = K 3 F-Ratio ; Trend Correlation Matrix, Alternate F (Effect Size = .2) — K =3 F ,.oi = 2.59 9 - K=6 F-Ratio Mean SD K = 3 2.60 2.88 K =6 1.86 1.72 K =9 1.73 1.30 K=9 K =3 K =6 K =9 Power 0.16 0.17 0.20 Kg > K6 > K3 F-Ratio Trend Correlation Matrix, Alternate F (Effect Size = .5) 500 400 •K = 3 300 K=9 200 Power K =3 0.82 K =6 0.82 K =9 0.87 Kg > Ke = K3 100 0 -fliiiMvv F-Ratio Figure 5.04. F-Ratio Mean SD K = 3 11.12 7.16 K = 6 6.57 3.77 K = 9 5.48 2.84 F distributions for one-way ANOVA designs with 3, 6 and 9 repeated measures under a trend correlation matrix and varying effect size. Note: Al! designs based on ave r = .8, n = 30. Potvin '96 F critical values given within graphs. 110 Discussion because of the lower concentration of F values occurring around its critical point compared to that observed for designs having larger numbers of RM (e.g. K = 9). In such a case, power generally tends to be lower. In contrast, when the same design involves a medium effect size (high power), the probability of obtaining an F value below level of significance can be less than that of designs with more RM for the same reasons mentioned above. In such a situation, power for a design with fewer RM maybe even greater than a design with larger K , as was evidenced for some conditions of this study. An important point worth mentioning when examining power across designs with different K under both conditions of sphericity and nonsphericity is that the results obtained are to a great extent dependent on the definition of ES (d) used in this study. Since d was fixed across designs, it should be expected that the tendency for power to increase as K increased would be less than what would have been observed if ES was defined by Cohen's/ This is because the latter statistic accounts for the reduction in numerator variance that occurs among designs with greater RM by enlarging the difference between means (i.e. making d bigger) for designs with more K. Therefore under afixed/ , power would be expected to increase more so as K increased. However, unlike the reasons given in this study, the increase in power across designs would be more attributable to larger differences between means than an increase in K. Potvin '96 111 Discussion II. Two-Way RM ANOVA Power Comparisons Across Different Correlation Matrices Main Effect Tests The power trends observed across different r matrices of the two-way RM ANOVA are in partial agreement with the hypotheses of this study. It was predicted, based on earlier pilot work, that power for a main effects test (e.g. B test) would increase as the Ave r among pooled trials (factor A) decreased. As shown from the simulation results of the two-way RM ANOVA, this was true only when the Ave r of the averaged-over RM factor was greater than that of the main effects factor (an 8-4 matrix in this example). If equal to or less than, power no longer was affected. The discrepancy between these results and findings from earlier pilot work can be traced to errors that were made in the pilot study when generating covariance matrices . Misleading information which resulted from these errors (r coefficients for AB pairs were entered incorrectly) was used in the formation of some of the hypotheses of this study. The reasons for seeing a change in the power of main effects tests with different r matrices was also somewhat different from anticipated. Since alterations in the r matrix are known to effect the error variance of a RM test (Winer et al., 1991), decreases in power for tests with different matrices are expected to be caused by an increase in the error term (denominator) of their F ratios. Although this was found to occur for all main effects tests (and interaction tests as well), what was surprising to observe and contrary to results from pilot work, was a concomitant increase in the numerator of the F ratio for those tests with matrices resulting in lower power. This is demonstrated in Table 5.1 in which the mean and standard deviation of several statistics generated from Monte Carlo simulation are given for two-way RM designs with medium ES, n = 1G and a = .05. Under any one particular design (i.e. 3 x 3, 3 x 6 or 3x 9) we see that the mean sum of squares among pooled trials of factor B and the mean square error for the B Potvin '96 112 Discussion test (MSB and MSQUU,, respectively) are lowest for those matrices showing the highest power and greatest for those demonstrating the least. The same holds true for the A main effect and AB interaction tests as well. A possible reason for this occurrence can be explained by exarnining the expected mean squares model for two-way R M designs, as given by Howell (1992). In this model, the expected F ratios of main effects and interaction tests are as follows: E(MS _ ) ERR B) o : + a a j in In all three equations, we see that the error term (e.g. a ,. + a a ^ for B test) is included along with the 2 2 treatment effect (e.g. narfp) in forming the numerator of the F ratio. Therefore according to this model, changes occurring in the denorninator term also bring about similar changes in the numerator with repeated replications of an ANOVA test. This explains why the results of this study showed an increase in both the numerator and denorninator across different r matrices. Further evidence of an error variance contribution to the numerator term can be eorifirrned by subtracting the MSERR of any given test condition in Table 5.1 from its corresponding MS,** value. Such a subtraction will result in approximately equal Potvin '96 113 3 )( 3 X 11.83 2.72 1.94 2.68 11.64 1.87 4.61 4.65 1.34 CO OO O If) If) T - < m °| < m£ < < < co 5 CO CO CO 2 2 s" t2o co2-g^ < tt cc ui m § I i cc £ UJ S 11.62 11.90 1.91 0.20 0.20 0.20 cn £ < CO 3 U. U- LL Potvin '96 CO o- Q- a. 6.84 7.25 1.38 6.85 2.63 1.37 ^ 2.57 6.84 1.33 CO CO CO CO CO CM 3.68 3.56 1.03 0.07 0.29 0.03 0.87 0.04 0.03 2.65 2.68 0.66 0.20 0.13 0.09 5.46 1.63 0.54 < co $ < < m§ A 2 Q CO & 2, 3 5 Q Q CO CO CD § « CO CO Q Q Q CO CO CO 10.54 1.16 0.67 o 0.07 0.07 0.05 l>- 0.07 0.47 0.05 O CM CM 0.45 0.07 0.05 o 11.33 2.69 0.67 0.07 0.04 0.03 1.24 0.43 0.12 7.08 1.09 0.44 0.20 0.10 0.07 2.60 0.59 0.24 2.73 1.88 0.50 1.29 0.03 0.02 7.73 0.31 0.09 14.87 0.82 0.50 0.07 0.23 0.02 15.58 1.89 0.50 0.07 0.03 0.02 1.51 0.31 0.09 1.49 1.03 0.09 t 0.20 0.20 0.14 1.23 1.40 0.12 5.04 0.42 0.12 CO CO CO CO uo 2.20 0.81 0.29 O 00 i o cn CM s 0.90 0.92 0.23 IB If) i CO t t ) i - O CM CM 0.88 2.74 0.23 32.74 5.52 1.40 0.20 0.20 0.20 5.82 1.07 0.28 32.40 1.65 1.39 0.20 1.39 0.20 5.79 2.24 0.27 2.81 5.54 1.39 3.81 0.20 0.20 9.45 1.08 0.27 8-4 | 11.49 2.56 1.15 0.60 0.60 0.60 6.12 1.49 0.68 | CO CO CO CM O) CM 22.30 6.47 1.47 0.20 0.20 0.20 3.92 1.25 0.29 4-8 o 2.74 0.88 0.23 21.96 1.83 1.47 0.20 1.39 0.20 2.76 6.52 1.45 2.60 0.20 0.20 3.93 2.43 0.29 N CO O CM 00 CM 8.09 2.92 1.17 0.61 0.59 0.60 6.34 1.25 0.28 Tt O CO T t O CD CVJ 1.63 1.63 0.54 0.20 1.39 0.20 CO 1.40 0.20 0.20 op 4.34 1.65 0.69 8-4 | CO CO 2.09 2.10 0.36 4-8 | i 2.09 3.35 0.36 4-4 | 3) Correlat ion Matri CO 1 CO 3.30 2.07 0.35 8-4 | CO N CM CM O) CM 0.60 0.60 0.60 Tt I Tt If) CM T t CO CO CM 2.47 2.48 0.76 4-8 | 3 )( 6 Correlat ion Matri X Test Correlatiion Matri X Statistic | Design: Discussion D Q Q CO CO CO 114 Discussion trials effect totals (/wo^p) across all matrices of a test. Interestingly, results from our initial pilot project did not show an increase in the numerator of F, instead producing constant MSuu. values across different r matrices. The reason for this is uncertain but it may have been due to the fact only a single replication (one ANOVA test) was performed instead of many in the pilot study. With one replication, the contribution of error variance to the numerator of F may not be observable since error due to random sampling is absent With changes in the numerator variance accounted for, it therefore seems evident that differences in the power of tests with different r matrices are the result of alterations in MSERR alone. Referring back to Table 5 . 1 , we see that for main effect tests, those matrices associated with lower power, regardless of design, produce higher MSERR values. For the A test, conditions involving an 8 - 4 or 8 - 8 matrix produced an equally low MSERR, while those with a 4 - 4 matrix resulted in a greater MSERR and tests having a 4 - 8 matrix produced the largest MSERR. The B test showed similar results with the exception that the 8-4 matrix and 4 - 8 matrix reversed their rank order. These findings seem to suggest that MSERR is dependent on the magnitude of the Ave r of both A and B pooled trials but not necessarily in every case. When the Ave r among trials of the main effects factor is equal to or greater than that of the averaged-over factor, MSERR seems only effected by the magnitude of the Ave r of the main effects factor and unaltered by the magnitude of the Ave r of the pooled-over factor. In contrast when the Ave r among trials of the main effects factor is less than that of the other factor, the magnitude of the Ave r of the pooled factor seems to also play an importantrolein influencing MSERR.. In this case, the influence on MSERR is somewhat similar to that observed when the Ave r of the RM factor is increased in a groups effect test of a two-way mixed ANOVA (see chapter 4 ) . Interaction The results observed across the four matrices for the interaction test agreed with the researcher's initial hypothesis that power would increase as the Ave r among trials of both factors increased (i.e. from Potvin *96 115 Discussion 4-4 to 8-8 matrix). Of surprise is thefindingthat an increase in only one and not both factors' Ave r was necessary to achieve an increase in power equal in magnitude as that obtained for the 8-8 matrix. Table 5.1 shows that the greater power and larger F ratios of AB tests with 4-8, 8-4 & 8-8 matrices over a test with a 4-4 matrix are due entirely to a reduction in MSE«R.AB (MS*B is lower also but for the same reasons outlined earlier). In addition, MSBMAB values for those matrices resulting in equal power are all similar. This seems to suggest the error variance of AB tests is affected solely by the magnitude of the Ave r of the pooled matrix (A or B) having the highest average correlation among its trials. Thus, when the Ave r of either the A or B pooled matrix is equal to .8, error variance remains the same as when the entire AB matrix is equal to .8. Only when both factor's pooled matrices fall below an Ave r of .8 is error variance increased, thereby causing a decrease in power. Evidentiy, the overall Ave r of the AB matrix does not seem to be a (tetermining factor. Power Comparisons Across Designs With Varying K . Although there was a tendency for all the three tests of the two-way RM ANOVA to show an increase in power as K„ increased, especially at larger effect sizes, reasons for the increase were different among the separate tests according to how numerator and denominator terms of respective F ratios were affected. Main Effect of A For the A main effect test, the increase in power (and F) across designs agreed with the researcher's original hypothesis that power for a main effect test would increase as the number of levels of the pooled factor increased.. Under matrices 4-4, 8-4 and 8-8, the increase seems entirely the result of an accompanying increase in mean M S since the mean MSum.* of these matrices remained constant across A designs (see Table 5.1). The increase in MS* itself, is a direct consequence of the number of trials of Potvin '96 116 Discussion factor B (i.e. K ), which in the numerator expression of eqn. 5.01, is depicted by variable b. As evidenced B by this equation, an increase in b elevates MS resulting in a bigger F value and thus power for the A test A across designs. Interestingly, an exception exists for the condition involving a 4-8 matrix. Under this matrix, not only does the numerator increase but so does its denorninator (MSEW-A ) which is somewhat puzzling considering other matrices did not demonstrate this. Reraming to eqn 5.01 again, it may seem apparent that the increase in MSEHRA is due to the presence of b in the denominator as well. However, this does not explain why A tests involving the other three matrices maintain a constant MS™ A as b increases. The reason for this finding therefore remains unknown but it seems evident that the effect only occurs when Ave r is below Ave r, (4-8 matrix). A Main Effect of B For the B main effects test, the alterations in numerator and denorninator terms with increasing K„ are similar to those observed in the one-way R M design under a constant r matrix pattern. That is, a corresponding decrease in mean F values with larger K„ is due to a reduction in mean MS B alone since mean MSHUU, remains constant (see Table 5.1). Under this test, the decrease in MS„ across designs with higher K„ is caused by a greater number of degrees of freedom (df) in the numerator term (df3x3 = 2; dfjrf = 5; dfjrf = 8) which off sets any expected increase in their respective sum of squares trials (SS ). Unlike B the A test, this pattern among the mean statistics of the B test is the same for all matrices involved. Even when the Ave r of B is below that of A (8-4 matrix) we see that MSHR». remains constant (1.39) across b designs. This is because the number of repeated measures of factor A, depicted by a in eqn. 5.02, does not change across designs as does b in eqn. 5.01 for the A test Despite the reduction in F across designs, the tendency for power of the B test under medium and large effect sizes to increase slightly as K increased is due to an accompanying decrease in the variability B of the F ratio (SDp ). As explained earlier for the one-way R M model under conditions of sphericity, B Potvin '96 117 Discussion designs with larger K„ have a greater concentration of F values along their critical F borders leading to greater potential for statistical significance and therefore higher power when experimental conditions are sufficiently favorable. AB Interaction The changes in the mean statistics of the interaction test ( M S A a , MSHM-AB , FAB, SDFAB ) as KB increases from 3 to 9 are similar to those seen in the B main effects test. For any given matrix condition, as K increases M S B A B , FAB, SDFAB decrease while MSBRRAB remains constant. The decline in MSAB is, again, the result of a greater number of numerator df overshadowing the expected increase in SSAB (although the decline is very slight between 3 x 6 and 3 x 9 designs). MSHWAB remains constant across designs since according to eqn. 5.03 neither variable a or b are expressions in the denorninator term and should therefore not effect error variance of the AB test The decrease in FAB as K A entirely to a decrease in MSAB. The general trend for power to increase as K increases therefore is due B increases among more favorable conditions of the AB test, as described in the results section, is unfortunately not as evident in Table 5.1 since the power associated with those conditions shown is relatively low. For these conditions, the decrease in variability of FAB (SDFAB ) across larger K , unlike that of the B test, does not provide a B sufficient power advantage in favor of designs with greater K B Again, the reason for this can be attributed to the area of the F distribution bordering the critical point of each design. Those designs with a higher K have a greater concentration of values over a shorter F interval compared with those having B fewer K . Therefore any condition (e.g. a change in E S ) that causes a more highly-concentrated B distribution to shift left orrightfrom its F critical value will bring about a greater change in the design's power when compared to a less concentrated distribution. Under less than optimal experimental conditions (low power), designs with more concentrated distributions (those with more K ) may result in equal if not B less power than designs having less concentrated distributions (those with fewer repeated measures). This Potvin '96 118 Discussion not only clarifies why power for the AB test was relatively constant across designs with different K„ under the conditions shown in Table 5.1 (medium ES, small n), but also helps explain why the general trend for power to increase as K increases was not universal among other conditions and tests (B main effect) B exhibiting low to moderate power. The simulation results observed for the interaction test coincided only partly with the original hypotheses that power for the AB test across designs would be dependent on both a factor's number of RM and magnitude of its average r. Thefindingsof this study revealed that a factor's Ave r did not have an influential effect on power of the AB test as K„ increased. Power Comparisons Between Tests: Main Effects and Interaction The power differences between tests of the two-way R M ANOVA described in the result section and illustrated in Figure 4.09 can be explained by referring again to Table 5.1. Among the mean statistics shown , it is clear that power and F values for the interaction test under all matrices and designs given are smaller than those for main effects tests due entirely to a reduced numerator mean square trials (MSAB < MS and MS ). The smaller MSAB is attributed to a smaller sum of squares (SSAB ) and greater numerator A B df. Again, it should be noted that these differences between main effect and interaction tests' statistics (particularly MS™*) are dependent on the definitions of ES (d) used in this study. Comparing power between main effects tests, several observations require explanation. One is why power for a 3 x 3 design under a homogeneous matrix (4-4 or 8-8) is almost always slightly greater for the B test even though both A and B tests have an equal number of R M and identical r matrices . 7 Examination of these tests' mean statistics reveals that, under such conditions, their values are quite similar. Slightly larger MS™* values appear to exist for the B test which may perhaps be responsible for The condition in Table 5.1 is a rather poor example since power between these tests under the 8-8 matrix is equal. However, this is an exception since under most other experimental conditions, B clearly shows an advantage - refer back to the top graph in Figure 4.09 or the appropriate power tables for better examples). 7 Potvin '96 119 Discussion the somewhat larger mean F ratios of B seen since MSERR between the two tests remained constant. For a simulation study, however, such small differences are negligible and therefore do not provide convincing evidence for the existence of a power difference between tests. Thus, there seems to be no theoretical reason why, under these conditions, F and power should differ between the two tests. Although only speculative, perhaps the results are reflective of a bias in the simulation process which favors one test (in this case, B) over the other. A clear explanation for thisfindingremainsabsent In contrast to the uncertainty in the 3 x 3 design, the differences in power between main effect tests observed among theremainingdesigns (3x6 and 3x9) under the same conditions are more easily interpretable. Under homogeneous matrices (4-4, 8-8), the larger power and F values for the A test among these designs are clearly due to a greater M S * whichresultsfrom having a larger number of R M levels (K or b) in the numerator term (compare equations 5.01 and 5.02). Interestingly, the variability of tests' B MS,*!, and MSERR (SDMSHU. and SDMSEW ,respectively)under homogeneous matrices also seem affected by an increase in K . However, these do not appear to be direcflyresponsiblefor the differences in power B seen between the two tests (with the exception of SDMSERR , a change in the variability of a particular statistic mirrors a similar change in its corresponding mean value). Another observation requiring elaboration is the alternating power advantage that occurs between main effects tests under heterogeneous matrices (4-8 and 8-4). For all designs in which the main effects test has a smaller Ave r across its trials than the other variable present (e.g. A test under 4-8 matrix or B test under 8-4 matrix), MSERR will be larger than that of the other main effects test under the same r matrix. M S ™ * is also larger but not in every case. Therefore, in contrast to the explanation given under homogeneous matrices, differences in power between tests under these matrices is due mostiy to changes in MSERR • Comparing power between the A test under a 4-8 matrix and the B test under an 8-4 structure, the higher F values and the slight power advantage the A test generates as K increases is theresultof having B Potvin '96 120 Discussion a larger number of R M in its analytical functions (compare b in eqn. 5.01 with a in eqn. 5.02). The same also applies when comparing values between an A test under an 8-4 matrix and a B test under a 4-8 matrix. Noncentrality Parameter For Tests of Two-Way R M A N O V A The results of this simulation study provide helpful information for identifying some of the analytical expressions involved in the computation of a noncentrality parameter (and thus power) for tests of the two-way R M ANOVA. For the numerator terms of the main effects and interactions, it appears the analytical function for each test is identical to those given in equations 2.01, 2.02 and 2.03 for the twoway mixed model. The denominator term, however, appears to have its own unique expression for the different tests of the two-way R M design, as explained in the previous sections of this chapter. For main effect tests, it seems that the denominator of the noncentrality parameter involves a variable for each Ave r of the pooled matrices. The relationship between these variables and error variance is such that one variable, the Ave r of the main effects factor, acts to reduce the error variance as it increases in magnitude whereas the pooled factor causes error variance to increase as the magnitude of its average r and/or number of R M levels increases. These effects are comparable to those expressed in equations 2.19 ( a (12 p)) and 2.20 (a^/7 + (q-l)p] ) with the former exhibiting a similarity to the trials main effect of the twoway mixed model while the latter resembles the relationship expressed by the randomized groups main effect test of the same model. How exactiy these two expressions interrelate to produce a resultant effect on error variance in the two-way R M model is still uncertain but it appears any increase in part of the equation expressed by o^/7 + MSERR (i.e. that (q^l)p]) is not noticeable until either both factor's Ave r decrease or the Ave r of the pooled factor becomes greater than that of the main effects factor. It may be that the error variance is also affected by a third correlation variable, the Ave r of the AB coefficients which has not been thoroughly exarnined as of yet Potvin '96 121 Discussion For the interaction test, it appears that the error variance is affected by only one correlation variable, that being the highest average r among the two pooled r matrices. The denominator expression therefore is similar to eqa 2.19. Based on this information, equations for estimating the noncentrality parameter of each test of the two-way R M may be partly derived. The noncentrality parameter formulae might resemble the following functions, a ( l - p ) + (ft-l)(p -p )a' z A c (l-p ) 2 B 'AB a + o- (l-p (a-l)(p -p )a' A 2 MAX ) for a main effects test of factor A (5.04) for a main effects test of factor B (5.05) itf AB for an A by B Interaction test (5.06) where jxy = the cell mean, u,,- and \ij = the marginal means for the levels of the A and B factors t respectively, u. = the grand mean, n = the sample size, a and b = the number of levels of A and B factors respectively, p A and p„ = the average of the off-diagonal correlation coefficients of A and B pooled matrices respectively, p^, = the average of the correlation coefficients of all AB pairs of the AB matrix, Potvin '96 122 Discussion PJMI is the highest average correlation among the two pooled matrices and = the error variance of the dependent variable involved. The accuracy of these equations of course is unknown at this point in time. Further testing and comparison of calculated values with simulated ones would be required in order to ascertain their reliability as well as decipher the currently unresolvedrelationship^)between variables affecting the error variance of these different tests. Potvin '96 123 Summary and Conclusions Chapter Six Summary and Conclusions Potvin '96 124 Summary and Conclusions The primary objective of this study was to provide researchers in the field of Human Kinetics with a more practical means of determining univariate power for one- and two-way repeated measures and twoway mixed ANOVA designs. This was accomplished byfirstgenerating power values using analytical and Monte Carlo simulation methods for varying conditions of sample size, effect size, and magnitudes and patterns of correlation and then making these estimates available in the form of power tables. A secondary purpose of this investigation was to exarnine and interpret those power trends less well known among designs failing to meet the assumption of sphericity or involving two R M factors. From the results of this study the following conclusions were drawn In general: When other conditions were held constant, an increase in either sample size, effect size, alpha or the average correlation (Ave r) across repeated trials resulted in a concomitant increase in power for most tests and designs of this study. An exception included power for the group main effect test of the two-way mixed ANOVA which decreased as the Ave r among repeated measures increased. The effects on power from these statistical parameters had been predicted and were based on changes in the numerator and/or denorninator terms of F and \ . With respect to power under conditions of nonsphericity: When power for a one-way RM design under a constant r matrix pattern (C) was low (<.20), power for the same test under a trend r matrix pattern (T) was found to be greater by up to .08. In contrast, when power was moderate to high (.50-.90) under C, power for the same test under T was found to be lower by up to .18. In addition, the degree to which power was greater or less under T increased as the number of repeated measures (RM) in a design increased. Depending on the effect size involved, a design displayed about equal power under C and T when values were Potvin '96 125 Summary and Conclusions between .20 and .40. The reason for the discrepancy in power under conditions of T was attributed to an increase in the variability of the F ratio or more specifically, MS and MSERR . K Regarding power as the number of RM (K) increased across designs: 1. Under conditions of sphericity, there was a common pattern for power to increase as K increased among all tests examined. However, the increase was mosdy observable at medium to large effect sizes and among one-way RM and main effects tests of two-way designs. The largest differences in power occurred between designs with 3 and 9 RM ranging from .15 to .58. In most cases, the power increase resulted from either a reduction in the variability of F or an increase in MS as K K became larger. It was noted that the extent of this power trend was dependent on the definition of effect size (d) used in this study. 2. Under conditions of nonsphericity, the tendency for power to increase as K increased was mainly observable at small effect sizes or when alpha was low (.01). At larger effect sizes and alpha, the trend tended to reverse itself with power becoming equal to or greater for designs with fewer K. Reasons for this were attributed to the lower concentration of F values occurring at level of significance for designs with fewer K which at small effect sizes (low power) served as a disadvantage but at large effect sizes (high power) provided a power advantage over designs with more RM. Regarding power differences between tests of a two-way design: 1. For most experimental conditions of the two-way mixed model, me trials main effect test showed the greatest power of the three tests involved followed by the groups main effect test which exhibited more power over the interaction test but only when the Ave r among repeated trials was Potvin '96 126 Summary and Conclusions low (.4). The reduction in power for the groups test as Ave r increased was, as expected, caused by an increase in the within-group variability or error variance associated with the effect 2. For main effect tests of the two-way RM model, under homogeneous r matrices the test with a higher number of "pooled" RM (i.e. A test) exhibited greater power over the other main effect test (i.e. B) except when the number of RM of both factors involved were equal, in which case, the B test, for reasons unknown, was slightly favored. Under heterogeneous r matrices, power was greatest for the main effect test having the highest Ave r among its trials. Differences in MS K were found to be responsible for the power order observed among tests under homogeneous r structures while changes in M S E M and to a lesser extent MS*, were accountable for those seen under heterogeneous matrices. For the interaction test, power was found to be the least among the three tests for almost all conditions examined. However, the lowered power observed wasrecognizedas being largely dependent on the way in which the interaction test's effect size, d, was determined in this study. Regarding power among test with different correlation matrices in the two-way RM ANOVA: 1. For main effect tests, power was found to be greatest when the Ave r among trials of the main effect factor was high (regardless of whether the Ave r of the pooled factor was equal to or below it) and lowest when its Ave r fell below that of the pooled factor. The changes in MSou responsible for these differences in power across matrices seemed to be dependent on the magnitude of both factors' Ave r and the number of RM involved. 2. For the interaction test power was found to be greatest among those matrices in which at least one factor had an Ave r across its trials equal to .80. In addition, the Ave r of the overall (AB) matrix was found to be a less influential variable in affecting the power of the interaction test than the Potvin '96 127 Summary and Conclusions Ave r of either pooled (A or B ) matrix. The changes in MSERR observed appeared to be entirely dependent on the highest Ave r among the pooled matrices. In addition to providing a detailed interpretation and discussion of results obtained, findings from the twoway R M ANOVA simulations were used to construct preliminary analytical expressions of X for each tests of the design. From the results of this study and in consideration of future work in this area, it is recommended that: > the effects of nonspherical r matrices on power be exarnined for the two-way mixed and R M models as well in order to determine whether the findings observed under the one-way R M model are similar across the different R M tests of these designs. > power under many more r structures and levels of R M of the two-way R M ANOVA model be determined in order to identify the exact relationship between these variables and the error variance of a particular test so that specific functions of noncentrality parameters involved can be derived and validated. > the information derived from this and follow-up studies be used to create a user-friendly computer program capable of providing power estimates for designs with more than one R M factor and heterogeneous r structures. Potvin '96 128 References References 1. Austin, H. W. (1983). Sample Size: How Much is Enough. Quality and Quantity. 17, 239-245. 2. Barcikowski, R. S. (1973). A Computer Program For Calculating the Power When Using The T2 Statistic With Selected Designs. Educational and Psychological Measurement. 33,723-726. 3. Barcikowski, R. S., & Holthouse, N. (1972). A Computer Program For Calculating the Power of F Tests In Analysis of Variance and Covariance for Specified Alpha Levels, Sample Sizes, and Effect Sizes. Educational and Psychological Measurement. 22,169-172. 4. Betz, M. A., & Thompson, B. L. A Comparison of New Power Approximations in Repeated Measures Analyses. Unpublished manuscript. Arizona State University. 5. Bloch, D. A. (1986). Sample Size Requirements and the Cost of a Randomize Clinical Trial With Repeated Measurements. Statistics in Medicine. 5, 663-667. 6. Borenstein, M., & Cohen, J. (1988). Statistical Power Analysis: A Computer Program. Hillsdale, NJ: Lawrence Erlbaum Associates. 7. Borich, G. D., & Godbout, R. C. (1974). Extreme Groups Designs and the Calculation of Statistical Power. Educational and Psychological Measurement. 34,663-675. 8. Box, G. E. P. (1954). Some Theorems on Quadratic Forms Applied in the Study of Analysis of Variance Problems, II. Effects of Inequality Variance and of Correlation Between Errors in the Two-Way Classification. Annals of Mathematical Statistics. 25,484-498. Cited in Green, 1992. 9. Bradley, D. B. (1989). DATASIM. Lewiston, Maine: Desktop Press. 10. Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences. 3rd Ed. Hillsdale New Jersey: Lawrence Erlbaum Associates. First and Second addition, 1969 and 1977, respectively. 11. Collier, R. Q.,. Jr., Baker, F. B., Mandeville, G. K., & Hayes, T. F. (1967). Estimates of Test Size For Several Test Procedures on Conventional Variance Ratios in the Repeated Measure Design. Psychometrika. 32, 339-353. Cited in Green, 1992. 12. Davidson, M. L. (1972). Univariate versus Multivariate Tests in Repeated-Measures Experiments. Psychological Bulletin. 22(6), 446-452. Potvin '96 129 References 13. Dodd, D. H., & Schultz, R. F. Jr. (1973). Computational Procedures For Estimating Magnitude of Effect For Some Analysis of Variance Designs. Psychological Bulletin. 79(6), 391-395. 14. Edgington, E. S. (1974). A New Tabulation of Statistical Procedures in APA Journals. American Psychologist. 29.25-26. Cited in Robey and Barcikowski, 1984. 15. Eom, H. J. (1993). The Interaction Effects of Data Categorization and Noncircularity of the Sampling Distribution of Generahzability Coefficients in Analysis of Variance Models: An Empirical Investigation. Unpublished doctoral dissertation, University of British Columbia, 16. Green, S., & Barcikowski, R. S. (1992). Power Analysis and Sphericity in Repeated Measures Analysis of Variance with Heterogeneously Correlated Occasions. Unpublished doctoral dissertation, Ohio University, 17. Greenhouse, S. W., & Geisser, S. (1959). On Methods in the Analysis of Profile Data. Psycrrometrika. 24,95-112. Cited in Mulvenon, 1993. 18. Grima, A. M., & Weinberg, S. (1987). An Analysis of Repeated Measures Data: An Exploration of Alternatives (MANOVA). Unpublished doctoral dissertation, New York University, 19. Howell, D. C. (1992). Statistical Methods For Psychology. 3rd Ed. Belmont, Ca.: Duxbury Press. 20. Huynh, H., & Feldt, L. S. (1970). Conditions Under Which Mean Square Ratios in Repeated Measurement Designs Have Exact F-Distributions. Journal of the American Statistical Association. 65.1582-1589. 21. Huynh, H., & Feldt, L. S. (1976). Estimation of the Box Correction For Degrees of Freedom From Sample Data in Randomized Block and Split-Plot Designs. Journal of Educational Statistics. 1,69-82. 22. Koele, P. (1982). Calculating Power in Analysis of Variance. Psychological Bulletin. 92(2), 513-516. 23. Kraemer, H. C , & Thiemann, S. (1987). How Many Subjects? Beverly Hills, Ca.: Sage. 24. Kraemer, H. C , & Thiemann, S. (1989). A Strategy to Use Soft Data Effectively in Randomized Controlled Oinical Trials. Journal of Consulting and Clinical Psychology. 57(1), 148-154. 25. Lipsey, M. W. (1990). Design Sensitivity. Newbury Park: Sage Publications. Potvin *96 130 References 26. Lui, K., & Cumberland, W. G. (1992). Sample Size Requirement for Repeated Measurements in Continuous Data. Statistics in Medicine. JUL, 633-641. 27. Marcucci, M. (1986). A Comparison of the Power of Some Tests for Repeated Measurements. Journal of Statistical Computation and Simulation. 26,37-53. 28. Mendoza, J. L., Toothaker, L. E., & Nicewander, W. A. (1974). A Monte Carlo Comparison of the Univariate and Multivariate Methods for the Groups by Trials Repeated-Measures Design. Multivariate Behavioral Research. 9_,165-177. 29. Muller, K. E., & Barton, C. N. (1989). Ar^roximate Power for Repeated-Measures ANOVA Lacking Sphericity. Journal of the American Statistical Association. 84(406), 549-555. 30. Muller, K. E., & Barton, C. N. (1991). Correction to "Approximate Power for Repeated-Measures ANOVA Lacking Sphericity". Journal of the American Statistical Association. 86, 255-256. 31. Muller, K. E., LaVange, L. M., Ramey, S. L., & Ramey, C. (1992). Power Calculations for General Linear Multivariate Models Including Repeated Measures Applications. Journal of the American Statistical Association. 87,1209-1224. 32. Muller, K. E., & Peterson, B. L. (1984). Practical Methods For Computing Power in Testing the Multivariate General Linear Hypothesis. Computational Statistics and Data Analysis. 2,143158. 33. Mulvenon, S. W., & Betz, M. A. (1993). Analytic Formulae For Power Analysis in Repeated Measures Designs. Unpublished doctoral dissertation, Arizona State University, 34. Olejnik, S. F. (1984). Banning Educational Research: Deterrnining the Necessary Sample Size. Journal of Experimental Education. 53.(1), 40-48. 35. PASS (1991). PASS (Power Analysis and Sample Size') Version 1.0 [Computer Program]. Kaysville, UT: NCSS; Jerry L. Hintze. 36. Pearson, E. S., & Hartley, H. O. (1951). Charts of the Power Function of the Analysis of Variance Tests, Derived From the Noncentral F-Distribution. Biometrika. 3j£, 112-130. Cited in Rotton and Schonemann, 1978. 37. Potvin, P. J., & Schutz, R. W. (1995). Predicting Power Trends in Repeated Measures ANOVA: A Prehminary Investigation Using a Random Number Generator Program. Unpublished term paper. Potvin '96 131 References 38. Robey, R. R., & Barcikowski, R. S. (1984). Calculating the Statistical Power of the Univariate and the Multivariate Repeated Measures Analyses of Variance For the Single Group Case Under Various Conditions. Educational and Psychological Measurement. 44(1), 137-143. 39. Rochon, J. (1991). Sample Size Calculations for Two-Group Repeated-Measures Experiments. Biometrics. 47,1383-1398. 40. Rotton, J., & Schonemann, P. H. (1978). Power Tables For Analysis of Variance. Educational and Psychological Measurement. 3J5, 213-229. 41. Rouanet, H. L . , . D. (1970). Comparison Between Treatments in a Repeated-Measurement Design: ANOVA and Multivariate Methods. British Journal of Mathematical and Statistical Psychology. 23.17-163. Cited in Green, 1992. 42. Schutz, R. W., & Gessaroli, M. E. (1987). The Analysis of Repeated Measures Designs Involving Multiple Dependent Variables. Research Quarterly For Exercise and Sport. 58(2), 132-149. 43. SOLO (1992). SOLO Power Analysis Version 1.0 [Computer Program]. Los Angeles, Ca.: BMDP Statistical Software, Inc. 44. St Pierre, R. G. (1980). Plaraiing Longitudinal Field Studies: Considerations in Deteimining Sample Size. Evaluation Review. 4(3), 405-415. 45. Sutcliffe, J. P. (1980). On the Relationship of Reliability to Statistical Power. Psychological Bulletin. 88(2), 509-515. 46. Tang, P. C. (1938). The Power Function of the Analysis of Variance Tests With Tables and Illustrations of Their Use. Statistical Research Memoirs. 2,126-149. Cited in Davidson 1972. 47. Tiku, M . L. (1967). Tables of the Power of the F-Test. Journal of the American Statistical Association. 62,525-539. Abstract only. 48. Vonesh, E. F., & Schork, M. A. (1986). Sample Sizes in the Multivariate Analysis of Repeated Measurements. Biometrics. 42,601-610. 49. Winer, B. J., Brown* D. R., & Michels, K. M. (1991). Statistical Principles in Experimental Design. 3rd Ed. New York: McGraw-Hill, Inc. Potvin '96 132 References Studies Used To Provide Empirical Data 1. Bozac, A (1990). Detenrrirring Exogenous Glucose Oxidation During Moderate Exercise. Thesis dissertation, University of British Columbia, 2. Cress, M.E., Thomas, D.P., Conrad, J.J., Kasch, F.W., Cassens, R.G., Smith, E.L. & Agre, J.C., (1991). Effect of Training on V O w . Thigh Strength, and Muscle Morphology in Septuagenarian Women. Medicine and Science in Sports and Exercise. 23(6), 752-758. 3. Davidson, (1989). The Effects of a Six Week Sea Level Exposure on the Cardiac Output of High Altitude Ouechua Natives. Thesis dissertation, University of British Columbia, 4. Gitto, A. (1996). Relationship of Excess Post-Exercise Oxygen Consumption to V Q w and Recovery Rate. Thesis dissertation, University of British Columbia, 5. Ienna, T. (1994). The Asthmatic Athlete: Metabolic and Ventilatory Responses to Exercise Without Pre-Exercise Medication. Thesis dissertation, University of British Columbia, 6. Lasko-McCarthey, P., & Davis, J.A. (1991). Effect of Work Rate Increment on Peak Oxygen Uptake During Wheelchair Ergometery in Men With Quadriplegia. European Journal of Applied Physiology and Occupational Physiology. 62(5), 349-53. 7. Lebrun, C. (1992). The Effects of the Menstrual Cycle and Oral Contraceptives on Athletic Performance. Thesis dissertation, University of British Columbia, 8. Mack, R.(1995). The Efficacy of Topical Ibuprofen in an Inflammatory Model: Delayed Onset Muscle Soreness. Thesis dissertation, University of British Columbia, 9. Rishiraj, N. (1996). The Role of Functional Knee Bracing in a Dynamic Setting. Thesis dissertation, University of British Columbia, 10. Sheel, W. (1995). The Time Course of Pulmonary Diffusion Capacity Changes FoUowing Maximal Exercise. Thesis dissertation, University of British Columbia, 11. Sheel, W.A., Lama, I., Potvin, P., Coutts, K.D., & McKenzie, D.C. (1996). Comparison of Aero-bars Versus Traditional Cycling Postures on Physiological Parameters During Submaximal Cycling. Canadian Journal of Applied Physiology. 21(1), 16-22. 12. Walton, P. (1996). Effects of Pre-Exercise Solid and Liquid Carbohydrate Feedings on High-Intensity Intermittent Exercise Performance. Thesis dissertation, University of British Columbia, Potvin '96 133 Appendix 3 CO 0) o c o cn o cn o o CM o t O O ^ O O N O 0)0)0)000)0 LU O O O T-' O T- n »- co u> in in s S C\J T - CO N CO O ) < CNJ CO CO 00 CO CO N O) HI o o o o o o o O) N co t o w i - N 00 N SI <d 3 CO CO O) CO o o o o o o < > o z < CC >» to • c O o i - in t*^ cn co o o o N O r - C J N O N T - CM T- T- T- CO CO CO CO CO CO CO .a > E E w a C\J C\J fvj o 3 o o> > > i I g x 811 o o o co o o o o CO I"- o 3 S i CO O ) N cn co co co T-^ in dI ^ o 1- •5 s in cn co r^co co ^- co co in in oi in O o o o T~ o o 5>l c o o ^ o C m •S li ill _ a) N 0 > II <° N o o cvi S O N 0)00) >" a x> < u II < t UJ 111 s i- N T- 5 < 2 S 0) E CO r - _i ca X UJ ,— cn cn CO in •> "> (0 co Q Q cn in _ in in cn JZ 75 cn cn -c Cn CP 0) JZ SZ a3 o o o •c tr CO co CO "55 CD C5 CD 08 oS O O o o CO 6 6 CO CO CS CO 1 _l ~ .2> a> I E w OJ CO CM <N 5 5 Q Potvin '96 m h a. | P J n x 135 ' Appendix s I o CO o S o t o CO CO d d 111 in o CO CO o o 9 o < in UJ o o CM CO cn m I Ui m < Ui < J III s CM o < d 3d d o o o co o o CO o LO CO LO d d II CO r> CM LO T 1 CO CM 1 3 1 o o o o 1 o o o 8•«* o^Lf)— d d d - o d d d LO 3 1 1 I< 3 d d d a. 3 8. O • o CO cn LO d d d O O o CM cn CM CM d d d LO i i i d d 3 CM CO LO o CO CM CM T t CM to CO oo q d d d d o CM d d s T— CO N- o O d d CO CO o o LO CO o CO CO CO CO CO CO CM CO X cn o co cn o 25 CO O d Tt d T-^ X X X X X .O « "8 o. Q O E o w s •= 11 o cn co co co o d d o T - cn co cn o o o o cn co CM co cn o o d d I 1 8 D) CO i > o h- < T- CO T- O CM L O CM Tt O LO d SB I <2 m1 Si i' 11 UJ < 11 CO UJ II m < d O co 3o i- o o o o LO « i i- T- o I ,» E m < CO •S e d CM io d d 8 UJ i - io o »; cn T - d d I O S !K TJ II X § g O o o IO T - CO T* Tt X X CM CM CM CO CO CO CM CM > d CO T - T t i CO cn cn cn CO < 83 CM in CM o w •o d i t— 00 LO CO CM T3 O X o d d d o o d d 9 d d d CD o f- • d Ovei Im cn cn d LO LO CO CO CO CM ^ T* 3 LO LO o co O CO d o d cn cn ^j—- cn cn CM Tfr o ft CO cn CO o d d d d d d d co LO CM CO O ft cn 1 CM p d d d d d d d o 1 o CM >>• CO CO LO CM CO < UI CO <2 s8| s o • § C M co DO Q > co 03 cn cn 2 cn — cn — io io 9° 90 jo ^ ^ >,|cn cn cn c c 33 O 3 O O C CO co c (j d i W (O O c CD T 3 T J ± 2 (7) ON CO "> > ™ <§ § CD CD co co ^CO — — CO 3 £ # £ £ Q CM o £ 9 > ii i L - N n ^ i n c o N c o Potvin '96 136 Appendix o i - -r- co CO N CM CO S 00 CO O) UJ d i d i d i d i d T - 't S cn co o O O W W O N (OOt CO O N O r O CO CO T t CM _ to_ eg < CO Ift CO CO i v . III d d d d d o CO CO *~ QJ d o o CM co CM CO S o o 5 CO o d d d CO co CM CO CO CO CM CO o d d is CO OJ d d d d d d r CD I f l O O) O) d d d d d d co coo CO LO LO d d 1 > CD r>. oo oi oo O O O r CD N -* 3 o2 cn 3 "«t 1 C o §• g x I 8 S T3 II > K I (5 <? .c S .2 g. I II 3 M 2 «* ? 111• 111' Oi I-II < co in yN Ol o d d co in s o in CM d d T-^ d •o ^ O CM CO N CO CO - i - CO i - ; d T- CO CM co c\i d d m CM in ^j; co co r-- o o o CM CO CO CM CM o < o > O co p CD u > x 2 | T3 2 d d d d d d i— ^— o i— CO CO CO CD X X X X ^ ^ CO CM CO TCO y— X X O 5o dinr-.dcoCM o 0 o o 01 6 CO CO CM CM CO CO CM CM N y- r o o N 8| < 2 5 X ca E H gn. > Q > in in ^ ^ LO LO CM CM CO co c o> "55 CD Q CD CD 0) CD x: JZ CO CO T - CM a a CO CO 5 3 2 2 J O 0) <D co ^ in co O o Potvin '96 II g h I > > °- s > 137 References Appendix 3.0 FORTRAN Program For Calculating Noncentrality Parameter and Effect Sizes (d & f) C C By Patrick TOWER-SEEKER' Potvin Developed Fall "95. C*******Limitations of Program ************************************ C* * C* - Cohen's d calculations for interaction effect * C* for designs with 2 or 3 groups are performed using * C* formulae created by this author. Cohen's d values given* C* by this program for designs with more than 3 groups are * C* not correct since they are only based on 3 groups. * C* * C* - Calculations assume equal n per group (program does not * C* calculate weighted means or variances) * C* * C* - number of RM levels should not exceed 9!! * C* * c* c* * * C******** Preparing to Execute this program *********************** C In 'datafile', do the following: C30 - Line 1: give a title to youriprogram run (< 60 characters). C - Line 2: enter the # of levels for each factor as RG x RM. C col's 1-2 for RG, col's 4-5 for RM. C - Line 3: enter Uie format your means and SD's are arranged. C for example, (4f7.2) - make sure you include the C parentheses! C - Line 4: enter the pattern of the distribution of your means. C either as Tsven', 'Centered' or 'At ends'. C Start in column 1. Do not exceed 8 characters. C - Line 5 & +: C enter the cell means per group (RM means). C format should be as '17.2' and # of means not over 9. C one line of means per group. Start in col 1. C - Lines thereafter: C enter the Cell SD's per group, allowing only 1 group C of SD's per line. Format same as for means. C C C C C C C In the "NCP.f program (this one), do the following: - Optional: you can change the R and n values as you wish by changing the default settings (set at 0). To compile the program, type: 177 ncp.f-o luck (compiled program saved under 'Luck'). Potvin '96 138 C To execute program, type: C Luck C at the unixg command prompt. To view results, type: C vi output C*****DEFINrriONS***** C C60 C C C CIO C C C C C C P number of levels for randomized group factor Q number of levels for repeated measures factor n sample size per group U population cell mean SD population cell standard deviation R average correlation across RM trials GPncp noncentrality parameter for group effect RMncp noncentrality parameter for RM effect INTncp noncentrality parameter for Interaction effect GPf cohen's f for group effect RMf cohen's f for RM effect INTf cohen's f for interaction effect C C C C C C GPd, RMd, INTd GPdf, RMdf, INTdf Errdfl, Errdf2 C C C C C C C C C C C C90 GSUM, RSUM, CSUM sum of means for grand, rows and columns GMEAN, RMEAN, CMEAN grand mean, row and column marginal means VARSUM, VARMEAN sum and grand mean of variances (SD's) GPMAX, RMMAX, DIFFMAX Maximum group and RM marginal means Maximum difference of differences GPMIN, RMMIN, DIFFMIN Minimum group and RM marginal means Minimum difference of differences DIFF??(Q) Difference between group cell means at each level of Q PATT Pattern or distribution of cell means Either: *Even', 'At Ends' or 'Centered' TITLE Title you give to each program run. dataflle output cohen's d per effect degrees of freedom for each effect degrees of freedom for between group & within group error terms, respectively datafile where program retrieves data fde where program outputs results C*****VARIABLE DECLARATIONS***** + + + + + INTEGER P, Q, n, A, B REAL*8 U(5,20), SD(5,20), R, GSUM, RSUM(5),CSUM(20), VARSUM, GMEAN, RMEAN(5),CMEAN(20), VARMEAN, GPSUM2,RMSUM2,INTSUM2, GPd,GPf,GPncp, RMd,RMf.RMncp, INTdJNTfJNTncp, Potvin '96 GPMAX, GPMTN, R M M A X , RMMIN, DIFFAB(20), DIFFBC(20), DJTFAC(20), INTAB, INTBC, INTAC, ABMAX,ABMIN,BCMAX,BCMIN,ACMAX,ACMIN, INTMAX, GPdf, Errdfl, RMdf, Errdf2, INTdf + + + + PATT CHARACTER* 15 CHARACTER*70 FMT1, TITLE C49 C C12 WRITE (FMT1.12) '(',Q,'(F7.2))' FORMAT (AU2.A17) C***READ mean's and sd's factor levels and design pattern FROM DATAFTLE* OPEN (UNIT=1, FILE='2x3.daf, STATUS='OLD', + ACCESS='SEQUENTIAL', FORM=PORMATTED') C124 READ (1,6) TITLE READ(.1,7)P,Q READ (1,12) FMT1 READ (1,8) PATT READ (1.FMT1) ((U(IJ), J=1,Q), 1=1^) READ (lfMTl) ((SD(I,J), J=1,Q), I=1,P) 6 FORMAT (A60) 7 FORMAT (I2.1X.I2) 8 FORMAT (A8) 12 FORMAT (A60) CLOSE(l) C*****Start Calculations for each r value and sample size***** OPEN (UNIT=2, FILE='dataflle, STATUS='NEW', FORM="FORMATTED') , C ***SetRtozero*** R= 0.00 C ***Write Tide of Execution e.g. Effect Size*** WRITE (2,14) '**** ', TITLE,' ****' WRITE (2,16)" 14 FORMAT (A5 A60A5) 16 FORMAT (A60) DOS A=l,5 C ***Change R values on each loop to .20, .40, .60, .80,1.00*** R = R + 0.20 Potvin '96 IF (R .GE. 1.0) THEN R = 0.99 ENDIF C 9 ***Write heading for correlation to output file' WRITE (2,9) "Average Correlation Used = ", R WRITE (2,9) *' FORMAT (A30,f4.2) C ***Set sample size to 0*** n= 0 DO 10B=1,6 C ***Change sample size values to 5,10,15,20,25,30 with each loop* n=n+5 C ***Write heading for sample size to output file*** WRITE (2,13)'Sample Size Used = \ n WRITE (2,13) *' 13 FORMAT (9XA21.I2) C189 Calculate Grand Mean and Marginal Means C First set all means & sums to zero! GSUM = 0 GMEAN = 0 C Then convert P , Q & n integers to real numbers! RP = P RQ = Q Rn = n C 15 20 Print *, RP, RQ, Rn DO 151=1 J> RMEAN (I) = 0.0 RSUM(I) = 0.0 CONTINUE DO20J=l,Q CMEAN (J) = 0 CSUM(J) = 0 CONTINUE Potvin '96 References C 30 25 C221 Then do calculations DO 251=1 J> DO30J=l,Q RSUM (I) = RSUM (I) + U(I,J) CSUM (J) = CSUM (J) + U(I,J) GSUM = GSUM + Ua,D CONTINUE CONTINUE GMEAN = GSUM/(RP*RQ) C PRINT *, GSUM, GMEAN 32 DO 321=1^ RMEAN(I) = RSUM(T)/RQ CONTINUE 33 DO 33 J=1,Q CMEAN(J) = CSUM(J)/RP CONTINUE C C C C C 18 C 17 DO 17 I=1J> DO 18 J=1,Q PRINT *, CMEAN(J), CSUM(J) PRINT *, RMEAN(I), RSUM (I) CONTINUE CONTINUE C C243 Calculate Mean Variance from Standard Deviation in datafile**** First set variance mean & sum to zero! VARMEAN = 0 VARSUM = 0 C 40 35 Then do calculations DO 35 I=1J> DO40J=l,Q VARSUM = VARSUM + (SD(I,J)**2) CONTINUE CONTINUE VARMEAN = VARSUM/(RP*RQ) C*****Write Variance sum to screen - error checking***** C PRINT *, VARSUM C***** Calculate the MEan deviations for Group, RM & Inter, effects GPSUM2 = 0 RMSUM2 = 0 INTSUM2 = 0 Potvin '96 142 45 DO 45 1=1,P GPSUM2 = GPSUM2 + (RMEAN(I) - GMEAN)**2 CONTINUE 50 DO50J=l,Q RMSUM2 = RMSUM2 + (CMEAN(J) - GMEAN)* *2 CONTINUE 60 55 DO 55 I=1P DO60J=l,Q INTSUM2 = INTSUM2 + (U(I,J) - RMEAN(I)-CMEAN(J>GMEAN)**2 CONTINUE CONTINUE C Print *, GPSUM2, RMSUM2, INTSUM2 C286 C******Calculate MS error for each design effect******* GPerr = VARMEAN*(1.0 + ((RQ-1.0)*R)) RMerr = VARMEAN*(1.0-R) C Print *, VARMEAN, GPerr, RMerr Print 77, R, n 77 FORMAT (2X, f4.2,2X, 12) C******Calculate NCP for each effect******************* GPncp = Rn*RQ*GPSUM2/GPerr RMncp = Rn*RP*RMSUM2/RMerr INTncp- Rn*INTSUM2/RMerr C C Print *, GPncp, RMncp, INTncp Print *, INTSUM2 C******Calculate Cohen's f for each effect***** GPf = SQRT(GPSUM2/(RP*GPerr)) RMf = SQRT(RMSUM2/(RQ*RMerr)) INTf = SQRT(INTSUM2/((((RP-l)*(RQ-l))+l)*RMerr)) C Print*, GPf, RMf, INTf C******Calculate Cohen's d for each effect***** C First find largest and smallest means per effect GPMAX = -999999.0 DO 651=1 J> IF (RMEAN(I) -GT. GPMAX) THEN GPMAX = RMEAN(I) ENDIF Potvin '96 References 65 C325 CONTINUE 70 GPMIN = 999999.0 DO 701=1 IF (RMEAN(I) .LT. GPMIN) THEN GPMIN = RMEAN(I) ENDIF CONTINUE 75 RMMAX = -999999.0 DO 75 J=1,Q IF (CMEANCD .GT. RMMAX) THEN RMMAX = CMEAN(J) ENDIF CONTINUE 80 RMMIN = 999999.0 DO80J=l,Q IF (CMEAN(J) .LT. RMMIN) THEN RMMIN = CMEAN(J) ENDIF CONTINUE C C DO 85 J=1,Q IF (P .EQ. 3) THEN DIFFAB(J) = U(1J) -U(2,J) DIFFBC(J) = U(2 J) -U(3,J) DIFFAC(J) = U(1,J) -U(3J) ELSE IF (P L.T. 3) THEN DIFFAB(J) = U(U)-U(2J) DIFFBC(J)= 0 DIFFAC(J)= 0 85 ENDIF CONTINUE 90 ABMAX=-999999.0 DO90J=l,Q IF (DIFFAB(J) .GT. ABMAX) THEN ABMAX = DIFFAB(J) ENDIF CONTINUE ABMIN = 999999.0 D095 J=1,Q IF (DIFFAB(J) .LT. ABMIN) THEN ABMIN = DIFFAB(J) ENDIF 95 CONTINUE INTAB = ABS(ABMAX - ABMIN) BCMAX = -999999.0 Potvin '96 144 91 DO 91 J=1,Q IF (DIFFBG(J) .GT. BCMAX) THEN BCMAX = DIFFBC(J) ENDIF CONTINUE 96 BCMIN = 999999.0 D0 96 J=1,Q IF (DIFFBC(J) .LT. BCMIN) THEN BCMIN = DIFFBC(J) ENDIF CONTINUE INTBC = ABS(BCMAX - BCMIN) 92 ACMAX = -999999.0 D0 92J=1,Q IF (DIFFAC(J) .GT. ACMAX) THEN ACMAX = DJTFAC(J) ENDBF CONTINUE 97 ACMIN = 999999.0 DO 97 J=1,Q IF (DIFFAC(J) .LT. ACMIN) THEN ACMIN = DIFFAC(J) ENDIF CONTINUE INTAC = ABS (ACMAX - ACMIN) INTMAX = MAX(INTAB ,INTBC,INTAC) C C C370 Print*, GPMAX, GPMN Print *, RMMAX, RMMIN Print *, ABMAXABMIN.INTAB Print *, BCMAX,BCMIN,INTBC Print *, ACMAX.ACMINJNTAC Print *, INTMAX Then calculate Cohen's d! GPd = (GPMAX - GPMIN)/SQRT(VARMEAN) RMD = (RMMAX - RMMIN)/SQRT(VARMEAN) INTO = INTMAX/SQRT(VARMEAN) C Print *, GPd, RMd, INTd C******Calculate degrees of freedom for each effect****** GPdf Errdfl RMdf Errdf2 INTdf =RP-1.0 =RP*(Rn-1.0) = RQ -1.0 =RP*(RQ-1.0)*(Rn-1.0) =(RP-1.0)*(RQ-1.0) Potvin '96 C******WRITE Cohen's d,f & NCP values and df s to output file******** WRITE (2,11) T>ESIGN\PATTERN', EFFECT.'DF, 'd', 'f, 'NCP 11 FORMAT (2X A6.2X.A7.3X A6,1X,A5,3(2XA11)) IF (P £ Q . 1) THEN WRITE (2,21) P,' x', Q, PATT, "RM'JlMdfJiMdJlMfJIMncp WRITE (2,31) •RMerr', Errdf2,'--','-','-' ELSE IF (Q .EQ. 1) THEN WRITE (2,21) P," x', Q, PATT, 'GROUP',GPdf,GPd,GPf,GPncp WRITE (2,31) 'GPerr', Errdf 1,'--','-', ELSE WRITE (2,21) P,' x', Q, PATT, 'GROUP'.GPdf.GPd.GPf.GPncp WRITE (2,21) P," x', Q, PATT, 'RM',RMdf,RMd,RMf,RMncp WRITE (2,21) P,' x ', Q, PATT, 'INT'JNTdf,INTd,INTf JNTncp WRITE (2,31) 'GPerr', Errdfl,'--','-','-' WRITE (2,31) 'RMerr', E r r d f 2 , ' - - ' ENDIF 21 31 FORMAT (2XJl,A3,Il,lX,2X,A8,2X,A6,lX,f5.1,3(2X,fl 1.3)) FORMAT (20X.A6,1X/5.1,3(8X, A2.3X)) WRITE (2,41)'. 41 WRITE (2,41)'' FORMAT (A70) C417 10 CONTINUE WRITE (2,41)' WRITE (2,41)'' 5 CONTINUE C******* Print to screen and write to output file the cell, marginal & C grand means and the population variance. 17 18 PRINT *, 'Cell Means Row Marginal Means' DO 17 1=1 J> PRINT 18, (U(I,J), J=1,Q), RMEAN (I) CONTINUE PRINT 18, (CMEAN (J), J=1,Q), GMEAN FORMAT (10(f7.2)) PRINT *, 'Column Marginal Means Grand Mean' PRINT *, 'Variance Mean =' , VARMEAN Potvin '96 References 24 19 22 26 WRITE (2,19) Population Means' DO 241=1 J» WRITE (2,22) (U(I,J), J=1,Q), RMEAN (I) CONTINUE WRITE (2,19) ' ' WRITE (2,22) (CMEAN (J), J=1,Q), GMEAN WRITE (2,19)' ' WRITE (226) Population Variance =', VARMEAN FORMAT (A20) FORMAT (10(F7.2)) FORMAT (A22.F7.2) CLOSE(2) END Potvin '96 147 i Appendix c 1 o o 0 o o o o o LO T— CM T t 01 o i o i cn o o o o o o LO 00 cn cn cn o io o io r- o co I o 0 LO T-; LO CD "cS o o O o o o o o If) CM CJ) CD CD o o o 0 o o o co co c g> 'co o o o o O LO O •r- CM T t CD TJ CM O CO CO cn cn cn cn cn cn o o o o cri o o o o cri io o o o o o o cri CM iq co o LO o in s o N coo o o cri cri o o o o in o T* T- CD TJ C c o o o o o o CM o cri O cn cn cn o o T t CM o cri 00 o o o o o o T T}- cq cri cn cn CD LO LO c\i CM T— 3= 01 o i cn Tt o o o o o CM cn cri o i o o o o cri o o o o co O o o 00 CO o cri CO CM CD N T5 T-; CO T J o o o o o o o o o LO CO CO 01 cj) cn LO CO o o T - o o o 01 cri cn cn cn c cn cn cn o o 0 LO m s o o O CM T t CO £ 9 ° p o o LO co o oi CM CO cri cri o i— T - CO o o LO o CM o T— CM oi oi o LO o LO CM o CM CO o CO CM O 1— O cri o i oi oi oi o o o o oi o o o o cri o o o o oi o o o o oi o o o o oi CD > D) "5 _co Q5 o 75 h. .o 8 > CD 9>= .N CO .="CD CO, 2_ j _ CM LO 00 9>= .=• aT .N 2 TJ CT CD CO E CD CO II I CO CM LO 00 CD .N = .2 oT CO 55 TJ p T K J II 1 Ui Potvin '96 CM LO CO O CO T3 C B co 148 Appendix 75 oo .c i 1 o o o o o o CM q cri cri cri cri o ca CM LO o o o o o o LO o o CM CO CM C> cri cri cri OJ II 0) 5 co CO CM E CO CO o o o o CO CO o o CO cri cri cri o o o o o cn o CO cri cri cri o o o o LO o cn o o o CO cri cri cri LO N co o CM co UJ o o co cri in w — o o m Si II 0) ¥ CM CO cri m o o T3 CD CM cn CO CD o CO X CO cn o o o cn cri CD cri cri o o cn •t cri o o co cri o o 3 s cri cri o" o $ cri s? 8 oo a> • > || o o » CO CO cri "O CO CD cri o CO S> CM CO cn cn o o o cri CD N N CO .5 Z CM CO £ UJ tn CO c o> Margi "ca c LO cri o o CO co cri cn o in CD ca LO o o o o o o CO CO CO o v— 8) cri cri cri cri o o o o CO CM o CM CM ^— cri cri cri cri ? cri tt3 o LO CO .5 LO I co I CMCO tn tn Si 3 o .8 cn cn CO o OJ o • II I o 5 | o «>CM cn o o o cri CD N CO co c CM • X c ®. Q. < " 5 > > _ Z 2 UJ ca TJ C CO a> O o o CO CO CM CO •I I LO cri cn o SI o CO a> £ cri II o (0 m c CM ^ in cri = 31 TJ CD a> LO CM cri CM CO o o o cn cri CD N CO *-> U .8 CM CO UJ o CO o • II CO 5. CO CD CO cri J CD N .8 I CMC O Potvin '96 TJ C ca c cn co co TJ c CO > 5) co o tn 75 o 75 k_ o tn < s • II CO > CD li UJ < CO 1 cri o o o cri SI < o 0 o •<* cri co -J m o CO o 5> CM t tn CO X CO o o •<* cri ca LO CM CO o ts 03 SI CO o N 3= < > O 0) "co li UJ 3 < .8 Q CO TJ C CO CO 149 Appendix O O O O .5 8i 81 cri o o o o cn o o o in co .c cri cri o o in CM E o cri o o o o o cri o o in rv co cri o o o o CD cri o o o o at I 1 0 CM CO co ,c 8 to UJ 3 CO cri o o o o in cri o o o o o o o o CO •* cri cri w 8 cri m CO o in o o o o co cri rv 00 cri • ll CM CO Y— cri CM co o cri CM CJ) o o o o o cri CD co .c CO CO co CM II CM CO 8 CD O) k. CM o o o o o cri N CO O) O) X X CO < CO CD TJ C co c O) 'co CD TJ C CO > CO o w •55 o « 8 O I CO CM I Jffl UJ CD cri CD £ T3 o o o o CO u CO N cri o in 3 TJ o o o o o o in UJ co o- rv co CM N N CO 1 cri o o o in CM CD cri o o o o o o o m o cri CO cn cri o o in rv o cri CM o o o 0 co cri o o in "co o o o o CM CO cri II cri CO CO o m T— • cri cn co cn T— co o o o o CO cri o o o m CM O O o o o m o o in h- m i o o o o in cri cri oo 81 CM cn CM o o o o •«* cri CO 5 8 CO •> CD Q CO TJ c < CO CO 150 Appendix O 0 c 01 03 co .c O 0 if 01 00 CO 01 in co m 01 01 if) s II .2 co 01 To E co 0 s CM o E _ c "55 CD 3 TJ 0 in co CM CD Q < > O CO O O *P o> o 0 * II T3 o CO CO — CO CO E .S CM CO co c (CD0 X CO a> O UJ Q. 3 g o o 00 00 CD N co To I CM ^ in 8> c ca c S> o 8* <P 03 01 8* 0 • II CD 3= CD TJ o O 0 LO co 01 o 0 " CM 05) k.S . LO 01 00 • CO CD TJ C CD > CD 00 u « CM LO CM ca 01 _co "55 o 0 CO CO 01 CD N co .c 00 u 01 CM UJ o a o § CO CO 1.8 8> 03 Q. 3 O "CO CM CM o o 01 o o 0 CD co 0 CD N o 0 CM II £?!§ CD N 'to t3 a3 LO UJ CO X CM CO CO UJ 03 .C .2 » CO 2 TD T J c C © , CO < CM $ CM 03 o 0 01 CO 01 UJ T3 03 "5> E> LO CM CM CO (0 • o CO O o 0 CD N c 01 CO +-» 01 r01 01 CD o .S CM II .{2 co a. 3 CM • CO CD N s TJ CD X 0 ° 0 <n 0 a. 3 CO X CM 01 CO CM H §> co UJ CO D) CM co ,c 01 01 CO CO 0 io O 0 01 5 loS 0 0 II w TJ .S co 01 <D N 01 LO LO CM 0 CO LO 01 0 • o 0 0 co CM 01 CO 01 0 in CM 03 0 m o o CO o 0 .5 in a. 3 o a > CD Q CO TJ c CO 00 151 Appendix .8 8* o o o O O O 0 .8 01 cB cc 0 01 01 01 o 0 LO o o o io r-^ 01 CM CO 0 75 E co .5 o c o '5 > CO i_ QQ. < © O 8> 03 CO X 2 TJ TJ c C CO 0), co CM LO Co if) • CO CO II TJ ¥ 3 TJ CD cri 0 LO CM CM CO 01 X CM Q. 3 o O 00 o o o o co II O) k. CM CD N CO co .c 0 CM 1 8> o2 Potvin '96 D) O J2 "CD u "CO k. £ o o o o 00 00 CD N CO 03 o CM £ .c 8> l-l UJ Q. 3 c co CO LO CM cri 0 o o o o cn 33 CD o o o o co 01 • o o o UJ o> CO CD TJ o o 01 u s> o o o o o CD N CO 01 c CO CO CO I- 0 0 CS CD 01 CM CO 01 01 c .S to LO 0 01 LO CM CO CO 0 c 01 0 CM at "O CD > o o O LO o o o T3 CD 3= CD 0 01 if loo 01 o o o LO CO II 3 .S LO 0 CO CO CO CD N LO 0 • TJ CD tn co CO o o o o o o o o o LO CM 01 01 o TJ CD X Tt CM CO LO co r» > O 0 io 0 < o o o 0 o o o 01 o> 10 led 01 01 LO CM <D Q LO 0 io 0 LO co 0 01 .S o o o o o o 01 o 'co 01 8 CO r-» 01 0 O) 0 CO cn io 0 CO c o o o LO cn CO CO 01 0 0 co CM 03 o o o o o cn 0 If LO 1 o o o to Q. 3 2 o CO > CD Q CO TJ C CO CO 152 References Appendix 5.0 Function Used To Compute Effect Size (d) For Tests of Interaction in Two-Way (A x B) ANOVA Designs 2 x K designs: For designs having 2 levels on one factor (A) and 3 or more on the other (B), effect size, d, was calculated using, d _ l(m - m ) - 0*i-m)* )| MAX w a where — M- 2 represents the maximum of the differences between A l and A2 cell means among the B levels and (m - \i ) 2 is the minimum of the differences between Ai and A cell 2 means among B levels. The numerator entails taking the absolute difference between maximum and minimum values and dividing by a, the average within-cell standard deviatioa 3 x K Designs: For designs having 3 levels on one factor (A) and 3 or more on the other (B), d was calculated as above but since three values for the numerator term of this function are possible (one for differences between Ai and A cell means, another for differences between Ai and A and yet another for those between A and A 2 3 2 ), only the one yielding the largest number (largest absolute difference) was used to compute effect size. Potvin '96 153 3 Appendix c CD CO CD k_ CD T> O Q. O E >> co _. CD CD k_ CO JO • C o CO CO CO CO r- © CD E o§ >% Q. Q. jo co E 8 § II CD O CO co > 88 8 R O CM ^ 10 CD ^ CO i ° ° - O O »- d d d O CI c> d 6 - o o d ^ i i I U CO O CO O CO " § 8 I 8 f— O S 8 S 8 d d d ^ TJ C 8 2 CO O (O o o d o E CD co o — d d d ci d ^ d d d o o «- CO CD O 'iZ - CO o co o >» c o = co co c 8 8 CO o CO CD i - CD O) I r- CM n ^ io to r CM n ^ in to O O O c o CO c I d >-' T— CM £ IIo O u •92. II CD II o *< o g g nj Q. c 1 ^ 1 a CO y- CM CO II CD D> CO g o § 8§ S§ 0 «i CM y- 8 8 8 8 •2. g O d d d d ^ II c CD co re k. k. ' O co co oo co o d d d d d d d ^ « CO CD O « § 8 8 8 8 8 ^ t ^ o d d d d d y^ d ^ CO O LL x\ 10 I > O o to to o> o 10 < lic d d CM 8 8 55 a> Q CO 3 8 18 y- CM CO lO CO II s g o 0 0 * - o o CM O g 8 8. o T- CM CO lO CO CD s 2 CD > < CD Potvin '96 I 3 154 I § 0 0 0 0 0 ' - ' - T > 5 d d d d d d d d * d - O §§§§§§§§ d d d d d d d O d g 8 § | | | § ^ ti ti ti ti ti ti d d d O O O O Q O SS S S N N d d d d N N N 00 CO CO 6 d d d 8 § 6 d d 1^ S d d § oo oo § § d d d ^ U ti W d 00 00 CO o d d d "* R 8 S 8 § § g § d O S * 8 8 R 8 88 8 d P. 8 8 8 8 8 8 O (0 IO S N N S d d d d d d d ti o o o o o *d Appendix CO CO o i o o d d o 8 8 8 S d o 00 O) o n d d ^ I o - «I - d 1 •o I o 3 8 d ••- r i t i n ^ i a i s s c o a i U O cc O O O O O O O i O 0 d d d tn c o> tn d d d d d d d d d d r O O O O O O 8 8 8 8 o ^ o o o o * - 8 § CO CO GO GO d o d o ^ O CO d O o ^ CD CO CD CD O & E >. o CO _co r- CD C CO CO CO CO O o o o o o O O O O T- o> a < d > o ' d d d d ' 8 8 88 T> T* T* © d d d ^ Li. II * tn f a> u £ (0 (0 Q. c a • u * s> < O 5 9 C - g % 8 r- o *- d dw oo oo S d d W co d d O O § T-' 88 o CD E co 5 I- © Q W co CD CD O E 8 — CO CO ~ . CO Ti .>» C o co CD k- O cb co o ci .E co CD CO Q. Q. CO Io E TJ C CD > c a o 8 N oo c o II <-> to N § II O d 8 8 88 5 d o c o c r N B ^ I O I O N I t O I CD 2 CD Potvin '96 155 Appendix O O O O O O O O O O O O O O O ' o o o o o o - oo " O o o o o o o cq 6 d d 6 d ^ II I O O 3 O i - O O O O O T- o o o e © ••- II - o d d d < O o o o "- II £ O g o o o o o o o » - - O O "- II CO jjp < co CD S < co CO II II 5 < 2 < CO CO CD o co E CD 0 c .CO "C CO > 8 8 8 8 8 8 8 8 8 I s o d d 8 •<t d d d d d d d d " - ' (0 C o> 'co Q < > O d o d o ' d o d ^ 0 8 ? 0 s 8 * 0 8 8 8 8 8 3 * ^ ? s 0 0 0 0 ^ z 2 II CC < o $ o LL < *CO CD CO O X CO CO ®, 2 c lat X 0 0 0 0 o • o o o »O O ^ 8 0 s •* o d d 0 < d d ^ » 1 8 5 « o d d 2 < 8 8 § 8 < 8 09 "- 1 d o o d ^ > c O tn co co CD co O E c O « CO "cO II Q CO CO < CO « o c k? < c .55 \_ co 8 CO II 1 CD O CD CO CD Q. C D i_ o d d t 8 t d o^ o d d II II II k. T5 c d O 1- o 0 8 k? i1 CD § 00 < • 8 l i t o d d § 8 8 CD CN 8 8 r - c M n t i n i D S c o o i •• c CD cn >_ 'co O CD <O Q Potvin '96 156 Appendix d d d d d d d o ' 0 0 0 0 0 0 0 6 6 ^ d d d d d d d d d d d d d d d d y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 d d d d d d d d d d d d d 0 0 ' 0 ^ 0 " - - - 2 O5 OS OSO§O§O SO §O S § § S§ 8 O O O O " - I C J O O O O O O O O O O O Q » - * ^ - ^ * ^ ' * * ' * * ^ * o CO CD CO CO , , d d d d d d d d d d d - r ^ o co - § § 0 d d d 0 0 § d d 0 d 0 0 § g d d O O ° 9 9 3 3 3 9 O O O O O d d r - 9'398 O O " - E CD o c .2 cz CO > o ° > § § § § § § § § 8 d d d d d d d d * - m C O) •55 0 CD 0 0 0 0 0 0 " c CD CO - CD ^ d o z < d d d Q. CD o ' d x - o 75 " S S 5 S S 8 ^. * o ' d d d d " - w II (5 § <D d < 5 6 o o o ^ c; § * § § § 8 5 8 TJ CD 3 C LL §> CO CD ^ CO O O O CM CD X CO = o § d o - -jg 8 11 - Q CO • 0 0 ^ CD 11 - 8 I< . g » « i - w n i t i o i o s t o o i O ' - N n ^ i n i o S f l O TJ c N - x C 2 d CD C3) O CD ® , £ '55 <OQ Potvin'96 157 Appendix » § S § § § § § g g , ? § § g g g g g 8 d o o o o d d d d d d d o o o o o ^ ^ o ^ o o o o g g g o o g g g g o g 0 0 0 0 0 0 d d d d d 0 0 0 0 0 0 0 0 0 0 d — <e§§°.SSSSS§§§§ggg8 d d d d d d d d d 0 d 0 d 0 — — — ddddddddddddoo^- 0 d 0 d 0 0 0 0 0 0 0 0 d d d d d d d o ' 2§§§§§§§§§§§§ 8 £ §-. ° . § S S S-. g- . g- . g- .8- .8- .8 - . O O O O O O O 0 0 0 0 0 0 0 O 0 O 0 O 0 O — — CD 3 CO ca CO CD o CD C m ca 0 d d a> < 0 0 d d d d d d d d d d d d ^ ^ SSSSS8 > d 0 1 >, 6 0 0 0 — d > 8 g — C CO 5 — C CD CD CO ^ o g g g o o g • =F 0 c o g g g g g g g g 1 CO (Q 0 o > g g g g g g g g g W _ 0 0 d 0 d d d I Q. CD — co s I ^ "SS 8 8 8 5 < CD s d o d o — ef O *88S8 d d d — 8 II TJ <L> C o u U- co g » © < 0 0— CO "88 CM CO X TJ C o>, Q. j- "co Q< OQ Potvin «96 1 1 5 g Appendix o d o d o d d o d o o d o o d o d — £9 9 9 98999998999998 d d d d d d d d d d d d d d d d ' 5 2 3 3 3 8 3 3 3 3 3 8 3 3 3 3 3 8 0 0 0 0 0 £ 3 3 8 3 0 0 0 0 0 0 0 0 0 0 " - 3 3 3 3 3 3 3 ° - 3 8 0 d d d d d d d d d d d d d d " - 2 § g § § § § § g § § i 5 § 8 0 o d d 0 0 0 0 0 0 0 0 0 ' " - 28999998999998 0 0 0 0 0 0 0 0 0 0 0 " | - CO C M O O Q O O O Q O Q O O O O O O O O O O O O O " co O CO CD - O To - o o o o g o o o o o g E dddddddddd-r- CD c co *c co S9998999338 O O O O O O O O O " - > 8 CD °>338333338 (j) Cj ddoddddd-^ •) CO 0 M) 0 • 0 0 ^ C JJC O 0 0 t t ' 0 t 0 > O ^ CD O O O O O T ® Q. - CD I • 3 3. 3. 3 3 8 ^ | do' o d d * - to o d d o "- fi » fl) CO 6 8 c O "3 99 8 O o ^ g > ro a> U_ CD co -— X o i_ W c X o TJ C ^ o < CM CO 0 ^ 8 3 5 3 9- .3 8- . ^ m ^ dddddddi- < O u 0 1 w TJ O 53 ®, CD O ) < O CD O Q < co • O a P O "~ 8 99 8 o d T-" ^9 8 ° - CO 11 Q c o I" CD > CO < * r ( \ i n ^ i n » s o c i O ' - N ( | i * i t ) * N s i Potvin '96 . 159 Appendix 2 88 d 6 s s s §8 8 8 8 8 8 8 8 3 3 8 8 d d o d o d o d o d d d d d d d d d d — t8S 8 3 8 8 3 3 8 3 3 8 3 8 3 8 8 d d d d d d d d d d d d d 2 33 8 888 33 33 8 33338 52 3 8 8 3 8 3 3 3 3 3 8 •8 8 8 8 d d d d d d o d d o d d d d d d d d d d d d d o d d d o d — — 2 338 8 8338 88 3 888 d d d o d o d d d d o d d — 2 88 8 38388 338 38 d d d d 8 8 d d GO d 8d oo GO d d d d d d -88 d d d 2 83 d c o> 00 '55 Q < > N 10 d I CO CM CD X TJ C o" d d d d d d d d d d d d d d CO CD — o E d CD O d d d c — CO co o" d o d d d d d d o o u. "88 8 X w I c 8 CD d o c CD — .2 CO C CO D d > a £ o _co o d CO CO CD ^ p d d d d — d CO E c o — CO C D k_ — 8 CO II a CO CO II 2 c o > d 88 3 38 d * CO CD CD d § cB 8 § 8 "88 8 8 JS co 38 88 d d 8 3 88 8338 o • To CO 83 § §88 II 10 d 88 d o o CD — 88 3 888 88 d CD ~ d d o • 38 8 888 33 8 CO TJ <D 3 C •5 C O d 8 388 88 33 8 d d d d d d d d d < CD o c - 8 CO N n ^ l Q I D S D O l O ' - N n ^ IO <o s H o>, CD O) Q- t CO Q. O CD < OQ 160 Appendix §§§§ § § § § § § § § § § N C M d d d d o ' o ' d d d d d d §§§§§§§§§§?§ 8 d d d d d d d d d d d d d d — § §§§§§§§§§§§§§§§§§§i §§§§§ 8 c v i d d o ' d d d d d d d d d d d o ' d d o ' d o ' d d d o o - 0 0 0 0 0 0 0 0 0 0 0 0 0 — , § § § § § § § § § § § § § § § § § § § § ? § § 8 O J C M O c v i N §§i §§§§§§§§§§§§§§§§§§ 8 c o ^ d d O d d d O d O o d d O o d d d o d O d d d d d O O O O O d o d o d d O o o d d O o d O o d O o d O o d O o o d O o O 0 d o O 0 O 0 d O d O ® 0 ' 0 O — ^ - d d d d d d d d 0 0 0 0§0§0§0§0§0§0S0?0 8 — 8 §d d§ d§o§d§d §d §d §d §d §d § d d d d d d d d — , §§§§§§§?§§§§§§§§§§ 8 — 6 6 6 6 6 6 6 6 6 6 6 6 6 66 6 6 6 — § § § § § § S S § § § § i § S § S 8 ^ 0 — d N §§§§§§§§§§5§§S d d — d d d d d d d d d d d o d o d d d d d d d d d d d d d - o d ' o ' d d d d d d d d d d d o ' d d d d d d d d d d Q d d d d d d d 1-0660060 > — d O d d d — o ' o ' d « - T ? § T ? ° ° ° § § ° 8 0 _ Z d d d d d d d d — o o o o o g g g g ^ " ' o ' d d o o d d d - m — E o TO 5 8 £ J 6 > O > d d d d d - 6 6 6 6 0 9> 0 0 n d Q. CO o *j O CD - 8 S§§8 m eg ^ § S S § 8 1 co I „, S S S S 38 d 0 d d d — ^ II d > w o o o o o g g ^ CD O g CO co 11 ^^dcicicidcici-^ CM — • — d O) o ~ u £ — §5§§?§§§§§§§8 0 0 0 0 0 0 0 0 0 0 0 0 — CO „ d . ? g n 3 — 8 « ' C <3 C d ^ddodddddociddcioci-r- ^ o g g g g g o g g g g g g g •PJO d d „ § § § § § § § § § § ? § 3 § § 8 — 2 d 8 0- II • CD X T5 c d> o> o> Q. Q. < O Q Potvin '96 161 Appendix o 3 d 53 d § O Od Od Od O d d d d d d § o d d d d d d d d d d d o o o d d d d d d o o o d d d d d d d d d d d d d d d d O O O O O O d d d d d d d Q < — < m «>» 0) »- o II " O o •* d d 0 Od O 0 d d d d e d d S d 0 o CD d d 0) > fl) < c o o o CM • x M d © O O O O * — d d d > oo " CO CD S 8 ° © co E ® ? a > c O 0 O O 0 * O T d ^ - - CO S" *- O co OT © CO d d J? CD 8 d w w w - 8 w 1 1 O 1_ d 0 O O Od O 0 O 8 O O Q O O CD 0) 3 * C co C § d • CO d S ^ d - - - - - ^ • S» d s .SSSSSSSS8 sss sss 8 s s sss ss8 s8 — O 'ST CO o > O Zz O > d d O .2 o •taw O O O O O O O O O O Q O •T^TfTteocococococoo o co O ° ° ° § S S S S § § 8 8 o "~ O d § d 8 5 d d S 8 CO co O ^ CD X TJ c £ .EP ®, Q< O Q Potvin '96 i 162 Appendix 3 § ci d in CM d 0.40 0.40 CM § d 3 3d d d 3 §d d o 33 s d 0.80 8 0.40 CM at o d d d d d d 3 §d d d 5 § d d d d o •* d d d s3 ^ w w w 3 §d § d o ° o 3 d o o -* d to d ll to d 0) * § d > CO § d d CO i *- 0.40 CO 0.40 o i o ° " c Is I -S Q. < O Q O * o o o o o o o o o o o o o o o o o - § § § o o o o •* d ° s ° •? o -- § o d d d o d o o d d o 3 o o co • * s o d o d d d d 3 3 33 o co © o o o o o •* d o -* d o •<* d s § d d 3 -3 -d3 d d o d o d d o o o O O O O O O d O O O r- o d d O o -3-3-3-3 d O o O o O O O O - - - - — - - J - 1- o d d d o •* d o O t S o§ o§ o§ o§ o° 3o d d o o d O d o d o -* d ° s§338 d d d d 6 -3 3 3 3 3 d d o -* d 3 3 38 o 3 38 d § d -3 -3 o o o •* •* d d d d d d d o o o o o o o - o o o § § • * •>» - * § Jd - Jd - Jd - dj - d J - dJ S d d d d d d ^ o o o o -» d d d d d o o o o o - o o o •* -» d d d d *- d d o d d - - % to CO co E CD 0c d o - - .2 38 d - d co > — 8 8 1 CD o *- c 8 co > c CD 8 jg go w 75 w CO 8 p CO E 8 c o o 8 o 3 d J5 0 tw. 8 8 1 "1118 C O o d d O o d d O o d 3 O o o d— O o s -3 § wd O o 8 °- 3 §d 3 d d O o o o o o § - * • * •» co ^ ^§ §^ CO o o d od od od od od - d - *- 1w O o d o -* wd O o o o "0- -<t d d d d o O o o •» d t dd O o 3 3 d d d o O o d s3 O o d d O o 3 333 S o o o o d O d d d d 333333S333 ° O o 3 d o -* d d O d d d d O o d d d O o o § d o d O o -* d o § d d 3s d o o § d d d - o o o o -* - dJ S e- j d- d d d o •* d d d - o o o d o -* d § d d d d it 3 TJ- d d d o § •» d d • o ©, o o o •* d §d d3 d§ §d o . 3 3d 33 3 3 3 3 3 88 d d (Ji d ^ T5 d 3 5d -3d 3 d d - co = O s U) o j CM CD X o s § d ^ - II 3 d § § d d d CM CM - Z Z< C § os § § d CM T3 CD d o -* d CO CM to _ O o •* d o -* d d CO < § d o •* d 3 9d 3 3 d d d CO CD Q 5o o3 o3 eS 3 o CM f- 5) d -» GO O) 3 d « o •* Potvin '96 163 Appendix ss 838888338838333388 CM d CO ctj Tt Cvi d o d d d d d d d d d d d 8d 3d 3d3 d8 sd sd 8 d o GO d S 8 S S 3 8 S S S S 3 8 8 8 8 S. 3 3 8 8 8 d d d o d d d d d d d d d d d d d d d d d s §8 o co d 3S3383SSS8S883SSS88 8 3 d d d d o d d d d d d d d d d d d d — d d d d d d d 1/5 CM d d d d d 333888838333883388 3 8 3 S 3d 8 d d d d d d d d d d d d d d d d d d d — d d d 8S888SSS83838S338 3 s sd 8 3 3 d d d d co CM d d d d d d d d d d d d d d d d d ^ § § 8 333SS388883S8S83SS8 CM CM d CM d d d d d d d d d d d d d d d d d d d d - ^ - 883838838833888 3 3 3 8 8d 8 d d d d d d d d d d d d d d d d d d d - ^ 388SS83SSS8338 3 3 3 8 8d 8 d d d d d d d d d d d d d d d d d — O CM d CO 3333383833388 3 8 8d 8 3 3 d d d d d CO 833888388888 8 8d 8 3 3 3 d d d d d r- 888888888 88 3 8 3 8 8d 8 d d d d d LO 8388888 88 8 8 8 8d 8 3 d d d d d d d d d o — d d d Tt 8883888 8 8 8 8 8d 8 8 d d d d d CM "55 CD Q i < > < 2 II QC 3 >» CO 5 K3 k_ O LL II * C CO "•P CD C 0 o u 'LZ CO CM a CO X TJ §" 1 C c CD O) o>, t CO Q- O CD < OQ II SI < e> a d 8 8d 8 3 3 d d d d o GO d 883 88 8 d d d d d d 8 8d 8 d d 8 3 8 8 88 d d d d d o 00 8 8d d o o GO GO d d C eg 'CZ CO > 8 • V co > c CD CO CD w. Q. CD i— 0 388 1 d CO 0) d IO 88 3 3 8 d d Tt 8 8 8d 3 d d CO 88 8 co E c o *- re g> d 8 i- 75 d II 88 d E CD o _co co CD d o 73 8 8 8 3 d d d — 8 3 8 8 8d 3 d d d d d CD 'CZ d d d d 3 3 88 33 d o d d — o GO d CO CO CO .2 o oo d d 8 8d 3 d d CD o c O Q O Q CO GO CO O CO CO 1 8 8 3 8 8 88 3 3 88 8 3 d d d d d d d o d d — Oi < o d d d d d d d - ^ 3 8 3— 8 8 8d 8 3 3 d d o d d d d i 1 TJ CD - O 0 z d d d d d d d d d d - ^ 883 8 8 3 3 8 3 8 8 8d 8 8 8 8 d d d o d d d d d d d d — ' d d CO CD d d d d d d d d d d d ^ CO co — c d d d d d d d d d d d d ^ O CO — CD £ 8 CO — C M C O T t l O t O t - - 0 0 O > ° — CMcOTrio<0i*^coo>o — C M e O T i - w c p h — — — — — — — — CM C M C M C M C M C M C M C M 164 Appendix Appendix 7.1 Effect Size and Noncentrality Parameter Values For One-Way RM A N O V A Designs (constant r matrix pattern only) Effect Size (d) Ave r 0.2 0.5 0.8 n f K =3 NCP f K =6 NCP f K =9 NCP 0.4 5 10 15 20 25 30 0.105 0.105 0.105 0.105 0.105 0.105 0.167 0.333 0.500 0.667 0.833 1.000 0.088 0.088 0.088 0.088 0.088 0.088 0.233 0.467 0.700 0.933 1.167 1.400 0.083 0.083 0.083 0.083 0.083 0.083 0.312 0.625 0.937 1.250 1.562 1.875 0.8 5 10 15 20 25 30 0.183 0.183 0.183 0.183 0.183 0.183 0.500 1.000 1.500 2.000 2.500 3.000 0.153 0.153 0.153 0.153 0.153 0.153 0.700 1.400 2.100 2.800 3.500 4.200 0.144 0.144 0.144 0.144 0.144 0.144 0.937 1.875 2.812 3.750 4.687 5.625 0.4 5 10 15 20 25 30 0.264 0.264 0.264 0.264 0.264 0.264 1.042 2.083 3.125 4.167 5.208 6.250 0.220 0.220 0.220 0.220 0.220 0.220 1.458 2.917 4.375 5.833 7.292 8.750 0.208 0.208 0.208 0.208 0.208 0.208 1.953 3.906 5.859 7.812 9.766 11.719 0.8 5 10 15 20 25 30 0.456 0.456 0.456 0.456 0.456 0.456 3.125 6.250 9.375 12.500 15.625 18.750 0.382 0.382 0.382 0.382 0.382 0.382 4.375 8.750 13.125 17.500 21.875 26.250 0.361 0.361 0.361 0.361 0.361 0.361 5.859 11.719 17.578 23.437 29.297 35.156 0.4 5 10 15 20 25 30 0.422 0.422 0.422 0.422 0.422 0.422 2.667 5.333 8.000 10.667 13.333 16.000 0.353 0.353 0.353 0.353 0.353 0.353 3.733 7.467 11.200 14.933 18.667 22.400 0.333 0.333 0.333 0.333 0.333 0.333 5.000 10.000 15.000 20.000 25.000 30.000 0.8 5 10 15 20 25 30 0.730 0.730 0.730 0.730 0.730 0.730 8.000 16.000 24.000 32.000 40.000 48.000 0.611 0.611 0.611 0.611 0.611 0.611 11.200 22.400 33.600 44.800 56.000 67.200 0.577 0.577 0.577 0.577 0.577 0.577 15.000 30.000 45.000 60.000 75.000 90.000 f = Cohen's f NCP = Noncentrality Parameter Potvin '96 165 Appendix 7.2 Effect Size and Noncentrality Parameter Values For a Two-Way 2(G) x 3(T) Mixed A N O V A Design (constant r matrix pattern only). Test 0.5 0.8 Group x Trials f NCP n Group (G) f NCP 0.4 5 10 15 20 25 30 0.075 0.075 0.075 0.075 0.075 0.075 0.167 0.333 0.500 0.667 0.833 1.000 0.105 0.105 0.105 0.105 0.105 0.105 0.333 0.667 1.000 1.333 1.667 2.000 0.075 0.075 0.075 0.075 0.075 0.075 0.083 0.167 0.250 0.333 0.417 0.500 0.8 5 10 15 20 25 30 0.062 0.062 0.062 0.062 0.062 0.062 0.115 0.231 0.346 0.462 0.577 0.692 0.183 0.183 0.183 0.183 0.183 0.183 1.000 2.000 3.000 4.000 5.000 6.000 0.129 0.129 0.129 0.129 0.129 0.129 0.250 0.500 0.750 1.000 1.250 1.500 0.4 5 10 15 20 25 30 0.186 0.186 0.186 0.186 0.186 0.186 1.042 2.083 3.125 4.167 5.208 6.250 0.264 0.264 0.264 0.264 0.264 0.264 2.083 4.167 6.250 8.333 10.417 12.500 0.186 0.186 0.186 0.186 0.186 0.186 0.521 1.042 1.562 2.083 2.604 3.125 0.8 5 10 15 20 25 30 0.155 0.155 0.155 0.155 0.155 0.155 0.721 1.442 2.163 2.885 3.606 4.327 0.456 0.456 0.456 0.456 0.456 0.456 6.250 12.500 18.750 25.000 31.250 37.500 0.323 0.323 0.323 0.323 0.323 0.323 1.563 3.125 4.688 6.250 7.813 9.375 0.4 5 10 15 20 25 30 0.298 0.298 0.298 0.298 0.298 0.298 2.667 5.333 8.000 10.667 13.333 16.000 0.422 0.422 0.422 0.422 0.422 0.422 5.333 10.667 16.000 21.333 26.667 32.000 0.298 0.298 0.298 0.298 0.298 0.298 1.333 2.667 4.000 5.333 6.667 8.000 0.8 5 10 15 20 25 30 0.248 0.248 0.248 0.248 0.248 0.248 1.846 3.692 5.538 7.385 9.231 11.077 0.730 0.730 0.730 0.730 0.730 0.730 16.000 32.000 48.000 64.000 80.000 96.000 0.516 0.516 0.516 0.516 0.516 0.516 4.000 8.000 12.000 16.000 20.000 24.000 Effect Size(d) A v e r 0.2 Trials (T) f NCP f = Cohen's f NCP = Noncentrality Parameter Potvin '96 Appendix Appendix 7.3 Effect Size and Noncentrality Parameter Values For a Two-Way 2(G) x 6(T) Mixed A N O V A Design (constant r matrix pattern only). Test Effect Size(d) A v e r n Group (G) f NCP Trials (T) f NCP Group x Trials f NCP 5 10 15 20 25 30 0.058 0.058 0.058 0.058 0.058 0.058 0.200 0.400 0.600 0.800 1.000 1.200 0.088 0.088 0.088 0.088 0.088 0.088 0.467 0.933 1.400 1.867 2.333 2.800 0.062 0.062 0.062 0.062 0.062 0.062 0.117 0.233 0.350 0.467 0.583 0.700 5 10 15 20 25 30 0.045 0.045 0.045 0.045 0.045 0.045 0.120 0.240 0.360 0.480 0.600 0.720 0.153 0.153 0.153 0.153 0.153 0.153 1.400 2.800 4.200 5.600 7.000 8.400 0.108 0.108 0.108 0.108 0.108 0.108 0.350 0.700 1.050 1.400 1.750 2.100 5 10 15 20 25 30 0.144 0.144 0.144 0.144 0.144 0.144 1.250 2.500 3.725 5.000 6.250 7.500 0.220 0.220 0.220 0.220 0.220 0.220 2.917 5.833 8.750 11.667 14.583 17.500 0.156 0.156 0.156 0.156 0.156 0.156 0.729 1.458 2.187 2.917 3.646 4.375 5 10 15 20 25 30 0.112 0.112 0.112 0.112 0.112 0.112 0.750 1.500 2.250 3.000 3.750 4.500 0.382 8.750 0.382 17.500 0.382 26.250 0.382 35.000 0.382 43.750 0.382 52.500 0.270 0.270 0.270 0.270 0.270 0.270 2.188 4.375 6.563 8.750 10.938 13.125 5 10 15 20 25 30 0.231 0.231 0.231 0.231 0.231 0.231 3.200 6.400 9.600 12.800 16.000 19.200 0.353 0.353 0.353 0.353 0.353 0.353 7.467 14.933 22.400 29.867 37.333 44.800 0.249 0.249 0.249 0.249 0.249 0.249 1.867 3.733 5.600 7.467 9.333 11.200 5 10 15 20 25 30 0.179 0.179 0.179 0.179 0.179 0.179 1.920 3.840 5.760 7.680 9.600 11.520 0.611 0.611 0.611 0.611 0.611 0.611 22.400 44.800 67.200 89.600 0.432 0.432 0.432 0.432 0.432 0.432 5.600 11.200 16.800 22.400 28.000 33.600 f = Cohen's f 112.000 134.400 NCP = Noncentrality Parameter 167 Appendix Appendix 7.4 Effect Size and Noncentrality Parameter Values For a Two-Way 2(G) x 9(T) Mixed A N O V A Design (constant r matrix pattern only). Test 0.2 0.5 0.8 Trials (T) f NCP Group x Trials f NCP n Group (G) f NCP 0.4 5 10 15 20 25 30 0.049 0.049 0.049 0.049 0.049 0.049 0.214 0.429 0.643 0.857 1.071 1.286 0.083 0.083 0.083 0.083 0.083 0.083 0.625 1.250 1.875 2.500 3.125 3.750 0.059 0.059 0.059 0.059 0.059 0.059 0.156 0.312 0.469 0.625 0.781 0.937 0.8 5 10 15 20 25 30 0.037 0.037 0.037 0.037 0.037 0.037 0.122 0.243 0.365 0.486 0.608 0.730 0.144 0.144 0.144 0.144 0.144 0.144 1.875 3.750 5.625 7.500 9.375 11.250 0.102 0.102 0.102 0.102 0.102 0.102 0.469 0.938 1.406 1.875 2.344 2.813 0.4 5 10 15 20 25 30 0.122 0.122 0.122 0.122 0.122 0.122 1.339 2.679 4.018 5.357 6.696 8.036 0.208 0.208 0.208 0.208 0.208 0.208 3.906 7.812 11.719 15.625 19.531 23.437 0.147 0.147 0.147 0.147 0.147 0.147 0.977 1.953 2.930 3.906 4.883 5.859 0.8 5 10 15 20 25 30 0.092 0.112 0.112 0.112 0.112 0.112 0.760 1.520 2.280 3.041 3.801 4.561 0.361 0.361 0.361 0.361 0.361 0.361 11.719 23.438 35.156 46.875 58.594 70.313 0.255 0.255 0.255 0.255 0.255 0.255 2.930 5.859 8.789 11.719 14.648 17.578 0.4 5 10 15 20 25 30 0.195 0.195 0.195 0.195 0.195 0.195 3.429 6.857 10.286 13.714 17.143 20.571 0.333 0.333 0.333 0.333 0.333 0.333 10.000 20.000 30.000 40.000 50.000 60.000 0.236 0.236 0.236 0.236 0.236 0.236 2.500 5.000 7.500 10.000 12.500 15.000 0.8 5 10 15 20 25 30 0.147 0.147 0.147 0.147 0.147 0.147 1.946 3.892 5.838 7.784 9.730 11.676 0.577 30.000 0.577 60.000 0.577 90.000 0.577 120.000 0.577 150.000 0.577 180.000 0.408 0.408 0.408 0.408 0.408 0.408 7.500 15.000 22.500 30.000 37.500 45.000 Effect Size(d) A v e r f = Cohen's f NCP = Noncentrality Parameter 168 O o c li ll o TJ D CO C CO « TJ (0 3 c Q- CO < CD TJ C CD k. < > o < 2 CC >» CO O CD C d CM d Z 3 oo 431 UI •er d CO oo d < > o o 5 Potvin '96 •* rt 00 •* op •«* ob 00 CO 3435 333 1673435 89329 100000 4 804 54673 7912 UI •cf it op •* •* CO 223 882688 931 00 d | cn 48765 41234 14 0.5 CO d r Matrix 177 463 538 99 7562 10008 233 •* op •»* 00 4 4 co 92336 0.5 I 1122345 3785 500111 86 0.8 546 feet Size ( d) 350 5678 112233 0.5 feet Size ( CM d r Matrix UI 301588 14 666 559221 >» co 19 < CM d 2337 z UI 1999223 d 63 co 6699 d 0.5 co CM d feet Size ( 876773 38? Ave r 00 d r Matrix •* d feet Size ( | 6 11234 10000 | CM d 19 CM d 44198 0.5 feet Size ( ui 91123 CO O 435567 UI Ave r b 7878 | oo • 3131 k_ o U | E 0.5 CO O feet Size ( c o Ave r 0) I I i Appendix en co X co X CO (0 3 00 00 CO X CO 169
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Statistical power for repeated measures anova
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Statistical power for repeated measures anova Potvin, Patrick John 1996
pdf
Page Metadata
Item Metadata
Title | Statistical power for repeated measures anova |
Creator |
Potvin, Patrick John |
Date Issued | 1996 |
Description | Determining power a prior for univariate repeated measures (RM) ANOVA designs is a difficult and often excluded practice in the planning of experimental research. Complicated procedures and lack of accessibility to computer power programs are among some of the problems which have discouraged researchers from perforrning power analysis on these designs. Another more serious issue has been the lack of methods available for estimating power of designs with two or more R M factors. Due to uncertainties on how to compute an appropriate error term when more than one variance-covariance matrix exists, analytical methods for approximating power are currently restricted to R M designs with only one withinsubjects variable. The purpose of this study therefore, was to facilitate the process of power detennination by providing a series of power tables for ANOVA designs with one and two within-subject variables. A secondary objective was to investigate less well known power trends among ANOVA designs having heterogeneous (nonspherical) correlation matrices or two R M factors. Power was generated using analytical and Monte Carlo simulation methods for varying experimental conditions of sample size (5, 10 , 15, 20, 25 & 30), effect size (small, medium & large), alpha (.01, .05 & .10), correlation (.4 & .8), variance-covariance matrix patterns (constant, e=1.00 and trend, e<.56) and levels of R M (3, 6 & 9). Examination of power results revealed that under conditions of nonsphericity (trend matrix pattern), power was found to be greater at small effect sizes and lower at medium and large effect sizes compared to those values generated under conditions involving spherical (constant matrix) structures. Regarding designs with two R M factors, power of main effects tests was observed to be greatest for a given condition so long as the average correlation among trials of the pooled factor was equal to or below that of the main effects factor. For interaction tests of the same model, power was found to be greatest for a given condition when at least one factor had an average correlation across its trials equal to .80. From simulation results, the relationship between error variance and power across different correlation matrices of the two-way R M design was examined and approximations of the noncentrality parameter for each test of this model were derived. |
Extent | 9196462 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-02-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0077309 |
URI | http://hdl.handle.net/2429/4644 |
Degree |
Master of Science - MSc |
Program |
Human Kinetics |
Affiliation |
Education, Faculty of Kinesiology, School of |
Degree Grantor | University of British Columbia |
Graduation Date | 1996-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1996-0268.pdf [ 8.77MB ]
- Metadata
- JSON: 831-1.0077309.json
- JSON-LD: 831-1.0077309-ld.json
- RDF/XML (Pretty): 831-1.0077309-rdf.xml
- RDF/JSON: 831-1.0077309-rdf.json
- Turtle: 831-1.0077309-turtle.txt
- N-Triples: 831-1.0077309-rdf-ntriples.txt
- Original Record: 831-1.0077309-source.json
- Full Text
- 831-1.0077309-fulltext.txt
- Citation
- 831-1.0077309.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0077309/manifest