PRATT'S IMPORTANCE MEASURES IN FACTOR ANALYSIS: ANEW TECHNIQUE FOR INTERPRETING OBLIQUE FACTORMODELSbyAMERY DAI LING WUB.A., Soochow University, 1991M.A., Middlesex University, 1993A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIES(Measurement, Evaluation, and Research Methodology)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)May 2008© Amery Dai Ling Wu, 2008AbstractThis dissertation introduces a new method, Pratt's measure matrix, for interpretingmultidimensional oblique factor models in both exploratory and confirmatory contexts. Overall,my thesis, supported by empirical evidence, refutes the currently recommended and practicedmethods for understanding an oblique factor model; that is, interpreting the pattern matrix orstructure matrix alone or juxtaposing both without integrating the information.Chapter Two reviews the complexities of interpreting a multidimensional factor solution dueto factor correlation (i.e., obliquity). Three major complexities highlighted are (1) theinconsistency between the pattern and structure coefficients, (2) the distortion of additiveproperties, and (3) the inappropriateness of the traditional cut-off rules as being "meaningful".Chapter Three provides the theoretical rationale for adapting Pratt's importance measuresfrom their use in multiple regression to that of factor analysis. The new method is demonstratedand tested with both continuous and categorical data in exploratory factor analysis. The resultsshow that Pratt's measures are applicable to factor analysis and are able to resolve threeinterpretational complexities arising from factor obliquity.In the context of confirmatory factor analysis, Chapter Four warns researchers that a structurecoefficient could be entirely spurious due to factor obliquity as well as zero constraint on itscorresponding pattern coefficient. Interpreting such structure coefficients as Graham et al. (2003)suggested can be problematic. The mathematically more justified method is to transform thepattern and structure coefficients into Pratt's measures.The last chapter describes eight novel contributions in this dissertation. The new method isthe first attempt ever at ordering the importance of latent variables for multivariate data. It is alsothe first attempt at demonstrating and explicating the existence, mechanism, and implications ofthe suppression effect in factor analyses. Specifically, the new method resolves the threeinterpretational problems due to factor obliquity, assists in identifying a better-fitting exploratoryiifactor model, proves that a structure coefficient in a confirmatory factor analysis with a zeropattern constraint is entirely spurious, avoids the debate over the choice of oblique andorthogonal factor rotation, and last but not least, provides a tool for consolidating the role offactors as the underlying causes.iiiTable of ContentsAbstract ^iiList of Tables.Lists of Symbols and Abbreviations ^ viCo-authorship Statement ^ viiiChapter One: Brief Background for Factor Analysis 11.1 What is factor analysis and why factor analyze? 1References ^ 5Chapter Two: What Has the Literature Recommended for Interpreting Factor Models? ^72.1 Recommendations for Interpreting a Unidimensional Factor Models ^72.2 Complexities in Interpreting a Multidimensional Factor Model 132.3 Review of Practices and Recommendations for Interpreting Multidimensional FactorModels ^ 34References 41Chapter Three: Pratt's Importance Measures in Exploratory Factor Analysis ^ 463.1 The Use of Pratt's Importance Measures in Linear Multiple Regression 463.2 The Rationale for Applying Pratt's Importance Measures to Factor Analysis ^513.3 A Demonstration of Pratt's Measures in EFA for Continuous Data ^543.4 A Demonstration of Pratt's Measures in EFA for Categorical Data 62References ^ 69Chapter Four: Demonstration of Pratt's Measures in Confirmatory Factor Analysis ^714.1 Pratt's Measures in CFA with No Factorial Complexities^ 724.2 Pratt's Measures in CFA with Factorial Complexity .764.3 Comparing the Fit of the Pratt's Measures Model: Additional CFA Case Studies ^ 81References ^ 86Chapter Five: Contribution, Limitation, and Future Research ^ 875.1 Recapitulation 875.2 Novel Contributions ^ ..885.3 Caveats and Limitations ^ 925.4 Suggestions for Future Research ^95References ^ 100Appendices ^ 101ivList of TablesChapter TwoTable 2.1 Review of Recommendation and Practice for Interpreting a Factor Solution ^ 11Table 2.2 Vertical and Horizontal Additive Properties of the Orthogonal Factor Model ^ 19Table 2.3 Distortion of the Horizontal Additive Property in the Pattern Coefficients 26Table 2.4 Distortion of the Horizontal Additive Property in the Structure Coefficients ^29Table 2.5 Inappropriateness of Traditional Rules for Interpreting the Pattern Matrix 33Chapter ThreeTable 3.1 Correlation Matrix of the Five Explanatory Variables for TIMSS MathematicsAchievement ^ 50Table 3.2 Pratt's Measures for the Five Explanatory Variables for TIMSS MathematicsAchievement 51Table 3.3 Pattern, Structure, PatternxStructure, & Pratt's Measure Matrices, andCommunalities for Holzinger & Swineford's (1939) Psychological Ability Data ^ 56Table 3.4 Eigenvalues and Parallel Analysis for 2003 TIMSS Outside School ActivitiesData ^ .64Table 3.5 Pattern, Structure, Pattern x Structure, & Pratt's Measure Matrices, andCommunalities for 2003 TIMSS Outside School Activities Data ^ 65Chapter FourTable 4.1 Factor Solution for Case One: with No Factorial Complexities ^73Table 4.2 Factor Solution for Case Two: with One Factorial Complexity .77Table 4.3 Comparisons of CFA Fit Indices of Models Identified by Different EFACoefficients and Cut-offs ^ 84vLists of Symbols and AbbreviationsSymbols (by alphabetic order)English Letters%(F): the percentage of total variance explained by a given factorCov(F 1, F2): the covariance between factor one and factor twodp: the Pratt's measures for the explanatory variable/factor pD: the Pratt's measure matrixFgp: the factor p for response variable q112 : the communalityL: the loading matrixp: the number of the explanatory variables/factorsP: the pattern matrixPS: a matrix in which the elements are the products of a given pattern coefficient and itscorresponding structure coefficientq: the number of the observed response variablesr: Pearson correlationrm : the average correlation in the observed correlation matrix.R: the correlation matrix among the factorsS: the structure matrixSqrt(PS): the square root of PSUq : the error term for the response variable qVar(Y): the variance of predicted response variable YVar(Fi): the variance of factor oneVar(F2): the variance of factor twoXqp : the explanatory variable p for the response variable qYq : the observed score of the response variable q,Yq : the fitted (predicted) score of the response variable qGreek Letters: the standardized partial regression coefficients for the explanatory variable p13qp : the standardized partial regression weight (i.e., loadings or pattern coefficients) ofthe explanatory variable/factor p on the response variable qPp : the simple correlations between the explanatory variable p and the response variableEFD: the sum of the elements in D for a given test across the four factorsETD: is the sum of elements in D for a given factor across the 24 testsEFL2 : the sum of the squared loadings for a given test across the factors.ETL2 : the sum of the squared loadings for a given factor along the tests.EFP2: the sum of the squared pattern coefficients for a given test across the factorsETP2: the sum of the squared pattern coefficients for a given factor along the testsEFPS: the sum of the elements in PS for a given test across the four factorsETPS: the sum of the elements in PS for a given factor across the 24 testsIFS2 : the sum of the squared structure coefficients for a given test across the factorsETS2 : the sum of the squared structure coefficients for a given factor along the testsviAbbreviations (by alphabetic order)AIC: Akaike's information criterionCAIC: corrected Akaike's information criterionCFA: confirmatory factor analysisEFA: exploratory factor analysisFA: factor analysisMLM: multivariate linear modelMV: multivariatePA: parallel analysisSEM: structure equation modelingviiCo-Authorship StatementChapter Two will be revised into a manuscript co-authored with Dr. Bruno D. Zumbo andDr. Anita Hubley at the University of British Columbia. As the first author, I was the in chargeof all aspects of this research project including identification of the research questions, literaturereviews, syntheses, critiques, and conclusions. I will also be in charge of the writing of themanuscript. Both co-authors contributed to the literature review and will assist in the preparationand revision of the manuscript.Chapter Three and Four will be revised into two manuscripts co-authored with Dr. BrunoD. Zumbo at the University of British Columbia and Dr. Roland D. Thomas at CarletonUniversity. As the first author, I was in charge of all aspects of these two projects includingformulating research questions, literature review, research design, data analyses. I will also be incharge of the writing of the manuscript. Both co-authors contributed to the identification anddesign of the research projects and will assist in the preparation and revision of this manuscript.viiiChapter One: Brief Background for Factor Analysis1.1 What is factor analysis and why factor analyze?The major purpose of factor analysis is to identify a parsimonious number of common factorsfrom a larger set of observed variables so that people can have a more concise conceptualizationof the observed variables. There are two major uses of factor analysis. First, it is used as ameasurement validational tool for test s development and refinement. The major purpose is tointerpret the underlying construct(s)2 and investigate how well each item measures theconstruct(s). Second, it is used to uncover the governing dimensions underlying a set ofpsychological domains such as personality traits.Theoretically, the common factors are assumed to be the "hidden underlying causes" of thevariation in the observed variables (Borsboom, Mellenbergh, & van Heerden, 2003; 2004; Burt,1940; Hoyle & Duvall, 2004; Rummel, 1970; Spearman, 1904; Zumbo, 2007) 3. Statistically,factor analysis extracts a set of latent variables to account for the covariances among theobserved variables. In essence, the extracted factors, functioning like independent variables4,partition the variance of an observed variable into two parts: the common variance (i.e.,1Throughout the dissertation, the terms "test" and "scale" are used interchangeably with the term "observed indicator"or "observed variable", which denote the dependent variables in a factor analysis. If one is concerned with psychologicalmeasures, for example, one may speak of scales, whereas in educational or certification settings one speaks of "tests".2It is crucial to point out the distinction between a latent variable (i.e., factor) and a construct. As Zumbo (2007) remindsus, although it is often confused even in the technical measurement literature, the construct is not the same as the truescore or latent variable, which, in practical settings, is not the same as the observed item or task score. The essentialdifference being that a latent variable is a statistical and mathematical variable created by the data analyst and statisticalmodeler for which respondents (or examinees) could receive a predicted score based on their item responses. Aconstruct, on the other hand, is an abstract or theoretical entity that has meaning because of its relation to other abstractvariables, and a theory of the concept being studied. In short, one cannot get an empirically realized score on aconstruct, as they can on a latent variable. Test validity then involves an inference from the item responses to theconstruct via the latent variable; please see Zumbo (2007) for more details.3Historically, there are theorists who have strongly argued against the reification of factors as entities and interpretingthem as causes underlying the data covariances (e.g., Gould, 1981). This dissertation does not intend to take part in thishistorical debate; rather, it recognizes the possible error of making ruthless causal interpretation, but does not refutefactor analysis as a potentially useful tool for attempting causal interpretations, even in a weak form.4 Throughout this dissertation, the term "independent variable" is used interchangeably with "explanatory variable" or"predictor". The term "dependent variable" is used interchangeably with "response variable".1communality) - the proportion of variance that is shared in common with the rest of the observedvariables and the unique variance - the proportion of variance that is specific to an observedvariable and arises from random variation, which is equal to one minus communality. A factoranalysis can be written as a regression equation such thatYq = 13q1Fq1 + 13q2Fq2 + ^ + 13qpFqp + Uq,^ (1.1)or^iTq = 13q1Fq1 + Pq2Fq2 + ^ + 13qpFqp (1.2)where,Yq is the standardized score for the observed response variable q,Fqp is the score of factor p for the observed response variable q,13qp is the standardized partial regression weight of factor p on the observed response variable q,Uq is the unique term (i.e., residual) for the observed response variable q‘74:1 is the predicted (fitted) score for the observed response variable q.Three major differences between equations (1.1) and a typical multiple regression are: (a) qmultiple observed response variables (i.e., dependent variables), Y q, are regressedsimultaneously, (b) the latent independent variables, Fqp , are the common factors (i.e.,independent variables) that are created by accounting for the covariances among the q observedvariables, and (c) the weights 13qp for these factors are typically termed factor loadings in a factoranalysis.A factor solution is referred to as unidimensional if only one factor in equation (1.1) isconsidered sufficient to account for the covariances among the observed variables, ormultidimensional if two or more factors are entailed. When a multidimensional solution ischosen, the factors can be hypothesized to be inter-correlated and referred to as an obliquesolution, or uncorrelated and referred to as an orthogonal solution. In practice, the choice ofobliquity or orthogonality is often dependent upon the theoretical and/or empirical grounds as2well as the ease of interpretation that comes with each solution. This issue will be discussed indetail in Chapter Two.Exploratory factor analysis (EFA) versus confirmatory factor analysis (CFA) is a commonclassification for a factor analysis (Joreskog, 1969). Conceptually, the distinction between EFAand CFA lies in whether the investigator has a firm expectation of the underlying factor structurebased on theoretical and/or empirical grounds (Church & Burke, 1994; Floyd & Widaman, 1995;Henson & Roberts, 2006). CFA requires a priori model specification regarding four elements ofthe factor structure: (1) the number of factors, (2) the correlation among the factors, if the modelis multidimensional, (3) the loadings of the factor(s) on the observed variables, and if necessary,(4) the correlations among the unique terms (Joreskog & Sorbom, 1999; Wu, Li, & Zumbo,2007). EFA, in contrast, is used when the investigator has no clear hypothesis about the abovefour elements and aims to explore the unknown structure of the empirical data and thesubstantive meaning of the factors.In statistical terms, the CFA vs. EFA distinction lies in whether any restriction is placed onthe parameters estimates. Thus, CFA and EFA are also distinguished as restricted versusunrestricted factor analysis (Ferrando & Lorenzo-Seva, 2000; Joreskog & Siirbom, 1999). CFAconstrains a subset of the model parameters to some fixed values according to the investigator'shypothesis (typically zeros). It is "confirmatory" in the sense that a CFA rejects or retains therestricted model using formal hypothesis testing (Floyd & Widaman, 1995; Zumbo, Sireci, &Hambleton, 2003). In contrast, EFA does not constrain any model parameters and allows all theparameters to be freely estimated (except in the orthogonal case where the factor correlations areconstrained to be zeros). EFA results sometimes are used as the empirical groundwork forspecifying a CFA hypothesis when a known theory is absent to guide the parameter constraints.Without a doubt, the credibility of factor analytic results, whether it be EFA or CFA, requiresa user's craftsmanship in making a series of critical judgements about how to choose a3statistically optimal and theoretically sensible model (Brown, 2006; Fabrigar, Wegener,MacCallum, & Strahan, 1999; Gorsuch, 1983; Russell, 2002). Among many others,interpretability is one of the key criteria for evaluating the credibility of a factor solution(Gorsuch, 1983; Nunnally, 1978). Confidence in a chosen factor solution is reserved if the resultsare mathematically or theoretically difficult to interpret. The overall purpose of this dissertationis to propose a new method for assisting in interpreting factor analyses, in particular, the obliquefactor models.This dissertation has produced three manuscripts to be revised and submitted for publication.Thus, the format of this dissertation follows the guidelines of a manuscript-based dissertationrequired by the Faculty of Graduate Studies at the University of British Columbia. Chapter Two,the first manuscript, includes an introduction and a literature review that set the context for theother two manuscripts. It explicates the interpretational problems and complexities inherent in anoblique factor model. It also critiques the commonly recommended and often practiced methodsfor interpreting an oblique factor model. Chapter Three, the second manuscript, introduces thenew method, the Pratt's measure matrix that resolves the problems and complexities ininterpreting an oblique factor model described in Chapter Two. It provides the theoreticalrationales and two real-data demonstrations to substantiate the use of Pratt's measures matrix inEFA. Through the use of Pratt's measures matrix, Chapter Four, the third manuscript, argues anddemonstrates that the method suggested by Graham, Guthrie, and Thompson (2003) forinterpreting an oblique CFA model can be problematic. The last Chapter summarizes theimprovements brought by the new method to resolving the problems of interpreting the obliquefactor models. Future research to improve the new method is also suggested.4ReferencesBorsboom, D., Mellenbergh, G. J., & van Heerden, J. (2003). The theoretical status of latentvariables. Psychological Review, 110, 203-219.Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity.Psychological Review, 111, 1061-1071.Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: TheGuilford Press.Burt, C. (1940). The factors of the mind. London: University of London Press.Church, J. T., & Burke, P. J. (1994). Exploratory and confirmatory tests of the big five andTellegen's three- and four-dimensional models. Journal of Personality and SocialPsychology, 66, 93-114.Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the useof exploratory factor analysis in psychological research. Psychological Methods, 3, 272-299.Ferrando, P. J., & Lorenzo-Seva, U. (2000). Unrestricted versus restricted factor analysis ofmultidimensional test items: Some aspect of the problem and some suggestions.PsicolOgica, 21, 301-323.Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement ofclinical assessment instruments. Psychological Assessment, 7, 286-99.Gould, S. J. (1981). The mismeasure of man. New York: W. W. Norton.Graham, J. M., Guthrie, A. C., & Thompson, B. (2003). Consequences of not interpretingstructure coefficients in published CFA research: A reminder. Structural EquationModeling, 10, 142-152.Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research:Common errors and some comment on improved practice. Educational andPsychological Measurement, 66, 393-416.Hoyle, R. H., & Duvall, J. L. (2004). Determining the number of factors in exploratory andconfirmatory factor analysis. In D. Kaplan (Ed.), The SAGE handbook of quantitativemethodology for social sciences (pp. 301-315). Thousand Oaks, CA: Sage.JOreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis.Psychometrika, 34, 183-202.Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.Rummel, R. J. (1970). Applied factor analysis. Evanston, IL: North-western University Press.5Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factoranalysis in personality and social psychology bulletin. Personality and Social PsychologyBulletin, 28, 12, 1629-46.Spearman, C. (1904). General intelligence, objectively determined and measured. AmericanJournal of Psychology, 15, 201-293.Wu, A. D., Li, Z., & Zumbo, B. D. (2007). Decoding the Meaning of Factorial Invariance andUpdating the Practice of Multi-group Confirmatory Factor Analysis: A DemonstrationWith TIMSS Data. Practical Assessment Research & Evaluation, 12, (2). Availableonline: http://pareonline net/genpare.asp?wh=0&abt=12Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao &S. Sinharay (Eds.), Handbook of statistics, Vol. 26: Psychometrics, (pp. 45-79).Zumbo, B. D., Sireci, S. G., & Hambleton, R. K. (2003, April). Re-visiting exploratory methodsfor construct comparability and measurement invariance: Is there something to be gainedfrom the ways of old? Paper presented at the Annual Meeting of the National Council forMeasurement in Education (NCME), Chicago, Illinois.6Chapter Two: What Has the Literature Recommended for InterpretingFactor Solutions?This chapter reviews the commonly recommended strategies for interpreting factorsolutions. In particular, it describes the difficulties in interpreting an oblique factor model. Thischapter also critiques and explains why the current recommendation of juxtaposing both thepattern and structure coefficients to overcome the difficulties in interpreting an oblique factormodel is insufficient and problematic'.2.1 Review of Recommendations for Interpreting a Unidimensional Factor SolutionIn a factor analysis, the researchers are interested in making two distinct inferences basedon the data. The first is a logical inference that involves assigning meaning to the factors andthen making causal inferences about the factors (Cattell, 1952; Kim & Muller, 1978; Rummel,1970). The second is a statistical inference that involves generalizing the first type of inference tothe population based on a given sample (Kim & Mueller, 1978).The fundamental theory of factor analysis shown in equation (1.1) is, in essence, adirectional model and implies, at least theoretically, that the factors are the "hidden underlyingcauses" of the variation in the observed variables (Borsboom et al., 2003; 2004; Burt, 1940;Hoyle & Duvall, 2004; Rummel, 1970; Spearman, 1904; Zumbo, 2007). In a path diagram, thisdirectional relationship is represented by the arrows going from the factors to the observedvariables. The regression weight in equation (1.1) indicates the change in the observed score perunit change in the factor score. The term loading matrix refers to such an array of regressioncoefficients of the p factors on the q observed variables. Unlike typical directional inferences,however, the causal interpretation is complicated by the fact that the meaning of the underlyingI A version of this chapter will be submitted for publication. Wu, A. D., Zumbo, B. D., & Hubley A. Commoninterpretational problems with oblique factor models.7causes, factors, is unknown to the investigator. Factors are merely latent variables ofmathematical creation, which do not automatically have substantive meaning. To makedirectional interpretation, the meaning of a factor must be fairly well-known to the researchers.Traditionally, the meaning of a factor is inferred by the common meaning of the observedvariables that load "meaningfully" (or "saliently") on that factor, and accordingly, a heuristiclabel is assigned to the factor. In this dissertation, we refer to the task of assigning meaning tofactors as vertical interpretation and the task of making causal inferences about factors ashorizontal interpretation. These two interpretational perspectives will be discussed in detail laterin this chapter. For both perspectives, the first interpretation challenge is to decide on whatconstitutes a "meaningful" loading.As with any statistics, factor loadings can be impressive by chance (Cudeck & O'Dell1994; Gorsuch, 1983; Harman, 1976; Horn, 1967; Humphreys, Ilgen, McGrath, & Montanelli,1969; Kim & Muller, 1978). One of the approaches to deciding on whether a given loading ismeaningful is to use inferential statistics, that is, hypothesis testing whether the loadings arestatistically significant from a particular value, typically zero (Archer & Jennrich, 1973; Cliff &Hamburger, 1967; Cudeck & O'Dell, 1994; Harman, 1976; Henryeson, 1950; Jennrich, 1973).The advantage of this approach is that it relies on formal hypothesis testing and allows theconstruction of confidence intervals for the loadings.The use of hypothesis testing is, however, limited by its technical difficulties. Theestimation of the standard errors of the loadings is a complex function of the sample size, thenumber of factors and observed variables, the estimation method, the rotation procedure, and thecorrelation among the factors (Cliff & Hamburger, 1967; Cudeck & O'Dell, 1994; Gorsuch,1983). Take sample size for instance, a given loading could be considered to be significantlydifferent from zero with a large sample size but non-significantly different from zero with asmall sample size. In other words, the limitation of using hypothesis testing as the criterion for8being "meaningful" is that it can detect a trivial departure from zero if the sample size is large orfail to detect a meaningful relationship when the sample size is small. In addition, investigatorsoften are more interested in what constitutes "practically significant" rather than whether theloading is different from zero.As a response to the estimation difficulty and lack of practical usefulness of hypothesistesting, the literature has suggested several rules for the minimum cut-off, beyond which aloading is considered as salient enough to be practically meaningful (Cudeck & O'Dell, 1994;Gorsuch, 1983; Rummel, 1970). A salient loading is one that is sufficiently high to assume that ameaningful relationship exists between the observed variable and the factor (Gorsuch, 1983).Today, there are several commonly adopted rules among the practitioners. Harman (1967,p. 435) provided a table of standard errors for loadings with sample sizes ranging from 20 to 500and rn, ranging from .10 to .75 where rn, is the average correlation in the observed correlationmatrix. In defining "meaningful loading", Comrey and Lee (1992) used "variance explained"criteria for deciding the importance of factor loadings: loadings in excess of .71 (i.e., 0.71x0.71=50% of common variance) are considered excellent, 0.63 (i.e., 40% of common variance) verygood, 0.55 (i.e., 30% of common variance) good, 0.45 (i.e., 20% of common variance) fair, and0.32 (i.e., 10% of common variance) poor. They concluded that only variables with loadings of0.32 or greater should be interpreted.A caveat for any cut-offs is that the use of the same lower bound for all loadings in afactor solution may be problematic. That is, it ignores the fact that for a given sample, differentloading estimates have different sampling variability, not to mention sample-to-sample variation(Cudeck & O'Dell 1994). For this reason, based on an empirical review, Gorsuch (1983)suggested that a minimum loading of 0.3 for an orthogonal solution is sufficient to be consideredsignificantly different from zero for a sample size of 175, but 0.4 for a sample size of 100.Considering the sampling variation, Gorsuch (1983) also provided a rough guide for salient9loadings by doubling the Pearson correlation required for statistical significance given thesample size. For example, for a sample size of 100, a factor loading of 0.4 is the necessaryminimum for being considered as salient because a Pearson correlation of 0.2 is the necessaryminimum for being considered as statistically significant for the same sample size. Despite theserules, the choice of minimum cut-off is often left to the discretion of the researchers, and the onlyconsensus is "the greater the loading is, the better."Table 2.1 summarizes a sample of 40 scholarly entries on the method of factor analysisfrom the inception of oblique factor analysis (1940s) to date. It includes 20 textbooks thatdevoted the whole book specifically to the topic of factor analysis (including one monograph). Italso includes 11 textbooks that devoted at least one chapter to the method of factor analysis;among them, nine are multivariate statistics textbooks, one is a psychometrics textbook, and oneis a latent variable modeling textbook. All textbooks were borrowed from the libraries of theUniversity of British Columbia and Simon Fraser University that were available at the time ofthe search. The other nine entries are journal articles that provided reviews or guidelines on theuse of factor analysis. Journal articles were retrieved from the EBSCO database bysimultaneously searching the key words "factor analysis" and "review". The first column ofTable 2.1 lists the author's name and the publication year. The second column specifies the typeof scholarly work. The third column lists the minimum cut-off value practiced or reported by theauthors. The last three columns list other interpretational practices, which Section 2.3 of thischapter will address. From column three, one can see that among the 40 entries, 13 did not useany cut-off for loading interpretation, the minimum cut-off reported by the other 27 works rangesfrom 0.10 to 0.50 with value of 0.3 or 0.4 being most popular.10Table 2.1 Review of Recommendation and Practice for Interpreting a Factor Solution (continued on next page)Author Type of literature Minimumcut-offacceptedDifferentiatedP & S?P or Sinterpreted?JuxtaposedP & S?Burt (1940) FA textbook No Yes PatternThurstone (1947) FA textbook No Yes PatternCattell (1952) FA textbook 0.1 or 0.13 Yes PatternFruchter (1954) FA textbook 0.3 Yes PatternHorst (1965) FA textbook No No OrthogonalHarman (1967) FA textbook 0.3 or 0.4 Yes Both YesRummel (1970) FA textbook 0.5 Yes Both YesGuertin & Bailey (1970) FA textbook 0.3 o r 0.4 Yes BothLawley & Maxwell (1971) FA textbook No No PatternMulaik (1972) FA textbook No Yes BothKim & Muller (1978) FA monograph 0.3 Yes Both YesNunnally (1978) Psychometrics textbook No Yes Both YesCattell (1978) FA textbook 0.3 or 0.4 No BothPress (1982) MV stats textbook 0.168 No PatternGorsuch (1983) FA textbook 0.3 or 0.4 Yes Both YesCureton & d'Agostino (1983) FA textbook 0.2 Yes Pattern YesMcDonald (1985) FA textbook 0.29 Yes PatternFord et al. (1986) Review/guideline article 0.15 No No indicationYates (1987) MV stats textbook 0.23 Yes Both YesComrey & Lee (1992) FA textbook 0.3 Yes BothFloyd & Widaman (1995) Review/guideline article 0.3 or 0.4 No No indicationKline (1993) FA textbook 0.3 Yes StructureTable 2.1 Review of Recommendation and Practice for Interpreting a Factor Solution (Continued)Author Type of literature Minimum Differentiated P or S Juxtaposedcut-offacceptedP & S? interpreted? P & S?Ferguson & Cox (1993) Review/guideline article 0.4 No PatternBasilevsky (1994) FA textbook 0.2 or 0.3 Yes BothStevens (1996) MV stats textbook 0.4 Yes OrthogonalThompson (1997) Journal guideline article No Yes Both YesLoehlin (1998) Latent variable textbook No Yes Both YesFabrigar et al. (1999) Review/guideline article 0.3 or 0.4 No PatternCudeck (2000) MV stats book chapter 0.3 Yes Both YesRussell (2002) Review/guideline article No Yes Both YesTimm (2002) MV stats textbook No Yes OrthogonalJohnson & Wichern (2002) MV stats textbook 0.416 No OrthogonalPreacher & MacCallum (2003) Review/guideline article No No No indicationPett et al. (2003) FA textbook 0.4 Yes BothConway & Huffcutt (2003) Review/guideline article No No No indicationGiri (2004) MV stats textbook No No No indicationBrown (2006) FA textbook 0.3 or .4 Yes No indicationHair et al. (2006) MV stats textbook 0.3 No PatternHenson & Roberts (2006) Review/guideline article 0.4 Yes OrthogonalTabachnik & Fidell (2007) MV stats textbook 0.3 Yes PatternP: pattern matrix, S: structure matrix. MV: multivariate, FA: Factor analysis. The list is temporally ordered.2.2 Complexities in Interpreting a Multidimensional Factor ModelWhen a factor solution is multidimensional, the approach described in 2.1 for interpretingfactor loadings is further complicated, especially when the factors are allowed to correlate. Thissection explicates two broad complexities under multidimensionality. The first complexityinvolves the vertical vs. horizontal interpretation and factorial complexity, and the secondinvolves factor rotation. In particular, the discussion about factor rotation details threeinterpretational problems that may arise from an oblique factor rotation.Complexity One: Vertical vs. Horizontal Interpretation and Factorial ComplexityA unidimensional factor model contains a column of loadings of the single factor on the qobserved variables. In contrast, a multidimensional factor model contains a loading matrix L forthe p multiple factors on the q observed variables. To properly understand the multiple-factormultiple-variable structure, it entails two perspectives in interpreting a L (Burt, 1940; Cattell,1952; 1957; Gorsuch, 1983; Rummel, 1970), which we shall term as the vertical interpretationand horizontal interpretation.The vertical interpretation reads along the q observed variables for a given factor at atime. The nature of the vertical interpretation is descriptive and classificatory (Cattell, 1952;1957; Gorsuch, 1983; Rummel, 1970). The major purpose is to (1) uncover the factor structureby summarizing and categorizing the complex interrelationships in the data and (2) understandthe substantive meaning of the factors. The vertical interpretation is most germane when the taskof the factor analysis is to (1) assign labels that best characterize the substantive meaning of thefactors or (2) to form subscales among the set of observed items.In contrast, the horizontal interpretation reads across the p factors one observed variableat a time. Horizontal interpretation considers factors as the underlying causes that explain thevariation in the observed variables as implied by the term "factor" used conventionally in the13experimental research design (Burt, 1947; Cattell, 1952; 1957; Gorsuch, 1983; Hoyle & Duvall,2004; Rummel, 1970). Factor analysis is believed to delineate the causal nexus. The causalapproach to factor interpretation is to "impute substantive form to the underlying and unknowns"(Rummel, 1970, p. 476). This interpretation perspective resonates with Spearman's (1904)original work wherein he referred to factors as 'the hidden underlying causes'. As Hoyle &Duvall (2004) stated,[...] the sets of observed variables believed to be caused, at least in part, by one or morefactors. The patterns of association are conveyed in matrices of covariances orcorrelation, and to the extent that the associations among the observed variables are nearzero when the influence of the factors is taken into account. (p. 301)Many of the early applications of factor analysis intended to identify these 'underlyingfactors'. These historical endeavours are summarized in Burt's (1966) work. The horizontalinterpretation is particularly useful for test validation purposes where the researcher examineswhether the hypothesized constructs have indeed caused the variation in people's item responses(Borsboom, 2003; 2004; Zumbo, 2007). Namely, the horizontal interpretation examines whetheran item measures what it purports to measure (Borsboom, 2003; 2004; Kelley, 1927; Nunnally,1978; Zumbo, 2007).In order to make meaningful causal statements about factors, the substantive meaning ofthe factors should, at the very least, be fairly well-known to the researchers. Thus, the horizontalinterpretation is most appropriate when the vertical interpretation has been demonstrated to beintelligible and meaningful by the present or previous data (Burt, 1947; Gorsuch, 1983). For thisreason, the vertical interpretation should, in principle, precede the horizontal interpretation for agiven data set, unless the meaning of the factors is already designated and verified in previousstudies.14When interpreting horizontally, an observed variable may yield salient loadings on morethan one factor -- a condition referred to as factorial complexity (Kim & Muller, 1978), crossloading (Gorsuch, 1983; Rummel, 1970) or cooperative factors (Cattell, 1952; 1978). That is, agiven observed variable is not a pure measure of one factor but also of the others.Factorial complexity complicates the interpretation because researchers have todetermine which factors should be recognized as more important causes for the variation in anobserved variable. Factorial complexity can also make vertical interpretation less straightforwardbecause an item may be involved in defining the meaning of multiple factors or be categorizedinto multiple subscales. Similarly, the task of identifying a "meaningful factor structure" from Lis no longer as clear-cut as is in the no cross-loading scenario (Church & Burke, 1994; Kim &Muller, 1978). In order to minimize the interpretational difficulty due to factorial complexity, theinitial orientation of the factors is often rotated to target the simple structure proposed byThurstone (1947).Thurstone's simple structure targets a solution where each factor is defined by a non-overlapping subset of observed variables that load higher relative to the rest of the measuredvariables and each observed variable loads on as few factors as possible (Browne, 2001; Cattell,1952; Gorsuch, 1983; Kim & Muller, 1978; Nunnally, 1978) 2. An ideal realization of the simplestructure is a loading matrix with no factorial complexities where each observed variable loadsonly on one factor with zero loadings on the others. Although factor rotation was developed tosimplify factorial complexities in a multidimensional factor solution, in itself, it inevitablycreates other complexities along the way.2In the original Thurstone (1947) book, he proposed five specific criteria for the simple structure.15Complexity Two: Factor RotationA number of rotation procedures have been developed to capture the simple structure (seereviews in Browne, 2001; Gorsuch, 1983). These rotation techniques are often classified into twotypes: orthogonal or oblique. Orthogonal rotation constrains the angles among the factors to be at90 degrees in the multidimensional space, hence yielding uncorrelated factors. Today, the mostdominant method for the orthogonal rotation is probably the VARIMAX procedure (Kaiser,1958). When the factors are orthogonal, interpretation of the loadings is inherentlystraightforward and simple because the loadings represent both the unique effects of the factorson the observed variables and the unique bivariate zero-order correlations between the factorsand the observed variables. Furthermore, the interpretation of the loading matrix is simplified bytwo mathematical properties (Burt, 1940; Fruchter, 1954; Nunnally, 1978),which we shall referto as the horizontal and vertical additive properties.The horizontal additive property affirms that the observed variance explained by a givenfactor is equal to the squared loading of that factor, hence, the communality of a given observedvariable is equal to the sum of the squared loadings across the p factors (Gorsuch, 1983; Harman,1976; Kim & Muller, 1978; Kline, 1993; Nunnally, 1978). Under orthogonality, calculation ofthe communality of an observed variable is a simple exercise of adding the squared loadingshorizontally. The unique variance of an observed variable, therefore, is equal to one minus thesum of the squared loadings.For example, if an item loads on factors Fl and F2 with loadings of 0.7 and 0.2respectively, Fl would contribute 0.72 = 0.49 (49 %) of the variance, and F2 would contribute0.22 = 0.04 (4%) of the variance of that item. The communality would be equal to 0.49 + 0.04 =.53, and the unique variance would be equal to 1- 0.53 = 0.47. Because of this additive propertyof calculating the communality, the contribution of a given factor to the observed variance of an16item is readily attributable to its squared loading-- 49% to Fl and 4% to F2. Also, thecontribution of a given factor to the standardized communality (i.e., transforming thecommunality into 1 by dividing itself) is readily attributable to the ratio of the squared loading tothe communality. Using the same example, Fl contributes 0.49/0.53 = 92.5% of the standardizedcommunality, and F2 contributes 0.04/0.53 = 7.5% of the standardized communality.The ease of orthogonal interpretation due to the horizontal additive property is analogousto that of a multiple regression when the independent variables are uncorrelated. Theindependent variables' contribution is directly attributable to each independent variable by thesquares of the standardized partial regression coefficients. The R-squared value is the sum of thesquared standardized partial regression coefficients across all the independent variables. Suchinterpretation simplicity of orthogonality transplants to factor analysis naturally because factorsare, in essence, the independent variables for the observed variables.The vertical additive property affirms that the amount of total variance explained by agiven factor is equal to the sum of the squared loadings along the q observed variables (Guertin& Bailey, 1970; Kline, 1993; Nunnally, 1978). The total variance is the sum of the standardizedvariances of all the observed variables, which is equal to q (one for each of the q observedvariable). For example, suppose two factors are extracted for a dataset with six observedvariables, which load on Fl with values of 0.3 0.4, 0.6, 0.7, 0.5, and 0.2. The amount of totalvariance, 6, that is accounted for by Fl will be equal to (0.3 )2 + (0.4)2 + (0.6)2 + (0.7)2 + (0.5)2 +(0.2)2 =1.39 . The percentage of the total variance explained by Fl would be equal to 23.17%(1.39/6). In addition, the amount of total variance explained jointly by the bi-factor model can besimply calculated by summing the amount of total variance explained by Fl and F2.The vertical and horizontal additive properties inherent in orthogonality are illustratedwith real data taken from Holzinger and Swineford (1939). This data consist of a variety of 2417psychological ability tests of junior high school students. These classic data have been usedthroughout the history of factor analysis and are one of the most widely studied in the literature(See Appendix A for the names of the 24 tests). The data have been shown to consist of fourdimensions by various experts in factor analysis (e.g., Browne, 2001; Gorsuch, 1983; Harman,1976; Preacher & MacCallum, 2003; Tucker & Lewis, 1973). The four-factor model explained11.032 (46%) of the standardized total variance (i.e., 24).Column 2 of Table 2.2 shows the communalities for the 24 tests denoted as h2, columns 3to 6 show the orthogonal loading matrix denoted as L, columns 7 to 10 show the correspondingsquared values of the loading matrix denoted as L2, which is the amount of variance in theobserved variables explained by each of the four factors. The last column shows the sum of thesquared loadings across the four factors denoted as EFL2. Because of the horizontal additiveproperty, the values of EFL2 for each of the 24 tests are all equal to their correspondingcommunalities.In addition, the second last row of columns 7 to 11 shows the sums of squared loadingsalong the 24 tests for each of the four factors denoted as Z TL2 . Because of the vertical additiveproperty, these values correspond to the amount of total variance that is explained by each of thefour factors. The last row of column 7 to 11 shows the percentage of total variance explained byeach of the factors denoted as %(F). For example, factor one accounted for 4.08 of the totalvariance, which is 4.08/24 = 17% of the total variance. The percentage of total variance jointlyexplained by the four factors adds up to 46%.18Table 2.2 Vertical and Horizontal Additive Properties of the Orthogonal Factor ModelCoI.# 2 3 4 5 6 7 8^9 10 11h2 L L2 ZFL2F1 F2 F3 F4 F1 F2^F3 F4T1 .47 .24 .61 .14 .15 .06 .37^.02 .02 .47T2 .28 .09 .52 -.03 .04 .01 .27^.00 .00 .28T3 .22 .15 .43 .10 -.01 .02 .19^.01 .00 .22T4 .43 .00 .62 .13 .18 .00 .39^.02 .03 .43T5 .70 .81 .13 .15 .01 .66 .02^.02 .00 .70T6 .68 .79 .17 .11 .15 .62 .03^.01 .02 .68T7 .77 .87 .12 .08 .03 .75 .01^.01 .00 .77T8 .56 .69 .22 .12 .13 .48 .05^.02 .02 .56T9 .72 .80 .24 .09 .11 .65 .06^.01 .01 .72T10 .64 .08 -.10 .78 .14 .01 .01^.60 .02 .64T11 .45 .26 .13 .55 .26 .07 .02^.30 .07 .45T12 .45 .05 .19 .64 .05 .00 .04^.41 .00 .45T13 .41 .11 .36 .51 .09 .01 .13^.26 .01 .41T14 .45 .14 .03 .01 .65 .02 .00^.00 .43 .45T15 .34 -.05 .13 .03 .57 .00 .02^.00 .32 .34T16 .42 .13 .38 .11 .50 .02 .14^.01 .25 .42T17 .38 .04 -.03 .30 .54 .00 .00^.09 .29 .38118 .26 .09 .14 .21 .43 .01 .02^.05 .18 .26119 .23 .22 .20 .11 .36 .05 .04^.01 .13 .23T20 .38 .30 .46 .03 .27 .09 .22^.00 .07 .38T21 .42 .25 .41 .38 .20 .06 .17^.14 .04 .42T22 .44 .46 .44 .10 .17 .21 .19^.01 .03 .44T23 .53 .37 .55 .21 .21 .14 .30^.04 .05 .53T24 .42 .40 .21 .37 .28 .16 .05^.14 .08 .42ETL2 4.08 2.71 2.18 2.06 11.03%(F) 17.00 11.30 9.10 8.60 46.00Note. h z denotes the communality of a given test. EFL z is the sum of the squared loadings for a given test across the four factors.ITL2 is the sum of the squared loadings for a given factor along the 24 tests. %(F) denotes the percentage of total varianceexplained by a given factor.Orthogonal rotation, however, has its limitations. First, the constraint on zero correlationamong the factors appears unrealistic because there is considerable theoretical and empirical19support for believing that most constructs in social and behavioural science are correlated. Thisbelief is based on the supposition that factors possessed by an individual will typically becorrelated. For example, while reading ability and mathematical ability are distinct attributes,they are certainly correlated to some degree. Also, factors are rarely uncorrelated in real datacollected from day-to-day research. Second, because an orthogonal rotation forces the factors tobe oriented at right angle to one another, there is less flexibility in approaching the simplestructure. Third, if there were true associations among the factors, information on theserelationships would be omitted by the orthogonality constraint and may lead to biased loadingestimates.In spite of its limitations, orthogonal rotation has been the preference of many factoranalysis users largely because of its ease of interpretation due to the additive properties (Browne,2001; Conway & Huffcutt, 2003; Fabrigar et al., 1999; Harman, 1967; Henson & Roberts, 2006;Horst, 1965; Kieffer, 1998; Nunnally, 1978; Preacher & MacCallum, 2003; Rummel, 1970) andalso because it is the default device of many popular statistical packages (Browne, 2001;Conway & Huffcutt, 2003; Fabrigar et al., 1999; Preacher & MacCallum, 2003). In fact, one cansee the preference for orthogonal rotation even in cases where it is a priori tested and suggestedthat the factors are correlated. In this case, there is clearly a mismatch between the orthogonalfactor solution and the data at hand Many methodologists have been warning against thethoughtless or unjustified use of orthogonal rotation and have been advocating the primary use ofoblique rotation (Church & Burke, 1994; Cudeck, 2000; Fabrigar et al. 1999; Floyd & Widaman,1995; Henson & Roberts, 2006; Preacher & MacCallum, 2003; Thurstone, 1947).Oblique rotation allows the orientation of the factors to be less or greater than 90 degrees,hence, correlated factors. Today, several equally popular oblique procedures are available suchas the Direct QUARTIMIN (Jennrich & Sampson, 1966), and the PROMAX (Hendrickson &20White, 1964). There are several advantages in applying oblique rotation procedures. Below, wesummarize some suggested in the literature:■ An oblique rotation is more likely to approximate the simple structure by allowing flexibilityin the orientation of factors (Browne, 2001; Cudeck, 2000; Gorsuch, 1983; Thurstone, 1947).■ Although the purpose of factor analysis is to identify distinctive factors underlying a set ofobserved variables, many constructs in social and behavioural science may be bettercharacterized as a continuum of distinction rather than as independent entities (Church &Burke, 1994; Floyd & Widaman, 1995; Preacher & MacCallum, 2003)■ Orthogonality is a proposition to evaluate, not a fact to believe. An oblique rotation allowsthe factor correlations to be evaluated; if the factors are virtually orthogonal, the obliquerotation will return with, in essence, an orthogonal solution. In such a case, orthogonalconstraint could follow subsequently for parsimony (Henson & Roberts, 2006).■ Factor correlations reveal the psychological relationships of the underlying traits and provideinformation for identifying second-order factors (Cattell, 1978).■ Technically, the orthogonality constraint can create biased loading estimates and a problemwith under-identification in confirmatory factor analysis (Floyd & Widaman, 1995; Kenny &Kashy, 1992).In promoting the use of oblique rotation, Cattell (1978) stated,The reason begins with the fact that we should not expect influences [factors] in acommon universe to remain mutually uninfluenced and uncorrelated. To this we can addan unquestionable statistical argument, namely, that if factors are by some rulesuncorrelated in the total population they would nevertheless be correlated (oblique) in thesample just as any correlation that is zero in the population has a non-zero value in anysample. (p. 128)21Three Interpretational Complexities Due to Factor ObliquityAlthough allowing inter-correlations among factors is theoretically and empirically morejustifiable, factor obliquity generates more interpretational complexities. Factor correlations notonly render more difficulties in estimating the standard errors of the loadings for conductinghypothesis testing (Archer & Jennrich, 1973; Jennrich, 1973), they may also invalidate the cut-off criteria that are commonly and habitually applied to infer "meaningful loadings". Threemajor interpretational complexities with these traditional cut-off criteria may occur when thefactors are correlated.1. Inconsistency between P and SAn oblique factor model yields two distinctive types of parameters of interpretationalinterest: the pattern coefficients and the structure coefficients. The pattern coefficients are thestandardized partial regression weights assigned to each of the factors to yield the prediction ofthe observed scores (i.e., loadings). They reflect the unique and directional effect-- i.e., thechange in the observed score per change in the factor score, by taking into account theoverlapping relationships among the factors. The structure coefficients are the zero-ordercorrelations between the observed variables and the factors indicating the bi-directionalrelationship. The structure coefficients are analogous to the bivariate Pearson correlationswithout isolating the overlapping relationships among the factors. The matrix of patterncoefficients is often denoted as P, of the structure coefficients as S, and of the factor correlationsas R. Both P and S are of size qxp and R is of size pxp. The relationship between P, S, and R isgiven as,Sqxp= Pqxp Rpxp^ (2.1)Equation (2.1) indicates that the structure matrix is equal to the pattern matrix post-multiplied bythe factor correlation matrix. A detailed account of the meanings and relationships among P, S,22and R can be easily found in the factor analysis literature (e.g., Gorsuch, 1983; Harman, 1967;Kim & Muller, 1978; Thompson & Daniel, 1996; Rummel, 1970).Unfortunately, the interpretational ease borne with additive properties of orthogonalitydoes not automatically transplant to an oblique solution.From equation (2.1), one can see that P will be equal to S only when R is an identitymatrix with elements in the off-diagonals all equal to zero indicating no correlations among thefactors. When the factors are correlated, there are always some discrepancies between P and Sdepending on the magnitude of correlations among the factors and their relationships with theobserved variables. The generic and convenient term "loading" can no longer beindistinguishably used because it cannot synonymously designate the pattern and structurecoefficients (Courville & Thompson, 2001; Henson, 2002; Henson & Roberts, 2006; Kim &Muller, 1978; Thompson & Borrello, 1985).The pattern coefficient, as in the orthogonal case, still reflects the unique directionaleffect because it isolates the effect of other factors in the model. However, the structurecoefficient no longer reflects the unique bi-directional relationship between an observed variableand a factor. This is because structure coefficients are zero-order correlations that carryoverlapping relationships rendered by the inter-correlations among the factors. The structurecoefficient will always be an overestimate of the unique association between a factor and avariable, say Y1 and a given factor Fl, because the correlation between Y1 and Fl may be partlyor solely due to both Y1 and F1 being related to F2. Once F2 is removed, the correlation betweenFl and Y1 may decrease or diminish. Gorsuch (1983) stated,The structure coefficients do not reflect the independent contribution because thecorrelation between a variable and a factor is a function not only of its distinctive23variance but also of all the variance of the factors that overlaps with the other factors. (p.207)Because pattern coefficients and structure coefficients can yield rather different values, they canlead to rather inconsistent interpretations about a factor solution.2. Distortion of Additive PropertiesUnlike orthogonal loadings that exhibit both additive properties, pattern coefficientsexhibits only the vertical additive property. More adversely, the structure coefficient exhibitsneither additive property (Nunnally, 1978; Rummel, 1970).When factors are correlated, the horizontal additive property is distorted. The squaredpattern coefficient no longer reflects the amount of variance explained by a given factor and thesum of the squared pattern coefficients across the p factors is no longer equal to thecommunality. The distortion can be explained by the variance of the predicted score 1‘7, whichitself is a linear combination of factor scores as in equation (1.1). Taking a bi-factor model forinstance, 'C' is a linear combination such that "C(= [31F1 + r32F 2, where Oland 132 are the patterncoefficients for F1 and F2. The variance of the linear combination (i.e., communality) denoted asVar(Y) or equivalently Var(131F1 + (32F2) is given as,= (31 2Var(Fi) + r322Var(F2) + 2 pi132cov(Fi, F2)^(2.2)pi2 + p22 + 2p 1 p2R=^ (2.3)Because the variance of a factor score is often scaled to be one, equation (2.2) can besimplified to (2.3), where R is the Pearson correlation between Fl and F2. Assuming pi and 132are of the same sign, calculating the communality by summing the squared pattern coefficients asthe addition of the first two terms of (2.3) will yield an underestimate if the two factors arepositively correlated (i.e., R is positive) because it fails to add the value of 243 1 02r or anoverestimate if the two factors are negatively related because it fails to subtract the value of242131132r. The above example is a simple case of bi-factor model. Evidently, the calculation of thecommunalities of a model with multiple oblique factors is even more convoluted, and thedistortion of the horizontal additive property is further worsened. Thus, when the factors arecorrelated, the squared pattern coefficients do not actually partition the communality and theircross-factor sum does not give the communality despite the fact that the pattern coefficients dotake into account the inter-correlations among the factors.Note that if the two factors are orthogonal, the communality will be equal to p 2 + 1322because the third term in (2.3) drops out of the equation as a result of the fact that R = 0. This iswhy, when the factors are orthogonal, the communality is simply the sum of the squaredloadings, which synonymously designate the pattern and structure coefficient (Gorsuch, 1983;Harman, 1976; Kim & Muller, 1978; Mulaik, 1972; Nunnally, 1978). Taking the 24 tests usedfor the orthogonal model for example, Table 2.3 demonstrates the distorted horizontal additiveproperty of the pattern coefficients due to factor obliquity. Column 2 shows the communalitiesfor the 24 tests denoted as h2, which remains the same as those of the orthogonal model in Table2.2. Columns 3 to 6 show the pattern matrix P, columns 7 to 10 show the corresponding squaredpattern matrix denoted as P2. Under obliquity, P2 no longer represents the amount of observedvariance explained by each of the four factors. The last column, denoted as EFP2, shows the sumof the squared pattern coefficients across the four factors. Because of the distortion of thehorizontal additive property, these values are no longer equal to their correspondingcommunalities. Elements in EFP2 are either an under-representation or an over-representation oftheir corresponding communalities, depending on the correlations among the factors as well asbetween the factors and the tests.25Table 2.3 Distortion of the Horizontal Additive Property in the Pattern CoefficientsCol.# 2 3 4 5 6 7 8 9 10 11h2 P P2 IFP2F1 F2 F3 F4 F1 F2 F3 F4T1 .47 .06 .65 .02 -.02 .00 .43 .00 .00 .43T2 .28 -.05 .62 -.12 -.08 .00 .38 .02 .01 .41T3 .22 .03 .49 .04 -.15 .00 .24 .00 .02 .27T4 .43 -.23 .73 .03 .03 .05 .53 .00 .00 .58T5 .70 .90 -.07 .04 -.12 .80 .01 .00 .02 .82T6 .68 .84 -.05 -.03 .05 .70 .00 .00 .00 .71T7 .77 .97 -.10 -.04 -.08 .94 .01 .00 .01 .96T8 .56 .71 .05 .00 .02 .51 .00 .00 .00 .51T9 .72 .85 .05 -.06 -.01 .72 .00 .00 .00 .72T10 .64 -.02 -.28 .88 .02 .00 .08 .77 .00 .85T11 .45 .15 -.04 .54 .14 .02 .00 .29 .02 .33T12 .45 -.10 .14 .70 -.14 .01 .02 .49 .02 .54T13 .41 -.06 .34 .51 -.10 .00 .12 .26 .01 .39T14 .45 .07 -.15 -.15 .76 .01 .02 .02 .58 .62T15 .34 -.18 .05 -.09 .65 .03 .00 .01 .42 .46T16 .42 -.04 .31 -.04 .47 .00 .10 .00 .22 .32T17 .38 -.06 -.21 .25 .58 .00 .04 .06 .33 .44T18 .26 -.02 .03 .13 .43 .00 .00 .02 .18 .20T19 .23 .14 .10 .00 .33 .02 .01 .00 .11 .14T20 .38 .18 .44 -.12 .18 .03 .19 .02 .03 .27T21 .42 .09 .36 .31 .04 .01 .13 .10 .00 .24T22 .44 .37 .38 -.04 .04 .14 .14 .00 .00 .28T23 .53 .21 .52 .07 .05 .05 .27 .01 .00 .32T24 .42 .31 .05 .29 .17 .10 .00 .09 .03 .22xTp2 4.15 2.73 2.14 2.01 11.03%(F) 17.30 11.40 8.90 8.40 46.00Note. h 2 denotes the communality of a given test. EFPl is the sum of the squared pattern coefficients for a given test across the fourfactors. ETP2 is the sum of he squared pattern coefficients for a given factor across the 24 tests. %(F) denotes the percentage oftotal variance of the 24 tests explained by a given factor.The difficulty of making the horizontal interpretation due to obliquity has been discussedin the context of multiple regression through the notion of variable importance. After finalizing aregression model, researchers are often interested in finding out which independent variable isrelatively more important (i.e., practically significant). The widely practiced method forassessing the contribution of each independent variable is to order the absolute size of the26standardized partial regression coefficients (i.e., beta weights). This is because the beta weightsare believed to overcome the incomparability problem of the unstandardized regressioncoefficients (i.e., b-weights), which reflect the different metrics of the independent variables(Achen, 1982; Greenland, Schlesselman, & Criqui, 1986; Healy, 1990).However, concerns have been raised about using the beta weights as importancemeasures (e.g., Bring, 1994; Healy, 1990; Thomas, Zhu, & Decady, 2007). The major concernstems from the following argument: For a regression model with p independent variables, a betaweight reflects the unique effect of each independent variable over and above the effects of all(p-1) other variables. Nonetheless, the reference subsets for the independent variable A andindependent variable B are different, hence making the comparison invalid (Bring, 1994; Healy,1990; Thomas et al, 2007). As is evident from the connection between factor analysis andmultiple regression, the concern of using beta-weights as importance measures in multipleregression translates to using the pattern coefficients to assess the importance of factors.Note that the vertical additive property of the pattern coefficient, however, is notdistorted by the factor obliquity. The second last row of columns 7 to 10, denoted as ETP 2 inTable 2.3, shows the sums of the squared pattern coefficients along the 24 tests for each factor.Because of the vertical additive property, these values still correspond to the amount of totalvariance explained by each factor and their sum is equal to the total variance (11.03). The lastrow of column 7 to 10, denoted as %(F), shows the proportion of total variance explained byeach factor; the values add up to 46%, which is identical to that of the orthogonal solution.The reason that the vertical additive property was not distorted by factor obliquity is thatfor a factor model with q observed variables, the beta-weight reflects the unique effect of eachfactor over and above the effects of the identical (p-1) factors. Thus, the reference subsets for theobserved variable Ti, T2,...,T24 are the same, thus rendering pattern coefficients comparable27along the observed variables. Namely, despite the distortion of the horizontal additive property,the pattern coefficients are still warranted for assigning meanings to the factors, grouping itemsinto subscales, or uncovering the underlying structure of the data.When the factors are oblique, neither the horizontal nor the vertical additive propertyholds for the structure coefficients. The structure coefficients lose both additive propertiessimply because it fails to account for the overlapping relationship among the factors. Thestructure coefficient will always overestimate the true bi-directional relationship between a factorand an observed variable or the unique effect of a factor on an observed variable. Consequently,the cross-factor sum of squared structure coefficients will always be an inflation of thecommunality. Likewise, the along-variable sum of squared structure coefficients will always bean inflation of the total variance explained.Distortion of both additive properties by the structure coefficient is demonstrated in Table2.4 using the same data. The communalities for the 24 tests calculated by summing the squaredstructure coefficients across the four factors, denoted as EFS 2, consistently exceed the actualvalues. Likewise, the amount of total variance explained by each factor calculated by summingthe squared structure coefficients along the 24 tests, denoted as ETS 2, exceeds its correspondingvalues of the orthogonal model in Table 2.2. As a result, as shown by the last two rows of Table2.4, the amount and the percentage of total variance explained by each factor are inflated, leadingto an overestimated amount of modeled total variance (18.16 compared to the actual value of11.03) as well as an overestimated percentage of the modeled total variance (75.7% compared tothe actual percentage of 46%).28Table 2.4 Distortion of the Horizontal and Vertical Additive Properties in the StructureCoefficientsCol.# 2 3 4 5 6 7 8^9 10 11h2 S S2 EFS2F1 F2 F3 F4 F1 F2^F3 F4Ti .47 .42 .68 .31 .35 .18 .47^.10 .12 .86T2 .28 .21 .50 .09 .16 .05 .25^.01 .03 .33T3 .22 .26 .45 .19 .14 .07 .20^.04 .02 .33T4 .43 .20 .63 .27 .33 .04 .40^.07 .11 .62T5 .70 .83 .38 .32 .22 .68 .14^.10 .05 .97T6 .68 .82 .43 .31 .35 .68 .19^.10 .12 1.08T7 .77 .87 .37 .27 .23 .75 .14^.07 .05 1.01T8 .56 .74 .45 .31 .32 .55 .20^.10 .10 .95T9 .72 .85 .48 .30 .33 .72 .23^.09 .11 1.15T10 .64 .19 .10 .76 .29 .04 .01^.58 .08 .71T11 .45 .40 .35 .65 .43 .16 .12^.42 .19 .88T12 .45 .21 .32 .66 .23 .04 .10^.43 .05 .63T13 .41 .29 .48 .58 .29 .08 .23^.34 .09 .74T14 .45 .23 .22 .18 .64 .05 .05^.03 .41 .55T15 .34 .07 .24 .16 .56 .00 .06^.03 .31 .40T16 .42 .30 .52 .30 .60 .09 .27^.09 .36 .81T17 .38 .16 .17 .41 .57 .02 .03^.17 .32 .54T18 .26 .21 .29 .33 .49 .05 .08^.11 .24 .48T19 .23 .33 .35 .26 .44 .11 .12^.07 .19 .49T20 .38 .44 .58 .22 .42 .19 .33^.05 .17 .74T21 .42 .43 .56 .52 .41 .18 .32^.27 .17 .93T22 .44 .58 .59 .29 .36 .33 .34^.09 .13 .89T23 .53 .55 .69 .40 .44 .30 .48^.16 .19 1.13T24 .42 .53 .44 .52 .46 .28 .19^.27 .21 .95ErS2 5.63 4.94^3.77 3.82 18.16%(F) 23.47 20.57 15.71 15.91 75.66Note. h4 denotes the communality of a given test. EFS 2 is the sum of the squared structure coefficients for a given test across thefour factors. ITS2 is the sum of the squared structure coefficients for a given factor across the 24 tests. %(F) denotes the percentageof total variance of the 24 tests explained by a given factor.The difficulty of distributing factors' contribution to the observed variation by the twooblique coefficients has resulted in several alternative methods being proposed in the literature(e.g., Bentler, 1968; Cattell, 1962; White, 1966). Bentler (1968) proposed the use of a total factorcontribution matrix, which is the product of an initial factor loading matrix and the least-squares29orthonormal approximation to the general transformation matrix. Although the total factorcontribution can uniquely partition the oblique factor contribution, elements to produce such amatrix are hard to calculate and are not produced by the popular statistical packages. In addition,White (1966), by algebraically manipulating equation (1.1), noted that the product of the patterncoefficients and the structure coefficients could additively partition the observed variance.However, his work was entirely algebraic and provided no axiomatic principles to justify its use.As a result, these early attempts did not draw much attention from other scholars and users offactor analysis. Alternatively, practitioners have been avoiding these complexities by placing anorthogonal constraint to retain the additive properties and interpretational simplicity even whenthe factors are believed to be correlated by theory or shown by empirical data (Browne, 2001;Conway & Huffcutt, 2003; Fabrigar et al., 1999; Harman, 1967; Henson & Robert, 2006; Horst,1965; Kieffer, 1998; Nunnally, 1978; Preacher & MacCallum, 2003; Rummel, 1970).3. Inappropriateness of Cut-off RulesAnother complexity arising from factor obliquity is that the traditionally suggested rulesfor loading cut-offs may not be appropriate for the oblique coefficients. Most rules for ameaningful loading such as 0.3 or 0.4 were suggested under the premise of orthogonality. Forexample, the rules suggested by Comrey and Lee (1992) were based on the premise of thehorizontal additive property that holds only for unidimensionality or orthogonality. Their cut-offcriteria were suggested for orthogonal loadings of which the squares represent the observedvariance explained by the factors. As we have shown, the premise of a horizontal additiveproperty is distorted by the factor obliquity. A minimum factor loading of 0.32 was consideredby Comrey and Lee (1992) as practically significant because it contributes to 10% (0.32 x 0.32)of the variation in the observed variable. However, a pattern or structure coefficient of the samemagnitude does not necessarily contribute to the same amount of the observed variance due to30the distortion of the horizontal additive property. Comrey and Lee's criteria for being practicallysignificant are invalid when the factors are oblique.Furthermore, the conventional cut-off criteria may not be appropriate, in particular, forthe pattern coefficient, because most of the cut-off criteria were suggested for a correlationaltype of interpretation that is bounded within the range of 1 and —1. For example, the cut-offssuggested by Harman (1967) were developed based on the average of the sample Pearsoncorrelation matrix. Also, Gorsuch's (1983) suggestion was based on doubling the critical value ofstatistical significance for the Pearson correlation. Both Harman and Gorsuch's rules wereproposed for correlational and bi-directional interpretation, which may not be appropriate for thepattern coefficient that is, by nature, NOT a bi-directional correlation.Furthermore, the pattern coefficients, like beta-weights in a multiple regression, canoccasionally exceed the bounds of 1 and —1 (Guertin & Bailey, 1970; Nunnally, 1978). Under thecircumstance of simple redundancy, which is often desired and assumed in a typical multipleregression (Cohen, Cohen, West, & Aiken, 2003) where the factors are slightly to moderatelycorrelated and contain only redundant information, three conditions should follow: (1) a patterncoefficient should be of the same sign as its corresponding structure coefficient, (2) themagnitude of the pattern coefficient should be less than its corresponding structure coefficientbecause it reflects the unique effect by removing the redundant relationships, and (3) because thestructure coefficient is bounded within the range of —1 and 1, by condition (2), the patterncoefficient should also be bounded within —1 and 1. However, when the factors are highlyredundant (i.e., multicollinear) and/or do not follow the simple redundancy relationship (e.g.,displaying suppression effect), the conventional rules fall apart because the pattern coefficientmay (1) be of the opposite sign to its corresponding structure coefficient, (2) be of a greater31magnitude than its corresponding structure coefficient, and even (3) exceed the structure boundsof —1 and 1 (Nunnally, 1978; Rummel, 1970)3 .Table 2.5 shows an extreme example where a simple redundancy relationship does nothold, and the conventional cut-off criteria may be inappropriate. The data was retrieved from the2003-2005 Wisconsin Longitudinal Study of psychological well-being. The construct ofpsychological well-being was measured by the 31 items of Ryff s Scales of Psychological Well-Being (RPWB; Ryff, 1989; Ryff & Keyes, 1995). Six factors were extracted to reflect the sixtheoretical dimensions: autonomy (AU), environmental mastery (EM), personal growth (PG),positive relations with others (PR), purpose in life (PL), and self-acceptance (SA) (See AppendixB for a detailed description of the six theoretical dimensions). First, observe that there is adistinctive discrepancy between the pattern and the structure coefficients. To make our point,pattern coefficients that exceed their corresponding structure coefficients are highlighted in italicface, those that exceed the structure bounds of-1 and 1 are highlighted in bold face, and those ofopposite sign to the structure coefficients are highlighted with an underline. It is fascinating toobserve that none of the 31 items actually satisfies all three conditions and strictly follows asimple redundancy relationship as a result of the high correlations among some of the factors asshown at the bottom of Table 2.5. This example demonstrates how the correlation andorthogonality based cut-off criteria may fall apart for interpreting the pattern coefficients.3The impact of multicollinearity and suppression on the partial coefficient has been deliberated in the literature ofmultiple regression (e.g., Cohen, Cohen, West, & Aiken, 2003). The task of resolving multicollinearity and suppressoreffect in factor analysis is beyond the focus of this dissertation, although the presence, types, and mechanism of thesuppression effect will be discussed throughout this dissertation.32Table 2.5 Demonstration of Inappropriateness of Traditional Rules for Interpreting the PatternMatrixItem h2 P SF1 F2 F3 F4 F5 F6 F1 F2 F3 F4 F5 F6AU1 .52 .04 .13 .80 -.17 -.37 .13 .49 -.34 .58 .32 -.28 .31AU2 .39 -.17 .12 1.05 -.26 -.14 -.18 .28 -.21 .47 .20 -.15 .02AU3 .44 -.30 .07 .94 .07 .07 -.15 .41 -.44 .62 .48 .18 .18AU4 .21 -.14 .12 .69 -.08 .11 -.01 .28 -.29 .42 .30 .15 .19AU5 .27 .37 .24 .44 -.25 -.15 .15 .40 -.25 .40 .22 -.16 .29Em .41 .50 .13 .11 .10 -.34 .05 .55 -.38 .47 .37 -.25 .29EM2 .58 .84 .18 -.12 .14 .26 -.02 .72 -.66 .60 .67 .34 .47EM3 .55 1.29 .18 -.31 -.15 .01 -.09 .69 -.50 .46 .46 -.04 .34EM4 .31 .63 .23 .09 .04 -.13 .00 .52 -.37 .44 .37 -.10 .28EM5 .42 .55 .06 .01 .09 .26 -.04 .59 -.58 .54 .59 .34 .39PG1 .40 .44 .01 .00 .25 -.36 -.08 .55 -.42 .46 .42 -.21 .22PG2 .55 .09 -.11 -.09 .73 -.23 -.05 .65 -.65 .60 .70 .11 .35PG3 .42 -.01 .07 .06 .66 -.57 -.05 .42 -.32 .40 .37 -.32 .13PG4 .43 .09 .17 -.22 .98 -.08 -.12 .49 -.51 .45 .63 .21 .23PG5 .37 .16 .16 -.02 .63 -.46 .01 .47 -.36 .41 .41 -.24 .21PR1 .64 .06 -1.15 -.12 -.29 .07 -.15 .63 -.77 .56 .63 .31 .34RR2 .62 -.23 -1.43 -.12 -.29 -.11 -.19 .54 -.72 .49 .55 .19 .24PR3 .39 -.13 -.69 -.12 .09 -.57 .06 .41 -.39 .33 .30 -.30 .20PR4 .50 -.06 -1.11 -.11 -.30 .03 -.06 .54 -.68 .48 .54 .27 .34PR5 .46 -.24 -1.08 -.24 .06 -.11 -.08 .48 -.64 .43 .54 .21 .27PL1 .68 1.24 .18 -.12 .02 -.11 -.59 .63 -.44 .47 .43 -.19 -.03PL2 .50 .44 .03 -.12 .47 .03 -.06 .66 -.64 .58 .68 .23 .38PL3 .31 -.14 .29 -.01 .95 -.06 -.04 .36 -.39 .39 .52 .21 .20PL4 .50 .21 .03 -.23 .83 -.07 -.15 .58 -.60 .52 .68 .22 .27PL5 .13 -.17 -.01 -.13 .57 .02 -.06 .17 -.24 .18 .31 .20 .09PL6 .24 .27 -.10 .03 .06 -.22 .09 .46 -.38 .39 .35 -.09 .29SA1 .68 1.22 .08 .00 -.24 -.11 -.36 .72 -.52 .56 .46 -.18 .17SA2 .35 .69 -.15 -.11 -.16 -.20 -.03 .54 -.42 .41 .35 -.15 .27SA3 .53 .79 .16 -.25 .23 .19 .06 .68 -.62 .54 .64 .29 .48SA4 .46 1.00 .05 -.28 -.25 .05 .17 .62 -.48 .42 .42 .05 .48SA5 .55 .31 -.33 .14 -.20 -.41 .22 .63 -.51 .54 .41 -.25 .45F2 -.87F3 .85 -.82F4 .83 -.90 .82F5 .14 -.37 .20 .43F6 .60 -.56 .53 .54 .28Note. hi: communality; P: pattern matrix; S: Structure matrix. AU: autonomy; EM: environmental mastery; PG: personal growth; PR:positive relations with others; PL: purpose in life; SA: self-acceptanceNote. Pattern coefficients that exceed their corresponding structure coefficients are highlighted in italic face, that exceed thestructure bounds of -.1 and 1 are highlighted in bold face, and that of opposite sign to the structure coefficients are highlighted withan underline.33To our knowledge, the widely accepted cut-off rules, which are based on the premises ofcorrelation, orthogonality, and simple redundancy relationship, have rarely, if ever, beenformally examined in terms of their appropriateness for the oblique coefficients. Neither havethere been different cut-off criteria suggested for the oblique coefficients.2.3 Review of Practices and Recommendations for Interpreting Multidimensional FactorModelsOrthogonal or Oblique?Although factor obliquity is more justified both theoretically and empirically, three majorinterpretational difficulties inhibit its use as revealed in Section 2.2. To reiterate, thesedifficulties are: (1) the inconsistency in the pattern and structure coefficients and choice of whichto trust, (2) the distortion of the additive properties, and (3) the inappropriateness of thetraditional rules suggested for orthogonal loadings. Nunnally (1979) stated,The supposed advantages of oblique rotations are mainly conceptual rather thanmathematical. The author has mild preference for orthogonal rotations, because (1)they are so much simpler mathematically than oblique rotations, (2) there has beennumerous demonstrations that the two approaches lead to essentially the same conclusionabout the number and the kinds of factors inherent in a particular matrix of correlation.(p. 376)Although orthogonality simplifies the interpretation via additive properties, manyinvestigators have concerns about sacrificing accuracy for the sake of mathematical simplicity.For example, Thurstone (1947) stated, "In developing the factorial methods we have insisted thatthe methods must not impose orthogonality on the fundamental parameters that are chosen forfactorial description, even though the equations are thereby simplified in that the cross products[factor correlation] vanish" (p. 140). Also, Cattell (1952) stated:34Our tolerance of the inconveniences of oblique factors [...] depends on our belief that inexplaining and predicting natural events it is actually more convenient in the long run tofollow nature than attempt to force upon it some artificial over-simplication. [....] And toinsist on orthogonality of factor is indeed mistaking means for ends, since these simplermathematical devices after all are only means to discovering and expressing whatever isin nature itself. (p. 122-123)Without a doubt, interpretational difficulties due to factor obliquity may have hinderedsome day-to-day practices. In particular, there has been an under-utilization of the horizontalinterpretation in the recent literature as oblique rotations became more popular. We believe thatsuch under-utilization may be due to the distortion of the horizontal additive property. Asmentioned earlier, horizontal interpretation bears the fundamental theory of factor analysis andprovides powerful evidence for test validation. It would be a misfortune to foresee a decreasinguse of horizontal interpretation as a result of increasing use of oblique rotations.The point to highlight here is that both orthogonal and oblique rotation methods haveinherent advantages and disadvantages in interpreting a multidimensional factor solution.Researchers often are forced to choose between (a) constraining factor orthogonality forinterpretation simplicity even if the factors are believed to be correlated and (b) allowingobliquity and tolerating the interpretational complexities. Our stance is that if there is a trueassociation among the factors, which is highly plausible on both theoretical and empirical bases,fixing the correlations among the factors to be zero could produce biased estimates of thecoefficients and lead to misleading interpretations. From a statistical modeling perspective, thisis clearly a scenario of model misspecification — i.e., fitting an orthogonal factor model when thedata are in fact oblique. If so, a researcher is trying to make sense of the oblique data based onthe biased parameters produced by the orthogonal model. Simply put, oblique and orthogonal35models are distinct models (Gorsuch, 1983, p. 33); one should not confuse the orthogonalloadings with the oblique coefficients.Pattern, Structure, Both, or Alternatives?As discussed earlier, when the factors are allowed to be oblique, there is always somediscrepancy between the pattern and its corresponding structure coefficient. Researchers, whorecognize this inconsistency, often encounter the question: "Which coefficient should I interpretand trust if different conclusions are reached?"Many researchers prefer the pattern coefficients (e.g., Cattell, 1952; Harman 1967).Investigators like Thurstone always used the pattern coefficients (Gorsuch, 1983). The patterncoefficient is preferable for the following reasons. First, the pattern coefficient is rooted in thefundamental theory of factor analysis as shown in equation (1.1); it reflects the mathematicalmeaning of factor analysis: i.e., change in the observed score per unit change in a given factorscore isolating the effect of other factors in the model. Second, the pattern coefficient is believedto capture the simple structure better than does the structure coefficient, because it reflects theunique effect (Cattell, 1952; Rummel, 1970). Furthermore, unlike the structure coefficient forwhich both additive properties are distorted, the pattern coefficient is still warranted for verticalinterpretation.Other researchers, however, believe that the structure coefficient should be the focus ofinterpretation (e.g., Comrey & Lee, 1992; Gorsuch, 1983; Horst, 1965). Comrey and Lee (1992)contended that the meaning of factors should be defined by how similar the observed variablesand the factor are. That similarity is best indicated by the bi-directional correlation. Also,Gorsuch (1983) argued that pattern coefficients capture only the directional relationship; they donot show the relationships of variables to the factors, but rather of the factors to the variables(Gorsuch, 1983; Kim & Muller, 1978). Gorsuch argued that, although the pattern coefficientbears the fundamental theory of factor analysis, the substantive nature of the factors should be36already known to the researchers so as to make meaningful directional interpretation.Nonetheless, such a condition may be implausible or unrealistic for many EFA contexts andpurposes. Recently, a few researchers have responded with support for Gorsuch's call forinterpreting the structure coefficient (Courville & Thompson, 2001; Graham, Guthrie, &Thompson, 2003; Henson & Roberts, 2006; Henson, 2002; Kieffer, 1999; Thompson, 1997;Thompson & Daniel, 1996). The following summarizes these researchers' reasons for promotingthe structure coefficients:■ The meaning of the unknown factors is best understood by a bi-directional relationship,which shows how much the factors and the observed variable share in common.■ Most investigators are more familiar with correlational-type coefficients, and are moreaccustomed to interpreting the practical importance of correlation, of which the range isalways bounded between —1 and 1.■ It is important to interpret the inflated relationship in the structure coefficient because thecorrelations among the factors are of theoretical importance. The pattern coefficientssystematically exclude overlap among the factors and represent only their uniquecontributions even when the overlap is of theoretical importance.■ The structure coefficients yield a relatively model-independent estimate of relationships.That is, regardless of what other factor occurs in the next study for a given data, thevariable should correlate at the same level with a particular factor. In contrast, the patterncoefficients are more model-dependent.■ In some situations, a variable may be regarded as unimportant because of a small patterncoefficient, yet the structure coefficient may reveal a substantial relationship between thevariable and the factor.■ Examining the structure coefficient can detect a potential suppression effect. Such effectcannot be detected if only the pattern coefficient is examined.37For these reasons, it is argued that interpreting the pattern coefficient itself is insufficientfor oblique factor interpretation and it could lead to incorrect conclusions; a meaningfulinterpretation of an oblique factor solution should always examine both coefficients byjuxtaposing them (Graham et al. 2003; Henson & Roberts, 2006; Kieffer, 1999; Thompson &Daniel, 1996; Rummel, 1970). Gorsuch (1983) concluded, "Indeed, proper interpretation of a setof factors can probably only occur if at least S and P are both examined" (p. 208).Our stance on this is that oblique coefficients, regardless the pattern or the structurecoefficient, have their interpretational difficulties as we explained and demonstrated in Section2.2. To iterate, these difficulties are (1) the inconsistency in the pattern and structure coefficientsand the choice of which to report and trust, (2) the distortion of the additive properties, and (3)the inappropriateness of traditional cut-off rules. To our knowledge, interpretational difficultyarising from problem (3) has not been heeded in the literature. Although interpretationaldifficulties arising from problems (1) and (2) have been addressed in the backstage of moretechnical literature (e.g., Gorsuch, 1983; Harman, 1976; Thurstone, 1947), most of the appliedresearchers have not attended to these problems (Graham et al., 2003; Tabachnik & Fidell, 2007;Thompson, 1997). It is only recently that researchers brought these problems to the spotlight(Courville & Thompson, 2001; Graham, et al., 2003; Henson, 2002; Henson & Roberts, 2006;Thompson, 1997).In a review of 60 factor analyses, Henson and Roberts (2006) reported that 23 appliedanalyses (38%) used oblique rotation, among them, 11 reported only the pattern matrix, fourreported only the structure matrix, and one reported both. Seven did not indicate which matrixwas reported. This review revealed that orthogonal rotation was still preferred among the appliedresearchers. When an oblique solution is chosen, reporting only the pattern matrix was the mostcommon practice, followed by reporting only the structure matrix. Few, if any, attended to bothcoefficients and the possible inconsistency between them.38Fortunately, the more methodology-oriented investigators listed in Table 2.1 on page 11-12 were more attentive to these issues. Column four of Table 2.1 indicates that among the 40scholarly works, 27 (67.5%) explicitly acknowledged the inconsistency between the pattern andthe structure coefficients. Column five of Table 2.1 shows that among the 40 scholarly entries,sixteen (40%) used both coefficients, 12 (30%) reported only the pattern coefficient, six did notclearly indicate which matrix was reported (15%), five (12.5%) chose to report only theorthogonal solution, and one (2.5%) reported only the structure coefficients. The last columnshows that among the 40 entries, only 11 (27.5%) actually juxtaposed both matrices.It appears that the methodology-oriented investigators were more inclined to report eitheror both coefficients at their own discretion or upon the feedback of editors and reviewers,compared to the applied researchers and practitioners. Nonetheless, the pattern coefficient wasstill the preference if only one oblique coefficient was chosen for interpretation. It is regrettablethat 13 out the 40 scholarly entries did not explicitly acknowledge the existence of two types ofcoefficients despite their methodological appeal. For those who did, some ended the discussionby simply defining and distinguishing the two types of coefficients, and provided no furtherdescription about the interpretational difficulties or solution for resolving them. Summarizing thepractice of both applied and methodological researchers, we conclude that the pattern coefficientwas the preference for interpretation if factor obliquity was allowed. The pattern coefficient wasoccasionally accompanied by the structure coefficient.Our stance on this matter is that the advocacy for interpreting both pattern and structurecoefficient is conceptually and technically legitimate. We agree with the recent literature thatresearchers should notice the different messages P and S deliver and interpret them in concert;missing either information could lead to fallible conclusions. However, the current recommendedapproach of juxtaposing both coefficients does not actually resolve the three problems of obliquecoefficients. Instead, by means of juxtaposing both, it brings the investigators back to the39original problems. In a cynical sense, the problems seem to have doubled because now bothcoefficients should be examined. The following statement by Bentler (1968) perfectly depictsour reflection on the review in Chapter Two and gives a preview for our discussion in ChapterThree,While the pattern matrix P may be useful in evaluating simple structure, for a givenoblique solution, it cannot evaluate simple structure in the contribution of factors tovariables (or vice versa) since the variables' variance accounted for by oblique factors isa complex function of the pattern and structure matrices. (p. 490)The next chapter proposes a new method for interpreting oblique factor models. The newmethod not only synergizes the information in both P and S but also resolves most of theinterpretational difficulties inherent in factor obliquity. In essence, the new method makes thefollowing statement by Thurstone (1947) about orthogonal solution true for the oblique case,"We have then the theorem that each factor loading for orthogonal factors is the square root ofthe variances of test j attribute to the factor m" (p.73).40ReferencesAchen, C. H. (1982). Interpreting and using regression. Beverly Hills, CA: Sage.Archer, C. 0., & Jennrich, R. J. (1973). Standard errors for rotated factor loadings.Psychometrika, 38, 581-592.Basilevsky, A. (1994). Statistical factor analysis and related methods. New York: Wiley.Bentler, P. M. (1968). A new matrix for the assessment of factor contributions. MultivariateBehavioral Research, 3, 489-494.Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2003). The theoretical status of latentvariables. Psychological Review, 110, 203-219.Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity.Psychological Review, 111, 1061-1071.Bring, J. (1994). How to standardize regression coefficients. American Statistician, 48, 209-213.Bring, J. (1996). A geometric approach to compare variables in a regression model. TheAmerican Statistician, 50, 57-62.Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: TheGuilford Press.Browne, M. W. (2001). An overview of analytic rotation in exploratory factor analysis.Multivariate Behavioral Research, 36, 111-150.Burt, C. (1940). The factors of the mind. London: University of London Press.Burt, C. (1966). The early history of multivariate techniques in psychological research.Multivariate Behavioral Research, 1, 24-42.Cattell, R. B. (1952). Factor analysis. New York: Harper.Cattell, R. B. (1957). Personality and motivation: Structure and measurement. New York: WorldBook.Cattell, R. B. (1962). The basis of recognition and interpretation of factors. Educational andPsychological Measurement, 72, 667-697.Cattell, R. B. (1978). The scientific use of factor analysis in behavioral and life sciences. NewYork: Plenum Press.Church, J. T., & Burke, P. J. (1994). Exploratory and confirmatory tests of the big five andTellegen's three- and four-dimensional models. Journal of Personality and SocialPsychology, 66, 93-114.41Cliff, N., & Hamburger, C. D. (1967). The study of sampling errors in factor analysis by meansof artificial experiments. Psychological Bulletin, 68, 430-445.Cohen, J. P., Cohen, S. G., West, L. S., & Aiken, L. S. (2003). Applied multipleregression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ:Lawrence Erlbaum Associates.Comrey, A. L., & Lee, H. B. (1992). A first course in factor analysis (2nd ed.). Hillsdale, NJ:Erlbaum.Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysispractices in organizational research. Organizational Research Methods, 6, 147-168.Courville, T., & Thompson, B. (2001). Use of structure coefficients in published multipleregression articles: (3 is not enough. Educational and Psychological Measurement, 61,229-248.Cudeck, R. (2000). Exploratory factor analysis. In H. E. A. Tinsley & S. D. Brown (Eds.),Handbook of applied multivariate statistics and mathematical modeling (pp. 266-296).San Diego, CA: Academic Press.Cudeck, R., & O'Dell, L. L. (1994). Applications of standard error estimates in unrestrictedfactor analysis: Significance tests for factor loadings and correlations. PsychologicalBulletin, 115, 475-487.Cureton, E. E., & d'Agostin, R. B. (1983) Factor analysis: An applied approach. New York:Erlbaum Associate.Ferguson, E., & Cox, T. (1993). Exploratory factor analysis: A user's guide. InternationalJournal of Selection and Assessment, 1, 84-9.Ford, J. K., MacCallum, R. C., & Tait, M. (1986). The application of exploratory factor analysisin applied psychology: A critical review and analysis. Personnel Psychology, 39, 291-314.Fruchter, B. (1954). Introduction to factor analysis. Princeton: D. van Nostrand Company.Giri, N. C. (2004). Multivaraite statistical analysis (2nd ed.). New York: Marcel Dekker.Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum.Graham, J. M., Guthrie, A. C., & Thompson, B. (2003). Consequences of not interpretingstructure coefficients in published CFA research: A reminder. Structural EquationModeling, 10, 142-152.Greenland, S., Schlesselman, J. J., & Criqui, M. H. (1986). The fallacy of employing standardregression coefficients and correlations as measures of effect. American Journal ofEpidemiology, 123, 203-208.42Guertin, W. H., & Bailey, J. P. (1970). Introduction to modern factor analysis. Ann Arbor:Edwards Brothers.Hair, J. F. Jr., Babin, B., Anderson, R. E., Tatham, R. L., & Black, W. C. (2006) Multivariatedata analysis (6th ed.). Upper Saddle River, NJ: Prentice Hall.Harman, H. H. (1967). Modern factor analysis. Chicago: University of Chicago Press.Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago: University of Chicago Press.Hendrickson, A. E., & White, P. 0. (1964). Promax: A quick method for rotation to obliquesimple structure. British Journal of Statistical Psychology, 17, 65-70.Healy, M, J. R. (1990). Measuring importance. Statistics in Medicine, 9, 633-637.Henryeson, S. (1950). The significance of factor loadings. British Journal of PsychologicalStatistics, 3, 159-165.Henson, R. K. (2002, April). The logic and interpretation of structure coefficients in multivariategeneral linear model analyses. Paper presented at the annul meeting of the AmericanEducational Research Association, New Orleans, LA.Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research:Common errors and some comment on improved practice. Educational andPsychological Measurement, 66, 393-416.Holzinger, K. J., & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factorsolution. Supplementary Educational Monographs. Chicago: University of Chicago,Horn, J. L. (1967). On subjectivity in factor analysis. Educational and PsychologicalMeasurement, 27, 811-820.Horst, P. (1965). Factor analysis of data matrices. New York: Holt, Rinehart & Winston.Hoyle, R. H., & Duvall, J. L. (2004). Determining the number of factors in exploratory andconfirmatory factor analysis. In D. Kaplan (Ed.), The SAGE handbook of quantitativemethodology for social sciences (pp. 301-315). Thousand Oaks, CA: Sage.Humphreys, L. G., Ilgen, D., McGrath, D., & Montanelli, R. (1969). Capitalization on chance inrotation of factors. Educational and Psychological Measurement, 29, 259-271.Jennrich, R. I. (1973). Standard errors for obliquely rotated factor loadings. Psychometrika, 38,593-604.Jennrich, R. I., & Sampson, P. F. (1966). Rotation for simple loadings. Psychometrika, 31, 313-323.Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis. New York:Prentice Hall.43Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis.Psychometrika, 23, 187-200.Kelley, T. L. (1927). Interpretation of educational measurements. Yonkers, NY: World Book.Kenny, D. A., & Kashy, D. A. (1992). Analysis of the multitrait-multimethod matrix byconfirmatory factor analysis. Psychological Bulletin, 112, 165-172.Kieffer, K. M. (1998). Orthogonal versus oblique factor rotation: A review of the literatureregarding the pros and cons. Paper presented at he Annual Meeting of the Mid-SouthEducational Research Association, New Orleans.Kim, J., & Mueller, C. W. (1978). Introduction to factor analysis. Beverly Hills, CA: Sage.Kline, P. (1994). An easy guide to factor analysis. London: Routledge.Lawley, D. N., & Maxwell, A. E. (1971). Factor analysis as a statistical method. New York:American Elsevier.Loehlin, J. C. (1998). Latent variable models: An introduction to factor, path, and structuralanalysis. Hillsdale, NJ: Lawrence Erlbaum Associates.McDonald, R. (1985). Factor analysis and related methods. Hillsdale, NJ: Erlbaum.Mulaik, S. A. (1972). The foundation of factor analysis. New York: McGraw-Hill.Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.Pett, M. A., Lackey, N. R., & Sullivan, J. J. (2003). Making sense of factor analysis: The use offactor analysis for instrument development in health care research. Thousand Oaks, CA:Sage.Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift's electric factor analysismachine. Understanding Statistics, 2, 13-43.Press, S. J. (1982). Applied multivariate analysis: Using Bayesian and frequentist methods ofinference. New York: Krieger.Rummel, R. J. (1970). Applied factor analysis. Evanston, IL: North-western University Press.Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factoranalysis in personality and social psychology bulletin. Personality and Social PsychologyBulletin, 28, 12, 1629-46.Ryff, C. (1989). Happiness is everything, or is it? Journal of Personality and Social Psychology,6, 1069-1081.44Ryff, C. D., & Keyes, D. L. M. (1995). The structure of psychological well-being revisited.Journal of Personality and Social Psychology, 69, 719-727.Spearman, C. (1904). General intelligence, objectively determined and measured. AmericanJournal of Psychology, 15, 201-293.Stevens, J. (1996). Applied multivariate statistics for the social sciences. Mahway, NI: Erlbaum.Tabachnick, B. G., & Fidell, L. S. (2007). Using multivariate statistics (5th ed.). Boston: Allynand Bacon.Thomas, D. R., Zhu, P. C., & Decady, Y. J. (2007). Point estimates and confidence intervals forvariable importance in multiple linear regression. Journal of Educational and BehavioralStatistics, 32, 61-91.Thompson, B. (1997). The importance of structure coefficients in structural equation modelingconfirmatory factor analysis. Educational and Psychological Measurement, 57, 5-19.Thompson, B., & Borrello, G. M. (1985). The importance of structure coefficients in regressionresearch. Educational and Psychological Measurement, 45, 203-209.Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity ofscores: A historical overview and some guidelines. Educational and PsychologicalMeasurement, 56, 197-208.Thurstone, L. L. (1947). Multiple factor analysis. Chicago: University of Chicago Press.Timm, N. H. (2002). Applied multivariate analysis. New York: Springer.Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factoranalysis. Psychometrika, 38, 1-10.White, 0. (1966). Some properties of three factor contribution matrices. Multivariate Behavioralresearch, 1, 373-377.Yates, A. (1987). Multivariate exploratory data analysis. Albany: State University of New YorkPress.Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao &S. Sinharay (Eds.), Handbook of statistics, Vol. 26: Psychometrics, (pp. 45-79).45Chapter Three: Pratt's Importance Measures in Oblique ExploratoryFactor AnalysisThis chapter introduces a new method of Pratt's importance measures for assisting ininterpreting an oblique factor model.' The method of Pratt's importance measures was originallyused to order the importance of a set of independent variables in a linear multiple regression. Inthis chapter we show that the use of Pratt's measures in factor analysis can resolve threeinterpretational difficulties arising from factor obliquity. First, it integrates the information inboth the pattern and structure coefficients, hence there is no need to choose which obliquecoefficients to interpret. Second, it restores the properties of horizontal and vertical additionwhile allowing factors to be oblique. Third, it resolves, in part, the problem of traditional rulesfor assessing a meaningful relationship between an observed variable and a factor.This chapter is organized as follows: Section 3.1 describes the original use of Pratt'simportance measures in a linear multiple regression, Section 3.2 provides the rationale foradapting Pratt's importance measures to factor analysis, Section 3.3 demonstrates an example ofusing Pratt's measures in EFA for continuous data, and finally Section 3.4 demonstrates anexample of using Pratt's measures in EFA for categorical data.3.1 The Use of Pratt's Importance Measures in Linear Multiple RegressionBecause the proposed methodology is an extended use of linear multiple regression, thissection describes the role of Pratt's measures in ordering the importance of independent variablesin a multiple regression. Once a regression model is chosen, the next query often is: "Which'A version of this chapter will be submitted for publication. Wu, A. D., Zumbo, B. D., & Thomas, R. D. Pratt'simportance measures in exploratory factor analysis.46independent variable contributes more to the variation of the dependent variable?" Numerousmeasures of relative importance have been proposed (see the references provided by Azen &Budescu, 2003; Bring, 1996; Kruskall, 1987; Pratt, 1987). Among these measures, Pratt (1987)used an axiomatic approach to deduce the importance of an independent variable and showedthat his unique measure could be expressed as the product of its standardized regressioncoefficient and its simple correlation with the dependent variable. Pratt justified the measureusing an axiomatic approach based largely on symmetry and invariance to linear transformation.He showed that his measure is unique, subject, of course, to his axioms.Thomas, Hughes, and Zumbo (1998) provided an intuitive derivation of a standardizedversion of Pratt's measures based on the geometry of least squares, and continued to refer to it as"Pratt's measures" in recognition of Pratt's theoretical justification. Details about the geometryof least squares, as it applies to Pratt's measures, can be found in Thomas (1992) and Thomas etal. (1998). For the purpose of this dissertation, only the necessary applicative procedures aredescribed to show the use of Pratt's measures in multiple regression and factor analysis. Adetailed axiomatic discussion of Pratt's measures can be found in Pratt (1987), Thomas (1992),Thomas et al. (1998), and Thomas and Zumbo (1996).It is crucial to define what "importance" means by Pratt's measures. Pratt refers toimportance as the proportion of the explained variance (R-squared) to which an independentvariable contributes, relative to the other independent variables in the model; therefore, onewould more accurately refer to it as relative importance. Pratt justified the rule whereby relativeimportance is equated to variance explained, provided that explained variance attributed to anindependent variable is the product of the population beta-weight and the population correlationof that independent variable and the dependent variable. Despite having been criticized, thisdefinition is still widely used in the applied literature (e.g., Green, Carroll, & DeSarbo, 1978).47As we will show below, an additional feature of Pratt's measure is that it allows theimportance of a subset of variables to be defined additively, as the sum of their individualimportance irrespective of the correlation among the independent variables. Other commonlyused measures (e.g., the standardized beta-weights, the t-values, unstandardized b-weights) donot allow for an additive definition and are problematic with correlated independent variables.The following description unpacks the meaning of Pratt's relative importance measures.Consider a linear multiple regression with one response variable of the formY^X + 62 X2 ± ^Xp U,^ (3.1)where f3p is the estimate of the standardized regression coefficient for the p th independentvariable, p=1...w, and U is the error term that is uncorrelated with Xp, and E (U) = 0. Pratt'smeasure, dp, for the relative importance of the pth independent variable included in the regressionmodel is given by(3 13 dp — P P R2 (3.2)where 13p is the estimate of simple Pearson's product moment correlation between theindependent variable X and the response variable Y. Because E 13PP = R2 , it followsP=1that E^- 1, hence E dp = 1, a result that was illustrated by Thomas et al.'s (1998)p=1 R P=1geometric derivation. The importance of the independent variables then can be ordered by dpaccordingly.Thomas (1992) also suggested that, as a general rule, if the dp < 1/(2p), namely half theaverage importance, then the corresponding independent variable can be considered unimportant.In addition, if the researcher is interested in the joint importance of a subset of variables for sometheoretical reason, she or he can simply sum up the individual importance of the subset because48of the additive property of Pratt's measures. For example, the joint importance measure ofindependent variable X i and X2 is equal to d1 + d2.The appropriateness of Pratt's measures has been criticized because it occasionallyproduces importance values beyond the logical range of zero to one. Negative Pratt's measurescan occur, which is a counterintuitive characteristic for importance interpretation. Small out ofbound Pratt's measures could be a result of chance capitalization since lip and 13p are bothsample estimates but Pratt's measures are population-defined. It is worth noting that largenegative Pratt's measures can occur if 13p and Pp are of different signs, a scenario referred to as"negative suppression effect" (Conger, 1974; Lancaster, 1999). Also, Thomas et al. (1998)demonstrated that negative Pratt's measures are often associated with multicollinearity.Suppression effect and multicollinearity are two complex situations when all other measures ofvariable importance display interpretational difficulty (Thomas et al., 1998). Thomas, Zhu,Zumbo, & Dutta (2006; in press) also reminded researchers that some regression models are socomplex that no single measure of importance satisfies Pratt's axioms. For the purpose of thisdissertation, the discussion will not dwell on the causes of negative Pratt's measures. Rather, thepoint here is to prepare readers for expecting some negative values when Pratt's measures arelater applied to factor analysis.To sum up, Pratt's measure dp is additively defined; it partitions the standardized varianceaccounted for by a regression model into non-overlapping parts that are attributable to eachindependent variable. The relative importance of the independent variables then can be orderedaccording to the values of dp. As a preview, the additive property of Pratt's measures in a linearmultiple regression is analogous to that of a horizontal additive property under factororthogonality as we explained in Chapter Two and under obliquity as we shall see in ChapterThree.49The following example demonstrates the use of Pratt's measures in ordering theimportance of five variables in explaining grade eight students' (N= 8,912) mathematicsperformance in TIMSS (a.k.a., Trends in International Mathematics and Science Study). The fiveindependent variables are (1) Parents' Education Level, (2) Mathematics Self-confidence, (3)Extra Lessons/Tutoring Time, (4) Computer Availability, and (5) Number of Books at Home. Allfive independent variables are measured on a quantitative scale, and are significant predictors forstudents' mathematics performance. Table 3.1 shows their inter-correlations. The R-squaredvalue for the regression model with the five independent variables is 0.38, F(5, 6929 = 865.61,p< 0.001), which indicates that the independent variables explain 38.4% of the observed variancein mathematics achievement.Table 3.1 Correlation Matrix of the Five Independent Variables for TIMSS MathematicsAchievementAParents'Education LevelBMathematicsSelf-conceptCExtra HourTutoring TimeDComputerAvailabilityENumber ofBooksABCDE1.00.13-.06.23.361.00-.16.10.121.00-.05-.091.00.23 1.00Table 3.2 lists the three building blocks for calculating Pratt's measures: (1) the R-squared value for the model, (2) the standardized regression coefficients, flp , for the fiveindependent variables, and (3) the simple correlations between each of the independent variableand the dependent variable, 13p . Table 3.2 also shows the product term 1 03p and the resultantPratt's measure cip for each independent variable. Note that the sum of the product terms 1 03pacross the five independent variables is equal to .384, which is also the R-squared value for theregression model. Namely, the contribution of each of the independent variables to the R-squaredvalue can be readily attributed to its corresponding value of ilpl3p . Also, Pratt's measures50partitioned the standardized R-squared value into five non-overlapping parts of 0.15, 0.31, 0.18,0.09, and 0.27 that add up to 1.0.Because the importance measures now are non-overlapping and additive, one can orderthe relative importance by the size of dp. For the present example, Mathematics Self-confidenceis the most important independent variable for mathematics performance relative to the otherfour independent variables chosen for this model. Computer Availability is the least important; infact, this variable alone is considered unimportant because d4 = 0.09 < 1/(2p) = 0.1 (p = 5). Inaddition, if a researcher would like to know for some theoretical reason how important students'educational resources, defined as the number of books and the availability of computers at home,relate to their mathematics performance, he or she can simply sum up the two Pratt's measuresfor Number of Books and Computer Availability to yield a joint importance measure using thesimple addition: d (4+5) = d4 + d5 = 0.09+0.27 = 0.36. Jointly, educational resources account for36% of the explained variation.Table 3.2 Pratt's Measures for the Five Independent Variables for TIMSS MathematicsAchievementIndependent Variables Pp 13p PpPp dp1. Parents' Education Level .17 .34 .06 .152. Mathematics Self-Confidence .30 .40 .12 .313. Extra Lesson/Tutoring Time -.22 -.32 .07 .184. Computer Availability .12 .27 .03 .095. Number of Books .26 .40 .10 .27R-squared & standardized R-Squared .38 1.00Note. fa r : standardized regression coefficient; 13 p : simple correlation between the response variable and the independentvariable; d p : Pratt's importance measure, p= 1...5.3.2 The Rationale for Applying Pratt's Importance Measures to Factor AnalysisHow can one take advantage of the desirable properties of Pratt's measures discussedabove and apply them to help enhance interpretability of an oblique EFA? This question can beanswered by describing the connection between multiple regression and factor analysis. The51parallelism between the two statistical methods makes the adaptation of Pratt's measures to EFApossible and justifiable. Gorsuch (1983, p. 14) gave a broad-stroke description of the connectionby framing both multiple regression and factor analysis under the umbrella of the multivariatelinear model (MLM). Namely, both multiple regression analysis and factor analysis are specialcases of the MLM. Using the notation provided by Gorsuch, the MLM with q dependentvariables can be written as a general equation,Yq 13q1Xq1 + Pq2Xq2 + ^ + 13qpXqp Uq,^(3.3)Yq is the score of the dependent variable q,Xqp is the independent variable p for the dependent variable q,Po is the standardized partial regression weight for the independent variable p on the dependentvariable p, andUq is the error term for the dependent variable q.One can easily see that dropping the subscript q in equation (3.3) simplifies the q simultaneousmultiple regression equations to a single equation that looks identical to the multiple regressionequation in (3.1).In spite of the similarity in the governing conceptual framework, these two methods haveseveral differences in technical aspects. Gorsuch (1983) pinpointed that the major differencebetween a multiple regression and factor analysis lies in whether the scores of the three majorelements of the MLM - the dependent variables, the independent variables, and the weightsassigned to the independent variables, are known to the researchers prior to analyses (i.e.,observed). If the scores for the dependent and independent variables are known and only theweights are to be estimated, the modeling technique is called multiple linear regression. If onlythe scores for response variables are known and the scores for the independent variables and theweights are to be estimated, then the multivariate modeling technique is called factor analysis.Hence, a factor analysis with q observed dependent variables can be written as,52Yq = f3q1Fq1 + (3q2Fq2 + + OcipFqp + Lig (3.4)Note that equation (3.4) is identical to the fundamental theory of factor analysis given inequation (1.1) in Chapter One.The differences between equations (3.3) and (3.4) are that first, the independent variablesnow are denoted as Fqp and are constructed by accounting for the interrelationships of the Y q, andsecond, the weights 13qp for these factors are now termed factor loadings in the orthogonal case orthe pattern coefficients in the oblique case.Bring (1996) and Thomas et al. (1998) used the geometry of least squares to interpretPratt's measures for multiple regression. Applying the same least square regression concept tofactor analysis can help to understand Pratt's measures in factor analysis. That is, each of the qthobserved variable and the p common latent factors are represented as vectors in an N-dimensional vector space, where N is the sample size. A factor model for the qth observedvariable is represented by the orthogonal projections of the q th observed variable onto the spacespanned by the common latent factors. When the observed variable and the factors arestandardized to have a mean of zero and variance of one, the qth fitted Y (i.e., ''Tc1) is representedalgebraically by the weight vectors of the factor sum as in equation (1.2).The connection between multiple regression and EFA by the MLM framework makes therationale for using Pratt's measures in EFA self-evident. That is, one can simply adapt Pratt'smeasures to order the importance oblique factors with regard to each observed variable via theadditive property of Pratt's measures regardless of whether the factors are orthogonal or oblique.Not only that, as one will see in the following two EFA examples, Pratt's measures will also holdthe vertical additive property, despite the factor model being oblique.Recall that one needs three building blocks in order to produce Pratt's measures: (1) thestandardized partial regression coefficients, (2) the correlations between the response variablesand the independent variables, and (3) the R-squared values. The goal of applying Pratt's53measures technique to EFA is to produce a Pratt's measure matrix, D, in which the elements arethe importance measures of the p factors to the q observed variables. What are the threecorresponding blocks in EFA for building a Pratt's measure matrix? As is evident from thediscussion in Chapter Two, the three building blocks are: (1) the pattern matrix P of size qxp, inwhich the elements are the equivalents of the standardized partial regression coefficients in amultiple regression, (2) the structure matrix S of size qxp, in which the elements are theequivalents of the zero-order correlations between the dependent variable and the independentvariables in a multiple regression, and (3) the vector of the communalities, in which the elementsare the equivalents of R-squared value in a multiple regression.When an oblique rotation is selected, statistical software such as SPSS will produce anoutput with these matrices for calculating the Pratt's measure matrix. Also, SPSS will producethe factor correlation matrix, which indicates the correlations among the factors. Although thismatrix is not needed for calculating the Pratt's measure matrix, the information in this matrix isoften of great theoretical value but is omitted if orthogonal constraint is imposed. Once the threebuilding blocks are identified in an EFA, one can apply Pratt's technique to transform theinformation in P and S into D.3.3 A Demonstration of Pratt's Measures in EFA for Continuous DataStep-by-step, this section demonstrates how to use the three building blocks in factoranalysis to obtain the Pratt's measure matrix by factor analyzing a continuous dataset. Thisdemonstration will show how Pratt's measures can overcome the three interpretational problemsarising from factor obliquity. However, before the demonstrative examples, it is crucial toacknowledge that Pratt's measures, like most other importance measures or loading cut-offs, aremodel dependent; i.e., they are defined relatively to the other factors chosen for a given model.54Thus, it is crucial that a researcher has made a sound decision about the dimensionality (i.e.,number of factors to retain) prior to the application of the Pratt's measures method.As in Chapter Two, we use the data from Holzinger and Swineford's (1939) 24psychological ability tests to demonstrate the application of Pratt's measures in EFA. The fouroblique factors were obtained using the same extraction and rotation methods. Table 3.3 consistsof the three building blocks required for producing the Pratt's measure matrix. Columns 1through 4 consist of P, columns 6 to 9 of S, and the last column is a vector of the communalitiesh2. The numbers in these three matrices are identical to those reported in Chapter Two but noware juxtaposed in one table for calculating and comparing to D.Calculation of the Pratt's Measure MatrixUsing the three building blocks, the Pratt's measure matrix can be obtained by twosimple steps. First, obtain a matrix, PS, the elements of which are the products of a given patterncoefficient and its corresponding structure coefficient as shown in columns 11 through 14 ofTable 3.3. Under a simple redundant relationship, elements in PS represent the proportion of thevariance in the tests that can be directly and uniquely attributed to each factor. They werederived by simply multiplying the corresponding pattern and structure coefficients for the fourfactors. For example, the product term of Fl for T1 is obtained by multiplying 0.06 (pattern) and0.42 (structure), and is equal to 0.02, indicating that 2% of the variance of T1 can be uniquelyattributed to Fl. The product terms of the other three factors can be obtained by the sameprocedure.55Table 3.3 Pattern, Structure, PatternxStructure, and Pratt's Measure Matrices, andCommunalities for Holzinger & Swineford's (1939) Psychological Ability DataCol.^1 2 3 4^5 6 7 8 9^10 11 12 13 14 15 16 17 18 19 20 21P S PS D h2F1 F2 F3 F4 ZP2 F1 F2 F3 F4 IS2 F1 F2 F3 F4 FPS F1 F2 F3 F4 EFDT1^.06 .65 .02 -.02 0.43 .42 .68 .31 .35 .86 .02 .45 .01 -.01 .47 .05 .95 .01 -.02 1.00 .47T2^-.05 .62 -.12 -.08 0.41 .21 .50 .09 .16 .33 -.01 .31 -.01 -.01 .28 -.04 1.12 -.04 -.05 1.00 .28T3^.03 .49 .04 -.15 .27 .26 .45 .19 .14 .33 .01 .22 .01 -.02 .22 .04 1.02 .03 -.09 1.00 .22T4^-.23 .73 .03 .03 .58 .19 .63 .27 .33 .61 -.04 .46 .01 .01 .43 -.10 1.06 .02 .03 1.00 .43T5^.90 -.07 .04 -.12 .82 .83 .38 .32 .22 .97 .74 -.03 .01 -.03 .70 1.06 -.04 .02 -.04 1.00 .70T6^.84 -.05 -.03 .05 .71 .82 .43 .31 .35 1.08 .69 -.02 -.01 .02 .68 1.02 -.03 -.01 .02 1.00 .68T7^.97 -.10 -.04 -.08 .96 .87 .37 .27 .23 .01 .84 -.04 -.01 -.02 .77 1.09 -.05 -.01 -.02 1.00 .77T8^.71 .04 .00 .02 .51 .74 .44 .31 .32 .95 .53 .02 .00 .01 .55 .96 .04 .00 .01 1.00 .55T9^.85 .05 -.06 -.01 .72 .85 .48 .30 .32 1.14 .72 .02 -.02 .00 .72 1.00 .03 -.02 .00 1.00 .72T10 -.02 -.28 .88 .02 .84 .19 .10 .76 .29 .71 .00 -.03 .67 .01 .64 -.01 -.04 1.04 .01 1.00 .64T11^.15 -.03 .54 .14 .33 .40 .35 .64 .43 .88 .06 -.01 .35 .06 .45 .13 -.03 .76 .13 1.00 .45T12 -.10 .14 .70 -.13 .54 .20 .32 .66 .23 .63 -.02 .04 .46 -.03 .45 -.04 .09 1.02 -.07 1.00 .45T13 -.06 .34 .51 -.10 .39 .29 .48 .58 .29 .74 -.02 .16 .29 -.03 .41 -.05 .40 .72 -.07 1.00 .41T14 .07 -.15 -.15 .76 .62 .23 .22 .18 .64 .55 .02 -.03 -.03 .49 .45 .04 -.07 -.06 1.09 1.00 .45T15 -.18 .05 -.09 .65 .46 .07 .24 .16 .56 .40 -.01 .01 -.02 .36 .34 -.03 .04 -.04 1.04 1.00 .34T16 -.04 .31 -.04 .47 .32 .30 .52 .30 .60 .81 -.01 .16 -.01 .28 .42 -.03 .39 -.03 .67 1.00 .42T17 -.06 -.21 .25 .58 .44 .16 .17 .41 .56 .54 -.01 -.03 .10 .33 .38 -.02 -.09 .26 .85 1.00 .38118 -02 .03 .13 .43 .20 .21 .29 .33 .49 .48 .00 .01 .04 .21 .26 -.02 .03 .17 .82 1.00 .26T19 .14 .10 .00 .33 .14 .33 .35 .26 .44 .49 .05 .03 .00 .15 .23 .20 .15 .00 .65 1.00 .23T20 .17 .44 -.12 .18 .27 .44 .57 .22 .42 .74 .08 .25 -.03 .07 .38 .20 .67 -.07 .20 1.00 .38T21^.09 .36 .31 .04 .23 .42 .56 .52 .41 .93 .04 .20 .16 .02 .42 .09 .48 .39 .04 1.00 .42T22 .37 .38 -.04 .04 .28 .58 .58 .29 .36 .89 .21 .22 -.01 .01 .44 A9 .51 -.02 .03 1.00 .44T23 .21 .52 .07 .05 .32 .55 .69 .40 .44 1.13 .12 .36 .03 .02 .53 .22 .69 .05 .04 1.00 .53T24 .31 .05 .29 .17 .22 .53 .44 .52 .46 .95 .17 .02 .15 .08 .42 .39 .06 .36 .19 1.00 .42F2^.55 ETPS^4.14^2.77 2.15 1.97 11.03 ETD^6.61 7.37 4.54 5.47 24.00F3^.40 .43 %(F) 17.30 11.50 9.00 8.20 46.00 %(F) 27.60 30.70 18.90 22.80 100.00F4^.40 .52 .47Note. h 2:communality; P: Pattern matrix; S: Structure matrix; PS: a matrix in which the elements are the products of a given patterncoefficient and its corresponding structure coefficient; D: Pratt's measure matrix; FPS is the sum of the elements in PS for a giventest across the four factors; ETPS is the sum of the elements in PS for a given factor across the 24 tests; EFD is the sum of theelements in D for a given test across the four factors; ETD is the sum of elements in D for a given factor across the 24 tests; %(F)denotes the percentage of total variance of the 24 tests explained by a given factor.Note. The highest Pratt's measures are highlighted in bold. Pratt's measures that are not the most important measures and yet arenot considered unimportant are underlined.56Second, calculate Pratt's importance measures by dividing the product terms by thecommunality as shown in columns 16 through 19 of Table 3.3. Pratt's measure of Fl for T1, forinstance, is calculated by dividing the product term of Fl by the communality value, that is0.02/0.47 = 0.05 indicating that 5% of the standardized common variance in T1 is uniquely dueto Fl. Note that the sum of the four Pratt's measures is equal to 1 as shown in column 20 ofTable 3.3 indicating that Pratt's measures partitioned the standardized communality into non-overlapping parts despite the factors being moderately correlated. This is crucial evidence thatPratt's measures work for Pearson's correlation matrix for continuous data in EFA. Applying thesame procedures to all the other tests will produce the Pratt's measure matrix as shown incolumns 16 to 19 in Table 3.3.Interpreting the Pratt's Measure MatrixBefore interpreting Pratt's measure matrix D shown in columns 16 to 19 of Table 3.3,note that the highest Pratt's measures for each test was highlighted in bold. Also, importancemeasures dp < 1/(2p) = 0.125 (p = 4), were considered unimportant and can be ignored usingThomas' (1992) criterion. In addition, the Pratt's measures that were not the most important butwere not considered as unimportant were underlined. There are two approaches for interpretingthe Pratt's measure matrix depending on the purpose and stage of the factor analysis. Horizontalinterpretation reads across the factors one test at a time. As explained earlier, horizontalinterpretation is most appropriate when the substantive meaning of the factors is fairly wellknown, or when the emphasis of the interpretation is on making direct causal inferences of thefactors on the tests. For the present example, the substantive meanings of the four-factors havebeen repeatedly verified and interpreted across empirical data and labelled as "Verbal","Spatial", "Speed", and "Memory" (e.g., Harman, 1976; Preacher & MacCallum, 2003; Russell,2002). The interpretation would emphasize how each factor influences the variation of a given57test; hence a focus on horizontal interpretation would be the more constructive and appropriate atthis stage.Take T8 for example, Pratt's measures partitioned the communality into four additiveparts that sum up to 1.00 (with rounding error): 0.96, 0.04, 0.00, and 0.01which are uniquelyattributable to F1 to F4, respectively. Because of the horizontal additive property, one canconclude that 96% of the communality in T1 is attributable to Fl (Verbal), 4% to F2 (Spatial),0% to F3 (Speed), and 1% to F4 (Memory). This is a statement that cannot be made byinterpreting P or S alone or by juxtaposing both. Using the dp < 1/(2p) = 0.125 rule, F2, F3, andF3 could be considered unimportant because Fl dominates the contribution to the amount ofcommunality. Observe that Pratt's measures yielded more mutually distinctive values than thepattern coefficients of 0.71, 0.04, 0.00, or 0.02 and the structure coefficient of 0.74, 0.44, 0.31,and 0.32 alone. Specifically, Pratt's measures transformed the pattern and structure coefficientsinto an even closer approximation of the simple structure. Also, by juxtaposing, the two obliquecoefficients, would lead one to an inconsistent interpretation if the traditional cut-off of 0.3 wereused.Vertical interpretation is most appropriate when the purpose of the factor analysis is tounderstand the substantive meaning of the factors or identify subscales among a set of items. It isachieved by reading along the tests for a given factor at a time. Assuming that the substantivenature of the four factors was unclear to the investigators, the meaning of unknown factors canbe inferred by the common meaning of the cluster of tests that share the same factor as the mostimportant contributor. For example, the meaning of F4 can be inferred by T14, T15, T16, T17,T18, and T19, which share F4 as their most important contributor. Note that the task ofclustering the tests is greatly eased by examining Pratt's measures because they yieldedconsiderably more distinctive values than the pattern or structure coefficients.58Of course, there are no reasons why one cannot make both the horizontal and verticalinterpretation when needed. In fact, the two-directional approach will give a more completeunderstanding of the factor model.In addition to the improvement over the conventional interpretation, Pratt's measureshave a unique advantage that cannot be achieved by the pattern and structure coefficients.Because Pratt's measures allow importance to be defined additively, researchers can directly addup importance measures of theoretical interest to them, and show how much two or more factorsjointly contribute to the common variance of a particular test. For instance, the joint contributionof F 1 and F2 to T23 is equal to d1 + d2 = 0.22 + 0.69= 0.91. Fl and F2 jointly explained 91% ofthe communality of T23.How Pratt's Measures Resolve the Three Interpretational Problems of an Oblique ModelAs we have seen in Chapter Two, interpreting P or S individually or juxtaposing bothhave three inherent interpretational problems caused by factor obliquity. These problems can beresolved by the Pratt's methods as explained below.Problem One: The Dilemma of Choosing P or SThe inconsistency problem between pattern and structure coefficients can be clearlyobserved by T21 in Table 3.3. Observe that Fl has a very small pattern effect of 0.09 on T21, buta moderate structure coefficient of 0.42 with T21. As discussed earlier, interpreting either thepattern or structure coefficient on its own is problematic and insufficient. Using the traditionalcut-off of 0.3 or 0.4 for practical meaningful relationship as suggested in the literature, thepattern and structure coefficient would reach contradictory conclusions about whether a"meaningful" relationship exists between T21 and Fl. Furthermore, simply juxtaposing bothcoefficients as displayed in the first 10 columns of Table 3.3, as recommended in the currentliterature cannot solve the difficulty either.59Unlike the pattern and structure coefficients alone or simply juxtaposing both, the Pratt'smeasure for T2 and Fl, 0.09 (9%) is a correct representation of how much the standardizedvariation of T21 was accounted for uniquely by Fl. Namely, Pratt's measures transform the twooblique coefficients into one single index of variance explained by a factor. Thus, there is nolonger a need to interpret P or S individually and encounter the dilemma of each interpretationleading to different conclusions.Problem Two: The Distortion of Horizontal and Vertical Additive PropertiesIn Chapter Two, we explained and showed that, under obliquity, the horizontal additiveproperty holds neither for the pattern nor for the structure coefficients, and the vertical additiveproperty holds only for the pattern coefficient. Here, we see that Pratt's measures restore thehorizontal and vertical additive properties while allowing factors to be oblique.When the horizontal additive property is distorted by factor obliquity, neither the patterncoefficient nor the structure coefficient on its own can properly partition the communality of anobserved variable. However, the product of the pattern and structure coefficients can. Thisproperty is demonstrated by the sum of the products across the four factors (denoted as IRS incolumn 15 of Table 3.3), all being equal to the communalities. Take T8 for example, the sum ofthe four product terms (with rounding error), 0.53 + 0.02 + 0.00 +0.01 = 0.55, is equal to thecommunality of T1. Alternatively, the horizontal additive property can be observed by the sumof Pratt's measures for the 24 tests all being equal to 1.0, the standardized communalities. Inother words, Pratt's measures divide the standardized communalities of the observed variablesinto the non-overlapping parts that are readily attributable to each factor, while allowing themoderate factor correlations to be revealed as shown at the bottom of column 1 to 3 of Table 3.3.In spite of the obliquity, the simple transformation of the pattern and structure coefficients intoPratt's measures restores the horizontal additive property held conventionally only by theorthogonal loadings.60Pratt's measures maintained the vertical additive property held by the orthogonal loadingsand pattern coefficients. In Table 3.3, we can see that the amount of total variance explained byeach of the four factors, denoted as ETPS, is calculated by adding up elements in PS along the 24tests. Their sum across the four factors is equal to the total variance of 11.03 identical to what wehave seen in Chapter Two. This information is shown in the second last row of columns 11 to 14.Accordingly, the proportion of total variance explained by each factor, denoted as %(F) can alsobe accurately calculated as shown in the last row of columns 11 to 14. The vertical additiveproperty in summing up the standardized total variance can also be seen in the last two rowsunderneath the Pratt's measure matrix in Table 3.3. Observe that the standardized total variance,24, is equal to the sum of those explained by each factor individually; the proportion ofstandardized total variance 100% is also equal to the sum of those explained by each factorindividually.Problem Three: Inappropriateness of the Traditional Rules for Being "Meaningful"In Chapter Two, we argued that the traditional rules for a meaningful variable-factorrelationship were based on the premises that (1) the horizontal additive property holds, and (2)the absolute magnitude of the relationship is bounded within the range of 0 and 1. As we haveexplained and demonstrated, Pratt's measures are in accord with these premises. Thus, if desired,it is appropriate to use the traditionally suggested 0.3 or 0.4 rule as practically meaningful forinterpreting D. Alternatively, one can use the 1/(2p) criterion suggested for multiple regressionby Thomas (1992) as the minimum value for being considered as important.Readers may have noticed and argued that interpreting P vertically using the traditionalcut-off rules would identify the same vertical pattern as does D, hence concluding that D isredundant because it provides no new information other than P. However, the discussion inChapter Two should remind readers that this argument is problematic. That is, although P indeedreserves the vertical additive property, the traditional cut-off rules applied to P are in fact invalid61because they are suggested for orthogonal loadings, not to mention that P itself is invalid forhorizontal interpretation. The limitations of the oblique coefficients restrict the researchers'interpretational orientation. In contrast, D frees the researchers from the limitations of theoblique coefficients. It allows both vertical and horizontal interpretation to obtain a morecomplete understanding of the factor model when needed.As discussed in Chapter Two, however, like other regression coefficients or importancemeasures, Pratt's measures work optimally when the observed variables and the factors followthe simple redundant relationship and display no multicollinearity or suppression effect. Underthese complex scenarios, the cut-offs may become uninterpretable for interpreting D because thenon-trivial negative Pratt's measures may occur other than simply by chance, and Pratt'smeasures may exceed the bounds of 0 and 1. Taking T4 for example, the Pratt's measure for Flis —0.10, which indirectly makes the Pratt's measure for F2 become greater than 1 (1.06).Examining the pattern and structure coefficients between T4 and Fl, one can see that they are ofdifferent sign, displaying a negative suppression effect. Such a complex relationship whichdisobeys the simple redundant relationship should be interpreted separately using a differentparadigm, as we will discuss more in Chapter Five.3.4 A Demonstration of Pratt's Measures in EFA for Categorical DataThe second example involves the Likert-type (i.e., rating scale) item responses that arewidely seen in social and behavioural science research. One way of factor analyzing categoricaldata is to base the analysis on the tetrachoric correlation matrix for binary data or polychoriccorrelation matrix for polytomous data. The polychoric correlation is derived by estimating thelinear relationship between two underlying unobserved variables, which are assumed to governpeople's observed ordered categorical responses. The estimation of polychoric correlation62assumes the underlying unobserved variables are continuous and normally distributed (seeMuthen, 1983; 1984).Our example data come from the background questionnaire of the 2003 TIMSS study.The participants were 8,385 U.S. grade-eight students who answered the questions regardinghow much time they spent on each of the nine outside of school activities. The question is: "On anormal school day, how much time do you spend before or after school doing each of thesethings?" One of the activities on the list was, for example, "I watch television and videos". Thequestions were measured on a 5-point Likert scale and coded as: (1) no time, (2) less than 1 hour,(3) 1-2 hours, (4) more than 2 but less than 4 hours, and (5) 4 or more hours. The purpose ofchoosing these data is to put Pratt's measures on an empirical test to see whether it also worksfor a polychoric correlation matrix derived from categorical data.Because SPSS does not produce estimates of polychoric correlations, one has to rely onPRELIS2 (Joreskog & SOrbom, 1999) to estimate the polychoric correlation matrix and save it asa data file3 . Unfortunately, PRELIS does not provide output for the structure matrix; so one hasto resort back to SPSS EFA to obtain the building blocks for Pratt's measure matrix. Using SPSSsyntax, one can directly read in the polychoric correlation matrix and conduct an EFA. Syntaxfor reading the polychoric correlation matrix and running EFA in SPSS is given in Appendix C.An alternative way to obtain S is to use the equation (2.1) S qxp = PqxpRpxp, where P and R areautomatically outputted by PRELIS.Unlike the previous example, as far as we are aware, these data was never factor analyzedand published, hence there is no a priori theory to help the researcher decide on the number of2Tetrachoric and polychoric correlation matrices can also be obtained from the program "FACTOR" developed byLorenzo-Seva & Ferrando (in press). Free download of this program is available fromhttp://psico.fcep.urv.es/utilitats/factor/.3This procedure can be done in PRRELIS by choosing Statistics/Factor Analysis/Output Options/Moment Matrix(choose Correlation, click Save to File, and give a .dat file name)63factors to extract, the substantive meaning of the factors, or whether the factors are correlated.We used parallel analysis with 100 replications to assist in choosing a preliminary number offactors to extract in addition to the conventional eigenvalue greater than 1.0 rule. Using theabove described procedures of factor analyzing a polychoric correlation matrix and the un-weighted least squares (ULS) extraction method in SPSS, the data show a three-factor structurethat is supported by both the "eigenvalue greater than one" rule and the parallel analysis (seeTable 3.4).Table 3.4 Eigenvalues and Parallel Analysis for 2003 TIMSS Outside School Activities DataEigen ValuesEFA PA1 2.372 1.0922 1.510 1.0643 1.138 1.0404 0.911 1.0195 0.832 0.9996 0.667 0.9807 0.651 0.9588 0.520 0.9409 0.400 0.910Note. PA: parallel analysis. The first three factors were retained because their eigen values are lager than 1 and are greater than those of PA.In this application, ULS was applied to extract the common factors rather than weightedleast squares or generalized least squares method. Our rationale for using ULS was based onJOreskog's (2003) contention that ULS and MINRES (minimum residuals, Harman, 1976) areequivalent and gave the same robust solutions. According to JOreskog, ULS can be used evenwhen the correlation matrix is not positive definite, an occasionally occurring scenario when atetrachoric or polychoric correlation matrix is analyzed. JOreskog also stated "ULS is particularlysuited for exploratory factor analysis where only parameter estimates and not standard errorestimates and chi-squared values are of interest" (p. 1). The pattern matrix, structure matrix, andthe communalities for building the Pratt's measure matrix are shown in Table 3.5.64The resultant Pratt's measure matrix in Table 3.5 clearly shows that Pratt's measures alsowork empirically for categorical data based on a polychoric correlation matrix. This is illustratedby the fact that the standardized communalities for the nine items all add up to 1.0. Furthermore,the interpretability of the three-factor solution is tremendously enhanced. This is illustrated bythe remarkably distinct proportions of communality explained by each of the three factors.Table 3.5 Pattern, Structure, PatternxStructure, & Pratt's Measure Matrices, and Commonalitiesfor 2003 TIMSS Outside School Activities DataPEP2SES2PS D1FDh2Item^F1^F2^F3 F1^F2 F3 F1 F2 F3 FPS^F1 F2 F31. Watching TV & video^.52 .06 -.07 .28 .55 .26 -.05 .37 .29 .02 .00 .30^.94 .05 .01 1.00 .302. Playing computer game^.79 -.10 .10 .65 .75 .25 .08 .63 .59 -.03 .01 .58^1.03 -.04 .01 1.00 .583. Playing/talking w/ friends^.32^.40 -.18 .30 .49 .49 -.07 .48 .16 .20 .01 .37^.43 .54 .03 1.00 .374. Doing jobs at home^-.06 .53 .31 .39 .16 .59 .46 .58 -.01 .31 .14 .45^-.02 .70 .32 1.00 .455. Working at a paid job^-.02 .46 .04 .21 .17 .46 .16 .27 .00 .21 .01 .21^-.01 .99 .03 1.00 .216. Playing sports^.02 .44 -.08 .20 .20 .42 .03 .22 .00 .19 .00 .19^.03 .99 -.02 1.00 .197.Reading a book^.09 -.10 .64 .43 .05 .11 .61 .39 .00 -.01 .39 .39^.01 -.03 1.02 1.00 .398.Using internet^.60 .03 .05 .37 .61 .29 .06 .47 .37 .01 .00 .38^.97 .02 .01 1.00 .389. Doing homework^-.04 .09 .44 .20 .00 .19 .46 .25 .00 .02 .20 .22^.00 .08 .92 1.00 .22F2 .41 ETPS 1.41 .91 .77 3.09 ETD^3.37 3.29 2.34 9.00F3 .01^.27 15.61 10.16 8.55 34.31^37.45 36.58 25.97 100.00Note. h2:communality; P: Pattern matrix; S: Structure matrix; PS: a matrix in which the elements are the products of a given patterncoefficient and its corresponding structure coefficient; D: Pratt's measure matrix; EFPS is the sum of the elements in PS for a giventest across the four factors; ETPS is the sum of the elements in PS for a given factor across the 24 tests; E FD is the sum of theelements in D for a given test across the four factors; E TD is the sum of elements in D for a given factor across the 24 tests; %(F)denotes the percentage of total variance of the 24 tests explained by a given factor.Note. The highest Pratt's measures are highlighted in bold. Pratt's measures that are not the most important measures and yet arenot considered unimportant are underlined.Because the meaning of the factors is unknown, vertical interpretation is most suitable forthe present stage. The vertical interpretation is warranted by the vertical additive property ofPratt's measures. When examining the Pratt's measures along the nine items, Fl contributes toalmost all the common variance of "Watching TV & Videos" (94%) and "Using internet" (97%),and all the common variance of "Playing computer games"(100%) - three activities that allinvolve some form of electronic-related activities. Similarly, F2 is the major contributor to"Playing or talking with friends" (54%) and "Do jobs at home" (70%), and is almost the sole65contributor for the common variances of "Playing sports" (99%) and "Working at a paid job"(99%). By identifying these four items that share F2 as the most important contributor to theircommunalities, F2 can be interpreted as a dimension of involvement in social interaction andsupport activities. In the same sense, F3 can be interpreted as involvement in "reading orstudying" because it is the sole contributor to the common variance of the items "Reading a bookfor enjoyment" (100%), and a major contributor to the common variance of "Doing homework"(92%).Once the factor interpretation is realized, one can then interpret the directionalrelationships of the factors on the observed variables using the horizontal additive property.Another useful way that Pratt's measures can enhance the horizontal interpretation is to identifyunimportant contributors using the dp< 1/(2p) rule. When interpreted horizontally, factors withPratt's measures less than 0.167 (P= 3) can be regarded as an unimportant factor for the variationof a given test. For example, even though F2 explains 8% of the common variance in item"doing homework", its contribution could be considered as unimportant and ignored.Identification of the unimportant factors helps to eliminate the unnecessary complexities wheninterpreting the factor solutions.It is interesting to observe that the dp< 1/(2p) criterion clearly separates the mostimportant factors from the "unimportant". For items with no factorial complexity (no cross-loading), the 1/(2p) criterion distinguishes the three factors into either "most important" or"unimportant" with no middle ground interpretation (i.e., in-between most important andunimportant). For example, for item 9 "doing homework", factors that are "not" most important(F1 and F2) were all identified as unimportant. The only two exceptions were items 3 and 4,which clearly displayed factorial complexity, as shown by their pattern coefficients, indicatingthat more than one factor was an important cause for the observed variations. For instance,although Fl is not the most important contributor for item 3 "playing or talking with friends", it66still accounts for 43% of the communality, which is greater than the unimportant criterion, andshould be recognized when interpreting horizontally.Another valuable piece of information resulting from applying Pratt's measures is that theassociations among the three factors can be revealed. This correlation matrix is shown at thebottom of Table 3.5. One can see that although Fl (electronic activities) and F3 (reading andstudying) are nearly orthogonal, F 1 (electronic activities) and F2 (social interaction) areestimated to correlate at .41; F2 (social interaction) and F3 (reading & studying) are estimated tocorrelate at .27. Information on factor correlation would have been omitted if an orthogonalrotation had been applied for mathematical simplicity.Closing Remarks for Chapter ThreeThe purpose of this chapter was to introduce the use of the Pratt's measures method infactor analysis and show how it could resolve three interpretational difficulties arising fromfactor obliquity. First, we showed that it integrates the information in both the pattern andstructure coefficients, so there is no need to choose which to trust if the individual interpretationleads to inconsistent conclusions. Second, it restores the horizontal and vertical additionproperties that are conventionally warranted only for orthogonal factor models. Third, it resolvesthe major problems of the rules for "meaningful" cut-offs suggested for orthogonal models. Inessence, the method of Pratt's measures also resolves the dilemma of choosing between theadvantages of theoretical flexibility facilitated by an oblique model and the advantage ofmathematical simplicity facilitated by an orthogonal model. Such historical debate is nowdispensable because advantages of both rotational methods can be achieved by the Pratt'smeasures method.Readers should be warned that, like multiple regression, the interpretation of Pratt'simportance measures in EFA is model dependent. Namely, the importance of a factor is definedrelative to the other factors for a particular factor solution. Because the importance of a factor is67defined relative to other factors, it is not appropriate to compare a factor's importance acrossvarious factor solutions involving a different number of factors. Hence, dimensionalspecification is crucial for Pratt's measures to operate meaningfully in EFA. If the Pratt'smeasure matrix is not interpretable, this may suggest that alternative models with a differentnumber of factors as well as rotation method should be explored.Also, because Pratt's measures partition the explained variance into additive parts,researchers should be aware of how much of the explained variance they begin with in theiranalyses. That is, if the communalities of the observed response variables are low, application ofPratt's measures is of little value because one cannot make sense of something that has little toexplain, a problem that neither rotational methods nor Pratt's measures can address. In this case,attention should be paid to the selection of the observed variables before any interpretation of thefactor solution.68ReferencesAzen, R., & Budescu, D. V. (2003). The dominance analysis approach for comparing predictorsin multiple regression. Psychological Methods, 8, 129-148.Bring, J. (1996). A geometric approach to compare variables in a regression model. TheAmerican Statistician, 50, 57-62.Conger, A. J. (1974). A revised definition for suppressor variables: A guide to their identificationand interpretation. Educational and Psychological Measurement, 34, 35-46.Gorsuch, R. L. (1983). Factor analysis (2nd ed.). Hillsdale, NJ: Erlbaum.Green, P. E., Carroll, J. D., & DeSarbo, W. S. (1978). A new measure of predictor variableimportance in multiple regression. Journal of Marketing Research, 15, 356-360.Harman, H. H. (1976). Modern factor analysis (3rd ed.). Chicago: University of Chicago Press.Holzinger, K. J., & Swineford, F. (1939). A study in factor analysis: The stability of a bi-factorsolution. Supplementary Educational Monographs. Chicago: University of Chicago,JOreskog, K. G. (2003). Factor analysis by MINRES. (Scientific Software International TechnicalDocumentation). Retrieved March, 22, 2006, fromhttp://www.ssicentral.com/lisrel/resources.html.JOreskog, K. G., & SOrbom, D. (1999). LISREL 8 user's reference guide. Chicago: ScientificSoftware International.Kruskall, W. (1987). Relative importance by averaging over orderings. The AmericanStatistician, 41, 6-10.Lancaster, B. P. (1999, January). Defining and interpreting suppressor effects: Advantages andlimitations. Paper presented at the annual meeting of Southwest Educational ResearchAssociation, San Antonio, Texas.Lorenzo-Seva, U., & Ferrando, P. J. (in press). FACTOR: a computer program to fit exploratoryfactor analysis model. Behavior Research Methods.Pratt, J. W. (1987). Dividing the indivisible: Using simple symmetry to partition varianceexplained. In T. Pukilla & S. Duntaneu (Eds.), Proceedings of Second TampereConference in Statistics (pp. 245-260). University of Tampere, Finland.Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift's electric factor analysismachine. Understanding Statistics, 2, 13-43.Russell, D. W. (2002). In search of underlying dimensions: The use (and abuse) of factoranalysis in personality and social psychology bulletin. Personality and Social PsychologyBulletin, 28, 12, 1629-46.69Thomas, D. R. (1992). Interpreting discriminant functions: A data analytic approach.Multivariate Behavioral Research, 27, 335-362.Thomas, D. R., & Zumbo, B. D. (1996). Using a measure of variable importance to investigatethe standardization of discriminant coefficients. Journal of Educational & BehavioralStatistics, 21,110-130.Thomas, D. R., Hughes, E., & Zumbo, B. D. (1998). On variable importance in linear regression.Social Indicators Research, 45, 253-275.Thomas, D. R., Zhu, P. C., Zumbo, B. D., & Dutta, S. (2006, June). Variable importance inlogistic regression based on partitioning an R2 measure. Paper presented at the 2006Annual Meeting of Administrative Sciences Association of Canada (ASAC), Banff,Alberta.Thomas, D. R., Zhu, P. C., Zumbo, B. D., & Dutta, S. (in press). On measuring the relativeimportance of explanatory variables in a logistic regression. Journal of Modern AppliedStatistical Methods.Wu, A. D., Zumbo, B. D., & Thomas, D. R. (2006, April). Variable and factor ordering in factoranalyses: Using Pratt's importance measures to help interpret exploratory factor analysissolutions for oblique rotation. Paper presented at the Annual Meeting of the AmericanEducational Research Association (AERA), San Francisco, CA.70Chapter Four: Demonstration of Pratt's Measures in Confirmatory FactorAnalysisThrough the use of Pratt's measures in CFA, this chapter serves two purposes. Thematerial in Section 4.1 and 4.2 constitutes a follow-up study to Graham, Guthrie, and Thompson(2003) 1 . The purpose is to warn researchers that the structure coefficient of a confirmatory factoranalysis can be entirely spurious due to the zero constraint on its corresponding patterncoefficient and the factor obliquity. Interpreting such a structure coefficient as advocated byGraham et al. could be misleading and problematic. Judging by the CFA fit indices, the secondpurpose is to compare the fit of the EFA model identified by the Pratt's importance measures >1/(2p) criterion to those of models identified by the cut-offs commonly applied to the pattern andstructure coefficients as well as the orthogonal loadings.In Graham et al.'s (2003) article, the authors used two hypothetical data sets to deliver acentral message: interpreting only the pattern coefficients and ignoring the information in thestructure coefficients could lead to problematic interpretation of an oblique CFA. Their first dataset represents a CFA involving no factorial complexity and the second data set represents a CFAinvolving one factorial complexity (i.e., one item with cross-loading). Graham et al. argued thatto properly interpret a CFA, both the pattern and structure coefficients should be interpreted bymeans of juxtaposing.In addition to the reasons we listed in Chapter Two, we believe that the preference forinterpreting only the pattern coefficients in CFA practice is due to two other reasons. First, theparameter constraint of CFA, which is the key feature that distinguishes a CFA from an EFA, isactually placed on the pattern coefficients borne on the fundamental theory of factor analysis in'A version of this chapter will be submitted for publication. Wu, A. D., Zumbo, B. D., & Thomas, R. D. Pratt'simportance measures in confirmatory factor analysis.71equation (1.1) rather than the structure coefficients. Second, the default output of most CFAstatistical packages provides only the pattern coefficients.To fulfill our first purpose, we re-analyzed the two datasets using LISREL by MaximumLikelihood estimation method. We demonstrate that the structure coefficients can be entirely orpartly spurious; hence, adding the structure coefficients to the interpretation of an oblique CFAas recommended by Graham et al. (2003) is still insufficient for the reasons we have explainedand demonstrated in Chapter Two. Next, the Pratt's measures method is applied to resolve theinterpretational difficulties of an oblique CFA by integrating information in both the pattern andstructure coefficient.4.1 Pratt's Measures in CFA with no Factorial ComplexitiesFor the first data set in Graham et al. (2003), two factors that correlate at 0.68 werehypothesized to be underlying six observed variables. Table 4.1 shows the pattern and structurematrices reported by Graham et al. The second and third columns show that Fl has partial effectsonly on the first three observed variables, F2 has partial effects only on the last three observedvariables, and all the other pattern effects were fixed at zero indicating no factorial complexities.This example is an ideal manifestation of the simple structure and is the template specificationby many CFA users. Graham et al. argued that constraining the pattern coefficients to zeros doesnot automatically constrain the structure coefficients to zeros if the factors are correlated.Graham et al.'s (2003) point is illustrated by the pattern and structure coefficientsreported in Table 4.1 - despite the zero constraint on the pattern coefficients, the correspondingstructure coefficients still yield substantial values as highlighted in bold face. For example,although the pattern coefficient of F2 on variable A is constrained to be zero, its correspondingstructure coefficient of 0.58, is far greater than the cut-off of 0.3 or 0.4 suggested in the72literature. Using this example, Graham et al. raised the problem of missing important informationif the structure coefficient is not interpreted.Table 4.1 Factor Solutions for Case One: with No Factorial ComplexitiesP S L PS SQRT(PS)F1 F2 F1 F2 F1 F2 F1 F2 F1 F2 F1 F2A .849(g) .000(h) .849(i) .580(j) .836 .000 .721 .000 .849 .000 1.000 .000B .726 .000 .726 .495 .721 .000 .527 .000 .726 .000 1.000 .000C .817 .000 .817 .557 .836 .000 .667 .000 .817 .000 1.000 .000D .000 .875 .597 .875 .000 .855 .000 .766 .000 .875 .000 1.000E .000 .774 .528 .774 .000 .777 .000 .599 .000 .774 .000 1.000F .000 .808 .552 .808 .000 .794 .000 .653 .000 .808 .000 1.000RF1 1.000(k) .680(1)F2 .680(m) 1.000(n)Note. P: pattern matrix; S: structure matrix; L: loading matrix; PS: a matrix of which the elements are the products of a given patterncoefficient and its corresponding structure coefficient; Sqrt(PS): square root of PS; D: Pratt measure matrix; R: factor correlationmatrixGraham et al.'s (2003) contention was only partially correct. They are correct in pointingout that ignoring the structure coefficients can miss important information in the bi-directionalrelationships. However, this statement is correct only if the bi-directional relationship is indeedtrue. Our forthcoming contention shows that a non-zero structure coefficient is "entirely"spurious when accompanied by a zero pattern coefficient. When the pattern coefficient isconstrained to zero, the correlation between A and variable F2 in Table 4.1 is "entirely" due tothe correlation between Fl and F2. That is, the correlation between A and F2 is due to Acorrelating with Fl, which in turn correlates with F2. The substantial zero-order bivariatecorrelation between A and F2 would completely disappear (rather than diminish) once thecorrelation between Fl and F2 is removed. Namely, the substantial correlation between A andF2, as indicated by the structure coefficient, is "entirely" spurious. Interpreting such a spurious73relationship as suggested by Graham et al. is as problematic as not interpreting it at all, if notmore!The spuriously substantial structure coefficient due to the high factor correlation can beshown through equation (2.1) S q,,p= PqxpRp,,p in Chapter Two, which says that the structurematrix is equal to the pattern matrix post-multiplied by the factor correlation matrix. Using thisequation, the structure coefficient 0.58 between A and F2, denoted as (j) in Table 4.1 is given asthe values in cells (g), (1), (h), and (n) such that,(j) = 0.58= (g) x(i) + (h) x (n)= (pattern Fl on A) x (correlation Fl and F2) + (pattern F2 on A) x (correlation F2 and F2)= 0.849 x 0.68 + x 1Because the second product term is equal to zero, the structure coefficient (j) between F2and A is completely attributable to the first product term: the partial effect of Fl on A (0.849)times the correlation between Fl and F2 (0.68), which has nothing to do with any relationshipsbetween F2 and A. By construing the calculation of the structure coefficient (j), it clearly affirmsthat, when the corresponding pattern coefficient is constrained to zero, the moderately highcorrelation between F2 and A, 0.58, is entirely spurious and is simply a result of the correlationbetween Fl and F2.The traditional way of investigating the unique correlation that is not inflated by thefactor correlation is to obtain loadings using orthogonal rotation assuming no correlationbetween the factors, even if the factors are theoretically or empirically shown to be otherwise.Table 4.1 also shows the orthogonal loadings, L. As explained in Chapter Two, one can see thatthe loadings identically represent P and S when the factor rotation is constrained to beorthogonal. The elements represent both the unique partial effect and unique bivariate correlationbecause the factors contain no overlapping information to be removed.74Comparing the pattern coefficients to the loadings in Table 4.1, one can see that thepattern coefficients remain very similar. This is because the pattern coefficient is the uniquecausal effect that has accounted for the overlapping contribution of F1 and F2. However, there isa troubling difference in the structure coefficients and the orthogonal loadings. For theorthogonal solution, variables with zero pattern coefficients also yield zero loadings showing thatthere is no unique bi-directional relationship between F2 and variable A when Fl and F2 areuncorrelated. Under factor orthogonality, this zero bi-directional relationship between F2 and Acan be construed using the same formula for calculating the structure coefficient,= (pattern F1 on A) x (correlation F1 and F2) + (pattern F2 on A) x (correlation F2 and F2)=0.836 x 0 + 0 x 1Compared to the oblique case, not only the second product term but also the first productterm drop out of the calculation and yield a zero structure coefficient (i.e., loading). The firstproduct term that produces the spurious bi-directional relationship in the oblique case is nolonger in effect because the pattern coefficient (i.e., loading) of Fl on variable A (0.836) ismultiplied by a zero correlation between Fl and F2.Although resorting to an orthogonal solution to help detect a spurious correlation andreveal a unique correlation between an observed variable and a factor is technicallystraightforward and convenient, this approach contradicts the CFA rationale of testing a modelthat is a priori hypothesized to be oblique. Also, using this method may produce biased patterncoefficient estimates due to the orthogonal constraint when, in fact, the factors are correlated.This bias can be observed by the small inconsistency between the pattern coefficients and theloadings in Table 4.1.As demonstrated in Chapter Three, Pratt's measures can resolve the interpretationalcomplexities resulting from factor obliquity. Without having to constrain unjustifiedorthogonality, Pratt's measures can additively attribute the unique contribution of each factor to75the communality while still allowing the factor correlation to be freely estimated and tested. Theunique bivariate relationship can be investigated without the orthogonal constraint. Columnsunder the heading of PS in Table 4.1, which are the products of a pattern coefficients multipliedby its corresponding structure coefficients, indicate the amount of variance explained by eachfactor. The column denoted as SQRT(PS), the square roots of PS, is analogous to the uniquecorrelation between a given factor and a variable by removing the overlapping relationship dueto factor correlation. By examining the values of SQRT(PS), one can see that the substantialcorrelation coefficients with zero pattern constraints drop to zero while the overlappingrelationships among the factors is removed without having to impose an orthogonal constraint.Columns under the heading of D in Table 4.1 list the Pratt's measures, which indicate thecontribution of each factor to the standardized communality. By examining Table 4.1, it is clearthat the large values of structure coefficients reported by Graham et al. (2003) were totally due tothe factor obliquity. This can be seen by F1's zero contribution to the communality of the lastthree variables with zero pattern coefficient constraint, and F2's zero contribution to thecommunality of the first three variables with zero pattern coefficient constraint.4.2 Pratt's Measures in CFA with Factorial ComplexityFor the second data set in Graham et al. (2003), two factors that correlate at 0.71 werehypothesized to be under six observed variables. Table 4.2 shows the pattern and structurematrices reported by Graham et al. (2003). As in the last example, Fl has an effect on only thefirst three variables. However, F2, in addition to the last three variables, also has an effect on thethird variable C. The other effects were all fixed at 0. This model displays one factorialcomplexity in variable C.76Table 4.2 Factor Solutions for Case Two: with One Factorial ComplexityP S L PS SQRT(PS)F1 F2 F1 F2 F1 F2 F1 F2 F1 F2 F1 F2A .834 .000 .834 .593 .836 .000 .696 .000 .834 .000 1.000 .000B .722 .000 .722 .514 .720 .000 .521 .000 .722 .000 1.000 .000C .930(g) -.132(h) .836(i) .529(j) .802 .111 .777 -.070 .882 --- 1.099 -.099D .000 .875 .622 .875 .000 .855 .000 .766 .000 .875 .000 1.000E .000 .774 .550 .774 .000 .777 .000 .599 .000 .774 .000 1.000F .000 .809 .575 .809 .000 .795 .000 .654 .000 .809 .000 1.000RF1 1.000(k)^.710(1)F2 .710(m) 1.000(n)Note. P: pattern matrix; S: structure matrix; L: loading matrix; PS: a matrix of which the elements are the products of a given patterncoefficient and its corresponding structure coefficients; Sqrt(PS): square root of PS; D: Pratt measure matrix; R: factor correlationmatrixAs in the first example, the structure coefficients are substantial for factors that areconstrained to have zero pattern coefficients. For example, the structure coefficient between F2and variable A is 0.593 despite the pattern effect being constrained to be zero. As shown in thefirst example, this type of spurious correlation is entirely due to the high correlation between Fland F2.What makes this example different from the first is that, for variable C, neither thepattern coefficient of Fl nor of F2 is constrained to zero. Namely, both factors have a uniquepartial effect on variable C. Also, both factors yield a noticeable correlation with variable Cindicated by the structure coefficients of 0.836 and 0.529. The second set of data differs from thefirst in that variable C displays a suppression relationship. For the first data set, all the variablesfollow a typical simple redundant relationship where the zero-order correlations withoutremoving the overlap relationship are expected to be equal to (in the orthogonal case) or greaterthan (in the orthogonal case) the corresponding partial regression coefficients. However, in thesecond example, the pattern coefficient of Fl on variable C (0.930) is greater than its77corresponding structure coefficient (0.836), a circumstance referred to as a classic suppressioneffect, leading to the pattern coefficient of F2 becoming negative (-0.132), while itscorresponding structure coefficient remains positive (0.529), a circumstance referred to asnegative suppression effect 2 (Cohen, Cohen, West, Aiken, 2003; Conger, 1974; Lancaster, 1999).In this scenario, these large structure coefficients may not be simply due to the factorcorrelation as in the first case, and hence should not be automatically considered as entirelyspurious. The true unique correlation may be complex and difficult to uncover.Under the suppression relationships, the derivation of the structure coefficients betweenvariable C and Fl and C and F2 can be construed by using the same method for the first data set.The structure coefficient between F 1 and C denoted as (i) in Table 4.2 is given by(i) = 0.836= (g) x (k) + (h) x (m)= (pattern F1 on C) x (correlation F1 and Fl) + (pattern F2 on C) x (correlation F1 and F2)= 0.930 x 1 +(-0.132) x0.71The structure coefficient of Fl, (i), is equal to the first product term, pattern coefficient ofFl (0.93x1), adjusted additively by the second product term, the pattern coefficient of F2 (-0.132) multiplied by the correlation between Fl and F2 (0.71). The second product term (-0.132x0.71) that adjusts downward the F1's structure coefficient reflects the joint influence ofthe unique effect of F2 on C and the correlation between Fl and F2, which has nothing to dowith Fl directly. One can also observe that the structure coefficient is adjusted downward to beless than its corresponding pattern coefficients showing a classic suppression effect.Similarly, the structure coefficient (j) between F2 and C in Table 4.2 is given by2In the language of multiple regression, a suppression effect is said to exist if the partial regression coefficient is greaterthan its corresponding zero-order bivariate correlation. Such a scenario is referred to as "classic suppression effect" byConger (1974) as shown between Fl and variable C in Table 4.5. In Lancaster (1999), he then distinguished the "classicsuppression effect" from the other types of suppression effects such as the "negative suppression effect" as we havepointed out in Chapter Two as well as shown between F2 and variable C in Table 4.5. For a review of definition andtypes of suppression effect in regression, please see Conger (1974) and Lancaster (1999).78(j) = 0.529= (g) x (I) + (h) x (n)= (pattern F1 on C) x (correlation F1 and F2) + (pattern F2 on C) x (correlation F2 and F2)= 0.930 x 0.71 + (-0.132) x 1The structure coefficient between F2 and (j) is equal to the second product term, patterncoefficient of F2 (-0.312x1), adjusted additively by the first product term, the pattern coefficientof Fl on C (0.930) multiplied by the correlation between Fl and F2 (0.71). The first product term(0.930x0.71) that upwardly adjusts the F2's structure coefficient reflects the joint influence ofthe unique effect of Fl on C and the correlation between Fl and F2, which has nothing to dowith F2 directly. The structure coefficient is upward adjusted to be of opposite sign to the patterncoefficient showing a negative suppression effect.Again, Table 4.2 also shows the traditional method of using orthogonal rotation toinvestigate the non-overlapping bivariate relationship. Note that after the orthogonal constraint,the loadings identically represent P and S, and the suppression effect disappears because there isno factor correlation to complicate the pattern or structure relationship. For variables displayingno factorial complexity (i.e., variables except for variable C), the structure coefficients all drop tozero for variables with zero pattern constraint. This shows that the non-zero structure coefficientsyielded by the oblique solution are entirely due to the correlation between Fl and F2, and areentirely spurious.For variable C that displays factorial complexity, the structure coefficients (i.e., loadings)for Fl and F2 produced by the orthogonal solution were actually 0.802 and 0.111 respectivelycompared to those by the oblique solution of 0.836 and 0.529. For F2, the structure coefficient isinflated by 0.418 (i.e., 0.529-0.111) as a result of the correlation between Fl and F2.Table 4.2 shows Pratt's measures and its associated indices. The PS matrix shows theproportion of communality explained by each of the factors. The square root of PS matrix under79the heading of Sqrt(PS), in essence, is analogous to the unique correlation where the overlappingrelationship is removed. Note that the square root of PS for variable C and F2 cannot becalculated because the pattern coefficient and structure coefficient are of different sign and yielda negative product term due to the negative suppression effect. The last two columns of Table 4.2display the Pratt's measures indicating how much of the standardized communality is attributableto each factor. Note F2 yields a noticeable negative importance measure for variable C (-0.099)showing that there is a potential suppression effect as we discussed in Chapter Three.To conclude our argument and findings in Section 4.1 and 4.2, for variables that displayno factorial complexity, Pratt's measures show that the substantial structure coefficients reportedby Graham et al., (2003), of which their corresponding pattern effects are constrained to zero,were entirely due to the high factor correlation. Unlike the traditional orthogonal solution, theunique bi-directional relationships can be obtained by the square root of the product term of PSwithout having to sacrifice the factor correlation information if the factors follow a simpleredundant relationship.For variables that display factorial complexity but no suppression relationship, which arenot discussed by the two examples in Graham et al. (2003), the structure coefficient will decreaseonce the factor correlation is removed. The structure coefficient should be interpreted only if theinflated correlation is removed by orthogonal constraint or by taking the square root of PS ifobliquity is preferred. For variables that display suppression effect as displayed by variable C inthe second data, interpreting the structure coefficient can be confusing because they may turn outto be less than or of different sign to their corresponding pattern coefficients. Although reportingthe structure coefficient may reveal the suppression relationship, which cannot be detected byinterpreting the pattern coefficient alone, it by no means reflects the true unique bi-directionalrelationship. Such complex suppression relationships require a different interpretation paradigmand should not be interpreted blindly using the traditional paradigm assuming a simple80redundancy relationship. The legitimate interpretation should look into the suppression effect inorder to properly understand the complex relationship. The suppression effect issue will berevisited in the closing chapter.It is also worth noting that Graham et al. (2003) deliberately created two heuristicdatasets with two dimensions that are highly correlated (R= 0.69 and 0.71). Their originalintention of creating such data was to highlight that variables with zero pattern constraints mayyield notable structure coefficients due to factor correlations that are often ignored by thepractitioners. However, in a real data analysis context, a unidimensional solution may sufficewith such highly correlated bi-factors. A unidimensional solution may fit the data satisfactorilyand be preferred for its parsimony.More importantly, for the CFA mechanism to work with legitimacy and warrant, thehighly overlapping and complex relationships that show suppression effects, as created byGraham et al. (2003) should have been explored prior to CFA. Namely, for CFA to workoptimally, the researcher should have a clear understanding of the data structure based either onthe prior empirical examination of EFA results or the researcher's substantive theory, rather thanusing CFA as an exploratory tool to uncover the highly overlapping and complex factorstructure.4.3 Comparing the Fit of the Pratt's Measures Model: Additional CFA Case StudiesIn this section, two new data sets were analyzed using CFA. The purpose is to comparethe fit of the EFA model identified by the Pratt's importance measures > 1/(2p) criterion to thoseof models identified by the cut-offs commonly applied to the pattern and structure coefficients aswell as the orthogonal loadings. To be specific, the purpose of this section is to investigate howwell the model suggested by the Pratt's measures > 1/(2p) rule fits the given data, as indicated by81the CFA fit indices, relative to the other six models suggested by the orthogonal loadings as wellas the oblique pattern and structure coefficients using the traditional cut-offs of 0.3 and 0.4.It is important to clarify that our intention is not to undermine the legitimacy ofsubstantive theories in guiding the CFA specification. Rather, it is to investigate, when there areno known theories, which EFA model best summarizes the given data as tested by CFA.The first data set consists of 6, 297 participants' responses to 26 items measuring the sixtheoretical dimensions of psychological well-being. These data was used in Chapter Two todemonstrate the problems of traditional cut-off rules for the oblique coefficients. However, thePratt's measures method has never been applied to this data and reported. The second data setconsists of 7,167 college students' responses to 10 items measuring the two dimensions ofpositive affect reported by the original authors (Watson, Clark, & Tellegen, 1988; see AppendixD for description of the scale items). These two examples were chosen because of their largesample sizes so that we can randomly split the data into two equal halves, with a sufficientnumber in each half. The first half of data were analyzed by EFA and solutions were chosenaccording to the seven criteria described below. Based on the results of EFA, the seven modelsthen were specified and tested by CFA using the second half of the data.CFA Model SpecificationThe seven model specifications are as follows: the first two models were chosen usingcut-offs of 0.3 and 0.4 for the loadings of the orthogonal EFA model. Accordingly in CFA, theloadings that were equal to or greater than 0.3 (Model 1) and 0.4 (Model 2) were free to beestimated. All the other loadings and the covariances among the factors were fixed to be zeros.The next two models were chosen using the same cut-offs for the pattern coefficients of theoblique EFA model. Accordingly in CFA, the pattern coefficients that were equal to or greaterthan 0.3 (Model 3) and 0.4 (Model 4) were free to be estimated, as were the covariances amongthe factors. All the other parameters were fixed to be zeros. The fourth and fifth models were82chosen using the same cut-offs but for the structure coefficients of the oblique EFA model.Accordingly in CFA, the pattern coefficients3 with corresponding structure coefficients that wereequal to or greater than 0.3 (Model 5) and 0.4 (Model 6) were freed to be estimated, as were thecovariances among the factors. All the other parameters were fixed to be zeros. The last modelwas chosen based on the unimportant criterion (i.e., d< 1/(2p)) suggested by Thomas (1992) forPratt's measures. Accordingly in CFA, pattern coefficients with corresponding Pratt's measuresthat are not unimportant (i.e., d> 1/(2p)) were free to be estimated, so were the covariancesamong the factors. All the other parameters were fixed to be zeros.CFA Results ComparisonThe fit of the seven models were reported and compared in Table 4.3 using the followingindices: p-value from the Chi-square test, RMSEA and its confidence interval (CI for RMSEA),SRMR, CFI as well as two information fit indices, AIC and CAIC 4. Although the optimal cut-offs for good fit depend on a variety of factors such as model complexity (Browne & Cudeck,1992; Hu & Bentller, 1999; Marsh, Hau, & Wen, 2004), in broad strokes, RMSEA< 0.08, SRMR< 0.05, and CFI> 0.90 are considered as good fit. On the other hand, AIC and CAIC are datadependent, no single satisfactory cut-offs can be suggested. The judging criterion is that a modelyielding a smaller AIC and CAIC is indicative of being relatively better fitting. In Table 4.3, thebest-fit indices among the seven models are highlighted in bold face.Due to the large sample size, the Chi-square tests for all seven models as shown incolumn four of 4.3, as anticipated, were all significant indicating a poor data-model fit, henceproviding no useful information for cross-model comparison. This finding is not uncommon in3For most SEM software packages, only the pattern coefficients can be specified in CFA, not the structure coefficients.4One disadvantage with chi-square in comparing model fitting is that it always decreases when more parameters added.Therefore there is a possibility to choose a model with more parameters that are really unnecessary. A number ofmeasures of fit have been proposed that take model parsimony into account. These measures resolve this problem byconstructing a measure which ideally first decreases as parameters are added and then has a turning point such that ittakes its smallest value for the "best" model and then increases when further parameters are added. The AIC & CAIC(Akaike's Information Criterion are measures of this type. They can also be used to test non-nested models providedyour sample is sufficiently large.83the CFA literature. When sample size is large, a Chi-square test may easily reject a nullhypothesis because of the high statistical power. Nonetheless, for the psychological well-beingdata, not only the Pratt's measures > l/(2p) model yield RMSEA, SRMR, and CFI indices thatsatisfied the good fit criteria suggested in the literature, but, most importantly, this model alsoyielded the best fit indices compared to the other six models, a finding that is further confirmedby the other two information indices of AIC and CAIC. The same results were found for thepositive affect data. Six out of seven (except for the CAIC) indices suggested that the Pratt'smeasures > 1/(2p) rule identified the model that best fit the data.Table 4.3 Comparisons of CFA Fit Indices for Models Identified by Different EFA Coefficientsand Cut-offsModel df x2 p-value RMSEA CI for RMSEA CFI SRMR AIC CAICPsychological Welling-being Data1. L> 0.3 421 8403.910 .000 .082 .080 .083 .790 .210 9459.110 9988.6802. L> 0.4 296 10054.780 .000 .110 .110 .110 .720 .260 11935.440 12323.7903. P> 0.3 391 5608.540 .000 .069 .068 .071 .860 .054 6503.030 7027.5404. P> 0.4 287 4400.020 .000 .071 .070 .073 .870 .052 5052.840 5504.7405. S> 0.3 341 4648.910 .000 .068 .066 .069 .890 .039 5603.780 6698.2106. S> 0.4 355 3965.680 .000 .058 .057 .060 .900 .040 4422.130 5198.8207. D? 1/(2p) 392 3746.220 .000 .054 .052 .055 .910 .037 4216.150 4950.470Positive Affect Data1. L> 0.3 31 760.950 .000 .080 .075 .085 .910 .110 775.920 947.7802. L> 0.4 33 1273.410 .000 .097 .092 .100 .850 .150 1159.520 1317.0503. P> 0.3 32 445.400 .000 .062 .057 .067 .950 .036 510.120 674.8104. P> 0.4 19 277.500 .000 .064 .057 .070 .960 .035 323.780 445.5105. S>0.3 a a a a a a a a a6. S> 0.4 27 236.660 .000 .047 .042 .053 .970 .023 294.580 495.0707. D? l/(2p) 30 237.780 .000 .045 .039 .050 .970 .023 289.310 468.320Note. The EFA results showed that the structure coefficients were all greater than 0.3 for both factors. Accordingly, to specify the S0.3 model in CFA, all the model parameters were freely estimated (including the covariances among the factors); hence, themodel fit indices could not be computed.Integrating the findings from multiple fit indices from the two data sets suggests that anEFA model identified by the Pratt's measures > 1/(2p) criterion best fit the given data. It istentatively concluded that a CFA is more likely to indicate a good model fit if the model isspecified using the Pratt's measures > 1/(2p) criterion in EFA. Nonetheless, as clarified at the84outset, this finding does not suggest researchers should ignore the role of their substantivetheories in specifying a CFA factor model. Instead, it suggests that when there are no well-known theories for guidance, and researchers cannot help but rely on empirical EFA to assist inthe model specification, Pratt's measures > 1/(2p) rule may provide the best fitting model astested by CFA.Closing Remarks for Chapter FourIn this chapter, in accordance with the current literature, we agree that the information inthe structure coefficient should not be ignored when interpreting a CFA model. Nonetheless,careful attention should be paid to the fact that the zero-order correlations between the observedvariables and the factors are inflated by the factor correlations. By construing the calculation ofthe structure coefficient, we affirm that the bi-directional relationship indicated by the structurecoefficient can be deceptive due to the factor obliquity, to the extent that the entire relationship isspurious when the corresponding pattern coefficient is constrained to be zero, a common practicein CFA.Furthermore, when a CFA model is hypothesized to be oblique, juxtaposing both thepattern and structure coefficients does not resolve the inherent problems in both coefficients, nordoes the method of resorting to orthogonal loadings. A more mathematically justified andsufficient algorithm is to incorporate the information in both the pattern and structurecoefficients by applying the Pratt's measures method. This contention is supported by the resultsof CFA investigations on two empirical data sets, indicating that the model identified by thePratt's measures > 1/(2p) rule yields the best fit to the data when compared to the models that useonly the information in the pattern, structure or orthogonal coefficients.Our thesis supported by empirical evidence refute the currently recommended andpracticed methods for understanding an oblique factor model — interpreting P or S or juxtaposingboth without integrating the information.85ReferencesBrowne, M. W., & Cudeck, R. (1992). Alternative ways of assessing model fit. In K. A. Bollen& J. S. Long (Eds.), Testing structural equation models (pp. 136-162). Newbury Park,CA: Sage.Cohen, J. P., Cohen, S. G., West, L. S., & Aiken, L. S. (2003). Applied multipleregression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ:Lawrence Erlbaum Associates.Conger, A. J. (1974). A revised definition for suppressor variables: A guide to their identificationand interpretation. Educational and Psychological Measurement, 34,35-46.Graham, J. M., Guthrie, A. C., & Thompson, B. (2003). Consequences of not interpretingstructure coefficients in published CFA research: A reminder. Structural EquationModeling, 10, 142-152.Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis:Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.Lancaster, B. P. (1999, January). Defining and interpreting suppressor effects: Advantages andlimitations. Paper presented at the annual meeting of Southwest Educational ResearchAssociation, San Antonio, Texas.Marsh, H. W., Hau, K. T., & Wen, Z. (2004). In search of golden rules: Comment on hypothesis-testing approaches to setting cutoff values for fit indexes and dangers in overgeneralizingHu and Bentler's (1999) findings. Structural Equation Modeling, 11, 320-341.Thomas, D. R. (1992). Interpreting discriminant functions: A data analytic approach.Multivariate Behavioral Research, 27, 335-362.Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measuresof positive and negative affect: the PANAS scales. Journal of Personality and SocialPsychology, 54, 1063-70.86Chapter Five: Contribution, Limitation, and Future Research5.1 RecapitulationThis dissertation began with a review of the recommendations and practices ofinterpreting a multidimensional factor model. In particular, this review highlighted three majorinterpretational complexities with an oblique factor model.The first complexity arises from the inconsistency in the calculation and meaning of twooblique coefficients: pattern and structure coefficients. Often, researchers proceed with theinterpretation either without noticing such inconsistency or choose to interpret only onecoefficient with no rationale provided. The current literature has recommended addressing thisproblem by juxtaposing and interpreting both coefficients. This recommendation may haveadvanced the interpretational practice by attending to the distinctive information in eachcoefficient. However, mere juxtaposing does not really resolve the interpretational problems, andmay actually further complicate the interpretation if the two types of coefficients lead toinconsistent conclusions.The second complexity considers the distortion of additive properties due to factorobliquity. Additive properties simplify the interpretation of a factor solution and make theinterpretation mathematically straightforward. Due to factor obliquity, neither the pattern nor thestructure coefficients hold the horizontal additive property that warrants horizontal interpretation.Over the last four decades, the distortion of horizontal additive properties may have regrettablyresulted in the underutilization of horizontal interpretation, which is rooted in the fundamentaltheory of factor analysis. To overcome this interpretational difficulty, orthogonal constraints areoften unduly forced even when the existing theory or the empirical data suggest that the factorsare oblique.The third complexity considers the inappropriateness of the traditional rules for a"meaningful" relationship between the observed variables and the factors. The traditional rules87suggested for the orthogonal loadings should satisfy three conditions. That is, the loadings (1)should identically represent the pattern and structure relationships, (2) are horizontally additive,and (3) are bounded within -1 and 1. These commonly practiced rules are not equally applicablefor the pattern and structure coefficients because they (1) represent distinctive information, (2)are not horizontally additive, and (3) may exceed the bounds of —1 and 1. As far as we are aware,the current literature has not addressed this problem nor has it provided any solutions.The dissertation adapts the method of Pratt's importance measures used in multipleregression to factor analysis and explicates how this new method can simultaneously resolve thethree interpretational complexities arising from factor obliquity.5.2 Novel ContributionsThis dissertation has made eight novel contributions to the understanding and use offactor analysis. First, this dissertation systematically articulates three interpretational problemsinherent in oblique factor models that have often gone unnoticed or unattended in the appliedresearch literature. The first two have been historically discussed yet remain unresolved by thecurrent methodology literature, and the third is articulated for the first time in this dissertation.This dissertation further highlights and critiques the inappropriateness of the conventionalsolutions and common practices for dealing with these problems.Second, this dissertation is the first ever attempt at ordering the importance of latentvariables for multivariate data. Although the axiom and geometry of Pratt's importance methodand its use in regression have been well documented and established since the 1980s by Pratt andsubsequent scholars, the application of Pratt's importance measures has been limited to orderingthe importance of observed independent variables for univariate data. Recently, Pratt'simportance measures have been used as a validation tool to order the contribution of observedand/or latent independent variables to a single latent variable (Zumbo, 2007; Zumbo, Wu, & Liu,882008). To date, this dissertation is the first development of Pratt's measures to order theimportance of latent variables underlying a set of observed variables simultaneously. The realstrength of the new method is its ability to order the importance of latent variables that aremutually correlated, a task that has never been accomplished with theoretical and mathematicaljustification as provided by the Pratt's measures method.Third, Pratt's measures method resolves the three interpretational problems due to factorobliquity articulated by this dissertation. Chapter Three justifies and demonstrates the use ofPratt's measures in EFA. It provides theoretical rationales and two real data demonstrations tosubstantiate the use of Pratt's measures matrix in EFA. It demonstrates how the threeinterpretational problems can be easily resolved through a simple transformation of the patternand structure coefficients into unified Pratt's measures.The interpretational problem regarding the inconsistency between P and S is resolved bythe capacity of Pratt's measures to synergize the distinctive information in each coefficient. Thisavoids the traditional dilemma of choosing between the pattern and structure coefficient forinterpretation when the two produce incompatible conclusions.The interpretational problem regarding the distortion of additive properties is resolved bythe capacity of Pratt's measures to restore the additive properties both horizontally and vertically.In particular, the restoration of the horizontal additive property may fuel a revival of horizontalinterpretation that has almost been forgotten since oblique rotation methods became popular.Furthermore, the restoration of horizontal additive properties will provide a powerful tool formeasurement validation in terms of examining the contribution of correlated constructs to thevariation in the item responses.The Pratt's measures method also partly resolves the third interpretational problemregarding the inappropriateness of the traditional rules for meaningful cut-offs. Under thecircumstance of a simple redundancy relationship, Pratt's measures are bounded within 0 and 189and uniquely represent the proportion of variation explained by the individual factors. This isconsistent with the interpretation schema held by the traditional cut-off rules suggested under thepremise of orthogonality.The fourth contribution is made through the application of Pratt's measures in CFA.Chapter Four demonstrates that interpreting the structure coefficients of a CFA as advocated byGraham, Guthrie, and Thompson (2003) can be problematic. By construing the calculation of thestructure coefficient, we prove that the structure coefficient can be deceptive due to the factorobliquity, to the extent that the entire relationship is spurious when the corresponding patterncoefficient is constrained to be zero. We further show that the structure coefficient could lead tovarying degrees of mistaken identification of a factor's importance as the combined results offactorial complexity and obliquity. Our thesis and proof refute the current recommendation forjuxtaposing the pattern and structure coefficients for interpreting an oblique CFA. Although thethesis and proof were set in the context of CFA, the same conclusions can be made for EFAmodels because an EFA can be seen as a special case CFA with no constraints on the patterncoefficients.Fifth, with two empirical data examples tested in CFA, we also show that an EFA modelchosen using the criterion of Pratt's measures >1/2p rule fits the data better than models chosenusing other commonly practiced criteria. From an empirical perspective, this finding suggeststhat the Pratt's measures >1/2p rule depicts the oblique factor structure underlying the data betterthan the traditional rules. Results from these two investigations provide tentatively support forthe < 1/2p rule for unimportance originally suggested by Thomas (1992).Sixth, at a broad level, the new method avoids the debate over the choice of oblique andorthogonal factor rotation. This claim is made because, through the method of Pratt's measures,all the mathematical advantages of an orthogonal model can now be easily achieved by anoblique model. In the literature review, we revealed that the oblique model is theoretically and90empirically preferred by the current methodology literature. However, the researchers are oftencompelled to the orthogonal solution for interpretation ease. When Pratt's method is used,researchers no longer have to sacrifice preference for factor obliquity for mathematicalsimplicity. In essence, the method of Pratt's measures ceases the dilemma of choosing betweenthe advantage of the theoretical flexibility of an oblique model and the mathematical simplicityof an orthogonal model. Such historical debate is now dispensable because both advantages canbe achieved by Pratt's measures.Seventh, to our knowledge, this dissertation may be the first to demonstrate and explicatethe existence, mechanism, and implications of the suppression effect in factor analyses. Formultiple regression, the methodology literature has accumulated sufficient discussions about thestatistical mechanism of the suppression effect. Also, the substantive literature is replete withtheoretical and empirical examples of suppression effects from various fields. However, there arefew, if any, for factor analysis. This dissertation articulates how the suppressors complicate andinvalidate the traditional interpretation rules suggested under the schema of simple redundantrelationship. Data examples demonstrate the existence and mechanism of such factor suppressioneffects in both exploratory and confirmatory contexts.Eighth, by definition, the mathematics, application, and interpretation of Pratt's measuresare directional both for regression and factor analysis. Taking a set of variables X 1 , X2, X3, andX4, for example, partitioning the explained variation of X1 by the remaining variables ismathematically and theoretically different from that of X2, X3, or X4 by the remaining set of thevariables. For factor analysis, this axiom sets the framework for a directional interpretation of thefactors' effect on the observed variables. When this framework is cemented with the additivepartition capacity of Pratt's measures, it becomes a useful tool, even under factor obliquity, forconsolidating the classic interpretation of factors — i.e., ascertaining the role of factors as the91underlying causes. This is an essential and desirable interpretation that has been losing its statussince the introduction of oblique factor analysis.5. 3 Caveats and LimitationsThis thesis and demonstration show that the Pratt's measure matrix resolves threeinterpretational problems of an oblique model, which can not be achieved by simply juxtaposingthe pattern and structure coefficients as the current literature suggests. Nevertheless, it is crucialto realize that our suggestion is not to treat Pratt's measures as "oblique loadings" or to replacethe use of the oblique coefficients. It should be fully realized that examining and comparing thepattern and structure coefficients can reveal a deep and rich story about a factor model includingsuch issues as the factor suppression effect. The real purpose of Pratt's measures, in fact, is todisentangle the interpretational problems by integrating the information in the two coefficients.To have a thorough understanding of a factor model, we suggest that researchers aid theirinterpretation by incorporating the Pratt's measure matrix in addition to juxtaposing the obliquecoefficients.As pointed out earlier, the method of Pratt's importance measures in factor analysis ismodel dependent. Namely, the importance of a factor is determined relative to the other factorsextracted for the particular data. Because of this, it is erroneous and meaningless to comparefactor importance across models with a different number of factors for the same data. Forexample, the relative importance of a factor in a three-factor model should not be compared tothat of a four-factor model even if the same meaning and label are assigned to that factor. Hence,correct dimensional specification is a prerequisite for Pratt's measures to work effectively. If thePratt's measure matrix is not interpretable, this may suggest that alternative models with adifferent number of factors as well as the rotation specifications should be explored. Also, therelative nature of Pratt's measures has a special implication when the Pratt's measures method is92used for multivariate data. That is, it is meaningless to compare the contribution of a factor basedon q observed variables to that of q ± z variables (z being a positive integer) even if the qvariables remain identical and the same meaning is assigned to that factor. That is, factoranalyzing a set of multivariate data with q ± z variables, in essence, answers a different questionfrom that with q variables.Also, because Pratt's measures are defined by the explained variance accounted for bythe factors, researchers should be aware of how much of the explained variance they begin withprior to the application of the method. That is, if the communalities of the observed variables areunsatisfactorily low, application of Pratt's measures is of little meaning because one cannot makesense of something that has little to explain, a problem that neither other coefficients nor rotationmethods can address. In this case, attention should be paid to the screening and selection of theobserved variables before any interpretation of the factor solution.Another circumstance that may abate the capability of Pratt's measures in factor analysisis the occurrence of negative estimates, which is counterintuitive to the definition of importancefor Pratt's measures. Occasionally, a negative Pratt's measure may cause the other Pratt'smeasure to be greater than 1 due to the fact that the sum of all Pratt's measures for a givenobserved variable is equal to 1. As explained earlier, small out of bounds Pratt's measures couldbe a result of chance capitalization because both pattern coefficient and structure coefficient aresample estimates, but Pratt's measures are population-defined. As for large negative Pratt'smeasures, they may result from the fact that a given factor model disobeys the simple redundantrelationship and is too complex to be additively partitioned. These complex scenarios mayinclude, but are not restricted to, suppression effect and multicollinearity.In the multiple regression literature, a negative suppression effect is defined as when thepartial regression coefficient of an independent variable is of different sign to its Pearsoncorrelation (Conger, 1974; Lancaster, 1999). Given the connection between multiple regression93and factor analysis, this definition is naturally applicable to factor analysis. That is, a negativesuppression effect is present if the pattern coefficient is of different sign to its correspondingstructure coefficient. Relatively large non-chance negative Pratt's measures can occur if anegative suppression is in effect. Since the denominator of Pratt's measures, the communality, isalways positive, Pratt's measures would yield negative values only if the product term in thenumerator, PS, is negative, which only happens if the pattern and structure are of different signs-- a definitive scenario of the negative suppression effect!A suppression effect can occur if the pattern and structure coefficients are of differentsigns but are not restricted to such cases. Two other types of suppression effects have beenidentified for multiple regression: the classic and reciprocal suppression effect. Conger (1974)and Lancaster (1999) gave very systematic accounts and examples for various types ofsuppression effects in multiple regression. Understanding and interpreting the suppression effectcan be even more complex in factor analysis because multivariate data are involved. Asuppression relationship may be interpretable and theoretically meaningful for one observedvariable but not so for another. The suppression effect in factor analysis needs to be betterunderstood and is a good topic for future research.Another complex scenario that abates the use of Pratt's measures is the problem ofmulticollinearity. Although there is not yet a direct proof, Thomas, Hughes, and Zumbo (1998)demonstrated that negative Pratt's measures are often associated with multicollinearity inmultiple regression where the independent variables are highly correlated to an extent that theindependent variable is completely linearly dependent (i.e., perfectly correlated). In an obliquefactor model, high correlations among the factors can also create multicollinearity problems as inmultiple regression. Unlike the suppression effect that may be of important theoretical andpractical interest, multicollinearity is a statistical problem that can probably be avoidedtechnically by simply removing or combining the highly redundant factors.94Thomas et al. (1998; 2006) reminded researchers that some models are so complex thatno single measure of variable importance satisfies Pratt's axioms. Suppression effect andmulticollinearity are two of these complex situations wherein all other coefficients andimportance measures also encounter interpretational complexity (Thomas et al., 1998).5.4 Suggestions for Future ResearchThe current literature has built a rich body of literature for interpreting multipleregression, where a single dependent variable is regressed on a set of observed independentvariables. To some extent, the MLM connection between multiple regression and factor analysismakes part of this rich literature of multiple regression transferable and compatible for factoranalysis. Nonetheless, multiple regression is, after all, not the same as factor analysis, wheremultivariate dependent variables are regressed on a set of latent independent variables.As pointed out earlier, this dissertation is the first attempt to order the importance of a setof oblique latent factors for multivariate observed data. Certainly, more work is needed tounderstand the theoretical and mathematical mechanism of ordering factor importance. Thefollowing suggests four major areas of future research.Hypothesis Testing and Confidence Intervals of Pratt's MeasuresPratt's measures, as we describe them, are descriptive statistics. A possible area forfurther research is to make inferences about the population parameters of Pratt's measures basedon the sample estimates. This enables the researchers to hypothesis test whether Pratt's measuresare greater than a particular value of their theoretical or practical interest such as the values ofzero or 1/2p. Unfortunately, The finite sample distributional properties of Pratt's measures aretechnically complicated, even under the assumption of normal errors (Thomas, Zhu, & Decady,2007). To date, the sampling distributions and the analytical calculation of the standard errors ofPratt's measures in multiple regression has yet to be identified. This implies that the sampling95distribution and standard errors for calculating the p-values are hard to track even for univariatedata. Most likely, simultaneously testing a number of qxp of Pratt's measures is an even morearduous puzzleFortunately, with the increasing capacity of modern computers, bootstrapping hasbecome a popular method for making inferences about population parameters. Bootstrapping is acomputer-intensive, non-parametric technique for statistical inference where parametricinference is infeasible or involves very complicated formulas for calculating the standard errors.It makes use of the re-sampling technique from the empirical distribution of the observed data toestimate the standard errors and confidence intervals. Bootstrapping is a promising line ofresearch for making inferences about Pratt's measures in factor analysis given that the traditionalparametric inference has met with considerable difficulty in multiple regression.Besides, Thomas et al. (2006) developed a new method for calculation of the pointestimate and confidence interval for Pratt's measures in multiple regression. This new method isbased on an asymptotic analysis of the properties of Pratt's measures in the limit when thesample size approaches infinity. In this method, the asymptotic variances are estimatedsimultaneously for conducting the confidence intervals for Pratt's measures to the full set of theindependent variables. The advantage of this method is that the calculation of these estimatesrequires only information routinely printed in the output of standard statistical programs. No newsoftware is required.Thomas et al's (2006) study also conducted a simulation to examine how this asymptoticconfidence interval performs for sample sizes encountered in practice. Their results showed theapproximate variance estimate is suitably accurate for a sample size of 250 or more, and in manycases, for a sample size as small as 100. Also, the asymptotic confidence intervals provide goodcoverage numerically close to the nominal 95% level for a sample size of 250 or more. However,for smaller sample sizes, the asymptotic confidence intervals tend to be liberal. Although this96line of research was set in the context of multiple regression, it hints at a potential application infactor analysis.Cut-offs for the Pratt's MeasuresThomas (1992) suggested that, as a general rule, variables with Pratt's measures <1 /2p beconsidered unimportant. We have observed from the EFA and CFA examples that the < 1/2pcriterion seems to work reasonably well, at least better than the traditional cut-offs. However,this rule of was suggested based on an intuitive reasoning; namely, variables with less than halfthe average importance is unimportant. There are no empirical studies yet to investigate theappropriateness of the < 1/2p rule or suggest other cut-offs such as one for being important inaddition to being unimportant.These gaps suggest that there is a need to conduct a systematic review based on a largeset of existing data. The attempt is to explore the possibility of a common set of thresholds thatclassify Pratt's measures into easily interpretable categories. For example, categories such as (1)unimportant, (2) not important but not unimportant, and (3) important, which are commonlyapplicable to different areas of social and behavioural sciences. This type of research is similar tothe studies conducted by Cohen (1988) in order to suggest effect sizes that are easilyinterpretable. The purpose is not to determine the unanimous criteria for what should beconsidered as "important" or "meaningful". Instead, the intention is to empirically explorewhether there is a meaningful pattern of importance underlying Pratt's measures, fullyacknowledging the diversity of disciplines and areas of research.Graphical Representation of the Pratt's Measure MatrixBring (1996) and Thomas et al. (1998) used the geometry of least squares to interpret thePratt's measures for multiple regression. Based on these geometric foundations, it is intuitivelystraightforward to provide a two-dimensional graphic of Pratt's measure matrix for users whohave little knowledge of the geometry of least squares. Such a graphical depiction would help to97outline the story of a factor model characterized by a Pratt's measure matrix. Developing agraphical visualization could facilitate the presentation and communication of the new method.Interpreting the Factor Suppression EffectAs mentioned earlier, there is little to no direct discussion about the existence,mechanism, and implications of the suppression effect in factor analysis. There is certainly aneed for future work to systematically understand and interpret factor suppression effects. Futureworks in the following directions are suggested.First and most pressing is to acknowledge the presence of suppression effects in factoranalysis. Much of the practice undertakes the interpretation without even noticing the possiblepresence of factor suppression effects. Second, future research can investigate the effect ofoblique rotation method and the accompanying specifications on the presence and mechanism ofthe suppression relationship. Different rotation methods and specifications may lead tomathematically different suppression relationships. It is worthwhile exploring rotation criteriathat opt for a mathematically meaningful and interpretable factor suppression model. Third,borne on the knowledge base of suppression effect in multiple regression, future work can focuson distinguishing, classifying, and interpreting different types of suppression effects in factoranalysis.Fourth, Thomas et al. (1998) argued that because suppressor variables and non-suppressor variables contribute to the regression in entirely different ways, it is actually intuitiveto assess the relative importance of the non-suppressors using Pratt's measures and separatelyassess the relative importance of the suppressors to the non-suppressors using the measure of(R2-R2NS)/R2, where the RNs denotes the variance explained by the non-suppressor variablesalone. It seems reasonable to seriously consider this recommendation for assessing thesuppression effects in factor analysis. However, in factor analysis, the same factor may act like asuppressor for one observed variable but not for another. Or, the same factor may display a98negative suppression effect for one observed variable but other types for another. Thus, theformula of (h2-h2Ns)/h2 only works for individual observed variables.In conclusion, this dissertation is the first attempt to adapt Pratt's measures developed formultiple regression to factor analyses. It has solved, in crucial ways, the major interpretationaldifficulties in understanding and interpreting an oblique factor model. Moreover, it has instigatedand formulated a line of research that is deserving of scholarly expertise and devotion in thefuture.99ReferencesBring, J. (1996). A geometric approach to compare variables in a regression model. TheAmerican Statistician, 50, 57-62.Conger, A. J. (1974). A revised definition for suppressor variables: A guide to their identificationand interpretation. Educational and Psychological Measurement, 34, 35-46.Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ:Lawrence Earlbaum Associates.Graham, J. M., Guthrie, A. C., & Thompson, B. (2003). Consequences of not interpretingstructure coefficients in published CFA research: A reminder. Structural EquationModeling, 10, 142-152.Lancaster, B. P. (1999, January). Defining and interpreting suppressor effects: Advantages andlimitations. Paper presented at the annual meeting of Southwest Educational ResearchAssociation, San Antonio, Texas.Thomas, D. R. (1992). Interpreting discriminant functions: A data analytic approach.Multivariate Behavioral Research, 27, 335-362.Thomas, D. R., Hughes, E., & Zumbo, B. D. (1998). On variable importance in linear regression.Social Indicators Research, 45, 253-275.Thomas, D. R., Zhu, P. C., Zumbo, B. D., & Dutta, S. (2006, June). Variable importance inlogistic regression based on partitioning an R2 measure. Paper presented at the 2006Annual Meeting of Administrative Sciences Association of Canada (ASAC), Banff,Alberta.Thomas, D. R., Zhu, P. C., & Decady, Y. J. (2007). Point estimates and confidence intervals forvariable importance in multiple linear regression. Journal of Educational and BehavioralStatistics, 32, 61-91.Zumbo, B. D. (2007). Validity: Foundational issues and statistical methodology. In C. R. Rao &S. Sinharay (Eds.), Handbook of statistics, Vol. 26: Psychometrics, (pp. 45-79).Zumbo, B. D., Wu, A. D., & Liu, Y. (2008, March). Variable ordering when using regressionwith latent variables. Paper presented at the Annual Meeting of the American EducationalResearch Association (AERA), New York.100AppendicesAppendix A: The 24 Psychological Ability Tests in Holzinger & Swineford's (1939) DataTi: Visual perceptionT2: CubesT3: Paper form boardT4: FlagsT5: General informationT6: Paragraph comprehensionT7: Sentence completionT8: Word classificationT9: Word meaningT10: AdditionTi l: CodeT12: Counting dotsT13: Straight-curved capitalsT14: Word recognitionT15: Number recognitionT16: Figure recognitionT17: Object - numberT18: Number - figureT19: Figure - wordT20: DeductionT21: Numerical puzzlesT21: Problem reasoningT23: Series completionT24: Arithmetic problems101Appendix B: Definitions of Six Theory-Guided Dimensions of Psychological Weil-BeingAutonomy (AU)High scorer: Is self-determining and independent; able to resist social pressures to think and actin certain ways; regulates behavior from within; evaluates self by personal standards.Low scorer: Is concerned about the expectations and evaluations of others; relies on judgmentsof others to make important decisions; conforms to social pressures to think and act in certainways.Environmental Mastery (EM)High scorer Has a sense of mastery and competence in managing the environment; controlscomplex array of external activities; makes effective use of surrounding opportunities; able tochoose or create contexts suitable to personal needs and values.Low scorer: Has difficulty managing everyday affairs; feels unable to change or improvesurrounding context; is unaware of surrounding opportunities; lacks sense of control overexternal world.Personal Growth (PG)High scorer: Has a feeling of continued development; sees self as growing and expanding; isopen to new experiences; has sense of realizing his or her potential; sees improvement in self andbehavior over time; is changing in ways that reflect more self knowledge and effectiveness.Low scorer: Has a sense of personal stagnation; lacks sense of improvement or expansion overtime; feels bored and uninterested with life; feels unable to develop new attitudes or behaviors.Positive Relations with Others (PR)High scorer Has warm, satisfying, trusting relationships with others; is concerned about thewelfare of others; capable of strong empathy, affection, and intimacy; understands give and takeof human relationships.Low scorer: Has few close, trusting relationships with others; finds it difficult to be warm, open,and concerned about others; is isolated and frustrated in interpersonal relationships; not willingto make compromises to sustain important ties with others.Purpose in Life (PL)High scorer: Has goals in life and a sense of directedness; feels there is meaning to present andpast life; holds beliefs that give life purpose; has aims and objectives for living.Low scorer: Lacks a sense of meaning in life; has few goals or aims, lacks sense of direction;does not see purpose of past life; has no outlook or beliefs that give life meaning.Self-acceptance (SA)High scorer: Possesses a positive attitude toward the self; acknowledges and accepts multipleaspects of self including good and bad qualities; feels positive about past life.Low scorer: Feels dissatisfied with self; is disappointed with what has occurred in past life; istroubled about certain personal qualities; wishes to be different than what he or she is.102Appendix C: SPSS Syntax for Input Tetrachoric/Polychoric Correlation Matrix and FactorAnalysis* Replace codes that are highlighted in bold according to your data.matrix data variables=V1 to V9/files= "C:ITIMSS_TIMEITIME.dat"/format free lower/ N= 8385/ contents=corr.FACTOR /matrix=in(cor=*)NARIABLES vl v2 v3 v4 v5 v6 v7 v8 v9/MISSING LISTWISE/ANALYSIS vl v2 v3 v4 v5 v6 v7 v8 v9/PRINT INITIAL EXTRACTION ROTATION/PLOT EIGEN/CRITERIA FACTOR(3) ITERATE(25)*"Factor(3)" means extracting a specified number of factors, three for the present example* or use "MINEIGEN(1)", eigen value greater than 1 extraction rule/EXTRACTION ULS*ULS: unweighted least squares/CRITERIA ITERATE(25)/ROTATION PROMAX(4).* can use other oblique rotation methods* "(4)" means kappa value is set to be 4103Appendix D: Positive Affect Negative Affect Schedule (PANAS)DirectionsThis scale consists of a number of words that describe different feelings and emotions. Read eachitem and then circle the appropriate answer next to that word. Indicate to what extent you havefelt this way during the past week.Use the following scale to record your answers.(1) = Very slightly^(2) = A little^(3) = Moderately^(4) = Quite a bit^(5) = Extremelyor not at allVery slightlyor not at allA little Moderately Quite a bit_ Extreme^. ^Interested 2. Distressed 122 34^,4.^..3. Excited 1 2 3 44^1 1 -dset I -,._ 3 44.^Strong6^6 uilt■ I 2— 4 57. Seared8. Hostile 1 -,_ 4 59.^Enthusiastic 1 210^Proud I -,_ 4 511.^Irritable 1 2 5,12^Alert 1 2 3 4 513: Ashamed 1 414. inspired I 4 515. Nervous 416^Dterinined 1 1_ 4 57. Attentive 418. Jittery 1 2 3 4 5N. Active I — 334420. Afraid 1 2 5104
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Pratt's importance measures in factor analysis : a...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Pratt's importance measures in factor analysis : a new technique for interpreting oblique factor models Wu, Amery Dai Ling 2008-12-31
pdf
Page Metadata
Item Metadata
Title | Pratt's importance measures in factor analysis : a new technique for interpreting oblique factor models |
Creator |
Wu, Amery Dai Ling |
Publisher | University of British Columbia |
Date | 2008 |
Date Issued | 2008-09-22T21:09:32Z |
Description | This dissertation introduces a new method, Pratt's measure matrix, for interpreting multidimensional oblique factor models in both exploratory and confirmatory contexts. Overall, my thesis, supported by empirical evidence, refutes the currently recommended and practiced methods for understanding an oblique factor model; that is, interpreting the pattern matrix or structure matrix alone or juxtaposing both without integrating the information. Chapter Two reviews the complexities of interpreting a multidimensional factor solution due to factor correlation (i.e., obliquity). Three major complexities highlighted are (1) the inconsistency between the pattern and structure coefficients, (2) the distortion of additive properties, and (3) the inappropriateness of the traditional cut-off rules as being "meaningful". Chapter Three provides the theoretical rationale for adapting Pratt's importance measures from their use in multiple regression to that of factor analysis. The new method is demonstrated and tested with both continuous and categorical data in exploratory factor analysis. The results show that Pratt's measures are applicable to factor analysis and are able to resolve three interpretational complexities arising from factor obliquity. In the context of confirmatory factor analysis, Chapter Four warns researchers that a structure coefficient could be entirely spurious due to factor obliquity as well as zero constraint on its corresponding pattern coefficient. Interpreting such structure coefficients as Graham et al. (2003) suggested can be problematic. The mathematically more justified method is to transform the pattern and structure coefficients into Pratt's measures. The last chapter describes eight novel contributions in this dissertation. The new method is the first attempt ever at ordering the importance of latent variables for multivariate data. It is also the first attempt at demonstrating and explicating the existence, mechanism, and implications of the suppression effect in factor analyses. Specifically, the new method resolves the three interpretational problems due to factor obliquity, assists in identifying a better-fitting exploratory factor model, proves that a structure coefficient in a confirmatory factor analysis with a zero pattern constraint is entirely spurious, avoids the debate over the choice of oblique and orthogonal factor rotation, and last but not least, provides a tool for consolidating the role off actors as the underlying causes. |
Extent | 6208226 bytes |
Subject |
Factor models Factor analysis Oblique factor |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2008-09-22 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0054580 |
URI | http://hdl.handle.net/2429/2333 |
Degree |
Doctor of Philosophy - PhD |
Program |
Measurement, Evaluation and Research Methodology |
Affiliation |
Education, Faculty of Educational and Counselling Psychology, and Special Education (ECPS), Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2008-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 24-ubc_2008_fall_wu_amery_dai_ling.pdf [ 5.92MB ]
- Metadata
- JSON: 24-1.0054580.json
- JSON-LD: 24-1.0054580-ld.json
- RDF/XML (Pretty): 24-1.0054580-rdf.xml
- RDF/JSON: 24-1.0054580-rdf.json
- Turtle: 24-1.0054580-turtle.txt
- N-Triples: 24-1.0054580-rdf-ntriples.txt
- Original Record: 24-1.0054580-source.json
- Full Text
- 24-1.0054580-fulltext.txt
- Citation
- 24-1.0054580.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0054580/manifest