"Medicine, Faculty of"@en . "Population and Public Health (SPPH), School of"@en . "DSpace"@en . "UBCV"@en . "Richardson, Christopher Galliford"@en . "2009-12-23T17:52:55Z"@en . "2004"@en . "Doctor of Philosophy - PhD"@en . "University of British Columbia"@en . "This dissertation is centred on the application of structural equation modeling (SEM) techniques to examine the measurement of mental well-being (MWB) and its relationship with health status using longitudinal data from the Canadian National Population Health Survey (NPHS). The first manuscript contains the results of a study (n = 1782) examining the factor structure of a 6-item version of the Rosenberg Self-Esteem Scale (RSES) included in the NPHS. Nested confirmatory factor analyses indicate that the items are best modeled as two correlated dimensions interpreted as a measure of self-competence and a measure of selfliking. The second manuscript contains the results of a study testing the age-based measurement invariance, temporal stability and moderating ability of Antonovsky's Sense of Coherence Scale (SOC) using longitudinal data from the NPHS. Results of multi-group longitudinal factor analysis support the measurement invariance across age and stability over time, of the SOC for the following age groups: 19 to 25 years (n = 1257); 30 to 55 years (n = 5326); and 60 plus years (n = 2213). A series of regression models were employed to test the ability of sense of coherence (SOC) to moderate (i.e., buffer) the health impacts associated with the experience of a recent stressful life event using data from the 1998-1999 and 2000- 2001 NPHS. A significant moderating effect in the expected direction was found when predicting self-reported health (SRH). However, the moderating effect of SOC was not significant when predicting number of self-reported visits to a physician during the previous year. The last manuscript contains the results of an investigation (n \u00E2\u0080\u0094 4842) examining the extent to which socioeconomic status (income adequacy) predicts current SRH and future change in SRH in addition to tests of the hypothesis that MWB (i.e., self-esteem and SOC) mediates the relationship between current SES and SRH. The results of a latent growth model indicate that SES is positively related to SRH assessed at the same time, but not to the linear change in SRH over the subsequent 6 years. Both SOC and self-esteem appear to partially mediate the impact of SES on SRH."@en . "https://circle.library.ubc.ca/rest/handle/2429/17195?expand=metadata"@en . "A N I N V E S T I G A T I O N OF T H E M E A S U R E M E N T O F M E N T A L W E L L - B E I N G A N D ITS R E L A T I O N S H I P W I T H H E A L T H STATUS USING D A T A F R O M T H E N A T I O N A L P O P U L A T I O N H E A L T H S U R V E Y OF C A N A D A B y C H R I S T O P H E R G A L L I F O R D R I C H A R D S O N B.Sc. (Hons.), University of Guelph, 1993 M . S c , University of Northern British Columbia, 1999 A THESIS S U B M I T T E D I N T H E P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S I F O R T H E D E G R E E OF D O C T O R O F P H I L O S O P H Y in T H E F A C U L T Y OF G R A D U A T E S T U D I E S Department of Health Care and Epidemiology We accept this thesis as confjpHriing)to the required standard T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A June 10, 2004 \u00C2\u00A9 Christopher Galliford Richardson, 2004 A B S T R A C T This dissertation is centred on the application of structural equation modeling (SEM) techniques to examine the measurement of mental well-being ( M W B ) and its relationship with health status using longitudinal data from the Canadian National Population Health Survey (NPHS) . The first manuscript contains the results of a study (n \u00E2\u0080\u0094 1782) examining the factor structure of a 6-item version of the Rosenberg Self-Esteem Scale (RSES) included in the N P H S . Nested confirmatory factor analyses indicate that the items are best modeled as two correlated dimensions interpreted as a measure of self-competence and a measure of self-liking. The second manuscript contains the results of a study testing the age-based measurement invariance, temporal stability and moderating ability of Antonovsky's Sense of Coherence Scale (SOC) using longitudinal data from the N P H S . Results of multi-group longitudinal factor analysis support the measurement invariance across age and stability over time, of the SOC for the following age groups: 19 to 25 years (n = 1257); 30 to 55 years (n = 5326); and 60 plus years (n = 2213). A series of regression models were employed to test the ability of sense of coherence (SOC) to moderate (i.e., buffer) the health impacts associated with the experience o f a recent stressful life event using data from the 1998-1999 and 2000-2001 N P H S . A significant moderating effect in the expected direction was found when predicting self-reported health (SRH). However, the moderating effect of S O C was not significant when predicting number of self-reported visits to a physician during the previous year. The last manuscript contains the results of an investigation (n \u00E2\u0080\u0094 4842) examining the extent to which socioeconomic status (income adequacy) predicts current S R H and future change in S R H in addition to tests of the hypothesis that M W B (i.e., self-esteem and SOC) mediates the relationship between current SES and S R H . The results of a latent growth model indicate that SES is positively related to S R H assessed at the same time, but not to the linear change in S R H over the subsequent 6 years. Both S O C and self-esteem appear to partially mediate the impact of SES on S R H . i i TABLE OF CONTENTS A B S T R A C T Ii T A B L E OF C O N T E N T S Ii i L I S T OF T A B L E S V L I S T OF F I G U R E S V i A C K N O W L E D G E M E N T S V i i C H A P T E R 1: I N T R O D U C T I O N 1 1.1 O V E R V I E W 1 1.2 M A N U S C R I P T - B A S E D F O R M A T 1 1.3 P R O G R A M OF R E S E A R C H 3 1.4 R E F E R E N C E L I S T 11 C H A P T E R 2: A N E X A M I N A T I O N OF T H E F A C T O R S T R U C T U R E OF A SIX-I T E M V E R S I O N OF T H E R O S E N B E R G S E L F - E S T E E M S C A L E . . . . 13 2.1 I N T R O D U C T I O N 13 2.2 M E T H O D S 19 2.3 R E S U L T S 23 2.4 D I S C U S S I O N 30 2.5 C O N C L U S I O N 33 2.6 R E F E R E N C E L I S T 35 C H A P T E R 3: A T E S T OF T H E A G E - B A S E D M E A S U R E M E N T I N V A R I A N C E , T E M P O R A L S T A B I L I T Y A N D M O D E R A T I N G A B I L I T Y O F A N T O N O V S K Y ' S S E N S E OF C O H E R E N C E S C A L E 39 3.1 I N T R O D U C T I O N 39 3.2 S T U D Y 1: T E S T I N G T H E M E A S U R E M E N T I N V A R I A N C E OF S O C A C R O S S A G E 52 3.3 M E T H O D S 53 3.4 R E S U L T S 58 3.5 S T U D Y 2: A N E X A M I N A T I O N OF T H E M O D E R A T I N G P R O P E R T I E S OF S E N S E O F C O H E R E N C E 66 3.6 M E T H O D S 69 in 3.7 R E S U L T S 75 3.8 D I S C U S S I O N 82 3.9 R E F E R E N C E L I S T 87 C H A P T E R 4: A N I N V E S T I G A T I O N OF T H E M E D I A T I N G R O L E S OF S E N S E OF C O H E R E N C E A N D S E L F - E S T E E M I N T H E R E L A T I O N S H I P B E T W E E N S O C I O E C O N O M I C S T A T U S A N D S E L F - R E P O R T E D H E A L T H 93 4.1 I N T R O D U C T I O N 93 4.2 M E T H O D S 99 4.3 R E S U L T S 103 4.4 D I S C U S S I O N 108 4.5 R E F E R E N C E L I S T 112 C H A P T E R 5: C O N C L U S I O N A N D D I S C U S S I O N 116 R E F E R E N C E L I S T 124 iv L I S T O F T A B L E S Table 2.1. Breakdown of self-esteem and income adequacy responses 24 Table 2.2. Results of the confirmatory factor analyses 26 Table 2.3. Standardized loadings and construct reliability for correlated factors model 27 Table 2.4. Correlation matrix for latent variables from the correlated factors model.. . . 28 Table 2.5. Examining the relationship with Negative Affect 28 Table 2.6. Examining the relationship with Anxiety 29 Table 2.7. Examining the relationship with Happiness 29 Table 3.1. Sample characteristics by each group 58 Table 3.2. Summary of invariance test results 61 Table 3.3. Testing S O C means and test-retest correlations 64 Table 3.4. Means and 95% confidence intervals for SOC 65 Table 3.5. Correlation between SOC over time for age groups 65 Table 3.6. Covariance estimates and 95% confidence intervals for age groups 66 Table 3.7. Sample characteristics with and without experience of a recent life event... 76 Table 3.8. Linear regression analysis results predicting self-rated health (with multi-Collinearity present) 77 Table 3.9. Results for the linear regression model predicting self-rated health using a centred S O C variable 78 Table 3.10. Difference in self-reported health across presence or absence of a R L E for groups defined as above or below the mean S O C score 80 Table 3.11. Results for Poisson regression analysis predicting number of visits to a medical doctor 81 Table 4.1. Sample characteristics 104 Table 4.2. Results for models examining sense of coherence as a mediator 105 Table 4.3. Results for models examining self-esteem as a mediator 107 v L I S T O F F I G U R E S Figure 3.1. Longitudinal confirmatory factor analysis model of sense of coherence.... 59 Figure 3.2. Graph of unstandardized predicted values of S R H by centred S O C 79 Figure 4.1. Mediation model examining the extent to which mental well-being mediates the relationship between income adequacy and self-reported health 101 v i A C K N O W L E D G E M E N T S I wish to thank my co-supervisors, Pom Ratner and Jim Frankish for their invaluable mentorship and support. They consistently provided me with sound advice, great learning opportunities and support in all forms. They really made the completion of this degree a great learning experience. I also offer heartfelt thanks to Bruno Zumbo, my third committee member, who shared so generously of his time, knowledge and general enthusiasm for research. I feel very privileged to have had such an amazing thesis committee and wi l l do my best to follow the great example they have set. I would also like to thank the Department of Health Care and Epidemiology for providing a fantastic learning environment filled with opportunities. In particular I would like to thank Virginia Anthony and Laurel Slaney for their friendship, support and never ending help jumping through all the bureaucratic hoops associated with the completion of this degree. I also thank the U B C Research Data Centre run by Statistics Canada for providing a great facility to conduct research. I also thank my wonderful partner, Rachel Johnson, and her three children Caitlin, Cassandra and Sabrina for being a part of my life and putting up with me during many stressful days and nights. Finally, I would like to thank the Social Sciences and Humanities Research Council o f Canada and the Michael Smith Foundation for Health Research for their generous fellowship support - with out their funding this degree would not have been possible. In particular I would like to acknowledge the respectful and timely support of Judy Finch at the Michael Smith Foundation. v i i CHAPTER 1 INTRODUCTION 1.1 OVERVIEW The purpose of this doctoral dissertation is to present a program of research centred on the application of structural equation modeling techniques to examine the measurement of mental well-being ( M W B ) and its relationship with health status using longitudinal data from the Canadian National Population Health Survey (NPHS). The first section of the work outlines the structure of the dissertation and introduces the reader to the research endeavor. The second section contains a set of manuscripts that describe the results of several studies including: a) an investigation of the validity of N P H S assessments of self-esteem and sense of coherence (SOC) in terms of their psychometric structure and functioning; b) a test of the ability of SOC to moderate (i.e., buffer) the health impact associated with the experience of recent stressful events; and c) an assessment of the extent to which self-esteem and SOC mediate the relationship between socio-economic status (SES) and health status. The dissertation ends with a summary and discussion of the research findings in terms of the outlined program of research. 1.2 MANUSCRIPT-BASED FORMAT This doctoral dissertation utilizes a manuscript-based format approved by the Faculty of Graduate Studies and Department of Health Care and Epidemiology at the University of British Columbia ( U B C ) . The Faculty of Graduate Studies at U B C defines a manuscript-based thesis as a collection of published, in-press, accepted, submitted, or draft manuscripts. The collection must be integrated and be presented as part of a unified document that includes a concluding chapter that relates the manuscripts to each other and to the discipline or field of study (Faculty of Graduate Studies Guidelines, 2004). 1 Following the recommendations of the Faculty of Graduate Studies at U B C , this dissertation begins with an introduction to the program of research including specific research objectives for each manuscript. The second section is composed of the individual manuscripts, each of which includes a literature review, statement of research objectives, discussion and conclusion. The last section of the dissertation is a concluding chapter that provides a brief review of the findings reported in the manuscripts, a discussion of the implications of the research findings in terms of the overall program of research, and some comment on areas for future research. Although the chapters in a traditional dissertation format might flow seamlessly, the manuscripts presented here are somewhat more distinct in that they do not build upon one another in a strictly vertical sense. Rather, each paper examines specific hypotheses related to the measurement of M W B and its relationship with health while at the same time making an explicit effort to demonstrate empirical advances in psychometric modeling that are relevant to the identified program of research. In addition to the distinctness of the research objectives examined in each manuscript, the data analyzed changed as the availability of the N P H S data sets increased over the course of the program of research. For example, during the preparation of the first manuscript in 1999 it was necessary to analyze individual question responses to a self-esteem scale included in the N P H S . However, the public release version of the N P H S data produced by Statistics Canada contained only the total scores for the scale and not responses to the individual self-esteem items. Access to the full set of responses to the 1994 N P H S questionnaire was gained through the British Columbia Linked Health Database ( B C L H D ) maintained by the Centre for Health Services and Policy Research at the University of British Columbia. Although the B C L H D N P H S data contained the responses to all questions in the N P H S , it only included the British Columbia portion of the sample. During the preparation of the second and third manuscripts in 2001-2002, access to the national data was made possible when Statistics Canada provided the full national longitudinal N P H S dataset to B C health researchers through the newly established British Columbia Inter-University Research Data Centre at the University of British Columbia. 2 1.3 PROGRAM OF RESEARCH This dissertation is the product of a program of research focused on the application of structural equation modeling (SEM) to enhance our knowledge of the psychosocial determinants of health. I am particularly interested in issues related to the measurement of psychosocial variables (e.g., self-esteem) and the mechanisms through which they influence health. M y doctoral studies included research on health-related quality of life, risk perception and biopsychosocial models of tobacco dependence, however, the manuscripts presented in this dissertation are centred on the application of S E M to investigate the psychometric structure of several components of M W B and their role as moderating (i.e., buffering) and mediating mechanisms between SES and health using longitudinal data from the N P H S . The National Population Health Survey The realization that universal, easily accessible medical care does not automatically eliminate health inequalities, together with the rising costs of modern health care systems have forced health planners and evaluators to critically examine traditional approaches to achieving a healthy population (Marmot et al., 1994). Typically, these evaluations have been based on mortality and morbidity data acquired through vital statistics and hospital databases. However, the increasing realization that behavioural and social factors represent the keys to preventing or controlling today's leading causes of death and disability has resulted in information needs that cannot be met by existing mortality and morbidity information sources (Green, 1990; Health Canada, 1998). Within a Canadian setting, growing economic and fiscal pressures on the health care system led the government of Canada to establish a National Health Information Council (NHIC) in 1991. After examining the available sources of information on the general population that could be used to assess and ultimately improve the health status of Canadians, the N H I C concluded that existing sources of health data were unable to provide a complete picture of the health status of the general population and the multitude of factors that influence health. The council went on to recommend that an on-going national survey of population health be conducted and in Apr i l 1992, Statistics Canada received funding for the development of the N P H S (Statistics Canada, 2002). 3 In addition to assessing the level, trend and distribution of the health status of the Canadian population, the N P H S was to provide longitudinal data on a panel of people who were followed over time to reflect the dynamic processes of health and illness (Statistics Canada, 2002). A s stated by Catlin and W i l l (1992), the objectives of the N P H S were: 1) To aid in the development of public policies designed to improve health, by providing measures of the level, trend and distribution of the health status of the population. 2) To provide data for analytic studies that w i l l assist in understanding the determinants of health. 3) To collect data on the economic, social, demographic, occupational and environmental correlates of health. 4) To increase the understanding of the relationship between health status and the use of health services, not only in the traditional sense, but also in areas such as home-care, self medication and self-care. 5) To provide panel data that w i l l reflect the dynamic process of health and illness and produce periodic cross-sectional estimates. To accomplish these goals, the N P H S was divided into two separate components: a survey of health care institutions and a survey of households, the latter of which is examined in this thesis (Statistics Canada, 2002). A s part of the household component, 17,276 persons were selected to form the longitudinal panel of respondents. Beginning with the first N P H S survey in 1994-1995, this set of respondents was to be surveyed every two years until 2014-2015 (Statistics Canada, 2002). A t present, 4 cycles of N P H S data are available to researchers (i.e., 1994-1995, 1996-1997, 1998-1999 and 2000-2001 cycles of the N P H S ) . Definitions of Health and Mental Well-being To meet the information requirements of the N P H S , the survey contains a large number of indicators associated with the physical, mental and social well-being of Canadians including assessments of self-esteem and sense of coherence (SOC). This multi-dimensional approach to the assessment of health status reflects the 1948 World Health Organization 4 (WHO) definition of health as a state of complete physical, mental, and social well-being and not merely the absence of disease or infirmity ( W H O , 1948). Although this definition has been criticized as being Utopian and unrealistic for practical purposes (Bartlett & Coles, 1998) , it serves well as an all encompassing starting point for a discussion of the concept of M W B . What is clear from the W H O definition of health is that the concept of mental health is a fundamental component of health and that it includes both positive and negative dimensions. The concept of mental health is also considered to encompass individuals' state of mind, conduct and competence as primary attributes (Kozma, Stones & M c N e i l , 1991). More specifically, mental health is defined by the U.S . Surgeon General as the successful performance of mental function, resulting in productive activities, fulfilling relationships with other people, and the ability to adapt to change and to cope with adversity; from early childhood until late life, mental health is the springboard of thinking and communication skills, learning, emotional growth, resilience, and self-esteem (United States Department of Health and Human Services, 1999). From a Canadian perspective, Stephens, Dulberg, and Joubert (1999) defined mental health as a resource for everyday living. More precisely, they considered mental health to be a set of affective/relational and cognitive attributes that permit the individual to carry out valued functions with reserved capacity and thus to cope effectively with challenges to both mental and physical functioning (Stephens et al., 1999). Within this approach, self-esteem and sense of coherence are treated as components of positive mental health status because they are considered to contribute to reserve capacity and coping ability (Stephens et al., 1999) . The concepts of self-esteem and SOC could be characterized as components of mental health, however, for the purposes of this thesis they are referred to as components of M W B . The primary reasons for using the phrase M W B in this thesis instead of 'mental health' or 'positive mental health' are to avoid confusion with the concept of mental illness - a term that refers collectively to all diagnosable mental disorders (United States Department of Health and Human Services, 1999); and maintain consistency with the widely cited W H O definition of health as a complete physical, mental, and social well-being and not merely the absence of disease or infirmity ( W H O , 1948). 5 Mediating and Moderating Properties of Mental Well-being In addition to providing an assessment of M W B , self-esteem and S O C have been identified as important mediators of the relationship between socio-economic status (SES) and health (e.g., Siegrist & Marmot, 2004). Baron and Kenny (1986) provided one of the most widely cited definitions of a mediator as \"the generative mechanism through which the focal independent variable is able to influence the dependent variable of interest\" (p. 1173). In other words, the predictor (i.e., SES) influences the mediator (i.e., self-esteem), which in turn influences the outcome (i.e., health). Ful l mediation is found when the entire effect of SES on health can be explained through the mediation pathway. If the direct effect is reduced but not completely eliminated, then there is evidence of partial mediation (Hoyle & Smith, 1994). The identification of variables that mediate the SES-health relationship has become of increasing interest to psychosocial epidemiologists as they shift their focus from the identification of health impacts associated with SES to developing an understanding of the processes by which SES influences health. Evidence of this shift towards modeling processes can be found in the recently published volume of the Annals of the New York Academy of Sciences that is dedicated solely to describing current research on the social, psychological and biological pathways linking SES and health (Adler, Marmot, McEwen , & Stewart, 1999). In addition to functioning as a mediator, SOC also has been theorized to function as a moderator (Antonovsky, 1987). In general terms, a moderator is a variable (either continuous or categorical) that affects the direction or strength of the relationship between the independent variable (i.e., SES) and dependent variable (i.e., health) (Baron & Kenny, 1986). Another way of conceptualizing a moderator variable is as an interaction term between the predictor (i.e., SES) and moderator (i.e., SOC) in a regression model (Barron & Kenny, 1986). Instead of representing a process (i.e., mechanism) by which an effect is achieved (i.e., SES affects SOC which in turn affects health), a moderator describes how the effect of a predictor changes depending on the level of a moderator. In the context of the SES-health relationship, SOC would be a moderator i f the effect of SES on health depended on whether survey respondents had a high or low SOC. This effect is commonly called \"buffering\" in that a high SOC might buffer or reduce the effect of SES on health. 6 Modern Approaches to Scale Validation Given the high cost associated with conducting the N P H S , and the importance of measuring psychosocial attributes (e.g., self-esteem and SOC) to meet the survey objectives, it is critical that the instruments (e.g., self-esteem scales) undergo extensive validation. Traditional approaches to the establishment of the validity of a measure in a given population tended to rely on evidence from one of several separate types of validity (e.g., researchers may have cited evidence of a correlation with a known criterion or the results of a factor analysis but not both (Hubley & Zumbo, 1996). However, more modern conceptualizations of validation as an ongoing collection of evidence demonstrating the meaningfulness, appropriateness and usefulness of the specific inferences made from scores (American Psychological Association, American Educational Research Association, & National Council on Measurement in Education, 1985) require a broader, more integrative approach based on the ongoing collection of evidence that justifies the interpretation and use of scores (Hubley & Zumbo, 1996). The need for extensive validation is of special concern when attempting to model complex processes (e.g., with multiple intermediate and final outcomes) where measurement error and bias associated with indicators of M W B may be compounded and hard to detect. Fortunately, the use of S E M makes it possible to model complex processes including mediation and moderation hypotheses that are not contaminated by measurement error. A t this time S E M is not widely taught in many epidemiology graduate programs in Canada. However, it is rapidly becoming incorporated into many American and European graduate level public health training programs. For example, the Johns Hopkins Bloomberg School of Public Health dedicates an entire course to the teaching of S E M for psychosocial research in the context of public health. Additional recognition of the usefulness of S E M in the health field comes from the United States National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, which partially funded the development of the MPlus software used in the data analysis for this dissertation (Muthen & Muthen, 2004). Given the relative novelty of S E M within the field of epidemiology, a brief introduction of the core concepts of S E M is presented next. 7 Introduction to Structural Equation Modeling In many types of epidemiological research, scientists are often interested in measuring and studying the relationships among theoretical constructs that cannot be directly observed. These abstract phenomena are termed latent variables or factors (Byrne, 1998). Examples of latent variables commonly encountered in health research include quality of life, pain, depression, anxiety, socioeconomic status, treatment satisfaction and health status. To measure latent variables, researchers must operationally define the construct in terms of behaviour that can be directly observed (Byrne, 1998). One of the most common behaviours selected for direct observation are responses to questions on a survey. Typically, it is assumed that the amount of a latent variable (e.g., pain) determines the responses to the questions in the survey (e.g., a patient might be asked to rate the amount of pain they experience on a scale from 1 to 10). One problem with this approach is that the response to the question is not a perfect reflection of the construct being assessed because responses contain measurement error (i.e., responses are influenced by things other than the construct of interest). Traditional statistical approaches assume that all constructs are measured without error. However, in the classical theory of mental tests, the observed score can be partitioned into the true score plus an error score. Within this framework, the power of statistical tests and the validity of the model coefficients both depend on the extent (high versus low) and nature (random versus systematic) of measurement error contained within the observed score (Williams, Zimmerman & Zumbo, 1995). Instead of ignoring the presence of measurement error, S E M allows the researcher to explicitly account for it in his or her statistical model. Conceptually, scores on a manifest variable can be thought of as being generated by a set of regression equations, each of which represents a regression of the latent variable onto a single observed indicator variable (the observed response) and an error term. The resulting regression coefficient for each indicator variable is commonly called a loading while the error term is labeled a 'residual' or 'uniqueness'. Such analyses offer considerable advantage in that the estimation of predictive relationships among the latent variables is not contaminated by measurement error (Byrne, 1998;Kelloway, 1998). 8 Although the testing of measurement models using confirmatory factor analysis ( C F A ) represents one of the most widespread uses of S E M , recent advances in both the accessibility and sophistication of S E M software packages have led to a rapid rise in the use of full structural equation models (i.e., models that include relationships among latent variables) (Kelloway, 1998). In addition to explicitly specifying a measurement model for each latent variable included in the analysis, the use of S E M enables the researcher to simultaneously test an entire system of relationships that includes multiple independent, dependent and intervening variables (Byrne, 1998). In addition to tests of statistical significance for each parameter in the model, the researcher also obtains a statistical test of the goodness-of-fit for the model as a whole (Byrne, 1998). In contrast, traditional procedures for examining complex models fail to assess or correct for measurement error and are essentially descriptive in nature and thus not amenable to hypothesis testing (Byrne, 1998). The ability to explicitly account for measurement error in models of complex relationships that include multiple independent, intervening and dependent variables makes S E M particularly appealing for psychosocial epidemiology. Many of the concepts under investigation are latent variables (e.g., socioeconomic status, stress, depression, mental wel l -being, quality of life and social support) and the majority of theories involve complex processes including mediation, moderation and reciprocal effects (i.e., feedback loops). For a non technical introduction to S E M , the interested reader is referred to Kelloway (1998) and Byrne (1998). Research Objectives The goal of this thesis was to apply S E M techniques to examine the measurement of M W B and its relationship with health status using longitudinal data from the N P H S . The first manuscript contains the results of an investigation that uses confirmatory factor analysis to examine the construct validity of the self-esteem scale included in the N P H S . The second manuscript contains the results of two studies investigating the psychometric properties of the SOC scale. The first study described in the manuscript uses multi-group longitudinal confirmatory factor analysis to simultaneously test the stability of the factor structure of the SOC over time, within and between age groups. The second study contains the results of a 9 series of regression models testing the ability of SOC to moderate (i.e., buffer) the health impact associated with the experience of a recent stressful life event using data from the 1998-1999 and 2000-2001 N P H S . The third manuscript reports the results of a series of structural equation models examining the extent to which self-esteem and S O C mediate the effect of SES on current self-reported health and future change in self-reported health over the subsequent 6 years. 10 1.4 R E F E R E N C E L I S T American Psychological Association, American Educational Research Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, D C : American Psychological Association. Antonovsky, A . (1987). Unraveling the mystery of health: how people manage stress and stay well. San Francisco, C A : Jossey-Bass. Adler, N . E . , Marmot, M . , McEwen , B . S. & Stewart, J. (Eds.) (1999). Socioeconomic status and health in industrial nations: social, psychological and biological pathways. New York, N Y : Annals of the New York Academy of Sciences. Baron, R. M . & Kenny, D . A . (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. JPers Soc Psycho., 51, 1173-1182. Bartlett, C . J. & Coles, E . C. (1998). Psychological health and well-being: why and how should public health specialists measure it? Part 1: rationale and methods of the investigation, and review of psychiatric epidemiology. J Public Health Med 20, 281-287. Byrne, B . M . (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: basic concepts, applications, and programming. Mahwah, N J : Lawrence Erlbaum Associates. Catlin, G . & W i l l , P. (1992). The National Population Health Survey: highlights of initial developments. Health Reports 4(3), 313-319. Faculty of Graduate Studies (2004). Masters and doctoral thesis submission: types of theses: manuscript-based format. Retrieved June 6, 2004, from http://www.grad.ubc.ca/students/thesis/index.asp?menu=001,002,000,000 Green, L . W. (1990). Community health. St. Louis: Times Mirror/Mosby College Publishing. Health Canada (1998). Health Canada request for synthesis proposals: measuring the health of populations. Ottawa, O N T : Health Canada. Hoyle, R. H . & Smith, G . T. (1994). Formulating clinical research hypotheses as structural equation models: a conceptual overview. J Consult Clin Psychol, 62, 429-440. 11 Hubley, A . M . & Zumbo B . D . (1996). A dialectic on validity: where we have been and where we are going. The Journal of General Psychology, 123, 207-215. Kelloway, E . K . (1998). Using LISREL for structural equation modeling: a researcher's guide. Thousand Oaks, C A : Sage Publications. Kozma, A . , Stones M . J. & M c N e i l , J. K . (1991). Psychological well-being in later life. Toronto: Butterworths. Marmot, T. R., Barer, M . L . & Evans, R. G . (1994). The determinants of a population's health: what can be done to improve a democratic nation's health status? In R. G . Evans, M . L . Barer and T. R. Marmor (Eds.), Why are some people healthy, and others not? The determinants of health ofpopulations (pp. 217-230). New York, N Y : Aldine de Gruyter. Muthen, L . K . & Muthen, B . O . (1998-2004). Mplus user's guide. Third edition. Los Angeles, C A : Muthen & Muthen. Siegrist, J. & Marmot, M . (2004). Health inequalities and the psychosocial environment-two scientific challenges. Soc Sci Med, 58, 1463-1473. Statistics Canada (2002). Population Health Surveys Program: National Population Health Survey Cycle 4 (2000-2001), household component longitudinal documentation. Ottawa, O N T : Statistics Canada. Stephens, T., Dulberg, C. & Joubert, N . (1999). Mental health of the Canadian population: a comprehensive analysis. Chronic Dis Can 20:118-126. United States Department of Health and Human Services (1999). Mental health: a report of the Surgeon General-executive summary. Rockvil le, M D : U . S . Department of Health and Human Services, Substance Abuse and Mental Health Services Administration, Center for Mental Health Services, National Institutes of Health, National Institute of Mental Health. Will iams, R. H . , Zimmerman, D . W. , & Zumbo, B . D . (1995). Impact of measurement error on statistical power: Review of an old paradox. Journal of Experimental Education, 63, 363-370. World Health Organization (1948). Constitution of the World Health Organization, in World Health Organization (1992). Basic documents (39th edn.). Geneva: World Health Organization. 12 CHAPTER 2 AN EXAMINATION OF THE FACTOR STRUCTURE OF A SIX-ITEM VERSION OF THE ROSENBERG SELF-ESTEEM SCALE 2.1 INTRODUCTION A n individual's self-concept represents the cognitive schema that organizes abstract and concrete memories about the self and controls the processing of self-relevant information (Campbell, 1990). The self-concept in turn is generally recognized as consisting of two related but distinct components referred to as evaluative and knowledge components that may vary to differing extents across time, situations and individuals (Nezlek & Plesko, 2001). The concept of self-concept clarity represents an elucidation of the structure of the knowledge component of the self that refers to the extent to which self-beliefs are clearly and confidently defined, internally consistent, and stable (Campbell, 1990). The evaluative component of the self is frequently measured using assessments of self-esteem (Campbell, 1990). The concept of self-esteem is one of the most widely assessed components of mental well-being ( M W B ) and is typically defined as \"the extent to which one prizes, values, approves or likes oneself or \"the overall affective evaluation of one's worth, value or importance\" (Blascovich & Tomoka, 1991, p. 115). From this perspective, self-esteem is determined by the extent to which individuals perceive themselves as matching up to a set of core values learned through the process of socialization. The smaller the discrepancy between the ideal self and the current, actual, or \"real\" self, the higher the self-esteem of the individual (Mruk, 1999). In addition to self-esteem, many population health surveys (e.g., National Population Health Survey of Canada) include assessments of other components of mental health and well-being. For example the source of the data analyzed in this study, the 1994 Canadian National Population Health Survey (NPHS), contained questions addressing negative affect, 13 anxiety and happiness. The relationship between self-esteem and these additional components of M W B is discussed next. Self-esteem and Depression The pioneering work of cognitive theorists such as Beck (1967) suggests that the construct of self-esteem should function as a direct indicator of vulnerability to major depression. Support for this perspective has been found in studies that have shown that poor self-esteem is a significant risk factor for the onset of clinical depression (e.g., Brown, Andrews, & Bifulco, 1990; Mil ler , Kreitman, & Ingham, 1989). Numerous studies have examined the relationships among dimensions of personality and depression. In a recent review and commentary, Enns and Cox (1997) documented empirical support for the following four general classes of relationships initially described by Clark, Watson, and Mineka in 1994: (1 Vulnerabi l i ty models where personality measures predispose persons to the development of depression; (2) Pathoplasty models where personality factors affect the expression of depression; (3) Complication or \"scar\" models in which depression leads to changes in personality function; and (4) Continuity or spectrum models where an underlying process is responsible for the occurrence of both personality and depression. Although these models do not focus specifically on the construct of self-esteem, there is evidence suggesting that the relationship between neuroticism and depression is very similar in size and magnitude to the relationship between self-esteem and depression. This has led some researchers to suggest that self-esteem may be measuring the same underlying concept assessed by measures of neuroticism (Judge, Erez, & Bono, 2002). Although studies have examined the link between depression and either neuroticism or self-esteem, recent research has focused on understanding the nature of the inter-relationships among self-esteem, neuroticism and depression. For example, Roberts and Kendler (1999) concluded that when considered individually, both neuroticism and poor self-esteem are risk factors for major depression. However, neuroticism appears to be a more proximal indicator of risk because most or all of the risk explained by self-esteem is mediated by neuroticism. Contrarily, Cheng and Furnham (2003) concluded that global self-esteem is a powerful predictor of depression that mediates the influence of neuroticism. Schmitz, Kugler, and Rollnik (2003) concluded that both neuroticism and self-esteem predict 14 depression, however a complex nonlinear interaction exists between the concepts suggesting that neuroticism and self-esteem should be evaluated simultaneously when analyzing depression disorders. Despite the ambiguities regarding the specific roles of self-esteem and neuroticism in predicting depression, there is empirical and theoretical support of a strong predictive relationship between measures of self-esteem and depression. Self-esteem and Happiness Happiness can also be considered a component of M W B and has been described as the preponderance of positive affect over negative affect (Diener, 1984). More recently, DeNeve and Cooper (1998) suggested that happiness is more of an overall affective appraisal while positive and negative affect are specifically focused on the recent occurrence of specific positive and negative emotions. Compared to the amount of research examining the relationship between depression and self-esteem, substantially less effort has been paid to examining the link between self-esteem and happiness. However, the recent rise in interest in positive psychology has led to a resurgence of research on happiness and its determinants. For example, a recent meta-analysis found that of the so-called big five personality traits (i.e., extraversion, agreeableness, conscientiousness, neuroticism and openness), neuroticism was the strongest predictor of happiness and negative affect (DeNeve & Cooper, 1998). Alternatively, a recent examination of the relationship between self-esteem and happiness published by Cheng and Furnham (2003) indicated that self-esteem directly predicts happiness in addition to mediating the effects of extraversion and neuroticism on happiness. A s with the evidence relating self-esteem to depression, considerable amounts of grey area exist surrounding the nature of the relationships among self-esteem, additional personality factors such as neuroticism, and happiness. Nonetheless, there is clear support for the expectation of a robust predictive relationship between self-esteem and happiness. In addition to the uncertainty concerning the nature of the relationships among self-esteem, personality assessments such as neuroticism and other assessments of M W B , there is an ongoing debate concerning the structure of self-esteem. The Rosenberg Self-Esteem Scale (RSES) is a widely used scale that purports to provide a one-dimensional assessment of global self-esteem (Rosenberg, 1965). The original version contains 10 questions (5 with a positive orientation and 5 with a negative orientation), however, a subset of 6 questions was 15 selected for inclusion in the 1994 National Population Health Survey of Canada (NPHS) (Statistics Canada, 1995). The item selection was based on the results of a principal components analysis (varimax rotation) published in the appendix of a paper by Pearlin and Schooler (1978). The selected questions with original factor loadings arising from the one factor solution published by Pearlin and Schooler (1978) are presented below. R S E S items included in the N P H S : Q l . I feel that I have a number of good qualities (.79). Q2.1 feel that I 'm a person of worth, at least on an equal plane with others (.79). Q3.1 am able to do things as well as most other people (.68). Q4.1 take a positive attitude towards myself (.65). Q5. On the whole I am satisfied with myself (.56). Q6. A l l in all , I am inclined to feel that I 'm a failure (-.46). R S E S items omitted from the N P H S : Q7. A t times I think I am no good at all . Q8.1 feel I do not have much to be proud of. Q9.1 certainly feel useless at times. Q10.1 wish I could have more respect for myself. Respondents' answers are scored using the following 5-point Likert scale: 0 = Strongly disagree, 1 = Disagree, 2 = Neither agree nor disagree, 3 = Agree, and 4 = Strongly agree. Total scores are created by summing all responses with scores reversed for Q6. The results of Pearlin and Schooler (1978) suggest that a one factor model might be an appropriate representation for the six indicators. However, it is well known that exploratory factor analyses (EFA) , including principal components analyses, are not able to identify models that contain higher-order factor structures or multiple correlated factors. For example, Hunter and Gerbing (1982) concluded many years ago that E F A is a \"poor ending point for the construction of uni-dimensional scales\" (p. 273). Not only does E F A tend to 'under-factor,' but it combines highly correlated variables into the same factor which can be problematic when variables are correlated for reasons other than being measures of the same 16 factor (Rubio, Berg-Weger, & Tebb, 2003), such as when variables correlate because of similar question formats (e.g., reverse coded response patterns may cause variables to correlate). Although numerous studies of the R S E S have been performed on the full set of R S E S items, as well as a variety of different subsets, little is known about the extent to which the validity of the one-factor model proposed by Pearlin and Schooler (1978) is supported in terms of a confirmatory factor analysis. One position on the factor structure of the R S E S that has recently gained widespread acceptance is based on the work of Marsh (1996). This perspective rests on the notion that one factor underlies the R S E S , but that a method effect related to the wording o f negative items contaminates the model fit in one-dimensional C F A ' s . Typically, researchers have achieved acceptable model fit for one-dimensional models by adding error correlations (correlations among the unique components) across the negatively worded items, though some have elected to include an additional latent variable as a single method effect. For a comprehensive review of the various models and their origins the reader is referred to Marsh (1996) and Corwyn (2000). Although the inclusion of method effects can, at times, make theoretical sense, their intuitive appeal combined with the accompanying improvement in model fit appears to have conferred a somewhat unjustified notion of validity in terms of their interpretation. The need to include correlated error terms or additional latent method effect variables may also be interpreted as indicating that multiple dimensions of self-esteem are present (see: Hayduk, Ratner, Johnson, & Bottorff, 1995). However, beyond the work of Marsh (1996) and Corwyn (2000), who reported evidence suggesting that the effects diminish with increased verbal ability, little has been done in terms of describing the specific nature of the method effects and their implications. For example, i f there are method effects and they vary according to verbal ability, then is the R S E S appropriate for use in populations containing important subgroups (e.g., socioeconomic groups) with different verbal abilities? However, given that the items included in the N P H S contain only one item with negative wording, a 6-item 1-factor model of self-esteem should fit the data without the addition of correlated error terms or latent method effect variables (i.e., there are no covariances among negatively worded items given that there is only one). 17 A second perspective on the structure of the R S E S and the concept of self-esteem has been offered by Tafarodi and Swann (1995, 2001). They proposed that 'global ' self-esteem is composed of two interdependent but distinct concepts. The first, labeled 'self-competence,' is defined as the valuative experience of oneself as a causal agent, an intentional being with efficacy and power. The second dimension, labeled 'self-liking, ' is defined as the valuative experience of oneself as a social object, a good or bad person according to internalized criteria. In relation to the 6 R S E S items included in the N P H S , Tafarodi and Swann (2001) reported evidence suggesting that items 5 and 4 are indicators of self-liking whereas items 1, 2, 3 and 6 are indicators of self-competence. Research Objectives The purpose of this study was to use confirmatory factor analysis to compare the fit of the following two competing models of the R S E S items used in the N P H S . Model 1, a one factor model proposed by Marsh (1996), proposes that all six items load onto a single self-esteem factor. Model 2 is a correlated two-factor model developed by Tafarodi and Swann (1995, 2001) which suggests that items (i.e., questions) 4 and 5 are indicators of self-liking, and items 1, 2, 3 and 6 are indicators of self-competence. In addition to the C F A ' s , a series of latent variable regression analyses were employed to compare the predictive power associated with using a single versus 2-dimensional representation of the R S E S to predict measures of depression and happiness. Support for using a 2-dimensional model would be indicated by an increase in the amount of outcome variance explained compared to that explained when using the single factor model. The latent variable regressions could also be considered, to some extent, as assessments of discriminant validity in that the researcher is also testing the degree to which a construct under investigation diverges from other constructs that are theoretically less related to it. In the context of this investigation, support for a 2-dimensional representation of self-esteem would be indicated by meaningful differences in the relationships between each of the self-esteem factors in the 2-factor model and one or more of the outcomes (e.g., happiness). 18 2.2 M E T H O D S Data The data analyzed in this study were obtained from respondents, 19 years of age and older, who participated in the British Columbia portion of the 1994 N P H S of Canada (Statistics Canada, 1995). The sampling design for the N P H S is a stratified cluster design based on an existing Labour Force Survey. Sampling weights adjusting for differing opportunity for responding were also included in the dataset provided by Statistics Canada. Instruments Rosenberg Self-esteem Scale A s described in the introduction of this paper, an adapted 6-item version of the R S E S was used in the N P H S . Respondents' answers to the 6 questions were scored using the following 5 point Likert scale: Strongly disagree = 0, Disagree = 1, Neither agree nor disagree = 2, Agree = 3, Strongly agree = 4. After reversing the score for Q6, a total score for the R S E S is created by summing the responses, with higher scores indicating greater self-esteem. Q l . Y o u feel that you have a number of good qualities; Q2. Y o u feel that you're a person of worth at least equal to others; Q3. Y o u are able to do things as well as most other people; Q4. Y o u take a positive attitude toward yourself; Q5. On the whole you are satisfied with yourself; Q6. A l l in all, you're inclined to feel you're a failure (reverse scored). Depression and Anxiety The questions used to measure depression in this study were originally included as part of the Distress Scale in the N P H S . Statistics Canada (1995) reported that the scale was derived from the work of Kessler and Mroczek (from Michigan University) and is based on a subset of items from the Composite International Diagnostic Interview (CIDI). The CIDI is a structured diagnostic instrument that was designed to produce diagnoses according to the 19 definitions and criteria of both DSM-II I -R and the Diagnostic Criteria for Research of the ICD-10. In the tripartite (i.e., three dimensional) model of depression and anxiety originally proposed by Clark and Watson (1991), depression is specifically characterized by low positive affect (adhedonia), anxiety is specifically characterized by physiological hyperarousal and both anxiety and depression are indicated by indicators of general negative affect (Joiner & Lonigan, 2000). Based on this three-dimensional characterization of symptom expression, the N P H S Distress Scale items were grouped into a measure of general negative affect and a measure of anxiety characterized by physiological hyper-arousal. A l l item responses were scored using the following 5-point Likert scale: A l l of the time = 4, Most of the time = 3, Some of the time = 2, A little of the time = 1, None of the time = 0. Negative Affect Items Scores on the negative affect factor had a possible range of zero to twelve with higher scores indicating greater negative affect. Q l During the past month, about how often did you feel so sad that nothing could cheer you up? Q2 During the past month, about how often did you feel hopeless? Q3 During the past month, about how often did you feel worthless? Anxiety Items Scores on the anxiety factor had a possible range of zero to eight with higher scores indicating greater anxiety. Q l During the past month, about how often did you feel nervous? Q l During the past month, about how often did you feel restless or fidgety? Happiness The construct of happiness was assessed using a single question that was included in the N P H S as part of the Health Util i ty Index (Feeny et al., 1992). Although it does not address multiple dimensions of happiness it does possess reasonable face validity in that it appears to provide a crude overall affective appraisal. To allow for some measurement error, 20 a reliability of .80 was utilized when modeling this variable as a single indicator of latent happiness. Responses were reverse coded so that higher scores indicated greater happiness. Q l Would you describe yourself as being usually: (1) Happy and interested in life; (2) Somewhat happy; (3) Somewhat unhappy; (4) Unhappy with little interest in life; (5) So unhappy that life is not worthwhile. Analysis Strategy The responses to the R S E S items were scored on a five-category Likert-type scale. The responses resulting from this type of scale are ordinal in that we assume that a person who endorses one category over another possesses more (or less) of the characteristic than i f they endorsed a different category. We do not know how much more (or less) of the characteristic they possess because ordinal variables do not have origins or units of measurement - the means, variances and covariances of ordinal variables have no meaning. The only information contained in ordinal data are the counts of cases in each cell o f a multi-way contingency table (Joreskog, 2001a). Although many researchers employing S E M have traditionally treated ordinal responses as i f they were continuous, recent advances in statistical modeling software packages have made other potentially more robust modeling strategies accessible to the applied researcher. One generally accepted approach for analyzing ordinal data in S E M analyses involves the use of polychoric correlations with diagonally weighted least squares estimation using the asymptotic covariance matrix. This approach assumes that there is an underlying continuous variable z* that represents the attitude manifested in the ordinal responses. Basically, a metric is assigned to the ordinal variable by estimating a response threshold using the proportion of cases responding in or below each category and mapping this threshold onto a standard normal distribution. For a more detailed explanation see 21 (Joreskog, 2001b). The S E M analyses conducted in this study were carried out using L I S R E L 8.54 ( L I S R E L , 2003). A l l analyses were based on the analysis of polychoric correlations using diagonally weighted least squares estimation and an asymptotic covariance matrix. Following the recommendations of Kaplan and Ferguson (1999) all analyses utilized normalized Statistics Canada weights (i.e., the original Statistics Canada weight for each respondent was divided by the average weight for their group) to take into account the probability of selection while controlling for an artificial weight-induced increase in sample size. Although use of weights w i l l yield unbiased estimates (i.e., estimates obtained from repeated independent samplings of the population would approximately equal the estimate obtained for the same analysis performed on the whole population), they have the disadvantage of being typically more variable than unweighted estimators (Korn & Graubard, 1995). Research has shown that the use of weights affects univariate statistics (e.g., population means) to a larger extent than estimates of association (Korn & Graubard, 1995). Weighted and unweighted estimates of association tend to differ substantially when (a) the model is very misspecified, (b) an omitted variable has a strong (qualitative) interaction with the independent variables and is highly correlated with the weights or (c) the sampling is done at very different rates depending upon the outcome variable (Korn & Graubard, 1995). If respecifying the model does not make weighted and unweighted estimates similar, then the weights are correcting for sample selection procedures and use of the weights w i l l yield correct estimates but possess incorrect standard errors. Although procedures have been suggested for adjusting the standard errors in a linear regression models (see for example Winship & Radbil l , 1994), comparable procedures for use in a S E M environment have not been incorporated into the L I S R E L software used for the modeling in this investigation. The first section of the analysis compared the fit o f two competing models of the R S E S items used in the N P H S . Model 1 was the one factor model proposed by Marsh (1996), which had all six items loading onto a single self-esteem factor (SE). Model 2 was the two-factor model proposed by Tafarodi and Swann (2001), which had questions 5 and 4 functioning as indicators of self-liking ( L I K E ) and questions 1, 2, 3 and 6 modeled as indicators of self-competence ( C O M P ) . 22 In the second stage of the analysis, latent variable correlations between the self-esteem factors and negative affect, anxiety and happiness were examined for evidence of discriminant validity (i.e., Do the two self-esteem factors differ in their relationships with indicators of negative affect, anxiety and depression?). A series of latent variable regressions were then used to compare the predictive power associated with using a single versus 2-dimensional representation of the R S E S to predict negative affect, anxiety and happiness (i.e., Does the variance explained in negative affect, anxiety and depression by each of the self-esteem factors overlap?). 2.3 RESULTS Descriptive Statistics The response rate for the survey was approximately 88% (Statistics Canada, 1995) with 1811 British Columbians 19 years of age and older participating. After list wise deletion of respondents with missing data, the effective sample size was 1782. Women made up 52.2 percent of the final sample and the mean age of the respondents was 45 years (standard deviation = 17). The distributions of self-esteem items and household income taking into account household size (income adequacy) can be found in Table 2.1. 23 Table 2.1 Breakdown of self-esteem and income adequacy responses. Self-esteem Scale Response Percentages Question Strongly Disagree Disagree Neither agree nor disagree Agree Strongly Agree Q 1 . 0.0 0.2 1.5 58.6 39.6 Q2. 0.1 0.3 0.9 54.1 44.5 Q3. 0.3 1.8 2.8 55.1 39.9 Q4. 0.2 3.2 6.1 51.0 39.5 Q 5 . 0.5 4.2 5.6 55.5 34.3 Q 6 . 47.8 46.1 3.4 2.6 0.1 Income Adequacy Response Percentages Category Income Household Size Percent Lowest income Less than $10,000 1 to 4 persons 4.6 Less than $15,000 5 or more persons Lower middle income $10,000 to $14,999 1 or 2 persons 10.4 $10,000 to $19,999 3 or 4 persons $15,000 to $29,999 5 or more persons Middle income $15,000 to $29,999 1 or 2 persons 26.6 $20,000 to $39,999 3 or 4 persons $30,000 to $59,999 5 or more persons Upper middle income $30,000 to $59,999 1 or 2 persons 37.9 $40,000 to $79,999 3 or 4 persons $60,000 to $79,999 5 or more persons Highest income $60,000 or more 1 or 2 persons 16.2 $80,000 or more 3 persons or more Unknown Not available Not available 4.2 Total scores on the R S E S ranged from 6 to 24, had a mean value of 20 (standard deviation = 3) and were normally distributed. From Table 2.1, it can be seen that responses to items 3, 4, 5 and 6 were negatively skewed suggesting a slight ceiling effect in this population (in the case of Q6 a floor effect is indicated). The mean negative affect score was 1.6. (standard deviation of 2.0); the mean anxiety score was 1.5 (standard deviation of 1.5); and the mean happiness score was 4.7 (standard deviation of .6). Both the negative affect and 24 anxiety scores were positively skewed suggesting a floor effect while the happiness rating was negatively skewed indicating a ceiling effect. Confirmatory Factor Analyses of RSES Items A routine review of the residuals and modification indexes associated with the initial 2-factor model based on the work of Tafarodi and Swann (2001) indicated that a substantially better fitting 2-factor model could be achieved by switching the loading of Q6 from the self-competence factor to the self-liking factor (see Table 2.2). While the generation of new models was not originally planned, the substantial increase in model fit (e.g., R M S E A decreased from .072 to .028 and S R M R decreased from .05 to .02) suggested that including this model in subsequent latent variable analyses might contribute to the literature. This new 2-factor model, labeled Model 3, had items 1, 2 and 3 loading onto a self-competency factor while items 4, 5 and 6 loaded onto a self-liking factor. The results in Table 2.2 show that none of the models passed the v 2 goodness of fit test though Model 3 obtained the largest p-value. In terms o f nested model comparisons using the v 2 statistic, both Model 2 and Model 3 yielded sizable statistically significant improvements in fit compared to Model 1 ( (v 2 M o d e , 2 -Model i (i) = 95, p < .01); ( x 2 M o d d i - Model 3 (i) = 158, p < .01), respectively). A n examination of the supplementary indices using cutoff criteria for relatively good fit ( R M S E A <06 , CFI > .95 and S R M R <08) recommended by Hu and Bentler (1999) suggested that only Model 3 could be classified as having a good fit (the R M S E A for Model 1 and Model 2 exceeded .06). 25 Table 2.2. Results of the confirmatory factor analyses. Model l 1 Model 2 2 Model 3 3 Indices of Model Fit 1 - Factor 2 - Factor 2 - Factor j c m ' 177 (9) 82 (8) 19(8) p-value <.001 <.001 .014 R M S E A .102 .072 .028 90% CI for R M S E A .090, .116 .058, .087 .012, .044 CFI .99 .99 1.00 S R M R .06 .05 .02 Standardized Loadings for Items C O M P L I K E C O M P L I K E Q l Y o u feel that you have a number of .83 .84 - .86 -good qualities. Q2 Y o u feel that you're a person of worth at least equal to others. .92 .93 - .97 -Q3 Y o u are able to do things as well as .77 .78 - .83 -most other people. Q4 Y o u take a positive attitude toward .82 - .89 - .88 yourself. Q5 On the whole you are satisfied with .78 - .83 - .83 yourself. Q6 A l l in all , you're inclined to feel .69 .70 - - .73 you're a failure (reverse coded). Cronbach's Alpha .84 .77 .76 .80 .77 Construct Reliability .92 .89 .85 .92 .86 Interfactor Correlation - .87 .80 Note: A l l loadings and inter-factor correlations were significant (p < .05) and all estimates were essentially the same when run without using the statistical weights. 1 Single factor model of self esteem implied by Marsh (1996). Two factor model of self esteem composed of self-liking and self-competence as described by Tafarodi and Swann (2001). Modified two factor model of self esteem composed of self-liking and self-competence. Reliability Cronbach's alpha is commonly reported as a measure of reliability, however, it is directly influenced by the number of items in a scale and underestimates reliability when the assumption of tau-equivalence (the items load on the same construct exclusively and have loadings equal in magnitude) is violated (see Bollen (1989) for further details). In light of the ability to check for tau-equivalence in C F A , use of an alternative coefficient based on a slightly different conceptualization of reliability that accommodates lack of tau-equivalence has been recommended. Initially derived by Joreskog (1971), this coefficient of construct reliability is based on a definition of reliability as an assessment of the variance in the 26 indicators explained by the common underlying latent construct. Gerbing and Anderson (1988) recommended using the following formula to calculate construct reliability: [Sum(std. loadings)] Construct reliability = \u00E2\u0080\u0094 2 2 [Sum(std. loadings)] + [Sum(l-std. loadings )] Although the values for Cronbach's alpha did not meet the recommended level of .80 (Nunnally, 1994), the estimates of construct reliability suggest that all three models meet the recommended standard (see Table 2.2). Latent Variable Regressions to Examine Predictive Power and Discriminant Validity Prior to running the latent variable regressions, a model composed of correlated constructs was tested to identify potential problems in the measurement models and to generate a correlation matrix of the latent variables (See Table 2.3 for results). This model fit the data well according to criteria for relatively good fit recommended by H u and Bentler (1999), but did not pass the the v 2 goodness of fit test (v 2 = 115, p <.01). The R 2 for the indicators was well above .50 for all but one item (Anxiety Q l ) . The construct reliability exceeded .80 for all constructs except Anxiety (construct reliability equals .66). Table 2.3. Standardized loadings and construct reliability for correlated factors model. Self-competence Self- iking Negative Affect Anxiety Happiness1 Q l Q2 Q3 Q4 Q5 Q6 Q l Q2 Q3 Q l Q2 Q l Loading .85 .97 .84 .85 .82 .77 .78 .90 .95 .66 .74 .89 T-value 38 56 36 49 43 35 44 57 54 23 27 -R 2 .72 .94 .70 .72 .68 .59 .60 .81 .91 .44 .55 .80 Construct Reliability .92 .85 .91 .66 .79 Note 1: A l l loadings were significant (p < .05). Note 2: Model fit: Satorra-Bentler Scaled Chi-Square = 115, (p<.01), R M S E A = 0.03, 90% CI for R M S E A = (0.02; 0.04), CFI = 1.0, S R M R = 0.037. 1 The error variance for happiness was set to (1-reliability) and the loading was fixed to the square root of reliability. A reliability of .80 was used. 27 Examination of the latent variable correlations (see Table 2.4) indicated that all correlations were in the expected direction (e.g., both self esteem factors have negative correlations with negative affect and anxiety and positive correlations with happiness). However, the self-liking factor had correlations that were substantially larger than the corresponding correlations associated with the self-competence factor (see Table 2.4). Table 2.4. Correlation matrix for latent variables from the correlated factors model. COMP LIKE Negative Affect Anxiety Happiness COMP 1.0 LIKE .80 1.0 Negative Affect -.21 -.54 1.0 Anxiety -.08 -.36 .75 1.0 Happiness .32 .63 -.75 -.53 1.0 Note: A l l correlations are significant (p < .05). Each of the outcome specific latent regression tables (i.e., Tables 2.5-2.7) contains the results for four separate regressions. Each row of the table contains the standardized coefficients, R indicating the amount of variance in the latent outcome explained and the model fit statistics. The first row in each table is a regression using the 6-item one factor model of self-esteem proposed by Marsh (1996). The second row contains the results for a regression containing only the self-competence factor described in Model 3, and the third row contains the results for a regression containing only the self-liking factor described in Model 3. The last row in the table contains the results of a regression with both the self-competence and self-liking factors simultaneously predicting the outcome under investigation (i.e., negative affect, anxiety and happiness). Table 2.5. Examining the relationship with Negative Affect. Standardized Coefi Icients Model Fit Statistics Model B l -Factor B C O M P B L I K E R 2 tfdf) P R M S E A CFI S R M R SE -0.41 - - 16.5 282(26) <.001 .07 .98 .10 C O M P - -0.21 - 4.4 17(8) .03 .03 1.0 .04 L I K E - - -0.55 29.9 33(8) <.001 .04 1.0 .04 C O M P and L I K E - 0.64 -1.06 44.2 60(24) <.001 .03 1.0 .04 Note: A l l loadings and standardized coefficients are significant (p < .05). 28 Table 2.6. Examining the relationship with Anxiety. Standardized Coefficients Model Fit Statistics Model Boriginal B C O M P BLIICE R 2 P R M S E A C F I S R M R S E -0.23 - - 5.2 344(19) <.00T .10 .99 .07 C O M P - -0.08 - .1 14(4) .007 .04 1.0 .02 L I K E - - -0.36 12.7 6.58(4) 0.16 .02 1.0 .01 C O M P and L I K E - 0.57 -0.81 24.2 42(17) <.001 .03 1.0 .02 Note: A l l loadings and standardized coefficients are significant (p < .05). Table 2.7. Examining the relationship with Happiness. Standardized Coefficients Model Fit Statist ics Model Boriginal B C O M P B L I K E R1 P R M S E A CFI S R M R SE .52 - - 26.6 316(14) <.001 .11 .99 .08 C O M P - .32 - 10.3 4.95(2) .08 .03 1.0 .02 L I K E - - .63 39.7 11.32(2) .003 .05 1.0 .02 C O M P and L I K E - -0.53 1.05 49.4 31(12) .002 .03 1.0 .03 Note: A l l loadings and standardized coefficients are significant (p < .05). Two things stand out in the results of the regression analyses. The first is that the self-l iking factor consistently explained substantially more variance in the outcomes than either the 1-factor self-esteem model or the self-competence factor model. The standardized loadings for each indicator in the regression models were examined and found to change very little from the standardized loadings reported in Table 2.3. Secondly, when both self-liking and self-competence were entered into the same model the sign of the standardized coefficient for self-competence changed to a direction incompatible with existing theory (i.e., self-competence had a negative relationship with happiness and a positive relationship with negative affect and anxiety) and the R 2 for the model increased substantially. It is also worth noting that the same pattern of results was found employing non -SEM linear regression analyses using simple summary scores for each construct. 29 2.4 DISCUSSION The nested C P A ' s presented in this study suggest that the R S E S items included in the N P H S are best modeled as two correlated factors. A s in the conceptualization offered by Tafarodi and Swann (2001), the first factor appears to represent an assessment of self-competence. For example, responses to Q2 (You feel that you're a person of worth at least equal to others) could be interpreted as an assessment of one's competence in terms of financial resources while Q3 (You are able to do things as well as most other people) could be viewed as measuring the respondent's current ability to perform specific actions. Question 1 (You feel that you have a number of good qualities) is somewhat more difficult to interpret though one could imagine that the phrase \"number of good qualities\" leads respondents to consciously review, rate and crudely count specific qualities they believe they possess. What appears to tie the items together is the requirement of the individual to assess or rate somewhat specific competencies (e.g., the number of good qualities, the ability to do things and their sense of worth). Furthermore, both Q2 and Q3 explicitly ask for the assessment to be made in comparison to others. In contrast to the first factor, the second appears to represent the individual's overall internalized sense of self-liking or satisfaction. For example, both Q5 (On the whole you are satisfied with yourself) and Q6 ( A l l in all , you're inclined to feel you're a failure) explicitly emphasize the global nature of the required ratings. While some researchers (e.g., Tafarodi and Swann, 2001) categorize responses to Q6 as an assessment of competency, the global nature of this internally oriented rating of feeling like a failure may be more driven by internal self-liking. Additional support for the distinctness of the self-liking factor is found in that none of the self-liking items explicitly asks for a rating in terms of an external comparison to others. This interpretation is also supported in the conceptualization offered by Campbell (1990) in her work on self-concept clarity where she suggests that it is important to distinguish between (a) outer self-esteem characterized by temporary feelings of self-regard that vary over situations, roles, feedback, events and the reflected appraisals of others and (b) inner or trait self-esteem that is characterized by global judgments of worthiness that appear to form relatively early during the course of development and remain stable over time. 30 The correlations between the self-competency and self-liking factors and happiness, negative affect and anxiety were examined to gather additional support for the uniqueness of the two factors (i.e., i f the self-liking and self-competence are distinct constructs, then they should have different relationships with other variables). Despite having 64 percent of variance in common, the factors had substantially different correlations with ratings of happiness, negative affect and anxiety (see Table 2.4). One could argue that Q6 of the self-l iking factor could be considered an indicator of negative affect, anxiety and to a lesser extent happiness and that this is why this factor is more strongly correlated with the aforementioned constructs. However, Q6 had the lowest standardized loading of the three items (.73) associated with the self-liking factor. This suggests that the factor interpretation should not be determined primarily by the meaning of this item but rather by a latent concept common to all three indicators (i.e., self-liking). The last component of this study used S E M to conduct a series of latent variable regressions to compare the predictive power associated with using a single versus 2-dimensional representation of the R S E S to predict negative affect, anxiety and happiness. Evidence of increased predictive power was found for the 2-factor model in that the self-l iking factor consistently explained substantially more variance in the outcomes than either the original 6-item 1-factor model or the self-competence factor alone (see Tables 2.5, 2.6 and 2.7). However, when both self-liking and self-competence were included in the regression model together, the size of the standardized coefficient for self-liking increased, the sign of the standardized coefficient for self-competence changed to a direction incompatible with existing theory and the overall model R increased. This same pattern emerged in all three regressions as well as in parallel non-SEM linear regression analyses using simple summary scores and suggests the presence of a suppressor effect. Cohen and Cohen (1975) stated that suppression is generally indicated when a j3\ coefficient falls outside the limits defined by rY\ and 0. More specifically, they define net suppression as the situation whereby the correlation between the outcome and suppressor is less than the product of the correlation between the outcome and predictor multiplied by the correlation of the predictor and suppressor (assuming all correlations are positive). Additionally, i f a # is of opposite sign from its rY\, as was the case for self-competency, then this variable is serving as a net suppressor (Cohen & Cohen,1975). While the /3's in the 31 regression models predicting happiness and negative affect both exceeded 1, Cohen and Cohen (1975) demonstrated that while r may never exceed the limits of+1 and -1, under conditions of suppression /3's may do so. The results o f the latent variable regressions meet all three of the aforementioned criteria and thus suggest that self-competence may function as a suppressor variable in the regression models. In spite o f its positive correlation with the outcome, the function of a suppressor in the multiple regression model is primarily in suppressing a portion of the variance of the other predictor that is irrelevant to (uncorrelated with) the outcome (Cohen & Cohen, 1975). While the negative partial coefficients w i l l always be associated with the initially smaller validity coefficient, the phenomenon is symmetrical and can be interpreted both ways (e.g., X I may suppress X 2 or X 2 may suppress X I ) . The decision regarding which to interpret should therefore be based on theory. In this study, it seems reasonable to conclude that the self-competency factor is suppressing irrelevant variance in the self-liking factor when predicting the outcome (i.e., higher scores on self-liking factor imply higher levels o f happiness and lower levels of anxiety and negative affect). One explanation for this suppressor effect is that self-competency and self-liking are stable, correlated but unique traits representing two different facets of self-esteem. When examining the relationship between self-esteem and depression, self-liking is a stronger predictor than self-competency in terms of the variance explained. However, when both self-competence and self-liking are modeled together self-competence suppresses variance in common with self-liking that is irrelevant to the prediction of depression. This increases the predictive power of the model. A hypothetical scenario producing such an effect might be that self-competence and self-liking are both determined to some extent by early childhood experiences. Depression and self-liking are both determined in part by a common biological predisposition that is also established in early childhood. Use o f self-liking alone in a regression model predicting depression yields a sizable amount of variance explained because self-liking and depression have a biological connection. However, when self-competence is introduced into the model, it suppresses the common variance it shares with self-liking that was due to early childhood experiences because the variance associated with early childhood experience in self-liking is not related to depression. 32 Another explanation for the functioning of self-competency as a suppressor comes from recent work by Clark, Vittengle, Kraft, and Jarrett (2003) examining the relationship between personality measures and depression. They concluded that many personality assessments tap both state affect and trait specific variance. With regard to the specific relationship between personality assessments and depression, Clark et al. reported that state-affect variance masked trait related variance in depressed patients. Although the research of Clark et al. did not directly address the concept of self-esteem, previous research has suggested that the concept of self-esteem is strongly related and may to some extent be subsumed within the B i g Five personality framework (i.e., extraversion, agreeableness, conscientiousness, neuroticism and openness) particularly those traits with a strong affective component such as extraversion and neuroticism (Judge et al., 2002; Robbins, Tracy, & Trzesniewski, 2001). In this scenario, self-competency might be considered more of a labile state that suppresses common state-related variance in self-liking that is unrelated to depression. In this way the remaining trait-related variance in self-liking that predicts depression gains predictive power. 2.5 C O N C L U S I O N The nested C F A ' s presented in this study suggest that the R S E S items included in the N P H S are best modeled as two correlated dimensions. The first dimension appears to represent a measure of self-competence while the second is interpreted as a measure of self-liking. Subsequent tests of predictive power and discriminant validity supported this two factor interpretation in that the two factors had substantially different relationships with measures of happiness, negative affect and anxiety. In addition to differing correlations, latent variable regressions indicated that self-competency consistently suppressed some variance in self-liking when modeled to predict happiness, negative affect and anxiety. Although suppressor effects are rarely published, Maassen and Bakker (2001) demonstrated, by way of example, that the probability of their occurrence is relatively high in latent variable models when the suppressed variable is corrected for measurement error. From a theoretical 33 perspective, Collins and Schmidt (1997) found clear support in multiple studies involving personality domains for the existence of stable and consistent suppressors. They concluded that suppressor variables not only enhanced prediction but that our understanding of certain concepts might improve by examining what the invalid variance components are that are being partialed out. Given that the results of this study are preliminary and have yet to be replicated, self-esteem researchers (particularly those examining the relationship between self-esteem and mental well-being) are advised to consider testing the performance of a 2-factor model defined by self-liking and self-competence. Although there was no evidence of multicollinearity (i.e., variance inflation factors < 2.0) in the parallel non-SEM linear regression models estimated in this investigation, the high correlation between the latent factors suggests that multicollinearity may influence their usefulness when predicting outcomes other than those examined in this study. In situations where multicollinearity is suspected, the researcher is advised to model the factors separately rather than combine them into a single global self-esteem measure. Although the confirmatory factor analyses and tests of discriminant validity provide clear support for a 2-dimensional structure for the 6-items of the R S E S included in the N P H S , it is important to note that the results do not necessarily apply to the full version (10 items) of the scale. Further research is needed to replicate the C F A results and more importantly, to confirm the existence of the suppressor phenomenon. If confirmed, the accurate description and modeling of the suppressor effect presented in this study has the potential to enhance our understanding of the underlying processes driving the relationship between self-esteem and depression. 34 2.6 R E F E R E N C E L I S T Blascovich, J., & Tomaka, J. (1991). Measures of self-esteem. In J.P. Robinson, P.R. Shaver, & L . S . Wrightsman (Eds.), Measures ofpersonality and social psychological attitudes (pp.115-160). New York: Academic. Beck, A . T. (1967). Depression: Causes and treatment. Philadelphia: University of Pennsylvania Press. Bollen, K . A . (1989). Structural equations with latent variables. New York: Wiley. Brown, G . W. , Andrews, B . , & Bifulco, A . T. (1990). Self esteem and depression: I. Measurement issues and prediction of onset. Social Psychiatry & Psychiatric Epidemiology, 25, 200-209. Campbell, J. D . (1990). Self-esteem and clarity of self-concept. Journal of Personality and Social Psychology, 59, 538-549. Cheng, H . , & Furnham, A . (2003). Personality, self-esteem, and demographic predictions of happiness and depression. Personality & Individual Differences, 34, 921-942. Clark, L . A . & Watson, D . (1991). Tripartite model of anxiety and depression: Psychometric evidence and taxonomic implications. Journal of Abnormal Psychology, 100, 316-336. Clark, L . A . , Watson, D . , & Mineka, S. (1994). Temperament, personality, and the mood and anxiety disorders. Journal of Abnormal Psychology, 103, 103-116. Clark, L . , Vittengl, J., Kraft, D . , & Jarrett, R. B . (2003). Separate personality traits from states to predict depression. Journal of Personality Disorders, 17, 152-172. Cohen, J., & Cohen, P. (1975). Applied multiple regression/correlation analysis for the behavioral sciences. Hillsdale, N . J : Lawrence Erlbaum Associates. 35 Corwyn, R. F . (2000). The factor structure of global self-esteem among adolescents and adults. Journal of Research in Personality, 34, 357-379. DeNeve, K . M . , & Cooper, H . (1998). The happy personality: A meta-analysis of 137 personality traits and subjective well-being. Psychological Bulletin, 124, 197-229. Diener, E . (1984). Subjective well-being. Psychological Bulletin, 95, 542-575. Enns, M . W. , & Cox, B . J. (1997). Personality dimensions and depression: Review and commentary. Canadian Journal of Psychiatry, 42, 274-284. Feeny, D . , Furlong W. , Barr, R. D . , Torrance, G . W. , Rosenbaum P., & Weitzman S. (1992). A comprehensive multi-attribute system for classifying the health status of survivors of childhood cancer. Journal of Clinical Oncology, 10, 923-928. Gerbing, D . W . & Anderson, J. C. (1988). A n updated paradigm for scale development incorporating unidimensionality and its assessment. Journal of Marketing Research, XXV, 2, 186-192. Hayduk, L . A . , Ratner, P. A . , Johnson, J. L . , & Bottorff, J. L . (1995). Attitudes, ideology, and the factor model. Political Psychology, 16, 479-507. Hu , L . , & Bentler, P. M . (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Hunter, J. E . , & Gerbing, D . W . (1982). Unidimensional measurement, second order factor analysis, and causal models. Research in Organizational Behavior, 4, 267-320. Joiner, T. E . , & Lonigan, C. J. (2000). Tripartite model of depression and anxiety in youth psychiatric inpatients: Relations with diagnostic status and future symptoms. Journal of Clinical Child Psychology, 3, 372-382. Joreskog, K . G . (1971). Statistical analysis of sets of congeneric tests. Psychometrika, 36, 109-133. 36 Joreskog, K . G . (2001a). Analysis of ordinal variables 1: A preliminary analysis. http://www.ssicentral.com/home.htni: Scientific Software International. Joreskog, K . G . (2001b). Analysis of ordinal variables 2: Cross-sectional data. http://www.ssicentral.com/home.htm: Scientific Software International. Judge, T. A . , Erez, A . , & Bono, J. E . (2002). Are measures of self-esteem, neuroticism, locus of control, and generalized self-efficacy indicators of a common core construct? Journal of Personality & Social Psychology, 83, 693-710. Kaplan, D . , & Ferguson, A . J. (1999). On the utilization of sample weights in latent variable models. Structural Equation Modeling, 6, 305-321. Korn , E . L . , & Graubard, B . I. (1995). Examples of differing weighted and unweighted estimates from a sample survey. The American Statistician, 49, 291-295. L I S R E L . (2003). Version 8.54. Chicago, IL: Scientific Software International. Maassen, G . H . & Bakker, A . B . (2001). Suppressor variables in path models: Definitions and interpretations. Sociological Methods & Research, 30, 241-270. Marsh, H . W . (1996). Positive and negative global self-esteem: A substantively meaningful distinction or artifactors? Journal of Personality & Social Psychology, 70, 810-819. Mil le r , P. M . , Kreitman, N . B . , & Ingham, J. G . (1989). Self-esteem, life stress and psychiatric disorder. Journal of Affective Disorders, 17, 65-75. Mruk, C. J. (1999). Self-esteem: Research, theory and practice (2nd ed.). New York, N Y : Springer Publishing Company. Nezlek, J. B . , & Plesko, R. M . (2001). Day-to-day relationships among self-concept clarity, self-esteem, daily events, and mood. Personality and Social Psychology Bulletin, 27, 201-211. 37 Nunnally, J. C , & Bernstein, I. H . (1994). Psychometric theory (3rd ed.). New York: M c G r a w - H i l l . Pearlin, L . I. & Schooler, C. (1978). The structure of coping. Journal of Health & Social Behavior, 19, 2-21. Roberts, S. B . & Kendler, K . S. (1999). Neuroticism and self-esteem as indices of the vulnerability to major depression in women. Psychological Medicine, 29, 1101-1109. Robins, R. W. , Tracy, J. L . , & Trzesniewski, K . (2001). Personality correlates of self-esteem. Journal of Research in Personality, 35, 463-482. Rosenberg, M . (1965). Society and the adolescent self-image. Princeton, N J : University Press. Rubio, D . M . , Berg-Weger, M . , & Tebb, S. S. (2003). Using structural equation modeling to test for multidimensionality. Structural Equation Modeling, 8, 613-626. Schmitz, N . , Kugler, J., & Rollnik, J. (2003). On the relation between neuroticism, self-esteem, and depression: Results from the National Comorbidity Survey. Comprehensive Psychiatry, 44, 169-176. Statistics Canada. (1995). National Population Health Survey, 1994-1995 documentation. Ottawa, O N T : Statistics Canada. Tafarodi, R. W. , & Swann, W . B . Jr. (1995). Self-liking and self-competence as dimensions of global self-esteem: Initial validation of a measure. Journal of Personality Assessment, 65, 322-342. Tafarodi, R. W. , & Swann, W . B . (2001). Two-dimensional self-esteem: Theory and measurement. Personality & Individual Differences, 31, 653-673. Winship, C. & Radbill , L . (1994). Sampling weights and regression analysis. Sociological Methods & Research, 23, 230-257. 38 CHAPTER 3 A TEST OF THE AGE-BASED MEASUREMENT INVARIANCE, TEMPORAL STABILITY AND MODERATING ABILITY OF ANTONOVSKY'S SENSE OF COHERENCE SCALE 3.1 INTRODUCTION Over the past 25 years a vast amount of research has been devoted to the investigation of the role of individual psychosocial characteristics in the relationship between stress and health. Early studies focused primarily on the effect of differential exposure to stress. However, researchers soon realized that substantial gains in explaining the stress - health relationship could be had by taking into account systematic differences in stress impact (i.e., differential vulnerability) (Kessler, 1979). One example of a theory that incorporates psychosocial personality characteristics to explain differential vulnerability to stress is Kobasa's concept of a hardy personality (Kobasa, 1979). According to Kobasa (1979), an individual with a hardy personality possesses a stronger commitment to self, an attitude of vigorousness toward the environment, a sense of meaningfulness and an internal locus of control, all of which buffer the impact of stressful events on health. A t about the same time as the concept of a hardy personality was gaining acceptance, Antonovsky presented his Salutogenic Model of Health as a means of explaining differential vulnerabilities to stress (Antonovsky, 1979, 1987). The central feature of the Salutogenic Model is the concept of sense of coherence (SOC), which Antonovsky defined as a global orientation that expresses the extent to which one has a pervasive, enduring though dynamic feeling of confidence that one's internal and external environments are predictable and that there is a high probability that things w i l l work out as well as can be expected (Antonovsky, 1979). Similar to the concept of hardiness, SOC is postulated to buffer the impact of stressful events on the health of individuals (Antonovsky, 1987). The concept of S O C remains popular to this day and was selected for inclusion in Canada's longitudinal National Population 39 Health Survey (NPHS), which was developed to provide an ongoing assessment of the health status of the general population and the multitude of factors that influence health. The goal of this paper is to present the findings of two studies focused on testing the theory behind Antonovsky's Sense of Coherence Scale (SOC) using longitudinal data from the N P H S . The paper begins with a brief review of Antonovsky's theory of salutogenesis and the concept of S O C , including a discussion of the evidence on the reliability and validity of the 13-item version of the SOC scale. The concept of measurement invariance is then introduced as an additional, though less commonly encountered, aspect of validity that must be demonstrated before researchers can confidently interpret and compare scores on the SOC across different age groups. In Study 1 of the investigation, multi-group longitudinal factor analysis is used to test the measurement invariance across age and stability over time of the SOC using data from three age groups of respondents to the 1994-1995 and 1998-1999 N P H S . Building on the results of the first study, Study 2 uses a series of regression models to test the ability of S O C to moderate (i.e., buffer) the health impacts of stressful life events using data from the 1998-1999 and 2000-2001 N P H S surveys. A general discussion examines the implications of both studies and identifies areas for future research. Conceptual Definition of Sense of Coherence Prominent among protective theories, Antonovsky's concept of S O C has been the subject of a great deal of research focused on understanding the relationships among stressors, coping and health (Antonovsky, 1993). Contrary to traditional pathogenic orientations, S O C emerged as part of a broader theory of salutogenesis, which attempts to predict and explain movement toward the healthy end of the health ease/dis-ease continuum. This salutogenic approach to the study of health is based on a conception of the world as a place in which stressors are endemic, ubiquitous and open ended (i.e., positive or negative) in regard to their consequences (Antonovsky, 1996). Within such a paradigm, the focus of investigation shifts from the identification and investigation of particular stressors (e.g., work stress) to the study of more general salutary factors (i.e., capabilities) that enable individuals to adapt to the endless and diverse array of challenges encountered during the course of l iving. 40 Broader than a specific coping mechanism, the concept of SOC is theorized as representing a higher level distillation of the common components underlying particular coping or resistance resources (e.g., social support, educational background and financial resources) (Antonovsky, 1996). In 1987, Antonovsky redefined the concept of SOC as: A global orientation that expresses the extent to which one has a pervasive, enduring though dynamic feeling of confidence that (1) the stimuli, deriving from one's internal and external environments in the course of living are structured, predictable and explicable; (2) the resources are available to one to meet the demands posed by these stimuli; and (3) these demands are challenges, worthy of investment and engagement (p. 19). Antonovsky (1996) labeled these three components of S O C as comprehensibility (i.e., the extent of the belief that the problem faced by the individual is clear), manageability (i.e., the extent of the belief that not only does one understand the problem, but that the necessary resources to successfully cope with the problem are available) and meaningfulness (i.e., the extent of the belief that coping \"makes sense\" emotionally, that one wishes to cope). Despite identifying three distinct components, Antonovsky (1996) theorized that SOC represents a single higher order generalized orientation that is universally meaningful and cross-culturally valid. In Antonovsky's (1993) own words: The SOC is a construct which is universally meaningful, one which cuts across lines of gender, social class, region and culture. It does not refer to a specific type of coping strategy, but to factors which, in all cultures, always are the basis for successful coping with stressors (pg 726). SOC Development and Mechanism of Action The concept of S O C has been conceived of as a stable generalized orientation in relation to perceiving and controlling the environment for meaningful and appropriate action (Kivimaki , Feldt, Vahtera, & Nurmi, 2000). According to Antonovsky (1987), an individual's SOC develops through childhood into mid-to-late adolescence. Throughout this 41 time of development, Antonovsky speculated that the individual is repeatedly exposed to tension states requiring that they actively respond to a wide variety of stressors by mobilizing appropriate available resources labeled generalized resistance resources. In addition to facilitating the selection of culturally appropriate and situationally efficacious coping resources and behaviour, Antonovsky speculated that a person's S O C increases the degree to which tension-states are perceived as comprehensible, manageable and meaningful thereby reducing the potential for psychological distress (Antonovsky, 1987, 1996). Over time, the repeated and healthy management of tension events is thought to reinforce the individual's SOC (e.g., the individual might perceive the world as more manageable) and increase the coping resources available to the individual in the future (i.e., part of the process of successfully dealing with a stressor often involves the development of new coping skills that are added to the set of available resistance resources). Antonovsky (1987) further theorized that by about the end of an individual's third decade of life, he or she w i l l have been exposed to a sufficiently long and consistent pattern of life experiences that their SOC becomes a stable dispositional orientation. In Antonovsky's (1987) own words: The lability and exploration of the adolescent are behind us (p. 118). Although Antonovsky speculated that a person's SOC may change by up to 10% in the presence of challenging life events, he maintained that it typically returns to its previous normal level upon resolution of the tension-state (cited in Karlsson, Berglin & Larsson, 2000 - Antonovsky 1993, personal communication). The National Population Health Survey and Antonovsky's Sense of Coherence A t about the same time that Antonovsky's theory of salutogenesis was gaining widespread exposure (see Antonovsky (1993) for a review of studies using SOC), Statistics Canada received funding for the development of the N P H S (Statistics Canada, 1995). In addition to assessing the level, trend and distribution of the health status of the Canadian population, the N P H S was to provide longitudinal data from a panel of people who were followed over time to reflect the dynamic process o f health and illness (Statistics Canada, 1995). In addition to a wide variety of measures of health and health determinants, Cycle 1 42 (1994-1994) and Cycle 3 (1998-1999) of the N P H S included the 13-item version of Antonovsky's Sense of Coherence Scale. The inclusion of the SOC in two N P H S cycles provides an excellent opportunity to test Antonovsky's theory of salutogenesis and to investigate the role S O C plays in the maintenance of individual health status in such a way as to inform future policy decisions. For example, i f it can be shown that SOC buffers the health impact of certain stressors (e.g., losing one's job) then interventions focused on enhancing the SOC of at-risk individuals may be worth pursuing. However, i f SOC is not malleable beyond age 30, then these interventions would have to take place early in an individual's lifetime (e.g., in elementary or high school instead of as a component of a job-retraining initiative at the age of 50). Prior to interpreting scores on a measure such as the S O C , it is necessary to establish the reliability and validity of the SOC in the population under investigation. Tests of the internal consistency of the 13-item version of the SOC yielded Cronbach's alphas from 0.74 to 0.91 in 16 studies reviewed by Antonovsky (1993). More recent studies have further confirmed the reliability of the SOC. For example, Feldt, Leskinen, Kinnunen, and Ruoppila (2003) assessed the SOC of a sample of Finnish technical designers in 1992 and in 1997 and reported Cronbach's alphas of 0.82 and 0.87 for respondents aged 25 to 29 years (n = 210) and 0.80 and 0.86 for respondents aged 35-40 (n = 285). With regard to the N P H S dataset, Hood, Beaudet, and Catlin (1996) reported a Cronbach's alpha of 0.83. Based on these findings it can be concluded that the 13-item S O C meets or exceeds the minimum reliability requirement of 0.80 recommended by Nunally (1994). Numerous studies employing confirmatory factor analysis have demonstrated that the second order factor model of the 13-item S O C implied by Antonovsky's theory of salutogensis is an appropriate representation. For example, investigations by Feldt et al., (2003), Feldt, Leskinen, Kinnunen and Mauno (2000), Feldt and Rasku (1998), Gana and Gamier (2001), and Kiv imaki , Feldt, Vahtera and Nurmi (2000) all support the use of a second order factor model. The study by Feldt and Rasku (1998) provides particularly well documented support with data from a Finnish sample of 989 technical designers (mean age 39.4 years) and 1012 teachers (aged 45 to 59 years). Although these studies confirmed the second order factor structure of the SOC, the investigations of Feldt et al. (2003), Feldt et al., (2000), Feldt and Rasku (1998), and Gana and Gamier (2001) all reported sizable and 43 significant improvements in model fit associated with freeing the correlation between the measurement errors of item 2 (Meaningfulness: How often in the past were you surprised by the behaviour of people who you thought you knew well?) and item 3 (Manageability: How often have people you counted on disappointed you?). Although theoretical explanations for the correlated errors have been limited, Frenz, Carey, and Jorgensen (1993) postulated that the factor formed by the correlation between these two items was a manifestation of interpersonal trust, though this interpretation of the error covariance has yet to be validated. The validity of the SOC scale, in terms of the theoretical relationships predicted by Antonovsky's theory of salutogenesis also has been supported. For example, in an investigation of health-protective factors in older adults, Lutgendorf, Vitaliano, Tripp-Reimer, Harvey, and Lubaroff (1999) found that S O C moderated the relationship between being relocated to independent congregate l iving facilities (yes or no) and natural killer cell activity. In a cross-sectional study examining the moderating effect of S O C on the relationship between psychosocial work environment and stress symptoms in 2053 Danish employees, Albertsen, Nielsen, and Borg (2001) reported significant moderating of SOC. However, the effect was significant in only 5 of the 25 environmental factor-stress symptom relationships investigated (Albertson et al., 2001). Moderate support was also found by Feldt (1997) who reported that employees with a strong S O C appeared to be better protected from the adverse psychological effects of certain work conditions than employees with a weak SOC. Additional support for the salutogenic effect of S O C was reported by Jorgensen, Frankowski, and Carey (1999) who employed both cross-sectional and prospective longitudinal assessments to test the moderating effect of SOC on the relationship between appraisals of negative life events and the frequency of somatic maladies in university undergraduates. They reported finding strong support for the moderating effect of SOC in that students with a strong SOC showed no link between appraisals of negative life events and frequency of somatic maladies, whereas a strong relationship was found in those with a low SOC (Jorgensen et al., 1999). Further support for the moderating role of SOC was found by Gana (2001) who used multigroup structural equation modeling (groups defined by a high versus low SOC) to test for differences in the relationship between adversity (measured by indicators of anxiety, stress and worry) and life satisfaction. The path from adversity to 44 satisfaction was significant in the weak SOC group (/3 = -0.48) but non-significant in the strong SOC group (/3 = -0.11) suggesting that S O C was moderating the relationship (Gana, 2001). A summary of other more dated studies that provide further evidence of the significant relationships between SOC and health can be found in the review completed by Antonovsky (1993). In addition to investigations of the reliability, psychometric structure and tests of relationships with health outcomes, researchers have examined the stability of the SOC scale. The trait-like nature of SOC is central to the theory of salutogenesis, which is founded on the belief that SOC is a stable dispositional quality that serves to promote health and well-being in the face of stress and disease. Antonovsky's explicit hypothesis that SOC is a stable disposition beyond age 30 has been examined by several researchers using a wide range of time intervals. For example, Frenz, Carey and Jorgensen (1993) reported test-retest reliabilities (T-RT) of .92 over a 1-week interval in a group of undergraduates (n = 171) and .93 over an interval ranging from 7 to 30 days in a group of social service employees (n = 36). A s part of a longitudinal study on job insecurity and well-being, Feldt et al. (2000) used longitudinal confirmatory factor analysis to examine the stability of SOC in 4 groups of Finnish employees (n = 219). They found that SOC at time 1 explained 52% of the variation in SOC at time 2, one year later. Feldt et al. (2000) also examined changes in the latent means in the meaningfulness, comprehensibility and manageability factors and found no significant changes. The estimated mean changes with the corresponding t-values for the factors were 0.05 (t = 0.62) for the meaningfulness factor, 0.06 (t = 0.62) for the comprehensibility factor and 0.12 (/ = 1.75) for the manageability factor. Most recently, Smith Breslin and Beaton (2003) examined the stability of the S O C scale using the 1994-1995 and 1998-1999 N P H S datasets. They found that 35.4% of the population reported changes beyond a minimally detectable amount of change of 10.22 points (i.e., the amount of change needed before one could be 95% confident that the change between time points was not due to random variability) and 58% reporting change greater than 10% (the maximum expected amount of temporary change in SOC) . The test-retest correlation over 4 years for the working population aged 18 to 30 years was 0.42 (n = 1904). For the working population aged 30 to 64 years, the test-retest correlation was 0.45 (n = 4886). Smith Breslin and Beaton (2003) went on to conclude that the S O C had a large state 45 component, and given this apparent lack of stability, they recommended caution when using the SOC as a stable global orientation within a causal context. Studies of Antonovsky's SOC scale have confirmed its reliability and construct validity in terms of psychometric structure and postulated theoretical relationships with stress and health in a variety of general populations including Canadians. However, as far as this author is aware, the measurement invariance of the 13-item S O C across groups defined by age has yet to be explicitly investigated. Measurement invariance refers to \"whether or not, under different conditions of observing and studying phenomena, measurement operations yield measures of the same attribute\" (Horn & McArd le , 1992, p. 117). In the context of this investigation, the different conditions are defined by assessments on the same respondents at different times (e.g., responses in 1994-1995 versus those 4 years later in the 1998-1999 N P H S ) and different respondents belonging to different age groups (e.g., responses from individuals 19 to 25 years versus 60 or more years of age). The critical issue when deciding i f it is necessary to test for measurement invariance is whether the variable(s) used to define the groups (i.e., age) are either directly or indirectly related to the indicator variables associated with the construct being measured (Kaplan, 1998). Given the direct relationship between age and SOC (e.g., an individual's SOC is under continuous development until age 30, after which it putatively becomes a stable orientation), without evidence of measurement invariance, conclusions based on comparisons of SOC scores across time in the same individual or across age groups of different individuals may be erroneous because the researcher does not know i f group differences are due to true SOC differences or to different psychometric responses to scale items (Cheung & Rensvold, 2002). More specifically, i f the researcher is unable to establish measurement invariance (i.e., the psychometric properties of responses to scale items can not be shown to be equivalent across groups), then the researcher does not know i f differences in S O C means are due to true differences between age groups on the underlying construct or i f the differences are due to systematic biases in the way people from different age groups respond to certain items (Steenkamp & Baumgartner, 1998). Differences in relationships between scale scores (e.g., test-retest correlations of SOC scores from different age groups) could indicate real differences in structural relations between constructs or scaling artifacts, differences in scale 46 reliability, or in the most extreme case actual nonequivalence of the constructs being assessed (Steenkamp & Baumgartner, 1998). Furthermore, this possibility applies to all results including those which find that there is no effect (e.g., when no change is found in SOC scores over time in persons over the age of 30 years) (Steenkamp & Baumgartner, 1998). Some researchers have even gone so far as to conclude that without evidence of measurement invariance, the conclusions of a study must be regarded as weak (Horn, p 119, 1991). Given the limited discussion of measurement invariance in the health literature, a brief review of the methodology employed to test for measurement invariance (i.e., confirmatory factor analysis) followed by a description of the various degrees of invariance and their associated statistical implications are presented next. Although this summary is taken from the widely cited conceptual framework proposed by Steenkamp and Baumgartner (1998), embedded in the review are examples that demonstrate the particular relevance of each level of age-based measurement invariance to the interpretation of SOC scores in the context of the NPHS and its objectives. Use of Confirmatory Factor Analysis to Establish Measurement Invariance Across Groups (Modified from Steenkamp and Baumgartner, 1998) Of the different techniques available to assess different levels of measurement invariance, the use of multi-group confirmatory factor analysis models (Joreskog, 1971) represents one of the most powerful and versatile approaches (Steenkamp & Baumgartner, 1998). In a typical confirmatory factor analysis model, the observed response x; to an item i (i, ...p) is represented as a linear function of a latent construct \u00C2\u00A3j (xi or ksi) (j = 1, ....,m), an intercept i\ (tau) and a stochastic error term Si (delta). Thus, xi = n + Xjj \u00C2\u00A3j + Si, (1) where Xjj (lambda) is the slope of the regression ofxj on \u00C2\u00A3j. The slope coefficient, or unstandardized factor loading, defines the metric of measurement, as it shows the amount of change in x\ due to a unit change in \u00C2\u00A3j . The intercept Tj, in contrast, indicates the expected value of when the latent construct (\u00C2\u00A3j) equals 0 (cf. Sorbom, 1974). 47 Assuming p items and m latent variables, and specifying the same factor structure in each group g (g-l\u00E2\u0080\u009E,G), we get the following measurement model: JC* =7* + A g \u00C2\u00A3 g + 5g (2) where xg is a p-by-1 vector of observed variables (in group g), | g is an m-by-1 vector of latent variables, 5 s is a p-by-1 vector of errors of measurement, r 8 is a p-by-1 vector of item intercepts, and A g is a p-by-m matrix of factor loadings. It is assumed that E(8S) = 0 and that Cov (\u00C2\u00A3 8 5 s ) = 0. Equation 2 shows that observed scores on the p items are a function of underlying factor scores, but that observed scores may not be comparable across groups because of different intercepts (TJ s) and scale metrics (A8ij). To identify the model, the latent constructs have to be assigned a scale in which they are measured. In multi-group analysis this is done by setting the factor loading of one item per factor to 1.0; the identification problem should not be solved by standardizing the variances of the \u00C2\u00A3j (Cudeck, 1989). Items for which A ' s (i.e., loadings) are fixed at 1.0 are referred to as marker (or reference) items. The same item(s) should be used as marker item(s) in each age group. Taking expectations of Equation 2 yields the following relation between the observed item means and the latent means: Ms = T& + A % g (3) where u 8 is the p-by-1 vector of item means and A 8 is the m-by-1 vector of latent means (i.e., the means of \u00C2\u00A3 g ) . Unfortunately, k g and Is cannot be identified simultaneously (Sorbom, 1982). The addition of any constant c to k} g can be compensated for by subtracting cAy from T j g . In other words there is no definite origin for the latent variables. To eliminate this indeterminacy, specific constraints on the parameters are necessary. One solution is to fix the intercept of each latent variable's marker item to zero in each group. This equates the means of the latent variables to the means of their marker variables (because A = 1.0). A second method, and the one used in this study, is to fix the vector of latent means at zero in the reference group (i.e., kTe{ = 0) and to constrain one intercept per factor to be invariant across groups (this must be done for an item whose factor loading is invariant across groups). The 48 latent means in the other groups are then estimated and interpreted relative to the latent means in the reference group. In addition to the mean structure defined by Equation 3, it is necessary to specify the covariance structure. Following standard notation, the variance-covariance matrix of x in group g, E 8 , is given by: \u00C2\u00A3\u00C2\u00A7 = A g O g A g ' + 9 g (4) where \u00C2\u00AE g is the variance-covariance matrix of the latent variables in \u00C2\u00A3j ( X i or Ks i ) and 0 s is the variance-covariance matrix of 5 s (typically constrained to be a diagonal matrix unless correlated error terms are modeled). The overall fit of the confirmatory factor analysis model is based on the discrepancy between the observed variance-covariance matrices S g and the implied variance-covariance matrices, and the discrepancy between the observed vectors of means and implied vectors of means. The reader is referred to Sorbom (1974) for mathematical details. Application of Multi-group CFA to Assess Levels of Measurement Invariance Conftgural Invariance The configural invariance approach is based on the principle of simple structure (Horn, McArd le , & Mason, 1983). In terms of factorial invariance, this implies that the items comprising the measurement instrument should exhibit the same configuration of salient (nonzero) and nonsalient (zero or near zero) factor loadings across different groups (Horn & McArd le , 1992) - in other words the same items should load onto the same latent factors across groups. If the same items do not load onto the same factors across groups, then the constructs being assessed should not be treated as equivalent. For example, in the context of the SOC scale, there may be items that are unrelated to their theorized factor in younger age groups because of a lack of related life experiences from which to draw upon when answering the question (e.g., Q2. How often in the past were you surprised by the behaviour of people whom you thought you knew well?). 49 Metric Invariance Configural invariance does not necessarily indicate that people in different groups respond to the items in the same way, in the sense that obtained ratings can be meaningfully compared across groups. Metric invariance is more strict than configural in that it introduces the concept of equal metrics or scale intervals across groups (Rock, Werts, & Flaugher, 1978). If an item satisfies the requirement of metric invariance, difference scores on the item can be meaningfully compared across groups, and these observed item differences are indicative of similar cross-group differences in the underlying construct. Since the factor loadings carry the information about how changes in latent scores relate to changes in observed scores, metric invariance can be tested by constraining the loadings to be the same across groups. Relating back to the factor analysis model described earlier, this implies that: Scalar Invariance Configural and metric invariance require only information about the covariation of the items in different groups. However, in many situations it is important to conduct mean comparisons across groups. In order for such comparisons to be meaningful, scalar invariance of the items is required (Meredith, 1993). Scalar invariance implies that cross-group differences in the means of the observed items are due to differences in the means of the underlying construct(s). It addresses the question of whether there is consistency between cross-group differences in latent means and cross-group differences in observed means. Even i f an item measures the latent variable with equivalent metrics in different age groups (metric invariance), scores on that item can still be systematically upwardly or downwardly biased. Meredith (1995) refers to this as additive bias. Comparisons of group means based on such additively biased items are meaningless unless bias is removed from the data (Meredith, 1993). Scalar invariance is tested by imposing the following additional constraint to the model intercepts in the model of metric invariance: A- 1 = A- 2 = A- 8 (5) ,g (6) 50 Factor Variance and Covariance Invariance Invariance may also be imposed on the factor covariances. This is tested by constraining the O (phi) elements to be equal across groups. 0 ^ = (phi) matrix to be equal across groups. < i = ^ j j - ^ j j (8) 0=1,..-m). It is important to note that to test the assumption that correlations across age groups are equal (i.e., the correlations between 1994-1995 and 1998-1999 S O C scores across 3 age groups are equal) requires both the factor variances and covariances be modeled as invariant. Error Variance Invariance A final form of invariance that may be imposed on the measurement model is that the amount of measurement error is invariant across groups. This is tested by constraining: e i ^ e i 2 . . . - ^ 8 . (9) If items are metrically invariant, and i f the error variances and factors are cross-group invariant, the items are equally reliable across groups. 51 3.2 STUDY 1: TESTING THE MEASUREMENT INVARIANCE OF SOC ACROSS AGE Given the strong influence of age on both the development and stability of an individual's SOC, it seems necessary to establish measurement invariance across age. While Feldt et al. (2003) included a test of the impact on model fit associated with the constraints of invariant factor loadings and inter-factor correlations across two age groups (25-29 years and 35-40 years) in their longitudinal factor analysis, they did not evaluate the full set of measurement invariance criteria, including the fixing of intercepts (required to compare S O C means) and factor variances (required to compare correlations between S O C scores over time). Additionally, the younger age group (25-29 years at time 1) would have all been at the least, equal to age 30 and, at the most, 34 years of age at the second S O C assessment, 5 years later. Lastly, the sample was composed entirely of Finnish designers, of which 91% were male. The lack of a complete test of measurement invariance combined with the use of a predominantly male sample of employed Finns and a 'young group' that was not clearly under the developmental cut off of 30 years recommended by Antonovsky (1987) prevents this study from providing the evidence necessary to conclude that the measurement properties of the SOC are invariant across age in an adult Canadian population (i.e., Canadians 19 years of age and over). The primary objective of Study 1 was therefore to use longitudinal data from Cycle 1 (1994-1995) and Cycle 3 (1998-1999) of the N P H S to explicitly test the measurement invariance of the SOC across age in the same respondents at different times (e.g., responses in 1994-1995 versus those 4 years later in the 1998-1999 N P H S ) and across different respondents belonging to different age groups (e.g., responses from individuals 19 to 25 versus 60 or more years of age). To test for invariance in a group of people less than 30 years of age and to avoid the cross-over of subjects from one age group to another during the assessment period the following age groups were created using ages at Cycle 1 of the N P H S : Group 1 consisting of those aged 19 to 25 years (members of this group would be 30 years of age or less for both survey assessments), Group 2 consisting of those respondents 30 to 55 years of age (members of this group could be considered working-aged adults) and Group 3 52 consisting of those aged 60 years plus (members of this group would be close to the end or beyond normal working-age). The secondary objective of Study 1 of this investigation was to test Antonovsky's theory that SOC develops until the age of 30 years after which it becomes a stable disposition. To test this hypothesis, the level (i.e., mean score) and stability (test-retest correlation) of each group's SOC were compared. 3.3 M E T H O D S Data The longitudinal data analyzed in this study were collected as part of the health component of the 1994-1995 and 1998-1999 National Population Health Surveys of Canada (Statistics Canada, 1995). The sampling design for the N P H S is that of multi-stage stratified cluster design based on an existing Labour Force Survey. In addition to the survey data, the N P H S also includes sample weights that reflect the probability o f an individual being selected for inclusion in the initial 1994-1995 wave of the survey. The longitudinal square weights used in the analyses in this paper were generated by adjusting the existing Labour Force Survey weights using 1996 Census cross-tabulations of age groups (0-11, 12-24, 25-44, 45-64 and 65 and older) by sex within each province (Statistics Canada, 2002). Following the recommendations of Kaplan and Ferguson (1999) and Statistics Canada (2002), the longitudinal square weights for each age sub-group were normalized (i.e., the weight for each respondent was divided by the average weight for their group) to take into account the probability of selection while controlling for an artificial, weight-induced increase in sample size. The response-rate for the health component of the Cycle 1 (1994-1995) N P H S was 83.6% at the Canada level and ranged from 77.8% in Ontario to 89.1% in Alberta. The Cycle 3 (1998-1999) longitudinal response rate for the health component of the N P H S was 88.2% at the Canada level and ranged from 83.9% in British Columbia to 92% in Newfoundland. The survey includes household residents in all provinces of Canada with the exception of populations on Indian Reserves and Crown Lands, residents of health-care institutions, full-53 time members of the Canadian Armed Forces and some remote areas in Ontario and Quebec (Statistics Canada, 2002). Instruments Sense of Coherence Scale The widely-used 13-item version of the SOC was included in the National Population Health Survey. The dimension of manageability is addressed by questions 3, 4, 8, and 10. Questions 1, 9, 11, and 13 measure meaningfulness while questions 2, 5, 6, 7 and 12 assess comprehensibility. Responses to each question are scored on a Likert-type scale ranging from 1 to 7. After reversing scores for questions Q l , Q2, Q3, Q8, Q13, a total SOC score is created by summing responses to all thirteen items. Higher scores indicate a stronger sense of coherence. Sense of Coherence Items Q l In this first question a response of 1 means very seldom or never and 7 means very often. How often do you have the feeling that you don't really care about what goes on around you? Q2 In this question a response of 1 means it has never happened and 7 means it has always happened. H o w often in the past were you surprised by the behaviour of people whom you thought you knew well? Q3 In this question a response of 1 means that it has never happened and 7 means it has always happened. How often have people you counted on disappointed you? Q4 In this question a response of 1 means very often and 7 means very seldom or never. How often do you have the feeling you're being treated unfairly? Q5 In this question a response of 1 means very often and 7 means very seldom or never. How often do you have the feeling you are in an unfamiliar situation and don't know what to do? 54 Q6 In this question a response of 1 means very often and 7 means very seldom or never. How often do you have very mixed-up feelings and ideas? Q7 In this question a response of 1 means very often and 7 means very seldom or never. How often do you have feelings inside that you would rather not feel? Q8 In this question a response of 1 means very seldom or never and 7 means very often. Many people - even those with a strong character - sometimes feel like sad sacks (losers) in certain situations. How often have you felt this way in the past? Q9 In this question a response of 1 means very often and 7 means very seldom or never. H o w often do you have the feeling that there's little meaning in the things you do in your daily life? Q10 In this question a response of 1 means very often and 7 means very seldom or never. How often do you have feelings that you're not sure you can keep under control? Q l 1 In this question a response of 1 means no clear goals or purpose and 7 means very clear goals and purpose. Unt i l now your life has had no clear goals or purpose or has it had very clear goals and purpose? Q12 In this question a response of 1 means you overestimate or underestimate importance and 7 means you see things in the right proportion. When something happens, you generally find that you overestimate or underestimate its importance or you see things in the right proportion? Q13 In this question a response of 1 means a source of great pleasure and satisfaction and 7 means a source of pain and boredom. Is doing the things you do every day a source of great pleasure and satisfaction or a source of pain and boredom? 55 Analysis Strategy The assessment of measurement invariance involves comparing the fit of successively more constrained models. A s the researcher proceeds from one step to another (e.g., configural to metric invariance), the researcher tests the change in model fit associated with the increased constraints (e.g., when moving from configural to metric invariance one determines whether the model fit deteriorates when the loadings are constrained to be equal across groups). When evaluating the fit of single-group measurement models, researchers typically examine the x2 test statistic that assesses the degree of discrepancy between the observed and model implied covariance matrices. A significant x2 indicates that there is a significant difference between the covariances implied by the model and those found in the data. However, as sample size increases, so does the power of the x2 test making it increasingly sensitive to small discrepancies between the observed and implied covariance matrices. The sensitivity of the x2 test to sample size has led many researchers to develop alternative measures of fit that are not so dependent on sample size. Prominent among the so called goodness-of-fit indices are the comparative fit index (CFI), root mean squared error of approximation ( R M S E A ) and standardized root mean squared residual ( S R M R ) . Based on the results of a series of simulation studies, H u and Bentler (1999) suggested that a cutoff value close to .95 for the CFI ; a cutoff value close to .06 for the R M S E A ; and a cutoff value close to .08 for the S R M R are needed before one can conclude that there is a relatively good fit between the hypothesized model and the observed data. Although many researchers dismiss the X^-test as being overly sensitive to sample size in single model studies, opting instead to base their conclusions on the goodness-of-fit indices using the widely accepted fit criteria proposed by H u and Bentler (1999), many of these same researchers frequently focus exclusively on the significance of the change in model fit provided by the difference in x2 when comparing the fit o f nested models in studies of measurement invariance. However, as Kelloway (1995) demonstrated, the x2-difference test also becomes more sensitive with increasing sample size. Evidence of this relationship can be found in the following simple proof provided to the author by Cheung (2004): X 2 = (N-1)F (1) 56 where F is the minimum value of the empirical fit function. ^(model 1) - ^(model 2) = (N-1) Ffmodel 1) - (N-1) Ffmodel 2) (2) Since sample size (N) is equal in the constrained and unconstrained models, ( N - l ) F(model 1) - ( N - l ) Ffmodel 2) = ( N - l ) [Ffmodel l)-F(model 2)], (3) From Equation 3 it can be seen that the ^-difference is a function of sample size and that a larger N wi l l result in greater power. It is therefore a double standard to place little emphasis on the Y 2 results when assessing global fit yet use them as the sole criterion when testing the change in fit associated with nested models containing successive degrees of invariance (Brannick, 1995; Kelloway, 1995). For the purposes of this investigation, both the v2-difference test and several supplementary goodness-of-fit indices (CFI, R M S E A and S R M R ) were considered with the knowledge that the large sample size in this study would result in a highly powered v 2 -difference test. Following the recommendations of Cheung and Rensvold (2002), a change in the CFI smaller than or equal to -0.01 was used as a cutoff within which invariance was not rejected. While cutoff criteria have been proposed for the other goodness of fit indices considered in this study, the simulations by Cheung and Rensvold (2002) suggest that only the performance of the CFI can be considered to be independent of model parameters and sample size when testing for measurement invariance ( S R M R was not included in their simulations). The software MPlus Version 2.14 (Muthen & Muthen, 1998) was used to conduct all of the multi-group longitudinal factor analyses to test for measurement invariance across time in the same groups and across age in different age groups. Maximum likelihood estimation was used and a mean structure was specified for the analysis of the covariance matrices. 57 3.4 R E S U L T S Demographics The final sample of 8796 cases examined in this investigation was formed by selecting respondents aged 19 to 25 years (n = 1257); 30 to 55 years (n = 5326); and 60 plus years of age (n = 2213) at the initiation of the N P H S with no missing data on the variables examined in the study. Demographics for this sample are reported in Table 3.1. Table 3.1. Sample characteristics by each group. Demographic Group 1 Group 2 Group 3 (19-25 yrs) (30-55 yrs) (60 plus yrs) Gender (% female) 49.8 51.3 58.9 Income Adequacy Lowest 8.2 3.9 6.6 Category (see Table 1 in Low-middle 13.5 8.8 15.9 Paper 1 of this dissertation for category descriptions) (%) Middle 29.3 25.7 38.9 Upper-middle 35.3 41.1 30.6 Highest 13.7 20.5 8.1 Education Level (%) < secondary 14.0 17.8 45.7 Secondary graduate 16.6 17.5 13.5 Some post 44.8 25.1 19.0 secondary College/University 24.6 39.6 21.8 degree Mean age in years (95% C.I.) 22 (22, 22) 41 (41,41) 69 (69, 69) Mean SOC in 1994-1995 (95% C.I.) 53 (53, 54) 59 (58, 59) 63 (62, 64) Cronbach's Alpha for SOC in 1994-1995 .82 .82 .82 Mean SOC in 1998-1999 (95% C.I.) 59 (58, 60) 62 (62, 62) 64 (64, 65) Cronbach's Alpha for SOC in 1998-1999 .84 .84 .82 1994-1995 with 1998-1999 SOC Correlation .36 .47 .43 58 Longitudinal Multigroup Confirmatory Factor Analysis Model The hierarchical structure of the SOC models can be found in Figure 3.1. In addition to the relationships described in Figure 3.1, all models included: 1) Correlated errors for each item at time 1 (1994-1995 N P H S ) to the same item at time 2 (1998-1999 N P H S ) ; 2) Correlated errors of prediction (disturbances) between time 1 and time 2 for meaningfulness and comprehensibility and a fixed error variance for manageability of 0.01 to prevent a negative estimate for this parameter. The decision to fix the error variance was based on the advice of Byrne (1998), who suggested that negative error variances typically represent boundary parameters with values close to zero that can be dealt with by fixing their value to a small positive number (i.e., 0.01). Note: To model S O C , the dimension of manageability (Man) is indicated by questions 3, 4, 8, and 10. Questions 1, 9, 11, and 13 tap meaningfulness (Mean) while questions 2, 5, 6, 7 and 12 assess comprehensibility (Com). The error covariance is between Q2 and Q3. Loadings for Q9, Q7, Q10 and manageability were fixed to one to scale their respective constructs. Although not indicated in the figure, all models also included: 1) Correlated errors for each item at time 1 (1994-1995 N P H S ) to the same item at time 2 (1998-1999 N P H S ) ; 2) Correlated errors of prediction (disturbances) between time 1 and time 2 for Mean and Com and a fixed error variance for manageability of 0.01. Figure 3.1. Longitudinal confirmatory factor analysis model of sense of coherence. 59 Results of the Assessment of Measurement Invariance The results of the tests of measurement invariance can be found in Table 3.2. Model l a represents the baseline model where the pattern of salient (nonzero) and nonsalient (zero) factor loadings is constrained to be equal across the three age groups. From Table 3.2 it can be seen that the model failed the Y 2 test (^(837) = 7860, p < .01) and was below the cutoff for the CFI . The model did however pass the criteria for the R M S E A and S R M R . A s found in previous studies (see for example Feldt et al. (2003), Feldt et al., (2000), and Feldt and Rasku (1998)), the modification indices (MI) suggested a substantial improvement in fit could be had i f we were to free the correlation of the error variance between Question 2 (Comprehensibility: How often in the past were you surprised by the behaviour of people who you thought you knew well?) and Question 3 (Manageability: H o w often have people you counted on disappointed you?). The values of the M I for this parameter ranged from 346 to 887 depending on group and time and were 3 to 10 times larger that all the other M i ' s . After freeing the correlation between item 2 and item 3, in all 3 age groups and both times, the fit o f the model improved substantially (see Model l b in Table 3.2). For example the v 2 improved significantly (A yl(6) = 3074, p < .01), the CFI changed from .882 to .934, and the R M S E A and S R M R improved from .053 to .040 and .042 to .035, respectively. Because of the substantial improvement in model fit and the consistency with which the need for this modification has emerged across other studies, a correlation between the residuals for item 2 and item 3 at both times and in all three age groups was included in all subsequent models. 60 Table 3.2. Summary of invariance test results. Model Constraints Added to Previous M o d e l a x W CFI R M S E A (90% CI) S R M R Model l a Baseline 7860 (837) .882 .053 (.052, .055) .042 Model l b Error covariance between Q2 and Q3 4786 (831) .934 .040 (.039, .041) .035 Model 2 Loadings constrained 5212 (891) .927 .041 (.040, .042) .042 Model 3a Intercepts constrained 6106 (951) .913 .043 (.042, .044) .044 Model 3b 1 intercept freed 5958 (950) .916 .042 (.041, .043) .044 Model 3 c 2 intercepts freed 5887 (949) .917 .042 (.041, .043) .043 Model 4 S O C variances constrained 5962(954) .916 .042 (.041, .043) .047 Model 5 SOC covariances constrained 5987 (956) .916 .042 (.041, .043) .048 Model 6 Residuals constrained 7177(1021) .899 .045 (.044, .046) .055 a Each added constraint remains in successive models. For example, Model 2 contains an error covariance in addition to loadings that are constrained to be equal. b A l l ^ -values were significant (p < .01). To test the metric invariance of the SOC, the factor loadings were constrained to be equal across time within groups as well as across groups in Model 2. While the x2 increased significantly (A j c m = 426, p < .01), the A C F I was within the cutoff of-.01) and the R M S E A and S R M R deteriorated slightly from .040 to .041 and .035 to .042, respectively. Following the recommendations of Cheung and Rensvold (2002) we classified the SOC as possessing metric invariance and proceeded to test for scalar invariance by constraining the item intercepts to be equal across time and within groups in model 3 a of Table 2. The fit of this model decreased significantly (A tfm = 894,/? < .01), the A C F I of .014 was well beyond the cutoff of -.01 and the R M S E A and S R M R deteriorated slightly from .041 to .043 and .042 to .044, respectively. The results of model 3a imply that the S O C scale does not possess scalar invariance. Although full measurement invariance (i.e., all loadings and intercepts can be constrained to be equal) represents a possible ideal, more often than not it is not achieved (Horn et al., 1991). A s a compromise between full and complete lack of metric invariance, many researchers, including Steenkamp and Baumgartner (1998), suggest that researchers consider the concept of partial invariance. The notion of partial invariance evolved out of the work of Byrne, Shavelson, and Methun (1989) who proposed that i f the proportion of non-invariant 61 items is small, then cross-group comparisons may still be interpretable because the low number of invariant items should not influence the results in a substantial manner. In situations where measurement invariance is not initially obtained, Steenkamp and Baumgartner (1998) suggested that the researcher try freeing individual constraints one at a time to attain partial measurement invariance. If possible, theoretical considerations should be used to guide the selection of which constraints to free. However, the degree to which item-specific knowledge is available is often limited, in which case researchers are advised to judiciously free equality constraints one at a time based on both the absolute and relative sizes of modification indices (MI) and expected parameter changes (EPC) (Steenkamp & Baumgartner, 1998). A s a minimum, Byrne et al., (1989) proposed that both metric and scalar partial invariance require at least 2 items per factor i f the goal of the study is to compare means across groups (i.e., the item fixed to one to define the scale of the latent variable plus one additional item). However, Steenkamp and Baumgartner (1998) also recommended that the number of modifications be kept to a minimum and that only those respecifications that produce sizable improvements in model fit remain in the final model. A review of the M I for model 3a indicated that the intercept for Q4 in 1998-1999 in group 2 had the largest M I . This M I (144.573) had an E P C of-0.236. Following the advice of Steenkamp and Baumgartner (1998) this intercept was left free to be estimated and the model was run again. The results of this model of partial scalar invariance can be found in model 3b in Table 3.2. Despite freeing the intercept the model still did not meet the criteria for invariance (Y2Model 3b - Model 2 (59) = 746, p < .01), the A C F I of .011 was just beyond the cutoff of-.01 and the R M S E A and S R M R were .042 to .044, respectively. A search for the largest intercept M I for model 3b indicated that the intercept of Q6 in 1998-1999 in group 2 should be freed. After freeing this intercept the model was run again (see model 3c in Table 3.2). The discrepancy between this model and the model with no constrained intercepts (Model 2) satisfied the minimum criteria for invariance recommended by Cheung and Rensvold (2002) (ACFI of .01) though the v 2 difference test remained significant ( X ^ o ^ i 3C_Model 2 (58)= 675,/? < .01). The R M S E A remained at .042 while the S R M R was .043. Model 3c was therefore considered to possess partial scalar invariance. A l l subsequent models tested included freely estimated intercepts for Q4 and Q6 in the 1998-1999 assessment for group 2. 62 To test for partial factor variance invariance, the variances for the higher order SOC factor were constrained to be equal across time and within groups in model 4. The A C F I for this model is .001, which meets the cutoff criteria for invariance proposed by Cheung and Rensvold (2002) though the x2-difference test remained significant (A x2Model 4 - Model 3c (5) = 75, p < .01). The R M S E A remained at .042 while the S R M R was .047. Since model 4 passed the criteria for invariance proposed by Cheung and Rensvold (2002), we proceeded to test the invariance of the covariances between S O C at 1994-1995 and SOC at 1998-1999 across the 3 groups in model 5. The A C F I for this model was .000 which easily meets the cutoff criteria for invariance proposed by Cheung and Rensvold (2002) though the x2 difference test remained significant (A x2 Model 5 - Model A (2) =25,p< .01). The R M S E A remained at .042 while the S R M R was .048. Given that model 5 passed the criteria for invariance proposed by Cheung and Rensvold (2002), we proceeded to the final stage of invariance testing and examined the invariance of item error variances across time and groups in model 6. The results of this model did not support error level invariance (A x2Model 6 - Model 5 (65) = 1 190, p < .01, A C F I of .017). While it was possible to pursue the freeing of individual error variances in an attempt to achieve error variance invariance, this level of invariance was not required to meet the objectives of this investigation (i.e., the comparison of S O C means and test-retest correlations does not require invariant error variances for the items). To compare SOC means over time and within groups it is necessary to establish invariance up to the scalar level because it is possible that differences in means may be due to group specific differences in metrics, item intercepts or both different metrics and item intercepts (Steenkamp & Baumgartner, 1998). Comparing group specific test-retest correlations requires invariance up to the level of factor variances because differences in the test-retest correlations may be due to group-specific differences in factor variances (Steenkamp & Baumgartner, 1998). Testing for Differences in S O C Means Across Time and Groups A s described in the review of measurement invariance, the latent mean of S O C at time 1 for 19 to 25 year olds was fixed to zero and served as a reference value. The latent means of the second time point in this group, as well as the latent means for both time points 63 of the other two groups were estimated relative to this zero reference. The reference model (Means free model in Table 3.3) for the test for differences in S O C means across time and groups is equivalent to Model 3c from Table 3.2 (Metric Invariance). The comparison model with all means equal was formed by adding the constraint that all SOC means be equal to zero (see Means equal model in Table 3.3). Imposing the constraint of equal means resulted in a significant deterioration in model fit ( v 2 M e a n s e q u a l m o d e l _ M e a n s f r e e m o d e , ( 5 ) = 897 p < .01, A C F I = .015, A R M S E A = .04, A S R M R = .02). It is important to remember that the means reported in Table 3 are relative to group 1 (19 to 25 years of age) at time 1 (1994-1995 N P H S ) and are on a 7 point scale. Because the total S O C score is the sum of 13 seven-point scales, one should multiply differences between means by 13 to obtain a sense of the magnitude of mean differences in terms of the total SOC score. Table 3.3. Testing SOC means and test-retest correlations. Model CFI R M S E A (90% CI) S R M R Mean S O C Group (1994,1998) Means free 5887(949) .917 .042 (.041, .043) .043 1 (.00, .53) 2 (.52, .77) 3 (.86, .98) Means equal 6784 (954) .902 .046 (.045, .047) .063 A110 To specify the difference between means, the 95% confidence intervals for each estimate are presented in Table 3.4. The confidence intervals suggest that for both group 1 and group 2, SOC scores increased significantly from time 1 to time 2. The differences in terms of the latent mean S O C for group 1 and group 2 were 0.531 and 0.253, which correspond to changes of 6.9 and 3.3 points on the total SOC score, respectively. The mean of SOC did not appear to change significantly over time in group 3. 64 Table 3.4. Means and 95% confidence intervals for SOC. Age Group 1994-1995 Mean (95% CI) 1998-1999 Mean (95% CI) Group 1 (19 to 25 years) 0 (referent) 0.531 (0.462,0.599) Group 2 (30 to 55 years) 0.517 (0.448,0.586) 0.770 (0.701,0.839) Group 3 (60 plus years) 0.859 (0.783,0.935) 0.975 (0.900, 1.051) Testing for Differences in the SOC Correlations Over Time The SOC test-retest correlations for each age group, can be seen in Table 3.5. The reference model with equal variances and correlations free to be estimated (Correlations free model in Table 3.5) is equivalent to Model 4 in Table 3.2. The comparison model with all test-retest correlations equal was formed by adding the constraint that all test-retest SOC correlations be constrained to be equal (see Correlations equal model in Table 3.5). Imposing the constraint of equal correlations resulted in a significant deterioration in model fit (^Correlations equal model-Correlations free model (2)= 25, p < .01). However, according to the Cutoff criteria for invariance proposed by Cheung and Rensvold (2002), the A C F I implies that it is reasonable to treat the SOC correlations as equivalent. Table 3.5. Correlation between SOC over time for age groups. Model X ^ d f ) CFI R M S E A (90% CI) S R M R Group (Correlation) Correlations free 5962 (954) .916 .042 (.041,.043) .047 1 (.38) 2 (.53) 3 (.50) Correlations equal 5987 (956) .916 .042 (.041, .043) .048 A l l (.50) To provide another perspective on the difference between test-retest correlations across groups, the estimated covariances with 95% confidence intervals are presented in Table 3.6. The overlapping confidence intervals for groups 2 and 3 suggest that the estimated covariances can be interpreted as equivalent. The lack of overlap in confidence intervals for the covariance for group 1 (19 to 25 years) suggests that the covariance of time 1 and time 2, is smaller than that of group 2 and group 3. 65 Table 3.6. Covariance estimates and 95% confidence intervals for age groups. Model ^ ( d f j CFI Correlation Covariance (95% C.L) Model 4 Correlations free 5962 (954) .916 Group 1 (.38) Group 2 (.53) Group 3 (.50) .316 (.264, .368) .436 (.406, .467) .416 (.377, .454) Model 5 Correlation equal 5987 (956) .916 A l l (.50) .413 (.386, .441) The discussion of results for this study has been integrated with the discussion for Study 2 (see page 88). 3.5 STUDY 2: AN EXAMINATION OF THE MODERATING PROPERTIES OF SOC The purpose of Study 2 was to use the longitudinal data from the N P H S to test Antonovsky's hypothesis that SOC moderates (i.e., buffers) the health impact of stressful or tension producing experiences. A review of the summary indicators of health used in the N P H S is followed by a discussion of appropriate modeling strategies to test the moderating effect of S O C . The results of a set of regression models predicting self-reported health status and self-reported number of visits to a medical doctor in the previous year are presented. After a brief conclusion specific to Study 2, the paper ends with a general discussion that addresses the findings and implications of both Study 1 and Study 2. In addition to psychosocial assessments such as the S O C scale, each version of the N P H S included a single-item measure of self-reported health that asked respondents to rate their health as (1) excellent; (2) very good; (3) good; (4) fair; or (5) poor (Statistics Canada, 1995). A t first blush, this measure may appear too broad to be of any empirical value. However, research has shown that it correlates strongly with other direct and indirect measures of health including the Sickness Impact Profile (Bergner et al., 1976) and various subscales of the Short Form 36 Health Survey (Brazier et al., 1992). Research also has shown the self-reported health item to be an efficient and valid means of assessing health status in 66 the context of a general population health survey. For example, Mossey and Shapiro (1982) used data from the Manitoba Longitudinal Study on Aging Study in one of the first population-based studies examining self-reported health. They found that it predicted mortality even after controlling for objective health status (assessed using physician assessments, self-reported health conditions, and health service utilization). The impact of . this study was swift and led researchers in many countries to include measures of self-reported health in their population health surveillance surveys (Idler, 1999). The resulting body of research confirmed the findings of Mossey and Shapiro (1982) and provided even more evidence of the utility of including a single question that asks respondents to rate their health as excellent, very good, good, fair or poor in population health surveys. In a widely cited comprehensive review, Idler and Benyamini (1997) examined 27 studies of the relationship between self-rated health and mortality. They concluded that not only are self-ratings valid predictors of death, but that, based on the studies reviewed, ratings of self-reported health add incremental explained variance to more commonly used physical health assessments (Idler & Benyamini, 1997). For example, despite the wide variety of methods employed, 23 of the 27 studies found that self-reported ratings of health are associated with mortality after controlling for traditional health status assessments including physicians' assessments, official medical records, and self-reports of disability status (Idler, 1999). A follow-up study that examined an additional 19 studies yielded similar conclusions (Benyamini & Idler, 1999). In addition to ratings of self-reported health, the number of visits to a medical doctor is a frequently used measure in population health surveys because it provides an indirect measure of a person's health and is of interest to health policy decision makers concerned with predicting the need for health services. Although individual medical chart reviews are considered to be the gold standard in terms of quantifying service utilization, they are difficult to carry out, expensive and often not feasible (Ritter et al., 2001). Computerized bill ing databases maintained by many government ministries and health maintenance organizations represent another more efficient means of assessing utilization. However, these too have limitations in terms of ease of access to personal information and the degree to which the databases capture and classify different types of service utilization accurately (Ritter et al., 2001). A s an alternative to chart reviews and the use of bil l ing data, many 67 health researchers have opted to rely on information from self-reported health care utilization questions included in surveys assessing the health of both general and targeted populations. Although self-report assessments offer substantial gains in efficiency and in the ability to capture a broad range of service utilization, important questions have been raised concerning their validity. Roberts, Bergstralh, Schmidt, and Jacobsen (1996) assessed the validity of self-reported utilization of health care services in a randomly selected cohort of 500 community dwelling men aged 40 to 79 years, in Minnesota. After comparing the number of ambulatory physician visits reported in a self-administered questionnaire with the community medical records of the men, they found that 29.6% of the respondents reports exactly matched their records, 15.5% over-reported by 1 to 3 visits, and 4.1% over-reported by 4 or more visits. In addition to over-reporting of visits, there was substantial under-reporting; 6.5% of the sample under-reported by 5 or more visits, 7.9% under-reported by 4 or 5 visits and 36.3% under-reported by 1 to 3 visits. Roberts et al. (1996) also observed a pronounced tendency towards greater under-reporting of visits as the total number of visits increased. After collapsing the data into difference categories of 0, 1-5, 6-10 and discrepancies of 11 or more, Roberts et al. (1996) reported a weighted kappa of .46. In a similar study using a sample of patients enrolled in a chronic disease management program (n = 216), Ritter et al., (2001) reported a weighted kappa of .46 between self-reported physician visits and administrative records over the previous six months. Similar to Roberts et al. (1996), Ritter et al., (2001) found that subjects tended to under-report more than over-report as the use of services increased. However, there was no association between demographic or health variables (e.g., self-rated health and depression) and the size of reporting error (Ritter et al., 2001). Given the published research findings of Roberts et al. (1996) and Ritter et al. (2001), it seems reasonable to conclude that self-reported health-care utilization rates are not entirely accurate. However, apart from a greater tendency to under-report in respondents with a greater number of medical visits, these studies suggest that there are no consistently reported factors applicable to the general population of Canada that would influence reporting error. Other studies have however found significant predictors of reporting error. For example, Wallihan et al. (1999) examined self-reports of ambulatory out-patient physician visits over a 12-month period and found that race (being African American versus not), age (equal to or 68 greater than age 65 versus not) and number of visits were significant predictors of reporting error while gender, education, l iving alone, self-rated health (fair or poor versus other) and income adequacy were not significant predictors. Cleary and Jette (1984) examined the degree of reporting error regarding self-reported physician utilization in the previous year in a representative sample of 908 members of the general population. They found that the average discrepancy was .05 visits (standard deviation of 4.1) with 90% of the sample having a discrepancy of 4 or fewer visits. After examining predictors of reporting error, Cleary and Jette (1984) reported that education, gender, income and being single were not related to the discrepancy between self reported and actual number of physician visits during the previous year. However, contrary to Ritter et al. (2001), they found that age, actual utilization in the previous year and being part of a prepaid health plan predicted under-reporting while belief in regular check-ups, number of chronic conditions, severity of limiting illness in the past year, and psychological demoralization were associated with over-reporting. Given that a person's SOC is thought to buffer the impact of stressful life events (i.e., it facilitates the selection of appropriate and situationally efficacious coping resources and behaviour in addition to increasing the extent to which tension-states are perceived as comprehensible, manageable and meaningful thereby reducing the potential for psychological distress (Antonovsky, 1987, 1996)), and that in addition to assessments of SOC, the N P H S includes measures of self-reported health and physician visits in all surveys, the moderating effect of SOC was tested using several multiple regression models. More specifically, the first set of regression equations tests the ability of S O C to moderate the self-rated health impact of a traumatic recent life event experienced by either the respondent or someone in their family. The second set of regression equations test the ability of S O C to moderate the impact of a traumatic recent life event on the number of self-reported visits to a medical doctor during the previous 12 months. 3.6 M E T H O D S Data The longitudinal data analyzed in this study were collected as part of the health component of the 1998-1999 and 2000-2001 National Population Health Surveys of Canada. 69 The sampling design for the N P H S is that of multi-stage stratified cluster design based on an existing Labour Force Survey. In addition to the survey data, the N P H S also includes sample weights that reflect the probability of an individual being selected for inclusion in the initial 1994-1995 wave of the survey. The longitudinal square weights used in the analyses in this paper were generated by adjusting the existing Labour Force Survey weights using 1996 Census cross tabulations of age group (0-11, 12-24, 25-44, 45-64 and 65 and older) by sex within each province (Statistics Canada, 2002). Following the recommendations of Kaplan and Ferguson (1999) and Statistics Canada (2002), the longitudinal square weights were normalized (i.e., the weight for each respondent was divided by the average weight for their group) to take into account the probability of selection while controlling for an artificial weight-induced increase in sample size. The response-rate for the health component of the Cycle 1 (1994-1995) N P H S was 83.6% at the Canada level and ranged from 77.8% in Ontario to 89.1% in Alberta. The Cycle 3 (1998-1999) longitudinal response rate for the health component of the N P H S was 88.2% at the Canada level and ranged from 83.9% in British Columbia to 92% in Newfoundland. The survey includes household residents in all provinces of Canada with the exception of populations on Indian Reserves and Crown Lands, residents of health-care institutions, full-time members of the Canadian Armed Forces and some remote areas in Ontario and Quebec (Statistics Canada, 2002). Instruments Sense of Coherence Scale The widely-used 13-item version of the SOC was included in the National Population Health Survey. The dimension of manageability is addressed by questions 3, 4, 8, and 10. Questions 1,9, 11, and 13 measure meaningfulness while questions 2, 5, 6, 7 and 12 assess comprehensibility. Responses to each question are scored on a Likert-type scale ranging from 1 to 7. After reversing scores for questions Q l , Q2, Q3, Q8, Q13, and rescaling responses from a l-to-7 format to a 0-to-6 format, a total S O C score is created by summing responses to all thirteen items (possible range from 0 to 78). Higher scores indicate a stronger sense of coherence. 70 Recent Life Events Included in the 2000-2001 N P H S data are three derived indices that measure recent life events based on the number of negative events the respondent or someone close to the respondent experienced in the past 12 months. The Recent Life Events Score -Al l Items Index was derived from items that are relevant to all respondents. The life events include physical abuse, unwanted pregnancy, abortion or miscarriage, major financial difficulties, and serious problems at work or in school. A n example of the question format is \"In the past 12 months, did you or someone in your family have a major financial crisis?\" For each positive response, respondents were awarded one point with the total number of positive responses varying from 8 to 10 depending on social role (Statistics Canada, 2000). The Recent Life Events Score - A l l Va l id Items Index expands upon the A l l Items Index by including questions that address the social roles of individuals. For partnered persons (i.e., married or l iving common-law), the index includes a question about the relationship with the respondent's partner. For persons who have children, the index includes a question about children moving back home. To create the Adjusted Recent Life Events Index, the range of scores of the second index ( A l l Va l id Items) was adjusted as i f the ten questions were relevant to all the respondents (possible range 0 to 10 for all respondents). A higher score on any of the indices indicates greater exposure to traumatic life events in the previous 12 months (Statistics Canada, 2000). Visits to a Medical Doctor The number of visits to a medical doctor in the past 12 months was assessed in the 1998-1999 and 2000-2001 N P H S by asking, \"In the past 12 months, how many times have you seen or talked on the telephone about your physical, emotional or mental health with a family doctor, pediatrician or general practitioner?\" (Statistics Canada, 2000). The respondent was further prompted to consider visits with \"any other medical doctor (such as a surgeon, allergist, orthopedist, gynecologist or psychiatrist)?\" (Statistics Canada, 2000). A n y responses to this prompt were added to the total number of visits initially reported by the respondent (Statistics Canada, 2000). 71 Self-reported Health Self-reported health was assessed in the 1998-1999 and 2000-2001 N P H S by asking respondents to rate their health as (1) excellent; (2) very good; (3) good; (4) fair; and (5) poor. Scores were reverse coded so that higher scores indicated better health (Statistics Canada, 2000). Analysis Strategy Although some researchers investigating self-reported health dichotomize the responses (e.g., excellent, good and very good versus fair and poor) and use logistic regression analysis to identify significant predictors, there is strong evidence from both theoretical and methodological perspectives suggesting that self-reported health should be treated as a continuous construct. From a theoretical perspective, Manderbacka, Lahelma, and Martikainen (1998) used logistic regression models to analyze the relationship between known risk factors (e.g., body mass index and frequency of exercise) and indicators of i l l health (e.g., presence of a short term disability and somatic symptoms) on the following dependent dichotomized response categories: (1) average versus good/excellent self-reported health and (2) poor versus good/excellent self-reported health. After finding a similar pattern of predictors for both models and then finding the same results after reanalyzing their data using average and good ratings as a reference group (as opposed to good/excellent), Manderbacka et al., (1998) concluded that self-reported health forms a continuum from poor through average to good health. Additional support for the treatment of self-reported health as a continuous variable comes from the findings of Mackenbach, van den Bos, Joung, Mheen, and Stronks (1994) who also used logistic regression models to compare the strength and pattern of relationships between 1) very good self-reported health (compared to good) predicted by socio-demographic determinants and specific risk factors and 2) less-than-good self-reported health (compared to good) as predicted by the same socio-demographic determinants and specific risk factors in model 1. They found that both the socio-demographic variables and specific risk factors had largely similar (but mirrored) patterns of association in the models and concluded that the processes by which very good or excellent health is generated are very similar to that which generates ill-health (Mackenback et al., 1994). Although there is strong 72 evidence suggesting that self-reported health should be treated as a single continuum, there is some debate in the literature from a methodological perspective on both the appropriateness of the use of a five point Likert-type scale and the frequent practice of dichotomizing responses for use in logistic regression analyses. The use of rating scales in psychological assessments has been the subject of many investigations over the past 75 years. Symonds (1924) was one of the first to suggest that the reliability of scores is optimized by the use of seven categories. While the debate continued, Mi l l e r (1956) published an influential article suggesting that the human mind is capable of simultaneously considering seven items. Despite these and many other findings, the debate continues to this day. For example, Preston and Colman (2000) used empirical data (respondent ratings of restaurants) to investigate the impact of the number of response categories on reliability, validity and discriminating power of a rating scale. Test-retest reliability (stability) was lowest for two-point, three-point, and four-point scales and was significantly higher for scales with more response categories with the most reliable scales employing 7, 8, 9 or 10 response categories. Internal consistency was lowest for scales with two or three response categories and highest for those with seven or more response categories. Criterion validity coefficients were lowest for scales with two, three, or four response categories and were generally higher (though the differences were not statistically significant) for scales with five or more response categories. Discriminating power was lowest for the scales with two, three or four response categories (Preston & Colman, 2000). In a similar study, Chang (1994) used confirmatory factor analysis to test a multitrait-multimethod model comparing the impact of different numbers of response categories on the reliability and criterion related validity of 4 versus 6 response categories for 9 attitude assessment items completed by 165 graduate students. In this study criterion related validity was not affected by the number of scale points. A factor contributing to the lack of agreement on the optimal number of scale points may be the frequent use of empirical data comprised of responses to scale items where the true reliability is not known (Velicer & Stevenson, 1978). In an attempt to control for such method effects, Lissitz and Green (1975) used a computer simulation to examine the relationship between reliability and the number of Likert-scale categories across three levels of item covariance (0.20, 0.50 and 0.80). They reported that reliability increased for all levels 73 of covariance for each additional category up to 5 scale-points, after which it leveled off. Similar results were found by Bandalos and Enders (1996) who also used a computer simulation to examine the influence on reliability of the degree of similarity between underlying distributions and the distributions of categorized responses of varying lengths. They found that reliability increased as the degree of similarity between distributions increased as well as when the number of response categories increased. However, maximum improvement in reliability was reached with 5 or 7 scale points after which reliability levels leveled off (Bandalos & Enders, 1996). More recently, Ochieng and Zumbo (2001) conducted an extensive simulation study to examine the implications of Likert scale categorization on regression models under different distributions. They reported that the degree of bias in results increased inversely with the number of response categories and that the smaller the number of response categories the greater the loss of information. They further noted that little or no gains were associated with the use of more than 4 response categories and went on to conclude that categorization of response data, particularly dichotomization, as a means of simplifying analysis should be avoided (Ochieng & Zumbo, 2001). The simulation work of Ochieng and Zumbo (2001) reinforces the recommendations of Gardner, Mulvey, and Shaw (1995) regarding situations where researchers attempt to evaluate the significance of a set of predictors on a dependent variable composed of count data (e.g., attempting to identify predictors of the number of visits to a medical doctor over a one year period). In such situations, many researchers collapse the counts into several categories after which they apply a chi-square test or in the case of a dichotomized outcome, fit a logistic regression model (see for example Dunlop, Coyte and Mclsaac, 2000). Although the resulting data may be appropriate for a chi-square test or logistic regression analyses, the categorization of count data wastes information, may reduce statistical power and often causes researchers to modify the hypotheses they wish to test (Gardner et al., 1995). Given the evidence supporting the assessment of self-reported health using a single continuum, the relatively good performance of five response categories and the consistent finding that dichotomizing responses tends to dramatically reduce the performance of scales in terms of reliability and predictive power, the moderating relationship of S O C between the experience of a traumatic event and self reported health was examined using linear regression 74 analysis (Model 1). Given that visits to a medical doctor were recorded as a total count over the previous year, Poisson regression analysis was used to test the moderating relationship using self-reported number of visits to a medical doctor in the previous year as an outcome (Model 2). In addition to the main effects of S O C assessed in 1998-1999 and the experience of a recent traumatic event assessed in 2000-2001, a multiplicative interaction between the two variables was included in the model to specifically test the moderating effect of S O C (i.e., a significant interaction would indicate the presence of moderation). To control for prior differences in health, Model 1 included the self-reported health of the respondent from the 1998-1999 N P H S as a covariate and Model 2 included the number of visits to a medical doctor reported in the 1998-1999 N P H S as a covariate. Age and gender were also included as covariates in both models to statistically control for their effects on health. Measures of socio-economic status (e.g., income and education) were not included in the test of the moderation of traumatic recent life events by SOC because they form an integral part of an individual's generalized resistance resources (Antonovsky, 1987). 3.7 R E S U L T S The final sample of 6517 cases examined in the analysis was formed by selecting respondents aged 30 years of age and greater at the initiation of the N P H S with no missing data on the variables examined in the study. O f these 6517 cases, 12 cases were removed from the sample because their reported number of medical visits in the previous year exceeded 100 visits. In contrast to the fairly normal distribution of self-reported health, the frequency distribution of Adjusted Recent Life Events Index indicated that 60% of the participants had a score of zero and only 10% had a score of 2 or more on the events. Due to the extreme skewing of scores and the interpretability of a dichotomized outcome (i.e., D i d the respondent experience a recent life event?), the scores on the index were dichotomized into zero (no events) or one (one or more life events). Specific demographic characteristics of the respondents can be found in Table 3.7. 75 Table 3.7. Sample characteristics with and without experience of a recent life event. Demographics (n= 6505) N o Recent Life One or More Recent Significance Events (26.7%) Life Events (73.3%) Percent (95 % CI) Percent (95% CI) Gender (% female) 53.1 (51.9, 54.3) 55.5 (54.3, 56.7) p=.097 Income Adequacy p<05 Lowest 2.0(1.7,2.3) 5.0(4.5, 5.5) Low-middle 6.1 (5.5,6.7) 7.9(7.2,8.6) Middle 21.8 (20.8,22.8) 21.8 (20.8,22.8) Upper-middle 37.8 (36.6, 39.0) 40.4 (39.2,41.6) Highest 32.3 (31.2,33.4) 24.8 (23.8,25.8) Education p<.05 Less than secondary 23.4 (22.4, 24.4) 17.4(16.5, 18.3) Secondary graduate 15.1 (14.2, 16.0) 16.2(15.3, 17.1) Some post-secondary 23.5 (22.5, 24.5) 29.1 (28.0, 30.2) College/University degree 37.9 (36.7, 39.1) 37.3 (36.1, 38.5) Mean (95% CI) Mean (95% CI) Age in years 48.6(48.2, 49.0) 43.1 (43.4, 44.4) p<.05 Sense of Coherence Scale (1998-1999) 64.0 (63.7, 64.2) 59.4 (58.8, 60.0) p<05 M D visits (1998-1999) 3.9(3.7, 4.0) 4.7 (4.4, 5.0) p<.05 M D visits (2000-2001) 3.8 (3.7, 4.0) 5.5 (5.1, 5.9) p<.05 Self-reported health (2000-2001) 2.7 (2.6, 2.7) 2.5 (2.4,2.5) p<.05 Self-reported health (1998-1999) 2.8 (2.8,2.8) 2.6(2.6,2.7) p<05 M o d e l 1: Self-Reported Health The results for Model 1 in Table 3.8 represent the regression of self-reported health (2000-2001) onto the presence of recent life event (RLE) in the previous 12 months (yes or no) assessed in the 2000-2001 N P H S , SOC (1998-1999), self-reported health (1998-1999), the interaction between S O C (1998-1999) and R L E (2000-2001) as well as the demographics age in years and gender. Although the results of this model indicate that there is significant interaction between the experience of a R L E and SOC, a review of the tolerance and variance inflation factors (VIF) indicated that there was a problem involving collinearity between the S O C - R L E interaction term and the main effect of R L E . In Table 3.8 it can be seen that the V I F was 29 for R L E and 28 for the R L E and SOC by R L E interaction. These V I F values are 76 well above the cut off of 10, which suggests that multicollinearity may be influencing the coefficients in the model (Neter, Wasserman, and Kutner, 1990). Table 3.8. Linear regression analysis results predicting self-rated health (with multi-collinearity present). B Std. Error Beta t P Lower Bound Upper Bound Tolerance V I F (Constant) 1.547 .086 17.9 .000 1.378 1.716 Age in -.013 .001 -.172 -16.3 .000 -.015 -.012 .911 1.098 years Gender -.058 .020 -.029 -2.9 .004 -.098 -.018 .995 1.005 (male reference) S R H 1998- .517 .012 .485 44.7 .000 .494 .540 .870 1.149 1999 R L E -.402 .124 -.178 -3.2 .001 -.645 -.159 .034 29.304 (none reference) 1998-1999 .005 .001 .060 4.6 .000 .003 .008 .600 1.667 S O C SOC by .004 .002 .121 2.2 .025 .001 .008 .035 28.381 R L E One approach to dealing with the presence of multicollinearity between a continuous predictor and its interaction term is to centre the continuous predictor (West, Aiken , & K r u l l , 1996). Centring a variable involves subtracting the mean score from each individual's score on the predictor. The resulting variable, which now has a value of zero for its mean, is then used to model the main effect and interaction term in the model. The results for the regression model using a centred SOC variable can be seen in Table 3.9. A review of the diagnostics indicated that centring SOC had solved the multi-collinearity problem (See Table 3.9). 77 Table 3.9. Results for the linear regression model predicting self-rated health using a centred S O C variable. Model B Std. Error Beta t P Lower Bound Upper Bound Tolerance V I F (Constant) 1.891 .059 32.30 .000 1.776 2.006 Age in years -.013 .001 -.172 -16.26 .000 -.015 -.012 .911 1 Gender -.058 .020 -.029 -2.87 .004 -.098 -.018 .995 1 (male reference) S R H 1998 .517 .012 .485 44.70 .000 .494 .540 .870 1 R L E (zero -.121 .024 -.054 -5.09 .000 -.168 -.075 .918 1 reference) S O C 1998 .005 .001 .060 4.57 .000 .003 .008 .600 2 (centred) SOC 1998 .004 .002 .029 2.24 .025 .001 .008 .621 2 (centred) by R L E Note: A review of residuals and plot of expected cumulative probability versus observed cumulative probability indicated that the fit of the linear regression model was adequate. After centring S O C and re-running the model, significant effects for S O C , R L E as well as their interaction were found (see Table 3.9). In addition to reducing multicollinearity in models containing interactions, centring can make the interpretation of coefficients more meaningful. Since the interaction between recent life events and SOC is significant, the size of the main effect of recent life events depends on the value of S O C . When SOC is not centred, the interpretation of the recent life event coefficient in the regression output is conditional on SOC being zero. This conditional interpretation is somewhat difficult to comprehend because as Aiken and West (1991) pointed out, psychological scales such as the SOC tend not to have meaningful zero points. However, after centring, zero represents the average SOC score and the interpretation of the recent life events coefficient in the regression output is now conditional on the average SOC score. From a slightly different perspective, the recent life event coefficient can now also be thought of as indicating the average effect of one or more recent life events on self-reported health across all values of SOC (West et al., 1996). For the model presented in Table 3.9, it can be seen that the average effect of experiencing a R L E across all values of SOC is a decrease in S R H of .121 ( S R H was assessed on a 5-point scale ranging from poor to excellent). 78 To illustrate the interaction, the predicted values for self-reported health were plotted against SOC separately for both categories of the recent life event variable. From Figure 3.2 it can be seen that there is a sizable protective effect for those with a higher S O C score taking into account age and gender. For those individuals at the mean SOC score or higher, there is little impact of a R L E in the previous year on S R H . However, as individual S O C scores fall below average, the negative impact of experiencing a R L E on S R H grows (see Figure 3.2). Figure 3.2. Graph of unstandardized predicted values of S R H by centred S O C . Another way of describing the interaction is to compare the mean self-reported health ratings for groups defined by SOC scores above or below the mean S O C score and the presence or absence of a R L E . From Table 3.10 it can be seen that the mean difference in S R H between those who did and did not experience a R L E was .24 (t = 5.7, p<.01) in individuals with a below average SOC and .04 (t = 1.0, p>.05) in individuals with a higher than average SOC score. 79 Table 3.10. Difference in self-reported health across presence or absence of a R L E for groups defined as above or below the mean SOC score. 95% CI of the SOC R L E Difference Category Category Mean t Sig. Mean Std. Error Lower Upper (2-tailed) Diff. Diff. Bound Bound None 2.50 5.80 .00 .24 .04 .16 .32 Below (n= 1772) Mean 1 One or more (n =948) 2.26 None (n = 2.77 1.03 .30 .04 .04 -.04 .11 Above 3003) Mean 2 One or more (n = 793) 2.73 Equal variances assumed (Levene's Test for Equality of Variances = .006, p=.937). 2 Equal variances assumed (Levene's Test for Equality of Variances = .001, p=.917). M o d e l 2: Visi ts To A M e d i c a l Doctor One approach to the analysis of count data commonly employed in the field of epidemiology involves the use of Poisson regression models. The Poisson model is a generalized linear model that extends the regression model to the exponential family of distributions that includes both normal and Poisson distributions (Gardner, Mulvey & Shaw, 1995). However, in many situations involving individual counts, an unadjusted Poisson model w i l l produce misleading estimates of its variance terms and misleading inferences about the regression because the observed counts have larger variances than that implied by the model. This situation is called over-dispersion and can be corrected by estimating a dispersion parameter, which is then used to rescale the standard errors in the model (Gardner et a l , 1995). McCulagh and Nelder (1989) suggested that it is reasonable to estimate the dispersion parameter using the ratio of the deviance to its associated degrees of freedom. The assumption of an unadjusted Poisson model is that the disturbance parameter equals 1, a value of less than one indicates under-dispersion while a value greater than one indicates over-dispersion. The results of the Poisson regression analysis can be found in Table 3.11. 80 Table 3.11. Results for Poisson regression analysis predicting number of visits to a medical doctor. Value Std. Error t-value Corrected t-value Final p-value (Intercept) 0.427 0.025 16.984 8.16 <.05 M D visits in 1998 0.046 0.001 86.846 41.75 <.05 S O C centred -0.009 0.00.1 - 12.982 -6.24 <.05 Age in years 0.013 0.000 27.842 13.39 <.05 Gender 0.110 0.012 8.973 4.31 <.05 R L E (yes/no) 0.302 0.014 21.989 10.57 <.05 R L E by SOC centred -0.002 0.001 -2.355 1.13 >.05 Note: Dispersion Parameter for Poisson family taken to be 1. N u l l Deviance: 36056.32 on 6504 degrees of freedom. Residual Deviance: 28202.07 on 6498 degrees of freedom. A check of the dispersion factor for the model yielded a value of 6.5, which indicated that the model was over-dispersed. Actual dispersion Factor = deviance/df = 28202.07/6498 = 4.34 Scale parameter equals square root of dispersion factor = 2.08 Cox (1983) suggested that the dispersion parameter can be used to correct the results of a moderately over dispersed model. More specifically, the standard errors of the estimates are multiplied by the square root of the estimated dispersion factor (the square root of the estimated dispersion factor is commonly called the scaling factor). Correcting for over dispersion in this way does not influence parameter estimates, though it can alter the significance levels and accompanying confidence intervals for the model parameters (Gardner et al., 1995; Pedan, 2001). From Table 3.11 it can be seen that while there was a significant main effect for SOC (t = -6.24, p < .05) and R L E (t = 21.99, p<.05), there was no significant moderating effect for SOC (i.e., the interaction between SOC and recent life events was not significant) (t\u00E2\u0080\u00941.1, p>.05). It is important to note that the use of a correction based on the dispersion factor is 81 acceptable when the researcher is primarily interested in hypothesis tests of the regression coefficients in the model, i f the goal of the analysis is to estimate the probability distribution of an individual count then the researcher is advised against using this approach and should consider using an approach based on the negative binomial model (Gardner et al., 1995). 3.8 D I S C U S S I O N The objectives of these investigations were to test the age-based measurement invariance, temporal stability and moderating ability of Antonovsky's Sense of Coherence Scale (SOC) using longitudinal data from the Canadian National Population Health Survey (NPHS). In Study 1, multi-group longitudinal factor analysis was used to test the measurement invariance across age and stability of the S O C over time using longitudinal data from the following three age groups: 19 to 25 years (n = 1257); 30 to 55 years (n = 5326); and 60 plus years (n = 2213). The results support the age-based measurement invariance of the scale for the general population sample examined in this study. Although it was necessary to free two intercept parameters (Q4 and Q6 in 30 to 55 years olds assessed in 1998-1999) to achieve measurement invariance, the change in these parameters when freed was very small. On a scale ranging from 1 (very seldom or never) to 7 (very often), the intercept changed from 4.9 to 4.7 in Q4 and from 5.1 to 5.2 for Q3. The tests for differences in the latent S O C means across time within age groups as well as across different age groups provided moderate support for Antonovsky (1987)'s hypothesis that S O C develops until approximately age 30 after which it more or less levels off and becomes resistant to change. In line with his theory, the lowest scores were found in the 1994-1995 assessment of the 19-25 year olds. Four years later the mean SOC score in this age group increased by 7 points on a scale that ranges from 0 to 78. The mean SOC score of the 1994-1995 assessment of group 2 (30 to 55 years of age) was less than 1 scale point more than the final assessment for group 1, while the mean SOC score for group 2 in 1998-1999 was only 3 points higher than either the second assessment of group 1 or first assessment of group 2. The scores for group 3 (60 years plus) were 4.4 scale points higher than the first 82 assessment of group 2 while the second SOC assessment in 1998-1999 was 6 scale points higher than the first assessment of the 30 to 55 year olds. While there were statistically significant differences in mean SOC scores across group and time (see Table 3.4 for details), the size of the differences did not exceed the 10 percent threshold identified by Antonovsky as the criterion for meaningful change beyond what could be considered normal temporary fluctuations in individuals. However, from a population perspective, Antonovsky (1996) stated that stable group changes of as little as 5 points may be meaningful and are \"not to be sneezed at\" (Antonovsky, 1996, p. 176). The direction of change found in this study suggests that, on average, the SOC of people increases slowly over time beyond age 30 and well into older ages. From Table 3.4 it can be seen that over the 4 years between assessments, 19 to 25 year olds improved the most (6.9 points), followed by 30 to 55 year olds (3.3 points) and those 60 plus years of age (1.6 points). One explanation for this trend might be that people continue to develop and refine their pool of resistance resources beyond age 30 though the gain in S O C appears to follow a pattern of diminishing improvements. It also is possible that a selection bias may in part be responsible for the average improvement in S O C scores found in this study. Older individuals with a low S O C may have been excluded from the survey at a higher than average rate because o f poor health or death (e.g., the N P H S did not assess seniors living in institutions). Future releases of N P H S data containing longer periods of follow-up on individuals w i l l reduce the reliance on cross-sectional age-based comparisons of SOC and provide the prospective cohort data required to confidently answer this question. The examination of 1994-1995 and 1998-1999 correlations indicated that the test-retest correlations differ across age groups. The lowest correlation occurred in group 1 (19-25 year olds) (.38) while the correlations in group 2 (30-55 year olds) (.53) and group 3 (60 plus year olds) (.50) were similar. This finding supports Antonovsky (1987)'s theory that more variation is found in those less than 30 years of age. However, the size of the coefficients suggests that the relative position of individuals (i.e., their ranking) changes substantially in all age groups though less so in those over age 30. This indicates that while, on average, SOC scores of individuals do not appear to change substantially, there were significant changes in the relative ranking of individuals from 1994-1995 to 1998-1999. This finding confirms the 83 findings of Smith, Breslin, and Beaton (2003) who found evidence of numerous individual changes in SOC over time, many of which exceeded the 10% threshold for normal or expected variation. Future work examining changes in S O C is required to further quantify and explain the variation in individual change and relative ranking on the SOC scale. Building on the results of the first study, Study 2 contained a series of regression models to test the ability of SOC to moderate (i.e., buffer) the health impacts associated with the experience of a recent stressful life event using data from the 1998-1999 and 2000-2001 N P H S . From Table 3.9 it can be seen that the interaction between experiencing a stressful recent life event in the previous year and the S O C score two years prior was significant (t = 2.24, p = .025). Using a centred SOC variable allowed for the interpretation of the R L E coefficient as indicating the average effect of one or more recent life events on self-reported health across all values of SOC (West et al., 1996). In this study, the average effect of experiencing a R L E across all values of SOC was a decrease in S R H of .121 ( S R H was assessed on a 5-point scale ranging from poor to excellent). To interpret the interaction, the predicted values for self-reported health were plotted against the S O C scores separately for each category o f recent life events. From Figure 3.2 it can be seen that there is a sizable protective effect for those with a higher S O C score taking into account age and gender and previous self-rated health. For individuals with a higher than average SOC, there is little impact of a R L E in the previous year on current S R H . However, for individuals with a below average SOC, the negative impact of experiencing a R L E on S R H grows as S O C decreases. From Table 3.10 it can be seen that the mean difference in S R H between those who did and did not experience a R L E was .24 (t = 5.8, p<.001) in individuals with a below average SOC and .04 (t = 1.0, p>.05) in individuals with a higher than average S O C score. These results provide strong support for the buffering capacity of S O C . However, without randomly assigning individuals to the experience of a R L E it is possible that the results are due to existing inequalities in the groups that did or did not experience a R L E . For example, the self-reported health of individuals with a low SOC who experience several R L E ' s may be more vulnerable to the health impact of a subsequent R L E due to their weakened status. Although it is impossible to completely control for such scenarios using statistical procedures, the addition of prior self-reported health into the regression should help minimize the impact of such biases. 84 In the second model of study 2, Poisson regression analysis was used to test the ability of SOC to moderate (i.e., buffer) the health impact of a R L E in terms of visits to a medical doctor in the past 12 months. Controlling for differences due to gender, age and number of medical visits in the previous year, the moderating effect of SOC was not significant (t =1.13, p>.05). This result suggests that while S O C buffers the health impact of a R L E in terms of self-rated health, it does not buffer the impact of experiencing a R L E on number of medical visits. In addition to a lack of effect, there are several other reasons why the interaction between S O C and R L E may not have been significant. Although experimentalists frequently find significant interaction effects, non-experimentalists have typically found the detection of theoretically expected interactions to be difficult (McClel land & Judd, 1993). The difficulty is due in large part to the exacerbation of measurement errors and restricted variance (e.g., variables with a restricted range) in variables when forming the multiplicative interaction (McClel land & Judd, 1993). Another reason for the apparent lack of a significant interaction involves the tendency for under-reporting to increase as utilization increases. However, in the current study the model controlled for prior utilization and in a sense modeled the factors that influence change in utilization. If the tendency of the participant to under-report did not change substantially from time 1 to time 2 then the under-reporting bias would not be manifest in the analysis. However, i f respondents shifted from low to high utilization rates in response to the experience of a stressful recent life event, then the increased tendency to under-report may have resulted in a bias towards the null hypothesis and produced overly conservative estimates of the effect of experiencing a stressful recent life event. From this perspective, the impact of the under-reporting bias associated with greater utilization may have contributed to the lack of a significant interaction in this study. However, given that the bias appears to increase as utilization increases, its effect would be substantially reduced by the fact that the model controls for prior similarly biased utilization rates. Although there is some doubt concerning the degree to which the under-reporting bias associated with increased utilization influenced the results of this study, there is little doubt that the self-reports are inaccurate. If the errors in reporting are not associated with specific factors (i.e., they can be considered random in the context of this study) then the primary 85 impact of the inaccuracies is an increase in measurement error accompanied by a reduction in power and not the production of biased coefficients (Ritter et al., 2001). Although inaccuracies may have contributed to a reduction of power in this study, the large sample size (n = 6000 plus) probably went a long way to minimizing the impact of the random measurement error associated with one finding a significant effect. Overall, the results of this investigation provide strong support for the factor structure and age-based measurement invariance of the SOC. Moderate support was found for the stability of SOC over time though the results indicated that S O C may continue to grow albeit increasingly slowly as one advances into older age. Although the results suggest that S O C buffers the impact of recent life events on self reported health, the relatively low test-retest correlations indicate that future research examining individual changes is needed to clarify the stability of SOC among special populations. Such research is particularly important i f researchers are to determine the feasibility of developing interventions aimed at improving the SOC of adults. 86 3.9 R E F E R E N C E L I S T Aiken, L . & West, S. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, C A : Sage. Albertson, K . , Nielsen, M . L . & Borg, V . (2001). The Danish psychosocial work environment and symptoms of stress: the main, mediating and moderating role of sense of coherence. Work and Stress, 15, 241-253. Antonovsky, A . (1979). Health, stress and coping: New perspectives on mental and physical well-being. San Francisco, C A : Jossey-Bass. Antonovsky, A . (1987). Unraveling the mystery of health: how people manage stress and stay well. San Francisco, C A : Jossey-Bass. Antonovsky, A . (1993). The structure and properties of the sense of coherence scale. SocSciMed., 36, 725-733. Antonovsky, A . (1996). The sense of coherence: A n historical and future perspective. IsrJ Med Sci 32, 170-178. Bandalos, D . L . & Enders, C. K . (1996). The effects of nonnormality and number of response categories on reliability. Applied Measurement in Education, 9, 151-160. Benyamini, Y . & Idler, E . L . (1999). Community studies reporting association between self-rated health and mortality: Additional studies, 1995 to 1998. Research on Aging, 3, 392-401. Bergner, M . , Bobbitt, R. A . , Pollard, W . E . , Martin, D . P., & Gilson, B . S. (1976). The sickness impact profile: validation of a health status measure. Med Care, 14, 51-61. Brannick, M . T. (1995). Critical comments on applying covariance structure modeling. Journal of Organizational Behavior, 16, 201-213. Brazier, J. E. , Harper, R., Jones, N . M . , O'Cathain, A . , Thomas, K . J., Usherwood, T., & Westlake, L . (1992). Validating the SF-36 health survey questionnaire: new outcome measure for primary care. BMJ, 305, 160-164. Byrne, B . M . , Shavelson, R. J., & Muthen, B . (1989). Testing for the equivalence of factor covariance and mean structures: The issue of partial measurement invariance. Psychological Bulletin, 105, 456-466. 87 Byrne, B . M . (1998). Structural equation modeling with LISREL, PRELIS, and SIMPLIS: Basic concepts, applications, and programming. Mahwah, N J : Lawrence Erlbaum Associates. Chang, L . (1994). A psychometric evaluation of 4-point and 6-point Likert-type scales in relation to reliability and validity. Applied Psychological Measurement, 18, 205-215. Cheung, G . W . (2004). RE: Chi-square difference test. Structural Equation Modeling Discussion Network: S E M N E T Discussion List. Posted Monday Mar 8, 2004 at http://bama.ua.edu/archives/semnet.html Cheung, G . W. , & Rensvold, R. B . (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233-255. Cleary, P. D . , & Jette, A . M . (1984). The validity of self-reported physician utilization measures. Med Care, 22, 796-803. Cox, D . R. (1983). Some remarks on overdispersion. Biometrika, 70, 269-27'4. Cudeck, R. (1989). Analysis of correlation matrices using covariance structure models. Psychological Bulletin, 105, 317-327. Dunlop, S., Coyte, P. C , & Mclsaac, W . (2000). Socio-economic status and the utilisation of physicians' services: results from the Canadian National Population Health Survey. SocSciMed, 51, 123-133. Feldt, T. (1997). The role of sense of coherence in well-being at work: analysis of main and moderator effects. Work and Stress, 11, 134-147. Feldt, T., Leskinen, E . , Kinnunen, U . , & Mauno, S. (2000). Longitudinal factor analysis models in the assessment of the stability of sense of coherence. Personality and Individual Differences, 28, 239-257. Feldt, T., Leskinen, E . , Kinnunen, U . , & Ruoppila, I. (2003). The stability of sense of coherence: Comparing two age groups in a 5-year follow-up study. Personality and Individual Differences, 35, 1151-1165. Feldt, T., & Rasku, A . (1998). The structure of Antonovsky's orientation to life questionnaire. Personality and Individual Differences, 25, 505-516. Frenz, A . W. , Carey, M . P., & Jorgensen, R. S. (1993). Psychometric evaluation of Antonovsky's sense of coherence scale. Psychological Assessment, 5 (2), 145-153. 88 Gana, K . (2001). Is sense of coherence a mediator between adversity and psychological wel l -being in adults? Stress and Health, 17, 77-83. Gana, K . , & Gamier, S. (2001). Latent structure of the sense of coherence scale in a French sample. Personality and Individual Differences, 31, 1079-1090. Gardner, W. , Mulvey, E . P., & Shaw, E . C. (1995). Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychol Bull, 118, 392-404. Hood, S. C , Beaudet, M . P., & Catlin, G . (1996). A healthy outlook. Health Rep., 7, 25-35. Horn, J. L . (1991). Comments on issues in factorial invariance. In L . M . Collins & J. L . Horn (Eds.) Best methods for the analysis of change (ppl 14-125). Washington, D C : American Psychological Association Horn, J. L . , & McArd le , J. J. (1992). A practical and theoretical guide to measurement invariance in aging research. Exp Aging Res, 18, 117-144. Horn, J. L . & McArd le , J. J., & Mason, R. (1983). When is invariance not invariant: A practical scientist's look at the ethereal concept of factor invariance. The Southern Psychologist, 1, 179-188. Hu, L . , & Bentler, P. M . (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Idler, E . I. (1999). Self-assessments of health: The next stage of studies. Research on Aging, 21, 387-391. Idler, E . L . & Benyamini, Y . (1997). Self-rated health and mortality: a review of twenty-seven community studies. J Health Soc Behav, 38, 21-37. Joregenson, R. S., Frankowski, J. J., & Carey, M . P. (1999). Sense of coherence, negative life events and appraisal of physical health among university students. Personality and Individual Differences, 27, 1079-1089. Joreskog, K . G . (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409-426. Kaplan, D . (1998). RE: Use of factorial invariance. Structural Equation Modeling Discussion Network: S E M N E T Discussion List. Posted Thursday Dec 10, 1998 at http://bama.ua.edu/archives/semnet.html 89 Kaplan, D . , & Ferguson, A . J. (1999). On the utilization of sample weights in latent variable models. Structural Equation Modeling, 6, 305-321. Karlsson, I., Berglin, E . , & Larsson, P. A . (2000). Sense of coherence: quality of life before and after coronary artery bypass surgery\u00E2\u0080\u0094a longitudinal study. J Adv Nurs, 31, 1383-1392. Kessler, R. C. (1979). A strategy for studying differential vulnerability to the psychological consequences of stress. J Health Soc Behav, 20, 100-108. Kelloway, E . K . (1995). Structural equation modeling in perspective. Journal of Organizational Behavior, 16, 215-224. Kiv imak i , M . , Feldt, T., Vahtera, J., & Nurmi, J. E . (2000). Sense of coherence and health: Evidence from two cross-lagged longitudinal samples. Soc Sci Med, 50, 583-597. Kobasa, S. C. (1979). Stressful life events, personality and health. Journal of Personality and Social Psychology, 37, 1-11. Lissitz, R. W . & Green, S. B . (1975). Effect of the number of scale points on reliability: A Monte Carlo approach. Journal of Applied Psychology, 60, 10-13. Lutgendorf, S. K . , Vitaliano, P. P., Tripp-Reimer, T., Harvey, J. H . , & Lubaroff, D . M . (1999). Sense of coherence moderates the relationship between life stress and natural killer cell activity in healthy older adults. Psychol Aging, 14, 552-563. Mackenbach, J. P., van den, B . J., Joung, I. M . , van de, M . H . , & Stronks, K . (1994). The determinants of excellent health: different from the determinants of ill-health? Int J Epidemiol, 23, 1273-1281. Manderbacka, K . , Lahelma, E . , & Martikainen, P. (1998). Examining the continuity of self-rated health. Int J Epidemiol, 27, 208-213. McClel land, G . H . & Judd, C. M . (1993). Statistical difficulties of detecting interactions and moderator effects. Psychol Bull, 114, 376-390. McCul lagh, P., & Nelder, J. (1989). Generalized linear models. Second edition, London: Chapman and Hal l . Meredith, W . (1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58, 525-543. Meredith, W . (1995). Two wrongs may not make a right. Multivariate Behavioral Research, 30, 89-94. 90 Mil ler , G . A . (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological Review, 63, 81-97. Mossey, J. M . & Shapiro, E . (1982). Self-rated health: a predictor of mortality among the elderly. Am J Public Health, 72, 800-808. Muthen, L . K . & Muthen, B .O . (1998). Mplus Version 2.14. Los Angeles, C A : Muthen & Muthen. Netter, J., Wasserman, W. , & Kutner, M . H . (1990). Applied linear statistical models. (3rd ed.) Boston, M A : Irwin. Nunnally, J. C . & Bernstein, I. H . (1994). Psychometric theory. 3rd ed. New York : McGraw-H i l l . Ochieng C. & Zumbo B D (2001). Implications of ordinal scale categorization on regression models under different distributions and conditions: An assessment of the accuracy and information of Likert scales on regression analysis. Presented at the N C M E Conference, Seattle, W A , Apr i l 12, 2001. Retrieved on March 14, 2004 from http://educ.ubc.ca/faculty/zumbo/ins2001/index.html Pedan, A . (2001). Analysis of Count Data Using the SAS System. Paper presented at S A S Users Group International (SUGI) Convention, Center Long Beach, C A , Apr i l 22-25, 2001. Retrieved on June 5, 2004 from http://www2.sas.com/proceedings/sugi26/p247-26.pdf Preston, C. C. & Colman, A . M . (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104, 1-15. Ritter, P. L . , Stewart, A . L . , Kaymaz, H . , Sobel, D. S., Block, D . A . , & Lor ig , K . R. (2001). Self-reports of health care utilization compared to provider records. J Clin Epidemiol, 54, 136-141. Roberts, R. O., Bergstralh, E. J., Schmidt, L . , & Jacobsen, S. J. (1996). Comparison of self-reported and medical record health care utilization measures. J Clin Epidemiol, 49, 989-995. Rock, D. A . , Werts, C. E . & Flaugher, R. L . (1978). The use of analysis of covariance structures for comparing the psychometric properties o f multiple variables across populations. Multivariate Behavioral Research, 13, 403-418. 91 Smith, P. M . , Breslin, F. C , & Beaton, D . E . (2003). Questioning the stability of sense of coherence\u00E2\u0080\u0094the impact of socio-economic status and working conditions in the Canadian population. Soc Psychiatry Psychiatr Epidemiol, 38, 475-484. Sorbom, D . (1974). A general method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Statistical Psychology, 27, 229-239. Sorbom, D . (1982). Structural equation models with structured means. In (Ed) K . G . Joreskog & H . Wold , Systems under indirect observations, pt 1 (pp. 183-195). Amsterdam: New Holland. Statistics Canada. (1995). National Population Health Survey, 1994-1995 documentation. Ottawa, O N T : Statistics Canada. Statistics Canada. (2000). National Population Health Survey, 2000-2001 documentation. Ottawa, O N T : Statistics Canada. Statistics Canada (2002). Population Health Surveys Program: National Population Health Survey Cycle 4 (2000-2001), household component longitudinal documentation. Ottawa, O N T : Statistics Canada. Steenkamp, J. E . M . & Baumgartner, H . (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78-90. Symonds, P. M . (1924). On the loss of reliability in ratings due to coarseness of the scale. Journal of Experimental Psychology, 7, 456-461. Velicer, W . F. & Stevenson, J. F. (1978). The relation between item format and the structure of the Eysenck Personality Inventory. Applied Psychological Measurement, 2, 293-304. Wallihan, D . B . , Stump, T. E . , & Callahan, C. M . (1999). Accuracy of self-reported health services use and patterns of care among urban older adults. Med Care, 37, 662-670. West, S. G . , Aiken , L . S., & K r u l l , J. L . (1996). Experimental personality designs: analyzing categorical by continuous variable interactions. JPers, 64, 1-48. 92 CHAPTER 4 AN INVESTIGATION OF THE MEDIATING ROLES OF SENSE OF COHERENCE AND SELF-ESTEEM IN THE RELATIONSHIP BETWEEN SOCIOECONOMIC STATUS AND SELF-REPORTED H E A L T H 4.1 INTRODUCTION The term socioeconomic status (SES) refers to the relative hierarchical placement of an individual along a gradient stratified by social and economic resources (Guppy, 2003). It is well known that SES influences the health and well-being of populations. Numerous studies have shown that not only do those in the lowest socioeconomic strata have worse health than those in the upper strata, but that the phenomenon exists across the full i socioeconomic gradient (e.g., lower middle class persons have been shown to have poorer health than upper middle class persons). These disparities in health exist across a full range of health outcomes ranging from disease-specific mortality to self-reported health. Although many theories have been proposed to explain the mechanisms behind the gradient, the focus of this investigation is limited to an examination of the degree to which the psychosocial constructs of sense of coherence and self-esteem mediate the relationship between income related-socioeconomic status and self reported health (SRH). The paper begins with a brief review of research on the SES-health gradient with an emphasis on the theories involving self-esteem by Siegrist and Marmot (2004) and sense of coherence by Antonovsky (1987) from which the mediation hypotheses were derived. The results of a series of structural equation models testing the hypotheses are then presented after which follows a discussion of the implications of the findings. Background The inverse relationship between SES and health has been the focus of a great deal of attention over the past 3 decades. Early research on the effect of SES on health tended to be limited to investigations of the effects of poverty on health using a threshold perspective that 93 implied that there is an inverse relationship between health and SES as long as one lives in poverty. However, increasing an individual's SES was thought to have little or no influence on health once he or she surpasses the poverty threshold (Adler & Ostrove, 1999). B y the mid 1980's it became clear that the relationship between SES and health is not limited to those living in poverty. Rather, SES was shown to influence health across the full range of SES (see for example Marmot et al., 1991). Growing evidence demonstrating that the influence of SES on health extends beyond the material deprivation associated with poverty, combined with research indicating that the impact on health involves not only mortality but also many different types of morbidity (e.g., osteoarthritis, chronic disease, hypertension and cancer (Adler et al., 1994)) contributed to a rapid rise in research on the health impact of SES (Adler & Ostrove, 1999). More recently, the challenge has shifted from identifying the particular health effects associated with SES to understanding the mechanisms by which SES influences health. Although Adler, Marmot, M c E w e n and Stewart (1999) presented a multitude of complementary theories relating SES to health, the purpose of this investigation was to test two particular mediating mechanisms involving aspects of mental well-being assessed in the National Population Health Survey of Canada (NPHS). The first model was derived from the recent work of Siegrist and Marmot (2004) and is based on their notion that one's SES is associated with a particular psycho-social environment determined in large part by one's occupation. This environment includes a socio-structural range of opportunities that facilitate the development of a sense of self-efficacy and self-esteem in individuals (Siegrist & Marmot, 2004). Citing Bandura's description of self-efficacy as the belief a person has in his or her ability to accomplish tasks that is based on a favourable evaluation of one's competence and of expected outcomes (Bandura, 1985), Siegrist and Marmot (2004) proposed that a work-related psychosocial environment conducive to self-efficacy enables people to practice their skills and to experience control in terms of successful agency. This in turn induces feelings of mastery and control beliefs. Building on research examining the demand-control model of Karasek (1979), Siegrist and Marmot (2004) suggested that lower SES psychosocial work environments place higher demands on working persons while providing less control over one's task 94 performance. Such conditions limit the experience of self-efficacy and ultimately enhance stressful experiences with adverse long-term consequences on health through established psychobiological pathways (e.g., activation of the hypothalamic-pituitary adrenocortical stress axis) (Siegrist & Marmot, 2004). Siegrist and Marmot (2004) suggested that a similar relationship can be found with respect to self-esteem. More specifically, SES determines one's work-related psychosocial environment which in turn influences the opportunity for positive experiences of personal self-worth (Siegrist & Marmot, 2004). They proposed that the effort-reward imbalance model, based on the work of Siegrist (1996), represents a complementary mechanism linking the work-related psychosocial environment and self-esteem in particular to health. This model is built on the notion that a lack of reciprocity between the costs and gains (i.e., high cost/low gain conditions) results in a state of emotional distress with a propensity to autonomic arousal and neuroendocrine stress responses (Siegrist & Marmot, 2004). Repeated violations of normal reciprocity are hypothesized to elicit a sense of being treated unfairly and suffering injustice, which affects the self-esteem of workers (Siegrist & Marmot, 2004). On the other hand, adequate approval and esteem, whether experienced in terms of money or recognition, job promotion or job stability are thought to enhance self-esteem and satisfaction (Siegrist & Marmot, 2004). Despite the common role of self-esteem and self-efficacy in both theories, recent research has shown that the demand-control and effort-reward imbalance models provide complementary information in terms of explaining acute myocardial infarction (Peter et al., 2002), S R H , and chronic disease (Ostry, Ke l ly , Demers, Mustard, & Hertzman, 2003). Although Siegrist and Marmot (2004) identified the SES-related work environment as a primary determinant of one's psychosocial environment, it seems reasonable to extend this hypothesis to consider the broader influence of SES defined by household income. A n individual's level of income adequacy (i.e., household income adjusted by household size) could be viewed as a determinant of an individual's sense of self-worth and success as well as a contributor to one's overall evaluation of personal competence and feelings of mastery and control. This extrapolation of findings associated with the SES-determined psychosocial work environment to a broader living environment determined by income adequacy appears particularly suited to research in North America where economic success is idealized and the 95 social environment is often characterized as the 'land of opportunity' with the potential for upward mobility and material wealth (e.g., Guppy, 2003). Siegrist and Marmot (2004) recently discussed a similar expansion of the psychosocial environment beyond one's work environment. A s evidence for an expanded role of the psychosocial environment, they present the findings of a study by Chandola, Kuper, Singh-Manoux, Bartley, and Marmot (2004) showing that low control at home predicts new coronary heart disease events in women and that low control at home among women may be due in part to a lack of material and psychological resources to cope with excessive household and family demands (i.e., low levels of material and psychological resources lead to a perceived lack of control at home which in turn contributes to a greater likelihood of experiencing coronary heart disease) (Siegrist & Marmot, 2004). Additional support for the relationship between SES and self-esteem can be found in a recent meta-analytic review of the relationship between self-esteem and SES. In this review, Twenge and Campbell (2002) examined 446 samples (participant N = 312,940) and found that SES had a small but significant positive relationship with self-esteem (r = .08). They concluded that their findings were consistent with a social indicator or salience model that is based on the notion that SES is an indicator of status within social groups (Twenge & Campbell, 2002). This theory predicts that when individuals aspire to success in the form of social status and wealth, and they achieve their goals, then their self-esteem should increase. Conversely, individuals who fail to achieve social status may experience reductions in self-esteem, especially in cultures (e.g., North America) where SES is commonly perceived as changeable and earned (Twenge & Campbell, 2002). Although Siegrist and Marmot (2004) focused on the maintenance of self-esteem and self-efficacy as mechanisms linking SES to health, a similar theory can be found in Antonovsky's (1987) work on sense of coherence (SOC). The concept of S O C was theorized to represent a single global conceptualization of the common components underlying particular coping or resistance resources that individuals have at their disposal. Examples of resistance resources include social support, educational background and financial resources (Antonovsky, 1996). More specifically, the concept of S O C is defined as: 96 A global orientation that expresses the extent to which one has a pervasive, enduring though dynamic feeling of confidence that (1) the stimuli, deriving from one's internal and external environments in the course of living are structured, predictable and explicable; (2) the resources are available to one to meet the demands posed by these stimuli; and (3) these demands are challenges, worthy of investment and engagement. (Antonovsky, 1987, p. 19) Antonovsky labeled these three components of SOC as comprehensibility (i.e., the extent of the belief that the problem faced by the individual is clear), manageability (i.e., the extent of the belief that not only does one understand the problem, but that the necessary resources to successfully cope with the problem are available) and meaningfulness (i.e., the extent of the belief that coping \"makes sense\" emotionally, that one wishes to cope) (Antonovsky, 1996). Despite identifying three distinct components, Antonovsky maintained that SOC represents a single higher order generalized orientation that is universally meaningful and cross-culturally valid (Antonovsky, 1996). In Antonovsky's (1993) words: The SOC is a construct which is universally meaningful, one which cuts across lines of gender, social class, region and culture. It does not refer to a specific type of coping strategy, but to factors which, in all cultures, always are the basis for successful coping with stressors (p.726). The concept of SOC can be conceived of as a stable generalized orientation in relation to perceiving and controlling the environment for meaningful and appropriate action (Kivimaki , Feldt, Vahtera & Nurmi, 2000). According to Antonovsky (1987), an individual's SOC develops through childhood into mid-to-late adolescence. Throughout this time of development, the individual is repeatedly exposed to tension states requiring that they actively respond to a wide variety of stressors by mobilizing appropriate available resources labeled generalized resistance resources. In addition to facilitating the selection of culturally appropriate and situationally efficacious coping resources and behaviour, a person's SOC increases the degree to which tension-states are perceived as comprehensible, manageable and meaningful thereby reducing the potential for psychological distress (Antonovsky, 1987, 1996). 97 Over time, the repeated management of tension events in a healthy manner reinforces the individual's SOC (e.g., the individual might perceive the world as more manageable) and increases the coping resources available to the individual in the future (i.e., part of the process of successfully dealing with a stressor often involves the development of new coping skills which are added to the set of available resistance resources). Antonovsky (1987) further theorized that by about the end of the third decade of life, people w i l l have been exposed to a sufficiently long and consistent pattern of life experiences that their SOC becomes a stable dispositional orientation. Antonovsky (1996) also theorized that \"by and large, however, the person with a weak SOC in adulthood wi l l manifest a cyclical pattern of deteriorating health and a weakening SOC (p. 176).\" The implications of this statement appear to be that not only is a low SOC associated with poorer health, but that a low S O C wi l l lead to an ongoing deterioration of health over time. Despite Antonovsky's claims that SOC becomes more or less unchangeable by age 30, the possibility that SOC mediates the relationship between one's psychosocial environment and individual health has been explored (i.e., aspects of the psychosocial environment are associated with changes in an individual's SOC) . For example, K iv imak i et al., (2002) used structural equation modeling to test the mediating properties of S O C between hostility and health in a seven year prospective study of female employees. After adjustment for baseline characteristics, SOC was found to mediate 50% of the relationship between hostility and self-rated health. Feldt, Kinnunen, and Mauno (2000) tested the mediating properties of sense of coherence between psychosocial work characteristics (e.g., organizational climate) and general and occupational well-being in a 1-year follow-up study of Finnish employees. They found that SOC mediated the relationship between the workplace psychosocial characteristics and well-being and changes in S O C mediated the relationship between changes in workplace psychosocial characteristics and changes in wel l -being. Gana (2001) also found evidence of both mediating and moderating roles for S O C between adversity (stress, worry and anxiety) and satisfaction with life. 98 Specific Objectives This study had three inter-related objectives each of which was tested in a separate model. The first was to test the hypothesis that socioeconomic status, assessed using current household income adequacy, predicts current S R H and future change in S R H . The second objective was to test the hypothesis that current SOC mediates the relationship between current income adequacy and both current S R H and future change in S R H . The third objective was to test the hypothesis that current self-esteem mediates the relationship between current income adequacy and both current S R H and future change in S R H . 4.2 M E T H O D S Analysis Strategy The use of structural equation modeling (SEM) enables the researcher to simultaneously test the mediation of both current health and future change in health through the use of a growth curve model. Independent structural equation models using longitudinal data from the N P H S of Canada were used to test the extent to which current socioeconomic status, assessed using respondents household income adequacy, predicted current S R H and change in S R H from 1994-1995 to 2000-2001 as well as the extent to which self-esteem and S O C function as mediators. Latent Growth Curve A simple growth model can be thought of as a structural equation model with latent variables. Instead of thinking about Ok (intercept) and ft (slope) as random parameters, they are considered latent (i.e., unobserved) variables that vary across individuals (Muthen & Khoo, 1998). The intercept represents the average value of the concept being investigated at the beginning of the process being modeled (i.e., at the time of the first survey in 1994-1995). The slope represents the mean rate of change or growth and can take both positive and negative values and may or may not be linear depending on the model specifications. Covariates are then added to the model to explain why some individuals differ in their intercepts and rates of change (Acock & L i , 2004). In the context of this investigation, this approach allowed the researcher to develop models of variables that explain why some 99 people have poorer health at the initial assessment as well as variables that explain why the health of some people changes more or less than others over time. Building on the work of Baron and Kenny (1986), Hoyle and Smith (1994) outlined the following procedure for testing for mediation within a S E M framework. The first step is to test the fit of a model containing only a direct effect of SES on S R H intercepts and slopes. If this model fits the data, then the indirect (i.e., mediating) effects of SES on SOC and SOC on S R H intercept and slope are added to the model. If the indirect effect of SES through the mediator (SOC) is significant, then the direct effect between SES and S R H intercept and slope w i l l be reduced, and the mediation hypothesis is supported (Hoyle & Smith, 1994). A s the direct effect approaches zero, the mediator can be said to fully account for the relationship between SES and S R H intercept and slope (Hoyle & Smith, 1994). If the indirect effect is significant and the direct effect remains greater than zero, then the mediator partially accounts for the S E S - S R H intercept and slope (Hoyle & Smith, 1994). See Figure 4.1 for a diagram of the full mediation model. 100 SRH SRH SRH SRH 1994 1996 1998 2000 Direct Effect \u00E2\u0080\u00A2 Covariance Note: Inc. = Income adequacy in 1994-1995, S R H Int. = Intercept for S R H latent growth curve, S R H Slop. = Slope for S R H latent growth curve and M e d . = the latent mediating variable (e.g., self-esteem). The mediating variable was self-esteem in one set of models and S O C in the other set of models. Specific details on the measurement structure of self-esteem can be found in paper 1 of this dissertation (see p.22) and the measurement structure of sense of coherence can be found in paper 2 of this dissertation (see p. 56). The loadings of S R H 1994, 1996, 1998 & 2000 onto the slope were fixed to 0, 1, 2 and 3, respectively. A l l the S R H loadings onto the intercept were fixed to 1. Figure 4.1. Mediation model examining the extent to which mental well-being mediates the relationship between income adequacy and self-reported health. 101 Data The longitudinal data analyzed in this study were collected as part of the health component of the 1994-1995 and 1998-1999 National Population Health Survey of Canada (Statistics Canada, 2002). The sampling design for the N P H S is that of multi-stage stratified cluster design based on an existing Labour Force Survey. In addition to the survey data, the N P H S also includes sample weights that reflect the probability of an individual being selected for inclusion in the initial 1994-1995 wave of the survey. Following the recommendations of Kaplan and Ferguson (1999) and Statistics Canada (2002), the longitudinal square weights for each age sub-group were normalized (i.e., the weight for each respondent was divided by the average weight for their group) to take into account the probability of selection while controlling for an artificial weight-induced increase in sample size. The response-rate for the health component of the Cycle 1 (1994-1995) N P H S was 83.6 % at the Canada level and ranged from 77.8% in Ontario to 89.1% in Alberta. The Cycle 4 (2000-2001) longitudinal response rate for the health component of the N P H S was 84.8% at the Canada level and ranged from 80.5% in British Columbia to 90.9% in Saskatchewan. The survey includes household residents in all provinces of Canada with the exception of populations on Indian Reserves and Crown Lands, residents of health care institutions, full-time members of the Canadian Armed Forces and some remote areas in Ontario and Quebec (Statistics Canada, 2002). The final sample of 4842 cases examined in the analysis was formed by selecting respondents aged 30 to 60 years of age at the initiation of the N P H S with no missing data on the variables examined in the study. Demographics for this sample are reported in Table 4.1. Instruments Socioeconomic status Socioeconomic status was assessed using the proxy indicator of income adequacy. The 5 ordinal categories of income adequacy included in the N P H S were derived by adjusting total household income by household size (Statistics Canada, 2002). Details on the creation of specific income adequacy categories can be found in paper 1 of this dissertation (see p. 22). 102 Sense of Coherence Scale The widely-used 13-item version of the SOC scale was included in the National Population Health Survey. The dimension of manageability is addressed by questions 3, 4, 8, and 10. Questions 1,9, 11, and 13 measure meaningfulness while questions 2, 5, 6, 7 and 12 assess comprehensibility. Responses to each question are scored on a Likert-type scale ranging from 1 to 7. After reversing scores for questions Q l , Q2, Q3 , Q8, Q13, a total S O C score is created by summing responses to all thirteen items. Higher scores indicate a stronger sense of coherence. For more detailed information on the S O C the reader is referred to paper 2 in this dissertation (see p. 37). Rosenberg Self-esteem Scale As described in the introduction section of this paper, an adapted 6-item version of the R S E S was used in the N P H S . Respondents' answers to the following 6 questions were scored using the following 5 point Likert scale: Strongly disagree = 0, Disagree = 1, Neither agree nor disagree = 2, Agree = 3, Strongly agree = 4. After reversing the score for Q6, a total score for the R S E S is created by summing the responses with higher scores indicating greater self-esteem. For more detailed information on the version o f the Rosenberg self-esteem scale included in the N P H S , the reader is referred to paper 1 in this dissertation (see p. 18). 4.3 R E S U L T S Demographics A summary of the sample's demographic characteristics is presented in Table 4.1. 103 Table 4.1. Sample characteristics. Demographic (N=4842) Percent (95% CI) Gender (% female) 53.0 (51.6, 54.4) Income Adequacy (%) Lowest 3.8 (3.3,4.3) Low-middle 8.1 (7.3, 8.9) Middle 25.3 (24.1,26.5) Upper-middle 41.5 (40.1,42.9) Highest 21.4 (20.2, 22.6) Education Level (%) < Secondary 16.9(15.8, 18.0) Secondary graduate 15.5 (14.5, 16.5) Some post secondary 26.2 (25.0, 27.4) College/University degree 41.1 (39.7, 42.5) Mean age in years (95% CI) 42 (42, 43) Mean self-reported health (95% CI) (possible range 0 to 4) 2.9 (2.8, 2.9) Self-reported health (%) Poor 1.3 (1.0, 1.6) Fair 6.0 (5.3, 6.7) Good 25.6 (24.4, 26.8) Very good 39.6 (38.2,41.0) Excellent 27.5 (26.2, 28.8) Mean self-esteem score (95% CI) (possible range of 0 to 24) 20 (20,21) Mean sense of coherence score (95% CI) (possible range of 0 to 78) 59 (59, 60) SOC Mediation Results The first model in the test for mediation by S O C contains a direct effect of income adequacy on the S R H growth curve intercept and slope as well as the effects of the covariates age and gender (see Model 1 in Table 4.2). The covariate effects consisted of a direct effect 104 of gender and age on the intercept as well as a direct effect of age on the S R H growth curve slope. In the interest of model parsimony, a direct effect of gender on slope was not included in the model because gender was not theorized to effect rate of decline in S R H . The model did not pass the x2 test of exact fit (x2^ - 61.4, p<.01) but had good fit according to the close-fit criteria of H u and Bentler (1999) who suggested the following cut-off for good fit: R M S E A <06, CFI >95 and S R M R <08 (see Table 4.2). Although the direct effect of income adequacy on the S R H intercept was significant (t = 15.8, p<.05), the effect of income on slope was not significant (t = -.008, p>.05). The results of the mediation model (see Model 2 in Table 4.2) indicate that the overall model fits well (e.g., C F I = .95, R M S E A = .039 and S R M R = .033), the effect of income on SOC is significant (t=7.1, p<.05), and the effect of SOC on the S R H intercept is significant (t = 17.8, p<.01). Although the direct effect of income on the S R H intercept remained significant (t = 14.1, p<.05), the size of the unstandardized coefficient was reduced by about 13% from .190 to .165 (see Table 4.2). Table 4.2. Results for models examining sense of coherence as a mediator. Model Model 1 Model 2 X2(df) 61.4(12) 1362.7 (162) CFI .99 .95 R M S E A (9.0% CI) .029 (.022, .037) .039 (.037-.041) S R M R .013 .033 S R H intercept (i.e., mean S R H in 1994-1995) 2.86 2.90 S R H slope (i.e., average rate of change of SRH) .019 (NS) .014 (NS) Gender - S R H intercept -.022 (NS) .014 (NS) Age - S R H intercept -.016 -.017 Age - S R H slope -.001 (NS) -.001 (NS) Income - S R H intercept .190 0.165 Income - S R H slope -.008 (NS) -.006 (NS) Income - S O C * .115 S O C - S R H intercept * .235 S O C - S R H slope * -.018 Note: * Indicates Not Applicable. A l l coefficients significant (p<.05) unless denoted as not significant (NS). 105 Self-esteem Mediation Results Similar to the analysis for mediation by SOC, the first model in the test for mediation by self-esteem contained a direct effect of income adequacy on the S R H growth curve intercept and slope as well as the effects of the covariates age and gender (see Model 1 in Table 4.3). Added covariate effects included a direct effect of gender and age on the intercept as well as a direct effect of age on the S R H growth curve slope. The model did not pass the X2 test of exact fit (x2^ = 61.4, p<.01) but was considered to fit well according to the close-fit criteria of H u and Bentler (1999) who suggested the following cut-off for good fit: R M S E A <06, CFI >95 and S R M R <08 (see Table 4.3). Although the direct effect of income adequacy on the S R H intercept was significant (t=15.8, p<.05), the effect of income on slope was not significant (t=-.008, p>.05). To test for mediation, self-esteem was initially modeled as a single construct indicated by all questions. This model fit poorly by both measures of exact fit (x260 = I860, p<.01) and close-fit (CFI = .91, R M S E A = .079, and S R M R .039). A review of the modification indices indicated that the primary source of misfit in this model was substantial correlation between several self-esteem item error variances suggesting that the six items measure more than one dimension of self-esteem. Uti l iz ing the results of paper 1 in this Dissertation (see p. 27), self-esteem was then modeled as two separate mediating factors acting simultaneously (see Model 2 in Table 4.3). The first factor (SE-f l ) represents a measure of self-competence and was assessed by Q l , Q2 and Q3. The second factor (SE-f2) represents a measure of self-liking and was measured using responses to Q4, Q5 and Q6. The disturbances (i.e., residuals) associated with S E - f l and SE-f2 were allowed to correlate suggesting that the covariance between these two dimensions arises for reasons additional to those specified in the model. In terms of the diagram in Figure 4.1, income adequacy predicted both S E - f l and SE-f2, each of which then predicted the S R H intercept and slope. This initial model did not fit the data according to the test of exact fit (x256 = 473.2, p<.01), but could be considered a good fit by the indices of close fit (CFI = .98, R M S E A = .039, and S R M R .026). From Model 2 in Table 4.3, it can be seen that the two-factor representation of self-esteem partially mediated the relationship between income adequacy and S R H intercept. More specifically, the direct effect of income on the S R H intercept remained significant (t = 106 15.2, p<.05) and the size of the unstandardized coefficient was reduced by about 7.4% from .190 to .177 (see Model 2 in Table 4.3). A suppressor effect was also found in the relationship between self-esteem and S R H intercept. The S E - f l factor, described as a measure of self-liking in paper 1, appeared to suppress the common variance in SE-f2 (self-competence) that was not related to S R H . A more detailed review of suppressor effects can be found in paper 1 of this dissertation (see p. 28). Table 4.3. Results for models examining self-esteem as a mediator. Model Model 1 Model 2 Model 3 (SE-f l ) Model 4 (SE-f2) S E - f l SE-f2 X2(df) 61.4(12) 473.2 (56) 190.9 (30) 215.0 (30) C F I .99 .98 .99 .99 R M S E A (90% CI) .029 (.022, .037) .039 (.036, .039) .033 (.029, .038) .036 (.031, .040) S R M R .013 .026 .022 .025 S R H intercept (i.e., mean S R H in 1994-1995) 2.86 2.87 2.84 2.88 S R H slope (i.e., average rate of change of SRH) .019 (NS) .015 (NS) .020 (NS) .014 (NS) Gender - S R H int. -.022 (NS) -.01 (NS) -.014 (NS) .007 (NS) Age - S R H int. -.016 -.017 -.016 -.017 Age - S R H slope -.001 (NS) -.001 (NS) -.001 (NS) -.001 (NS) Income - S R H int. .190 0.177 .181 .176 Income - S R H slope -.008 (NS) -.006 (NS) -.006 (NS) -.006 (NS) Income - SE N / A .024 .039 .024 .038 SE - S R H int. N / A -.165 .480 .387 .413 SE - S R H slope N / A -.017 (NS) -.036 -.058 -.047 Note: A l l coefficients significant (p<.05) unless denoted as not significant (NS); Model 2 self-esteem measured using two correlated factors f l (Q1-Q3) and f2 (Q4-Q6); Model 3 self-esteem measured using only f l (Q1-Q3); Model 4 self-esteem measured using only f2 (Q4-Q6). To examine the independent effect of S E - f l , the mediating properties of S E - f l alone were then tested (See Model 3 in Table 4.3). This model fit the data well (CFI = .99, R M S E A = .033, and S R M R .022) and indicated that S E - f l partially mediated the relationship between 107 income and S R H intercept. More specifically, the direct effect of income on the S R H intercept remained significant (t = 15.3, p<.05) and the size of the unstandardized coefficient was reduced by about 5% from .190 to .181 (see Model 3 in Table 4.3). A similar effect was found when SE-f2 was modeled as the mediator (see Model 4 in Table 4.3) though the direct effect of income on S R H intercept was reduced by about 7.4% from .190 to .176. 4.4 DISCUSSION The purpose of this study was to use a latent growth model to test the hypothesis that mental well-being (i.e., self-esteem and SOC) mediates the effect of SES (i.e., income adequacy) on both current S R H and future trajectory of S R H over the subsequent 6 years. The results suggested that SES assessed by income adequacy is positively related to S R H assessed at the same time but not to the linear change in S R H over the next 6 years. Despite the small effect of income adequacy on the S R H intercept (e.g., income adequacy accounted for about 7.3% of the variation in the S R H intercept), both S O C and self-esteem appeared to partially mediate the impact of income adequacy on S R H . Although the mediation was statistically significant (i.e., all pathways involving the mediators were statistically significant), the indirect effect of income adequacy on S R H through SOC only explained about 1% of the variation in the S R H intercept (i.e., 13% of 7.3%). The indirect effect of income adequacy on S R H through self-esteem explained a maximum of 0.5% of the variation in the S R H intercept. Although the extent of the mediation observed in this study was small, the results support the hypothesis that aspects of mental well-being partially mediate the effect of income-related SES on current S R H . While the cross-sectional nature of the relationship between SES and S R H intercept prevents this study from proving that the SES - S R H relationship is partially mediated by mental well-being, the findings do support this proposition. Had the findings not indicated that a significant mediating relationship existed, then the results would have provided strong evidence against the mediating hypothesis (i.e., the associations observed in this study provide necessary but not sufficient evidence for the hypothesized mediation relationship). 108 With regard to self-esteem, it is reassuring to find that the results of paper 1 in this dissertation were supported in this study of a much larger data set (British Columbia sample versus a national sample). Not only did the single factor representation of the six items result in a significant deterioration in model fit, but the two factor representation proposed in paper 1 fit well and the suppressor effect was found when predicting S R H . The S E - f l factor, described as a measure of self-liking in paper 1, appeared to suppress the common variance it shares with SE-f2 (self-competence) that is not related to the S R H intercept. It is interesting that the suppressor variable switched from S E - f l in paper 1 to SE-f2 in this study. This role switching may reflect dominance of the relationship between aspects of self-esteem more closely related to self-competence and self-efficacy (i.e., SE-f2) and S R H over the relationship between aspects of self-esteem related to self-liking (i.e., SE- f l ) and S R H . The correlation between SES and health found in the models tested in this investigation support the hypothesis that SES influences health, however, it is also possible that the observed SES-health relationship may be explained by a social drift phenomenon based on the notion that deteriorating health causes reductions in SES (e.g., Dahl, 1993; Maclntyre, 1997). Although the cross-sectional nature of the data analyzed in this study precludes testing of this effect, recent investigations using S E M to simultaneously examine cross-lagged panel effects suggest that from a population perspective, the influence of SES on health is far greater than the influence of health on SES. For example, Chandola, Bartly, Sacker, Jenkinson and Marmot (2003) examined 10 years of longitudinal data from the Whitehall II study and found no evidence of an effect of mental or physical health on future changes in employment grade. When using financial deprivation as a measure of social position, a significant effect of mental health on changes in social position among men was found though the size of the effect was over two and a half times smaller than the effect of social position on health (Chandola et al., 2003). In this study, current SES did not predict future changes in the trajectory of S R H . Although the lack of a significant relationship between SES and health trajectory applies, on average, to the general population of Canada, it is possible that direct and mediating effects predicting rate of change in S R H might be found in specific subpopulations with sustained deteriorations in health status (the S R H slope was not significantly different from zero in this study). For example, it would be interesting to test the models examined in this study in a 109 population of patients with rheumatoid arthritis experiencing different rates of deterioration, or other chronic diseases that have acute exacerbations combined with a progressive decline in health status. Another possible explanation for the small mediation effects and lack of significant relationship between SES and S R H trajectory may be the use of income adequacy as the sole indicator of SES. Instead of using SES indicators based on economic resources, many European investigators have elected to use indicators of more socially defined classes (Guppy, 2003). For example, Marmot, Shipley, Brunner, and Hemingway (2001) used indicators of occupational position and Stronks et al. (1997) employed measures of education. Although these socially oriented assessments appear to be assessing unique aspects of SES, a similar SES - health gradient has been identified in many Canadian studies using household income adjusted by household size as a measure of SES and self-reported health as an outcome measure (e.g., M c L e o d et al., 2003) as well as in studies using household income to predict mortality and health service utilization (e.g., Mustard, Derkson, Bethelot, Wolfson, & Roos, 1997). Additionally, when Geyer and Peter (2000) examined the relationship between SES and mortality using income, occupational position and qualification (i.e., education), they found that income was to a large extent independent of qualification and occupational position, however, mortality related to the effects of income overrode those of the other SES indicators (Geyer & Peter, 2000). The results of this study speak only to the socioeconomic gradient related to economic resources. Further research is needed to replicate the findings of Geyer and Peter (2000) and to clarify the relationships between different aspects of SES (e.g., social versus economic gradients) and their relationships with health. Some readers might suggest that the use of a single item 5-point S R H scale as a measure of health status might have contributed to the small effects found in this study. However, extensive research has demonstrated both the validity and predictive utility of this measure in general population surveys. For a comprehensive review of supporting literature the reader is referred to Idler and Benyamini (1997). The treatment of S R H responses as continuous responses is also clearly supported by simulation studies discussed in paper 2 of this thesis. Although the size of the relationship between income adequacy and S R H (i.e., 7.3% explained variance) might be considered very modest by some, it is in line with similar 110 nationally representative studies using more complicated assessments of health and SES (e.g., Mulatu and Schooler (2002) reported 6% explained variation). Another plausible explanation for the results in this study may be related to the restricted age range of the sample. A l l respondents were aged 60 or less at the first assessment. It is possible that socioeconomic disparities in health might not fully manifest themselves until people reach older age. However, research has shown that midlife represents the period where social inequalities in health manifest themselves most strongly (Siegrist & Marmot, 2004; Mishra, Ba l l , Dobson, & Byles, 2004). Similar results in a Canadian setting were reported in an extensive study by Mustard et al., (1997). Not only did Mustard et al. find that the strongest relationship between SES and health was in young and middle aged adults aged 30-64, they reported that this relationship was largest when using income quartiles as an indicator of SES as opposed to education. The lack of a relationship between SES and S R H trajectory may be related to the choice of a linear model to represent change over time. However, in situations where a restricted portion of the life span has been observed (such as 6 years) using only a few assessment points, the curve can be said to capture piecewise growth and is probably best modeled using linear or quadratic functions of time (Willet & Keiley, 2000). Although a growth curve representing the health of individuals over their entire life course may best be modeled with a logarithmic-type function that accommodates an increase in the rate of declining health towards the end of life, the length of follow-up was only 6 years and included respondents between ages 30 and 65 years of age. Given the restricted age range and relatively short period of follow-up, a growth model that enabled individuals to possess different albeit linear slopes is probably appropriate. The results of this investigation add yet another piece of information to the literature examining mechanisms underlying the SES-health gradient. Although the amount of explained variation in S R H was small, it is possible that larger and possibly complementary mediating effects might be observed for other aspects of mental well-being not examined in this study. Future research examining other components of SES (e.g., occupation) in conjunction with tests of the mediating properties of additional components of mental wel l -being are needed to fully understand the complex relationships between SES and health. I l l 4.5 R E F E R E N C E L I S T Acock, A . C , & L i , F . (2004). Latent growth curve analysis: A gentle introduction. Retrieved on June 5, 2004 from http://www.hhs.oregonstate.edu/hdfs/acock/pdfs-docs/lgcgeneral.pdf Adler, N . E . , Boyce, T., & Chesney, M . A . (1994). Socioeconomic status and health: The challenge of the gradient. American Psychologist, 49, 15-24. Adler, N . E . , Marmot, M . , McEwen , B . S., & Stewart, J. (Eds.) (1999). Socioeconomic status and health in industrial nations: social, psychological and biological pathways. New York, N Y : Annals of the New York Academy of Sciences. Adler, N . E . & Ostrove, J. M . (1999). Socioeconomic status and health: What we know and what we don't. Ann N Y Acad Sci, 896, 3-15. Antonovsky, A . (1996). The sense of coherence: A n historical and future perspective. IsrJ Med Sci 32, 170-178. Antonovsky, A . (1993). The structure and properties of the sense of coherence scale. Soc Sci Med, 36, 725-733. Antonovsky, A . (1987). Unraveling the mystery of health: How people manage stress and stay well. San Francisco, C A : Jossey-Bass. Bandura, A . (1985). Social foundations of thought and action. New Jersey: Englewood Cliffs. Baron, R. M . , & Kenny, D . A . (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. JPers Soc Psychol, 51, 1173-1182. Chandola, T., Bartley, M . , Sacker, A . , Jenkinson, C , & Marmot, M . (2003). Health selection in the Whitehall II study, U K . Soc Sci Med, 56, 2059-2072. Chandola, T., Kuper, H . , Singh-Manoux, A . , Bartley, M . , & Marmot, M . (2004). The effect of control at home on C H D events in the Whitehall II study: Gender differences in psychosocial domestic pathways to social inequalities in C H D . Soc Sci Med, 58, 1501-1509. Dahl, E . (1993). High mortality in lower salaried Norwegian men: The healthy worker effect. Journal of Epidemiology and Community Health, 47, 192-194. 112 Feldt, T., Kinnunen, U . , & Mauno, S. (2000). A mediational model of sense of coherence in the work context: A one-year follow-up study. Journal of Organizational Behavior, 27,461-476. Gana, K . (2001). Is sense of coherence a mediator between adversity and psychological well-being in adults? Stress and Health, 17, 77-83. Geyer, S. & Peter, R. (2000). Income, occupational position, qualification and health inequalities\u00E2\u0080\u0094competing risks? (comparing indicators of social status). J Epidemiol Community Health, 54, 299-305. Guppy, N . (2003). Socioeconomic status. In James J. Ponzetti (Ed.), International encyclopedia of marriage and family (pp. 1545-1549). New York: Macmil lan Reference. Hoyle, R. H . & Smith, G . T. (1994). Formulating clinical research hypotheses as structural equation models: a conceptual overview. J Consult Clin Psychol, 62, 429-440. Hu, L . , & Bentler, P. M . (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Idler, E . L . & Benyamini, Y . (1997). Self-rated health and mortality: A review of twenty-seven community studies. J Health Soc Behav, 38, 21-37. Kaplan, D . , & Ferguson, A . J. (1999). On the utilization of sample weights in latent variable models. Structural Equation Modeling, 6, 305-321. Karasek, R. (1979). Job demands, job decision latitude and mental strain: Implications for job redesign. Administrative Science Quarterly, 24, 285-306. Kiv imaki , M . , Feldt, T., Vahtera, J., & Nurmi, J. E. (2000). Sense of coherence and health: Evidence from two cross-lagged longitudinal samples. Soc Sci Med, 50, 583-597. Maclntyre, S. (1997). The Black Report and beyond: what are the issues? Soc Sci Med, 44, 723-745. Marmot, M . , Shipley, M . , Brunner, E. , & Hemingway, H . (2001). Relative contribution of early life and adult socioeconomic factors to adult morbidity in the Whitehall II study. J Epidemiol Community Health, 55, 301-307. 113 Marmot, M . G . , Smith, G . D . , Stansfeld, S., Patel, C. , North, F. , Head, J., White, I., Brunner, E . , & Feeney, A . (1991). Health inequalities among British c iv i l servants: The Whitehall II study. Lancet, 337, 1387-1393. McLeod , C. B . , Lavis, J. N . , Mustard, C. A . , & Stoddart, G . L . (2003). Income inequality, household income, and health status in Canada: A prospective cohort study. Am J Public Health, 93, 1287-1293. Mishra, G . D . , Ba l l , K . , Dobson, A . J., & Byles, J. E . (2004). Do socioeconomic gradients in women's health widen over time and with age? Soc Sci Med, 58, 1585-1595. Mulatu, M . S. & Schooler, C. (2002). Causal connections between socio-economic status and health: reciprocal effects and mediating mechanisms. J Health Soc Behav, 43, 22-41. Mustard, C. A . , Derksen, S., Berthelot, J. M . , Wolfson, M . , & Roos, L . L . (1997). Age-specific education and income gradients in morbidity and mortality in a Canadian province. Soc Sci Med, 45, 383-397. Muthen, B . O. & Khoo, S. (1998). Longitudinal studies of achievement growth using latent variable modeling. Learning & Individual Differences, 10, 73-101. Ostry, A . S., Ke l ly , S., Demers, P. A . , Mustard, C , & Hertzman, C. (2003). A comparison between the effort-reward imbalance and demand control models. BMC Public Health, 3, 1-9. Peter, R., Siegrist, J., Hallqvist, J., Reuterwall, C , & Theorell, T. (2002). Psychosocial work environment and myocardial infarction: Improving risk estimation by combining two complementary job stress models in the S H E E P Study. J Epidemiol Community Health, 56, 294-300. Siegrist, J. (1996). Adverse health effects of high-effort/low-reward conditions. J Occup Health Psychol, 1, 27-41. Siegrist, J. & Marmot, M . (2004). Health inequalities and the psychosocial environment-two scientific challenges. Soc Sci Med, 58, 1463-1473. 114 Statistics Canada (2002). Population Health Surveys Program: National Population Health Survey Cycle 4 (2000-2001), household component longitudinal documentation. Ottawa, O N T : Statistics Canada. Stronks, K . , van de Mheen, H . D . , Looman, C. W. , & Mackenbach, J. P. (1997). Cultural, material, and psychosocial correlates of the socioeconomic gradient in smoking behavior among adults. Prev Med, 26, 754-766. Twenge, J. M . & Campbell, W . K . (2002). Self-esteem and socioeconomic status: A meta-analytic review. Personality & Social Psychology Review, 6, 59-71. Willet, J. B . , & Keiley, M . B . (2000). Using covariance structure analysis to model change over time. In Howard E . A . Tinsley, H . E . & Brown, S. D . (Eds.), Handbook of Applied Multivariate Statistics and Mathematical Modeling (pp. 665-694). Academic Press. 115 C H A P T E R 5 C O N C L U S I O N A N D DISCUSSION The program of research of this dissertation was centred on the application of structural equation modeling (SEM) techniques to examine the measurement of mental well-being ( M W B ) and its relationship with health status using longitudinal data from the Canadian National Population Health Survey (NPHS). The first manuscript examined the validity of responses to the 6-item version of the Rosenberg Self-Esteem Scale (RSES) in terms of its underlying factor structure. The second manuscript simultaneously examined the validity of scores on Antonovsky's Sense of Coherence Scale (SOC) in terms of factor structure stability over time within and between age groups. This second manuscript tested the hypothesis that S O C moderates (i.e., buffers) the health impact associated with the experience of a recent stressful life event. The third manuscript presented the results of a series of structural equation models examining the extent to which self-esteem and S O C mediate the impact of socioeconomic status (SES) on current self-rated health and the subsequent rate of change in self-rated health over the following 6 years. The remainder of this discussion is composed of two sections. The first examines the specific contributions of the manuscripts to the program of research. The second section offers a discussion of the broader implications of the findings and provides specific recommendations regarding future research on M W B and its relationship to health status in the context of general population health surveys. The first manuscript presented in this dissertation used data from the British Columbia sample of the N P H S to examine the dimensionality of the 6-item version of the Rosenberg Self-esteem Scale (RSES) included in the survey. Nested confirmatory factor analyses suggested that the R S E S items included in the N P H S measure two correlated dimensions of self-esteem. The first dimension appears to represent a measure of self-competence while the second is interpreted as a measure of self-liking. Subsequent tests of predictive power and discriminant validity supported the two factor interpretation in that the two factors had substantially different relationships with the theoretically related measures of 116 happiness, negative affect and anxiety. Over and above differing correlations, latent variable regressions indicated that the self-competency factor consistently suppressed irrelevant variance in the self-liking factor when predicting happiness, negative affect and anxiety. The reported two-dimensional structure of the R S E S also supported earlier work by Tafarodi and Swann (2001) who found evidence of a similar structure for self-esteem and for the R S E S items examined in this study. Additional support for the two-dimensional representation of self-esteem can be found in the third manuscript of this dissertation where the mediating properties of self-esteem were tested using the much larger national N P H S data set (versus the British Columbia sample used in the first manuscript). Modeling self-esteem as a single factor (i.e., as is implied by the description of the scale in the N P H S ) caused substantial misfit in the model (e.g., the mediation model as a whole did not pass the criteria for good fit proposed by H u and Bentler (1999)). Modeling self-esteem using the two-dimensional structure reported in the first manuscript led to a substantial improvement in model fit with the model being classified as good fitting according to the criteria of H u and Bentler (1999). Without prior knowledge of the recommended two-dimensional approach to measuring self-esteem, many researchers would likely have resorted to reviewing various model diagnostics (e.g., residuals and modification indices) in order to locate the source of misfit in the model before interpreting any coefficients. In relatively uncomplicated models, such the mediation model tested in the third manuscript, it is often possible to identify sources of misfit and re-specify the model accordingly. However, re-specifying the model under such circumstances calls into question the confirmatory nature of the model testing procedure (i.e., the process becomes more exploratory and data driven) and can lead to the over-fitting of models (i.e., models are re-specified to fit unique aspects of the sample being analyzed). In more complicated models, it may not be possible to identify the source of misfit using model diagnostics and the model may have to be rejected. The lack of model fit may lead the researcher to make inappropriate conclusions about the model being tested (i.e., the data do not support a mediating hypothesis) or in the worst case, cause the researcher to abandon the original hypothesis. This last point has direct implications for researchers considering the inclusion of self-esteem in models using N P H S data. These researchers 117 should consider the conceptualization of self-esteem as two correlated but separate factors and how this might influence their hypotheses. For example, Siegrist and Marmot (2004) distinguished between mediators involving self-efficacy and self-liking for the relationship between psychosocial work environment and health status. The second manuscript contained the results of an investigation of the age-based measurement invariance, temporal stability and moderating ability of Antonovsky's sense of coherence (SOC) scale using longitudinal data from the N P H S . In Study 1 of this investigation, multi-group longitudinal factor analysis was used to test the measurement invariance across age and stability over time of the SOC scale using longitudinal data from the following three age groups: 19 to 25 years; 30 to 55 years; and 60 plus years. The results indicated that the psychometric structure of the SOC scale is invariant across age groups and support the appropriateness of comparing of SOC scores among individuals 19 years of age and older. Small differences in the latent means and stability coefficients (i.e., test-retest correlation) across the three age groups provided moderate support for the stability of S O C in the general population of Canada over 30 years of age. However, there was some evidence that, on average, the SOC of individuals continues to develop slightly into older age, though the rate of increase appears to diminish as age increases. Building on the results of the first study, study 2 of the second manuscript contained the results of a series of regression models testing the ability of S O C to moderate (i.e., buffer) the health impacts associated with the experience of a recent stressful life event using data from the 1998-1999 and 2000-2001 N P H S . A significant moderating effect in the expected direction was found in models using self-reported health (SRH) as a measure of health status. However, when the number of self-reported visits to a physician during the previous year was used as an outcome the moderating effect of S O C was not significant. The presence of a significant moderation effect in the expected direction supports Antonovsky's (1987) conceptualization of S O C as a buffer of the impact of stressful life events on health as well the more general notion that individual differences in M W B (i.e., SOC) are associated with differential vulnerabilities to the health impact of stressful events. Although the moderating properties of self-esteem were not tested in this program of research, a logical next step might be to determine i f other components of M W B (e.g., self-esteem) had similar 118 moderating properties to those of SOC and whether these properties are distinct from those attributed to S O C . The last manuscript in this thesis contained the results of an investigation examining the extent to which SES, measured in terms of current income adequacy, predicts current S R H and future change in S R H as well as tests of the hypothesis that M W B (i.e., self-esteem and SOC) mediates the relationship between current income adequacy levels and both current S R H and future change in S R H . The results of the study suggest that SES, assessed by income adequacy, is positively related to S R H assessed at the same time, but not to the linear change in S R H over the subsequent 6 years. Despite the small effect of SES on S R H (e.g., income adequacy accounted for about 7.3% of the variation in the S R H intercept), both S O C and self-esteem appeared to partially mediate the impact of SES. Although the mediation was statistically significant (i.e., all indirect pathways involving the mediators were statistically significant), the indirect effect of SES through SOC only explained about 1% of the variation in the S R H intercept (i.e., 13% of 7.3%). The indirect effect of SES through self-esteem explained a maximum of 0.5% of the variation in the S R H intercept. Although the extent of the mediation observed in this study was small, the results support the hypothesis that aspects of M W B partially mediate the effect of income-related SES on current S R H . The findings of this study also support the work of Siegrist and Marmot (2004) and contribute yet another piece of information to the current body of literature examining mechanisms behind the SES-health gradient. The amount of explained variation in S R H was small, nonetheless, it is possible that additional supplementary mediating effects might be observed for other aspects of M W B not examined in this study (e.g., social support). In addition to the theoretical contributions, the manuscripts serve to demonstrate the utility of Structural Equation Modeling (SEM) as a means of validating indicators of M W B (e.g., using confirmatory factor analysis to test measurement structure) and testing complex theories involving indirect effects and multiple dependent variables (e.g., simultaneously testing the degree to which SOC mediates the effect of SES on current S R H and future rate of change in SRH) . Hoyle and Smith (1994) point out that the testing of mediating processes is particularly amenable to the application of S E M because S E M (a) avoids problems of over- and underestimated mediated effects related to measurement error (Baron & Kenny, 119 1986), (b) permits isolation of the direct effect by including problematic third variables in the model (Bollen, 1989), and (c) permits estimation of models that include multiple mediators (e.g., modeling the mediating properties of self-liking and self-efficacy components of self-esteem simultaneously) and combinations of mediated and moderated effects (Baron & Kenny, 1986). The modeling approaches presented in this dissertation improve upon previous investigations of the relationships between M W B and health using the N P H S datasets. For example, in one of the only comprehensive studies examining the mental health of the Canadian population, the numerous indicators of M W B (e.g., self-esteem) were arbitrarily dichotomized and analyzed using separate logistic regression analyses (Stephens, Dulberg, & Joubert, 1999). Although this analytical approach provides some information about the distribution and predictors of high versus low M W B components, it does little to advance our understanding of the processes underlying the often complex relationships between health and its many determinants. This last point is of particular relevance to research involving the N P H S because this survey was conducted to provide data that could be used to improve our understanding of the relationships between health and its many determinants. There is little doubt concerning both the practical importance (e.g., to provide assessments of the level, trend and distribution) and theoretical value (e.g., improving our understanding of the mechanisms underlying many determinants of population health) associated with the inclusion of comprehensive assessments of M W B into national population health surveillance programs. In line with these benefits, the Canadian government has committed substantial resources to national population health surveys that typically include assessments of psychosocial well-being. For example, at the time of the writing of this dissertation the N P H S had 5 survey cycles remaining and Statistics Canada was working on the release of the second cycle of newly launched Canadian Community Health Survey (CCHS) . Similar to the N P H S , the C C H S is being conducted by Statistics Canada to provide regular and timely cross-sectional estimates of health determinants, health status and health system utilization for 136 health regions across the country (Statistics Canada, 2003). Each two-year collection cycle of the C C H S is comprised of two distinct surveys: a health region-level survey in the first year with a total sample of 130,000 120 respondents and a provincial-level survey in the second year with a total sample of 30,000 respondents (Statistics Canada, 2003). Parallel growth in efforts to examine the relationship between psychosocial concepts and health can be found in the United States. For example, the National Institutes of Health (NIH) recently published recommendations for an expansion of research on the social and interpersonal factors that influence health (Bachrach & Abeles, 2004). One of the repeated themes emerging from the N I H and leaders in the field of public health (e.g. Mervyn Susser, past editor of the American Journal of Public Health) is the need for epidemiologists to move beyond assessing the relationship between exposure and disease in individuals towards more complex multilevel models that attempt to fully integrate the biopsychosocial processes that influence health (Pearce, 1996; Susser & Susser, 1996a, 1996b). The models tested in the studies presented in this dissertation are relatively simple. However, S E M is an extremely flexible alternative to more traditional epidemiological modeling approaches. For example, current S E M software supports the modeling of data containing complex multi-level structures, non-linear effects and growth over time with time-dependent covariates, feed-back loops, categorical observed (e.g., probit regressions) and latent variables (e.g., latent class analyses), mixture models, models with imputed missing data as well as various types of survival analyses. The ability to embed many types of analyses commonly encountered in epidemiology into more complex models that control for measurement error and support the testing of hypotheses involving indirect effects, multiple pathways and outcomes identifies S E M as a powerful tool for future health research. While the capabilities of statistical modeling software programs have advanced rapidly, corresponding advancements in the theoretical frameworks of health and its many determinants have not kept pace. For example, within the health literature we frequently see the use of broad terms such as mental health in inferences derived from assessments of commonly used and often narrowly validated constructs (e.g., measures of depression). However, detailed discussions of the definitions of, and relationships between, mental health and the more specific concepts such as SOC, mental illness, emotional distress, psychological adjustment and psychological distress are conspicuously absent from much of the published literature. This gap in the literature stands in stark contrast to the relative abundance of research examining the relationships between the specific constructs and the 121 indicators selected to assess their levels (e.g., see the examination of anxiety and depression and their indicators by Clack and Watson (1991)). The current state of health science appears to stem from the lack of a theoretical framework that serves to organize the many different but seemingly related concepts frequently used to examine mental health and well-being. From a broader perspective there needs to be a movement towards the clarification and organization of concepts and processes abstracted from social and psychological sciences for use in health-related research. For example, there appears to be substantial overlap between the resistance resources associated with Antonovsky's SOC and components of self-esteem and self-efficacy, yet the constructs are rarely examined together. The lack of a theoretical framework that organizes existing psychosocial theories in terms of their relationship to health has been recognized by epidemiologists. For example, in a recent review of key problem areas associated with recent attempts in Australia to develop the Second National Mental Health Plan: 1998-2003, Sainsbury (2000) concluded that there is a vital conceptual and practical need to develop a more sophisticated understanding of the concepts of mental health and in particular positive mental health. More recently, the European Science Foundation (ESF) sponsored a four year 'Scientific Programme on Social Variations in Health Expectancy in Europe' (Marmot & Siegrist, 2004). This scientific program represented one of the first large scale coordinated efforts that included both the Standing Committee for the Social Sciences and the European Medical Research Council . The Scientific Programme on Social Variations in Health Expectancy in Europe is a collaborative effort involved approximately 70 scientists from Europe and North America and was organized into the following three work groups. The first work group examined life course approaches to the study of social inequalities in health, with particular evidence on early childhood (Marmot & Siegrist, 2004). O f particular relevance to this dissertation is the work of the second work group. This group concentrated on social and psychological determinants of health in midlife with a special emphasis on psychobiological pathways linking adverse psychosocial environments with poor health (Marmot & Siegrist, 2004). The last work group focused on research examining issues involving regional disparity, income inequality and social capital (Marmot & Siegrist, 2004). 122 The research presented in this dissertation makes specific contributions to the field of psychosocial epidemiology in terms of validating measures of mental well-being ( M W B ) and examining their roles as mediators and moderators of the relationship between SES and health. However, the success of future research on the measurement of M W B and its relationship to health depends in part on the development of a broad health-oriented theoretical framework that serves to organize existing psychosocial theories and constructs (e.g., M W B ) . One obvious way forward is to follow the lead of the European Science Foundation and formally organize multidisciplinary working groups focused on translating relevant knowledge in the psychosocial sciences for use in Canada's national health surveillance programs. Such an initiative would likely go a long way towards improving the efficiency of future large scale national health surveys (e.g., N P H S and C C H S ) by identifying specific gaps in the existing health literature and facilitating the development of survey content that enables scientists to meet research objectives identified by health agencies (e.g., Health Canada). 123 R E F E R E N C E L I S T Antonovsky, A . (1987). Unraveling the mystery of health: how people manage stress and stay well. San Francisco, C A : Jossey-Bass. Bachrach, C. A . & Abeles, R. P. (2004). Social science and health research: growth at the National Institutes of Health. Am J Public Health, 94, 22-28. Baron, R. M . & Kenny, D . A . (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. JPers Soc Psychol, 51, 1173-1182. Bollen, K . A . (1989). Structural equations with latent variables. New York: Wiley. Clark, L . A . & Watson, D . (1991). Tripartite model of anxiety and depression: psychometric evidence and taxonomic implications. JAbnorm Psychol, 100, 316-336. Hoyle, R. H . & Smith, G . T. (1994). Formulating clinical research hypotheses as structural equation models: a conceptual overview. J Consult Clin Psychol, 62, 429-440. Hu, L . , & Bentler, P. M . (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55. Marmot, M . & Siegrist, J. (2004). Health inequalities and the psychosocial environment. Soc Sci Med, 58, 1461. Pearce, N . (1996). Traditional epidemiology, modern epidemiology, and public health. Am J Public Health, 86, 678-683. Sainsbury, P. (2000). Promoting mental health: recent progress and problems in Australia. J Epidemiol Community Health, 54, 82-83. Siegrist, J. & Marmot, M . (2004). Health inequalities and the psychosocial environment-two scientific challenges. Soc Sci Med, 58, 1463-1473. Statistics Canada (2003). Canadian Community Health Survey (CCHS): Cycle 1.1: extending the wealth of health data in Canada. Accessed on June 5, 2004 from http://www.statcan.ca/english/concepts/health/cchsinfo.htm Stephens, T., Dulberg, C. & Joubert, N . (1999). Mental health of the Canadian population: a comprehensive analysis. Chronic Dis Can, 20, 118-126. 124 Susser, M . & Susser, E . (1996a). Choosing a future for epidemiology: I. Eras and paradigms. Am J Public Health, 86, 668-673. Susser, M . & Susser, E . (1996b). Choosing a future for epidemiology: II. From black box to Chinese boxes and eco-epidemiology. Am J Public Health, 86, 614-611. Tafarodi, R. W. , & Swann, W . B . (2001). Two-dimensional self-esteem: Theory and measurement. Personality & Individual Differences, 31, 653-673. 125 "@en . "Thesis/Dissertation"@en . "2005-05"@en . "10.14288/1.0092379"@en . "eng"@en . "Health Care and Epidemiology"@en . "Vancouver : University of British Columbia Library"@en . "University of British Columbia"@en . "For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use."@en . "Graduate"@en . "An investigation of the measurement of mental well-being and its relationship with health status using data from the National Population Health Survey of Canada"@en . "Text"@en . "http://hdl.handle.net/2429/17195"@en .