DOES THE SCREENING VERSION OF THE PSYCHOPATHY CHECKLIST MEASURE THE SAME DISORDER IN MALES AND FEMALES? by Jenessa Erin Balmer-Labrecque A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in The Faculty of Graduate Studies (Measurement, Evaluation, and Research Methodology) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) July 2011 © Jenessa Erin Balmer-Labrecque, 2011 ii Abstract The Psychopathy Checklist: Screening Version (PCL:SV) is a tool used to measure the construct of psychopathy in males and females. From a psychometric standpoint, the PCL:SV is administered and scored in the same manner for male and female respondents. The aim of the current study is to investigate using scale and item level statistical techniques if the PCL:SV measures the construct of psychopathy equivalently for males and females. Given the Likert-type nature of the item responses, both Pearson and polychoric correlation techniques were employed in order to compare the more commonly used Pearson to the psychometrically correct polychoric. At the scale level, some PCL:SV items loaded differently for males and females, and not in a manner found in the literature. At the item level, only four items displayed DIF, and the DIF for these items was minimal. These findings suggest that the PCL:SV is measuring the same construct of psychopathy for males and females, but more clearly defines the construct of psychopathy for male respondents. This may, in part, be due to the ways in which males and females express psychopathy; in which case the construct of psychopathy itself needs to be revisited. iii Table of Contents Abstract ............................................................................................................................... ii Table of Contents ………………………...…………………………………………....…iii List of Tables ...................................................................................................................... v List of Figures .................................................................................................................... vi Chapter 1: Introduction ................................................................................................... 1 Measurement of Psychopathy ......................................................................................... 3 Psychopathy Checklist: Screening Version .................................................................... 5 Gender Differences ......................................................................................................... 7 Psychometric Analyses of the PCL:SV ........................................................................ 10 Validity. ........................................................................................................................ 10 Scale level analyses: Factor structure. ...................................................................... 12 Item level analyses: DIF. .......................................................................................... 16 Chapter 2: The Present Study ....................................................................................... 17 Chapter 3: Method .......................................................................................................... 18 Participants .................................................................................................................... 18 Procedure ...................................................................................................................... 20 Statistical Analyses ....................................................................................................... 20 Scale level analyses: Factor analysis. ....................................................................... 20 Item level analyses: Differential item functioning. ................................................... 21 iv Chapter 4: Results........................................................................................................... 23 Scale Level Analyses .................................................................................................... 23 Pearson correlation.................................................................................................... 23 Thresholds and eigenvalues. ..................................................................................... 26 Factor loadings. ......................................................................................................... 30 Polychoric correlation matrix. .................................................................................. 31 Differential item functioning. ................................................................................... 36 Chapter 5: Discussion ..................................................................................................... 43 Factor Level Discussion ................................................................................................ 43 Appropriate factor model. ......................................................................................... 43 Dimensional nature of PCL:SV. ............................................................................... 44 Methodological contribution: Comparing correlation matrices. ............................... 48 Item Level Discussion................................................................................................... 50 Next Steps ..................................................................................................................... 51 References ......................................................................................................................... 55 Appendix A ....................................................................................................................... 59 Appendix B ....................................................................................................................... 61 v List of Tables Table 1 PCL:SV Factor Models ........................................................................................ 13 Table 2 Sample Demographics ......................................................................................... 19 Table 3 Male Factor Loading Comparison ....................................................................... 25 Table 4 Female Factor Loading Comparison.................................................................... 26 Table 5 Male Polychoric Eigenvalues .............................................................................. 27 Table 6 Male Univariate Parameters ................................................................................. 28 Table 7 Female Polychoric Eigenvalues ........................................................................... 29 Table 8 Female Univariate Parameters ............................................................................. 29 Table 9 Two Factor Solution ............................................................................................ 32 Table 10 Three Factor Solution ........................................................................................ 33 Table 11 Factor Comparison for Males and Females ....................................................... 33 Table 12 Male Polychoric Correlation Matrix .................................................................. 34 Table 13 Female Polychoric Correlation Matrix .............................................................. 35 Table 14 DIF Results ........................................................................................................ 41 vi List of Figures Figure 1. Item Response Function Example of No DIF for PCL:SV Item 4 (Lacks Remorse). .......................................................................................................................... 38 Figure 2. Item Response Function Example of Non-uniform DIF for PCL:SV Item 10 (Irresponsible). .................................................................................................................. 38 Figure 3. Item Response Function Example of Uniform DIF for PCL:SV Item 2 (Grandiose). ...................................................................................................................... 39 Figure 4. Item Response Function Example of Non-Uniform DIF for PCL:SV Item 9 (Lacks Goals). ................................................................................................................... 39 Figure 5. Item Response Function Example of Uniform DIF for PCL:SV Item 12 (Adult Antisocial Behavior). ........................................................................................................ 40 Figure 6. Sub-scale Response Function Example of Uniform DIF for PCL:SV Interpersonal Subscale (Items 1-6). .................................................................................. 40 Figure 7. Sub-scale Response Function Example of No DIF for PCL:SV Behavioural Subscale (Items 7-12). ...................................................................................................... 41 Figure 8. Scale Response Function Example of No DIF for PCL:SV Total Score (Items 1-12). ................................................................................................................................. 41 1 Chapter 1: Introduction Screening for psychopathy is commonly done in forensic psychiatric populations. A widely used measure is the revised Psychopathy Checklist (PCL-R) (Hare, 1991; 2003) and the screening version, PCL:SV (Hart, Cox, & Hare; 1995). Given that the measure is used in the same way for both males and females, there is an implicit assumption that the measure functions equivalently for male and female respondents. The primary purpose of this thesis is to investigate whether the PCL:SV measures the same latent variable in the same way for males and female respondents using data from four risk assessment studies of men and women with severe psychiatric disorders. To my knowledge, this is the first empirical study of construct comparability across genders in this psychiatric population. A secondary purpose is to investigate statistical psychometric methods used in construct comparability research (i.e., considering the scaling of the item responses). Although this study focuses on the statistical and psychometric characteristics of the measure, one needs to contextualize the findings within a basic understanding of the phenomenon, in this case psychopathy. The term psychopathy has been refined since it was first introduced in the late 1800s as a broad term referring to behavioural problems unclassified by any mental illness of the time (Lykken, 2005). In the 1930s, Partridge coined the phrase “sociopathic personality” to describe individuals who could not or would not conform to societal demands (as cited in Lykken). As a result, the terms psychopath and sociopath became interchangeable, resulting in some confusion to the construct. As well, diagnoses were inconsistent and subjective. All this aside, Partridge’s work remained relevant for almost 50 years. 2 The term “sociopathic personality” (as coined by Partridge) became dichotomous when the Diagnostic and Statistical Manual of Mental Disorders (DSM), 3 rd Edition (APA, 1980) introduced antisocial personality disorder as a mental disorder. Following this, psychopathy was thought of as its own psychiatric illness, albeit not included in any version of the DSM to date. Psychopathy, a personality disorder, is characterized by interpersonal, affective, and behavioural symptoms (Lykken, 2005). Cleckley (1941) is considered a pioneer of the modern clinical conceptualization of psychopathy because of his synopsis of 16 key characteristics of the disorder. They are as follows: Superficial charm and good intelligence Absence of nervousness or psychoneurotic manifestations Unreliability Lack of remorse or shame Inadequately motivated antisocial behaviour Poor judgement and failure to learn from experience Pathological egocentricity and incapacity for love General poverty in major affective relations Specific loss of insight; unresponsiveness in general interpersonal relations Fantastic and uninviting behaviour with drink (and sometimes without) Suicide rarely carried out Sex life impersonal, trivial, and poorly integrated Failure to follow any life plan Although this list cannot directly be used for assessment, it provides an exceptional description of the latent variable of psychopathy from which to pursue measurement objectives. When reviewing the psychometric literature on the psychopathy checklist, it is tempting to consider results from studies using other versions of the psychopathy checklist, not just those using the PCL:SV, because they all measure psychopathy and some versions have a much larger collection of published research. This is especially true 3 because the PCL-R came out before the PCL:SV and is more commonly used, thus logically there has been more research published using the PCL-R, including gender equivalence research. However the PCL:SV was not psychometrically derived from the PCL-R and is not an established equivalent shorter version of the PCL-R; therefore direct statistical comparison of the two measures is not possible when determining PCL:SV gender equivalence. It stands that if you cannot assume statistical/psychometric equivalence between the two measures than you cannot assume gender equivalence even though limited PCL-R research demonstrates gender equivalence. Another way of thinking of this is that assuming psychometric equivalence between the PCL-R and PCL:SV is like taking reliability values from a previous study and considering them valid for a current study even though the populations are different. Statistically speaking, drawing conclusions of equivalence between two measures that have not been verified empirically as being equivalent is inappropriate. Measurement of Psychopathy If Cleckley (1941) conceptualized the construct of psychopathy with his 16 characteristics, then Hare (1980) measured it with the development of the Psychopathy Checklist (PCL) and its future versions and iterations (Hare, 1991; 2003). The PCL is a diagnostic tool used to rate a person’s psychopathic or antisocial tendencies, as well as in violence and aggression prediction (Hill, Neumann, & Rogers, 2004). Several versions of the PCL exist for use in specific populations such as youth. To date, no specific version of the PCL exists for measuring psychopathy in females. The initial PCL (Hare, 1980) was created by listing traits and behaviours related to the 16 Cleckley characteristics; these 100 potential questions were then refined into 22 items. 4 Two main components of psychopathy were identified early in the PCL development as making distinctive contributions to an essentially unidimensional construct 1 . The core features of the disorder are interpersonal/affective (factor 1) and behavioural (factor 2). Thus, the PCL and subsequent versions, along with a supportive body of research literature, have created tools which tap both parts of the construct. Historically, there has been a correlation of at least 0.5 between the factors (Cooke & Michie, 1997). The core features of psychopathy measured by factor one are the “predatory inclinations and deficient emotional reactivity.” Factor two measures “impulsivity, disinhibition, and early and chronic antisocial behaviour” (Hare, 2003). Current research also argues for a three and four factor model of the PCL, which will be discussed in greater detail in the Factor Structure section of this thesis. After clinical use, research, and psychometric analyses, the PCL was revised (Hare, 1991) and two items were dropped. Subsequently, the shorter 12 item Psychopathy Checklist Screening Version (PCL:SV) was developed by Hart, Cox, & Hare (1995) for two reasons: The first was to screen for psychopathy in forensic settings, but only proceed with the PCL-R if a threshold score was attained. This saved both administration time and resources. The second reason for developing the PCL:SV was so a shorter version was available to assess and diagnose psychopathy outside of forensic settings (Hart et al., 1995). 1 Essential unidimensionality implies that there is one dominant latent variable that accounts for the covariation among the items. A latent variable is considered dominant if, for example, the largest principle component is substantially larger than the second when determining the number of factors to extract in an exploratory factor analysis. Essential unidimensionality in this light can have at least ratios of 2:1 when considering the first to second component. 5 Psychopathy Checklist: Screening Version The PCL:SV was derived from the PCL-R as a psychopathy screening tool for forensic populations and a diagnostic tool for non-forensic populations. Using both case history information and interviews, the PCL:SV can be completed in approximately 40% less time than the PCL-R, or in about 1-1.5 hours (Hart et al., 1995). The 12 items that comprise the PCL:SV are described as follows: 1. superficial 2. grandiose 3. deceitful (manipulative) 4. lacks remorse 5. lacks empathy 6. doesn’t accept responsibility 7. impulsive 8. poor behaviour controls 9. lacks goals 10. irresponsible 11. adolescent antisocial behaviour 12. adult antisocial behaviour Each item of the PCL:SV is scored on the same three point scale of 0/1/2 as the PCL-R. Items are scored by a rater using a combination of file review and interview. Scores are defined as follows (Hart et al., 1995): 2 The item applies to the individual; a reasonably good match in most essential respects; his/her behaviour is generally consistent with the flavour and intent of the item. 1 The item applies to a certain extent but not to the degree required for a score of 2; a match in some respects with too many exceptions or doubts to warrant a score of 2; uncertain about whether or not the item applies; conflicts between interview and file information that cannot be resolved in favour of a score of 2 or 0. 0 The item does not apply to the individual; he/she does not exhibit the trait or behaviour in question, or he/she exhibits characteristics that are the opposite of, or inconsistent with, the intent of the item. 6 There are three scores summed from the rated PCL:SV items. The first is a total score where items 1-12 are summed, and the score can range from 0-24. The score reveals the “degree of overall psychopathic symptomatology” (Hart et al., 1995). The second and third scores are subscale scores which are commonly known as factor 1 and factor 2. Each factor will have a score that ranges from 0-12. By summing items 1-6, factor 1 captures the interpersonal and affective components of psychopathy. Factor 2, the sum of items 7-12, is indicative of the individual’s degree of social deviance in relation to psychopathy. When scoring the PCL:SV, up to one item from each factor can be left unscored because of a lack of available information. The total and factor scores are then prorated to adjust for the missing items (Hart et al., 1995). Within the literature, support has developed for three different factor models for the PCL:SV. The first is the two factor model developed by Hare (1980) with the introduction of the PCL. Factor one is commonly known as the interpersonal factor, while factor two is known as the behavioural factor. The three factor model splits factor one into two more detailed factors, and creates a third factor from Hare’s original factor two while deleting antisocial behaviour items from the PCL:SV (Cooke, Michie, Hart, & Hare, 1999). In response, Hare (2003) further developed the two factor model into a four factor model, which again split the original factor one into two factors, and splits the original second factor into factors three and four. This factor model accounts for antisocial behaviour in factor four. These competing factor models will be discussed in greater detail in the Factor Structure section of this paper. The total PCL:SV score is either used to determine if further investigation of the individual is needed with the PCL-R or to assess and diagnose psychopathy. A score of 7 12 or less is generally obtained by non-psychopathic individuals. A score of 18 or larger is the cut-off score that indicates psychopathy, and a follow-up assessment with the PCL- R is required. A score between 13 and 17 is less clearly defined in terms of the presence of psychopathy and further investigation may be required (Hart et al., 1995). There are two distinct frameworks in which the PCL:SV is used: practice and research. In practice, the PCL:SV must be administered by an individual with a large amount of graduate education who is trained in psychiatry and the use of psychodiagnostic tools (Hart et al., 1995). The authors of the PCL:SV stress that because the PCL:SV can have huge implications based on diagnoses given, the test administrator must be well qualified in all domains of psychopathy. In research, the test administrator must still have formal training in psychopathy, but requires less education than in the practice framework. PCL:SV scores are usually kept confidential in a research framework, thus the potential harm a PCL:SV result possesses is substantially reduced (Hart et al., 1995). It is important to recognize in both the practice and research frameworks that gender differences in psychopathy can exist. This begs the question: Is the PCL:SV measuring gender differences in psychopathy? Gender Differences Just like equivalence between the PCL-R and PCL:SV cannot be assumed without psychometric testing, equivalence between different populations for the PCL:SV cannot be assumed. In doing so, one will threaten the validity of the measure (Bolt, Hare, & Neumann, 2007) because the assumption is made that different populations are responding to PCL:SV questions in a similar manner, or that psychopathy is the same 8 underlying construct in both populations (Forouzan & Cooke, 2005). As has been suggested by psychometric findings when assessing depressive symptomology with the Center for Epidemiological Studies Depression (CES-D) scale, males and females do not always endorse the same symptoms or degree of symptoms, thus assuming so when using the CES-D means that you are limiting the information the measure can tell you about the construct of depression in males and females (Gelin & Zumbo, 2003). It is not unreasonable to think the same of the PCL:SV. No literature to date has examined in detail whether the construct of psychopathy, as measured by the PCL:SV, functions equivalently in males and females. The majority of past research has focused on males and used males as subjects in psychometric validations. In truth, the PCL:SV could be measuring gender differences in psychopathy between males and females because males could be more psychopathic than females. Conversely gender differences could actually be the result of measurement artifact, which is when the scale (PCL:SV) used to measure the latent variable of interest (psychopathy) is also concurrently measuring other unknown constructs in which gender differences exist (Gelin, Carleton, Smith, & Zumbo, 2004). In other words, the essentially unidimensional construct of psychopathy is being confounded by other extraneous dimensions (Gelin et al., 2004). For example, males may score higher on aggression items than females because, in general, males are more aggressive. Scoring low on PCL:SV aggression items may lower a female PCL:SV score enough that they do not meet the psychopathy cutoff. However the total does not expose the proposed inability to measure the construct, but rather misrepresents true female levels of psychopathy. Thus, inability to measure aggression for both genders is known as measurement artifact. 9 Different psychopathy cut-off scores for males and females is one way to circumvent the problem of measurement artifact due to differences such as aggression levels. However, if dimensionality differences between genders are revealed after factor analysis, then cut- off scores will not compensate for the differences and are thus an inappropriate solution. Early on, it is important to recognize that, at least anecdotally, there is a difference between male and female psychopaths – even if they are measured on a single scale. Cleckley’s (1941) descriptions of Roberta and Anna, two females, demonstrates a tendency towards manipulation and sexual promiscuity. Clinicians recognize these traits as being part of the female psychopath persona, yet the PCL fails to formally measure them. In general, female psychopaths are thought to be better represented by the interpersonal and affective traits of psychopathy. A review of the psychometric PCL:SV literature will direct future psychometric research with regards to gender differences. However, it is expected that the current body of literature will serve as little more than a starting point because of the lack of female populations in prior psychopathy research. The research is still valuable in that it establishes the best statistical vantage point for measuring psychopathy in males from which female psychopathy can be compared. In male populations, the outcomes of the psychopathy checklist have been used to predict violence and aggression. However it is premature to make the jump to violence and aggression prediction using the PCL:SV in females because psychometric gender equivalence has not been established. In fact, in preliminary analyses, there is reason to believe that the dimensions of psychopathy captured by the PCL:SV are different in 10 males and females; thus, PCL:SV prediction abilities in females cannot be assumed without further investigation. Psychometric Analyses of the PCL:SV Validity. Past research has established the PCL:SV as being a valid measurement tool using specific statistical analyses relevant for content, criterion, or construct validity. However, this traditional validity approach has been improved upon by a modern view of validity. Within this modern view of validity, construct validity has become the overarching concept. Statistical methods at the item and scale level help build the evidential basis for establishing validity inferences made from the scores. These inferences exist along a continuum so that a degree of validity is obtained rather than a valid or not valid conclusion (Zumbo, 2005, 2007a). Construct validity is the degree to which inferences about a latent variable can be made based on the measure of interest (Zumbo, 2005, 2007a, 2009). For example, construct validity concerns the degree to which inferences about psychopathy can be made based on the PCL:SV. Zumbo (2005) reports that latent variable modeling (factor analysis) and item response modeling (differential item functioning) are both tools to help build evidence about valid inferences. In terms of establishing construct equivalence between genders, validity is key in that both genders must exhibit equal levels of validity on the PCL:SV in order to make the measure comparable between genders. A simple start to testing the validity of the PCL:SV is to compare the total and factor scores on the PCL:SV to the corresponding scores on the PCL-R. Because of the abundance of research that indicates the PCL-R is a useful measure of psychopathy, it 11 can be inferred that a strong statistical relationship between the PCL-R and PCL:SV would by proxy also indicate that the PCL:SV is a useful measure of psychopathy. However, by comparing the two measures, you are in no way establishing psychometric gender equivalence for the PCL:SV, even if it has been established for the PCL-R. Guy and Douglas (2006) found a strong statistical relationship between the total and factor scores for the PCL-R and PCL:SV, indicating on a very simple level that the PCL:SV is an adequate measure of psychopathy because the PCL-R has been established as an adequate measure of psychopathy for both genders. This is sufficient for initial investigations of the PCL:SV, but further statistical examination is needed to determine if construct equivalence between genders exists. Construct validity of the PCL:SV has been established in more than one research study using inter-item correlations, item-total correlations, and alpha levels. These studies follow the more traditional approach of establishing validity, but because of limited psychometric research on the PCL:SV, should none the less be considered. Douglas, Strand, Belfrage, Fransson, & Levander (2005) were able to establish that a high level of construct validity existed in the PCL:SV total score, as well as the factor 1 and factor 2 scores. In the four factor model, validity was lower but acceptable for the first three factors, and poor for the fourth factor. These findings hold true for both genders in their correctional and forensic populations (Douglas et al., 2005). Although this was a large sample, and including females in a study of this type is a rarity, the study as a whole is limited in its comparative value in that it is a non-North American population. A second study uses similar statistical analyses to test the relationship between PCL:SV items (criterion) and their respective sub-criteria. Sub-criteria, not often 12 mentioned in the literature, laid the groundwork for the representation of the psychopathy construct in the PCL:SV (Rogers, Salekin, Hill, Sewell, Murdock, & Neumann, 2000). Each item of the PCL:SV has multiple sub-criteria that were originally considered during the PCL:SV ratings, but have since been dropped (Hart et al., 1995). Results indicate that the items of the PCL:SV function as expected in that they statistically exhibit the largest relationship with their own sub-criteria. This is followed by a moderate relationship/degree of association between the sub-criteria with a particular item, and a weak relationship between sub-criteria of one item and another item (Rogers et al., 2000). Scale level analyses: Factor structure. Like its predecessor, the PCL-R, there is debate in the literature as to which factor model best fits the PCL:SV (see Table 1). The factor discussion has remained active long after the PCL-R, and subsequently the PCL:SV, established itself as “the” measure of psychopathy because of the large role the psychopathy constructs plays in understanding and predicting aggression and violence (Hill et al., 2004). The three competing factor models have two, three, and four factors respectively. The two factor model (Hare, 1991, 2003) was the first model and came out of the original test data for the PCL-R, although the PCL:SV was developed under the premise of the same two factors. Factor 1 relates to selfish, remorseless, and callous personality traits. Factor 2 relates to unstable and anti-social behaviour (Hare, 1991, 2003). 13 Table 1 PCL:SV Factor Models Factor Model Reference 2 Hare, 1991; 2003 3 Cooke et al., 1999 3 Skeem, Mulvey, and Grisso, 2003 3 or 4 Vitacco, Neumann, and Jackson, 2005 4 Hare, 1991; 2003 4 Hill et al., 2004 The three factor model was developed when Cooke et al. (1999) failed to replicate the two factor model. The PCL:SV was shortened by three items that remove anti-social behaviour for this model, and the factor 1 was divided into two factors, an interpersonal factor and an affective factor. Factor 3 is considered to be a behavioural factor, with anti- social behaviour being absent. The four factor model is a continuation of Hare’s original work (Hare, 1991) and takes the original factor 1 and factor 2 and breaks them down again (Hare, 2003). This model takes back into account the three items discarded in the three factor model. The new factors are as follows: Factor 1: interpersonal; Factor 2: affective; Factor 3: impulsive and behavioural/ lifestyle; Factor 4: antisocial behaviour. Within the limited psychometric analyses of the PCL:SV, there are a number of studies that deal primarily with which factor structure best represents the PCL:SV. Hill et al., (2004) used confirmatory factor analysis to test the three competing factor models for the PCL:SV and found that, although all three performed well, the four factor model was superior. In terms of statistical tools, the Satorra-Bentler (S-B) chi-square was used, along with a comparative fit index (CFI), robust comparative fit index (RCFI), root- 14 mean-square error of approximation (RMSEA), and a standardized version of the root- mean-square residual (SRMR) (Hill et al., 2004). All factor and error-unique loadings were significant, as were all six factors between genders. The four factor model was also tested by Vitacco, Neumann, and Jackson (2005) on the belief that removing the antisocial behaviour component in the three factor model of the PCL:SV was a mistake because it is a crucial part of the psychopathic personality. However, this assumption was falsified when confirmatory factor analysis (CFA) found structural validity for both the three and four factor models. Regardless of the debate between models, this study made a valuable contribution to the literature in that it is one of the very few which presents results for men and women separately. This is important because when the CFA results are presented together for men and women it is impossible to tell if the factor structure is different for males and females. Skeem et al., (2003) did not test the four factor model of psychopathy but found that, in comparing the two factor model and three factor model, the three factor model was superior. Unique to their study, besides the larger sample size (n=870), was the fact they tested factor models outside of the three conventional models that already exist. They used factor analysis to test both the two and three factor models with all twelve items, as well as nine items traditionally used in the three factor model. They also tested a one factor model. In terms of the 12 items factor models, the two factor model fits the data better than the one factor model, but the three factor model fit better than the two factor model. When the nine item, three factor model was tested, it fit the data better than the twelve item model, suggesting that omitting the three items related to antisocial behaviour improved model fit. These three items, which deal with anti-social behaviour, 15 also had higher correlations with variables related to criminality than did other items on the PCL:SV (Skeem et al., 2003). Some researchers (Cooke et al., 1999) support the deletion of these items from the PCL:SV because they shift the focus of the construct of psychopathy away from the personality disorder spectrum as originally intended by Cleckly (1976). This model varies slightly from the Cooke and Michie (2001) model in that it is simplified by dropping the superordinate factor and testlet level. Prior to this change, the three factor model had nine items for seven latent factors, which Skeem et al. (2003) thought raised questions about overfitting associated with the model. However the search for a better model should continue because none of the models presented exhibited great fit, only adequate fit. Psychometric research on best fitting factor models for the PCL:SV has been limited. Undoubtedly this is due to both the limited popularity of psychometric analyses, as well as the plethora of existing factor structure research using the PCL-R. Further, within the limited body of PCL:SV factor structure research, the study sample is most often male, or the results are not broken down by gender. When reviewing the PCL:SV literature for psychometric studies on gender difference, it is important to keep in mind that factor structure can only identify gender differences at the scale level. If items within the PCL:SV function in equal but opposite ways for males and females, the factor structure may be similar for both genders even though gender differences exist at the item level. A simulation study of test translation by Zumbo (2003) found that item level DIF was not exhibited in scale level results, meaning that item level analyses are absolutely necessary in order to report the complete picture of gender equivalence of the PCL:SV. In order to ascertain if PCL:SV items are functioning 16 the same for both genders, a review of literature related to differential item functioning (DIF) must take place. DIF studies using the PCL-R have taken place (Cooke et al., 1999) but it is important to remember that the conclusions drawn from this research cannot be transferred to the PCL:SV because these are two psychometrically different measures. Item level analyses: DIF. To date, no studies have been published using the item level analysis of DIF to determine if gender differences exist in psychopathy, as measured by the PCL:SV. Within the spectrum of modern test theory (as opposed to classical test theory), one study was published which reanalyzed normative data from the PCL:SV manual using item response theory (IRT) methods (Cooke et al., 1999). This study confirms the comparability of the PCL-R and PCL:SV, but does not establish item level equivalence between genders for the construct of psychopathy. 17 Chapter 2: The Present Study Based on the paucity of published psychometric literature for the PCL:SV, future research is needed to confirm if the construct of psychopathy is equivalent for males and females. Research must take place that examines if the PCL:SV is measuring the same disorder in males and females at both the scale and item level. Thus, the following research questions need to be answered: 1. Are the dimensions of the PCL:SV the same for males and females? In order to address this applied measurement question, we also need to investigate whether the conclusions differ based on whether one uses a Pearson or polychoric correlation matrix of the items – the items are not continuously scored therefore the matter of the appropriate correlation becomes important. 2. Do the PCL:SV items, sub-scales, and the scale as a whole function the same for males and females? In order to answer these research questions, the most representative factor model for the PCL:SV should be found and compared for each gender. Similarly, individual PCL:SV items, sub-scales, and the total scale score must be scrutinized using DIF to determine if they measure psychopathy equivalently in males and females. 18 Chapter 3: Method Participants PCL:SV ratings from a total of 993 forensic and civil psychiatric patients were analyzed. These data, as seen in Table 2, were derived by pooling data from four risk assessment studies of men and women with severe psychiatric disorders. The current study did not distinguish between civil and forensic psychiatric patients for two reasons. First, a larger sample size will produce more valid statistical results from which to draw conclusions. Second, labelling a patient as civil or forensic is more of a point in time issue than defining two different types of mentally ill patients. That is, civil and forensic psychiatric patients are all mentally ill. The treatment services they are assigned to (civil or forensic) really depends on what they are doing when identified as needing mental health services. If the patient is in trouble with the law when identified as needing mental health services then they are assigned to forensic treatment, but if they are routed into mental health treatment via family members or physicians then they are a civil psychiatric patient. Often patients will severe mental illness will bounce between civil and forensic psychiatric treatment throughout their lifetime. Seto, Harris, and Rice (2004), in a large Canadian study comparing forensic and civil psychiatric patients, found that “forensic patients were similar or lower than civil psychiatric patients in all criminogenic, clinical, and social problems.” Thus, based on personal work experience in mental health as well as on the literature, the appropriate decision was made to collapse forensic and civil psychiatric patients into one group. 19 Table 2 Sample Demographics Number of Participants Age PCL:SV Sample Total Men Women Men Women Men Women Reference Min Max Mean SD Min Max Mean SD Mean SD Mean SD Civil 279 167 112 17 86 36.77 14.00 20 89 42.35 17.13 8.85 4.36 6.75 3.73 Nicholls, Ogloff, & Douglas, 2004 Civil 101 65 36 16 89 39.63 16.05 21 74 44.77 16.32 10.94 5.00 7.24 4.70 Nicholls, Ogloff, & Ledwidge, 2003 Forensic 94 47 47 20 75 33.80 10.24 17 63 35.30 9.661 11.00 4.54 10.48 4.78 Nicholls, 2001 Civil 519 197 322 18 40 30.46 6.254 18 40 31.39 6.092 3.58 3.83 2.17 3.23 Monahan, J. et al., 2001 Total 993 476 517 17.57 66.29 34.26 10.70 18.55 55.07 35.05 9.52 7.05 5.21 4.19 4.55 20 Procedure For all patients, the PCL:SV was completed by raters trained by the author of the instrument (S. Hart). All assessments were completed on the basis of extensive hospital file reviews in the absence of patient interviews, with the exception of the MacArthur study data. These data were collected using both interviews and file reviews. All of the data were combined into one data set for the purposes of the current study. Statistical Analyses Scale level analyses: Factor analysis. Factor analytic techniques should be performed to establish if the construct of psychopathy is essentially unidimensional for both genders. Further factor analyses should test the existing two, three, and four factor models to find the best fit for each gender. Regardless of whether the same model fits males and females best, DIF should be performed to examine if item level gender differences exist for the PCL:SV. Any item level gender differences that are found should be related back to the larger body of gender difference literature. For instance, if PCL:SV items that measure aggressiveness are found function differently in males and females, than the literature should be scoured to see if aggressive behaviour is an established area for gender differences, or if the difference is due to measurement artefact. If the PCL:SV is found to measure the construct of psychopathy differently in males and females, this should not be considered to be a downfall of the tool. Instead, it can enlighten clinicians and researchers to differences in PCL:SV scores for males and 21 females, preventing decisions from being made on the basis of measurement error. As well, new psychometric research can take place to revise the PCL:SV so it functions equivalently for males and females. Item level analyses: Differential item functioning. Before beginning a review of the PCL:SV literature for DIF, it is important to clarify that, although there is debate surrounding the best factor structure for the PCL:SV, the PCL:SV measures the essentially unidimensional construct of psychopathy (Cooke et al., 1999). That is, the two, three, and four factor structures discussed previously are actually models of interrelated sub-factors that compose the factor of psychopathy. This is important when reviewing the DIF literature because a central assumption of DIF statistical techniques is that the construct being measured is essentially unidimensional (Zumbo & Hubley, 2003). DIF occurs when participants from one group endorse items more frequently than participants from another group matched at the trait level (Zumbo & Hubley, 2003). In terms of the PCL:SV, DIF will permit the conclusion that different item scores for males and females are achieved after matching for overall level of psychopathy. Thus PCL:SV items, and ultimately the test itself, are functioning differently for males and females. At the item level, this is known as item bias (Zumbo, 2007b). On the other hand, item impact is when differential item functioning exists because there are true differences in psychopathy between males and females (Zumbo, 2007b). Known as the second generation of DIF and item bias detection methods by some researchers in the field (Zumbo, 2007b), there are four inter-related reasons to examine 22 DIF and item bias in the PCL:SV. First, DIF and item bias are strongly tied to fairness and equity in testing (Zumbo, 2007b). That is, the outcome of a PCL:SV score can help shape treatment decisions, length of stay, and policy decisions; if the PCL:SV scores for males and females are different because of DIF, then treatment decisions, lengths of stay, and policy decisions might be unjustly different for males and females as well. Second, the detection of DIF and item bias can help discard the notion of measurement artefact as one threat to internal validity (Zumbo, 2007b). As discussed previously, measurement artefact should be ruled out as an explanation for differences in PCL:SV scores for males and females. Third, item level analyses of the PCL:SV using DIF and item bias can help in understanding why the items are responded to in a particular way, and if the responses differ for each gender. Zumbo (2007b) describes this as trying to understand the cognitive and psychosocial processes of item responding and test performance, and if these processes differ by gender. Finally, DIF and item bias detection techniques are a useful starting point for other psychometric techniques such as lack of invariance and model- data fit (Zumbo, 2007b). 23 Chapter 4: Results Scale Level Analyses Pearson correlation. Factor analysis was done in two steps. In the first step we needed to decide on the number of factors so we calculated eigenvalues using Principle Components Analysis. Then based on the number of factors, minimum residual exploratory factor analysis (MINRES) was conducted. PROMAX rotation was used to allow the factors to correlate. The eigenvalue greater than one results of the factor analysis of the Pearson correlation matrix explained similar amounts of variance to the polychoric factor analysis results. Both a two and three factor solution were acceptable for males and females. Like the polychoric correlation solution, the third factor had an eigenvalue very close to one. Two factors explained 52.8% of the variance for males, while the third factor increased the variance accounted for to 81.2% for males. Compared to the eigenvalues of the polychoric correlation matrix, the Pearson correlation matrix explained less variance for factor 1 but almost identical variance for factor 2 and 3. Females had almost identical results to males in that 53.5% of the variance was accounted for by a two factor model while a three factor model increased the variance accounted for to 81.9%. Like the males, the polychoric correlation matrix explained more variance than the Pearson correlation matrix for factor 1 and almost identical variance for both factor 2 and 3. 24 A factor score cut-off of 0.4 was used to determine which items of the PCL:SV loaded on each factor. For males in the two factor solution, items 4 - 12 loaded on factor 1 while items 1 - 3 loaded on factor 2. No items cross loaded on both factors. In the three factor solution, items 4 - 10 loaded on factor 1, items 1 and 3 loaded on factor 2, and items 11 and 12 loaded on factor 3. Note that item 2 did not load on any factors in the three factor solution. Like the two factor solution, there were no cross loadings for the three factor solution. These factor loadings are opposite to the factor loading results of the polychoric correlation matrix and counter-intuitive to the traditional behavioral and interpersonal factors described in the literature (Hare, 2003). Recall that items 1 and 3 are thought of as behavioral factor items in the literature and, in keeping with the literature, loaded on factor 1 for males in the polychoric correlation matrix. Exactly the opposite of this is true for the Pearson correlation, where items 1 and 3 loaded on factor 2 equally as strongly. Similarly items 4 - 10 load on the traditional interpersonal factor known as factor 2 for the polychoric correlation matrix and the behavioral factor known as factor 1 for the Pearson correlation matrix. Items 11 and 12 load on factor 3 for both the polychoric correlation matrix and the Pearson correlation matrix, although the factor loadings for the polychoric correlation matrix tend to be slightly larger (see Table 3). 25 Table 3 Male Factor Loading Comparison Correlation Matrix Two Factor Model Items Three Factor Model Items Factor 1 Factor 2 Factor 1 Factor 2 Factor 3 Polychoric 1-3 4-12 1, 3 2, 4-10, 12 11, 12 Pearson 4-12 1-3 4-10 1, 3 11, 12 The female two factor solution saw items 1 - 4, 6, 7, and 9 all load on factor 1. Traditionally, in the literature for a two factor solution, items 1 - 6 loaded on factor 1, the behavioral factor. Items 8 and 10 - 12 loaded on the interpersonal factor 2, which was to be expected based on factor loadings seen in the literature (Hare, 2003). Item 5 did not load on either factor, and no items cross loaded. Using the same loading cut-off of 0.4 as was used for the two factor solution, items 8 and 10 - 12 loaded on factor 1 for the three factor solution. Items 1- 3, and item 7 loaded on factor 2; while items 4 - 6 loaded on factor 3. Item 9 failed to load on any factor in the three factor solution. Comparing the factor loadings for females of the polychoric and Pearson correlation matrices was slightly more complicated than for males because items did not simply flip factors when correlation methods were altered. In addition, items loaded on more than one factor for females. With a two factor solution, items 1 - 4, 6, 7, and 9 all load on factor 1 when either the polychoric or Pearson correlation matrices are used. Items 8 and 10 - 12 load on factor 2 for both matrices. So, with the exception of items 4 and 5, factor loadings of the two factor solution for the polychoric and Pearson correlation matrices are almost identical. 26 Comparing the three factor solutions for the polychoric and Pearson correlation matrices is next to impossible. Only items 2, 3, and 7 load on the same factor for both matrices, factor 2. The Pearson factor loadings are slightly larger than polychoric factor loadings for these items. As seen in Table 4, there is no consistent pattern for the remaining items moving between factors depending on the correlation matrix employed. Table 3 Female Factor Loading Comparison Correlation Matrix Two Factor Model Items Three Factor Model Items Factor 1 Factor 2 Factor 1 Factor 2 Factor 3 Polychoric 1-7, 9, 10 4, 5, 8, 10-12 1 2-10, 12 11, 12 Pearson 1-4, 6, 7, 9 8, 10-12 8, 10-12 1-3, 7 4, 5, 6 Thresholds and eigenvalues. Factor analysis was performed using a correlation matrix that treats the items as Likert-type responses (i.e., the polychoric correlation matrix). A comparative analysis using a Pearson correlation matrix was also performed (refer to Appendix B for Pearson correlation results). Analyses were done in the software LISREL/PRELIS. The Kaiser- Guttman rule, also known as the eigenvalues greater than 1.0 rule, was used to decide the number of factors to retain. The effective sample size for females was 466 and for males was 391. For males, as seen in Table 5, three components had eigenvalues greater than 1.0, which accounted for 72.8% of the variance. However the third component is very close to 1.0, so a two factor solution that accounts for 63.7% of the variance could also be considered. A four factor solution was not tried, as suggested in the procedure section, because of the eigenvalue greater than one rule that was followed. So, both the two and 27 three factor solutions were tried and their interpretability compared. A factor analysis extracting two factors was performed, and because a polychoric correlation matrix was used, a minimum residual estimation of the parameters in the model was done (see Table 6). A minimum residual estimation of the parameters in the model produces threshold scores (z-scores) that estimate when the population graduates to the next scoring level on any PCL:SV item. Thresholds assume that an item has an underlying normal distribution and because they are normalized scores (z-scores), thresholds allow for comparisons between genders or across items. Lower z-scores mean that it is easier to move across thresholds, essentially it is easier to move to the next scoring level of an item. In item 10 (irresponsible) for instance, males have a lower threshold score of -0.029 and an upper threshold score of 1.089; whereas females have a lower threshold score of 0.168 and an upper threshold score of 1.054. This means that males have a much easier time moving from a score of zero to one, but males and females are the same when moving from a score of one to two. Table 4 Male Polychoric Eigenvalues Factor Initial Eigenvalues Total % of Variance Cumulative % 1 6.36 53.03 53.03 2 1.28 10.65 63.69 3 1.09 9.12 72.81 28 Table 5 Male Univariate Marginal Parameters Item Thresholds 1. Superficial 0.689 1.950 2. Grandiose 0.433 1.427 3. Deceitful 0.512 1.483 4. Lacks Remorse 0.433 1.376 5. Lacks Empathy 0.497 1.634 6. Doesn’t Accept Responsibility 0.249 1.011 7. Impulsive -0.042 0.919 8. Poor Behavioral Controls -0.093 0.816 9. Lacks Goals -0.197 0.816 10. Irresponsible -0.029 1.089 11. Adolescent Antisocial Behavior 0.216 1.254 12. Adult Antisocial Behavior -0.210 0.689 A 0.40 level was set as the cut-off for interpreting factors. For comparison purposes and in keeping with the literature, a factor analysis extracting three factors was also done. Like the two factor solution, a minimum residual estimation of the parameters in the model was done because of the polychoric correlation matrix. A 0.40 as the cut-off for interpreting factors was also used for the three factor model. Promax factor rotation was used because it allows for the data to determine the degree of correlation among the factors (rather than artificially forcing the factors to be uncorrelated as per Varimax rotation). A disadvantage of the three factor solution is that two of the factors are left to be defined by only two items, which is not an optimal solution. For females, as seen in Table 7, two components had eigenvalues greater than 1.0, which accounted for a total of 68.9% of the variance. However the third eigenvalue was just slightly smaller than 1.0, so both two and three factor solutions were tested. Just like for males, a minimum residual estimation of the parameters in the model (see Table 8) was done because of the polychoric correlation matrix. A 0.40 cutoff was also used for 29 interpreting factors, and promax factor rotation was used. Similar to the three factor solution for males, the three factor solution for females was not ideal because two of the factors were represented by only one or two items. In contrast, the two factor solution had a more equal item loading for both factors, although some items loaded on both factors. This did not occur in the male sample. Table 6 Females Polychoric Eigenvalues Factor Initial Eigenvalues Total % of Variance Cumulative % 1 6.99 58.21 58.21 2 1.28 10.70 68.91 3 0.94 7.80 76.71 Table 7 Female Univariate Marginal Parameters Item Thresholds 1. Superficial 2. Grandiose 1.196 2.230 3. Deceitful 1.045 2.025 4. Lacks Remorse 0.820 1.947 5. Lacks Empathy 0.914 1.913 6. Doesn’t Accept Responsibility 1.102 2.116 7. Impulsive 0.638 1.651 8. Poor Behavioral Controls 0.317 1.277 9. Lacks Goals 0.392 1.241 10. Irresponsible 0.168 1.054 11. Adolescent Antisocial Behavior 0.403 1.277 12. Adult Antisocial Behavior 0.747 1.651 30 Factor loadings. Traditionally, for both genders, Factor 1 is the interpersonal factor of the PCL:SV and comprised of items 1-6; whereas Factor 2 is the behavioral factor of the PCL:SV and comprised of items 7-12 (Hare, 1991). In the current study, items 1-3 load on the interpersonal factor for both men and women, and items 8, 11, and 12 load on the behavioral factor for both men and women. These results parallel the current traditional factor structure of the PCL:SV. Items 4, 5, and 10 load on both the interpersonal and behavioral factors for woman, whereas these same items load on only the behavioral factor for men. In the literature (Hare, 1991), only item 10 would be expected to load on the behavioral factor. The remaining three items, 6, 7, and 9, load on the interpersonal factor for females but load on the behavioral factor for males. Again, item 6 is traditionally thought of as an interpersonal item and items 7 and 9 are thought of as behavioral items. Factor loadings and unique variance for each item, presented by gender, can be seen in Table 9. The factor correlations of the two factor model for both males and females can be characterized as moderate with males at 0.572 and females at 0.500. The three factor solution, as mentioned previously, is not ideal for either gender because factor 1 and 3 are defined by too few items (see Table 10). This makes the interpretation of factors difficult. Items 1 and 3 load on factor 1 for males, while only item 1 loads on factor 1 for females. Factor 3 has items 11 and 12 load on it for both males and females. The remaining items load on factor 2 for both genders, and item 12 loads on both factor 2 and 3 for both genders. One advantage of the three factor solution is comparability in that items load almost identically for both genders (see Table 10). For 31 both males and females, there is a moderate relationship between factors 1 and 2, as well as factors 1 and 3. There is a weak relationship between factors 2 and 3 (see Table 11). Polychoric correlation matrix. Within the polychoric correlation matrix one should expect stronger relationships among items that load on the same factor. For example, the relationship between items 1 and 2 (0.449) and items 1 and 3 (0.604) are much larger than items 1 & 8 (0.174). As expected, items 1, 2, and 3 load together in a two factor solution whereas items 1 and 8 do not (see Table 12). Differences in the strength of the relationship depending on whether items load on the same factor can also be seen in the female population (see Table 13). The next step in understanding how psychopathy is expressed in males and females is to examine the PCL:SV at the item level using DIF. 32 Table 8 Two Factor Solution Note: The coefficients reported in this table are the pattern coefficients, which are akin to the standardized “regression coefficients” in the factor analysis model. Item Promax-Rotated Factor Loadings Males Females Factor 1 Interpersonal Factor 2 Behavioral Unique Variance Factor 1 Interpersonal Factor 2 Behavioral Unique Variance 1 0.856 -0.179 0.410 0.703 -0.119 0.575 2 0.511 0.202 0.579 0.768 -0.130 0.494 3 0.697 0.188 0.328 0.651 0.139 0.466 4 0.133 0.766 0.279 0.555 0.459 0.226 5 0.020 0.813 0.320 0.488 0.473 0.308 6 0.331 0.590 0.318 0.692 0.314 0.205 7 0.299 0.464 0.537 0.554 0.324 0.408 8 -0.007 0.796 0.373 0.336 0.550 0.399 9 0.082 0.682 0.464 0.572 0.394 0.291 10 0.101 0.735 0.364 0.433 0.538 0.289 11 .0150 0.459 0.782 -0.031 0.665 0.577 12 -0.161 0.830 0.438 -0.066 0.871 0.293 33 Table 9 Three Factor Solution Table 10 Factor Correlations for Males and Females Males Females Factor 1 2 3 1 2 3 1 1.000 1.000 2 0.521 1.000 0.533 1.00 3 0.235 0.443 1.000 0.155 0.451 1.00 Item Promax-Rotated Factor Loadings Males Females Factor Factor 1 2 3 Unique Variance 1 2 3 Unique Variance 1 0.924 -0.150 0.050 0.251 1.026 -0.141 0.105 0.051 2 0.350 0.503 -0.226 0.527 0.310 0.529 -0.266 0.531 3 0.594 0.263 0.109 0.348 0.399 0.419 0.058 0.455 4 0.072 0.710 0.174 0.292 0.095 0.803 0.047 0.227 5 -0.026 0.731 0.179 0.339 0.005 0.832 0.016 0.292 6 0.179 0.787 -0.095 0.267 0.158 0.824 -0.060 0.202 7 0.214 0.483 0.110 0.543 0.243 0.537 0.150 0.407 8 -0.125 0.849 0.040 0.345 -0.155 0.832 0.077 0.361 9 -0.076 0.830 -0.049 0.404 0.018 0.861 -0.031 0.265 10 -0.013 0.783 0.061 0.352 -0.160 0.952 0.015 0.218 11 0.150 -0.081 0.811 0.327 0.125 -0.044 0.986 0.017 12 -0.122 0.432 0.620 0.267 -0.131 0.495 0.463 0.404 34 Table 11 Male Polychoric Correlation Matrix Item 1 2 3 4 5 6 7 8 9 10 11 12 1. Superficial 1.000 2. Grandiose 0.449 1.000 3. Deceitful 0.604 0.508 1.000 4. Lacks Remorse 0.364 0.409 0.569 1.000 5. Lacks Empathy 0.303 0.430 0.448 0.841 1.000 6. Doesn’t Accept Responsibility 0.440 0.515 0.575 0.752 0.683 1.000 7. Impulsive 0.368 0.406 0.529 0.497 0.453 0.552 1.000 8. Poor Behavioral Controls 0.174 0.493 0.471 0.647 0.681 0.638 0.445 1.000 9. Lacks Goals 0.161 0.512 0.461 0.560 0.520 0.585 0.564 0.618 1.000 10. Irresponsible 0.283 0.410 0.507 0.617 0.588 0.667 0.595 0.630 0.705 1.00 11. Adolescent Antisocial Behavior 0.246 0.112 0.405 0.390 0.377 0.203 0.329 0.321 0.276 0.318 1.000 12. Adult Antisocial Behavior 0.194 0.227 0.353 0.594 0.609 0.503 0.479 0.539 0.458 0.566 0.624 1.000 35 Table 12 Female Polychoric Correlation Matrix Item 1 2 3 4 5 6 7 8 9 10 11 12 1. Superficial 1.000 2. Grandiose 0.501 1.000 3. Deceitful 0.589 0.432 1.000 4. Lacks Remorse 0.466 0.488 0.622 1.000 5. Lacks Empathy 0.417 0.428 0.485 0.864 1.000 6. Doesn’t Accept Responsibility 0.506 0.545 0.638 0.802 0.783 1.000 7. Impulsive 0.490 0.471 0.588 0.629 0.576 0.658 1.000 8. Poor Behavioral Controls 0.237 0.431 0.480 0.660 0.634 0.660 0.542 1.000 9. Lacks Goals 0.396 0.533 0.570 0.707 0.705 0.741 0.646 0.671 1.000 10. Irresponsible 0.261 0.491 0.549 0.669 0.702 0.718 0.681 0.735 0.794 1.00 11. Adolescent Antisocial Behavior 0.301 0.071 0.340 0.413 0.384 0.361 0.489 0.441 0.395 0.427 1.000 12. Adult Antisocial Behavior 0.195 0.207 0.387 0.623 0.566 0.517 0.448 0.541 0.500 0.566 0.652 1.000 36 Item Level Analyses Differential item functioning. Gender DIF analyses were done for all 12 PCL:SV items as seen in Table 14. The three steps in Table 14 represent a hierarchical multiple regression, and test whether group and interaction variables are statistically significant over and above the conditioning variable. This means that they are testing for uniform and non-uniform DIF simultaneously (Zumbo, 1999). Step one is the total score or the conditioning variable. Step two is the group variable or the difference between genders, which is the uniform DIF. Step three is the group interaction, or the non-uniform DIF. When test statistic, or R-squared, from step one is subtracted from step three the DIF value is achieved. Item response function graphs, the predicted item scores from DIF regression, were graphed using scatterplot and LOESS smoothing by gender for all items, as well as for subscale and total scores. Most items had no DIF. Item 4 (Lacks remorse), graphed below (see Figure 1), is an example of an item with no DIF. At quick glance no DIF can be seen because the item lines for males and females lie on top of one another. Numerically, item 4 has no DIF because the p-value of the chi-square is 0.94035 while the Bonferroni corrected alpha needed to be significant is 0.041667. Furthermore, the R-squared value is tiny, listed in Table 14 at zero. Those items that displayed DIF had minimal amounts. Item 10 (Irresponsible), graphed below (see Figure 2), is an example of an item with minimal non-uniform DIF. Non-uniform DIF exists when the probability of answering the item correctly is not 37 greater across all levels of ability for any group. That is, there is an interaction between PCL:SV score and gender. Non-uniform DIF appears graphically when the male and female lines are not parallel to one another throughout the entire graph. DIF for item 10 occurs when item scores are one or larger and when the total score is greater than five. The DIF for this item is interesting because females actually score higher than males for this item. This suggests that in terms of irresponsibility and psychopathy, those females whose total score falls in the questionable psychopathy range will have a higher score for the irresponsibility item than their male counterparts. Perhaps this item captures female psychopathic characteristics better than some of the other PCL:SV items. However keep in mind that the DIF creates less than a one point difference between males and females, so it is a small contribution to how males and females may differ on the PCL:SV. The p- value for item 10 is very small (zero to five decimal places) and the effect size (R- squared) is 0.014. While 0.014 is considered a small effect size, it is the largest effect size of all 12 items. This lends support to the argument that item level differences between males and females for the PCL:SV are minimal. Items 2 (Grandiose), 9 (Lacks goals), and12 (Adult antisocial behaviour) are also items displaying minimal DIF, and are graphed below (see Figures 3, 4, and 5). Note that item 9, like item 10, has non-uniform DIF. The interpersonal subscale DIF plot shows differences between male and female scores after a total score of 15 has been achieved, but the difference between genders is less than a point (see Figure 6).The behavioural subscale DIF plot has no DIF (i.e. the male and female lines overlap) (see Figure 7), nor does the overall PCL:SV DIF plot (see Figures 8). 38 Figure 1. Item Response Function Example of No DIF for PCL:SV Item 4 (Lacks Remorse). Figure 2. Item Response Function Example of Non-uniform DIF for PCL:SV Item 10 (Irresponsible). 39 Figure 3. Item Response Function Example of Uniform DIF for PCL:SV Item 2 (Grandiose). Figure 4. Item Response Function Example of Non-Uniform DIF for PCL:SV Item 9 (Lacks Goals). 40 Figure 5. Item Response Function Example of Uniform DIF for PCL:SV Item 12 (Adult Antisocial Behavior). Figure 6. Sub-scale Response Function Example of Uniform DIF for PCL:SV Interpersonal Subscale (Items 1-6). 41 Figure 7. Sub-scale Response Function Example of No DIF for PCL:SV Behavioural Subscale (Items 7-12). Figure 8. Scale Response Function Example of No DIF for PCL:SV Total Score (Items 1-12). 42 Table 13 DIF Results Item STEP 1 STEP 2 STEP 3 DIF χ² df R² χ² df R² χ² df R² χ² df p-value R² 1 161.299 1 0.23 166.252 2 0.236 167.108 3 0.237 5.809 2 0.05478 0.007 2 292.22 1 0.336 305.808 2 0.349 305.878 3 0.349 13.658 2 0.00108 0.013 3 397.356 1 0.44 399.118 2 0.442 399.215 3 0.442 1.859 2 0.39475 0.002 4 623.411 1 0.626 623.449 2 0.626 623.534 3 0.626 0.123 2 0.94035 0 5 513.03 1 0.583 514.711 2 0.584 515.806 3 0.585 2.776 2 0.24957 0.002 6 689.024 1 0.617 689.071 2 0.617 689.09 3 0.617 0.066 2 0.96754 0 7 553.21 1 0.504 554.685 2 0.505 560.977 3 0.509 7.767 2 0.02058 0.005 8 635.694 1 0.553 635.869 2 0.553 638.9 3 0.555 3.206 2 0.20129 0.002 9 652.204 1 0.557 655.802 2 0.559 668.354 3 0.567 16.150 2 0.00031 0.01 10 720.802 1 0.615 732.53 2 0.621 745.84 3 0.629 25.038 2 0.00000 0.014 11 253.877 1 0.295 260.69 2 0.302 263.078 3 0.305 9.201 2 0.01005 0.01 12 508.845 1 0.470 522.604 2 0.480 522.757 3 0.480 13.912 2 0.00095 0.01 *Note: Bonferroni corrected alpha = 0.0041667 Bold text represents items with a significant p-value. Italic text represents items with a very small, but not significant, p- value. Italic and bold text represents an item with a very small, but not significant, p-value. This item has a very small R² value. 43 Chapter 5: Discussion The PCL:SV is used interchangeably with males and females although, to date, little research has examined if the measure functions differently for the two genders. The goal of this study was to determine statistically at both the test level and item level if the PCL:SV is measuring the same construct of psychopathy, as well as what inferences can be drawn from the statistical findings. This thesis also makes an important contribution to the psychometric methodological literature because it compares the results of Pearson and polychoric correlations as a prelude to factor analysis. The following research questions were answered during the course of the study: 1. Are the dimensions of the PCL:SV the same for males and females and does the methodological approach taken alter the dimensional findings? 2. Do the PCL:SV items, sub-scales, and the scale as a whole function the same for males and females? The results of the research questions allow gender equivalence, or a lack thereof, to be established at an empirical level for the PCL:SV rather than assumed because of PCL-R research or anecdotal evidence. Factor Level Discussion Appropriate factor model. Throughout the results and discussion sections of this thesis many references have and will be made to a two factor solution “in the literature.” The two factor solution was the work of Hare (1991), who is also the author of the PCL:SV. Although alternative three and four factor solutions exist in the literature, the two factor solution best matched 44 the present study’s eigenvalues and factor loading results. However, more recently, Hare (2003) has published a four factor solution that essentially takes factor one and two and splits them each into two factors. The PCL:SV is tapping an essentially unidimensional construct to begin with, it is not an unreasonable stretch to say that whether interpretation of results is made using a two or four factor solution merely describes some degree of compartmentalization being applied to a single factor: psychopathy, where some factors are more interrelated than others. The difference is the scale level detail applied based on the factor loadings. That being said, a much more detailed explanation of construct differences in psychopathy for males and females can take place when using a four factor model versus a two factor model. The four factor model contains the following factors: Factor 1: interpersonal; Factor 2: affective; Factor 3: impulsive and behavioural/ lifestyle; Factor 4: antisocial behaviour (Hare, 2003). Dimensional nature of PCL:SV. The first step of an exploratory factor analysis is to calculate eigenvalues in order to deduce how many factors should be included in the model. In the current study, the optimal number of factors (two) was the same for males and females. Eigenvalues and percentage of variance accounted for were very similar identical for males and females. The male two factor solution accounted for 63.7% of the variance and the female two factor solution accounted for 68.9% of the variance. The three factor solution was also calculated for males and females because the third eigenvalue was slightly larger than one for males and slightly smaller than one for females. For both genders, the lack of items loading on the third factor meant that the two factor solution was preferred. It is important to remember that one cannot assume that the factor structures and loadings are 45 the same for males and females when a two factor solution is best for both genders. Multiple two factor solutions can exist within the same dataset, each one unique to a particular respondent group. After comparing the factors chosen based on eigenvalues, the next step is to compare the overall item loadings for males and females to determine the factor model. The first step in examining factor models for males and females is to review the overall item loadings. As mentioned previously, a 0.4 cut-off was chosen for this purpose. Six out of 12 items load cleanly on the same factor for both males and females. This suggests that psychopathy is the same construct for males and females. Four out of 12 items load on the same factor for males and females, but, for females, the items also load equally on the second factor. This suggests that, when measured by the PCL:SV, the construct of psychopathy has more clear boundaries for males than females. Two out of 12 items load on different factors for males and females. This suggests that these items are tapping a different construct of psychopathy for males and females. Now that an overall picture of factor structure for males and females has been established, the next step is to compare the item loadings by gender. Items 1 - 3 load on the interpersonal factor for males and females. This is expected in the literature (Hare 2003). Items 8, 11, and 12 load on the behavioural factor for males and females, as expected in the literature. Items 4, 5, 9, and 10 load as behavioural factor items for men and cross-load as both behavioural and interpersonal factor items for women. The cross loading is split equally between the two factors for women. Items 4 and 5 are expected to be interpersonal according to the literature (Hare, 2003), while items 9 and 10 are behavioural in the literature. Items 6 and 7 load differently for males and females in that 46 both load on the behavioural factor for males and interpersonal factor for females. Item six is expected to be interpersonal in the literature while item 7 is expected to be behavioural. In the current study, factor model comparisons that are made to the literature refer to a two factor solution, although three and four factor solutions have also been suggested. The results of the factor analyses, when compared for males and females, tell us that the PCL:SV is not consistently measuring the same construct of psychopathy for males and females, although there are overlaps. When discussing the scale and item level results of the PCL:SV for females, especially when these results differ for males, it serves as a reminder that the PCL:SV is a male-centric scale designed for measuring psychopathy in males (Forouzan & Cooke, 2005). This allows for a second way to interpret results of the current study. It is not just that the PCL:SV measures a different construct of psychopathy for females; it is an indicator of how closely the disorder of psychopathy in females mimics the disorder in males. This is an important distinction to make when female respondents are being measured on a scale designed to capture a disorder in males. What the PCL:SV does appear to do well for both genders is measure the level of interpersonal traits typically found in a sly and cunning male psychopath. Recall the superficial, grandiose, and deceitful items of the PCL:SV loaded together as factor 1 for both males and females. However using the size of the factor loadings as an indicator, males exhibit these features more strongly than females. In males on the first factor (items 1 - 3) reflect Hare’s four factor solution while the remainder lump together into a second factor rather than three separate factors. While some may argue that this two factor solution is an accurate representation of psychopathy 47 in males, I purport that this two factor solution is merely evidence of the PCL:SV grossness. That is, the PCL:SV is a screening tool meant at a clinical level to screen for psychopathy, to be followed up when required with an in-depth interview requiring the PCL-R. Based on the factor loadings, the PCL:SV is successful at identifying sly, cunning male psychopaths. The PCL:SV also successfully identified the behavioural factor of psychopathy, although not in detail enough to separate into two factors, lifestyle choices and behavioural. Where the PCL:SV fails to capture the construct of psychopathy is in the affective factor. One may speculate that this is because the affective component of psychopathy may be less important (for males) to the disorder. In order to recognize the limits of my scope of knowledge and practice, the reader should note that these, and other, clinical speculations are primarily founded in the psychometric theory and not clinical theory or practice. What we observe happening with the female population is two-fold. First, like the males, females require a more precise measure of psychopathy than is captured by the PCL:SV. Items 7-12 lump together into a lifestyle choices/ behavioural factor. The impulsive item (item 7) comes close to cross loading on both the interpersonal and behavioural factors, but falls just short of the 0.4 cut-off. One must raise the question of whether the female results would continue to mimic that of the males if the PCL:SV was a more precise construct measure of psychopathy. Or rather, if precision allowed items to fall within a four factor model as Hare suggested then behavioural items would suggest a different construct of psychopathy for males and females. The second observation made about females and the PCL:SV results concerns the cross-loading of factor two (of four) affective items. This phenomenon did not occur in 48 the male respondents. It appears that the affective component of the construct of psychopathy in females is challenging to capture. This is one reason for the cross loading of the lacks remorse, lacks empathy, and doesn’t accept responsibility items on both the affective and behavioural factors. Traditionally, the affective symptoms of psychopathy play a large role in this disorder in females. Perhaps then a weakness of a measure designed to capture the construct of psychopathy in males is to adequately measure affective psychopathic traits in females. For this reason items 4-6 bounce between two factors, not really fitting well with either. This line of reasoning lends the biggest support to the argument that the PCL:SV is not capturing the same disorder in males and females. Methodological contribution: Comparing correlation matrices. The Pearson correlation is commonly used in factor analysis, including those studies involving Likert-type data, although technically the use of the polychoric correlation matrix is more statistically appropriate because it takes into account the ordinal nature of the data (Holgado-Tello, Chacón-Moscoso, Barbero-García, & Vila- Abad, 2010). Due to discrepancy between the commonality of the Pearson and the correctness of the polychoric, this study employed both statistical methods when approaching factor analysis. A comparison and contrasting of the results will hopefully permit better acceptance of the polychoric correlation matrix and related statistical techniques in the research community because one can see that just as useful results are produced using this statistical technique as with the Pearson correlation. One could argue that polychoric results are just as easy to obtain and that results are actually easier to interpret because they are more true to the data, hence trustworthy. 49 Recall that items 1 and 3 are thought of as behavioral factor items in the literature (Hare, 1991) and, in keeping with the literature, loaded on factor 1 for males in the polychoric correlation matrix. Exactly the opposite of this is true for the Pearson correlation, where items 1 and 3 loaded on factor two equally as strongly. Similarly, items 4 - 10 load on the traditional interpersonal factor known as factor 2 for the polychoric correlation matrix and the behavioral factor known as factor 1 for the Pearson correlation matrix. Items 11 and 12 load on factor 3 for both the polychoric correlation matrix and the Pearson correlation matrix, although the factor loadings for the polychoric correlation matrix tend to be slightly larger. Recall that Table 3 in the Results section presents these differences. Comparing the factor loadings for females of the polychoric and Pearson correlation matrices was slightly more complicated than for males because items did not simply flip factors when correlation methods were altered. In addition, items loaded on more than one factor for females. With a two factor solution, items 1 - 4, 6, 7, and 9 all load on factor 1 when either the polychoric or Pearson correlation matrices are used. Items 8 and 10 - 12 load on factor 2 for both matrices. So, with the exception of items 4 and 5, factor loadings of the two factor solution for the polychoric and Pearson correlation matrices are very similar. Comparing the three factor solutions for the polychoric and Pearson correlation matrices is next to impossible. Only items 2, 3, and 7 load on the same factor for both matrices, factor 2. The Pearson factor loadings are slightly larger than polychoric factor loadings for these items. As seen in Table 4 in the Results section, there is no consistent 50 pattern for the remaining items moving between factors depending on the correlation matrix employed. Item Level Discussion The gender differences in psychopathy revealed by the factor analyses were counteracted by the DIF. At the item level, essentially no DIF appears for males and females. This is true for all PCL:SV items as well as factor and test level DIF. Tests of DIF were done only for a two factor model, not a four factor model, because although it aided conclusions drawn by factor analyses to consider a more detailed four factor model, this was an after-thought arrived at when trying to draw meaningful interpretations from the results. What little DIF can be observed is at the subscale level, with the interpersonal subscale. This minimal DIF does lend support to the argument that the PCL:SV is capturing a construct difference in psychopathy between males and females, particularly the interpersonal areas of the disorder. Factor analyses lend some compelling evidence that the construct of psychopathy in males and females is not the same or at least not measured the same by the PCL:SV. However, DIF results counteract that by demonstrating that even if the construct of psychopathy has measureable gender differences, at the item level gender differences are almost completely negated and the use of the PCL:SV to measure psychopathy is acceptable. Although this result is frustrating because there is no clear cut answer whether to continue measuring psychopathy in females with the PCL:SV is appropriate, it is also a relief because there is a measurable similarity in construct between males and females. In short, one can conclude that the constructs are not equivalent, but perhaps by psychometric good 51 fortune, the scores when obtained using procedures in the technical manual are comparable across genders. What is needed now is to “tweak” the measurement of psychopathy in females so that the PCL:SV can capture with greater and equal clarity the disorder of psychopathy in both males and females. Next Steps In general, the construct of psychopathy is being measured the same for males and females, albeit for males with much more clarity. The clarity became obvious discussing the male and female factor analyses results and how the female PCL:SV items loaded across factors whereas items did not for males. Rather than look at just the measure and redefining the items, norms, etc. to account for gender differences, we need to go back to the construct itself and determine if the construct of psychopathy is the same for males and females as measured by the PCL:SV. As Forouzan and Cooke (2005) state, an essential task is to map the domain of symptoms of the disorder and their expression in females. They go on to conclude that this can be accomplished with two types of studies. First are clinical studies that map a range of potential symptoms. Second are psychometric studies that model interrelationships amongst symptoms and diagnostic significance. The PCL:SV is a tool developed to capture psychopathy in males and may not flush out traits of the disorder more common in females. As the results tell us, changes may need to be made in order to accurately measure the affective component of psychopathy in females. Reexamining the construct is important because if it is the same construct for both genders then adjustment needs to be made to the measurement of psychopathy, but if the construct is not comparable across genders than redefinition of the 52 construct needs to take place (Zumbo, 2007). If we simply delete items to erase gender differences, then important information may not be captured about the construct of psychopathy. The construct needs to be defined correctly because of implications for use in terms of diagnoses and treatment of psychopathy. If historically the construct of psychopathy has been inappropriately assumed to be equivalent for males and females when a measure such as the PCL:SV is used, then the potential problems of rater and gender bias also needs to be addressed. That is, raters may be introducing a bias into their scoring when completing the PCL:SV for female respondents because of preconceived notions of psychopathy in females. This is a threat to validity known as measurement artefact. A talk-aloud protocol is one way to investigate rater bias whereby raters think out loud while scoring the PCL:SV. The rater verbalizations are recorded and analyzed for common themes that would not be intuitively obvious from examining scale and item level analyses (Ericsson & Simon, 1984). Results of talk-aloud protocol should be examined to see if the gender of the rater has an impact on validity. PCL:SV scores and the rater of the gender should also be examined to investigate whether scores may differ depending on the gender of the rater. After rater bias is examined, construct equivalence or comparability must be reviewed again in the validation process. Examining the psychometric properties of the PCL:SV, as was done in the current study, is ultimately an utility exercise in the validation process (Zumbo & Hubley, 1996). That is, construct comparability in psychopathy, as measured by the PCL:SV, needs to be established psychometrically for males and females so that accurate and relevant conclusions can be drawn at the clinical level. Even if gender comparisons are not being 53 made at the clinical level, the same tool is being used in the same way for both genders, such as cut off scores. Furthermore, the same conclusions based on test scores are being drawn and the construct is implicitly being treated as comparable. If construct comparability is ignored, over and under identification in the screening process for psychopathy can occur. False positive and false negative psychopathy screening results can skew clinical findings. This can be detrimental not only on a case-by-case basis where follow-up with the PCL-R occurs, but in situations where case results are combined into a database of results. These results can then be used for research purposes that help guide healthcare funding decisions, resource allocation, intervention selection, and ultimately future research focus. Thus, the final but ongoing step after psychometrically establishing that the construct is psychopathy is essentially equivalent for males and females as measured by the PCL:SV is considering the clinical implications the PCL:SV has when being used. I would like to close this thesis with a reflection on applied psychometrics. The primary foci of psychometrics are the statistical methods and results. However inherent in applied psychometrics is the inevitable tension one experiences at the edge of the scope of knowledge and professional practice. That is, when focusing on the statistics and mathematics as a psychometrician one is fully within their scope. However, interpreting the findings one may find themselves, at times, at the limit of their knowledge of the content domain (e.g. psychopathy). It is important for one to acknowledge their scope of knowledge and practice and to recognize and state when they are on the edge of that boundary. Of course, interpretation of the results will inevitably lead one to that 54 uncomfortable domain and in broader practice outside of a thesis, one would fully collaborate with a content specialist (Zumbo & Rupp, 2004). 55 References American Psychiatric Association. (1980). Diagnostic and statistical manual of mental disorders (3 rd ed.). Washington, DC. Bolt, D. M., Hare, R. D., & Neumann, C. S. (2007). Score metric equivalence of the Psychopathy Checklist – Revised (PCL-R) across criminal offenders in North America and the United Kingdom. A critique of Cooke, Michie, Hart, and Clark (2005) and new analyses. Assessment, 14, 44-56. Coenders, G. & Saris, W. E. (1995): Categorization and measurement quality. The Choice between Pearson and polychoric correlations. In Saris, W. E. & Münnich, Á. (Eds.) The multitrait-multimethod approach to evaluate measurement instruments. Eötvös University Press, Budapest: 125-144. Cleckley, H. M. (1941). The mask of sanity. St. Louis: Mosby. Cooke, D. J. & Michie, C. (1997). An item response theory analysis of the Hare Psychopathy Checklist – Revised. Psychological Assessment, 9, 3-14. Cooke, D. J., Michie, C., Hart, S. D., & Hare, R. D. (1999) Evaluating the screening version of the Hare Psychopathy Checklist – Revised (PCL:SV): An item response theory analysis. Psychological Assessment, 11, 3-13. Cooke, D. J. & Michie, C. (2001). Refining the construct of psychopathy: Towards a hierarchical model. Psychological Assessment, 13, 171-188. Douglas, K. S., Strand, S., Belfrage, H., Fransson, G., & Levander, S. (2005). Reliability and validity of evaluation of the Psychopathy Checklist: Screening Version (PCL:SV) in Swedish correctional and forensic psychiatric samples. Assessment, 12, 145-161. Ericsson, K. A. & Simon, H.A. (1984). Protocol analysis: Verbal reports as data. Cambridge: The MIT Press. Flora, D. B. & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological Methods, 9, 466-491. Forouzan, E. & Cooke, D. J. (2005). Figuring out la femme fatale: Conceptual and assessment issues concerning psychopathy in females. Behavioral Sciences and the Law, 23, 765-778. 56 Gelin, M. N., & Zumbo, B. D. (2003). DIF results may change depending on how an item is scored: An illustration with the Center for Epidemiological Studies Depression (CES-D) scale. Educational and Psychological Measurement, 63, 65-74. Gelin, M. N., Carleton, B. C, Smith, M. A., & Zumbo, B. D. (2004). The dimensionality and gender differential item functioning of the Mini Asthma Quality of Life Questionnaire (MINIAQLQ). Social Indicators Research, 68, 91-105. Guy, L. S. & Douglas, K. S. (2006). Examining the utility of the PCL:SV as a screening measure using competing factor models of psychopathy. Psychological Assessment, 18, 225-230. Hare, R. D. (1980). A research scale for the assessment of psychopathy in criminal populations. Personality & Individual Differences, 1, 111-119. Hare, R. D. (1991). The Hare Psychopathy Checklist – Revised. Toronto, ON, Canada: Multi-Health Systems. Hare, R.D. (2003). The Hare Psychopathy Checklist – Revised (2nd ed.). Toronto, ON, Canada: Multi-Health Systems. Hart, S., Cox, D., & Hare, R. D. (1995). Manual for the Psychopathy Checklist: Screening Version (PCL:SV). Toronto: Multi-Health Systems. Hill, C. D., Neumann, C. S., & Rogers, R. (2004). Confirmatory factor analysis of the Psychopathy Checklist: Screening Version in offenders with Axis 1 disorders. Psychological Assessment, 16, 90-95. Holgado-Tello, F., Chacón-Moscoso, S., Barbero-García, I., & Vila-Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity: International Journal of Methodology, 44, 153-166. Hubley, A. M. & Zumbo, B. D. (1996). A dialectic on validity: Where we have been and where we are going. The Journal of General Psychology, 123, 207-215. Lykken, D. T. (2006). Psychopathic personality: The scope of the problem. In C. J. Patrick (Ed.), Handbook of psychopathy (3-13). New York: Guilford Press. Monahan, J., Steadman, H., Silver, E., Appelbaum, P., Robbins, P., Mulvey, E., Roth, L., Grisso, T., & Banks, S. (2001). Rethinking risk assessment: The MacArthur study of mental disorder and violence. New York: Oxford University Press. Nicholls, T. (2004). Violence risk assessments with female NCRMD acquittees: Validity of the HCR-20 and PCL-SV. Dissertation Abstracts International, 64, Retrieved from EBSCOhost. 57 Nicholls, T. L., Ogloff, J. R. P, Ledwidge, B. (2003). Clinical assessments of violence risk with male and female psychiatric patients. In J. R. P. Ogloff (Chair), Inpatient aggression, risk for violence, and community adjustment in forensic and civil psychiatric patients. The 2 nd biennial Psychology-Law International Interdisciplinary Conference, Edinburgh, Scotland. Nicholls, T. L., Ogloff, J., & Douglas, K. (2004). Assessing risk for violence among male and female civil psychiatric patients: the HCR-20, PCL:SV, and McNiel & Binder's screening measure. Behavioral Sciences and the Law, 22, 127-158. Rogers, R., Salekin, R. T., Hill, C., Sewell, K. W., Murdock, M. E., & Neumann, C. S. (2000). The Psychopathy Checklist – Screening Version: An examination of criteria and subcriteria in three forensic samples. Assessment, 7, 1-15. Seto, M. C., Harris, G. T., & Rice, M. E. (2004). The criminogenic, clinical, and social problems of forensic and civil psychiatric patients. Law & Human Behavior, 28, 577- 587. Silver, E., Mulvey, E. P., & Monahan, J. (1999). Assessing violence risk among discharged psychiatric patients: Toward an ecological approach. Law and Human Behavior, 23, 237-256. Skeem, J. L., Mulvey, E. P., & Grisso, T. (2003). Applicability of traditional and revised models of psychopathy to the Psychopathy Checklist: Screening Version. Psychological Assessment, 15, 41-55. Slocum, S. L., Gelin, M. N., & Zumbo, B. D. (in press). Statistical and graphical modeling to investigate differential item functioning for rating scale and Likert item formats. In B. D. Zumbo (Ed.) Developments in the theories and applications of measurement, evaluation, and research methodology across the disciplines, Volume 1. Vancouver: Edgeworth Laboratory, University of British Columbia. Steadman, H. J., Mulvey, E. P., Monahan, J., Robbins, P. C., Appelbaum, P. S., Grisso, T., Roth, L. H., & Silver, E. (1998). Violence by people discharged from acute psychiatric inpatient facilities and by others in the same neighborhoods. Archives of General Psychiatry, 55, 393-401. Steadman, H., Silver, E., Monahan, J., Applebaum, P., Robbins, P., Mulvey, E., Grisso, T., Roth, L., & Banks, S. (2000). A classification tree approach to the development of actuarial violence risk assessment tools. Law and Human Behavior, 24, 83-100. Uebersax J. S. The tetrachoric and polychoric correlation coefficients. Statistical Methods for Rater Agreement web site. 2006. Available at: http://john- uebersax.com/stat/tetra.htm . Accessed November 03, 2008. 58 Vitacco, M. J., Neumann, C. S., & Jackson, R. L. (2005). Testing a four-factor model of psychopathy and its association with ethnicity, gender, intelligence, and violence. Journal of Counseling and Clinical Psychology, 73, 466-476. Zumbo, B. D. (2009). Validity as contextualized and pragmatic explanation, and its implications for validation practice. In R. W. Lissitz (Ed.) The concept of validity: revisions, new directions and applications, (pp. 65-82). Charlotte: Information Age Publishing, Inc. Zumbo, B. D. (2007a). Validity: foundational issues and statistical methodology. In C. R. Rao and S. Sinharay (Eds.) Handbook of statistics, Vol. 26: Psychometrics, (pp. 45-79). The Netherlands: Elsevier Science B.V. Zumbo, B. D. (2007b). Three generations of differential item functioning (DIF) analyses: Considering where it has been, where it is now, and where it is going. Language Assessment Quarterly, 4, 223-233. Zumbo, B. D. (2005). Structural equation modeling and test validation. In B. S. Everitt & D.C. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science: Volume 4 (pp. 1951-1958). Chichester: John Wiley & Sons, Ltd. Zumbo, B. D. (2003). Does item-level DIF manifest itself in scale-level analyses? Implications for translating language test. Language Testing, 20, 136-147. Zumbo, B. D. & Hubley, A. M. (2003). Item bias. In R. Fernandez-Ballesteros (Eds.), Encyclopedia of Psychological Assessment: Volume 1 (pp. 505-509). London: Sage Publications Ltd. Zumbo, B. D., & Rupp, A. A. (2004). Responsible modeling of measurement data for appropriate inferences: Important advances in reliability and validity theory. In David Kaplan (Ed.), The SAGE Handbook of Quantitative Methodology for the Social Sciences (pp. 73-92). Thousand Oaks, CA: Sage Press. Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense. 59 Appendix A Polychoric Correlation It is common in factor analysis to use Pearson correlations and maximum likelihood estimation techniques. However, Pearson correlations require interval level data and social science research measures, such as the PCL:SV, often employ ordinal level data (Uebersax, 2006). An alternative to mismatching the assumptions underlying the statistical model and the empirical characteristics of the data to be analyzed (Flora & Curran, 2004), as is the case with Pearson correlation and ordinal data, is to use polychoric correlations and minimal residual estimations. The polychoric correlation assumes a continuous, normally distributed latent variable and ordinal level data. In terms of the current study, the polychoric correlation respects that psychopathy is a continuous variable and permits the scoring metric of the PCL:SV. Another way to think of polychoric correlations is as a special case of latent trait modeling for ordinal data (Uebersax, 2006). The use of an incorrect correlation matrix for factor analysis (as would be the case with Pearson and PCL:SV) may mean that factor loadings are less representative of the latent variable. This is an important point when analyzing the PCL:SV because so much effort has gone into how many factors represent the essentially unidimensional construct of psychopathy. The competing factor models, if analyzed using a less than ideal correlation matrix, may result in validity problems in that the conclusions drawn from the empirical data are not representative of the theoretical model (Flora & Curran, 2004). A limiting feature of polychoric correlations is the requirement of normally distributed data, which may lead some researchers to believe they must use Pearson 60 correlations instead. However, Flora and Curran (2004), replicating the earlier results of Quiroga (1992, as cited in Coenders & Saris, 1995) concluded that polychoric correlation was robust to moderately non-normal data. Coenders and Saris (1995) further support the use of polychoric correlation, but caution that the measurement of association used must be recognized and factor loadings from pearson and polychoric correlations must be interpreted differently. A lack of software for performing polychoric correlations has undoubtedly allowed Pearson correlations to remain the dominant correlation technique, regardless of how the latent variables of interest are being measured. 61 Appendix B Pearson Correlation Table 1 Female Pearson Eigenvalues Factor Initial Eigenvalues Total % of Variance Cumulative % 1 5.185 43.210 43.210 2 1.237 10.311 53.521 3 1.001 8.338 61.859 Table 2 Female Univariate Marginal Parameters Variable Thresholds 1. Superficial .235 .229 2. Grandiose .208 .250 3. Deceitful .327 .342 4. Lacks Remorse .577 .537 5. Lacks Empathy .503 .430 6. Doesn’t Accept Responsibility .549 .626 7. Impulsive .429 .442 8. Poor Behavioral Controls .439 .468 9. Lacks Goals .532 .554 10. Irresponsible .579 .580 11. Adolescent Antisocial Behavior .290 .287 12. Adult Antisocial Behavior .385 .514 62 Table 3 Female Pearson Correlation Matrix Item 1 2 3 4 5 6 7 8 9 10 11 12 1. Superficial 1.000 2. Grandiose .294 1.000 3. Deceitful .375 .248 1.000 4. Lacks Remorse .252 .283 .415 1.000 5. Lacks Empathy .227 .222 .294 .669 1.000 6. Doesn’t Accept Responsibility .313 .344 .451 .611 .559 1.000 7. Impulsive .296 .293 .417 .434 .373 .491 1.000 8. Poor Behavioral Controls .133 .265 .315 .463 .419 .496 .423 1.000 9. Lacks Goals .239 .337 .388 .498 .456 .571 .517 .531 1.000 10. Irresponsible .150 .306 .375 .470 .471 .559 .541 .593 .653 1.000 11. Adolescent Antisocial Behavior .163 .038 .206 .244 .217 .231 .348 .302 .275 .297 1.000 12. Adult Antisocial Behavior .105 .132 .259 .442 .367 .368 .333 .411 .378 .438 .478 1.000 63 Table 4 Male Pearson Eigenvalues Factor Initial Eigenvalues Total % of Variance Cumulative % 1 5.083 42.355 42.355 2 1.249 10.405 52.759 3 1.091 9.091 61.850 Table 5 Male Univariate Marginal Parameters Variable Thresholds 1. Superficial .261 .403 2. Grandiose .285 .299 3. Deceitful .391 .497 4. Lacks Remorse .588 .554 5. Lacks Empathy .542 .507 6. Doesn’t Accept Responsibility .526 .543 7. Impulsive .350 .355 8. Poor Behavioral Controls .459 .502 9. Lacks Goals .456 .430 10. Irresponsible .497 .513 11. Adolescent Antisocial Behavior .301 .157 12. Adult Antisocial Behavior .445 .448 64 Table 6 Male Pearson Correlation Matrix Item 1 2 3 4 5 6 7 8 9 10 11 12 1. Superficial 1.000 2. Grandiose .307 1.000 3. Deceitful .441 .369 1.000 4. Lacks Remorse .246 .287 .419 1.000 5. Lacks Empathy .194 .298 .322 .692 1.000 6. Doesn’t Accept Responsibility .315 .380 .424 .608 .536 1.000 7. Impulsive .253 .300 .386 .367 .328 .428 1.000 8. Poor Behavioral Controls .119 .361 .336 .501 .525 .509 .351 1.000 9. Lacks Goals .119 .380 .337 .423 .385 .456 .455 .506 1.000 10. Irresponsible .196 .304 .373 .467 .440 .527 .479 .514 .583 1.000 11. Adolescent Antisocial Behavior .159 .072 .298 .284 .267 .151 .249 .237 .206 .240 1.000 12. Adult Antisocial Behavior .124 .156 .253 .450 .452 .389 .382 .433 .373 .460 .492 1.000 65 Table 7 Two Factor Solution Item Males Females Factor 1 Interpersonal Factor 2 Behavioral Factor 1 Interpersonal Factor 2 Behavioral 1 -.219 .752 .656 -.278 2 .127 .457 .645 -.217 3 .094 .642 .601 -.021 4 .692 .079 .476 .301 5 .728 -.025 .399 .298 6 .531 .274 .680 .139 7 .417 .236 .428 .277 8 .736 -.044 .168 .547 9 .618 .057 .446 .346 10 .680 .055 .269 .536 11 .413 -.026 -.256 .705 12 .783 -.207 -.298 .918 66 Table 8 Three Factor Solution Item Males Females Factor 1 Interpersonal Factor 2 Behavioral Factor 3 Factor 1 Interpersonal Factor 2 Behavioral Factor 3 1 -.133 .726 .056 -.198 .678 -.072 2 .399 .325 -.200 -.170 .594 .031 3 .139 .589 .122 .026 .594 -.005 4 .660 .051 .093 -.032 .011 .842 5 .679 -.043 .087 -.061 -.108 .910 6 .725 .160 -.126 .001 .389 .460 7 .419 .187 .095 .360 .499 -.110 8 .784 -.112 -.012 .465 .079 .192 9 .727 -.037 -.060 .321 .364 .137 10 .714 -.010 .025 .495 .211 .127 11 -.134 .124 .774 .779 -.121 -.190 12 .396 -.138 .537 .783 -.300 .147 Table 9 Male Two Factor Correlations Factor 1 Factor 2 Factor 1 1.000 Factor 2 0.572 1.000 Table 10 Female Two Factor Correlations Factor 1 Factor 2 Factor 1 1.000 Factor 2 0.764 1.000 67 Table 11 Male Three Factor Correlations Factor 1 Factor 2 Factor 3 Factor 1 1.000 Factor 2 0.527 1.000 Factor 3 0.508 0.238 1.000 Table 12 Female Three Factor Correlations Factor 1 Factor 2 Factor 3 Factor 1 1.000 Factor 2 0.703 1.000 Factor 3 0.729 0.734 1.000
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Does the screening version of the Psychopathy Checklist...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Does the screening version of the Psychopathy Checklist measure the same disorder in males and females? Balmer-Labrecque, Jenessa Erin 2011
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Does the screening version of the Psychopathy Checklist measure the same disorder in males and females? |
Creator |
Balmer-Labrecque, Jenessa Erin |
Publisher | University of British Columbia |
Date Issued | 2011 |
Description | The Psychopathy Checklist: Screening Version (PCL:SV) is a tool used to measure the construct of psychopathy in males and females. From a psychometric standpoint, the PCL:SV is administered and scored in the same manner for male and female respondents. The aim of the current study is to investigate using scale and item level statistical techniques if the PCL:SV measures the construct of psychopathy equivalently for males and females. Given the Likert-type nature of the item responses, both Pearson and polychoric correlation techniques were employed in order to compare the more commonly used Pearson to the psychometrically correct polychoric. At the scale level, some PCL:SV items loaded differently for males and females, and not in a manner found in the literature. At the item level, only four items displayed DIF, and the DIF for these items was minimal. These findings suggest that the PCL:SV is measuring the same construct of psychopathy for males and females, but more clearly defines the construct of psychopathy for male respondents. This may, in part, be due to the ways in which males and females express psychopathy; in which case the construct of psychopathy itself needs to be revisited. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-07-06 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0105094 |
URI | http://hdl.handle.net/2429/35907 |
Degree |
Master of Arts - MA |
Program |
Measurement, Evaluation and Research Methodology |
Affiliation |
Education, Faculty of Educational and Counselling Psychology, and Special Education (ECPS), Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2011-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2011_fall_balmerlabrecque_jenessa.pdf [ 978.43kB ]
- Metadata
- JSON: 24-1.0105094.json
- JSON-LD: 24-1.0105094-ld.json
- RDF/XML (Pretty): 24-1.0105094-rdf.xml
- RDF/JSON: 24-1.0105094-rdf.json
- Turtle: 24-1.0105094-turtle.txt
- N-Triples: 24-1.0105094-rdf-ntriples.txt
- Original Record: 24-1.0105094-source.json
- Full Text
- 24-1.0105094-fulltext.txt
- Citation
- 24-1.0105094.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0105094/manifest