Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Content validation of the Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Health… Russell, Lara Beate 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2013_fall_russell_lara.pdf [ 1.47MB ]
Metadata
JSON: 24-1.0165615.json
JSON-LD: 24-1.0165615-ld.json
RDF/XML (Pretty): 24-1.0165615-rdf.xml
RDF/JSON: 24-1.0165615-rdf.json
Turtle: 24-1.0165615-turtle.txt
N-Triples: 24-1.0165615-rdf-ntriples.txt
Original Record: 24-1.0165615-source.json
Full Text
24-1.0165615-fulltext.txt
Citation
24-1.0165615.ris

Full Text

    CONTENT VALIDATION OF THE QUALITY OF LIFE FOR HOMELESS AND HARD-TO-HOUSE INDIVIDUALS (QOLHHI) HEALTH AND LIVING CONDITIONS IMPACT SECTIONS: IMPLICATIONS FOR INSTRUMENT USE AND CONTENT VALIDATION METHODOLOGY  by Lara Beate Russell  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Measurement, Evaluation, and Research Methodology)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  October 2013  ? Lara Beate Russell, 2013 ii  Abstract Evidence based on content is one of the key sources of validity evidence described in the Standards for Educational and Psychological Testing. This dissertation used three groups of content experts to gather validity evidence based on content for two sections of the Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Survey. One goal of this dissertation was to examine the implications of this evidence for the validity of inferences made from the QoLHHI, while a second goal was to determine if judgmental studies using content experts, a popular method for content validation, can be improved.  Quantitative and descriptive validity evidence was collected from 11 subject matter experts (researchers working with individuals who are homeless and vulnerably housed (HVH)), 16 experiential experts (individuals who were HVH), and 8 practical experts (individuals who had administered the QoLHHI). These experts independently rated the relevance and technical quality of the items, response scales, administration instructions and other content elements of the QoLHHI. Content Validity Indices were computed and used to identify elements that were endorsed by the experts. Experts also provided descriptive feedback in the form of comments and suggestions for improving the QoLHHI content. This feedback was used to develop recommendations for revising the content. In terms of evidence based on content for the QoLHHI, over 85% of the content elements were endorsed, indicating good evidence for validity. The recommended revisions are relatively straightforward to implement. Overall, the content of these two sections of the QoLHHI appears likely to produce scores that lead to supportable and appropriate inferences.  With regards to methodology, a new approach to content validation is proposed. Traditionally, content validation studies have focused on quantitative assessments, with descriptive data being given a secondary role. It is proposed here that content experts should be iii  viewed as an advisory board rather than a representative sample of a larger population, that members? feedback should be given individual attention, and that judgmental studies should focus primarily on the experts? descriptive feedback, which is where their knowledge, experience, and insight are best expressed.   iv  Preface  This dissertation is original, unpublished, independent research by the author, Lara Beate Russell.  The research studies reported in Chapters 3-5 were covered by UBC Ethics Certificate numbers H11-03080 and H11-02904.   v  Table of Contents  Abstract ........................................................................................................................................... ii Preface............................................................................................................................................ iv Table of Contents ............................................................................................................................ v List of Tables ............................................................................................................................... viii List of Figures ................................................................................................................................ ix List of Abbreviations ...................................................................................................................... x Acknowledgments.......................................................................................................................... xi Chapter 1: Introduction ................................................................................................................... 1 Chapter 2: Background ................................................................................................................... 7 A Brief History of Validity ......................................................................................................... 7 The Role of Instrument Content within the Unified View of Validity ..................................... 10 Aspects of Instrument Content that Affect Validity ................................................................. 13 Additional Considerations in Content Validation: The Context of Measurement .................... 16 Aspects of Content and the QoLHHI ........................................................................................ 17 Description of the QoLHHI. ................................................................................................. 17 Content domain and content domain definition. ................................................................... 18 Content coverage. ................................................................................................................. 22 Content relevance.................................................................................................................. 22 Technical quality. .................................................................................................................. 23 The larger context: The purpose of measurement and target population. ............................. 23 Collecting Evidence Based on Instrument Content: The Content Validation Study ................ 25 Judgmental studies using content experts. ............................................................................ 25 Data for a content validation study. ...................................................................................... 29 Collecting and analyzing the quantitative data. .................................................................... 29 Collecting and analyzing descriptive data. ........................................................................... 42 Chapter 3: Going Beyond the Numbers: Content Validation of a Quality of Life Measure Using Subject Matter Experts ............................................................................................................... 46 Background ............................................................................................................................... 46 Methods..................................................................................................................................... 50 Ethical approval. ................................................................................................................... 50 Recruitment. .......................................................................................................................... 50 Measures and materials. ........................................................................................................ 50 vi  Procedures. ............................................................................................................................ 52 Analyses. ............................................................................................................................... 56 Results ....................................................................................................................................... 58 Participants. ........................................................................................................................... 58 Quantitative results. .............................................................................................................. 59 Descriptive feedback. ............................................................................................................ 60 Discussion ................................................................................................................................. 69 Implications for the QoLHHI................................................................................................ 69 Implications for content validation. ...................................................................................... 71 Conclusion, Study Limitations, and Future Directions ............................................................. 72 Chapter 4: Going Beyond Subject Matter Expert: Content Validation of a Quality of Life Measure with a Sample of Experiential Experts and Practical Experts ..................................... 92 Introduction ............................................................................................................................... 92 Methods..................................................................................................................................... 96 Ethical approval. ................................................................................................................... 96 Participants and recruitment. ................................................................................................ 96 Measures and materials. ........................................................................................................ 97 Procedures. .......................................................................................................................... 100 Analyses. ............................................................................................................................. 102 Results ..................................................................................................................................... 104 Sample................................................................................................................................. 104 Quantitative data. ................................................................................................................ 105 Descriptive data. ................................................................................................................. 106 ?Impact? versus ?Effect?. ..................................................................................................... 119 Suggestions for topics to add to the QoLHHI. .................................................................... 119 Discussion ............................................................................................................................... 120 Validity implications for the QoLHHI. ............................................................................... 120 Implications for content validation. .................................................................................... 124 Study Limitations and Directions for Future Research........................................................... 129 Conclusion .............................................................................................................................. 132 Chapter 5: SME, PE and EE Feedback: Why Compare Data from Different Groups of Experts?.... ................................................................................................................................ 162 The Quantitative Data: Many Similarities, Some Important Differences ............................... 165 Descriptive Feedback: A Rich Source of Information on Content Quality ............................ 167 Chapter 6: Conclusion................................................................................................................. 181 vii  Summary of the Research Findings ........................................................................................ 181 Validity evidence based on the content of the QoLHHI. .................................................... 181 Implications for content validation using judgmental methods. ......................................... 182 Novel Contributions and a New Perspective on Content Validation Methodology ............... 185 How we think about content experts. .................................................................................. 186 Types of data. ...................................................................................................................... 187 Analyzing the data. ............................................................................................................. 188 Limitations and Future Directions .......................................................................................... 188 References ................................................................................................................................... 194 Appendix A: Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Survey: Health Impact Section and Living Conditions Impact Section ................................................ 209 Appendix B: QoLHHI Impact Response Card ........................................................................... 216 Appendix C: Descriptive Feedback From Content Experts and Author Comments .................. 217 Impact Elements...................................................................................................................... 218 Health Impact Section Elements ............................................................................................. 250 Living Conditions Impact Section Elements .......................................................................... 308 Elements from the QoLHHI Administration and Scoring Manual ......................................... 366 Elements from the QoLHHI Forms ........................................................................................ 370  viii  List of Tables Table 2.1: Important Aspects of Content Identified in Selected Publications on Content Validity and Content Validation ............................................................................................................... 44 Table 3.1: Content Validity Indices (CVI) for Impact Elements ................................................... 75 Table 3.2: Content Validity Indices (CVI) for the Relevance of Aspects of Health ...................... 76 Table 3.3: Content Validity Indices (CVI) for the Clarity of the Health Impact Section Items .... 77 Table 3.4: Content Validity Indices (CVI) for Skip Patterns in the Health Impact Section ......... 81 Table 3.5: Content Validity Indices (CVI) for the Relevance of Aspects of Living Conditions .... 82 Table 3.6: Content Validity Indices (CVI) for the Clarity of the Living Conditions Impact Section Items ........................................................................................................................................... 85 Table 3.7: Content Validity Indices (CVI) for the Administration Elements of the QoLHHI ....... 89 Table 4.1: Content Elements Rated by Practical Experts and Experiential Experts .................. 133 Table 4.2: Content Validity Indices (CVI) for Impact Elements ................................................. 135 Table 4.3: Content Validity Indices (CVI) for the Relevance of Aspects of Health .................... 136 Table 4.4: Content Validity Indices (CVI) for the Relevance of Aspects of Living Conditions .. 138 Table 4.5: Content Validity Indices (CVI) for the Clarity of the Health Impact Section Items .. 142 Table 4.6: Content Validity Indices (CVI) for the Clarity of the Living Conditions Impact Section Items ......................................................................................................................................... 148 Table 4.7: Content Validity Indices (CVI) for Skip Patterns in the Health Impact Section ....... 153 Table 4.8: Content Validity Indices (CVI) for the Administration Elements of the QoLHHI (Practical Experts only)............................................................................................................ 154 Table 4.9: Suggestions for Topics to Add to the Health Impact and Living Conditions Impact Sections of the QoLHHI ............................................................................................................ 157 Table 5.1: Agreement on Endorsement between Groups of Content Experts: Relevance Ratings .................................................................................................................................................. 170 Table 5.2: Agreement on Endorsement between Groups of Content Experts: Clarity Ratings .. 173 Table 5.3: Agreement on Non- Endorsement between Groups of Content Experts .................... 180 Table C.1: Content Experts? Suggestions for Revisions to the QoLHHI Items .......................... 376  ix  List of Figures  Figure B. 1: QoLHHI Impact Response Card ............................................................................. 216    x  List of Abbreviations EE: Experiential Expert HVH: Homeless or vulnerably housed  PE: Practical Expert QoL: Quality of Life QoLHHI: Quality of Life for Homeless and Hard-to-House Individuals Inventory SME: Subject Matter Expert SQoL: Subjective Quality of Life   xi  Acknowledgments   I wish to express my heartfelt gratitude to my supervisor, Dr. Anita Hubley, for her support, encouragement, guidance, humour, and seemingly endless patience. Dr. Hubley embodies the concept of ?mentor? and exemplifies the kind of scholar and teacher I hope one day to become. Many thanks are also due to my other committee members, Dr. Bruno Zumbo and Dr. Anita Palepu, for their support and encouragement, and also their willingness to be tough when it was warranted. I would like to recognize the financial support provided to me by the Social Sciences and Research Council of Canada, the University of British Columbia, and the Faculty of Education at the University of British Columbia. Though the contents of this doctoral dissertation represent my own work, I cannot claim to have done this alone. It takes a village to support a PhD student, it seems. To everyone who encouraged, cajoled, badgered, commiserated, cheered, nagged, let me cry, let me rant, pretended to care about validation while I ranted, provided food, provided wine, sent money, waited patiently when I dropped out of sight for weeks at a time and welcomed me back when I re-appeared: Thank you, thank you, thank you!   1  0. Chapter 1: Introduction Although housing is enshrined as a component of basic human rights in the United Nations Universal Declaration of Human Rights (?The Universal Declaration of Human Rights,? n.d.), at least 100 million individuals world-wide are homeless, and 1 billion have inadequate housing (UN Commission on Human Rights, 2005). In Europe, the number of individuals who are homeless is estimated to be between 600,000 and 3 million, depending on the definition of homelessness used (UN-HABITAT, 2011), while in the United States, between 2 to 3 million individuals make use of services for homeless persons each year (Caton et al., 2005). In Canada, an estimated 30,000 individuals are homeless on any given night, and 200,000 are homeless in any given year (Gaetz, Donaldson, Richter, & Gulliver, 2013). These figures paint a grim picture; yet even so, it is one that likely underestimates the extent of homelessness and housing vulnerability. Many individuals who are homeless are never included in enumeration attempts. This may be due to variations in the methodologies used in these attempts or to the fact that many individuals who are homeless find temporary shelter with friends and family or stay in less visible locations, such as campgrounds or cars, where they are easily missed (National Coalition for the Homeless, 2007). In Canada, it has been estimated that as many as 80% of those who are homeless fall into the category of ?hidden homeless? (?Population: Hidden homeless,? n.d.). Because it is so complex and can take many different forms, the ?very nature? of homelessness makes an accurate count difficult (Human Resources and Skills Development Canada, 2013).  The association between homelessness and a number of physical, psychological, and social difficulties has been well-documented. For example, individuals who are homeless experience high mortality rates, poor overall physical and mental health, and a wide range of health conditions including seizures, arthritis, hypertension, respiratory tract infections, tuberculosis, diabetes, HIV and AIDS, hepatitis, nutritional deficits, mental health problems such 2  as schizophrenia and depression, drug and alcohol addiction, and neurological and cognitive impairments (Boivin, Roy, Haley, & Galbaud du Fort, 2005; Fischer & Breakey, 1991; Hwang, 2001; Hwang et al., 2011; Solliday-McRoy, Campbell, Melchert, Young, & Cisler, 2004; Spence, Stevens, & Parks, 2004). Many individuals who are homeless are unemployed or underemployed and face numerous barriers to finding and keeping employment (Long, Rio, & Rosen, 2007). They may have little to no income or rely on social assistance as their sole source of income (Aubry, Klodawski, Hay, & Birnie, 2003; Halifax Regional Municipality, 2005; SPARC BC et al., 2009), have been incarcerated or otherwise involved with the criminal justice system (Lee & Greif, 2008; Metraux, Caterina, & Cho, 2008), experience food insecurity (Greater Vancouver Regional Steering Committee on Homelessness, 2012; Lee & Greif, 2008), be frequent victims of crime and assault (Boivin et al., 2005; Lee & Schreck, 2005), and experience high levels of loneliness and isolation (Rokach, 2005a, 2005b) and suicidal thoughts (Yoder, Whitbeck, & Hoyt, 2008).  Less is known about the subjective quality of life (SQoL) of individuals who are homeless. In a recent review of the literature, Hubley, Russell, Palepu and Hwang (2012) found that individuals who are homeless report both low levels of satisfaction in many life areas such as safety, finances, and living situation, and lower SQoL compared to the general population, but they noted that these findings were based on a relatively small number of studies.1 The findings from this review might seem self-evident given the physical, mental, and social challenges faced                                                  1 Slightly more research has considered homelessness and what is called health-related quality of life (HRQoL). Studies of HRQoL generally employ measures such as the SF-36 (e.g., Kertesz et al., 2005; Tsui, Bangsberg, Ragland, Hall, & Riley, 2007), SF-12 (e.g., Savage, Lindsell, Gillespie, Lee, & Corbin, 2008), or EQ-5D (Sun, Irestig, Burstr?m, Beijer, & Burstr?m, 2012). These measures are more correctly classified as measures of health status, rather than measures of QoL. Though health status and QoL are often treated as interchangeable (Anderson & Burckhardt, 1999), they are distinct concepts (Anderson & Burckhardt, 1999; Moons, 2004; Smith, Avis, & Assmann, 1999). HRQoL focuses specifically on those aspects of QoL that are related to health (?Health-related quality of life (HRQOL),? n.d.), and therefore even the equation of HRQoL with the much broader construct of general QoL (including SQoL) is problematic.  3  by many individuals who are homeless, as described above. Yet the relationship between adverse life circumstances and SQoL is not necessarily so straightforward. Evidence from the health field indicates that disability or severe illness does not necessarily lead to lower SQoL, something that has been referred to as the ?disability paradox? (Albrecht & Devlieger, 1999) and may be explained by a phenomenon called ?response shift? (Sprangers & Schwartz, 1999).  Discrepancies between objective life circumstances and subjective assessments of QoL, as evidenced by the phenomena of the ?happy poor? and ?unhappy rich?, have been the subject of discussion in QoL research beyond the health field as well (Phillips, 2006; Sirgy, 2012). The potential for gaps between objective circumstances and subjective assessments of those same circumstances has implications for research, service provision, and policy aimed at individuals who are homeless or vulnerably housed (HVH), for it suggests that attempts to improve negative conditions and circumstances may not necessarily translate neatly into improved SQoL. A better understanding of the SQoL of individuals who are HVH would, therefore, be helpful.  In order to properly measure the SQoL of individuals who are HVH, it is essential to have a measurement tool that is appropriate for this target population. A SQoL measure developed for the general population may not meet this need. Individuals who are HVH experience many difficulties and circumstances that are not experienced by the general population, and the causes and outcomes of homelessness and housing vulnerability are complex and multi-faceted. A population-specific measure developed for a different vulnerable population may also not necessarily be appropriate. For example, one QoL measure that has been used in research with individuals who are HVH is the Quality of Life Interview (QOLI, Lehman, 1988). This instrument was developed for use with individuals with mental illness, based on reviews of pre-existing measures of QoL as well as research in the area of mental illness. Although some 4  individuals who are HVH do experience mental health issues, many do not, and so it is not clear if the QOLI is appropriate for use with all individuals who are HVH. Key facets of the experiences of individuals who are HVH may be missing from the QOLI, while others may be included unnecessarily or overrepresented. In addition, the homeless population is very diverse. In order to be effective, a measure of SQoL for individuals who are HVH must both reflect the unique circumstances entailed by homelessness and housing vulnerability, and be flexible enough that it is responsive to variations within this population (for example, by allowing components to be easily dropped or adapted). The Quality of Life of Homeless and Hard-to-House Individuals (QoLHHI) Inventory (Hubley, Russell, Gadermann, & Palepu, 2009) was recently developed to meet the need for such a population-specific measure of SQoL. While the development of a measure specific to this population is significant, it is also of fundamental importance that the validity of inferences from this measure be examined so that it can be used with confidence for research or other purposes. Validity is defined as ?the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests?  (American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing (U.S.) [AERA, APA & NCME], 1999, p. 9), and it is what gives meaning to measurement (Hubley & Zumbo, 1996). Evidence for validity may come from a number of sources, including the content of an instrument. Content validation is the process of gathering such content-based evidence. Although content validation may be carried out at any point throughout an instrument?s lifetime, it is often of greatest use during the development stage or soon after the instrument?s release, when the findings can be used to make 5  changes to the content in order to increase the likelihood that the instrument will produce scores, and thus inferences, that are meaningful and appropriate.  This dissertation has two main goals. The first goal is to present the findings from two studies that assessed evidence based on content for two key sections of the QoLHHI. The implications for validity are discussed, and recommendations are made for revising the QoLHHI content in light of the study findings. The second goal of the dissertation focuses on methodology. The two studies reported in this dissertation were conducted using a popular approach to content validation, namely, judgmental studies using content experts. The study procedures and types of data collected are assessed in order to determine if changes to the typical methodology of such studies might serve to enhance content validation. The structure of the dissertation is as follows: Chapter 2 provides some important background information on validity, the role of content in validity, and content validation studies. Throughout the chapter, the ways in which these apply to the content validation of the QoLHHI specifically will be discussed. Chapter 3 presents the first of two content validation studies. For this study, content-based validity evidence was collected from a sample of subject matter experts (i.e., individuals with experience in conducting research with individuals who are HVH). These subject matter experts provided both quantitative and descriptive feedback on the QoLHHI content. The implications of the findings for both the validity of inferences made from the QoLHHI and for the practice of content validation are discussed. Chapter 4 presents the findings from a second content validation study, this one conducted with a sample composed of experiential experts (i.e., individuals who were HVH) and practical experts (i.e., individuals who had experience in administering the QoLHHI in a research setting). 6  The implications of the findings for the validity of the inferences from the QoLHHI are discussed. The implications for the practice of content validation are also explored further.  Chapter 5 presents a brief comparison of the findings from the three different groups of content experts. This type of comparison is rare in the content validation literature. Once again, the implications for both the use of the QoLHHI as a measurement tool and for the practice of content validation are discussed. Finally, Chapter 6 summarizes the findings from the two studies and concludes with a discussion of some of the contributions and limitations of the studies presented, and provides recommendations for future research. Note: Chapters 3 and 4 are written in manuscript format. In order to meet the requirements of this format, there is some overlap in content between these two chapters and Chapter 2, particularly in the introduction and methods sections.   7  2. Chapter 2: Background A Brief History of Validity Validity is ?the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests? (AERA, APA, & NCME, 1999, p. 9). This definition, now widely adopted in many fields including psychology and education, is the result of many shifts in thinking over the past three-quarters of a century.  Prior to the 1950s, validity was generally treated as a very straightforward concept, as exemplified by Kelley?s (1927) statement that ?the problem of validity is that of whether a test really measures what it purports to measure? (p. 14). Evidence for validity was typically empirical in nature and primarily took the form of correlations between the scores obtained from an instrument and some external variable deemed to be relevant to what was being measured. Example of such variables might be age (for assessing the validity of intelligence tests) and psychiatric history (for measures of personality; Anastasi, 1986). However, concerns eventually arose over how to establish the validity of these variables in turn, leading to a push away from purely statistical approaches to validity and to the increasing application of theory to the process of validation (Sireci, 1998b). There was a proliferation of ?types? of validity during this time, leading to a ?confusing array of names ? ranging from face validity, validity by definition, intrinsic validity, and logical validity to empirical validity and factorial validity? (Anastasi, 1986, p. 2).  In 1954, the Technical Recommendations for Psychological Tests and Diagnostic Techniques (American Psychological Association, American Educational Research Association, & National Council on Measurement in Education, 1954) narrowed the list of validity ?types? by proposing four categories of validity: predictive validity, concurrent validity, content validity, and construct validity. Predictive and concurrent validity were later combined into a single 8  category of criterion validity, resulting in what has been called a ?tripartite? conceptualization of validity (Anastasi, 1986). Under the tripartite view, in addition to being of various ?types? (i.e., construct, content, and criterion-related validity), validity was seen as a property of the measurement instrument itself, as having a dichotomous designation (i.e., an instrument was either ?valid? or ?not valid?), and as tied to the procedures used in validation (Hubley & Zumbo, 1996; Zumbo, Gelin, & Hubley, 2002). Anastasi (1986) has suggested that the tripartite view had an unfortunate consequence in that it created the impression that validity could be established by instrument developers as long as they could address the (usually three) types of validity and ?tick them off in checklist fashion? (p. 2), after which no further validation was needed.  Even as the tripartite conceptualization of validity took hold, however, other ideas were beginning to emerge. In 1955, Cronbach and Meehl published their seminal paper on construct validity, in which they emphasized the interplay between the theory around a construct and the evidence gathered to support that theory. Though Cronbach and Meehl did not, at that time, argue that construct validity should necessarily supersede other types of validity, their paper highlighted the importance of construct validation, and introduced the idea that other types of validity could provide evidence of construct validity. Soon after, Loevinger (1957) argued that ?construct validity is the whole of validity from a scientific point of view? (p. 636). The idea of construct validity as being of primary importance was increasingly championed over the next three decades. Today, many (though by no means all) theorists in the field of validity support a ?unified? view of validity that treats construct validity as the main type (or even only true type) of validity (Anastasi, 1986; Angoff, 1988; Messick, 1995). It is no longer instruments themselves that are valid; instead, validity is tied to the inferences made from the scores obtained when the instrument is used. It is not a dichotomous ?valid/not valid? designation, but instead exists along 9  a continuum, and is established through multiple sources of evidence and within a theoretical framework (Hubley & Zumbo, 1996; Zumbo et al., 2002). The purpose and context of measurement is also important, and validity in one context will not necessarily carry over to another context or if the instrument is used for a different purpose (AERA et al., 1999). Under the unified view of validity, validation is viewed as an ongoing process of building a ?scientifically sound validity argument to support the intended interpretation of test scores and their relevance to the proposed use? (AERA et al., 1999, p. 9). The unified view of validity thus represents a fundamental shift away from the checklist approach that had so worried Anastasi (1986). Rather than focus on certain methods (e.g., correlations of scores with external variables), validation now takes the form of an ongoing interplay between theory and evidence, with a constant eye towards the implication of changing contexts. This process continues on throughout the lifetime of an instrument, and any one study will serve only to provide one piece, or at most a few pieces, of evidence in support of the use of an instrument for a particular purpose in a particular context.                                                                     Evidence for validity may come from various sources, including instrument content, response processes, internal structure, relations to other variables, and the consequences of instrument use (AERA et al., 1999). Some of these sources, such as instrument content or the relationship between the scores obtained from an instrument and other variables, were previously considered types of validity but have now been recast as types of evidence. Other sources of evidence, in particular the consequences of instrument use, extend far beyond anything proposed under the tripartite view.  There are two main threats to validity under the unified view: construct underrepresentation and construct-irrelevance variance. Construct underrepresentation occurs 10  when a measurement instrument fails to assess certain important aspects of the construct of interest. Construct-irrelevant variance results from the influence of unrelated constructs or of variables such as response sets, guessing, or task difficulty, if these are unintended and not relevant to the measurement of the construct of interest (Messick, 1995).  The history of validity does not end with the development of the unified approach. Debates about validity theory and practice continue into the present day. Many (though not all) are based on the unified view of validity but take these debates in new directions, such as for example Kane?s (2006, 2013) argument-based or interpretation/use argument (IUA) approach to validity, which aims to bridge the gap between theory and the practice of validation. However, much of this current debate does not address in detail the part that instrument content may play in validity. Thus, it will not be reviewed here. Instead, the next section of this dissertation will focus more closely on the role of content in validation. The Role of Instrument Content within the Unified View of Validity Content validity was listed as one of four types of validity in the 1954 Technical Recommendations for Psychological Tests and Diagnostic Techniques. But later, as the focus of shifted more and more towards construct validity and the inferences made from scores, the idea of content validity came under particular fire. Of the ?non-construct? types of validity, content validity was seen as perhaps the most problematic because it focused on ?test forms rather than test scores, upon instruments rather than measurements? (Messick, 1975, p. 960, emphasis in original). Nevertheless, even among those who believed that there is no such thing as ?content validity?, there was still a recognition that instrument content is important (Yalow & Popham, 1983). Content was recast from a type of validity to a form of evidence for construct validity. Still, some have suggested that it is a weak form of evidence precisely because it is, in essence, 11  focused on instruments rather than scores. Without scores, there can be no inferences, and without inferences, there is no basis for validity under the unified view. Messick (1989), for example, argued that evidence based on content cannot, by itself, provide evidence for the validity of inferences made from instrument scores (though he did consider instrument content to have a role to play in validity). However, while it is certainly true that content cannot be the sole source of validity evidence for an instrument, this is really no different from the argument that should be made about any type of evidence under the unified view of validity: no one type or single source of evidence is sufficient for the purposes of validation. It can only serve as one piece of evidence in building a comprehensive case for validity. Therefore, content should still be considered an important target for validation. As Sireci (1998b) said, ?how can we evaluate score-based inferences without first evaluating the assessment instrument itself? Obviously we cannot, and should not, evaluate test scores without first verifying the quality and appropriateness of the tasks and stimuli from which the scores are derived? (p. 103).   Some researchers have expressed concern that the unified view of validity will result in a lack of attention to the importance of instrument content (e.g., Sireci, 1998b; Yalow & Popham, 1983), but this does not necessarily appear to be the case. For example, Cizek, Rosenberg and Koons (2010) found that evidence based on content was provided for 48% of the measures reviewed in the 2005 edition of the Mental Measurements Yearbook. The only sources of validity evidence that appeared more often were construct-related validity evidence (58%) and concurrent criterion-related validity evidence (51%);2 no other sources of evidence came close to being reported as often as content. Thus, it appears that instrument content is still considered important to validity in practice as well as in theory.                                                   2 This last category does not correspond exactly to any of the sources of evidence listed in the 1999 AERA, APA and NCME Standards, but appears to fall under ?relationships to other variables?. 12  Having made a case for the importance of content-based evidence to validity, the remainder of this section will focus on the aspects of instrument content that can affect validity, and on the process of content validation. The QoLHHI will be used as an example throughout this discussion and, at the same time, the discussion of instrument content and content validation will serve to set up the subsequent chapters that report on the validation studies of the QoLHHI. First, however, it is necessary to make a few remarks regarding the use of language and terminology in this dissertation. Much of the literature on content and validity uses the term ?content validity?. In some cases, this is because the work predates the unified view of validity and the reframing of content as a type of evidence for validity. However, the term ?content validity? continues to be used even today. For example, a search in the PsycInfo database using the search term ?content validity? and limiting the results to articles published in peer-reviewed journals in the year 2012 returned 147 results. A similar search in the PubMed database for the same time period returned 280 results. Even while acknowledging that some of these results will be duplicates, these findings suggest that the term ?content validity? is still widely used. In fact, it would seem that language reflecting the tripartite conceptualization of validity more generally is still common. Cizek et al. (2010) noted that fewer than 10% of the sources in their review of validity information provided in the 2005 Mental Measurements Yearbook explicitly referred to a unified conceptualization of validity. Nevertheless, they were doubtful that this reflected a lack of understanding of the unified view of validity. Rather, they felt that it was more likely that researchers have retained the language of the tripartite approach while adhering to the theory of the unified approach. Perhaps this is due to the convenience of this language. For example, Sireci (1998b), who does subscribe to the unified view of validity, has argued that the term ?content validity? should be 13  retained for the simple reason that it is familiar to many researchers and can be used to summarize a set of ideas and procedures that are specific to content validation.  There is perhaps a larger debate to be had about the language used in discussions of validity. There is no doubt that the language of the tripartite view of validity is still in circulation, and Sireci?s (1998b) point that the term ?content validity? provides a tidy way to summarize one aspect of validity and a set of procedures is also quite compelling. Nevertheless, the term ?content validity? will be avoided in this dissertation, except when summarizing or citing other works where it has been used. Instead, the term ?evidence based on content?, which is adapted from the AERA, APA and NCME Standards, will be used.3 The term ?content validation? will be used to refer to the process of collecting and analyzing evidence based on content.  Aspects of Instrument Content that Affect Validity The next section will consider some of the aspects of content that have been discussed in the literature on validity evidence based on content and content validation. Although the use of language and terminology is not entirely consistent across authors, a closer look reveals that there are four aspects of instrument content that are mentioned most frequently. Sireci (1998b), in his overview of the literature on content validity, noted four aspects of instrument content that have historically been identified as important to validity: domain                                                  3 The term used in the Standards is ?evidence based on test content?. The authors of the Standards acknowledge that the word ?test? connotes an instrument for which responses are judged for correctness or quality, but go on to say that the word ?test? is used more broadly within the Standards to refer to any ?evaluative device or procedure in which a sample of an examinee?s behavior in a specified domain is obtained and subsequently evaluated and scored using a standardized procedure? (p. 3). Despite this disclaimer, the popular understanding of the word ?test? carries a  strong suggestion of educational or achievement measurement, and its use runs risks creating the impression that the discussion of validity is not applicable outside of these fields. In this dissertation, therefore, the more general term ?evidence based on content? will be used; the word ?instrument? will also be used in favour of ?test?, except when directly quoting sources that use the latter word. 14  definition, domain or content relevance, domain or content representation, and proper instrument development procedures.4 The domain definition is an ?operational definition of the content domain? (Sireci, 1998a, p. 300), content domains being ?important and testable aspects of the construct of interest? (Sireci, 1998b, p. 105). Content relevance is the extent to which the instrument?s content is relevant to the content domain, and content representation is the extent to which the instrument content reflects the domain definition (Sireci, 1998a). Though somewhat heavy on terminology, the basic idea behind these terms is fairly straightforward: constructs are by their nature intangible, and also often quite broad. In order to measure them, it is necessary to decide what aspects of the construct to measure (i.e., delineate the content domain) and translate these into operational terms (i.e., domain definition). Content validation is then concerned with how well an instrument?s content fits with this definition (i.e., its relevance and representativeness).  At first glance, the list of aspects of content identified by Fitzpatrick (1983) in her discussion of the literature on content validity appears somewhat different from Sireci?s (1998b) list. Fitzpatrick?s list includes domain sampling, domain relevance, domain clarity, and the technical quality of items. However, on closer inspection, it is clear that the differences are primarily in labeling. Domain sampling is essentially domain representation. Domain clarity is about how well the content domains are defined, and is thus related to domain definition. This leaves item quality as the main difference between the aspects of content identified by Sireci and Fitzpatrick. Fitzpatrick did not define or describe what exactly is meant by item ?quality?, other than to provide the example of an ambiguous item that may tap abilities other than the construct of interest.                                                   4 Sireci (1998a) did not discuss instrument development in detail in his article, other than to mention that it has been identified as one aspect of content validity. Presumably, proper instrument development procedures will help ensure that content is relevant, representative, and reflects an appropriate domain definition. 15   Messick (1989) identified the same aspects of content as Fitzpatrick (1983), though again with some difference in terminology. His list includes domain specification, content relevance, and content representativeness. Messick also addressed the technical quality of items or tasks, but as a sub-topic under content relevance, arguing that technically flawed items may introduce difficulty that is not part of the content domain (i.e., not relevant) into the response task. Technical quality might include ?readability level, freedom from ambiguity and irrelevancy, appropriateness of keyed answers and distractors, relevance and demand characteristics of the task format, and clarity of instructions? (p. 39).  Table 2.1 (p. 44) lists the aspects of content described in some additional sources on content validity and content validation. Across all sources, there appears to be agreement that some combination of five aspects of instrument content is relevant to validity (regardless of whether they are addressed under the label of ?content validity? or ?evidence based on content?). Instrument content should include: ?  A clear conceptualization of the construct of interest, defined in measurable terms (content domain and content domain definition)5  ? Content that, across the entire instrument, taps all of the important aspects of the content domain (domain sampling/content representation - this is sometimes called content coverage (Anastasi, 1986), which is perhaps a more intuitively understandable term)                                                  5 The distinction between ?content domain? and ?domain definition? is not always clear. According to Sireci (1998b), the domain definition is the detailed, operational description of the content domain. The APA, AERA, and NCME Standards (1999) define ?content domain? as ?the set of behaviors, knowledge, skills, abilities, attitudes or other characteristics to be measured by a test, represented in a detailed specification? (p. 174). While this appears to make a distinction between the content domain and the operational description (?detailed specification?) of this domain, the term ?domain definition? is not used. In contrast, Fitzpatrick (1983) used the term ?content domain? ?to refer to any definition that is given to the procedures used to measure a behavior of interest. In its most detailed form, this definition might describe the content and structure of a test procedure and the rules for scoring the responses that are obtained? (p. 4). This passage does not clearly distinguish between a content domain and the operational description of that domain. In order to acknowledge that content domains are still at level of abstraction that is not directly measurable (Sireci, 1998a), but in recognition of the fact that ?domain definition? is not widely used, the term ?content domain definition? will be used here, as it is descriptive of the aspect of content in question. 16  ? Content that, at the individual item or element level, reflects the content domain (i.e., content relevance) ? Content that is of sound technical quality, as content that is unclear, ambiguous, or taps extraneous variables may lead to scores that reflect more than just the construct of interest ? Finally, following proper procedures at the instrument development stage will help ensure that the content domain is properly defined, that content coverage and relevance are adequate, and that the content is of good technical quality Discussions of these aspects of content often focus on items, but there are many other elements of instrument content that can influence validity. For example, if the response scales do not fit the items or the construct being measured, the resulting scores (and therefore the inferences made from those scores) will be affected. Another example is the scoring procedures: the way in which scores are computed can also influence the inferences that will be made. Content validation should therefore target all elements of the instrument that might affect the scores that will be obtained, including, in addition to items, such things as response formats, response scales, administration method, administration instructions, presentation of the items, and scoring (Haynes, Richard, & Kubany, 1995; Netemeyer, Bearden, & Sharma, 2003).  Additional Considerations in Content Validation: The Context of Measurement    All validity evidence must be considered within the larger context of an instrument?s purpose and use. Evidence is not necessarily portable, and evidence of validity for one particular use of an instrument will not necessarily support validity claims if the instrument is used for another purpose (AERA et al., 1999). This holds for content validation as well. Although content validation focuses on the instrument itself, rather than on scores, any claims regarding validity 17  that are based on content are also tied to the purpose of measurement (Sireci, 1998b). For example, there may be evidence to support the use of an instrument for screening purposes, but additional validation will be needed if the instrument is to be used for diagnostic purposes. The target population of the instrument is also an important consideration. Evidence based on content may support the use of an instrument with one population, but not necessarily another (Haynes et al., 1995). Aspects of Content and the QoLHHI Description of the QoLHHI. The QoLHHI consists of a number of sections that correspond to major life areas such as health, living conditions, work, finances, and relationships. Each life area is measured by two sets of items. One set is based on Michalos? (1985) Multiple Discrepancies Theory (MDT) and contains 10 items that measure respondents? perception of the overall life area as a series of discrepancies (e.g., between self and others, self and ideal). The MDT sections of the QoLHHI are described in more detail in the QoLHHI Administration and Scoring Manual (Hubley et al., 2009).  The focus of this dissertation is on the second set of QoLHHI items, which evaluate the impact of each life area on the respondent?s SQoL. For example, the Health Impact section includes items that ask about the impact of physical health, mental health, quality of sleep, and level of stress. The Living Conditions Impact section asks about the impact of housing, neighbourhood, clothing, personal hygiene, and food. Responses to these impact items are collected using a 7-point Likert-type scale with the response options ?Large negative impact?, ?Moderate negative impact?, ?Small negative impact?, ?No impact?, ?Small positive impact?, ?Moderate positive impact?, and ?Large positive impact?. In addition to the impact items, each section includes items that gather descriptive information about the life 18  area and can be used to provide context for the impact ratings. For example, respondents are asked to indicate their stress level (low, medium or high) before rating the impact of this stress.  The QoLHHI is intended to be administered in face-to-face interviews with the instructions, items, and, if necessary, response options, read out loud by an administrator, who also records the responses. Graphical representations of both the MDT and Impact response scales are available to assist with administration, particularly for individuals with low literacy skills (see Appendix A, p. 209, for the QoLHHI items and forms, and Appendix B, p. 216, for the Impact Response Card). Content domain and content domain definition. The more clear and concrete the construct that an instrument is intended to measure, the more straightforward it is to define its content domain. For example, Murphy and Davidshofer (2001) provided a description of the content domain for a test of knowledge of world history, as taught in grade 7. This description includes three types of issues (social, political, and cultural), several geographic areas (North America, Europe, and Africa and Asia ? these last two areas are grouped together in Murphy and Davidhofer?s example), and two time periods (18th and 19th centuries). These can be laid out in a grid, and the relative importance of each intersection of topics (e.g., social issues in Europe in the 19th century) is specified in the form of time allotted to it in lectures and readings. When a content domain is this concrete and can be described with this level of specificity, it is also fairly straightforward to assess content relevance and coverage, as well as the adequacy of the content domain definition (e.g., grade 7 History teachers within a school district could be asked to review this definition).  Defining content domains is more challenging for constructs that are less easily defined, or that can be conceptualized in multiple ways. Quality of life (QoL) is a good example of such a 19  construct, as many different conceptualizations of QoL abound. Some treat QoL as an objective concept, something that can be evaluated externally based on whether needs are being met in key areas such as health, housing, education, income and environment (Phillips, 2006; Sirgy, 2012). Subjective approaches, in contrast, view QoL as driven by individuals? own assessments of their circumstances (Cummins, 2010). Some approaches to QoL treat it as a very narrow construct, while others assume that QoL is quite broad and encompasses many aspects of a person?s life. An example of a narrow conceptualization of QoL is found in the health field, where health status and QoL are often treated as interchangeable (Anderson & Burckhardt, 1999; Gill & Feinstein, 1994; Muldoon, Barger, Flory, & Manuck, 1998). Yet other research suggests that individuals with significant health problems factor life areas other than health into their assessments of their QoL (Anderson & Burckhardt, 1999), which lends support to a broader definition of QoL. Another debate centers on whether QoL is best measured by asking general questions about overall QoL, or by collecting and then summing separate ratings for individual life domains (Cummins, 1996).  With so many possible ways of conceptualizing the construct, the content for different measures of QoL may vary considerably. In fact, it would likely be impossible to come up with a ?definitive? content domain definition that would apply to all measures of QoL, or upon which everyone would agree. This has implications for content validation, for how does one assess content coverage or representativeness if the content domain cannot be definitively defined? This difficulty may be why some validity theorists consider content validation to be appropriate only for certain types of instruments for which the content is directly tied to the score meaning (e.g., education and achievement tests, where the scores reflect mastery of certain knowledge or skills; Anastasi & Urbina, 1997; Lawshe, 1975). It has also been suggested that, in the case of 20  instruments intended to measure theoretical constructs, evidence based on instrument content can provide at most only very limited evidence (Kane, 2006). These are legitimate hesitations, in light of the point made above about the impossibility of establishing a definitive content domain definition for such constructs. But it  could also be argued that content validation is especially important for constructs that are harder to define or can be defined multiple ways, as measures of these constructs carry with them a higher risk of poor content definition, irrelevant content, and inadequate content coverage. Even so, it is necessary to keep in mind the particular conceptualization of the construct that underlies an instrument, as it is reflected in the content domain definition, when conducting a content validation study. It would not be appropriate, for example, to criticize the content of a measure of subjective QoL on the grounds that it does not include objective assessments of QoL. The question of whether QoL is an objective or subjective construct is a separate (though important) issue from the question of whether an instrument?s content reflects the particular content domain definition. Content validation should focus on the latter question. At the same time, it is important to recognize the limitations imposed by a given definition, which may restrict the applicability of validity evidence to only certain uses of the instrument or only certain measurement contexts. The content domain for the QoLHHI was based on several sources. A review of the literature on QoL, combined with the QoLHHI authors? own expertise in this area, led to the selection of the World Health Organization?s (WHO) definition of QoL as the basis for the instrument. According to the WHO, QoL reflects:  Individuals' perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns. It is a broad ranging concept, incorporating in a complex 21  way individuals' physical health, psychological state, level of independence, social relationships, personal beliefs and their relationships to salient features of the environment (?The World Health Organization Quality of Life assessment (WHOQOL),? 1995, p. 1405, italics in original).  Using this definition as a starting point for instrument development meant that QoLHHI would focus on subjective measurement of QoL, and would incorporate a broad range of life areas or domains. In order to identify and define important domains for the target population, focus groups were conducted with a total of 140 individuals who were HVH (Palepu, Hubley, Russell, Gadermann, & Chinni, 2012). Additional information was gathered from 14 individuals working for organizations that provide services to individuals who are HVH (Hubley et al., 2009). The data from the focus groups and interviews were the primary source of information for the content of the QoLHHI. Other aspects of the QoLHHI content, such as the use of the MDT (Michalos, 1985) and the focus on ?impact? for the Impact sections, were based on the QoL literature and the authors? own expertise in the area of QoL research. The content domain for the QoLHHI thus includes general life areas such as ?health?, ?living conditions?, ?finances?, and ?work and education? that were identified by focus group and interview participants as being relevant to SQoL. It also includes sub-topics for each life area, such as ?physical health?, ?mental health?, and ?substance use? for the general life area of health; and ?cost?, ?cleanliness?, and ?privacy? for the area of housing. Finally, the content domain for the QoLHHI includes the type of assessments, such as impact ratings (as opposed to, for example, satisfaction ratings or status ratings), that respondents are asked to make for these life areas and sub-areas.  22  Content coverage.  For a construct such as SQoL, the idea of proportional representation of the content domain, similar to the breakdown included in Murphy and Davidshofer?s (2001) 7th grade History example, is not particularly relevant. It would be difficult to argue, for example, that the life area of ?health? should take up 25% of the instrument content for a QoL measure, but the life area of ?work and education? should only take up 15%. However, it is still important to determine if the instrument content fully reflects the content domain. For the QoLHHI, questions of content coverage should focus primarily on whether all important aspects of the content domain are represented, or if important aspects of SQoL are missing from the instrument?s content.  Content relevance. Compared to content coverage, assessing content relevance is fairly straightforward even for constructs like SQoL. However, because the construct of SQoL can be defined in various ways and the content domain will only reflect one of these definitions, it is important to be prepared for a certain amount of disagreement on the relevance of individual content elements, compared to what might be the case for a construct that is more definitive. In addition, in the case of the QoLHHI, it is not only the content domain, but also the target population that must be factored into considerations of content relevance. The target population for the QoLHHI is quite diverse. The instrument content was designed to be flexible in order to accommodate this diversity (e.g., using terms such as ?the place where you live or stay? instead of ?housing? in order to accommodate both homeless and vulnerably housed respondents, including a ?not applicable? response option), nevertheless, it is possible that the content will be perceived by some to have differing levels of relevance for different subgroups within the HVH population.  23  In addition to item relevance, the idea of ?impact? should be assessed for relevance. This is one of the central concepts for the Impact sections of the QoLHHI. In these sections, respondents are asked to rate the impact, either negative or positive, that each life area has on them. This is not a typical approach to measuring QoL; most instruments assess QoL via satisfaction, status, and/or importance ratings (e.g., Burckhardt, Woods, Schultz, & Ziebarth, 1989; Cummins, Mccabe, Romeo, & Gullone, 1994; Frisch, Cornell, Villanueva, & Retzlaff, 1992; Lehman, 1988). Because ?impact? is part of the content domain of the QoLHHI, and at the same time a more unusual concept in the measurement of SQoL, it will be important to obtain feedback on the relevance of impact ratings.  Technical quality. For many of the content elements of the QoLHHI, clarity will be the most important aspect of technical quality to assess. Unclear content in the items, instructions or instrument manual risks introducing construct-irrelevant variance into the administration, responses and scoring of the QoLHHI. This may in turn affect scores. For some elements, it will make more sense to ask about different aspects of technical quality. For example, the manual for the QoLHHI includes suggestions for dealing with various difficult administration scenarios. While the clarity with which these scenarios and suggestions presented is certainly important, an even more informative question will be whether these suggestions are helpful (i.e., do they serve their intended purpose of facilitating the administration of the QoLHHI?). The larger context: The purpose of measurement and target population. The QoLHHI was designed to be used for multiple purposes, including research, program evaluation, service provision, and policy development. Ultimately, evidence for validity with regards to a particular purpose will require obtaining instrument scores. For example, the effectiveness of the QoLHHI in identifying group differences among individuals who are HVH cannot be established 24  without administering the instrument to different groups and evaluating the resulting scores. Therefore, there is a limit to the information that can be provided about the purpose of measurement through a content validation study. Nevertheless, the intended uses of the QoLHHI should be kept in mind during the content validation process, as it may be possible, even in the absence of scores, to identify content that has the potential to make the instrument less effective for some purposes (or, conversely, particularly effective for some purposes). Identifying such content early will save researcher, and participant, time and effort in later validation studies. The definition of the target population is also relevant to the content validation of the QoLHHI. Like the construct of QoL, this population can be defined in various ways. Most research on homelessness has tended to focus on individuals who are absolutely homeless or using services for individuals who are homeless (Phelan & Link, 1999; Toro, 2007; Toro et al., 2007). This is a very narrow definition of the target population. In contrast, a broad definition of homelessness might include individuals and families who are unsheltered (living in public spaces or spaces not intended for human habitation), emergency sheltered (staying in emergency shelters), provisionally accommodated (which includes those living in transitional housing, living with others on a temporary basis, living in short term rental housing such as motels or rooming houses but without security of tenure, being in institutional care without certainty of permanent housing upon release, or residing in housing provided for immigrants and refugees), or at risk of homelessness (due to precarious finances or unsafe housing; Canadian Homelessness Research Network, 2012). Depending on the definition used (i.e., narrow versus broad), the target population will be quite different, and may require different content in order to measure SQoL. 25  The target population of the QoLHHI includes both individuals who are homeless and those who are insecurely/unstably housed (Hubley et al., 2009), and so is closer to the broad definition of homelessness cited above. As part of content validation, it will therefore be important to determine if the instrument has adequate content representation and relevance for the different sub-groups within the target population. The technical quality of elements is also a factor. For example, if certain content is only meant to be administered to some sub-groups, the instructions to interviewers must clearly indicate this and make it easy to identify what content to administer and what content to skip. Collecting Evidence Based on Instrument Content: The Content Validation Study Judgmental studies using content experts. Content validation may be carried out using either statistical or judgmental methods, but judgmental approaches are by far the most common. These typically involve asking individuals with relevant expertise (?subject matter experts? or SMEs) to evaluate the instrument?s content. The most commonly-cited qualification for a SME is some form of experience related to what the instrument is intended to measure. For example, Beck and Gable (2001) noted that ?content validity experts are expected to have extensive knowledge in the construct being measured? (p. 209), while Lynn (1986) proposed that SMEs should have ?been determined to have expertise in the content/domain area(s) of the instrument? (p. 384). DeVellis (1991) provided the example of ?colleagues who have worked extensively with the construct in question or related phenomena? (p. 75). Additional criteria may include relevant professional experience with the target population (Davis, 1992). Individuals with expertise in measurement or instrument construction can also be considered, because, although these SMEs may be less qualified to judge an instrument?s content as it relates to the construct of interest, they will be able to assess its technical aspects (Di Iorio, 2006). 26  As with content validation more generally, context is important when establishing criteria for selecting SMEs. It is not only the construct of interest that should be considered, but also the target population, the goal of measurement, and any other aspect of the measurement context that might help determine whose feedback would be most useful. For example, if the measurement instrument under investigation is to be used as a research tool, it will make sense to seek out SMEs with relevant research experience. When an instrument is meant for clinical use, SMEs with clinical experience will, of course, be important. Some instruments are designed for multiple purposes, in which case it may be desirable to seek feedback from several groups of SMEs, each with a different type of applicable experience.  In the case of the QoLHHI, it was decided to focus initially on its use as a research instrument and draw on SMEs with research backgrounds. In addition, it was decided to seek out researchers who had experience with individuals who are HVH, rather than in the area of QoL, even though the QoLHHI is a measure of SQoL. This was because it was felt that QoL researchers in different fields might not necessarily be able to evaluate how well the QoLHHI captures QoL for individuals who are HVH, but that researchers who have worked with individuals who are HVH would, if provided with a definition of the construct, be able to assess the potential of the QoLHHI to measure SQoL.  In addition to ?traditional? clinical or research-focused SMEs, judgments about content may also be solicited from members of the target population of the instrument. These are sometimes called lay experts or experiential experts.  Experiential experts (EEs) can be especially helpful in identifying unclear item language and judging if an abstract concept has been translated into understandable items (Tilden, Nelson, & May, 1990). This will be particularly salient if the target population for the instrument is quite different from the SMEs 27  and/or the construct of interest is subjective, as with the QoLHHI. In these cases, SMEs may not be able to fully represent the views of the target population. Involving EEs in content validation has also been recommended as a way to promote ?research ideals of participation, inclusion and collaboration? (Schilling et al., 2007). Again, this consideration is particularly relevant for the QoLHHI, which is aimed at a group that is frequently marginalized. The use of EEs as content experts would represent an opportunity to give individuals who are HVH a voice in the research process. Although EEs are acknowledged as valuable potential contributors to content validation (e.g., Grant & Davis, 1997; Haynes et al., 1995), there has been little discussion of the practical considerations that may arise when they are included in a study. Depending on the individuals involved, EEs can present some challenges that are not necessarily found with SMEs. In order to evaluate the content of an instrument effectively, content experts need to understand the study language and task, something that can be challenging for some EEs. EEs should also evaluate the content based not just on their own experiences, but as representatives of the broader target population. Again, this can be difficult for some EEs (Stewart, Lynn, & Mishel, 2005). Some EEs may also struggle with the level of abstraction required to answer questions about an instrument, rather than responding to the items themselves (Schilling et al., 2007). At the very least, the language used for study materials will likely need to be simplified and stripped of technical jargon. Other aspects of the study, such as the method of data collection, may need to be adapted as well.  Stewart and colleagues (2005) described a number of modifications that they implemented for a content validation study using children aged 8-16 years as EEs. One was to adapt the study materials to the language level of the participants. Another was to supplement 28  written materials with a verbal orientation to the validation tasks. Finally, the EEs? data were collected in one-on-one interview-type sessions, rather than through self-administered forms as is common with SMEs. This allowed the researchers to provide clarification about the study if needed. It also led to the researchers identifying two participants who always rated an item poorly if it did not apply to them personally. Because of the interactive nature of the data collection, the researchers were able to address this issue. The two participants were asked to rate the instrument items a second time, but pretending to be a different child who was not exactly like them. In the second round, the EEs were more positive in their ratings.   Not all EEs will necessarily need all or even some of these kinds of modifications in order to be able to participate effectively in a content validation study (although it could be argued that using straightforward, jargon-free language is always desirable). In the case of the QoLHHI, however, some adjustments will likely be needed in order to include EEs in content validation studies. In particular, some EEs might have cognitive impairments or lower literacy levels that, while not significant enough to disqualify them as EEs, might nevertheless make it advisable to adjust the study materials and procedures.    SMEs and EEs are the most common types of experts reported for content validation studies, but other groups may also be able to provide useful feedback on instrument content.  The QoLHHI, for example, is designed to be administered by an interviewer, rather than self-administered. Interviewers would be particularly well-qualified to comment on administration-related content elements such as the administration and scoring instructions. In addition, these ?practical experts? (PEs) may have received feedback on the items from the HVH individuals to whom they have administered the QoLHHI. This feedback, which could be passed on through a content validation study, could supplement feedback obtained directly from EEs. PEs therefore 29  represent a potentially invaluable (but easily overlooked) source of information on the content of the QoLHHI, and should be included in the content validation of the instrument. Data for a content validation study. Content validation can employ both quantitative and qualitative approaches (Haynes et al., 1995; McKenzie, Wood, Kotecki, Clark, & Brey, 1999). For a judgmental study with content experts, quantitative data generally consist of ratings of content coverage, relevance, and technical quality. The qualitative or descriptive component generally consists of comments from the content experts, including feedback on specific elements and suggestions for content to add. 6 Collecting and analyzing the quantitative data. In a content validation study, quantitative ratings are usually analyzed using a measure of inter-rater agreement. Deciding on a measure or index can be challenging, as there are many from which to choose and each has its champions and detractors. Should one use the Content Validity Ratio (Lawshe, 1975) or the Content Validity Index (CVI, Lynn, 1986), both of which have been proposed specifically for content validation? Or the multi-rater kappa statistic (Fleiss, 1971), which has been suggested as a chance-corrected alternative to the CVI (Wynd, Schmidt, & Schaefer, 2003), or one of its variants? Or perhaps a measure of inter-rater agreement that focuses on the variability of the ratings, such as the ADm (Burke, Finkelstein, & Dusig, 1999) or rwg (James, Demaree, & Wolf, 1984)? Although the options can seem overwhelming, the selection of an appropriate measure is made simpler by considering the purpose for collecting the ratings in the first place. In content validation, the ultimate purpose for obtaining evaluations of instrument content from SMEs, EEs, or other content experts is to determine if there is evidence that using an                                                  6 Open-ended feedback (i.e., comments and suggestions) is often referred to as ?qualitative? feedback (e.g., Haynes, Richard, & Kubany, 1995). Although qualitative in the sense of not involving numeric data, this feedback is rarely, if ever, collected and analyzed using qualitative research methods or theory. Therefore, in order to avoid confusion with research conducted using such theories or methods, open-ended feedback will be referred to in this dissertation as ?descriptive feedback? or ?descriptive data?. 30  instrument is likely to produce valid inferences. Or, to approach the issue from a different direction, a content validation study can be used to identify potential sources of invalidity in the instrument?s content. A content validation study can also provide information to guide changes to content in order to reduce potential sources of invalidity. In this, evidence based on content can serve a function that other types of validity evidence do not: it is not only descriptive, but potentially prescriptive as well. That is, the information collected as part of a content validation study can be used to determine if content elements should be retained, revised, or removed from the instrument. This may be why content validation is often discussed as part of the instrument development stage (e.g., Beck & Gable, 2001; DeVellis, 1991; Grant & Davis, 1997; Wynd et al., 2003), as it is at this point in an instrument?s lifetime that such changes will be easiest to make. However, findings regarding content-based evidence for an instrument cannot be assumed to be stable over time. Shifts in theory and the accumulation of new knowledge can affect the definition and understanding of a construct, which in turn can have validity implications. Therefore, evidence based on content should be re-assessed periodically (Haynes et al., 1995). Although changes to the instrument?s content may be easiest to make during the development stage, there is no reason they cannot also be implemented at a later time (subject to copyright and other restrictions).  What is needed therefore is a measure or index of instrument content that can serve both descriptive and prescriptive functions. Regardless of whether experts are rating a content element for its relevance, its clarity, or some other characteristic, what is needed in order to assess the implications for validity and inform the decision to retain, revise or remove that element is not the level of expert agreement on the ratings for the element, but rather the level of expert endorsement of the element. Many measures of overall inter-rater agreement, such as the multi-31  rater kappa or ADm, do not distinguish between high agreement that an item is relevant and high agreement that an item is not relevant. In other words, they do not measure endorsement, which is the information needed to make decisions about revising the instrument content. If these measures are used to collect data for a content validation study, additional information will be needed in order to distinguish between endorsed and non-endorsed elements. A more parsimonious measure will be one that directly identifies endorsement. Ease of computation and interpretation, though not essential, are also helpful characteristics. Based on these criteria, the CVI becomes a viable first choice.  The Content Validity Index. The CVI measures the proportion of raters who endorse an element as relevant, clear, etc. Lynn (1986)  suggested collecting ratings from experts using a 4-point scale and then collapsing the resulting data into two categories by combining the bottom two response options and the upper two response options. The CVI is then calculated as the proportion of experts who assigned an element a rating of either 3 or 4 (i.e., endorsed an element). In addition to item or element-level CVIs (E-CVIs), it is also possible to compute a scale-level CVI (S-CVI).7 There is some debate as how the S-CVI should be calculated, but Polit and Beck (2006) recommended computing the average of the item-level CVIs for all of the items in the instrument. In the context of a content validation study based on experts? ratings, the CVI measures exactly what the researcher wants to know, namely, the extent to which the experts endorsed an element as relevant, clear, etc. In addition, the CVI is simple to compute and intuitive to interpret. However, the CVI has been criticized on a number of grounds, in particular that it does not account for the possibility of chance agreement among raters (Beckstead, 2009; Polit &                                                  7 The abbreviation I-CVI has been used to refer to item-level CVIs (e.g., Polit, Beck, & Owen, 2007) to distinguish these from scale-level CVIs. E-CVI (for element-level CVI) will be used in this dissertation instead, since CVIs may be computed for other content elements besides items.  32  Beck, 2006; Wynd et al., 2003). This concern has long been raised in connection with proportion agreement indices more generally (e.g., Cohen, 1960; Tinsley & Weiss, 1975; Waltz & Bausell, 1981). The CVI has also been criticized for collapsing response categories (e.g., from a 4-point scale to a 2-point scale), resulting in a loss of information (Beckstead, 2009). These concerns will be addressed in more detail next. The CVI and the risk of chance agreement. The most common criticism of the CVI is that it does not adjust for the possibility of chance or non-true agreement between raters. Chance agreement could lead to inflated estimates of agreement/endorsement. This is a legitimate concern but, unfortunately, one that is not easy to resolve. In order to incorporate an adjustment for chance agreement into an index or measure, it is first necessary to establish a model for the chance agreement component. There are two main ways to do this. The first is to use observed data. This is the approach taken, for example, with the multi-rater kappa statistic, where the distribution of observed ratings is treated as ?if the raters made their assignments purely at random? (Fleiss, 1971, p. 379) and forms the basis of the chance agreement component of the computation. The use of observed data for this purpose has been criticized on the grounds that it is unlikely that raters would respond completely at random for every rating they provide, making observed data a poor model of chance agreement (Ubersax, n.d.). This criticism seems especially salient for content validation studies, as the most plausible explanation for completely random responses would be guessing, yet guessing seems particularly unlikely for a sample of experts (who, by definition, would have better-than-average knowledge of the subject they are rating). So, while it is possible that the observed distribution of ratings in a study reflects some amount of random responding, it is unlikely to reflect completely random responding, and there is no way to know just how much randomness is involved.  33  The second approach to adjusting for chance agreement is to establish an a priori model for the chance agreement component, as is done with measures such as the rwg index (James et al., 1984). James et al. demonstrated the calculation of the rwg index using a uniform (i.e., completely random) distribution for chance agreement, however, they noted that this is not the only possible pattern for chance agreement and suggested computing rwg with a number of different distributions. The results could then be used to establish a range within which the true agreement for a given set of data is assumed to lie. Unfortunately, this approach would place a heavy and probably unrealistic burden on researchers who are conducting a content validation study, who would need to use theory and/or previous evidence to decide on at least two defensible models of chance agreement. The options are numerous. For example, in addition to a uniform distribution, they could propose a leniency bias (i.e., a bias in favour of endorsement) under the hypothesis that the experts who agree to participate in a content validation study are likely to be positively predisposed towards the instrument. Alternatively, they could hypothesize that the experts would have a strong sense of responsibility to ensure the quality of the measurement instrument, and so would be particularly critical of its content. To add a potential complication, it is possible that different groups of experts (e.g., SMEs versus EEs) would present with different biases, or that the mode of data collection (e.g., in-person versus self-administered) could trigger different response sets. No doubt additional scenarios could be proposed. Given this, it is not clear to what extent it is possible to select a correct or accurate a priori model of chance or non-true agreement among content experts.  In sum, the possibility for chance agreement does exist for data obtained through expert ratings, but the difficulty lies in determining how to address this. Even if it were possible to correct for this chance, the theoretical and data requirements would change the focus from 34  assessing rater agreement to modeling rater agreement (Ubersax, n.d.) ? a completely different exercise that shifts the focus away from content validation. With so many challenges inherent in trying to guard against the possibility of chance agreement, it is better to acknowledge the risk but avoid applying complicated corrections that may simply introduce a new layer of error. Therefore, the CVI, despite not accounting for the possibility of chance agreement, remains a viable choice of measure for a content validation study. The CVI and collapsing categories. A key step in the computation of the CVI is the collapsing of the obtained ratings into dichotomous categories (e.g., on a 4-point scale, ratings of 1 and 2 are combined, as are ratings of 3 and 4). This practice, for which neither Lynn (1986), nor her source, Waltz and Bausell (1981), provide a rationale, has been criticized for leading to a loss of important information (Beckstead, 2009). While this criticism is valid, there is, in fact, a good argument for combining categories, an argument that lies in recognizing the distinction between measurement scales and scales of analysis.  The measurement scale is employed at the point of data collection, that is, when the content experts provide their ratings. The purpose of these ratings is to quantify each element of the measurement instrument in terms of some characteristic such as relevance, clarity, helpfulness, etc. The rating (measurement) scale with which the experts are provided in order to make these judgments is of critical importance. Beckstead (2009) noted that: The purpose in using a rating scale ... is to allow a rater or group of raters to demonstrate their perceptual discriminations among a set of stimuli (items in this case) ... The rating scale may be thought of as having a certain capacity for transmitting information about the items from the expert rater to the researcher (p. 1277).  35  A longer scale (such as the 4-point relevance scale proposed by Lynn, 1986) will, of course, allow for greater discrimination than a shorter scale (e.g., a 2-point scale). But given that the response scale is collapsed into dichotomous categories in order to calculate the CVI, it is reasonable to ask why raters should not be provided with a dichotomous scale at the outset. In rating relevance, for example, experts could be asked to simply indicate if they think an item is ?relevant? or ?not relevant?. These are, after all, the categories that are obtained when the scale is collapsed at the point of computing the CVI. The danger in using such a dichotomous scale at the point of measurement is that it may not provide experts with a response option that matches their assessment of a content element. In order to provide ratings, individuals must map the response in their mind onto one of the options offered in the response scale. If there is no option available that matches their intended response, they will be unable to express their opinion. Dichotomous scales allow respondents to express extreme opinions, but do not provide options that fit more moderate views (Krosnick & Presser, 2010). In a content validation study, a dichotomous scale would force experts to make judgments by rating elements as either absolutely ?relevant? (?clear?, etc.) or absolutely ?not relevant? (?not clear?, etc.) ? a judgement that may be too extreme for some respondents or for some elements. Providing a rating scale with more response options instead permits experts to express a range of opinions about the elements. This is important not only for conveying information to the researcher (as noted by Beckstead, 2009), but also in order to make the rating process meaningful for the experts.8                                                   8 There is no reason why the response scale for the measurement stage must have four response options. For example, Lynn (1986) acknowledged that a 3- or 5-point scale might be used instead, but felt that a 4-point scale would provide sufficient discrimination while avoiding an ?ambivalent? midpoint (p. 384). Other researchers might disagree, and prefer a scale with a different number of response options. Different elements or characteristics may also be best measured on different scales.   36  Having argued that a response scale with more than two options allows content experts to express their true opinion of the content elements better than a 2-point scale, the question then becomes: why collapse the response categories at the point of analysis, since by doing so, a degree of the detail in the opinions provided by the experts is lost? There is, however, a fundamental difference in the purpose for which the ratings will be used at the point of analysis (i.e., when the ratings obtained with the measurement scale are used to compute a CVI). At the data collection stage, as already noted, the goal is to gather information about the degree of relevance, clarity, etc. of content elements. These are judgments that rest with the various content experts (e.g., SMEs, EEs, PEs) participating in the content validation study. At the point of analysis, the purpose of the ratings is to provide evidence for validity, and perhaps also determine whether changes should be made to the instrument. The task is now to decide what ratings reflect endorsement and what ratings do not. Essentially, the question is no longer ?how relevant (clear, etc.) is this item??, but rather ?is this item relevant (clear, etc.) enough for the purposes of content validation?? These are judgments that rest with the researcher(s) conducting the content validation study, rather than the experts who provided the ratings. The researchers should base their decisions on both the meaning of the scale points, and on the outcomes (i.e., retain, revise, remove an element) attached to each grouping of responses.  Lynn (1986) proposed that ratings of 3 and 4 on a 4-point scale be taken as indicative of endorsement and, as an outcome, that items that do not obtain an acceptable CVI be revised or removed from the instrument. But Lynn does not explain why she recommends this particular grouping of the scores, and other groupings are presumably possible. For example, on a 4-point scale consisting of the categories ?not at all clear?, ?somewhat clear?, ?mostly clear? and ?very clear?, a researcher could choose to take a very conservative approach and only categorise ratings 37  of ?very clear? as endorsements. In contrast, a very liberal approach might be to accept all ratings of ?somewhat clear?, ?mostly clear? or ?very clear? as indicative of endorsement. The most probable outcome of either of these approaches would be a shift in the percentage of elements that are endorsed, and therefore different outcomes in terms of the number of elements that are retained, revised or removed from the instrument. In this example, both the liberal and conservative approaches are problematic. A rating of only ?somewhat clear? suggests that an element has some significant flaws in the opinion of an individual who is considered an expert. To treat this rating as an endorsement would seem to dismiss the content expert?s concerns, and undermine the point of seeking his or her feedback in the first place. In contrast, the conservative approach of treating only ratings of ?very clear? as indicative of endorsement is unnecessarily stringent. In this case, an item that received a number of ?mostly clear? ratings might be removed from the instrument even though it is probably not, in fact, a bad item, and the goal of improving instrument content would be poorly served by this decision.  This is not to suggest that response scales should always be collapsed down the middle. In some cases, uneven groupings might be more appropriate. For example, if the above response scale for clarity were changed so that the response options were  ?very unclear?, ?somewhat unclear?, ?somewhat clear? and ?very clear?, the meaning of the third response option would be different (i.e., changed from ?mostly clear? to only ?somewhat clear?). In this case, it would make sense, and seem to be in keeping with the meaning of the response options, to only treat ratings of ?very clear? as endorsements. The most important consideration in collapsing response categories should not be some sort of arbitrary rule such as splitting the response options down the middle or into even groupings, but rather the meaning of the ratings in the context of establishing what constitutes an endorsement. 38  Ultimately, the decision of whether or not to collapse categories on the rating scale should be based on careful judgments on the part of the researcher(s) conducting the content validation study. These judgments will be informed by the meaning of each scale point and the outcome attached to them. If each point on the measurement scale corresponds to a different decision about the implications for validity and/or about retaining, revising or removing an element, then all of the points should be retained for the analysis scale. If, however, two or more points will lead to the same conclusion and outcome (e.g., they will all result in an element being retained, though perhaps with revisions), then collapsing categories becomes not only acceptable, but even desirable from the point of view of tidying up the data and simplifying the interpretation of the results.  As the above discussion shows, the two main criticisms of the CVI do not, in fact, disqualify it as an index for a content validation study. Although it is true that the CVI does not account for the possibility of chance agreement among raters, it is unclear if there is any effective way to guard against this possibility. Concerns about collapsing the rating scale categories in order to compute the CVI are resolved by recognizing that there is a difference between the measurement and data analysis stages, and that the different goals at each stage may be best served by collapsing categories between stages. The CVI has the advantage of measuring endorsement, information that is directly applicable to the task of content validation, and it is easy to compute and interpret. For these reasons, the CVI will be used for the content validation of the QoLHHI. Measurement and analysis scales can be established on a per-element basis in order to meet the goals of each stage of the study. The CVI and choosing a cut-off point for endorsement at the sample level. The preceding discussion of scales of analysis applies to the identification of scores that represent endorsement 39  at the level of the individual rater. For a content validation study, it is also necessary to establish a minimum level of acceptable endorsement at the group or sample level. One way to approach this is to base the minimum on levels of statistical significance. For the CVI, Lynn (1986) provided guidelines for the number of endorsements needed in order to establish evidence of endorsement beyond the p < .05 level, for samples of between 2 and 10 raters. The main reason to apply a cut-off based on statistical significance would be to establish that the obtained data are not a chance finding, but rather are representative of some larger population of experts ? in other words, the application of inferential statistics. But both Lynn?s guidelines for the CVI and the practice of using inferential statistics in content validation studies with expert raters more generally have been criticized.  Lynn?s (1986) based her proposed guidelines for the CVI on the standard error of proportions; this is discussed in more detail in Polit et al., (2007). 9  According to Polit et al., Lynn first set 0.50 as the probability of chance agreement, based on the dichotomous structure of the CVI. In determining the number of endorsements needed for a particular sample size of raters, she then selected the smallest proportion of raters for which the lower bound of the confidence interval of the standard error of the proportion would be greater than 0.50. In some cases, this lower bound is quite close to 0.50. Polit et al. used the example of the confidence interval of .54 to 1.00 around a CVI of 0.83 for six experts (5/6 endorsements). As Polit et al. noted, ?knowing that the ?true? population proportion could be as low as .54 is hardly reassuring about the item?s relevance, even if this value is greater than ratings at random? (p. 464).  The main problem here is one of small sample sizes, a problem that is not unique to studies employing the CVI. Indeed, similar concerns have been raised more generally about the                                                  9 Lynn does not explain in detail how she arrived at her proportions in her 1986 article. The procedure is explained in Polit, Beck and Owen (2007), who cite personal communication with Lynn as the source of their information on this subject. 40  high chance of sampling error inherent in the small samples typical of content validation studies. As Beckstead (2009) concluded, ?There is no escaping the law of large numbers; the quality (i.e., precision) of an estimate is a function of the sample size? (p. 1281). Certainly, if the goal is to make inferences about the larger population of all possible experts, then the use of samples in the range of 5 to 10 individuals (fairly standard for a content validation study), is worrisome. The fact that samples of experts are rarely randomly selected, but rather depend on availability and willingness (Lynn, 1986), adds to the concern, since non-random samples can have a negative impact on the accuracy of statistical inferences. Rather than try to address this problem, however, it is worth asking if representativeness should even be the goal when convening a sample of content experts for a content validation study. Instead of thinking of experts? data in terms of estimating the levels of endorsement that exist in the population of all possible experts, it may be far more useful to treat a content validation study as an opportunity to consult with an advisory group consisting of individuals who have expertise that is relevant to the construct of interest, target population, or purpose of measurement. In planning a content validation study, researchers might even deliberately try to include raters who are not likely to be representative of the ?population? of experts, for the very reason that these experts may be able to provide valuable counterpoints to the feedback from more typical experts. Considered in this light, statistically significant cut-off values for the E-CVI (or any other quantitative measure or index of expert endorsement) are not particularly appropriate or useful. Instead, the aim should be to establish a level of endorsement that instills confidence that a particular group of experts felt that a content element was, on the whole, relevant, clear, or otherwise of good quality.  In order to do this, an acceptable level of sample-level endorsement can be set a priori. This level could be based on previous studies or the literature on content validation. The most 41  commonly-cited minimum for the E-CVI  in the literature is 0.80 (e.g., Pierce, 1995; Waltz, Strickland, & Lenz, 1991); however, no justification for this cut-off is provided. For the content validation of the QoLHHI, a somewhat different approach is proposed. The first factor to be considered is sample size. Since this cannot be predicted in advance, the minimum number of raters that is generally deemed acceptable for a content validation study can used as a starting point. There are no firm guidelines for this minimum, but five experts is one figure that is frequently cited in the literature (e.g., Lynn, 1986; Netemeyer et al., 2003; Osterlind, 1997). Next, the highest number of allowable disagreements must be established. It would be unrealistic to expect that all experts will endorse every content element but, with such a small sample, allowing more than one dissenting voice per element would be too lenient. Therefore, assuming a sample size of 5, a minimum of 4 experts should endorse an element (however endorsement is defined) in order for that element to be considered to have an acceptable level of endorsement at the sample level. This sets the minimum level for the CVI to 0.80 (4/5 endorsements). In terms of actual raters, this allows for only a single disagreement for any sample of between 5 and 9 individuals, and thus represents a fairly stringent criterion even at larger sample sizes.10 Setting a minimum level for endorsement for the scale-level CVI (S-CVI) for the content validation of the QoLHHI is more difficult because the S-CVI is an average of the CVIs for individual elements, and there is no generally accepted minimum number of items or elements for a scale in the way that there is a generally accepted minimum sample size. Therefore, for the content validation of the QoLHHI, it is recommended that a cut-off of 0.90, proposed by Polit et al. (2007), be used for the scale-level analyses. In the absence of a logically defensible alternative, adopting this quite stringent standard appears to be a reasonable approach.                                                  10 For example, for a sample of nine raters, one disagreement would result in a CVI of 0.88 and therefore a decision of endorsement, but two disagreements would result in a CVI of 0.78 (non-endorsement).  42  Collecting and analyzing descriptive data. Relatively little has been said in the content validation literature about qualitative or descriptive feedback from content experts. Most general discussions of content validation describe one or more quantitative procedures in some depth, but make little or no mention of non-quantitative feedback (e.g., Davis, 1992; Lynn, 1986; Rubio, Berg-Weger, Tebb, Lee, & Rauch, 2003; Sireci, 1998b; Waltz et al., 1991). When descriptive feedback is mentioned, it is often treated as a secondary source of information, and its collection as optional. For example, DeVellis (1991) suggested that content experts can be asked to ?point out awkward or confusing items and suggest alternative wordings, if they are so inclined? (p. 76, emphasis added), and that ?you might invite your experts to comment on individual items as they see fit? (p. 75, emphasis added).  This tentative approach to collecting descriptive feedback from content experts typifies the discussion of content validation. Yet many of these same discussions also suggest that items or elements that do not obtain acceptable ratings from experts (however ?acceptable? is defined) should be revised or removed from the instrument (e.g., Lynn, 1986; Rubio et al., 2003; Waltz et al., 1991). What is not made clear is how the substance of these changes will be determined. For example, if an item is not endorsed for clarity, how will the instrument authors or revisers know exactly what to do in order to make it clearer? The obvious place to look for guidance is in the descriptive feedback provided by content experts. An informal review of reports of the practice of content validation suggests that this is indeed exactly how many researchers go about making changes to instrument content, but the process is rarely described in detail. Typical examples of the way in which the application of descriptive data to the task of revising an instrument is described in reports of content validation studies include ?together with the CVI, these comments informed the modifications made to the CNAQ? (Halliday, Porock, Arthur, Manderson, & 43  Wilcock, 2012, p. 218) and ?some phrases were changed according to patients? suggestions? (Can, Durna, & Aydiner, 2010, p. 317). Examples of somewhat more detailed discussions can be found in Hubley and Palepu (2007) and Schilling et al. (2007), though even for these reports, the focus is still primarily on quantitative findings. It would appear that researchers are using descriptive feedback from content experts to inform their decisions about revisions to an instrument, but without describing the process in detail in published reports. This may be due to a bias in favour of quantitative data or considerations of manuscript length (i.e., detailed discussions of descriptive analyses generally require more space than discussions of quantitative analyses). Regardless of the reason, these brief summaries do not provide readers with enough information to permit a critical evaluation of the researchers? conclusions regarding content-based validity evidence, or of their recommendations for changes to an instrument. The fact that the collection of descriptive data appears to be optional in many studies is also a concern, as this means that content experts may not realize the importance of providing this feedback. In order to meet the goals of a content validation study (i.e., evaluate content-based validity evidence, and perhaps also guide revisions to an instrument), descriptive data should be requested of content experts not as an option, but as a matter of course. The analysis and implications of this data should also be described in detail when the study findings are reported. For the content validation of the QoLHHI, all experts will be asked to provide explanations for their quantitative assessments, in addition to other descriptive feedback such as suggestions for topics that may be missing from the QoLHHI. The hope is that, by asking for this feedback, it will be possible not only to identify which elements are not endorsed, but why they are not endorsed, and not only which elements would benefit from revision, but also what revisions to make. 44  Table 2.1: Important Aspects of Content Identified in Selected Publications on Content Validity and Content Validation Source Text from source Aspects of content addresseda Anastasi & Urbina (1997) ?1. Does the test cover a representative sample of the specified skills and knowledge? 2. Is test performance reasonably free from the influence of irrelevant variables?? (p. 116) 1. Content coverage 2. Technical quality DeVellis (1991) ?Content validity concerns item sampling adequacy  -  that is, the extent to which a specific set of items reflects a content domain? (p. 43) ?You can ask your panel of experts --- to rate how relevant they think each item is to what you intend to measure? (p. 75, italics in original) ?Reviewers also can evaluate the items? clarity and conciseness? (p. 75, italics in original) ?A third service that you expert reviewers can provide is pointing out ways of tapping the phenomenon that you have failed to include? (p. 76, italics in original) 1. Content coverage 2. Content relevance 3. Technical quality Furr & Bacharach (2008) ?One type of validity evidence relates to the match between the actual content of a test and the content that should be included in the test. If a test is to be interpreted as a measure of a particular construct, then the content of the test should reflect the important facets of the construct (p. 172, italics in original) 1. Content coverage Haynes, Richard & Kubany (1995) ?Content validity is the degree to which elements of an assessment instrument are relevant to and representative of the targeted construct for a particular assessment purpose? (p. 238) ?Carefully define the domain and facets of the construct and subject them to content validation before developing other elements of the assessment instrument? (p. 244) 1. Content domain definition 2. Content relevance 3. Content coverage 4. Technical quality 45  Source Text from source Aspects of content addresseda ?Every element of an assessment instrument ? should be judged ?  on applicable dimensions such as relevance, representativeness, specificity, and clarity? (p. 244) Lynn (1986) ?Content validity is the determination of the content representativeness or content relevance of the elements/items of an instrument? (p. 382) ?? the full content domain must be identified? (p. 383) 1. Content domain definition 2. Content coverage 3. Content relevance Murphy & Davidshofer (2001) ?The basic procedure for assessing content validity consists of three steps: 1. Describe the content domain 2. Determine the areas of the content domain that are measured by each test item 3. Compare the structure of the test with the structure of the content domains? (p. 150) 1. Content domain definition 2. Content coverage Waltz, Strickland & Lenz (1991) ?The focus [for content validity] is on determining whether or not the items sampled for inclusion on the tool adequately represent the domain of content addressed by the instrument? (p. 172) ?These experts are then asked to (1) link each objective with its representative item, (2) assess the relevancy of the items to the content addressed by the objectives, and (3) judge if they believe the items on the tool adequately represent the content or behaviors in the domain of interest? (p.172-173) 1. Content coverage 2. Content relevance a The cited texts do not necessarily refer to these aspects of content by these names; rather, this column represents an attempt to summarize information from a variety of sources using a common set of terms.46  3. Chapter 3: Going Beyond the Numbers: Content Validation of a Quality of Life Measure Using Subject Matter Experts Background The association between homelessness and a range of negative physical, mental and social circumstances, including poor physical and mental health, substance abuse, high unemployment, and low income, has been well-documented (e.g., Aubry et al., 2003; Fazel, Khosla, Doll, & Geddes, 2008; Fischer & Breakey, 1991; Hwang, 2001; Long et al., 2007; Solliday-McRoy et al., 2004). Less is known about the quality of life (QoL), and in particular the subjective quality of life (SQoL), of individuals who are homeless or vulnerably housed (HVH).11 In a recent review of the literature, Hubley et al. (2012) found that individuals who are homeless report lower SQoL compared to the general population, and also have low levels of satisfaction in many life areas such as safety, finances, and living situation. However, the authors also noted that these findings are based on a fairly small number of studies, and that the factors at play in the SQoL of individuals who are homeless are as yet poorly understood.  It is known that objective and subjective evaluations of life circumstances do not always correspond. In the health field, the phenomenon of individuals who experience severe disabilities but who nevertheless rate their QoL as high has been referred to as the ?disability paradox? (Albrecht & Devlieger, 1999). Beyond the health field, the discrepancy between objective circumstances and subjective assessments of those same circumstances has been labeled the ?happy poor and unhappy rich? phenomenon (Phillips, 2006). The potential for such                                                  11Much of the research that claims to report on the QoL of individuals who are homeless has focused on what is called health-related quality of life (HRQoL). These studies for the most part employed measures such as the SF-36 (e.g., Kertesz et al., 2005; Tsui et al., 2007), SF-12 (e.g., Savage et al., 2008), or EQ-5D (Sun et al., 2012) to measure QoL. However, these measures are more correctly labeled as measures of health status, rather than measures of QoL, and their narrow focus on health makes them unsuitable as measures of QoL more generally. In addition, they do not adequately capture the subjective aspects of QoL. This paper and the research presented therein are concerned with the broader concept of general SQoL.  47  discrepancies has important implications for research, service provision, and policy initiatives aimed at individuals who are homeless or vulnerably housed (HVH), for it means that efforts to improve objective circumstances may not necessarily result in improvements in how people feel about their lives. Assessments of SQoL can provide information about the subjective impact of these efforts and, more generally, about people?s feelings about their circumstances.   In order to measure the SQoL of individuals who are HVH effectively, however, it is first necessary have a measurement tool that is appropriate for this population. A SQoL measure that was developed for the general population may not meet this requirement, as it will not reflect the unique circumstances entailed by homelessness and housing vulnerability, or be responsive to variations within this population (e.g., individuals who are living in shelters versus those in insecure housing, single individuals versus families). Recognizing the need for a population-specific measure, a team of researchers recently developed the Quality of Life of Homeless and Hard-to-House Individuals (QoLHHI) Inventory (Hubley et al., 2009). This measure is grounded in the World Health Organization?s definition of QoL as:  Individuals' perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns. It is a broad ranging concept, incorporating in a complex way individuals' physical health, psychological state, level of independence, social relationships, personal beliefs and their relationships to salient features of the environment (?The World Health Organization Quality of Life assessment (WHOQOL),? 1995, p. 1405, italics in original).  The QoLHHI is composed of multiple sections, each of which focuses on a different life area such as health, finances, and social support. There are two sets of items for each life area. One set is based on Michalos? Multiple Discrepancies Theory (Michalos, 1985). The other set of 48  items measures the impact (negative, positive, or neutral) of each life area on the respondent.  The life areas and content for each section were developed from focus groups conducted with 140 HVH individuals in four Canadian cities (Hubley et al., 2009; Palepu et al., 2012). The QoLHHI is intended to be administered via an interview (i.e., it is not self-administered). It may be used for multiple purposes including research, policy development, and to support the provision of services to individuals who are HVH and, because of this, the instrument is designed to be highly flexible. Sections can be administered or dropped as needed, depending on the purpose for which the instrument is used (Hubley et al., 2009).  The purpose of the present study was to examine validity evidence based on content for two of the most frequently-used sections of the QoLHHI: the Health Impact section and Living Conditions Impact section. Validity is ?the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests? and is ?the most fundamental consideration in developing and evaluating tests? (AERA et al., 1999, p. 9). A strong case for validity will rest on evidence from a number of sources, including the internal structure of an instrument, the response and scoring processes, relationships between the obtained scores and scores obtained from other instruments, theory that explains the observed scores, and the consequences that arise from the interpretation and use of those scores (AERA et al., 1999; Messick, 1995). Another potential source of validity evidence is the instrument?s content. Although content by itself cannot guarantee validity, an instrument that contains well-designed and properly selected content has a greater chance of producing supportable score interpretations (Anastasi & Urbina, 1997; Osterlind, 1997).  Content validation ? the process of gathering and evaluating validity evidence based on instrument content ? can therefore play an important role in building up a body of evidence to support the use of a measurement instrument. Specifically, 49  content validation is concerned with collecting evidence of content representativeness (i.e., how well the instrument content covers all facets of the content domain) and content relevance (i.e., how well an element reflects the content domain or aspect of the content domain; Messick, 1989; Sireci, 1998b). The technical quality of the content is also important, because content that is flawed (e.g., unclear or overly difficult) can introduce variability into the scores that is unrelated to the construct of interest (Messick, 1989).  In the context of validation, ?content? includes not only items, but also any other content elements that might affect scores. This might include such things as the item type(s), the response scale(s), the administration instructions, the sequence of presentation of the items, the timeframes referred to in the items, the administration mode, and scoring (Haynes et al., 1995; Netemeyer et al., 2003).  A common approach to content validation is to obtain feedback on instrument content from Subject Matter Experts (SMEs). These are individuals who have ?worked extensively with the construct in question or related phenomena? (DeVellis, 1991, p. 75), and are often academic or clinical professionals. SMEs are asked to judge the quality of the elements of the instrument content using criteria that generally reflect some combination of content representativeness, content relevance, and technical quality. The samples for these studies tend to be fairly small; although there are no strict guidelines, the number of SMEs typically ranges from three to ten (Hubley & Palepu, 2007) and five is generally the recommended minimum (Lynn, 1986; Netemeyer et al., 2003).  For the study reported here, SMEs were asked to provide quantitative feedback on the items, response scales, administration instructions, scoring instructions, and other content elements from the QoLHHI Health and Living Conditions Impact sections. They were also asked 50  to provide descriptive feedback in the form of explanations for their quantitative ratings, suggestions for improving the content elements, and suggestions for content to add to the instrument. The combined quantitative and descriptive data were then used to (a) describe the content-based evidence for these two sections of the QoLHHI and the implications of this evidence for validity, and (b) generate suggestions for revising the content in order to enhance validity. Methods Ethical approval. This study was approved by the Behavioural Research Ethics Board of the University of British Columbia in Vancouver, Canada. Recruitment. Subject Matter Experts (SMEs) were defined as researchers affiliated with a Canadian or American university or hospital with a minimum of 4 years of experience conducting research with individuals who are HVH. The sample was limited to researchers with Canadian or American experience due to concerns that regional or cultural differences in how homelessness is defined and addressed might make it difficult to interpret data from other countries. Researchers who had been directly involved in the development of the QoLHHI were excluded from participating in this study as SMEs. Potential SMEs were identified through the QoLHHI authors? professional contacts. Individuals who met the criteria of a SME were sent an email message outlining the study procedures and inviting them to contact the author of this dissertation (LR) if he or she was interested in participating. A copy of the informed consent form was attached to this initial email so potential SMEs could view it before deciding whether to participate.  Measures and materials. The SMEs provided feedback on content elements of the QoLHHI Health and Living Conditions Impact sections, an abridged version of the QoLHHI 51  Administration and Scoring Manual, and the Impact Response Card. These materials are described in more detail below. Health and Living Conditions Impact sections of the QoLHHI (Hubley et al., 2009). The Health Impact section of the QoLHHI consists of 39 items that assess 13 aspects of a respondent?s health (i.e., physical health, mental or emotional health, quality of sleep, stress, physical activity, physical pain, emotional pain, alcohol use, marijuana use, street drug use, chronic illnesses, prescription medication, and special (e.g., medically indicated) diet). The impact on the respondent of each aspect of health is rated on a 7-point scale with the response options ?large negative impact?, ?moderate negative impact?, ?small negative impact?, ?no impact?, ?small positive impact?, ?moderate positive impact?, and ?large positive impact?. Items requesting additional descriptive information about some of the health aspects provide context for the impact ratings. For example, before rating the impact of their current level of stress, respondents are asked to indicate whether their current stress level is low, medium or high. Due to response-dependent skip patterns, not all respondents answer all of the descriptive or impact items. In addition to items, the Health Impact section contains the following content elements that might have an impact on validity: an introduction (that is read out to the respondent by the interviewer), and administration instructions for the interviewer. The Living Conditions Impact section of the QoLHHI assesses five aspects of a respondent?s living conditions: the place where they live or stay, their neighbourhood, food, clothing, and personal hygiene. Due to the length of this section, only the place where you live or stay, neighbourhood, and food were included in the present study. Descriptive information about each aspect is gathered through a series of yes/no and open-ended questions (17 items for the place where you live or stay, 11 items for neighbourhood, and 8 items for food), and the impact 52  of each aspect is rated on the same 7-point impact scale used for the Health Impact section. Like the Health Impact section, the Living Conditions Impact section contains introductions (one for each of the aspects) and instructions to the interviewer. QoLHHI Response Card. A response card that depicts the 7-point impact scale in a visual format was developed to facilitate administration of the QoLHHI to individuals with lower literacy levels. Negative impact is represented by minus signs and positive impact by plus signs, while the ?no impact? option is represented by a zero. All symbols are depicted in a single line and increase in size and darken in shade to represent larger impact (see Figure A. 1, p. 216).  QoLHHI Administration and Scoring Manual (Hubley et al., 2009). The manual describes the development and purpose of the QoLHHI and provides both general and section-specific administration instructions as well as scoring instructions. For this study, SMEs were provided with an abridged version that included general content and all content specific to the Health and Living Conditions Impact sections, but none of the content related to other sections of the QoLHHI. Demographic questionnaire. SMEs were asked some basic demographic questions including gender, age, education, length of time involved in conducting research with individuals who are HVH, and experience with administering the QoLHHI. Procedures. SMEs who agreed to participate in the study were given the choice of reviewing either one or both of the Health and Living Conditions Impact sections of the QoLHHI. They were sent a copy of the relevant materials by email. SME feedback was collected using forms developed specifically for this study and administered via an online survey hosted by FluidSurveys (http://fluidsurveys.com/). These forms included a brief description of the construct of SQoL that stressed its subjective and broad nature; the WHO definition was also 53  available to SMEs as it is included in the QoLHHI manual.  SMEs were asked to rate various elements of the QoLHHI Health and Living Conditions Impact sections for relevance, clarity, helpfulness, and ease of use. The forms also had space for comments and suggestions.  The consent form was presented again at the beginning of the online feedback forms and SMEs indicated their consent to participate in the study at that point. All SMEs were offered a CDN $10 gift card for the Starbucks Coffee Company for participating. Quantitative feedback. Relevance. The following elements were rated for relevance on a 4-point scale with the response options ?not at all relevant?, ?somewhat relevant?, ?mostly relevant?, and ?completely relevant?: ? Impact ratings. A sample impact question was presented (e.g., ?Now I want to know about the kind of impact/effect that different aspects of your health have on you?) and SMEs were asked to indicate the relevance of such impact ratings to measuring SQoL for individuals who are HVH. The focus was on the relevance of ?impact?, rather than on ?health? or ?living conditions? ? Each of the 13 aspects of Health (e.g., current level of physical health, chronic illnesses or conditions) ? Each of the 3 aspects of Living Conditions (i.e., the place where you live or stay, neighbourhood, and food) and 36 descriptive questions (e.g., ?Do you feel that the place where you live or stay is affordable??) Clarity. The following elements were rated for clarity on a 4-point scale with the response options ?not at all clear?, ?somewhat clear?, ?mostly clear?, and ?very clear?: ? The introduction to the Health Impact section 54  ? All items from the Health Impact section ? The introductions in the Living Conditions Impact section (one introduction per aspect, i.e., the place where you live or stay, neighbourhood, food) ? All items from the Living Conditions Impact section ? The general and section-specific (i.e., Health Impact and Living Conditions Impact) administration instructions from the QoLHHI Administration and Scoring Manual  ? The instructions and notes provided on the forms for the Health Impact and Living Conditions Impact sections Ease of understanding. The following elements of the QoLHHI were rated for how understandable they are on a 4-point scale with the response options ?not at all easy to understand?, ?somewhat easy to understand?, ?mostly easy to understand?, and ?very easy to understand?: ? The Impact response scale ? The Yes/No response scale (used for the descriptive items from the Living Conditions Impact section) Ease of use. Several elements of the QoLHHI were rated on a 4-point scale for how easy they are to use or apply, using the response options: ?not at all easy to use?, ?somewhat easy to use?, ?mostly easy to use?, and ?very easy to use? or ?not at all easy to follow?, ?somewhat easy to follow?, ?mostly easy to follow?, and ?very easy to follow?. The elements were: ? The Impact response scale ? The Yes/No response scale (used for the descriptive items in the Living Conditions Impact section) ? The scoring instructions for both the Health and Living Conditions Impact sections 55  Helpfulness. The helpfulness of the following elements was rated on a 4-point scale with the response options ?not at all helpful?, ?somewhat helpful?, ?mostly helpful?, and ?very helpful?: ? The general and section-specific (Health and Living Conditions Impact) administration instructions from the QoLHHI manual (helpfulness to the interviewer) ? The Impact Response Card (helpfulness to respondents to the QoLHHI) Skip patterns in the Health Impact section. Certain items from the Health Impact Section are meant to be skipped if a respondent indicates that they have never applied or do not currently apply to him or her; for example, if the respondent says that he or she does not have any chronic illnesses or conditions or has never used street drugs. The present study presented an opportunity to assess if this is indeed the best approach, or if it would be valuable to ask all respondents to answer these impact questions (e.g., a respondent might indicate that not having any illnesses or conditions has a positive impact). The SMEs were asked to rate the skip patterns for the following impact items: experience of physical pain, experience of emotional pain, use of alcohol, use of marijuana, use of street drugs, and having one or more chronic illnesses or conditions. The response scale was a 3-point scale with the options ?should not be skipped?, ?unsure if should be skipped?, and ?should be skipped?. Descriptive feedback. Descriptive feedback in the form of explanations, comments, and suggestions was collected in three ways: ? For all quantitative assessments, SMEs were asked to explain any rating of less than ?completely relevant? or ?very? clear, helpful, etc.  ? SMEs were asked which of two words, ?impact? or ?effect?, they felt is more understandable, and why. The instructions for administering the QoLHHI suggest that either of these words may be used, at the interviewer?s discretion   56  ? SMEs were asked to list any additional aspects of Health or Living Conditions that they felt should be added to the QoLHHI in order to measure SQoL Analyses. Quantitative data. All quantitative analyses were conducted using Microsoft Excel 2010. The SMEs? ratings were assigned numeric values, with ratings of ?not at all? relevant, clear, etc. coded as 1, ?somewhat? relevant, clear, etc. coded as 2, ?mostly? relevant, clear, etc. coded as 3, and ?completely relevant? or ?very? clear, etc. coded as 4. For the skip patterns from the Health Impact section, ratings of ?should not be skipped? were coded as 1, ratings of ?unsure if should be skipped? were coded as 2, and ratings of ?should be skipped? were coded as 3. The scores were then used to compute a Content Validity Index (Lynn, 1986) for each element and for several scales.   The CVI measures the proportion of experts who endorse an element (e.g., rate it as relevant or clear). It is computed by dividing the number of experts who endorsed an element by the total number of experts who rated that element. Following the approach proposed by Lynn (1986), the 4-point rating scales in the present study were collapsed into two categories, with ratings of 4 and 3 treated as endorsements and ratings of 2 and 1 were treated as non-endorsements. For the skip patterns for the Health Impact section, which were rated on a 3-point scale, only ratings of 3 were treated as endorsements.   Once a CVI has been computed, it is necessary to establish if it is indicative of an acceptable level of endorsement for that element at the sample level. Lynn (1986) proposed cut-offs based on the standard error of proportions, but the use of cut-offs based on inferential statistics can be problematic with the small sample sizes that are typical of content validation studies (Beckstead, 2009). Alternatively, a minimum level for acceptable endorsement can be set 57  a priori. Some suggestions for minimum acceptable scores in the literature include 0.70 (House, House, & Campbell, 1981) and 0.80 (Waltz et al., 1991), but such rules of thumb are often applied with little justification or reference to the context of a particular study. For the present study, a number of factors were factored into establishing a minimum level for the CVIs for elements (E-CVIs), including the probable sample size, the level of endorsement that could reasonably be expected for that sample size, and finally, the minimum level of endorsement that would represent an acceptable level of rigour for the sample. Because it was not known in advance how many SMEs would participate, the estimated sample size was based on the generally accepted minimum of five SMEs for a content validation study. It seemed unrealistic to expect 100% endorsement from five SMEs for all elements, but in order to maintain a reasonable level of rigour, it was decided that only one non-endorsement (i.e., one score of 1 or 2) per element would be allowed. This set 0.80 (i.e., 4/5 endorsements) as the minimum acceptable level for the E-CVIs. Although a single non-endorsement actually produces E-CVIs above 0.80 as the sample size increases (e.g., 5/6 endorsements results in a CVI of 0.83), in terms of actual raters, the 0.80 cut-off does not allow for more than one non-endorsement for sample sizes of up to 9 (e.g., 7/9 endorsements results in a CVI of 0.67).  In addition to the standard E-CVI, a score called the Perfect CVI was established for this study. This is an E-CVI of 1.00 when all raters assign the highest possible rating to an element (e.g., all ratings of 4 on the 4-point scales).The obtained E-CVIs could therefore fall into one of three categories: a Perfect E-CVI, an acceptable E-CVI (0.80 or higher and still indicative of endorsement) or an unacceptable E-CVI below 0.80 (indicative of non-endorsement). Any individual element that did not obtain a minimum E-CVI of 0.80 was flagged as problematic and potentially in need of revision or perhaps even removal from the QoLHHI.  58  In addition to the E-CVIs, it is also possible to compute a scale-level CVI (Lynn, 1986; Polit & Beck, 2006). Polit and Beck (2006) noted that a scale-level CVI can be calculated in a number of different ways, but recommended using the average of the CVIs of individual elements (what they refer to as the S-CVI/Ave) and setting 0.90 as the minimum acceptable score. For the present study, the following elements were treated as ?scales? and a S-CVI/Ave computed: (a) the relevance of the 13 aspects of Health, (b) the relevance of the three aspects of Living Conditions, (c) the relevance of the descriptive questions for the Place Where You Live or Stay, Neighbourhood, and Food (as three separate scales and as a total scale), (d) the clarity of all of the items of the Health Impact section, and (e) the clarity of the items for the place where you live or stay, neighbourhood and food in the Living Conditions Impact section (as three separate scales and as a total scale).   Descriptive data. The SMEs? descriptive feedback in the form of comments and suggestions for all non-endorsed elements was reviewed and common themes were identified. This information was used to (a) gain a better understanding of the reasons why an element was not endorsed, and (b) develop recommendations for making changes to the QoLHHI content in order to increase the likelihood that it will produce valid score inferences. It was recognized that endorsed elements might also benefit from revision; therefore, the descriptive feedback for these elements was reviewed as well. Analyses of the descriptive data were conducted using ATLAS.ti (version 6.2). Results Participants. A total of eleven SMEs took part in the study; six were female, four were male, and one did not provide gender information. The average age was 48.27 years (SD = 9.12 years, range = 35-60 years). Six SMEs had PhDs, 3 SMEs were MDs (2 also had a Master?s 59  degree in Public Health), 1 SME had a Master?s degree, and 1 SME had completed secondary school. The SMEs had been involved in research with individuals who are HVH for an average of 12 years (SD = 5.44 years,  range = 5-20 years), and two had prior experience administering the Living Conditions Impact and the Health Impact sections of the QoLHHI in a research setting.  Four SMEs reviewed the Health Impact section, 4 SMEs reviewed the Living Conditions Impact section, and 3 SMEs reviewed both sections of the QoLHHI. One SME who rated the Health Impact section and one SME who rated the Living Conditions Impact section did not complete the entire rating survey. There were some additional missing data as SMEs skipped some ratings for unknown reasons; however, there were no discernible patterns in these data (e.g., there was never more than one SME who missed rating a particular element).  Quantitative results. The CVIs for all content elements and scales are presented in Tables 3.1-3.7 (pp. 75-89). Of the 174 elements rated by the SMEs, 156 (90%) were endorsed, including 54 (31%) with Perfect E-CVIs. Only 18 elements (10%) did not obtain the minimum acceptable E-CVI of 0.80. In addition, all of the nine scale-level CVIs (SCVI/Ave) were above the minimum recommended level of 0.90. Non-endorsed elements. While impact ratings themselves were endorsed as relevant, the Impact Response Scale was not endorsed either for how easy it is to understand or for how easy it is to use/apply (see Table 3.1, p. 75). The following seven elements of the Health Impact section were not endorsed for their clarity: the overall introduction to the section (which describes impact ratings), the impact items for physical activity or exercise, quality of sleep, physical pain, each of following or not following a special diet, and the skip instructions attached to the item ?Are you taking this medication?? (see Table 3.3, p. 77 and Table 3.7, p. 89). The 60  nine non-endorsed elements of the Living Conditions Impact section were the relevance of ?neighbourhood? and of the item asking about the disruptiveness of neighbours, the relevance and clarity of the items asking about feeling like part of the community in your neighbourhood, feeling stuck in your neighbourhood, and whether the food you eat is nutritious, and the clarity of the impact item for the place where you live or stay (which also serves to introduce the idea of ?impact? for the Living Conditions Impact section, see Tables 3.5 and 3.6, pp. 82-85). Descriptive feedback. Since the SMEs had been asked to provide an explanation for any rating that was less than ?completely relevant? or less than ?very? clear, easy to understand, etc., there were numerous comments and suggestions for most of the non-endorsed elements. Because many endorsed elements had been given less-than-perfect ratings by at least some SMEs, these elements also had descriptive feedback attached. In addition, many SMEs commented on elements to which they had given the highest possible rating. As a result, there were few elements for which there was no descriptive feedback at all. Descriptive feedback on non-endorsed elements. Most of the descriptive feedback on non-endorsed elements could be grouped into the following themes: (a) words, terms, and items that should be clarified or simplified; (b) items or elements with too narrow a focus; (c) high cognitive demand; (d) questions that people cannot really answer; and (e) limited relevance due either to the HVH context or to diversity within the HVH population. There were also some comments and suggestions that were very specific to particular elements. Some of the descriptive feedback on the clarity of health impact items focused primarily on the ?impact? aspect of these items; this feedback will be addressed together with other SME comments on impact in the discussion of endorsed elements. 61  Words, terms, and items that should be clarified or simplified. The SMEs singled out a number of words and terms that they felt should be clarified, including ?quality? of sleep, ?community?, ?feeling stuck? in your neighbourhood, and ?nutritious?. In some cases, SMEs felt that entire items, such as ?Are other people living or staying there too disruptive?? and ?I?d like you rate the kind of impact that having/no longer having physical pain has on you? could be made more clear. The introduction to the Health Impact section and the impact item for the ?place where you live or stay? were also singled out as needing to be simplified and shortened. Because these last two elements incorporate an introduction to the idea of rating impact, they are quite long and include a number of different concepts, including impact itself, direction (negative and positive impact), intensity (small, medium and large impact), and outcome (making things better or worse for the respondent). Several SMEs felt that it would be difficult for respondents to keep all of these different ideas clear in their minds. In many cases, SMEs offered specific suggestions for improving the non-endorsed elements through changes in wording or by adding examples. For example, the item asking about nutritious food could be reworded with terms such as ?healthy? and ?good for you? instead of ?nutritious?, or by asking about access to vegetables, fruits and grains. For the term feeling ?stuck? in your neighbourhood, SMEs suggested adding probes around affordability and services (which might affect someone?s ability to move out of the neighbourhood), and clarifying if ?stuck? means emotionally or physically trapped. SMEs also made suggestions for rewording entire items; for example, instead of asking ?Are other people living or staying there too disruptive??, the item could be changed to ?Do other people living or staying there bother or disturb you??. The introduction to the Health Impact section could be shortened by cutting down the examples of negative, positive, and neutral impact.  62  Items or elements with too narrow a focus. For the item asking about ?quality of sleep?, it was noted that quantity of sleep is also important, particularly if the quality is poor. The focus on quality alone might therefore be too narrow. High cognitive demand. When rating the impact of physical pain, respondents rate either the impact of pain that they are experiencing in the present, or the impact of the absence of pain that they used to experience in the past. Several SMEs commented that thought processes required to rate  the impact of past pain (i.e., I had pain in the past; I no longer have pain; what is the impact of no longer having pain?) might be too challenging for some respondents.  Questions that people cannot really answer. For several items dealing with food (how nutritious the food is and the impact of following or not following a special diet), it was suggested that respondents simply would not have the information they would need to answer these questions (i.e., they would not know the exact nutritional content of their food, and would not be in position to judge if the diet is having an effect or not). Limited relevance either due to the HVH context or to diversity within the HVH population. A number of the SMEs noted that the overall concept of ?neighbourhood? can be problematic for some individuals who are HVH, because those who are transient may not have a strong sense of their neighbourhood. In addition, individuals who are HVH often do not make clear distinctions between ?neighbourhood?, ?the place where you live or stay?, and ?community?. These reservations about the term ?neighbourhood? affected the perceived relevance of a number of both endorsed and non-endorsed elements, such as neighbourhood safety and feeling like part of the community in the neighbourhood.  The relevance of ?neighbourhood? was questioned because of the specific context of being HVH. In contrast, the relevance of asking if other people where you are living or staying 63  are too disruptive was seen as limited by the diversity within the HVH population, as some SMEs thought this item would not be relevant to respondents who are living alone. Comments specific to individual elements. One SME commented that the impact item for following a special diet is only clear if the respondent is fully (not just partially) following the diet. The Impact Response Scale was not endorsed either for how easy it is to understand or for how easy it is to use/apply. Some SME comments on the scale focused on concerns with impact ratings; these are addressed together with other comments about impact under ?endorsed elements?. Other comments about the response scale suggested that a 7-point scale might be difficult for respondents to use (one SME proposed using a 5-point scale instead). Opinions on the scale gradations of ?small?, ?moderate? and ?large? were mixed, with some SMEs indicating that these are basically straightforward, but others commenting that they are difficult for some respondents to distinguish. Several SMEs suggested changing the word ?moderate? (e.g.,  to ?somewhat?), and one SME noted that the alternate wording suggestions for the response options that are provided in the QoLHHI Administration and Scoring Manual (?a lot better?, ?a lot worse?) might be easier for respondents to understand, and could be incorporated directly into the QoLHHI response scale.  The final element that was not endorsed was the instructions for when to skip the impact items for taking/not taking prescribed medication. The SMEs who thought that these instructions are unclear did not explain their ratings, but one possible explanation lies in the way the QoLHHI items are labeled on the Health Impact section forms. These labels use the letters of the alphabet; however, there are more items (28) in this section than there are letters. Therefore, the final items in the section are labeled AA and BB. The skip pattern instructions that were flagged 64  as unclear refer to both single letter and double letter labels (e.g., ?Go to Z but skip AA and BB?). It is possible that some SMEs felt that it is not clear that AA and BB refer to the items following item Z (rather than, for example, sub-items under items A and B). Endorsed elements. The SME comments on the endorsed elements fell under many of the same themes as those for the non-endorsed elements, including (a) words, terms, and items that should be clarified or simplified; (b) items or elements with too narrow a focus; (c) high cognitive demand; (d) questions that people cannot really answer; and (e) limited relevance either due to the HVH context or to diversity within the HVH population. Additional themes for the endorsed elements included: (f) items or elements that focus on the wrong thing; (g) problems and suggestions related to administration; (h) need to clarify the purpose of the QoLHHI; and (i) problems with impact ratings. Words, terms, and items that should be clarified or simplified. As with the non-endorsed elements, the SMEs singled out a number of words, terms, or items in need of clarification. For example, ?physical health?, ?mental or emotional health?, ?stress?, ?amenities?, having ?control? over your own space, and ?restrictions? are all words or terms that might not be clear to all respondents. In some cases, SMEs provided suggestions for examples that could be added to items, such as ?mood?, ?anxiety? and ?how you feel about yourself? for mental or emotional health, or ?having a key for your room? and ?being able to come and go as you please? for having control of your space. Several SMEs commented that it is not clear if occasional or social substance use should be considered ?current use? for the items ?Do you currently drink alcohol??, ?Do you currently use marijuana?? and ?Do you currently use other street drugs??. For chronic illnesses and conditions, it was noted that it is not clear if this only includes conditions that have been 65  diagnosed by a health professional, or also those that are self-diagnosed. Similarly, it is not clear if ?supposed to? follow a special diet for health reasons refers only to recommendations from a health professional.  Some SMEs felt that the non-specific timeframes employed in the QoLHHI, such as ?lately? and ?currently?, should be replaced with more specific ones (e.g., last 30 days).  Items or elements with too narrow a focus. Although the aspect of special (medically indicated) diet was endorsed, it was noted that diet more generally and food security are also important. This suggests that the focus on special diet alone may be too narrow.  High cognitive demand. For the impact item for emotional pain, respondents rate either the impact of emotional pain that they are experiencing in the present, or the impact of no longer experiencing emotional pain that they once had. This is similar to the impact ratings for physical pain and, as with physical pain, SMEs noted that it might be challenging to rate the impact of the absence of emotional pain.  The response scale for the descriptive items from the Living Conditions Impact section has three response options: ?yes?, ?no, and ?yes/no?. This last option is intended to cover situations where the response might be mixed. For example, when asked if they are able to get enough to eat, respondents might say ?yes? for weekdays when more free meal services are available, but ?no? for weekends and holidays. In this case, a ?yes/no? response option would be recorded. One SME commented that this mixed (yes/no) response option will require judgments that are too difficult for respondents to make in some cases.  Questions that people cannot really answer. Much as with asking about the impact of following or not following a special diet, it was felt that respondents might not know if taking or 66  not taking medication is having an impact, unless the medication has obvious psychoactive effects. Limited relevance either due to the HVH context or to diversity within the HVH population. SMEs noted that some words and concepts may have a particular meaning for individuals who are HVH. Elements that use these terms might be less relevant, or may have a different meaning, than if they were used in a measure intended for the general population. Examples include asking if someone?s housing is ?affordable? (which may mean something very different for individuals who are HVH compared to the general population), ?stress? (which may be so normalized for individuals who are HVH that they will specifically need to be reminded to think of a time when they experienced more or less stress), ?exercise or physical activity? (individuals who are HVH may be quite active, but are unlikely to engage in structured exercise such as working out at a gym), and ?special diet? (individuals experiencing psychosis may have unfounded beliefs around diet and food).  As with some non-endorsed elements, the relevance of certain endorsed elements was seen as limited to a subset of respondents within the HVH population. For example, the affordability of the place where you live or stay was not seen as relevant to people who are homeless, while asking about the safety and cleanliness of bathing facilities would not be relevant for people who do not have to share these facilities with others.  Items or elements that focus on the wrong thing. Comments on some elements suggested that SMEs felt the focus of the element should be shifted. For example, one SME noted that, in asking about ?bad influences? in the neighbourhood, the examples used (drugs, crime) are things that might not necessarily influence respondents, even if they are unpleasant to witness. Instead, 67  respondents could be asked if things like drugs and crime are present in their neighbourhood and, if so, what effect they have (e.g., feeling afraid to go out alone).  Another example is the administration guidelines in the QoLHHI Administration and Scoring Manual. For the most part, these were seen as clear and helpful. However, one SME commented that such manuals are often awkward in how they present suggestions for talking to respondents (the QoLHHI manual discusses topics such as seating arrangements, dealing with respondent fatigue, and asking sensitive questions). Instead, the most important consideration in training interviewers should be comfort with, and understanding of, the instrument?s items. Comments and suggestions related to administration. Some of the administration elements, particularly the skip patterns for several items, were considered confusing. One SME suggested using arrows or similar visuals to illustrate the skip patterns. Other suggestions for administration elements included highlighting parts of the scoring instructions using bold font, providing an example of completed score calculations, and creating a response card for the yes/no scale (as it is difficult for some respondents to keep even just three response options in mind). Although most SMEs thought that the Impact Response card is helpful, it was noted that using ?smiley? and ?frowny? faces might be preferable to the current plus and minus symbols.  Need to clarify the purpose of the QoLHHI. Some SMEs were concerned that certain terms are very abstract (e.g., ?home?) or open to varying interpretations (e.g., ?stress?, ?restrictions?). Several SMEs were also concerned that respondents may incorrectly self-diagnose that they have a chronic illness or condition, or mislabel an illness or condition (e.g., call any breathing problem ?asthma?). Comments such as these suggest that the purpose of the QoLHHI, namely to measure SQoL, may not be entirely clear.12  Given the focus on impact, rather than on                                                  12 It is also possible that the SMEs taking part in this study may not have fully understood the construct of SQoL, since they were not necessarily experts in QoL, or they may have equated SQoL with health-related quality of life. 68  objective circumstances, it does not in fact matter if two respondents have different definitions of ?restrictions? or whether an illness is identified correctly. What is important is the self-perceived impact of these restrictions or illnesses on the respondent.  Problems with impact ratings. Many of the SME comments regarding impact ratings fall under the theme of high cognitive demand but, as impact is so central to the QoLHHI, they (and other comments related to impact) merit a separate discussion.  Although impact ratings were endorsed overall, many of the SMEs were concerned that the concept of impact is too abstract, and that it will be difficult for respondents to translate their concrete experiences into ratings of impact. Another potential problem with impact ratings is that the impact that something has could be different for the physical, emotional, mental, social, and other aspects of a person. Other comments included that the impact that something has should generally be straightforward (e.g., poor health will have a negative impact, low stress will have a positive impact), and, therefore, it seems strange to even ask the impact questions. SMEs suggested that it might be easier and more intuitive for people to instead rate states (e.g., rate their health as poor, fair, good, etc.), or even to rate the importance, rather than the impact, of the various aspects of life areas.  As noted previously, some of the descriptive feedback on the non-endorsed elements focused primarily on impact, rather than on the non-impact aspects of the element (e.g., almost all of the comments on the clarity of the impact item for stress were about ?impact?, not ?stress?. This pattern was repeated for a number of other impact items). It is therefore possible that concerns with the concept of impact were at least partly responsible for the lower E-CVIs for some of the non-endorsed elements. 69  Word preference: Impact or effect. When asked which word, ?impact? or ?effect?, they felt is more understandable, six of 11 SMEs indicated ?effect?. The most frequent explanation for this was that ?effect? is a more everyday word that individuals who are HVH will find easier to understand. One SME also commented that ?impact? has negative connotations and might therefore bias responses. Two SMEs thought that both ?impact? and ?effect? were equally understandable, and that each might work better with different respondents. Three SMEs thought that neither word was understandable because of the complex and abstract idea they represent. Topics to add to the QoLHHI. SME suggestions for topics to add to the Health Impact section included social housing, housing quality, HIV and Hepatitis B and C status, interference in functioning from psychiatric symptoms (given that the term ?mental or emotional health? may not address all aspects of mental health), the type and ingestion mode for alcohol and drugs, and whether physical or mental health was a cause of homelessness or was made worse by homelessness. Suggestions for topics to add to the Living Conditions Impact section included the types of resources and supports available in the neighbourhood and whether these are accessible, social supports, social interactions, and pest infestations. Discussion Implications for the QoLHHI. The results from this study indicate that the content of the QoLHHI Health and Living Conditions Impact sections is, overall, relevant to measuring the SQoL of individuals who are HVH. In addition, this content is for the most part clear, helpful, and easy to apply. In terms of the quantitative feedback, 90% of the E-CVIs were at or above the 0.80 cut off for individual elements and 100% of the S-CVIs were at or above the 0.90 cut off for scales. The SMEs endorsed 100% of the Health Impact section elements for relevance and 85% for clarity. They endorsed 87% of the Living Conditions Impact section elements for relevance, 70  and 90% for clarity. Ninety-one percent of the remaining elements (administration guidelines, skip instructions, response scales, etc.) were endorsed. Based on the descriptive feedback on the non-endorsed elements, the majority of the SME concerns with these elements that are reflected in the quantitative ratings can be addressed effectively. The comments on the endorsed elements show that many of these could also be improved by making revisions to their content but, again, most of the concerns raised by the SMEs will be straightforward to address. The types of revisions that are recommended include: clarifying concepts, defining words and timeframes more precisely, adding examples, simplifying item language, ensuring that items and elements reflect the HVH context and target population, clarifying certain administration elements, making the purpose of the QoLHHI (i.e., to measure SQoL) clearer, and revisiting the concept of impact and how this is presented in the QoLHHI. This last is particularly important because the idea of impact is so central to the Impact sections of the QoLHHI. The SMEs had relatively few recommendations for topics to add to the Health and Living Conditions Impact sections of the QoLHHI, suggesting that content representativeness is generally adequate. Of the topics that were recommended, several would be very relevant to the SQoL of individuals who are HVH (e.g., access to resources, housing quality, social supports). Some of these topics should be considered as possible additions to the QoLHHI, while others are already addressed in other sections of the instrument. For example, housing quality (which was recommended as a topic to add to the Health Impact section, but without any specific reference to the impact of housing on health) is addressed in detail in the Living Conditions Impact section, and there is an entire section of the QoLHHI devoted to social supports (which was recommended as a topic for the Living Conditions Impact section). Other suggested additions to the QoLHHI, such as the type(s) of substances used and ingestion mode(s) for alcohol and drugs, 71  are not relevant to measuring SQoL specifically, and so could be left out of the instrument without threatening validity. Implications for content validation. Content validation studies employing SMEs tend to focus on quantitative data. Although many studies do include opportunities for SMEs to provide descriptive feedback on the instrument?s content elements, this is rarely where the emphasis is placed. In some cases, descriptive feedback is entirely optional. Yet, it is in the descriptive feedback that the real value of SMEs is to be found. The quantitative ratings, and the characterization of elements as either endorsed or non-endorsed, provide an overall summary of content representativeness, relevance, and quality. But the descriptive feedback is what helps to (a) explain those ratings, and (b) identify where any problems in the content lie and how they can be fixed. These problems are not necessarily confined to non-endorsed elements. Indeed, in the present study, there was nothing that clearly distinguished the descriptive feedback on endorsed and non-endorsed elements. There was significant overlap in the themes of the descriptive feedback between the two groups of elements and, though there were more themes for the endorsed elements, this is not surprising given that the endorsed elements outnumbered the non-endorsed ones by a ratio of about 9 to 1. In some cases, the comments on endorsed elements revealed very important concerns. The most striking example of this is the concept of impact ratings. Though the relevance of these ratings was endorsed by the SMEs overall, the comments on the relevance of impact, as well as on several individual impact items and elements, clearly indicate that the concept of impact and its execution within the QoLHHI was considered problematic by many of the SMEs. Given that the impact ratings form the core of the Impact sections of the QoLHHI, it is essential that the SMEs? concerns be addressed.  72  Sometimes, a single SME will raise a concern or make a suggestion or comment that is at odds with other SMEs. This dissenting voice can provide valuable information, such as when it suggests an interpretation of an item?s wording that the instrument authors had not unanticipated. This was the case with the item about ?bad influences? in the neighbourhood. The QoLHHI authors had intended for this item to refer specifically to things that affect a respondent in a negative way, hence the use of the words ?bad influences there for you? (emphasis added). As described in the discussion of items or elements that focus on the wrong thing under ?endorsed elements?, one SME commented that the examples of bad influences (drugs, crime) are things may not in fact have any effect on a respondent. This suggests that the intent of the item (as reflected in the words ?for you?) may be lost. If the intent is unclear, the item will be difficult to answer. Although only one SME raised this possibility, consideration should still be given to whether the item can be reworded in such a way as to clarify its intended meaning. SMEs are chosen for a content validation study for the experience, insight, and knowledge they bring to the task of evaluating the content of an instrument. Focusing on quantitative evaluations of item content, particularly if these evaluations are analyzed using inferential approaches, would be appropriate if the sample of SMEs were meant to be representative of all SMEs. But given that the true value of their expertise cannot be captured through numbers alone, it is more appropriate, and more effective, to view the sample of SMEs as similar to an advisory board. Seen in this light, all of the experts? feedback, but especially their more detailed descriptive feedback, is important.  Conclusion, Study Limitations, and Future Directions The findings of this content validation study are generally supportive of the quality of the content of the Health and Living Conditions Impact sections of the QoLHHI. Although it is clear 73  that improvements can be made, most of the content was deemed relevant, clear, helpful, easy to use, etc. by the sample of 11 SMEs, and the majority of their concerns about the instrument?s content can be addressed. This study therefore represents one piece of evidence to support the use of the QoLHHI. This study also highlights the value of descriptive feedback in content validation. Such feedback is often treated as a supplement to quantitative ratings of instrument content but, in fact, it is a valuable form of data in its own right. The intended purpose for which an instrument is used is an important consideration for validity. Validity evidence, whether based on content or on some other source, may only be applicable to certain contexts or intended uses (AERA et al., 1999; Sireci, 1998b). The SMEs who took part in the research described here had all conducted research with individuals who are HVH, and were thus qualified to assess the QoLHHI as a research instrument. It cannot be assumed, however, that their feedback on the QoLHHI content will be applicable to other potential uses of the instrument, such as service provision, or program or policy evaluation. Similarly, the feedback obtained here may not be relevant to research conducted outside of Canada and the United States. As previously noted, the sample was deliberately limited to researchers from these two countries in order to avoid the potentially confounding influence of different definitions and approaches to homelessness globally. There is wide variability in the way in which homelessness is defined both within and across countries (e.g., Busch-Geertsema, 2010; Cordray & Pion, 1991; Gabbard et al., 2007). Although the content of the QoLHHI was generally deemed to be relevant and representative for the purposes of measuring the SQoL of individuals who are HVH by the SMEs who participated in this study, the extent to which this would be the case if the instrument were to be used outside of Canada or the United States is 74  unknown. Further validation work would need to be undertaken before the QoLHHI could be used for research in other countries. After the recommended revisions have been made to the QoLHHI content, further content validation work should be carried out in order to ensure that the SMEs? concerns have been addressed. Other types of SMEs could be included in these studies in order to address the question of validity when the QoLHHI is used for non-research purposes. Additional content validation should also be carried out with other types of experts. Although SMEs are the most frequent kind of expert included in content validation studies, other groups can also provide valuable insights on instrument content. The potential value of seeking feedback from individuals who are HVH should be obvious, given the many unique circumstances faced by the QoLHHI?s target population. In addition, the SMEs who participated in the study reported here were concerned that some of the concepts and language employed in the QoLHHI would be difficult for the instrument?s target population to understand. One way to determine if these concerns are justified is to seek feedback directly from members of that population. Another potentially valuable group of experts would be individuals who have administered the QoLHHI. This group would be particularly qualified to comment on the administration elements of the QoLHHI, such as administration and scoring instructions.  Content validation with different types of experts will also provide opportunities to investigate further the contribution that experts? descriptive feedback, not just their quantitative feedback, can make to a content validation study.  75  Table 3.1: Content Validity Indices (CVI) for Impact Elements Element CVI How relevant do you think impact/effect ratings are to measuring quality of life for individuals who are homeless or vulnerably housed? 0.82 How easy to understand is the impact response scale for the impact questions? 0.70 How easy to use (apply) is the impact response scale for the impact questions?  0.73 How helpful is the QoLHHI Impact Response Card?  0.90 Note: Elements that were not endorsed are in bold font.   76  Table 3.2: Content Validity Indices (CVI) for the Relevance of Aspects of Health Element CVI Current level of physical health Perfect  Current level of mental or emotional health Perfect Current level of physical activity or exercise 1.00 Quality of sleep 1.00 Current level of stress 0.86 Physical pain Perfect Emotional pain Perfect Using alcohol 1.00 Using pot/marijuana 1.00 Using street drugs 1.00 Chronic illnesses or conditions 0.86 Special (medically recommended) diet 1.00 Prescription medication 1.00 Relevance of all aspects of health (S-CVI-Ave) 0.98    77  Table 3.3: Content Validity Indices (CVI) for the Clarity of the Health Impact Section Items Element CVI Now I want to know about the kind of impact/effect that different aspects of your health have on you. You could tell me, for example, that your physical health has no impact on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you 0.71 I?d like you to rate the kind of impact/effect that your current level of physical health has on you 0.86 I?d like you to rate the kind of impact/effect that your current level of mental or emotional health has on you 0.86 I?d like you to rate the kind of impact/effect that your current level of physical activity or exercise has on you 0.57 I?d like you to rate the kind of impact/effect that the quality of sleep that you?ve been getting lately has on you 0.71 Would you describe your current level of stress as low, medium, or high? 1.00 Given your (low/medium/high) stress level, I?d like you to rate the kind of impact/effect that this has on you 0.86 Have you been experiencing physical pain lately? Perfect I?d like you to rate the kind of impact/effect that (having/ no longer having ) physical pain has on 0.71 78  Element CVI you Have you been experiencing emotional pain lately? 1.00 I?d like you to rate the kind of impact/effect that (having/no longer having) emotional pain has on you 0.83 Do you currently drink alcohol? 1.00 I?d like you to rate the kind of impact/effect that (drinking/no longer drinking) has on you 0.86 Do you currently use pot (marijuana)? 1.00 I?d like you to rate the kind of impact/effect that (using/no longer using) pot has on you 0.86 Do you currently use other street drugs ? such as cocaine, heroin, or crystal meth for example? 1.00 I?d like you to rate the kind of impact/effect that (using/no longer using) street drugs has on you 0.86 Do you have one or more chronic illnesses or conditions (for example: diabetes, allergies, a disability, hepatitis)? 1.00 I?d like you to rate the kind of impact/effect that this has on you 0.86 Are you supposed to follow a special diet because of a health condition? 1.00 Are you following this special diet? 1.00 I?d like you to rate the kind of impact/effect that following this diet has on you 0.71 79  Element CVI I?d like you to rate the kind of impact/effect that not following this diet has on you 0.71 If you are NOT following or only partially following this special diet, why not?  Perfect ?because the food you need for this diet is too expensive? Perfect ?because it?s too difficult for you to get the food you need for this diet? 0.86 ?because you don?t have any way to prepare or store the food you need for this diet? Perfect ?because you are not willing to give up certain foods as part of this diet (for example: salt, red meat, sweets)? 1.00 ? Other Perfect Are you currently supposed to be taking medication that was prescribed by a doctor? 1.00 Are you taking this medication? 1.00 I?d like you to rate the kind of impact/effect that taking this medication has on you 1.00 I?d like you to rate the kind of impact/effect that not taking this medication has on you 1.00 If you are NOT taking the medication prescribed to you, why not? 1.00 ?because the medication is too expensive? Perfect ?because it?s too difficult for you to store the medication? 1.00 80  Element CVI ?because you?re not able to take the medication as recommended (for example: with food, 3 times a day)? Perfect ?because you don?t like the side effects? Perfect ?because you don?t believe in taking medication? Perfect ? Other Perfect Clarity of all Health Impact section items (S-CVI-Ave) 0.92 Note: Elements that were not endorsed are in bold font.    81  Table 3.4: Content Validity Indices (CVI) for Skip Patterns in the Health Impact Section Element CVI Should the impact question for physical pain be skipped? 0.86 Should the impact question for emotional pain be skipped? 0.86 Should the impact question for using alcohol be skipped? 0.86 Should the impact question for using pot/marijuana be skipped? 0.86 Should the impact question for using street drugs be skipped? 0.86 Should the impact question for chronic illnesses or conditions be skipped? Perfect     82  Table 3.5: Content Validity Indices (CVI) for the Relevance of Aspects of Living Conditions Element CVI Aspect of Living Conditions: Place where you live or stay Perfect Aspect of Living Conditions: Neighbourhood 0.71 Aspect of Living Conditions: Food 1.00 Place where you live or stay  Affordability 0.86 Amenities 0.86 Access to bathing facilities 1.00 Cleanliness of bathing facilities 0.86 Safety of bathing facilities 0.86 Overall cleanliness 0.86 Feeling of control over your own space 1.00 Disruptiveness of others 0.71 Privacy Perfect Restrictions 1.00 Worries about catching illnesses from others 1.00 Security of possessions Perfect Treatment by others Perfect Feeling of home 1.00 Worst thing about the place where you live or stay 0.86 Best thing about the place where you live or stay 0.86 83  Element CVI Anything else you want to say about the place where you live or stay 1.00 Neighbourhood  Feeling safe in your neighbourhood 1.00 Reasons for feeling safe/unsafe 1.00 Feeling safe at night versus during the day 1.00 Reasons for feeling safe/unsafe at night versus during the day 1.00 Feeling like part of the community in your neighbourhood 0.71 Feeling stuck in your neighbourhood 0.71 Bad influences in your neighbourhood 1.00 Resources in your neighbourhood 1.00 Worst thing about your neighbourhood 1.00 Best thing about your neighbourhood 1.00 Anything else you want to say about your neighbourhood 1.00 Food  Ability to get food that you like Perfect Is food nutritious 0.71 Quality of food 0.86 Feeling stuck eating same thing every day Perfect Able to get enough to eat Perfect Worst thing about food Perfect 84  Element CVI Best thing about food Perfect Anything else you want to say about food Perfect Relevance of all aspects of Living Conditions (S-CVI-Ave) 0.90 Relevance of all aspects of Place where you live or stay (S-CVI-Ave) 0.92 Relevance of all aspects of Neighbourhood (S-CVI-Ave) 0.95 Relevance of all aspects of Food (S-CVI-Ave) 0.95 Note: Elements that were not endorsed are in bold font.    85  Table 3.6: Content Validity Indices (CVI) for the Clarity of the Living Conditions Impact Section Items Element CVI Place where you live or stay  I?d like to know about the place where you currently live or stay 0.86 Do you feel that the place where you live or stay is affordable? 1.00 Does the place where you live or stay have the amenities that are important to you (like a fridge, stove, own bathroom, elevator)? 1.00 Do you have access to bathing facilities (such as a shower)?  Perfect Do you feel that these bathing facilities are clean enough to use? Perfect Do you feel safe using these bathing facilities? Perfect Overall, do you feel that the place where you live or stay is clean enough? 1.00 Do you feel like you have control over your own space? 1.00 Are the other people living or staying there too disruptive? 1.00 Do you have enough privacy there? 1.00 Do you feel there are too many restrictions placed on you there? 1.00 Are you always worrying that you?ll catch some illness from other people living there? 1.00 86  Element CVI Do you feel your stuff is safe there? Perfect Do you feel that you?re treated well there (for example: by landlord, shelter staff, other residents)? Perfect Does it feel like a home to you? 0.86 What is the worst thing about the place where you currently live or stay? Perfect What is the best thing about the place where you currently live or stay? Perfect Anything else you want to tell me about the place where you live or stay? 1.00 You?ve talked about some things that describe the place where you currently live or stay. Now I want to know what kind of impact/effect that the place where you live or stay has on you. You could tell me that the place where you live or stay has no impact/effect on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you. I?d like you to rate the impact/effect that the place where you currently live or stay has on you 0.43 Neighbourhood  Now I have some questions about your neighbourhood.  By ?neighbourhood?, I mean the neighbourhood of the place where you are currently living or staying ? even if you haven?t been there very long 1.00 Do you feel safe in your neighbourhood? Perfect 87  Element CVI Why is that? 1.00 Do you feel differently about safety in your neighbourhood at night than during the day? Perfect Why is that? Perfect Do you feel that you?re part of the community in your neighbourhood? 0.71 Do you feel stuck in your neighbourhood? 0.71 Do you feel that there are a lot of bad influences there for you (for example: too many drugs, too much crime)? 1.00 Do you think that there are enough resources there? (for example: food bank, health care, support workers) 1.00 What is the worst thing about your neighbourhood? Perfect What is the best thing about your neighbourhood? Perfect Anything else you want to tell me about your neighbourhood? Perfect You?ve talked about some things that describe your neighbourhood. Now I want to know what kind of impact/effect that your neighbourhood has on you 0.86 Food  88  Element CVI Now I have some questions about the food you eat Perfect Are you usually able to get food that you like? Perfect Would you say that the food you eat is nutritious? 0.71 Are you usually able to get good quality food? 0.86 Do you find that you get stuck eating the same thing almost every day? Perfect Do you have trouble getting enough to eat? Perfect What is the worst thing about the food you eat? Perfect What is the best thing about the food you eat? Perfect Anything else you want to tell me about the food you eat? Perfect You?ve talked about some things that describe the food you eat. Now I?d like you to rate the impact/effect that the food you eat has on you 0.83 Clarity of all aspects of Place where you live or stay (S-CVI-Ave) 0.95 Clarity of all aspects of Neighbourhood (S-CVI-Ave) 0.95 Clarity of all aspects of Food (S-CVI-Ave) 0.94 Note: Elements that were not endorsed are in bold font.  89  Table 3.7: Content Validity Indices (CVI) for the Administration Elements of the QoLHHI Element CVI Health Skip Instructions Clarity of instructions for the skip pattern for item about experiencing physical pain 1.00 Clarity of instructions for the skip pattern for item about experiencing emotional pain 1.00 Clarity of instructions for the skip pattern for item about using alcohol 1.00 Clarity of instructions for the skip pattern for item about using marijuana 1.00 Clarity of instructions for the skip pattern for item about using street drugs 1.00 Clarity of instructions for the skip pattern for item about having a chronic illness or condition 1.00 Clarity of instructions for the skip pattern for item about whether supposed to be following a special diet 1.00 Clarity of instructions for the skip pattern for item about whether or not following this special diet 0.83 Clarity of instructions for the skip pattern for item about whether supposed to be taking medication 1.00 Clarity of instructions for the skip pattern for item about whether or not taking this medication 0.67 Administration Guidelines/Instructions How clear are the section-specific administration guidelines for the Health Impact Section? Perfect How helpful are the section-specific administration guidelines for the Health Impact Section? Perfect 90  Element CVI How clear are the section-specific administration guidelines for the Living Conditions Impact Section?  Perfect How helpful are the section-specific administration guidelines for the Living Conditions Impact Section?  Perfect How clear is the note (in italics) regarding ?Yes/No? responses for the Living Conditions Impact Section? Perfect How clear are the instructions (in bold capital letters) for when to skip items D and E in the ?place where you live or stay? section of the Living Conditions Impact Section? 1.00 How clear is the Exception note (in italics) regarding the respondent?s ?usual neighbourhood? for the Living Conditions Impact Section? 1.00 How clear are the general administration guidelines for the overall QoLHHI? 1.00 How helpful are the general administration guidelines for the overall QoLHHI? 1.00 How clear are the general administration guidelines for the Impact Sections? Perfect How helpful are the general administration guidelines for the Impact Sections? 1.00 Yes/No Scale (Living Conditions Impact section) How easy to understand is the yes/no response scale for the descriptive questions? 0.86 How easy to use (apply) is the yes/no response scale for the descriptive questions? 0.86 Scoring Instructions 91  Element CVI How easy to follow are the instructions for calculating and interpreting the basic health impact total score (impacthealth5)? 0.83 How easy to follow are the instructions for calculating and interpreting the enhanced health impact total score (impacthealthvar)? 0.83 How easy to follow are the instructions for scoring and using the additional questions from the Health Impact Section? 0.83 How easy to follow are the instructions for calculating an overall rating of the quality of where one is living or staying (qollive14)?  Perfect How easy to follow are the instructions for calculating an overall rating of the quality of the neighbourhood (qolneigh5)?  Perfect How easy to follow are the instructions for calculating an overall rating of the quality of food (qolfood5)?  Perfect How easy to follow are the instructions for calculating and interpreting the overall impact of living conditions on a person (impactlive3)?  Perfect Note: Elements that were not endorsed are in bold font. 92  4. Chapter 4: Going Beyond Subject Matter Expert: Content Validation of a Quality of Life Measure with a Sample of Experiential Experts and Practical Experts Introduction Not much is known about the relationship between homelessness and subjective quality of life (SQoL). Some research has looked at the health-related QoL (HRQoL) of individuals who are homelessness, but these studies have, for the most part, employed measures such as the SF-36 (e.g., Kertesz et al., 2005; Tsui, Bangsberg, Ragland, Hall, & Riley, 2007), SF-12 (e.g., Savage, Lindsell, Gillespie, Lee, & Corbin, 2008), or EQ-5D (e.g., Sun, Irestig, Burstr?m, Beijer, & Burstr?m, 2012). These measures, with their focus on health, provide only limited information about overall QoL. One recent review of the literature on SQoL (Hubley et al., 2012) found that individuals who are homeless report low levels of satisfaction for many life areas, and also report lower quality of life compared to the general population. However, these findings were based on a fairly small number of studies. The authors also noted that there is wide variation in the instruments used to measure SQoL among individuals who are homeless. Greater consistency in the measures used in research would help provide a more accurate picture of SQoL in this population (Hubley et al., 2012). There is also a need for a measure that reflects the unique and complex issues and circumstances faced by individuals who are homeless and vulnerably housed (HVH), as measures developed for the general population may omit some important influences on the SQoL of individuals who are HVH, and include other topics that are less relevant.  The Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Inventory (Hubley et al., 2009) was recently developed to meet the need for a population-specific measure of SQoL. It is based on information provided by individuals who were HVH about what affects their SQoL, and is composed of multiple sections, each of which addresses a different life area 93  such as health, housing, living conditions, finances, and personal relationships. The instrument is designed to be administered by an interviewer (i.e., it is not self-administered) and can be used for research and in community settings. The measure is also designed to be highly flexible; depending on the purpose of measurement, only some life areas may be relevant and so only some sections may be administered.  Validity, which is ?the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests? (p. 9), is one of the most important considerations in the development and use of any measurement instrument (AERA et al., 1999). A strong case for validity will rest on evidence from a number of sources, including the content of an instrument. Content validation is concerned with establishing the relevance and representativeness of an instrument?s content, within the context of the purpose for which the instrument is used (Haynes et al., 1995). The technical quality of content elements is also an important consideration, as this may affect content relevance (Messick, 1989).  One of the most common approaches to content validation is to have Subject Matter Experts (SMEs) assess the quality of the instrument content. Although there are no universal criteria for SMEs, commonly cited qualifications include clinical or research expertise related to the construct of interest (Davis, 1992; Grant & Davis, 1997). Such ?traditional? SMEs will be able to judge the content of an instrument with an eye towards its clinical or research applications, based on insights gained from a range of sources including their own research and/or clinical experience, the work of colleagues, and contact with the instrument?s target population. SMEs may also have at least a basic understanding of the concept of validity and the goals of a content validation study. A content validation study of the two most frequently used sections of the QoLHHI, the Health Impact and Living Conditions Impact sections, was recently 94  conducted with a sample of 11 SMEs. This study found generally favourable content-based evidence for validity for these two sections (see Chapter 3). The majority of the content elements were considered by the SMEs to be relevant to measuring the SQoL of individuals who are HVH, and the technical quality of the majority of the elements was also acceptable. The relatively small number of suggestions for topics to add to the QoLHHI indicated a high level of content representativeness. In addition, an analysis of the SME comments from this study revealed that most of the experts? concerns about the QoLHHI content could be addressed through relatively minor revisions to the items and other elements.  While SMEs are commonly used to evaluate instrument content, they may not have personal experience of the construct of interest and, no matter how closely they have worked with the target population, they may not be able to fully assess the instrument content from the perspective of this group, particularly if the group is significantly different from themselves (e.g., much younger or older, from a different ethnic or socio-economic group). For this reason, it can be of great value to include other types of content experts in content validation.  One such type of content expert is members of the instrument?s target population, sometimes called lay experts or ?experiential experts? (Schilling et al., 2007, p. 362). Experiential experts (EEs) can be particularly helpful in assessing whether an instrument?s content reflects the target population?s conceptualization of the construct of interest and in identifying unclear terms and language (Fleury, 1993; Tilden et al., 1990), and it has been recommended that EEs be included at both the instrument development stage and as part of the content validation process (Haynes et al., 1995). Introducing EEs into the content validation process can present some challenges, however. Some EEs may struggle with the  language of research studies, or may find it difficult to evaluate instrument content from beyond their own 95  personal experiences (Stewart et al., 2005). Also, some EEs may not fully understand the goals of a content validation study. For example, a typical study task is to rate the relevance of an item to the construct of interest. This is a more abstract task than responding to the item itself, and can be quite difficult. Also, EEs may not always understand the larger context of an instrument?s intended uses (e.g., research versus clinical applications). Despite these potential challenges, including EEs in addition to SMEs in content validation can provide alternative perspectives on an instrument?s content elements, and identify problems with the content that might otherwise be overlooked. The potential value of these perspectives is particularly high for an instrument such as the QoLHHI, which measures a highly subjective construct and whose target population is likely quite different from most SMEs. Content experts need not be limited to SMEs and EEs. For example, because the QoLHHI is administered by interviewers, another group of content experts might be individuals who have administered the instrument for research or service provision purposes. The perspective of these ?practical experts? (PEs) would likely be informed not only by their own views, but by feedback from the HVH individuals that they have interviewed. PEs are therefore potentially valuable representatives of a broad range of members of the target population and, at the same time, would be ideally suited to assessing such content elements as the clarity of administration and scoring instructions.    The first goal of the present study was to evaluate the evidence based on content for the Health Impact and Living Conditions Impact sections of the QoLHHI, using both PEs and EEs as content experts. This information will (a) serve to supplement and complement the results from the content validation study conducted with a sample of SMEs, (b) allow a comparison of feedback provided by different expert groups, and (c) permit an evaluation of the utility of 96  obtaining content validation evidence from such a varied and challenging population as individuals who are HVH.   A second goal of this study was to assess the value of different types of feedback from content experts. In a typical content validation study, experts are asked to provide some form of quantitative feedback on the instrument?s content, usually in the form of ratings of relevance, clarity, etc. Many content validation studies also collect descriptive feedback in the form of comments and suggestions, but this feedback is generally paid less attention than the quantitative data. An important finding from the recent content validation study of the QoLHHI with SMEs, however, was that descriptive feedback can be at least as informative as quantitative feedback. In fact, for the purposes of identifying why a content element might negatively affect validity and how the element might be improved, the comments and suggestions were even more helpful than the quantitative ratings. The present study presented a further opportunity to evaluate the contribution that these different types of information can make to an assessment of validity evidence based on instrument content. Methods Ethical approval. This study was approved by the Behavioural Research Ethics Board at the University of British Columbia, Vancouver, Canada. Participants and recruitment. Practical Experts (PEs) were defined as individuals who had administered the QoLHHI Health Impact section and/or Living Conditions Impact section, for research purposes, to individuals who were HVH. Potential PEs were identified through their involvement as research assistants/interviewers for studies that had included the administration of one or both QoLHHI sections, and were sent an email inviting them to contact the author of this dissertation (LR) if they were interested in taking part in the present study.  97  Experiential experts (EEs) were defined as individuals who had been HVH within the previous 12 months. In addition, they needed to be capable of understanding and carrying out the study tasks; this meant that individuals with significant cognitive impairments, for example, would not be eligible. Being homeless was defined as living on the streets or a similar location, in a shelter, with friends, or being in hospital or prison with no permanent address to go to upon release. Vulnerably housed was defined as living in insecure or temporary housing such as a single room occupancy hotel (SRO) or rooming house. Individuals in transitional, supportive, or group housing were also eligible to participate in this study. Potential EEs were identified through their participation in the Vancouver, Canada cohort of a Canadian Institute for Health Research (CIHR)-funded longitudinal multi-site study on health and housing transitions (Health and Housing in Transition (HHiT) Study). Two research assistants from the HHiT Study identified participants who met the 12-month eligibility criteria and who they felt would be capable of evaluating the QoLHHI content effectively. The research assistants then contacted these potential EEs and arranged meetings with the author of this dissertation (LR) for those who were interested. Potential EEs were informed that their involvement in the HHiT Study would not be affected by their decision to participate or not participate in the present study. Measures and materials. Study participants were asked to evaluate the content elements from the following QoLHHI measures and materials: the Health Impact section and Living Conditions Impact section, the Impact Response Card, and the QoLHHI Administration and Scoring Manual. In addition, all participants completed a brief demographics form. Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Health Impact section (Hubley et al., 2009). This section of the QoLHHI measures the impact on the respondent of various aspects of health. There are 28 main items and 11 sub-items in this section, some of 98  which are skipped due to response-dependent skip patterns. The forms that are used to administer the Health Impact section of the QoLHHI also contain instructions to the person administering the QoLHHI (interviewer), such as instructions for skipping items.  The impact of 13 aspects of health is measured on a 7-point scale ranging from ?Large negative impact? to ?Large positive impact?, with a neutral ?No impact? option as the mid-point. The aspects of health are physical health, mental health, physical pain, emotional pain, physical activity, sleep, stress, use of alcohol, use of marijuana, use of street drugs, chronic illnesses or conditions, special diet, and prescription medication. The interviewer can choose between two words, ?impact? and ?effect?, to use for the administration of the impact items.  A number of descriptive questions provide context for the impact ratings. For example, the impact item for stress is preceded by an item asking the respondent to rate their stress level as low, medium, or high. Other descriptive questions are used to determine if further questions need to be asked. For example, respondents are asked if they have any chronic illnesses or conditions, and the impact item for chronic illnesses is skipped if the answer is negative. Finally, some descriptive questions determine the wording of the impact items (e.g., the impact of ?drinking alcohol? versus the impact of ?no longer drinking alcohol?).  Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Living Conditions Impact section (Hubley et al., 2009). The impact of the respondent?s living conditions can be measured by asking about the impact of the place where they live or stay, neighbourhood, food, clothing, and personal hygiene. The most commonly-used sections are for the place where they live or stay, neighbourhood, and food. Given this and the length of the entire Living Conditions Impact section, only these three aspects were included in the present study.  99  There are 17 descriptive items and one impact item addressing the place where you live or stay (two of these items may be skipped based on responses to a previous item), 11 descriptive items and one impact item addressing neighbourhood, and eight descriptive items and one impact item that address food. As with the Health Impact section, the forms that are used to administer the Living Conditions Impact section of the QoLHHI contain instructions to the interviewer, such as instructions for skipping items. The impact items are rated on the same 7-point response scale used for the Health Impact section. Descriptive information is collected about each aspect of living conditions. For example, for the place where you live or stay, there are questions about cleanliness and privacy, whereas, for the aspect of neighbourhood, there are questions about resources and safety. Response options for most of the descriptive items are ?yes?, ?no?, and ?yes/no?. This last option covers any mixed responses (e.g., ?sometimes?), and there is space on the QoLHHI form to record comments on these, or any other, responses. There are also several open-ended questions for each aspect of Living Conditions (e.g., ?What is the best thing about your neighbourhood?? and ?What is the worst thing about your neighbourhood??). QoLHHI Impact Response Card. This card presents visually the 7-point impact response scale using minus signs and plus signs for negative and positive impact, respectively (see Figure A.1, p. 216). These are displayed in a single row with progressively larger and darker symbols as the scale increases from small to large impact. The ?no impact? mid-point is represented by a zero. The response options are printed in text below each symbol, but there are no numbers on the card. Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Administration and Scoring Manual (Hubley et al., 2009). This manual describes the 100  background, development, and purpose of the QoLHHI. It contains several sets of administration guidelines: general guidelines for the overall QoLHHI, general guidelines for the Impact sections, and guidelines specific to the individual Impact sections. The manual also contains scoring instructions. An adapted version of the manual, from which content that did not apply to the Health and Living Conditions Impact sections had been removed, was used for this study. Demographics form. All PEs and EEs were asked to provide basic demographic information including age, gender, and years of education. PEs were asked to indicate how many times they had administered the Health and Living Conditions Impact sections of the QoLHHI, and whether they had any non-QoLHHI experience working with individuals who are HVH. EEs were asked if they had any chronic illnesses or conditions, to rate their health on a 5-point scale, and to describe their current housing or living situation (e.g., living on the street, in a shelter, with friends, in supportive housing or private sector housing). They were also asked to indicate how long they had been in their current housing or living situation. Procedures. All PEs were given the choice of rating either one or both of the QoLHHI Impact sections. The QoLHHI forms (either one or both sections) and the QoLHHI Administration and Scoring Manual were sent to participants by email. The informed consent form, rating forms, and demographics form for the PEs were administered online using a web-based survey hosted by Fluid Surveys (http://fluidsurveys.com/). All PEs were offered a $10.00 (Canadian) gift card for the Starbucks Coffee Company for participating. Data were collected from the EEs through face-to-face interviews with the author of this dissertation (LR). These interviews took place in cafes, restaurants, or public spaces such as parks and lasted approximately 1 to 1.5 hours. Each EE provided feedback on one section of the 101  QoLHHI (either Health or Living Conditions); the sections were alternated from one participant to the next. All EEs were offered a $20.00 (Canadian) cash honorarium for participating.  The EEs were provided with the informed consent form at the outset of the interview and given an opportunity to ask questions about the study. Once they had signed the consent form, LR gave a brief oral introduction to the purpose of the study (i.e., content validation) in lay language. EEs were then given a copy of the relevant section of the QoLHHI and the Impact Response Card and a printed version of the response scales for rating the elements of the QoLHHI content (e.g., options ?not at all relevant?, ?somewhat relevant?, ?mostly relevant? and ?completely relevant?). LR read out the questions from the rating forms (which were the same as for the PEs) and wrote down the responses.  The PEs and EEs rated the various content elements of the QoLHHI Impact sections using forms that were developed for this study. Table 4.1 (see p. 133) shows the different content elements of the QoLHHI that were rated, the scale used for each element, and which group of experts (i.e., PEs, EEs) rated each element. In addition to the elements listed in Table 4.1, the PEs and EEs were asked to provide feedback on the skip patterns for six aspects of health. Currently, interviewers are instructed to skip the impact items for physical and emotional pain, having a chronic illness or condition, and use of alcohol, marijuana, and street drugs if the respondent says that he or she does not currently experience or has never experienced these conditions, or does not currently engage in or has never engaged in these activities (e.g., does not have a chronic illness or has never used alcohol). The PEs and EEs were asked to indicate if they thought that the impact items should indeed be skipped in these cases, using  a 3-point scale with the response options ?should not be skipped?, ?unsure if should be skipped?, and ?should be skipped?. 102  PEs and EEs were asked to provide explanations for any rating other than ?completely relevant? or ?very? clear, helpful, etc. For PEs, space for these explanations was provided in the online rating forms. For EEs, LR asked for and wrote down this information. It was not always possible to follow up with the EEs on all low ratings. In some cases, this was because an interview was taking longer than anticipated and it became necessary to skip some probes in order have time to complete the rating forms. In other cases, participants were unable to explain their ratings even after several prompts and were becoming frustrated. Skipping one or more follow-up questions allowed the interview to move forward again. PEs and EEs were also asked to suggest any topics that they thought should be added to the QoLHHI in order to better capture health and living conditions (in the context of measuring SQoL). Finally, both PEs and EEs were asked to indicate whether they thought the word ?impact? or ?effect? is more understandable, and to provide an explanation for their answer. Analyses. Quantitative data. All ratings were assigned a numeric score as follows: ?not at all? (relevant, clear, etc.) = 1, ?somewhat? = 2, ?mostly? = 3, and ?completely? or ?very? = 4. For the ratings of the skip patterns in the Health Impact section, ratings of ?should not be skipped? = 1, ?unsure if should be skipped? = 2, and ?should be skipped? = 3. The resulting numeric scores were used to compute a Content Validity Index (CVI, Lynn, 1986), which is the proportion of raters who endorse an element. As recommended by Lynn (1986), the 4-point rating scales were first collapsed into two categories. Ratings of 3 and 4 were treated as endorsements and ratings of 1 and 2 were treated as non-endorsements. The CVI was then calculated as the number of endorsements divided by the total number of ratings. For the rating scale for the skip patterns, 103  only the highest rating (3, or ?should be skipped?) was treated as an endorsement. All quantitative analyses were conducting using Microsoft Office Excel 2010. A CVI of 0.80 was set as the minimum acceptable sample-level of endorsement for the purposes of content validation.13  Elements with CVIs below 0.80 were flagged as problematic from a validity perspective. In terms of guidelines for making changes to the instrument content, these elements were identified as being in need of revision or potentially even removal from the QoLHHI. A category of the CVI called a Perfect CVI was also established for this study. This is a CVI of 1.00 when all raters in a group (i.e., all PEs or all EEs) have assigned the highest possible score (usually a 4) to an element. These elements might still benefit from revision, but it was assumed that these revisions would be minor. Descriptive data. Quantitative ratings of instrument content are informative up to a point. They can be used to summarize validity evidence based on content by, for example, indicating that an element was deemed relevant or not relevant overall. However, these ratings are limited in that they do not explain why an element was not endorsed, nor do they help determine how an element may be improved. In the present study, PEs and EEs were asked to provide explanation for any ratings below a 4 (3 in the case of the skip patterns for certain Health Impact items). Their explanations were reviewed in order to identify specific problems with the content elements, and, if possible, establish recommendations for revisions. This was done for both endorsed and non-endorsed elements, for two reasons. First, although the endorsed elements                                                  13 A CVI of 0.80 was chosen as the minimum level for endorsement because it was felt that this represents a reasonable and acceptable level of endorsement under the conditions of a typical content validation study. There is no single accepted minimum sample size for such a study, but 5 is a number that is often found in the literature (e.g., (Lynn, 1986; Netemeyer, Bearden, & Sharma, 2003; Osterlind, 1997). It would be unrealistic to expect perfect levels of endorsement for all elements with 5 raters, but in order to set a reasonably rigorous standard, it was decided that no more than one non-endorsement per element should be allowed. Four endorsements from among five raters will result in a CVI of 0.80. Even at larger sample sizes, setting an acceptable level for the CVI at 0.80 will still only allow for one non-endorsement. For example, with 6 raters, 5/6 endorsements produces a CVI of 0.83, but 4/6 endorsements produces a CVI of only 0.67. In terms of individual raters, a minimum of 0.80 allows for only one non-endorsement for samples of up to 9 content experts. 104  were generally assumed to be acceptable overall, it was thought that they might still be improved using the feedback from the content experts. Second, looking at all descriptive feedback provided an opportunity to investigate if there were any differences in the descriptive feedback for elements that had been classified differently (i.e., endorsed versus non-endorsed) in the quantitative analyses. The analysis of the descriptive feedback was conducted using Atlas.ti (version 6.2). Results Sample. Practical Experts. Eight PEs completed rating forms. Seven (87.5%) were female and the average age was 30.13 years (SD=8.03 years, range 24-49 years). Five PEs had an undergraduate university degree, two had graduate degrees, and one had completed secondary school. Six PEs rated both sections of the QoLHHI, one PE rated only the Health Impact section, and one PE rated only the Living Conditions Impact section. The PEs had administered the Health Impact section an average of 153.13 times (SD=84.10 times) and the Living Conditions Impact section an average of 146.43 times (SD=88.49 times). Six of the PEs reported that they had experience working or conducting research with individuals who are HVH in addition to their research work involving the QoLHHI. Experiential Experts. A total of 22 EEs were interviewed; however, data from six EEs were not included in the analyses. The reasons for removing these data included: the EE being under the influence of substances or suffering from a mental health condition at the time of the interview that appeared to be interfering with his/her ability to answer the study questions, not being able to explain the reasons for ratings or giving explanations that clearly showed a lack of understanding of the study tasks, declaring that the rating tasks were not meaningful, or only 105  being able to provide feedback based on one?s own life experiences despite frequent prompting to adopt a broader perspective. The 16 EEs whose data were retained for analysis included 13 males (81%) and 3 females. Their ages ranged from 27 to 64 years with a mean age of 46.75 years (SD=10.68 years); years of education ranged from 7 to 18 years with a mean of 13.06 (SD=2.86 years). Health was rated as ?fair? by 1 EE (6%), ?good? by 6 EEs (38%), ?very good? by 7 EEs (44%), and ?excellent? by 2 EEs (12%). Most (63%) said that they did not have any chronic illnesses or conditions. Eight EEs (50%) reported that they were living in single room occupancy (SRO) hotels, 4 (25%) were living in private market rental housing, 3 (19%) were living in supportive housing, and 1 (6%) was living in a shelter. The EEs who were living in private market housing had previously lived in shelters, supportive housing, and on the street. The EEs had been in their current housing or living situation for an average of 22.88 months (SD= 29.61 months, range = 1-120 months).14 Of the EEs whose data were retained in the analyses, eight provided feedback on the Health Impact Section and eight provided feedback on the Living Conditions Impact section. Quantitative data. Impact elements. The PEs did not endorse either the relevance of rating ?impact? or the impact response scale for how easy it is to understand or apply. The PEs did endorse the Impact Response Card as helpful (see Table 4.2, p. 135).                                                  14 The 6 EEs who were not included in the analyses included 4 males (67%) and had an average age of 44.50 years (SD=5.13 years, range = 39-53 years). Their mean years of education was 9.50 (SD=2.59 years). Seventeen percent rated their health as ?poor?, 33% as ?fair? and 50% as ?good?; 67% reported that they had no chronic illnesses or conditions. Thirty-three percent were living in SROs, and another 33% were living on the street. The remainder were evenly distributed between living in rooming houses and shelters. The EEs in this group had been in their current housing or living situation for an average of 25.67 months (SD=19.31 months). This group was similar to the EEs who were included in the analyses in terms of age (although they were older in terms of their age range) and time in current housing, but included a higher percentage of females, had lower education on average, and tended to be in poorer health. They also included in a higher percentage of individuals who were homeless, as opposed to vulnerably housed. 106  Unlike the PEs, the EEs endorsed the relevance of impact ratings and the impact response scale for how easy it is to understand and apply, but did not endorse the helpfulness of the Impact Response Card (see Table 4.2). Health and Living Conditions Impact section elements. The PEs endorsed all 52 of the Health and Living Conditions Impact section elements that they rated for relevance. These ratings included 23 Perfect CVIs (see Tables 4.3 and 4.4, pp. 136-138). Four elements of the 40 elements of the Health Impact section that were rated for clarity were not endorsed, but the endorsed elements included 23 that obtained Perfect CVIs (see Table 4.5, p. 142). The PEs endorsed all 42 elements of the Living Conditions Impact section that were rated for clarity, including 24 elements with Perfect CVIs (see Table 4.6, p. 148).  The EEs endorsed seven of the 13 elements of the Health Impact section and 23 of the 39 elements from the Living Conditions Impact section that were rated for relevance (Tables 4.3 and 4.4). There was one Perfect CVI for relevance. All 82 elements of both sections that were rated for clarity were endorsed by the EEs, including 59 Perfect CVIs (Tables 4.5 and 4.6).  Skips on Health Impact items. The skip patterns for the six impact items from the Health Impact section were not endorsed by either the PEs or the EEs (Table 4.7, p. 153). Administration elements. The PEs endorsed all 30 of the administration elements (administration guidelines, scoring instructions, etc.) for clarity, helpfulness, etc. These ratings included 22 Perfect CVIs (Table 4.8, p. 154).  Descriptive data. Non-endorsed elements: Practical Experts. Most of the PEs? comments on the elements that they did not endorse fell under two main themes: concerns with the concept of impact and 107  concerns with item language. In addition, one non-endorsed element (impact of stress) was singled out as involving a concept that is difficult for some respondents to grasp. Comments on impact. Although the PEs did not endorse the relevance of impact ratings, most PE comments on impact did not focus on relevance per se. In fact, 75% of the PEs rated the relevance of impact ratings as 3 (?mostly relevant?) or 4 (?completely relevant?), indicating that impact ratings were seen by the majority as relevant to measuring the SQoL of individuals who are HVH. Based on the PEs? explanations for lower relevance ratings for impact, as well as their comments on the Impact response scale and on several of the impact items from the Health and Living Conditions Impact sections, the problem may not be so much with relevance as with clarity. These comments indicate that PEs feel that many respondents struggle to understand what the QoLHHI impact items are asking. For example, PEs noted that both the idea of impact and the Impact response scale often require additional explanation on the part of the interviewer, that respondents find the Impact Response Scale confusing, that probing will often reveal that respondents have provided a rating that is the opposite of what they meant (e.g., they provided a rating of ?large negative impact? when in fact they meant ?large positive impact?), that the impact items are particularly challenging for respondents with cognitive impairments, and that some respondents will become quite angry about the impact items because they feel the items do not make sense. The PEs did have some suggestions for making the impact elements clearer. One PE proposed breaking the task of rating impact into multiple steps; for example by asking "Does your neighbourhood make things better or worse for you, or does it not make a difference to you? [worse]. Ok, then would you say it makes things extremely worse, moderately worse, or just a little bit worse?" Several PEs commented that incorporating more examples of how impact 108  might be felt would also help. An example for the impact of physical health might be ?Your physical health limits your walking, and therefore has a negative impact?. Some PEs? comments on impact ratings were more directly about relevance, including that making these ratings is not intuitive for respondents because people do not necessarily think in terms of how things impact their lives, and that impact ratings can be very subjective and so are dependent on an individual respondent?s situation or perspective. For example, two respondents might rate the impact of the same living situation completely differently, depending on how long they have been there or where they were living before.  For the Impact Response scale, it was noted that the scale cannot accommodate situations in which something has both negative and positive impacts at the same time. Respondents are forced to choose either a negative or a positive rating, and the ?no impact? mid-point rating does not reflect mixed impact. Comments on language. PEs? comments on the language of non-endorsed elements included that the terms ?physical activity and exercise? and ?emotional pain? should be more clearly defined, and that the word ?level?, applied to physical activity (i.e., ?your current level of physical activity?), can be confusing for respondents. The item about physical activity or exercise would also benefit from the addition of examples that are specific to a HVH context, such as ?dumpster diving? or ?collecting bottles?. Other PEs? comments on item language included that the phrase ?impact of different aspects of health? can be confusing or unclear for some respondents, and that the introduction to the Health Impact section is too long.  The language of the Impact Response Scale was described by one PE as quite clinical. PEs also noted that the scale anchors of ?small?, ?moderate? and ?large? can be hard for some respondents to understand, particularly ?moderate?. It was suggested that ?moderate? should be 109  replaced with ?medium?, and that terms such as ?a little?, ?a lot?, and ?very? could be used in place of the current language.  Difficulty understanding the concept of stress. The impact item for stress was not endorsed for clarity. Judging by the descriptive feedback on this element, the problem is with the clarity of the concept rather than the clarity of the item language. Several PEs noted that respondents who rated their stress level as ?low? then sometimes struggled to assign an impact rating (what is the impact of low stress?). It can also be hard for respondents to imagine that stress can potentially have a positive impact, for example by acting as a motivator.  The Experiential Experts? perspective: Comments on elements not endorsed by the Practical Experts. In contrast to the PEs, the EEs as a group did endorse the relevance of asking about impact. Comments supporting the relevance of impact ratings included that ?impact is what it?s all about? and that these ratings provide an understanding of what people are going through. However, several EEs did say that they felt that the idea of impact is convoluted and hard to pin down, because the impact that something has can be short-term or long-term, vary considerably, and extend beyond any one life area. One EE noted that once he understood the basic concept of impact, he felt that it was very relevant, but so complicated that the impact items may not be worth asking. Other EEs thought that impact ratings are unnecessary because the question of impact will be largely answered by the responses to the descriptive items (this applies to the Living Conditions Impact section, which contains a large number of descriptive items), or that satisfaction ratings might be more relevant than impact ratings. Although the Impact Response Scale was endorsed as both easy to understand and easy to use by EEs overall, several also expressed concerns about the scale. Comments included that there are too many response options on the scale and too many different words (e.g., positive and 110  negative; small, moderate and large; impact and effect), though one EE had a different view and stated that the response scale has just the right number of response options. One EE noted that the scale cannot be used to capture a situation where something has both a positive and a negative impact at the same time, a point that was also raised in the PE comments. The EEs? comments on the remaining elements that were not endorsed by the PEs were quite different from the PE comments. For example, there were no EE comments to the effect that the term ?physical activity or exercise? might be misinterpreted by respondents or that examples that are more relevant to individuals who are HVH are needed. Instead, one EE stated that this item is not entirely clear because asking about physical activity or exercise incorrectly implies that these terms refer to different things. And although no EEs commented that it would be hard to rate the impact of low stress, or questioned the potential for stress to have a positive impact, one EE thought that it can be hard to characterize stress as low, medium or high. EEs? comments on the clarity of the introduction to the Health Impact section (which was not endorsed by the PEs) were somewhat contradictory: one EE thought that there were too many ideas in this introduction, while another felt that there was not enough detail. One EE noted that the word ?maybe? is used in the introduction to describe negative impact, but not positive impact. This EE felt that this is potentially leading and that the same sentence structure should be used for both positive and negative impact. The PEs? comments on the Health Impact section introduction tended to focus more on problems with conveying the idea of impact, rather than on such subtle language use. Non-endorsed elements: Experiential Experts. With the exception of the Impact Response Card, which was not endorsed for helpfulness, all of the non-endorsements from the EEs were for relevance. The relevance of elements was seen as limited for a range of reasons. 111  Limited relevance due to relative (un)importance. Several aspects of health, such as physical activity or exercise, physical pain, and use of alcohol, were considered by some EEs to be simply less important than other aspects (in particular, physical and mental health). In addition, some EEs thought that the open-ended questions in the Living Conditions Impact section are unnecessary because the other descriptive questions in this section will provide all of the necessary information about a respondent?s situation. Other EEs, however, thought that these questions could elicit valuable information. Elements/items that do not apply to most or all respondents. Some elements/items were considered less relevant because they would only apply to a small subset of respondents (e.g., prescription medication as an aspect of health would only apply to those people who have been prescribed medication) or because they were not seen as applying to anyone at all (e.g., there is no need to ask about feeling ?stuck? in your neighbourhood since people are always free to move whenever they want). Other elements were given lower relevance ratings because they were not important to an EE personally (e.g., one EE did not consider food to be relevant to SQoL because he does not care much about food; other EEs commented that they rarely or never worry about catching illnesses from other people).   Elements/items that are not relevant because negative situations are the result of one?s own choices. Some elements were considered less relevant because EEs thought that any negative impact could be avoided if a respondent were willing to make different choices. This applied primarily to elements related to food. Several EEs commented that there are many places to get food that is both free and varied, so if someone is not getting enough to eat or is eating the same thing every day, it would be because of that person?s own choices and priorities. As a result, these EEs did not think it was relevant to ask about food quantity and variety. 112  Items with obvious answers. EEs thought that some items were less relevant because the responses to these items will always be the same. For example, they did not think that it would be informative to ask if there are bad influences in a respondent?s neighbourhood ? the answer will always be that there are. And although asking about access to bathing facilities was considered highly relevant, it seemed pointless to some EEs to then ask if respondents feel safe using these bathing facilities. They argued that such facilities are always unsafe and that people will use them regardless of how unsafe they are.   Lack of negative impact.  Several EEs suggested that using marijuana does not have a negative impact and that it may, in fact, have a positive impact. As a result, asking about marijuana use was not viewed as relevant to measuring SQoL. Other reasons for lower relevance. Some of the explanations for lower relevance only applied to single elements. One EE felt that the relevance of asking about feeling stuck in your neighbourhood was limited by the fact that people might feel this way for very personal reasons. For the aspect of prescription medication, one EE thought that this topic is too sensitive to be included in the QoLHHI. Asking about alcohol use was seen as less than completely relevant because many people will not be truthful about their use out of a sense of shame, resulting in incorrect information (interestingly, this EE thought that this problem would not apply in the case of street drug use, because by the time people are using these drugs heavily, they ?are beyond shame?). Finally, one EE thought that physical activity or exercise should not be included as an aspect of health because exercise can be a hobby or a passion, and so its impact extends beyond just health.  Comments on the Impact Response Card. The EEs as a group did not endorse the Impact Response Card for helpfulness. Explanations for low helpfulness ratings included that the Impact 113  Response scale itself is clear enough, so that the card is not needed. However, a number of EEs did note that they liked the card, which they saw as catering to all respondents including those who cannot read. The card was also seen as effective for conveying the response scale, and as likely to speed up administration of the QoLHHI. The Practical Experts? perspective: Comments on elements not endorsed by the Experiential Experts. All elements that were not endorsed by the EEs were endorsed overall by the PEs in this study. Nevertheless, the PEs did comment on many of these elements. In some cases, the comments were similar to those of the EEs. For example, like the EEs, several PEs noted that, although the open-ended questions in the Living Conditions Impact section can elicit interesting information from some respondents, they are probably unnecessary. The PEs also echoed the EE statements about marijuana as an aspect of health, commenting that most respondents do not see marijuana use as being relevant to SQoL because the impact it has is generally positive (or at least not negative). In other cases, the PEs? explanations for lower ratings were different from those provided by the EEs. For example, for the safety of bathing facilities, PEs noted that this item would not be relevant to people who have access to private facilities. When asking about bad influences in the neighbourhood, rather than focus on how the answer would always be ?yes?, one PE noted that not everyone is affected by their neighbourhood, while another commented that the example of drugs as a bad influence may be flawed, because, for some respondents, living close to a source of drugs can be a good thing (e.g., they may use drugs for pain management).  In contrast to the EEs, the PEs endorsed the Impact Response Card for helpfulness. One PE noted that it would take twice as long to administer the QoLHHI without it. However, some PEs did note concerns with the card, in particular, that the current symbols look like lines and 114  hospital crosses (rather than minus and plus signs), and that ?smiley faces? and ?frowny faces? would be easier to understand. One PE commented that the card is most useful with literate respondents, as those with more limited literacy prefer to have the response options read out to them for each item. This contrasts with the EEs? comments that the card will be helpful for people who have difficulty reading (and is also quite different from what the QoLHHI authors had anticipated when they developed the card).  Non-endorsed elements: Skips on Health Impact items. Currently, interviewers administering the Health Impact section are instructed to skip the impact items for physical pain, emotional pain, substance use, and chronic illnesses and conditions if a respondent says that he or she has never or does not currently experience this condition or engage in this behaviour. In the present study, both PEs and EEs were asked if the impact items should in fact be skipped under these conditions, or if instead they should be asked of all respondents. This was one area where the PEs and EEs agreed, as neither group endorsed the existing skip patterns. There was also some agreement between the PEs and EEs on the reasons for why they felt certain impact items should not be skipped. For emotional pain, there were both PEs and EEs who commented that everyone has experienced emotional pain at some point, and so it is unnecessary to have a skip pattern at all for this item (i.e., everyone will either be experiencing current emotional pain or the absence of past emotional pain). For the impact item for alcohol use, there were both PEs and EEs who noted that alcohol use is so common that respondents who report never drinking have likely made a very conscious choice to avoid alcohol, and that therefore the impact of this choice should be explored.  There were both PEs and EEs who thought that the impact ratings for these six aspects of health could be useful for comparative purposes. However, the EEs tended to focus on 115  comparisons to others ? for example, respondents could rate the impact of not experiencing pain or not using substances by comparing themselves to those who do. This would encourage respondents to think about positive aspects of their lives. In contrast, the PEs? emphasis was more on the usefulness of collecting the impact information from the same respondents over multiple administrations of the QoLHHI, for example, in the context of a longitudinal research study. Asking the impact questions of all respondents would permit the tracking of changes in a respondent?s situation, and could also serve as a veracity check of sorts (by looking for inconsistencies in responses over time). Endorsed elements: Practical Experts and Experiential Experts. The almost complete absence of overlap in non-endorsed elements between the PEs and EEs suggests that these two groups of experts assessed the content of the QoLHHI very differently. In fact, though, the PEs and EEs showed considerable agreement in that they endorsed the majority of the elements of both the Health and Living Conditions Impact sections of the QoLHHI (Tables 4.3-4.6). However, because not all of these elements received the highest possible ratings from all PEs and EEs, there were many comments and suggestions for these elements. A closer look at this descriptive feedback revealed that there were both similarities and differences in the reasons that the two groups of experts gave for the lower quantitative assessments of these elements. Within each group, there were also similarities and differences in the types of comments made on the endorsed elements versus those elements that were not endorsed.  Similarities and differences across groups of experts.  There were some similarities in the descriptive feedback of the PEs and EEs on endorsed elements. In some cases, the general theme of the comments was similar, but the specific content of the comments differed. In other cases, both the theme and the content were similar. For example, there were both PEs and EEs who did 116  not like the vague timeframes used in the QoLHHI; experts in both groups singled out the terms ?lately? (for experiencing pain) and ?currently? (for use of alcohol and drugs) as problematic and suggested that these should be replaced with more specific timeframes. An example of a similar theme but different specifics is that experts from both groups felt that certain words or terms need to be more clearly defined, but singled out different elements as being in need of clarification. For example, the EEs singled out the terms ?quality? (of sleep), and ?chronic? (illnesses and conditions) as unclear. They also felt that ?mental or emotional health? should not be addressed within a single item, arguing that these are two different things, and so should be addressed separately. In contrast, the PEs had concerns about the terms ?physical health?, feeling ?stuck? in your neighbourhood, and food that is ?nutritious? and that you ?like?. Like the EEs, the PEs flagged ?chronic illnesses? as being potentially difficult to understand, but while the EEs? comments focused on the meaning of ?chronic?, the PEs? comments were about whether ?chronic conditions? includes mental health conditions as well as physical health conditions. Unlike the EEs, there were no PEs who commented on the term ?quality? with regards to sleep, but some PEs were unsure about the use of the word ?quality? to describe food, and wondered how this differs from ?nutritious?.  One term that experts from both groups found unclear was ?control? (over your own space). There were also both PEs and EEs who did not like the use of the word ?level? (e.g., ?level of physical health?). Another theme found in both the PEs? and EEs? comments on endorsed elements was that some items focus on the wrong aspect of the item?s subject or topic. The PEs singled out the items about amenities where you live (for many people, the goal is to secure any housing at all, regardless of amenities), food that you like (most people are just trying to get the food that they 117  need), and resources in the neighbourhood (a more important question is whether the resources are accessible, and whether people feel comfortable using these resources). The EEs thought that, rather than asking about the safety of a respondent?s possessions, the QoLHHI should include an item about personal safety, which is a greater concern. For the question about whether respondents feel that they are part of the community in their neighbourhood, one EE noted that some people do not wish to be part of the community and that this is something that cannot be captured by the item as it currently appears in the QoLHHI.    Similarities and differences across endorsed and non-endorsed elements. Within each group of experts, there were both similarities and differences in the themes of the descriptive feedback across endorsed and non-endorsed elements.  The greatest similarity for the PEs was the large number of comments about item language. Many of their comments on both endorsed and non-endorsed elements focused on words or terms that they felt need to be more clearly defined. They also had suggestions for re-wording items to make the language simpler or clearer.   Other themes in the PEs? descriptive feedback only appeared in the comments on either endorsed or non-endorsed elements, but not both. For example, some elements that were endorsed by the PEs as a group were nevertheless flagged by individual PEs as having limited relevance to some individuals within the target population (e.g., the question about affordability of housing was not seen as relevant to people living on the street or in shelters, while the question about cleanliness of bathing facilities was not seen as relevant to individuals who have access to private facilities). The PEs did not express concerns about limited relevance for any of the elements that they did not endorse. Another theme that only appeared in the PEs? comments on endorsed elements was that lack of knowledge or insight can limit respondents? ability to 118  provide accurate impact ratings. This was perceived as reducing the relevance of some elements (e.g., the relevance of chronic illnesses as an aspect of health is affected if respondents cannot remember whether they have any chronic illnesses or conditions). There was also some overlap in the EEs? descriptive feedback for endorsed and non-endorsed elements. Similarities included that some elements/items do not apply to most or all respondents (e.g., asking about quality of sleep or feeling safe in your neighbourhood were not seen as universally relevant, mainly because they did not matter much to an EE personally), and that some items have obvious answers (e.g., it is not necessary to ask about the cleanliness of bathing facilities as these will always be dirty, nor is it informative to ask about the neighbourhood because everyone already knows about the neighbourhood [Vancouver?s Downtown Eastside]). Another similarity between both endorsed and non-endorsed elements was that some EEs felt that certain elements were less relevant because people always have choices. These aspects were primarily related to food: getting nutritious food, good quality food, and food that you like.  One theme in the EEs? comments that was unique to elements they had endorsed was that some items are unclear because respondents will not have the information they need to respond. This applied to the impact of taking medication and the impact of following a special diet: because of the time lag between when someone starts the medication or diet and when the effects are felt, it would be hard for respondents to provide an accurate impact rating.  Endorsed elements: Administration elements (Practical Experts only). As noted previously, the PEs endorsed all of the administration elements (such as instructions to interviewers and scoring instructions) in both sections of the QoLHHI. They offered some suggestions for improving these elements and making the administration of the QoLHHI easier, 119  such as combining some items, adding a completed example of the scoring to the QoLHHI Administration and Scoring Manual, and making minor changes to the administration guidelines contained the manual.  ?Impact? versus ?Effect?. When administering the QoLHHI, interviewers may choose to use either the word ?impact? or the word ?effect? for the impact items. The instructions to the interviewer recommend starting with one word, but switching to the other if it appears that a respondent is having trouble understanding the meaning. Both PEs and EEs were asked which word they felt is more understandable, and why.  Seven PEs chose ?effect? and one PE chose ?impact?. Most PEs indicated that, in their experience, respondents did not understand the word ?impact? as well as they did the word ?effect?, perhaps because ?effect? is more common in everyday language. However, one PE commented that using ?impact? seems to draw out more concrete examples from respondents, and another noted that word preference among respondents was very individual.  Among the EEs, five felt that the word ?effect? is more understandable, seven felt that ?impact? is more understandable, two felt that the two words are equally understandable, one felt that neither word is understandable, and one did not provide a response. Comments on ?impact? included that it is more cut and dry, more final, more direct, more immediate, easier to quantify, sounds very clinical and hard, has a more negative connotation, and that although it might be the correct word, it is unpleasant to hear. ?Effect? is a softer and friendlier word, sounds broader and more long-term, is harder to respond to, and is more about how people feel than about facts.  Suggestions for topics to add to the QoLHHI. The PEs? and EEs? suggestions for topics to add to the QoLHHI are listed in Table 4.9 (see p. 157). There was little overlap in the topics suggested by the two groups. In some cases, the suggested topics are relevant to SQoL, 120  but perhaps not to the sections of the QoLHHI included in this study (e.g., relationships may not fit well under the topic of health or living conditions, but are relevant to SQoL and are addressed in a separate section of the QoLHHI). In other cases, the topics are already covered to some extent by the existing content in the section, though a more specific item could be created (e.g., crime rates/crime prevention in the neighbourhood could be seen to fall under ?feeling safe in your neighbourhood?, but feeling safe is quite broad and so a question specifically about crime could be added). And finally, there were topics that are not currently included in the QoLHHI and that the authors should consider adding, such as personal safety, pest infestations, dental/oral health, and questions about having choices (for example, about where you live).  Discussion Validity implications for the QoLHHI. The findings from this study indicate that the content of the Health and Living Conditions Impact sections is, on the whole, relevant, clear, helpful, etc. The PEs and EEs who took part in this content validation study endorsed the majority of the elements from these sections. The PEs endorsed 91% of the elements that they rated, while the EEs endorsed 80%. Forty-eight percent of the ratings across both groups were Perfect CVIs ? that is, elements that had received the highest possible rating from all content experts in a group.  Despite these findings, it is clear that making some revisions to the QoLHHI content may increase the likelihood that its use will lead to valid score interpretations. Not all of the elements were endorsed by the content experts. Interestingly, there was little agreement between the two groups of experts on their non-endorsement of elements. The EEs were generally more critical than the PEs when it came to relevance, while the PEs had more concerns than the EEs about item clarity. Other than the skip patterns for the six impact items from the Health Impact section, 121  there were no non-endorsed elements in common between the two groups. In fact, in some cases, the two groups had opposing views. For example, the EEs endorsed the relevance of impact ratings and also endorsed the Impact Response Scale both for how easy it is to understand and to use, but did not feel that the Impact Response Card was helpful. The PEs? ratings were exactly the opposite in terms of endorsement and non-endorsement of the impact elements.  Turning to the descriptive feedback, most of the PEs? concerns about the clarity of non-endorsed elements can be addressed by more clearly defining what certain words or terms mean, adding examples to illustrate their meaning, or simplifying item language. The lack of endorsement for rating impact is more of a concern, as impact is a fundamental component of the Impact sections. However, although certain of the PEs? comments do suggest that the concept of impact itself is problematic (e.g., it is not intuitive for respondents), many of the comments focus more on the way in which impact is presented (i.e., issues of clarity, rather than relevance). Rather than abandoning the concept of impact, the QoLHHI authors should look for ways to make it more understandable. Some of the PEs? suggestions, such as using very concrete examples of impact or breaking the rating process down into multiple steps, could be helpful in this regard.  The fact that almost all of the EEs? non-endorsements were for relevance is a concern; relevance is, in many ways, more important a consideration than clarity because lack of clarity can generally be fixed, but it is harder to ?improve? relevance. However, a closer look at some of the reasons the EEs gave for lower relevance ratings shows that these ratings must be evaluated carefully. For example, some of the comments suggest that certain ratings were based on a fairly narrow perspective. Sometimes this perspective was limited by very personal opinions on the part of the EE, while, in other cases, this perspective reflected the fact that the EEs were all 122  living in one city (Vancouver, Canada) and, for the most part, in one particular neighbourhood. This neighbourhood, known as the Downtown Eastside, has a very high concentration of services and resources for individuals who are HVH, a large percentage of the city?s poorest housing stock, and high rates of crime and substance use. These factors were evident in many EEs? comments, such as ?all bathing facilities are unsafe?, ?there are always bad influences in the neighbourhood?, and ?food can always be found for free?. Although these comments accurately reflect the reality for many individuals in the Downtown Eastside, they will not necessarily apply to HVH individuals living in other areas or other cities (or even to all residents of the Downtown Eastside). This does not mean that the EEs? feedback is not relevant, but it does mean that the applicability of some of their ratings, comments, and suggestions to the wider target population of individuals who are HVH should be considered before changes are made to the QoLHHI content.  It is also worth noting that nine of the items that were not endorsed for relevance by the EEs were open-ended items from the Living Conditions Impact section. The responses to these items are purely descriptive, not quantitative, and therefore are not included in any QoLHHI score calculations. The relevance of these items is therefore arguably less important to validity, since validity is about the ?interpretation of test scores? (AERA et al., 1999, p. 9, emphasis added).  There were comments from both the PEs and EEs that suggest that the purpose of the QoLHHI (i.e., to measure SQoL) is not always clear. For example, there were experts in both groups who were concerned that some ratings would be too subjective, but this is not actually a problem when the intention is in fact to measure subjective quality of life. The PEs? and EEs? comments to the effect that the topic of marijuana use is not relevant because the impact of use 123  will only be positive also suggest a lack of clarity around the goal of measuring SQoL and the use of impact ratings in this measurement. The intent of the QoLHHI authors was never to measure only negative impact; positive impact is also assumed to affect SQoL, though of course by enhancing rather than lowering it. Finally, the fact that certain elements were perceived as less relevant because they will only apply to some respondents suggests that target population for the QoLHHI (i.e., a broad and diverse population) needs some clarification. In particular, anyone administering the QoLHHI must have clear instructions for dealing with situations where an element or item is deemed ?not applicable? to a particular respondent. Based on the descriptive feedback from both groups of experts on both endorsed and non-endorsed elements, the QoLHHI authors should consider the following general revisions to the QoLHHI content: ? Revise the way in which impact is presented. In particular, add concrete examples, and consider breaking the task of rating impact into several steps. The question of how to address mixed impact should also be reviewed and clear instructions provided for interviewers ? Define certain terms and words more clearly and make changes to the language of some items. This includes clarifying the timeframes (e.g., ?lately?) used in several items ? Clarify the purpose of the QoLHHI (i.e., to measure subjective quality of life by capturing negative and also positive impact) ? Ensure that interviewers have clear instructions for what to do or how to respond when an item or element is not relevant to a particular respondent ? Review items/elements that were identified as focusing on the wrong topic (e.g., on amenities instead of securing housing, on the presence of resources in the community 124  instead of access to resources) and consider whether the original items/elements should be revised or even removed ? Review the suggestions for topics to add to the QoLHHI and incorporate these into existing items, or add new items as appropriate ? The current skip pattern for the six Health Impact section items (which directs interviewers to skip an impact item under certain conditions) was not endorsed. However, the reasons for non-endorsement varied so much across items and groups of experts that it is difficult to make a definitive recommendation regarding changes to these skip patterns. An additional consideration in favour of keeping the current skip patterns is that some respondents might find it abstract (and therefore challenging) to rate the impact of something they have never experienced. Although this possibility was not raised by the PEs or EEs, it was noted by the SMEs who took part in a separate content validation study of the QoLHHI, and should be kept in mind when deciding what to do about these skip patterns ? There was no clear preference regarding the words ?impact? and ?effect?. Although the PEs preferred ?effect?, and were basing their decision on their experiences with individuals who are HVH, the EEs were clearly divided. At this point, the best approach would be to keep both words Implications for content validation. This study provides two important lessons for content validation. The first is that there is valuable information to be gained from including more than one kind of content expert in the validation process. The second is that the experts? descriptive data can be a particularly rich source of information about an instrument?s content. Indeed, for some purposes, descriptive feedback is even more useful than quantitative data. 125  The value of multiple groups of experts. The PEs and EEs in the present study made quite different assessments of the QoLHHI content. With regards to their quantitative assessments, there was very little agreement between the two groups on the non-endorsement of content elements. With the exception of the skip patterns from the Health Impact section, if the PEs as a group did not endorse an element, the EEs did, and vice versa. In addition, most of the PEs? non-endorsements were for item or element clarity, while most of the EEs? non-endorsements were for item or element relevance. Nevertheless, based on the quantitative data alone, the two groups appear mostly to have been in agreement, in that they endorsed the majority of the QoLHHI content elements. It is in the descriptive data that the differences between the groups become fully apparent. While there was some overlap in both the general themes and in the specific content of their descriptive data, for the most part the two groups of experts raised different points and made different suggestions about the content elements. The resulting feedback is far richer than would have been produced by a single type of expert. The inclusion of individuals who are HVH as EEs did, however, require some adjustments to the study procedures, and created some challenges that were not present with the PEs (or with the SMEs who participated in a separate content validation study).  Some of the language and procedures used for the study was adapted for the EEs, in recognition of the fact that the population of individuals who are HVH includes individuals with low literacy levels and varying levels of cognitive ability. For example, while the PEs were not provided with a definition or explanation for ?relevance?, the idea of content relevance was explained at some length to the EEs. This was due to concerns that the concept of relevance would otherwise be too abstract for some EEs to understand. To explain ?relevance?, the EEs were told to imagine that they were conducting an interview to find out about how someone?s 126  health or living conditions might affect their QoL. They were then asked to think about whether it would be important to ask about each of the aspects of health and living conditions in order to meet this goal. To make the results of the quantitative analyses comparable across groups of content experts, however, the same response scale had to be used for all three groups. Thus, although the concept of relevance was described to the EEs using different words than for the PEs (and also the SMEs), the response scale (which used the word ?relevance?) was identical for all three groups. Data were collected from the EEs in one-on-one interview sessions. This differed from the online data collection method used with the PEs. The change in the procedures was driven by several considerations. Requiring EEs to have internet access in order to participate in the study would have excluded a large number of potential participants. A self-administered paper-and-pencil survey would not have been ideal either, as many individuals who are HVH do not have facilities for storing personal effects, and so it might have been difficult for some EEs to hold on to a paper survey long enough to return it to the researchers. A self-administered survey, whether online or in a paper-and-pencil format, might also have been challenging for participants with lower literacy levels. Finally, the interview format allowed the interviewer to gauge EEs? understanding of the study tasks and procedures and provide clarification if necessary. This proved to be very helpful, as in fact there were several instances where it was obvious that an EE was having difficulty with the study tasks. In some cases, this problem was solved through additional explanation and clarification on the part of the interviewer. In other cases, it became clear that even though an EE was providing data, these data were unreliable and needed to be removed from the analyses. Without the in-person interaction between the interviewer and the EEs, it would not have been possible to identify these cases. Nevertheless, even many of the EEs 127  who struggled with some of the study tasks were helpful in highlighting problematic QoLHHI content, in part because the interview format of the data collection sessions allowed the interviewer to probe and follow up on the EEs? ratings and comments. The value of descriptive feedback. Quantitative ratings provide a tidy summary of experts? assessments of an instrument?s content. However, as the present study shows, it is through descriptive data that experts can share their knowledge in depth and in a way that will most serve to further the goals of content validation. Even if that goal is simply to describe the content-based evidence for an instrument and what this evidence says about the impact that the content might have on validity, the descriptive feedback from experts will help to explain in detail why an element may be problematic (or, for that matter, effective). If the goal is also to revise the instrument content in order to improve it, the descriptive feedback can inform this process. Descriptive feedback can also identify instances where quantitative ratings should be interpreted with caution, as was the case in the present study when several content experts gave ?marijuana use? low ratings for relevance. As previously described, an analysis of their comments showed that the low ratings were due to a misunderstanding of the idea of impact. In fact, it is not that the topic of marijuana use should be removed from the QoLHHI, but that the importance of positive impact in measuring SQoL must be made explicitly clear. An interesting finding in the present study was that, although there were differences in the feedback between endorsed and non-endorsed elements, from a content validation perspective, these differences were not significant. Although there were themes that appeared in the descriptive feedback on endorsed element that were not present in the feedback on non-endorsed elements, this can be explained by the fact that the proportion of endorsed elements was much higher. Individual experts provided comments on, and suggestions for revisions to, 128  many of the endorsed elements. This meant that potential threats to validity (i.e., lack of relevance, lack of representativeness, or poor item quality) were identified for both endorsed and non-endorsed elements.  This last point highlights the importance of considering all of the descriptive feedback collected in a content validation study. With quantitative feedback, the focus tends to be on identifying the view of the majority of experts, but this is less meaningful with descriptive feedback. Instead, the researchers conducting a content validation study should treat the participants as individual experts.  Drawing on different ?types? of experts will help to ensure a broader range of perspectives, and looking for patterns in the descriptive feedback may help to identify strengths and weaknesses in the instrument?s content. But in the end, each comment and suggestion has the potential to provide valuable information, regardless of whether it agrees with other comments and suggestions, and regardless of whether it is attached to an endorsed or non-endorsed element. Indeed, it may be the dissenting voice within a group that helps to build a more complete picture of how the instrument content might affect validity.  The above argument in favour of the importance of descriptive feedback should not be taken to mean that every suggestion provided by a content expert must be followed. Neither should the quantitative results from a content validation study be treated as binding. Content experts may make suggestions that would ultimately be detrimental to an instrument, for example, if the resulting changes would negatively affect its psychometric properties. In the end, it is the researcher or scale developer who must decide how to put the information provided by content experts to use (DeVellis, 1991) Based on the results from the present study, the following general points should be considered when undertaking a content validation study using content experts:  129  ? Quantitative data can provide a useful summary of experts? assessments of instrument content. However, descriptive data should be given as much, if not more, attention as quantitative data  ? At the study design stage, great care should be taken in developing the questions to solicit descriptive feedback. These questions should be designed to make the most possible use of the content experts? relevant knowledge and experience ? At the data collection stage, the importance of descriptive feedback needs to be stressed to the study participants. Even though it is more work for experts to provide detailed comments and suggestions, the value of these and the fact that they are not secondary to any quantitative data should be emphasized ? At the data analysis stage, it is important to take the time to conduct a thorough analysis of the descriptive data. These data are, in many ways, more difficult to address than quantitative data. Dealing with these data requires a great deal of thought, judgment, and the ability to deal with contradictory opinions and suggestions. It is not possible to attach clear outcomes (e.g., endorsed/not endorsed) to the analysis of descriptive data in the way that it is with quantitative data. However, it is by considering all of the descriptive feedback in its entirety that the value of content experts will be fully realized Study Limitations and Directions for Future Research  There were several limitations to the present study, some related to the design of the research and some related to sampling issues. At the research design level, the collection of descriptive data was planned around the collection of quantitative data. Specifically, the content experts were asked to explain their rating for an element if this rating was less than ?completely 130  relevant? or less than ?very? clear, helpful, etc. This meant that the descriptive data were driven by the quantitative data. As a result, the experts? comments and suggestions were often repetitive. For example, if an expert felt that a particular word was unclear, this observation was often repeated for any subsequent items that used that word. Tying the comments and suggestions to the quantitative ratings may also have limited the scope of the descriptive data. As this study showed, the usefulness of descriptive feedback extends far beyond simply serving as a supplement to quantitative feedback. The question of how to best collect descriptive data in order to fully capitalize on the value of this aspect of expert feedback should be addressed through future methodological work on content validation, and then applied to any future content validation of the QoLHHI.   At the sampling level, there were limitations to both the EE and PE samples. The EE sample was composed of individuals who had been homeless for a considerable length of time, none were caring for dependent children, and they were, for the most part, living in a single neighbourhood (Vancouver?s Downtown Eastside) that has a very distinct character. The EEs were also predominantly male. The sample was selected in part because the face-to-face data collection method for the EEs made it necessary to draw on individuals in the dissertation author?s own city; in addition, these individuals were accessible through the QoLHHI authors? professional contacts. However, as a result of these restrictions, the sample did not reflect the full range of individuals included in the QoLHHI?s target population. Future content validation of the QoLHHI should include a broader range of EEs from the HVH population, including homeless families, women, and those living in a greater diversity of geographic locations and neighbourhood types.  131  Including individuals who were HVH as EEs in the content validation of the QoLHHI also had an unanticipated effect in that it necessitated balancing representativeness with effectiveness in selecting the EEs. The EEs who proved to be the most able to understand and carry out the study tasks were those whose level of cognitive functioning was highest. These individuals were not necessarily the most representative of the broader population of individuals who are HVH, which includes individuals with severe mental health issues and cognitive impairments. The more effective EEs also tended to be those with more years of education and, consequently, may not have been the best judges of item clarity and wording for the broader target population.   One of the criteria for the definition of PEs for this study was that they had to have administered the QoLHHI for a research study. However, the QoLHHI has other potential applications, such as program evaluation or serving as an aid to service provision. These applications may not have been represented in the feedback from the PEs in this study. It should be noted, though, that PEs were included in the content validation of the QoLHHI for two main reasons: first, it was believed that they could evaluate the content based not only on their own perspectives, but based also on the feedback they had received from HVH individuals to whom they had administered the QoLHHI. Second, it was felt that the PEs would be particularly helpful in evaluating the administration elements of the QoLHHI (e.g., administration and scoring instructions). It seems unlikely that the feedback from PEs in these areas would differ significantly from research to non-research settings, and so the extent to which the PE sample in the present study posed a limitation is unclear. Nevertheless, future content validation of the QoLHHI might benefit from the inclusion of a broader range of PEs.   132  Conclusion This study had two main goals. One was to collect and evaluate validity evidence based on content for the Health and Living Conditions Impact sections of the QoLHHI from two groups of content experts. A second goal of this study was to evaluate the contribution of different types of data to the process of content validation. Overall, the content-based evidence from this study suggests that these QoLHHI sections reflect the construct of SQoL for individuals who are HVH, and do so through content that is relevant and well-constructed. Although some revisions to the instrument are recommended, these are for the most part straightforward and will be easy to implement. The most important revision will be to make changes to the way in which the concept of impact is presented.  One of the strengths of this study is that it included two groups of content experts, including one unusual type of expert, namely the PEs. A comparison of the data from these two groups of experts revealed that, while there were some similarities in their assessments of the QoLHHI content, there were also many differences. This underscores the value of including different types of experts in the content validation process.  This study also included an in-depth analysis of the descriptive feedback provided by the content experts. A careful review of the comments and suggestions allowed for a deeper understanding of the experts? quantitative ratings, and also provided valuable information about the QoLHHI content that went beyond the questions addressed by the quantitative data. Content validation studies using content experts have traditionally focused on quantitative data, with descriptive data treated as a secondary, sometimes even optional, source of data. This study suggests this approach needs to be rethought and descriptive data given a much more central role in content validation.  133  Table 4.1: Content Elements Rated by Practical Experts and Experiential Experts  Element Rating Scale Used Rated by  Practical Experts Rated by Experiential Experts Concept of impact (measuring the impact of various life areas on the respondent) Relevancea Yes Yes Impact response scale Easy to understandb Easy to usec Yes Yes Yes Yes Impact response card Helpfulnessd Yes Yes Aspects of Health (e.g., physical health, stress, substance use) Relevance Yes Yes Health Impact section items Claritye Yes Yes Aspects of Living Conditions (i.e., place where you live or stay, neighbourhood, food) Relevance Yes Yes Living Conditions Impact section items Relevance Clarity Yes Yes Yes Yes Skip instructions for health items Clarity Yes No Administration instructions - Living Conditions Impact form, Health Impact form Clarity Helpfulness  Yes Yes No No 134  Element Rating Scale Used Rated by  Practical Experts Rated by Experiential Experts Yes/No response scale Easy to understand Easy to use Yes Yes No No Section-specific administration guidelines (Health & Living Conditions) ? QoLHHI Manual Clarity Helpfulness Yes Yes No No General Administration guidelines for QoLHHI ? QoLHHI Manual Clarity Helpfulness Yes Yes No No General Administration guidelines for Impact Sections ? QoLHHI Manual Clarity Helpfulness Yes Yes No No Scoring instructions (Health & Living Conditions) Easy to followf Yes Yes No No Note: a4-point scale with response options ?Not at all relevant?, ?Somewhat relevant?, ?Mostly relevant?, and ?Completely relevant?. b4-point scale with response options ?Not at all easy to understand?, ?Somewhat easy to understand?, ?Mostly easy to understand?, and ?Very easy to understand?. c4-point scale with response options ?Not at all easy to use?, ?Somewhat easy to use?, ?Mostly easy to use?, and ?Very easy to use?. d4-point scale with response options ?Not at all helpful?, ?Somewhat helpful?, ?Mostly helpful?, and ?Very helpful?. e4-point scale with response options ?Not at all clear?, ?Somewhat clear?, ?Mostly clear?, and ?Very clear?. f4-point scale with response options ?Not at all easy to follow?, ?Somewhat easy to follow?, ?Mostly easy to follow?, and ?Very easy to follow?. 135  Table 4.2: Content Validity Indices (CVI) for Impact Elements  Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa How relevant do you think impact/effect ratings are to measuring quality of life for individuals who are homeless or vulnerably housed? 0.75 0.88 X How easy to understand is the impact response scale for the impact questions? 0.50 0.94 X How easy to use (apply) is the impact response scale for the impact questions?  0.75 0.93 X How helpful is the QoLHHI Impact Response Card?  0.88 0.69 X Note: CVIs for elements that were not endorsed are in bold font.  aAn X indicates disagreement   136  Table 4.3: Content Validity Indices (CVI) for the Relevance of Aspects of Health  Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Current level of physical health 1.00 1.00  Current level of mental or emotional health 1.00 0.86  Current level of physical activity or exercise 0.86 0.57 X Quality of sleep 1.00 1.00  Current level of stress Perfect 1.00  Physical pain Perfect 0.75 X Emotional pain 0.86 1.00  Using alcohol 1.00 0.63 X Using pot/marijuana 0.86 0.63 X Using street drugs 1.00 1.00  Chronic illnesses or conditions 1.00 1.00  Special (medically recommended) diet 0.86 0.75 X 137   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Prescription medication Perfect 0.63 X Note: CVIs for elements that were not endorsed are in bold font.  a An X indicates disagreement.    138  Table 4.4: Content Validity Indices (CVI) for the Relevance of Aspects of Living Conditions  Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Aspect of Living Conditions: Place where you live or stay Perfect 1.00  Aspect of Living Conditions: Neighbourhood Perfect 0.88  Aspect of Living Conditions: Food 1.00 0.63 X Place where you live or stay    Affordability 0.86 1.00  Amenities 1.00 0.88  Access to bathing facilities Perfect Perfect  Cleanliness of bathing facilities 1.00 0.88  Safety of bathing facilities 1.00 0.75 X Overall cleanliness Perfect 1.00  Feeling of control over your own space 1.00 1.00  Disruptiveness of others Perfect 1.00  139   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Privacy Perfect 1.00  Restrictions Perfect 0.88  Worries about catching illnesses from others Perfect 0.75 X Security of possessions Perfect 0.88  Treatment by others Perfect 1.00  Feeling of home Perfect 1.00  Worst thing about the place where you live or stay 0.86 0.63 X Best thing about the place where you live or stay 1.00 0.75 X Anything else you want to say about the place where you live or stay 0.86 0.63 X Neighbourhood    Feeling safe in your neighbourhood Perfect 1.00  Reasons for feeling safe/unsafe Perfect 1.00  140   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Feeling safe at night versus during the day Perfect 0.88  Reasons for feeling safe/unsafe at night versus during the day Perfect 0.88  Feeling like part of the community in your neighbourhood 0.86 0.88  Feeling stuck in your neighbourhood Perfect 0.75 X Bad influences in your neighbourhood 0.86 0.63 X Resources in your neighbourhood 1.00 0.88  Worst thing about your neighbourhood 0.86 0.71 X Best thing about your neighbourhood 0.86 0.71 X Anything else you want to say about your neighbourhood 1.00 0.57 X Food    Ability to get food that you like 0.86 0.88  141   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Is food nutritious 1.00 0.88  Quality of food 1.00 1.00  Feeling stuck eating same thing every day Perfect 0.75 X Able to get enough to eat Perfect 0.75 X Worst thing about food Perfect 0.63 X Best thing about food Perfect 0.63 X Anything else you want to say about food 1.00 0.75 X Note: CVIs for elements that were not endorsed are in bold font.  aAn X indicates disagreement.    142  Table 4.5: Content Validity Indices (CVI) for the Clarity of the Health Impact Section Items  Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Now I want to know about the kind of impact/effect that different aspects of your health have on you. You could tell me, for example, that your physical health has no impact on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you 0.71 1.00 X I?d like you to rate the kind of impact/effect that your current level of physical health has on you 0.86 Perfect  I?d like you to rate the kind of impact/effect that your current level of mental or emotional health has on you 1.00 0.88  I?d like you to rate the kind of impact/effect that your current level of physical activity or exercise has on you 0.71 Perfect X I?d like you to rate the kind of impact/effect that the quality of sleep that 1.00 0.88  143   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa you?ve been getting lately has on you Would you describe your current level of stress as low, medium, or high? 1.00 0.88  Given your (low/medium/high) stress level, I?d like you to rate the kind of impact/effect that this has on you 0.57 0.88 X Have you been experiencing physical pain lately? 1.00 Perfect  I?d like you to rate the kind of impact/effect that (having/ no longer having ) physical pain has on you Perfect 1.00  Have you been experiencing emotional pain lately? 0.71 Perfect X I?d like you to rate the kind of impact/effect that (having/no longer having) emotional pain has on you Perfect 1.00  Do you currently drink alcohol? 1.00 Perfect  144   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa I?d like you to rate the kind of impact/effect that (drinking/no longer drinking) has on you Perfect 1.00  Do you currently use pot (marijuana)? 1.00 Perfect  I?d like you to rate the kind of impact/effect that (using/no longer using) pot has on you Perfect 1.00  Do you currently use other street drugs ? such as cocaine, heroin, or crystal meth for example? 0.86 Perfect  I?d like you to rate the kind of impact/effect that (using/no longer using) street drugs has on you Perfect 1.00  Do you have one or more chronic illnesses or conditions (for example: diabetes, allergies, a disability, hepatitis)? 1.00 0.88  I?d like you to rate the kind of impact/effect that this has on you 1.00 Perfect  145   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa Are you supposed to follow a special diet because of a health condition? Perfect Perfect  Are you following this special diet? Perfect Perfect  I?d like you to rate the kind of impact/effect that following this diet has on you Perfect 1.00  I?d like you to rate the kind of impact/effect that not following this diet has on you Perfect 1.00  If you are NOT following or only partially following this special diet, why not?  0.86 1.00  ?because the food you need for this diet is too expensive? Perfect Perfect  ?because it?s too difficult for you to get the food you need for this diet? Perfect Perfect  ?because you don?t have any way to prepare or store the food you need for this diet? Perfect Perfect  146   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa ?because you are not willing to give up certain foods as part of this diet (for example: salt, red meat, sweets)? Perfect Perfect  ? Other Perfect 1.00  Are you currently supposed to be taking medication that was prescribed by a doctor? 0.86 Perfect  Are you taking this medication? Perfect Perfect  I?d like you to rate the kind of impact/effect that taking this medication has on you Perfect 1.00  I?d like you to rate the kind of impact/effect that not taking this medication has on you Perfect 1.00  If you are NOT taking the medication prescribed to you, why not? 1.00 Perfect  ?because the medication is too expensive? Perfect Perfect  147   Practical Experts Experiential Experts  Element CVI CVI PE/EE disagreement on endorsementa ?because it?s too difficult for you to store the medication? Perfect Perfect  ?because you?re not able to take the medication as recommended (for example: with food, 3 times a day)? Perfect Perfect  ?because you don?t like the side effects? Perfect Perfect  ?because you don?t believe in taking medication? Perfect Perfect  ? Other Perfect 1.00  Note: CVIs for elements that were not endorsed are in bold font.  aAn X indicates disagreement.    148  Table 4.6: Content Validity Indices (CVI) for the Clarity of the Living Conditions Impact Section Items  Practical Experts Experiential Experts Element CVI CVI Place where you live or stay   I?d like to know about the place where you currently live or stay 1.00 Perfect Do you feel that the place where you live or stay is affordable? 1.00 Perfect Does the place where you live or stay have the amenities that are important to you (like a fridge, stove, own bathroom, elevator)? 0.86 Perfect Do you have access to bathing facilities (such as a shower)?  Perfect Perfect Do you feel that these bathing facilities are clean enough to use? Perfect Perfect Do you feel safe using these bathing facilities? Perfect Perfect Overall, do you feel that the place where you live or stay is clean enough? 1.00 Perfect Do you feel like you have control over your own space? Perfect 1.00 Are the other people living or staying there too disruptive? 1.00 Perfect Do you have enough privacy there? Perfect Perfect 149   Practical Experts Experiential Experts Element CVI CVI Do you feel there are too many restrictions placed on you there? Perfect Perfect Are you always worrying that you?ll catch some illness from other people living there? 1.00 Perfect Do you feel your stuff is safe there? Perfect Perfect Do you feel that you?re treated well there (for example: by landlord, shelter staff, other residents)? 1.00 Perfect Does it feel like a home to you? Perfect Perfect What is the worst thing about the place where you currently live or stay? Perfect Perfect What is the best thing about the place where you currently live or stay? Perfect Perfect Anything else you want to tell me about the place where you live or stay? Perfect Perfect You?ve talked about some things that describe the place where you currently live or stay. Now I want to know what kind of impact/effect that the place where you live or stay has on you. You could tell me that the place where you live or stay has no impact/effect on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you. I?d like you to rate the impact/effect that the place where you currently live or stay has on you 0.86 0.88 150   Practical Experts Experiential Experts Element CVI CVI Neighbourhood   Now I have some questions about your neighbourhood.  By ?neighbourhood?, I mean the neighbourhood of the place where you are currently living or staying ? even if you haven?t been there very long Perfect Perfect Do you feel safe in your neighbourhood? Perfect Perfect Why is that? 0.86 Perfect Do you feel differently about safety in your neighbourhood at night than during the day? 1.00 Perfect Why is that? 1.00 Perfect Do you feel that you?re part of the community in your neighbourhood? Perfect Perfect Do you feel stuck in your neighbourhood? 1.00 1.00 Do you feel that there are a lot of bad influences there for you (for example: too many drugs, too much crime)? 1.00 Perfect Do you think that there are enough resources there? (for example: food bank, health care, support workers) Perfect Perfect 151   Practical Experts Experiential Experts Element CVI CVI What is the worst thing about your neighbourhood? Perfect Perfect What is the best thing about your neighbourhood? Perfect Perfect Anything else you want to tell me about your neighbourhood? Perfect Perfect You?ve talked about some things that describe your neighbourhood. Now I want to know what kind of impact/effect that your neighbourhood has on you 1.00 0.88 Food   Now I have some questions about the food you eat Perfect Perfect Are you usually able to get food that you like? 1.00 Perfect Would you say that the food you eat is nutritious? Perfect Perfect Are you usually able to get good quality food? Perfect Perfect Do you find that you get stuck eating the same thing almost every day? Perfect Perfect Do you have trouble getting enough to eat? Perfect Perfect What is the worst thing about the food you eat? 1.00 Perfect 152   Practical Experts Experiential Experts Element CVI CVI What is the best thing about the food you eat? 1.00 Perfect Anything else you want to tell me about the food you eat? Perfect Perfect You?ve talked about some things that describe the food you eat. Now I?d like you to rate the impact/effect that the food you eat has on you 0.86 0.88 Note. There was no disagreement on endorsement between PEs and EEs on any element in this table.  153  Table 4.7: Content Validity Indices (CVI) for Skip Patterns in the Health Impact Section  Practical Experts Experiential Experts Element CVI CVI Should the impact question for physical pain be skipped? 0.57 0.43 Should the impact question for emotional pain be skipped? 0.57 0.25 Should the impact question for using alcohol be skipped? 0.57 0.50 Should the impact question for using pot/marijuana be skipped? 0.57 0.63 Should the impact question for using street drug be skipped? 0.57 0.25 Should the impact question for chronic illnesses or conditions be skipped? 0.71 0.57 Note. There was no disagreement on endorsement between PEs and EEs on any element in this table. CVIs for elements that were not endorsed are in bold font.   154  Table 4.8: Content Validity Indices (CVI) for the Administration Elements of the QoLHHI (Practical Experts only) Element CVI Health Skip Instructions Clarity of instructions for the skip pattern for item about experiencing physical pain Perfect Clarity of instructions for the skip pattern for item about experiencing emotional pain Perfect Clarity of instructions for the skip pattern for item about using alcohol Perfect Clarity of instructions for the skip pattern for item about using marijuana Perfect Clarity of instructions for the skip pattern for item about using street drugs Perfect Clarity of instructions for the skip pattern for item about having a chronic illness or condition Perfect Clarity of instructions for the skip pattern for item about whether supposed to be following a special diet Perfect Clarity of instructions for the skip pattern for item about whether or not following this special diet 1.00 Clarity of instructions for the skip pattern for item about whether supposed to be taking medication Perfect Clarity of instructions for the skip pattern for item about whether or not taking this medication 0.86 Administration Guidelines/Instructions How clear are the section-specific administration guidelines for the Health Impact Section? Perfect How helpful are the section-specific administration guidelines for the Health Impact Section? 0.86 155  Element CVI How clear are the section-specific administration guidelines for the Living Conditions Impact Section?  Perfect How helpful are the section-specific administration guidelines for the Living Conditions Impact Section  Perfect How clear is the note (in italics) regarding ?Yes/No? responses? (Living Conditions Impact section) Perfect How clear are the instructions (in bold capital letters) for when to skip items D and E in the ?place where you live or stay? section? (Living Conditions Impact section) 1.00 How clear is the Exception note (in italics) regarding the respondent?s ?usual neighbourhood?? (Living Conditions Impact section) Perfect How clear are the general administration guidelines for the overall QoLHHI? 1.00 How helpful are the general administration guidelines for the overall QoLHHI? 0.88 How clear are the general administration guidelines for the Impact Sections? Perfect How helpful are the general administration guidelines for the Impact Sections? 0.86 Yes/No Scale (Living Conditions Impact section) How easy to understand is the yes/no response scale for the descriptive questions? Perfect How easy to use (apply) is the yes/no response scale for the descriptive questions? Perfect Scoring Instructions 156  Element CVI How easy to follow are the instructions for calculating and interpreting the basic health impact total score (impacthealth5)? Perfect How easy to follow are the instructions for calculating and interpreting the enhanced health impact total score (impacthealthvar)?  0.86 How easy to follow are the instructions for scoring and using the additional questions on the Health Impact Section?  Perfect How easy to follow are the instructions for calculating an overall rating of the quality of where one is living or staying (qollive14)?  Perfect How easy to follow are the instructions for calculating an overall rating of the quality of the neighbourhood (qolneigh5)?  Perfect How easy to follow are the instructions for calculating an overall rating of the quality of food (qolfood5)?  Perfect How easy to follow are the instructions for calculating and interpreting the overall impact of living conditions (impactlive3)?  Perfect       157  Table 4.9: Suggestions for Topics to Add to the Health Impact and Living Conditions Impact Sections of the QoLHHI  Practical Experts? Suggestions Experiential Experts? Suggestions Health ? Relationships are an important component of well-being. What is the effect on the respondent of the quality of his/her relationships? ? Thoughts about future health changes ? Nutrition and intake of things like antioxidants & vitamins, general healthiness of diet  ? Air quality/quality of environment (mould, toxins in environment, smoke) ? The problem of maintaining a balanced lifestyle while homeless - finding time for exercise, a social life, etc. when you're in a shelter ? Your ability to connect with people, and the basis of those relationships. Are they about getting basic needs met or can you get emotional/social support?  ? The psychological effect of hygiene, and also how it affects your ability to move on/out. For example, dental care - having no teeth affects how people see you and how you feel about yourself ? Why people are using street drugs ? Self-care, also yoga, massage, taking care of yourself ? Dental health - oral disease and the impact it can have on life expectancy and health  158   Practical Experts? Suggestions Experiential Experts? Suggestions ? Abuse - it's the cause of so much of the mental health and addiction issues ? Whether or not a respondent has shelter   Living Conditions ? How is the respondent getting their food? ? Quality of friendships ? Size of respondent?s housing. Many are living in very small spaces and this information often gets lost  ? Recreation facilities ? Access to health care ? Health ? Personal safety (especially for women) ? Home support ? Having connections/conversations that are not just about homelessness and drugs ? Comfort level ? How you're treated in the environment ? Staff at shelters ? Neighbours, people in your environment ? Programs where you're living ? they have to be compatible with your headspace  Place where you ? A question about living alone or with ? Do you feel comfortable there?  159   Practical Experts? Suggestions Experiential Experts? Suggestions live or stay others. Many respondents talk about having roommates, while others talk about feeling alone in single apartments ? Personal safety where the respondent is living or staying (as opposed to safety of possessions)   ? Do you feel safe there?  ? Do you have a say in the way things are run - rules, regulations (negotiation)? ? Are you happy with the socioeconomic mix in your area ("ghettoization")?  ? Are you worried about gentrification? ? Noise and chaos levels ? Some sort of focus on the future ? Bugs, insects, rats, mice, roaches  Neighbourhood ? Do you think there is a sense of community in your neighbourhood?  ? Crime rate in your neighbourhood  ? Are you comfortable living in your neighbourhood? ? Do you choose to live in this neighborhood?  ? Is there enough crime prevention in your neighbourhood? ? The reputation of the neighbourhood, in particular, does the reputation of the neighbourhood create problems or challenges for you? For example, "Have you ever been denied a job because of your address/where you live?" ? Are there places where people can help out, places to connect with people and talk about something other than drugs (especially relevant to neighbourhoods with fewer 160   Practical Experts? Suggestions Experiential Experts? Suggestions resources) ? Housing stock  ? Groups of people in the neighbourhood - is there a community? Are there children, parents, workers; is it more than just a concrete jungle? What is missing to make it a community? What is your concept of community?  Food ? Food questions need to be more specific  ? Where are respondents getting the food that they eat? ? This whole section needs work. Things like: where is a person eating, how often, what does a typical meal look like, do they cook for themselves, can they store food in their rooms, how long do they wait for a meal, how often have they been sick after eating, are there enough meal programs that offer food that fits their diet, do they get seconds, are they eating snacks and when was the ? Are there food services in your neighbourhood? Access to food ? Food for pets  ? Dietary considerations and food for medical conditions ? Do you know where to go to get quality food? There are lots of people in this population with compromised immune systems - do they know where to go?  ? With regards to the quantity of food: there is enough food to be had in some ways, but in other ways there is not. For example, people often are not able to get second helpings at meals. In a sense, it's a question of the quality of the quantity. There seems to be a fear of 161   Practical Experts? Suggestions Experiential Experts? Suggestions last time they went grocery shopping?   enabling if too much is offered, but people are enabled already  162  5. Chapter 5: SME, PE and EE Feedback: Why Compare Data from Different Groups of Experts? Studies that employ a judgemental approach to content validation have traditionally drawn on subject matter experts (SMEs) with academic or professional experience relevant to the construct being measured or the target population. The benefits of including other types of content experts, particularly experiential experts (EEs), in content validation are now also acknowledged. Interestingly, however, reports of content validation studies conducted with both SMEs and EEs often combine the data from all experts, and provide no discussion of any differences (or lack of differences) that might have been found between the groups (see, for example, Clemson, Cumming, & Heard, 2003; Halliday et al., 2012). Reporting only combined data runs the risk of masking any diversity in the experts? assessments, particularly for quantitative analyses.  For example, two groups might have completely opposing views on an element, with one group giving the element high ratings and the other group giving it low ratings. If these ratings are combined, the scores from each group will balance each other out, resulting in mid-range scores that do not accurately reflect the views of either group. Such an approach seems to run counter to the purpose of having multiple groups of experts in the first place, namely to assess instrument content from a variety of perspectives. The findings from multiple (usually two) groups of experts are sometimes reported separately, but this appears to be primarily because the groups assessed different aspects of the instrument?s content or provided different types of data (e.g., ratings versus comments and suggestions). For example, in a study of the development of a Turkish version of a quality of life scale, SMEs were asked to rate item relevance while EEs were asked to comment on the items and make recommendations for improvement (Can et al., 2010). Wilkinson, Roberts and While 163  (2010) also reported a division of tasks in the validation of a measure of information technology skills: SMEs reviewed the fit between items and the content domain, while EEs provided feedback on item clarity and suggestions for topics to add. The reasons for assigning different tasks to SMEs and EEs in studies such as these are not clear. It is true that different types of experts may bring different strengths to a content validation study (for example, EEs can be particularly helpful in identifying item language that is problematic for the target population; Tilden et al., 1990), and it may not make sense to ask all content experts to assess all elements (for the content validation of the QoLHHI, for example, there was little point in asking EEs to evaluate scoring instructions, as these elements are not used or even seen by respondents). However, decisions to divide study tasks must be made carefully, for researchers may be missing an opportunity to glean valuable information about an instrument if they only ask some groups of experts to rate certain aspects of content. For example, SMEs may have insightful comments on item language, while EEs may provide helpful feedback on element relevance. Assigning different rating tasks to different groups of experts also limits opportunities to make direct comparisons between them. Only rarely do reports from content validation studies include a comparison of the feedback from multiple groups of experts, yet similarities and differences in this feedback can yield valuable insights. For example, in a study of a measure of diabetes self-management (Schilling et al., 2007), CVIs were computed for both SMEs? and EEs? ratings of content relevance. The results for SMEs and EEs as a single group were compared to the results for EEs alone, and it was found that the EEs had endorsed fewer elements compared to the total sample. Another study that compared two groups of experts was a content validation study for a measure of satisfaction with treatment for sexual arousal disorder (Corty, Althof, & 164  Wieder, 2011). Both SMEs and EEs rated elements for how important they are to measuring the construct. In this case, it was the EEs who rated a greater number of items as important.  Although they represent only two examples of comparisons of the data from multiple groups of content experts, these two studies serve to illustrate that different types of experts can differ in their assessments of an instrument?s content, even when rating the same aspect of that content (e.g., relevance). In order to capture all of the experts? information, it is therefore desirable to analyse their data separately, and to make some comparison of the findings from the different groups. The comparison of practical experts? (PEs?) and EEs? data in Chapter 4 of this dissertation provides another example of the merit of such an approach. Combining all of the experts? data together for analysis would have obscured some interesting differences between these two groups. For example, although the PEs and EEs both endorsed the majority of the content elements of the QoLHHI, there was little agreement between them on those elements that were not endorsed. The PEs and EEs also differed in that the PEs were more critical of the clarity of elements, while the EEs were more critical of the relevance of elements. For the descriptive feedback, although there were some areas of agreement between the two groups, there were also many differences. All of this suggests that the PEs and EEs brought different perspectives to their evaluations of the QoLHHI content.  While Chapter 4 included a comparison of the content validation results for PEs and EEs, the findings of the SMEs stood alone in Chapter 3. The remainder of this chapter will provide a brief discussion and comparison of the quantitative and descriptive data from all three groups of experts. 165  The Quantitative Data: Many Similarities, Some Important Differences  A total of 144 elements of the QoLHHI were rated by all three groups of experts (another 30 administration elements, rated only by the SMEs and PEs, are discussed separately below). Of these, 96 (67%) were endorsed overall by all three groups, and there were no elements that were unendorsed by all three groups. The SMEs endorsed 88% of these 144 elements, while the PEs endorsed 91% and the EEs endorsed 80%.  The rate of three-way agreements on endorsement was lower for the relevance of elements (49% of all elements rated for relevance, see Table 5.1, p. 170) than for the clarity of elements (85% of all elements rated for clarity, see Table 5.2, p. 173).  The EEs were less likely to endorse elements for relevance than were the SMEs (as already noted in Chapter 4, the EEs also endorsed fewer elements for relevance compared to the PEs).  The highest number of two-way agreements on endorsement was between the SMEs and the PEs; these groups agreed on the endorsement of the relevance of 21 elements, which represents 40% of all elements rated for relevance (see Table 5.1), and the helpfulness of the Impact response card (not included in a table).15 There were a total 12 agreements on endorsement between the PEs and the EEs. Four of these were for relevance (representing 7% of all elements rated for relevance, see Table 5.1) and eight were for clarity (representing 10% of all clarity ratings, see Table 5.2). The groups with the least overlap were the SMEs and EEs, who agreed on endorsement for only three out of the 144 elements, although one of these was the relevance of rating ?impact?, which is central to the QoLHHI Impact sections. Overall, the PEs appear to have fallen in-between the SMEs and EEs in their assessments, in that they sometimes agreed with the former group, and sometimes with the latter. The higher rates of agreement                                                  15 All two-way agreements are in addition to the three-way agreements, e.g., the SMEs agreed with the PEs and EEs on the endorsement of the relevance of 49% of elements and with only the PEs on the relevance of another 40% of elements.  166  between the PEs and EEs, compared to between the SMEs and EEs, may be due to the PEs incorporating the views of the HVH individuals to whom they had administered the QoLHHI into their assessments of the instrument?s content (this was reflected in the PEs? descriptive feedback).  There were 11 two-way agreements (out of 144 elements) on non-endorsement of elements (Table 5.3, p. 180), six of which were the skip patterns from the Health Impact section which were not endorsed by either the PEs or the EEs, as discussed in Chapter 4. Four of the remaining five agreements on non-endorsement were between the SMEs and PEs, who did not endorse the impact response scale (either for how easy it is to understand or to apply), the clarity of the introduction to the Health Impact section, or the clarity of the impact item for physical activity or exercise. The final two-way agreement on non-endorsement was between the SMEs and EEs, who did not endorse the relevance of asking if people feel stuck in their neighbourhood. In addition to the 144 elements that were rated by all three groups of experts, there were 30 administration elements, such as instructions to interviewers and scoring instructions, which were rated by only the SMEs and PEs. Twenty-nine (97%) of these elements were endorsed by both groups. The single exception was one set of instructions for skipping items related to prescription medication, which was endorsed for clarity by the PEs but not the SMEs. Overall, the quantitative feedback indicates a fairly high level of endorsement of the QoLHHI content. Despite the diversity in experts who participated in the studies reported in Chapters 3 and 4, just over two-thirds of the QoHHI content elements rated by all three groups were endorsed, and 77% of the two-way agreements on these elements were also endorsements. Ninety-seven percent of the administration elements, rated by only the SMEs and PEs, were endorsed by both groups. Nevertheless, the fact that many elements did not obtain a three-way 167  endorsement for relevance is initially worrying, because lack of relevance is, arguably, harder to ?fix? than lack of clarity. Closer inspection, however, shows that this is an example of why it is important to consider the data from different groups of experts separately. Most of the non-endorsements for relevance were provided by the EEs. As described in Chapter 4, the EEs? descriptive feedback suggests that some of the low ratings for relevance may have been influenced by the EEs? personal experiences and opinions; therefore, the extent to which these ratings reflect the views of a wider range of individuals who are HVH is unclear. This does not mean that the EEs? ratings should be automatically dismissed, but it does suggest that these particular non-endorsements should be treated cautiously, particularly as they conflict with the feedback from the other two groups of experts. This example also illustrates, once again, the importance of considering not only the quantitative information provided by content experts, but their descriptive feedback as well. The descriptive data can provide context for the quantitative ratings, and can serve to clarify confusing or contradictory findings.  Descriptive Feedback: A Rich Source of Information on Content Quality As noted in Chapters 3 and 4, the content experts who participated in these content validation studies provided a great deal of descriptive feedback on the QoLHHI content. Because individual experts within a group sometimes commented on elements that were endorsed by the group overall, descriptive feedback was not limited to the non-endorsed elements. As a result, almost all elements had at least some descriptive feedback attached to them, and the volume of data was quite large. In approaching the task of analyzing the descriptive feedback for all three groups of experts, it was tempting initially to look for patterns of agreement in the comments and suggestions (e.g., produce a tally of similar comments for each element) in order to identify the feedback that should be given the most weight. Ultimately, this approach (which would have 168  essentially been an attempt to quantify the descriptive data) would not have represented the best use of the experts? feedback. In reviewing the descriptive feedback for the three groups together, it became clear that there were many different patterns in the data. Some elements were endorsed by all three groups of experts, but for some of these elements the details of the descriptive feedback differed across the groups, while for other elements there was more agreement in the comments. In some cases, there were similar comments across groups even when the groups disagreed overall on endorsement. There were also instances of single experts making helpful comments or suggestions, and instances when the experts within a group provided very different suggestions on a single element. As noted in Chapter 3, content validation will be most effective if content experts are viewed as an advisory group of individuals with very particular expertise related to the construct of interest or the target population. Under such an approach, the descriptive feedback collected from experts becomes as important, if not more important, than their quantitative feedback. In order to make full use of all of the descriptive information provided by the content experts in the studies reported here, it was necessary to consider all of their comments and suggestions as unique pieces of information.  This does not necessarily mean that every suggestion made by the experts must be followed. Decisions about changes to content ultimately rest with an instrument?s developer(s) or revisers (e.g., DeVellis, 1991; McKenzie et al., 1999). It is they who will have the most comprehensive understanding of the instrument, from its earliest conception to its present form, and so will be in the best position to review all of the data, judge the implications of these data for the validity of inferences, and judge what revisions to the instrument content are needed.  An example of how descriptive feedback might be utilized to make a detailed assessment of instrument content and guide revisions to the content is provided in Appendix C. All of the 169  descriptive feedback from SMEs, PEs, and EEs is presented there, organized by element.16 For many elements, the content experts? feedback is followed by my response, which provides some information about the development of the element/item, and presents the perspective of the QoLHHI authors as I recall it from our many test development meetings and discussions. For example, several content experts commented that the term ?physical activity and exercise? does not apply to individuals who are HVH, who generally do not engage in structured fitness activities like working out at a gym. In my response to these comments, I note that the QoLHHI authors had specifically intended for the term ?physical activity or exercise? to include any type of physical activity, but also acknowledge that, given the experts? comments, this intent may not be clear. Finally, for many elements, I have made recommendations for revisions to the QoLHHI content. For example, I recommend adding some HVH-specific examples of physical activity, like collecting bottles for recycling, to the physical activity item. These recommendations take into account both the content experts? comments and suggestions, and the QoLHHI authors? position as I understand it. If the recommendation is to leave an item unchanged, the reasoning behind this recommendation is explained. One drawback to increasing the emphasis placed on descriptive feedback is that this will both increase the amount of data that must be analyzed and the complexity of analyses. The amount of descriptive data provided by content experts can be quite extensive, as evidenced by the length of Appendix C, and the effort involved in making sense of this data is significant as well. Nevertheless, this is an important step in assessing validity evidence based on instrument content, as well as the best guide to revisions and improvement to instrument content.                                                     16 All of the experts? comments and suggestions in Appendix C have been edited for spelling and grammar, and have been converted to a third person voice for consistency. 170  Table 5.1: Agreement on Endorsement between Groups of Content Experts: Relevance Ratings Element Groups in Agreement Current level of physical health All groups Current level of mental or emotional health All groups Quality of sleep All groups Current level of stress All groups Emotional pain All groups Using street drugs All groups Chronic illnesses or conditions All groups Aspect of Living Conditions: Place where you live or stay All groups Affordability of the place where you live or stay All groups Amenities where you live or stay All groups Access to bathing facilities All groups Cleanliness of bathing facilities All groups Overall cleanliness of the place where you live or stay All groups Feeling of control over your own space All groups Privacy where you live or stay All groups Restrictions where you live or stay All groups Security of possessions where you live or stay All groups Treatment by others where you live or stay All groups Feeling of home where you live or stay All groups Safety of neighbourhood All groups 171  Element Groups in Agreement Reasons for feeling safe/unsafe in your neighbourhood All groups Safety at night versus day in your neighbourhood All groups Reasons for feeling safe/unsafe at night versus in the day All groups Resources in your neighbourhood All groups Ability to get food that you like All groups Quality of food All groups Current level of physical activity or exercise SMEs and PEs Physical pain SMEs and PEs Using alcohol SMEs and PEs Using pot/marijuana SMEs and PEs Special (medically recommended) diet SMEs and PEs Prescription medication SMEs and PEs Aspect of Living Conditions: Food SMEs and PEs Safety of bathing facilities where you live or stay SMEs and PEs Worries about catching illnesses from others where you live or stay SMEs and PEs Worst thing about the place where you live or stay SMEs and PEs Best thing about the place where you live or stay SMEs and PEs Anything else to say about the place where you live or stay SMEs and PEs Bad influences in your neighbourhood SMEs and PEs Worst thing about your neighbourhood SMEs and PEs Best thing about your neighbourhood SMEs and PEs 172  Element Groups in Agreement Anything else to say about your neighbourhood SMEs and PEs Able to get enough to eat SMEs and PEs Feeling stuck eating the same food every day SMEs and PEs Worst thing about food SMEs and PEs Best thing about food SMEs and PEs Anything else to say about food SMEs and PEs Relevance of the concept of ?impact? SMEs and EEs Aspect of Living Conditions: Neighbourhood PEs and EEs Disruptiveness of others where you live or stay PEs and EEs Feeling part of the community in your neighbourhood PEs and EEs Is food nutritious PEs and EEs    173  Table 5.2: Agreement on Endorsement between Groups of Content Experts: Clarity Ratings Element Groups in Agreement I?d like you to rate the kind of impact/effect that your current level of physical health has on you All groups I?d like you to rate the kind of impact/effect that your current level of mental or emotional health has on you All groups Would you describe your current level of stress as low, medium, or high? All groups Have you been experiencing physical pain lately? All groups I?d like you to rate the kind of impact/effect that (having/no longer having) emotional pain has on you All groups Do you currently drink alcohol? All groups I?d like you to rate the kind of impact/effect that (drinking/no longer drinking) has on you All groups Do you currently use pot (marijuana)? All groups I?d like you to rate the kind of impact/effect that (using/no longer using) pot has on you All groups Do you currently use other street drugs ? such as cocaine, heroin, or crystal meth for example? All groups I?d like you to rate the kind of impact/effect that (using/no longer using) street drugs has on you All groups Do you have one or more chronic illnesses or conditions (for example: diabetes, allergies, a disability, hepatitis)? All groups 174  Element Groups in Agreement I?d like you to rate the kind of impact/effect that this has on you All groups Are you supposed to follow a special diet because of a health condition? All groups Are you following this special diet? All groups If you are NOT following or only partially following this special diet, why not? All groups ?because the food you need for this diet is too expensive? All groups ?because it?s too difficult for you to get the food you need for this diet? All groups ?because you don?t have any way to prepare or store the food you need for this diet? All groups ?because you are not willing to give up certain foods as part of this diet (for example: salt, red meat, sweets)? All groups ? Other All groups Are you currently supposed to be taking medication that was prescribed by a doctor? All groups Are you taking this medication? All groups I?d like you to rate the kind of impact/effect that taking this medication has on you All groups I?d like you to rate the kind of impact/effect that not taking this medication has on you All groups 175  Element Groups in Agreement If you are NOT taking the medication prescribed to you, why not? All groups ?because the medication is too expensive? All groups ?because it?s too difficult for you to store the medication? All groups ?because you?re not able to take the medication as recommended (for example: with food, 3 times a day)? All groups ?because you don?t like the side effects? All groups ?because you don?t believe in taking medication? All groups ? Other All groups I?d like to know about the place where you currently live or stay All groups Do you feel that the place where you live or stay is affordable? All groups Does the place where you live or stay have the amenities that are important to you (like a fridge, stove, own bathroom, elevator)? All groups Do you have access to bathing facilities (such as a shower)? All groups Do you feel that these bathing facilities are clean enough to use? All groups Do you feel safe using these bathing facilities? All groups 176  Element Groups in Agreement Overall, do you feel that the place where you live or stay is clean enough? All groups Do you feel like you have control over your own space? All groups Are the other people living or staying there too disruptive? All groups Do you have enough privacy there? All groups Do you feel there are too many restrictions placed on you there? All groups Are you always worrying that you?ll catch some illness from other people living there? All groups Do you feel your stuff is safe there? All groups Do you feel that you?re treated well there (for example: by landlord, shelter staff, other residents)? All groups Does it feel like a home to you? All groups What is the worst thing about the place where you currently live or stay? All groups What is the best thing about the place where you currently live or stay? All groups Anything else you want to tell me about the place where you live or stay? All groups Now I have some questions about your neighbourhood.  By ?neighbourhood?, I mean the neighbourhood of the place where you are currently living or staying ? even if you haven?t been there very long All groups 177  Element Groups in Agreement Do you feel safe in your neighbourhood? All groups Why is that? All groups Do you feel differently about safety in your neighbourhood at night than during the day? All groups Why is that? All groups Do you feel that there are a lot of bad influences there for you (for example: too many drugs, too much crime)? All groups Do you think that there are enough resources there? (for example: food bank, health care, support workers) All groups What is the worst thing about your neighbourhood? All groups What is the best thing about your neighbourhood? All groups Anything else you want to tell me about your neighbourhood? All groups You?ve talked about some things that describe your neighbourhood. Now I want to know what kind of impact/effect that your neighbourhood has on you All groups Now I have some questions about the food you eat All groups Are you usually able to get food that you like? All groups 178  Element Groups in Agreement Are you usually able to get good quality food? All groups Do you find that you get stuck eating the same thing almost every day? All groups Do you have trouble getting enough to eat? All groups What is the worst thing about the food you eat? All groups What is the best thing about the food you eat? All groups Anything else you want to tell me about the food you eat? All groups You?ve talked about some things that describe the food you eat. Now I?d like you to rate the impact/effect that the food you eat has on you All groups I?d like you to rate the kind of impact/effect that the quality of sleep that you?ve been getting lately has on you PEs and EEs I?d like you to rate the kind of impact/effect that (having/ no longer having ) physical pain has on you PEs and EEs I?d like you to rate the kind of impact/effect that following this diet has on you PEs and EEs I?d like you to rate the kind of impact/effect that not following this diet has on you PEs and EEs You?ve talked about some things that describe the place where you currently live or stay. Now I want to know what kind of impact/effect that the place where you live or stay has on you. You could tell me that the PEs and EEs 179  Element Groups in Agreement place where you live or stay has no impact/effect on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you. I?d like you to rate the impact/effect that the place where you currently live or stay has on you Do you feel that you?re part of the community in your neighbourhood? PEs and EEs Do you feel stuck in your neighbourhood? PEs and EEs Would you say that the food you eat is nutritious? PEs and EEs Given your (low/medium/high) stress level, I?d like you to rate the kind of impact/effect that this has on you SMEs and EEs Have you been experiencing emotional pain lately? SMEs and EEs    180  Table 5.3: Agreement on Non- Endorsement between Groups of Content Experts Element Groups in Agreement How easy to understand is the impact response scale? SMEs and PEs How easy to use (apply) is the impact response scale? SMEs and PEs Now I want to know about the kind of impact/effect that different aspects of your health have on you. You could tell me, for example, that your physical health has no impact on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you (Clarity) SMEs and PEs I?d like you to rate the kind of impact/effect that your current level of physical activity or exercise has on you (Clarity) SMEs and PEs Feeling stuck in your neighbourhood (Relevance) SMEs and EEs Should the impact question for physical pain be skipped? PEs and EEs Should the impact question for emotional pain be skipped? PEs and EEs Should the impact question for alcohol use be skipped? PEs and EEs Should the impact question for pot/marijuana use be skipped? PEs and EEs Should the impact question for street drug use be skipped? PEs and EEs Should the impact question for chronic illnesses or conditions be skipped? PEs and EEs 181  6. Chapter 6: Conclusion This dissertation consists of two content validation studies that used judgemental methods with different types of content experts. This research had two main goals. The first was to collect, assess, and describe validity evidence based on content for two sections of a new measure of subjective quality of life (SQoL) for individuals who are homeless and vulnerably housed (HVH), and provide recommendations for revising this instrument. The second goal of this dissertation was to evaluate the use of judgmental studies using content experts - a common content validation methodology - and propose ways to make it more effective in meeting the goals of content validation. Summary of the Research Findings Validity evidence based on the content of the QoLHHI. The research findings summarized in Chapters 3 to 5 indicate that the content experts who participated in this research were generally favourable in their assessments of the QoLHHI content. The SMEs and PEs rated 174 individual content elements and the EEs rated 144 individual elements on characteristics such as relevance, clarity, ease-of-use, and helpfulness. Despite the large number of elements and the diversity of content experts, over 85% of these quantitative assessments reflected endorsements at the group level. In addition, 42% of the element-level ratings were Perfect CVIs, which are obtained when all experts in a group give an element the highest possible rating. The descriptive feedback from the content experts suggests that they did have some concerns about the QoLHHI content, but most of these can be addressed through relatively minor revisions to various content elements, and the addition of some topics/items. Overall, the findings from this study suggest that the content of the QoLHHI is appropriate and useful for measuring SQoL in individuals who are HVH. 182  The most significant recommendation to come out of the content validation studies in this dissertation is that the way in which the concept of ?impact? is presented in the QoLHHI should be reviewed. Although the majority of content experts felt that measuring impact is relevant to the measurement of SQoL, some also felt that this concept might be difficult for some respondents to grasp, and that some of the items addressing impact were unclear as a result. Given that the concept of impact is central to the Impact sections of the QoLHHI, the QoLHHI authors should carefully review the in-depth descriptive feedback from all content experts, as well as my recommendations in Appendix C, and use these to make revisions to the content elements that relate to impact. Other potential revisions to the QoLHHI content include making changes to the wording of various items and content elements in order to increase their clarity, adding examples to certain items, and either adding new items or expanding on existing ones. The purpose of the QoLHHI overall (i.e., to measure subjective QoL) and the goals of certain items could also be made clearer.  Implications for content validation using judgmental methods. The most common approach to content validation is to conduct a judgmental study with content experts, who are generally subject matter experts (SMEs) with research or clinical experience that is relevant to the construct of interest or the target population. Members of the target population, or experiential experts (EEs), may be included as content experts as well. Depending on the instrument and measurement context, other types of experts can also be asked to evaluate the instrument content. Inclusion of a less common type of expert, practical experts (PEs), was reported in Chapter 4. Although the value of including different types of experts in a content validation study is generally recognized, the data from all experts is usually combined at the point of analysis, and rarely do reports of content validation studies include a discussion of any 183  similarities or differences in the data from different groups. As seen in Chapters 4 and 5, the research reported in this dissertation benefited from the inclusion of several types of experts. In some cases, the different groups provided quite different feedback on the QoLHHI content, resulting in far richer validity evidence than would have been obtained from any single group. I proposed that samples of content experts should therefore be selected with the goal of collecting as many diverse, though of course relevant, perspectives as possible. I also noted that combining the data from all experts at the point of analysis can be a disservice to the data, particularly when the groups disagree in their assessments. In such cases, differences between the groups may be masked and the results will not fully reflect any of the groups. It is only by reviewing the findings separately that any differences can be properly recognized and evaluated. Quantitative feedback tends to be the main focus of judgmental studies of instrument content, with descriptive feedback treated as a relatively minor supplement to the quantitative ratings. The use of inferential statistics to analyze the quantitative data is not uncommon, despite concerns that have been raised about using inferential approaches with the small sample sizes typical of content validation studies. I acknowledged the legitimate concern about the use of inferential statistics with small samples, but argued that this is resolved by treating content experts not as a representative sample of all possible experts, but as an advisory panel to developers or users of an instrument, the members of which are chosen specifically for the relevant and perhaps even unique experiences that they can bring to bear on the task of evaluating the instrument?s content. Under this approach, the use of inferential statistics becomes meaningless. Instead, a cut-off for quantitative data should represent a level of expert endorsement that instills confidence in the quality of the instrument.  184  I also argued that it is not especially helpful to focus on expert agreement when selecting a measure or index for analyzing experts? quantitative ratings. This is because indices of rater agreement may be high even if all experts agree that an element is not relevant, clear, etc. Additional steps would then be needed to distinguish high agreement on poor elements from high agreement on good elements. I suggested that a more parsimonious index is one that assesses the level of expert endorsement of an element.  Moreover, I proposed that quantitative ratings of content serve different purposes at different stages of a content validation study. At the data collection stage, the purpose of these ratings is to communicate the content experts? assessment of the instrument. At the analysis stage, the goal is to determine what these assessments say about the instrument content. Do they reflect an acceptable level of endorsement, or do they instead suggest that an element is problematic, from a validity perspective? Because of these different goals, the data collection and analysis stages of a study may require different scales of measurement. For example, a four-point scale may be most appropriate for data collection, but a two-point scale (obtained by combining response categories) may best serve the goals of analysis. Researchers planning a content validation study should give careful thought to the response scales, the meaning of response options, and the outcomes (i.e., retain, revise or remove an element) attached to different response options or groupings of response options. Finally, one of the most important findings arising from the research presented in this dissertation is the recognition of the contribution that descriptive feedback can make in meeting the goals of content validation. While the quantitative data collected for the studies reported in Chapters 3 to 5 were helpful in summarizing the content experts? assessments of the QoLHHI, it was the descriptive data that provided the most information. These data helped explain the 185  quantitative ratings and were invaluable in developing recommendations for making changes to the instrument. Although there is certainly value in quantitative feedback, the emphasis in content validation studies that employ judgmental methods should be shifted to experts? descriptive feedback. Working with descriptive data requires a great deal of creativity, flexibility, judgment, and willingness to deal with contradictions and ambiguities. To meet the goals of content validation, however, analysis of descriptive data makes the best use of experts? time, knowledge, and effort.  Detailed reporting of descriptive data is important because it is not only the developers, but also the users, of an instrument who are responsible for ensuring validity in their measurement endeavors (AERA et al., 1999). In order to make informed decisions about using an instrument in a particular context for a given purpose, users need information about validity evidence; for judgmental content validation studies, this means having access to more detailed reports on the experts? descriptive data.  Novel Contributions and a New Perspective on Content Validation Methodology It is expected that a dissertation will make a novel contribution to a field or literature. This dissertation makes several novel contributions both to (a) the validation of a quality of life measure for HVH individuals and (b) the area of validation methodology. This dissertation provides the first content validation evidence for the QoLHHI measure and relies not only on the input and expertise of SMEs, but also of EEs and PEs. Few content validation studies have made use of experts outside of SMEs and EEs and even fewer have relied on multiple groups of experts and conducted any analysis of similarities and differences among the expert groups. Perhaps even more importantly, this dissertation presents a new way of thinking about and conducting content validation research and, in particular, studies that use judgmental approaches 186  with content experts. Specifically, I have argued for a shift in how we think about experts in such studies, the types of data that we collect from these experts, and how we analyze their data.  Regardless of the details of a content validation study (i.e., whether the focus is on content relevance and representativeness, on relevance and the technical quality of elements, on items alone or all content elements, or some other aspects of content), the overarching goal of the research is to obtain feedback on the content of an instrument and, from this, determine the implications that content might have for the validity of inferences made from the instrument. Most content validation studies have as an additional goal to gather data to use in revising and improving the instrument content.  The studies presented in Chapters 3 to 5 were conducted within the general framework of judgmental studies using content experts, but differed from the typical approach to such studies in several key ways that reflect some novel contributions of this dissertation. These differences and contributions are presented below.  How we think about content experts. In this dissertation, I propose that the goals of content validation will be better served if we think of content experts, not as a sample drawn from, and representing the views of, some larger population of experts, but as an advisory panel that has been convened to assist with the task of assessing and revising the content of an instrument. This formulation emphasizes the contribution of the individual experts, and acknowledges that the reason they are invited to take part in a content validation study is that they bring relevant and valuable experience and expertise to the task of evaluating an instrument. The advisory panel approach allows us to focus on identifying the most effective ways to tap this experience and expertise. It also eliminates the temptation to use inappropriate inferential 187  techniques to analyze data, or make inappropriate inferences about the representativeness of the findings. Types of data. Most content validation studies conducted with content experts focus primarily on collecting quantitative data, generally in the form of ratings of content relevance, technical quality, etc., and treat descriptive feedback as secondary or supplemental to the quantitative data. For the first goal of content validation, that is, assessing the implications for validity of an instrument?s content, quantitative data may be sufficient. As I demonstrated in the studies presented in Chapters 3 to 5, quantitative feedback can be helpful in providing summaries of experts? evaluations of content. It can also be used to compare how different types of experts view the content of an instrument. Such comparisons are rarely presented in the literature but, as I demonstrated in Chapter 4, they can yield useful insights.  Ultimately, however, the information provided by quantitative data will be limited because the reasons for the quantitative ratings will not be known. This, in turns, limits the extent to which it is possible to understand why an element is considered problematic, and makes it difficult to fully assess the impact that content may have on validity. In the studies presented here, I demonstrated that descriptive data can be used to explain and provide context for quantitative assessments, resulting in a fuller and more nuanced understanding of experts? feedback. When it comes to the second goal of content validation, that of revising an instrument?s content, descriptive feedback becomes essential. As can be seen from Chapters 3 to 5, although quantitative data can be used to identify items or content elements that would benefit from revision, it is only by carefully and thoroughly considering descriptive feedback (as I did in Appendix C) that it is possible to determine how to revise it.  188  The research presented in this dissertation highlights that descriptive data must play a much more prominent role in content validation studies employing content experts than is currently the case. In order to better meet both goals of content validation, we should be designing studies that allow content experts to share much more of their knowledge and experience, rather than asking them to reduce their expertise to a series of quantitative ratings. Analyzing the data. There is little guidance in the literature on content validation on how to analyze the descriptive data provided by content experts. Even studies that include such data tend to report the descriptive findings in a few sentences at most. For those trying to evaluate the validity evidence for a particular instrument - for example, when selecting an instrument for a research study - such limited descriptions impede their ability to make fully informed decisions. Researchers planning their own content validation studies may be discouraged from including descriptive data in their research by a lack of guidance and examples. In Chapters 3 to 5, and particularly in Appendix C, I provide a comprehensive discussion and demonstration of how descriptive data may be utilized, both in order to assess validity evidence and to guide revisions to an instrument?s content. Most significantly, I demonstrated that, for this latter goal in particular, the descriptive feedback from each content expert should be considered at an individual level. This research can serve as an example of how a new approach to content experts, data, and analysis can serve to better meet the dual goals of a content validation study and make the most effective use of content experts. Limitations and Future Directions Limitations to the individual studies were discussed in Chapters 3 and 4. These include that the sample of SMEs only included experts with research backgrounds that was limited to two countries (Canada and the United States), and EEs from a single neighbourhood in 189  Vancouver, Canada. Given the importance to validity of the context within which an instrument is used (e.g., purpose of measurement, target population), the limited samples mean that the validity evidence provided in these studies may only apply to certain contexts.  More generally, content validation has been criticized for being subjective, and also as being vulnerable to a confirmatory bias, especially if carried out by the developers of an instrument (Kane, 2006). The studies reported in this dissertation did rely on subjective data, and so are subject to any risks this might entail. The fact that some EEs in particular took a very personal and subjective approach to their assessments of the QoLHHI content has already been noted. Subjectivity in the experts? data would be a significant concern if the intent of the studies had been to use samples to represent the views of all possible experts. However, subjectivity may be less of a concern if the experts are viewed primarily as an advisory panel. Under these conditions, the experts? feedback is acknowledged from the outset to be subjective. This should, hopefully, help to guard against any temptation to treat it as anything else.  The content validation studies reported here were carried out by one of the QoLHHI authors, so the possibility of a positive bias in the analyses, in particular the analysis of the descriptive data, must be acknowledged. A final risk of confirmatory bias now rests with the QoLHHI authors as a whole. They will be undertaking the task of revising the instrument, and have a responsibility to assess and apply the expert feedback as objectively as possible to this task. Another potential source of bias in the studies presented here is acquiescence. The SMEs and PEs provided their feedback via an online survey, but this process was not anonymous, as contact information was needed in order to provide honoraria (although participants were assured that their identifying information would be separated from their study data prior to analysis). The 190  EEs provided their feedback in face-to-face interviews and so were even more readily identifiable than the SMEs and PEs. It is therefore possible that the experts who participated in this research felt constrained in their ability to be critical of the QoLHHI content. The extent to which the lack of anonymity might have affected the experts? responses is impossible to determine. However, it should be noted that the EEs, who were the least anonymous group, were also the least positive overall in their assessments of the QoLHHI content, and that although the quantitative ratings across all three groups of experts were positive on the whole, there were elements that that they did not endorse. The descriptive feedback also included many comments about content that could be improved. Thus it appears that the content experts felt able to express critical opinions about the QoLHHI.  The next step for the QoLHHI will be to implement the revisions to the content recommended in Appendix C. Further content validation for the revised instrument may then be required. It would also be helpful to conduct additional content validation studies with more diverse samples of SMEs and EEs. SMEs who represent some of the non-research applications of the QoLHHI, such as policy development or service evaluation, should be consulted. For EEs, it will be important to draw on individuals who live outside of Vancouver?s Downtown Eastside, as well as a greater number of women and homeless families. However, content validation is only one aspect of the overall validation process, and any one study contributes only one portion of validity evidence. Future validation efforts should address some of the other sources of validity evidence listed in the AERA, APA and NCME Standards (1999), such as response processes, internal structure, relationships between scores obtained with the QoLHHI and other variables, and the consequences of using the QoLHHI. In light of the concerns that were raised in the research reported here with regards to some of the language and more abstract concepts in the 191  QoLHHI content, an investigation of response processes, conducted with individuals who are HVH, would be particularly useful. This would provide an opportunity to learn more about how respondents from the target population deal with these potentially problematic content elements. Given the broad scope of the QoLHHI and the inclusion of an atypical approach to measuring QoL (the use of impact ratings), validation research that looks at the relationship between the QoLHHI and other variables would also be valuable. In particular, validation studies with convergent measures would help to establish if the QoLHHI is capturing SQoL as intended. There is also a need for more methodological research in the area of content validation. For judgmental studies with content experts, future research should investigate more effective and efficient ways of collecting descriptive feedback. For example, in the studies reported in this dissertation, ratings for relevance and clarity were collected separately. The experts? comments on these ratings, however, did not always fall neatly into these two areas. Perhaps the distinction between characteristics like relevance and clarity is not as useful for soliciting descriptive feedback. An important methodological question to address is ?What are the most effective questions to ask of experts when the collection of descriptive feedback is no longer designed around the quantitative ratings?? Including different groups of content experts can greatly enhance content validation research, but can also present challenges. For example, in the studies reported here, there were EEs whose data were removed from the analyses because they had difficulties with the study tasks. These EEs were likely the most representative of those members of the target population who would struggle to understand the QoLHHI items, while the EEs who were best able to complete the fairly abstract study tasks were not necessarily representative of this segment of the QoLHHI?s target population. This illustrates one type of challenge that can arise when trying to 192  include EEs who reflect the entire range of an instrument?s target population. More thought should be given to the advantages and disadvantages of aiming for representativeness when selecting experts, and to the advantages and disadvantages of including different types and multiple groups of experts in content validation. In the research reported in this dissertation, for example, it was the PEs, rather than EEs, who provided feedback that reflected a broad segment of the target population. Future methodological research should also seek to establish effective methods for analyzing and comparing quantitative and descriptive data from different groups of experts.  Future research should also look beyond judgmental approaches to explore methodological issues in content validation more generally. For example, content validation can serve to identify potential sources of invalidity in an instrument?s content by identifying problematic elements, as demonstrated by the research presented in Chapters 3 to 5 and Appendix C. In practice, it is often those content elements that need revising that are paid the most attention in a content validation study. Interestingly, though, the link between sources of invalidity (i.e., construct underrepresentation and construct-irrelevant variance) and instrument content is rarely made explicit. Messick (1989) noted that items of poor technical quality may introduce irrelevant sources of difficulty ? that is, introduce construct-irrelevant variance ? into the measurement process, and the AERA, APA and NCME Standards (1999) also mention irrelevant sources of difficulty arising from instrument content. Furr and Bacharach (2008), in their coverage of validity evidence based on content, devote most of their discussion to sources of invalidity. But, for the most part, discussions of content-based validity evidence and content validation do not refer specifically to invalidity or its sources, and focus instead on content relevance, coverage, etc. There is nothing inherently wrong in this, but reframing content 193  validation as an exercise in identifying possible sources of invalidity in an instrument?s content might provide a more intuitive fit with the actual practice of content validation, that is, with the tendency to focus on content that needs improving.  Validity is of fundamental importance to measurement, and instrument content is of fundamental importance to validity. The more that developers and users of instruments feel confident in their understanding of the concepts of validity and validation, the more likely that the quality of measurement endeavors improves. Both the studies presented in this dissertation and the recommendations for future research have the potential to advance the theory and methodology of content validation. However, it is far too easy for measurement and validity specialists to propose complicated theories and methods that are difficult for other researchers to apply and implement. Therefore, a final important consideration going forward will be ensuring that any advances in theory and methodology are translated into clear, accessible, and useable guidelines for the practice of content validation.  194  References  Albrecht, G. L., & Devlieger, P. J. (1999). The disability paradox: High quality of life against all odds. Social Science & Medicine, 48(8), 977?988. American Educational Research Association, American Psychological Association, National Council on Measurement in Education, & Joint Committee on Standards for Educational and Psychological Testing (U.S.). (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. American Psychological Association, American Educational Research Association, & National Council on Measurement in Education. (1954). Technical recommendations for psychological tests and diagnostic techniques. Psychological Bulletin, 51(2:2), 1?38. Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology, 37(1), 1?16. doi:10.1146/annurev.ps.37.020186.000245 Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Upper Saddle River, N.J: Prentice Hall. Anderson, K. L., & Burckhardt, C. S. (1999). Conceptualization and measurement of quality of life as an outcome variable for health care intervention and research. Journal of Advanced Nursing, 29(2), 298?306. Angoff, W. H. (1988). Validity: An evolving concept. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 19?32). Lawrence Erlbaum. Aubry, T., Klodawski, F., Hay, E., & Birnie, S. (2003). Panel study on persons who are homeless in Ottawa: Phase 1 results (pp. 1?51). Ottawa, ON: Centre for Research on Community Services Faculty of Social Sciences, University of Ottawa. Retrieved from http://www.homelesshub.ca/Resource/Frame.aspx?url=http%3a%2f%2fintraspec.ca%2fP195  anel_Study_-_Phase_1.pdf&id=34772&title=Panel+Study+on+Persons+Who+Are+Homeless+in+Ottawa%3a+Phase+1+Results&owner=121 Beck, C. T., & Gable, R. K. (2001). Ensuring content validity: An illustration of the process. Journal of Nursing Measurement, 9(2), 201?215. Beckstead, J. W. (2009). Content validity is naught. International Journal of Nursing Studies, 46(9), 1274?1283. doi:10.1016/j.ijnurstu.2009.04.014 Boivin, J.-F., Roy, E., Haley, N., & Galbaud du Fort, G. (2005). The health of street youth: A Canadian perspective. Canadian Journal of Public Health/Revue Canadienne de Sant? Publique, 96(6), 432?437. Burckhardt, C. S., Woods, S. L., Schultz, A. A., & Ziebarth, D. M. (1989). Quality of life of adults with chronic illness: A psychometric study. Research in Nursing & Health, 12(6), 347?354. doi:10.1002/nur.4770120604 Burke, M. J., Finkelstein, L. M., & Dusig, M. S. (1999). On average deviation indices for estimating interrater agreement. Organizational Research Methods, 2(1), 49?68. doi:10.1177/109442819921004 Busch-Geertsema, V. (2010). Defining and measuring homelessness. In E. O?Sullivan, V. Busch-Geertsema, D. Quilgars, & N. Pleace (Eds.), Homelessness research in Europe. Brussels: FEANTSA. Can, G., Durna, Z., & Aydiner, A. (2010). The validity and reliability of the Turkish version of the Quality of Life Index [QLI] (Cancer version). European Journal of Oncology Nursing, 14(4), 316?321. doi:10.1016/j.ejon.2010.03.007 196  Canadian Homelessness Research Network. (2012). Canadian definition of homelessness. Retrieved from http://www.homelesshub.ca/ResourceFiles/06122012CHRNhomelessdefinition.pdf Cizek, G. J., Bowen, D., & Church, K. (2010). Sources of validity evidence for educational and psychological tests: A follow-up study. Educational and Psychological Measurement, 70(5), 732?743. doi:10.1177/0013164410379323 Clemson, L., Cumming, R. G., & Heard, R. (2003). The development of an assessment to evaluate behavioral factors associated with falling. The American Journal of Occupational Therapy, 57(4), 380?388. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37?46. doi:10.1177/001316446002000104 Cordray, D. S., & Pion, G. M. (1991). What?s behind the numbers? Definitional issues in counting the homeless. Housing Policy Debate, 2(3), 585?616. doi:10.1080/10511482.1991.9521065 Corty, E. W., Althof, S. E., & Wieder, M. (2011). Measuring women?s satisfaction with treatment for sexual dysfunction: Development and initial validation of the Women?s Inventory of Treatment Satisfaction (WITS-9). The Journal of Sexual Medicine, 8(1), 148?157. doi:10.1111/j.1743-6109.2010.01977.x Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281?302. doi:10.1037/h0040957 Cummins, R. A. (1996). The domains of life satisfaction: An attempt to order chaos. Social Indicators Research, 38(3), 303?328. doi:10.1007/BF00292050 197  Cummins, R. A. (2010). Fluency disorders and life quality: Subjective wellbeing vs. health-related quality of life. Journal of Fluency Disorders, 35(3), 161?172. doi:10.1016/j.jfludis.2010.05.009 Cummins, R. A., Mccabe, M. P., Romeo, Y., & Gullone, E. (1994). The Comprehensive Quality of Life Scale (Comqol): Instrument development and psychometric evaluation on college staff and students. Educational and Psychological Measurement, 54(2), 372?382. doi:10.1177/0013164494054002011 Davis, L. L. (1992). Instrument review: Getting the most from a panel of experts. Applied Nursing Research, 5(4), 194?197. doi:10.1016/S0897-1897(05)80008-4 DeVellis, R. F. (1991). Scale development: Theory and applications. Newbury Park, CA: Sage. Di Iorio, C. K. (2006). Measurement in health behavior: Methods for research and evaluation. San Francisco, CA: Jossey-Bass. Fazel, S., Khosla, V., Doll, H., & Geddes, J. (2008). The prevalence of mental disorders among the homeless in Western countries: Systematic review and meta-regression analysis. PLoS Med, 5(12), e225. doi:10.1371/journal.pmed.0050225 Fischer, P. J., & Breakey, W. R. (1991). The epidemiology of alcohol, drug, and mental disorders among homeless persons. The American Psychologist, 46(11), 1115?1128. Fitzpatrick, A. R. (1983). The meaning of content validity. Applied Psychological Measurement, 7(1), 3?13. doi:10.1177/014662168300700102 Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378?382. doi:10.1037/h0031619 Fleury, J. (1993). Preserving qualitative meaning in instrument development. Journal of Nursing Measurement, 1(2), 135?144. 198  Frisch, M. B., Cornell, J., Villanueva, M., & Retzlaff, P. J. (1992). Clinical validation of the Quality of Life Inventory. A measure of life satisfaction for use in treatment planning and outcome assessment. Psychological Assessment, 4(1), 92?101. doi:10.1037/1040-3590.4.1.92 Furr, R. M., & Bacharach, V. R. (2008). Psychometrics: An introduction. Los Angeles, CA: Sage Publications. Gabbard, W. J., Snyder, C. S., Lin, M. B., Chadha, J. H., May, J. D., & Jaggers, J. (2007). Methodological issues in enumerating homeless individuals. Journal of Social Distress and the Homeless, 16(2), 90?103. Gaetz, S., Donaldson, J., Richter, T., & Gulliver, T. (2013). The state of homelessness in Canada 2013. Toronto, ON: Canadian Homelessness Research Network Press. Gill, T. M., & Feinstein, A. R. (1994). A critical appraisal of the quality of quality-of-life measurements. JAMA: the journal of the American Medical Association, 272(8), 619?626. Grant, J. S., & Davis, L. L. (1997). Selection and use of content experts for instrument development. Research in Nursing & Health, 20(3), 269?274. Greater Vancouver Regional Steering Committee on Homelessness. (2012). One step forward: Results of the 2011 Metro Vancouver Homeless Count. Vancouver, BC: Author. Halifax Regional Municipality. (2005). Homeless in HRM: Portrait of streets and shelters. Halifax, NS: Author. Halliday, V., Porock, D., Arthur, A., Manderson, C., & Wilcock, A. (2012). Development and testing of a cancer appetite and symptom questionnaire. Journal of Human Nutrition and Dietetics, 25(3), 217?224. doi:10.1111/j.1365-277X.2012.01233.x 199  Haynes, S. N., Richard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assessment: A functional approach to concepts and methods. Psychological Assessment, 7(3), 238?247. doi:10.1037/1040-3590.7.3.238 Health-related quality of life (HRQOL). (n.d.). Retrieved June 8, 2013, from http://www.cdc.gov/hrqol/concept.htm House, A. E., House, B. J., & Campbell, M. B. (1981). Measures of interobserver agreement: Calculation formulas and distribution effects. Journal of Behavioral Assessment, 3(1), 37?57. doi:10.1007/BF01321350 Hubley, A. M., & Palepu, A. (2007). Injection Drug User Quality of Life Scale (IDUQOL): Findings from a content validation study. Health and Quality of Life Outcomes, 5, 46. doi:10.1186/1477-7525-5-46 Hubley, A. M., Russell, L. B., Gadermann, A. M., & Palepu, A. (2009). Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Inventory administration and scoring manual. Retrieved from http://educ.ubc.ca/faculty/hubley/qolhhi/QoLHHI%20Manual.pdf Hubley, A. M., Russell, L. B., Palepu, A., & Hwang, S. W. (2012). Subjective quality of life among individuals who are homeless: A review of current knowledge. Social Indicators Research, 1?16. doi:10.1007/s11205-012-9998-7 Hubley, A. M., & Zumbo, B. D. (1996). A dialectic on validity: Where we have been and where we are going. The Journal of General Psychology, 123(3), 207?215. doi:10.1080/00221309.1996.9921273 200  Human Resources and Skills Development Canada. (2013, April 6). Understanding Homelessness and the Homelessness Strategy| HRSDC. Retrieved June 19, 2013, from http://www.hrsdc.gc.ca/eng/communities/homelessness/understanding.shtml Hwang, S. W. (2001). Homelessness and health. Canadian Medical Association Journal, 164(2), 229?233. Hwang, S. W., Aubry, T., Palepu, A., Farrell, S., Nisenbaum, R., Hubley, A. M., ? Chambers, C. (2011). The health and housing in transition study: A longitudinal study of the health of homeless and vulnerably housed adults in three Canadian cities. International Journal of Public Health, 56(6), 609?623. doi:10.1007/s00038-011-0283-3 James, L. R., Demaree, R. G., & Wolf, G. (1984). Estimating within-group interrater reliability with and without response bias. Journal of Applied Psychology, 69(1), 85?98. doi:10.1037/0021-9010.69.1.85 Kane, M. T. (2006). Validation. In R. L. Brennan, National Council on Measurement in Education, & American Council on Education (Eds.), Educational measurement (4th ed., pp. 17?64). Westport, CT: Praeger Publishers. Retrieved from http://www.loc.gov/catdir/toc/ecip0614/2006015706.html Kane, M. T. (2013). Validating the Interpretations and Uses of Test Scores. Journal of Educational Measurement, 50(1), 1?73. doi:10.1111/jedm.12000 Kelley, T. L. (1927). Interpretation of educational measurements. Yonkers-On-Hudson, NY & Chicago, IL: World Book. Kertesz, S. G., Larson, M. J., Horton, N. J., Winter, M., Saitz, R., & Samet, J. H. (2005). Homeless chronicity and health-related quality of life trajectories among adults with addictions. Medical Care, 43(6), 574?585. 201  Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. In P. V. Marsden & J. D. Wright (Eds.), Handbook of survey research (pp. 263?313). Bingley, UK: Emerald Group Publishing. Lawshe, C. H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 563?575. doi:10.1111/j.1744-6570.1975.tb01393.x Lee, B. A., & Greif, M. J. (2008). Homelessness and hunger. Journal of Health and Social Behavior, 49(1), 3?19. doi:10.1177/002214650804900102 Lee, B. A., & Schreck, C. J. (2005). Danger on the streets: Marginality and victimization among homeless people. American Behavioral Scientist, 48(8), 1055?1081. doi:10.1177/0002764204274200 Lehman, A. F. (1988). A quality of life interview for the chronically mentally ill. Evaluation and Program Planning, 11(1), 51?62. doi:10.1016/0149-7189(88)90033-X Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(3), 635?694. doi:10.2466/pr0.1957.3.3.635 Long, D., Rio, J., & Rosen, J. (2007). Employment and income supports for homeless people. Presented at the 2007 National Symposium on Homelessness Research, Washington, DC. Retrieved from http://www.homelesshub.ca/(S(0z1x4h45i2wcnkzlnjequzz0))/Resource/Frame.aspx?url=http%3a%2f%2faspe.hhs.gov%2fhsp%2fhomelessness%2fsymposium07%2flong%2freport.pdf&id=32931&title=Employment+and+Income+Supports+for+Homeless+People&owner=48 Lynn, M. R. (1986). Determination and quantification of content validity. Nursing Research, 35(6), 382?385. doi:10.1097/00006199-198611000-00017 202  McKenzie, J. F., Wood, M. L., Kotecki, J. E., Clark, J. K., & Brey, R. A. (1999). Establishing content validity: Using qualitative and quantitative steps. American Journal of Health Behavior, 23(4), 311?318. doi:10.5993/AJHB.23.4.9 Messick, S. (1975). Meaning and values in measurement and evaluation. American Psychologist, 30, 955?966. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13?103). New York, NY: American Council on Education & Macmillan Publishing Company. Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons? responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741?749. doi:10.1037/0003-066X.50.9.741 Metraux, S., Caterina, R., & Cho, R. (2008). Incarceration and homelessness. In D. Dennis, G. Locke, & J. Khadduri (Eds.), Toward understanding homelessness: The 2007 National Symposium on Homelessness Research. Washington, DC: US Department of Housing & Urban Development. Retrieved from http://works.bepress.com/metraux/1 Michalos, A. C. (1985). Multiple discrepancies theory (MDT). Social Indicators Research, 16(4), 347?413. doi:10.1007/BF00333288 Moons, P. (2004). Why call it Health-Related Quality of Life when you mean perceived health status? European Journal of Cardiovascular Nursing, 3(4), 275?277. doi:10.1016/j.ejcnurse.2004.09.004 Muldoon, M. F., Barger, S. D., Flory, J. D., & Manuck, S. B. (1998). What are quality of life measurements measuring? BMJ?: British Medical Journal, 316(7130), 542?545. 203  Murphy, K. R., & Davidshofer, C. O. (2001). Psychological testing: Principles and applications. Upper Saddle River, NJ: Prentice Hall. National Coalition for the Homeless. (2007). How many people experience homelessness? NCH Fact Sheet #2. Washington, D.C.: Author. Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scaling procedures: Issues and applications. Thousand Oaks, CA: SAGE Publications. Osterlind, S. J. (1997). Constructing test items?: Multiple-choice, constructed-response, performance and other formats (2nd ed.). Hingham, MA: Kluwer Academic Publishers. Retrieved from http://site.ebrary.com/lib/ubc/docDetail.action?docID=10052638 Palepu, A., Hubley, A. M., Russell, L., Gadermann, A., & Chinni, M. (2012). Quality of life themes in Canadian adults and street youth who are homeless or hard-to-house: A multi-site focus group study. Health and Quality of Life Outcomes, 10(1), 93. doi:10.1186/1477-7525-10-93 Phelan, J. C., & Link, B. G. (1999). Who are ?the homeless?? Reconsidering the stability and composition of the homeless population. American Journal of Public Health, 89(9), 1334?1338. Phillips, D. (2006). Quality of life: Concept, policy and practice. Taylor & Francis. Retrieved from http://lib.myilibrary.com/?ID=34728 Pierce, A. G. (1995). Measurement. In L. Talbot (Ed.), Principles and practice of nursing research (pp. 265?291). St. Louis, MO: Mosby. Polit, D. F., & Beck, C. T. (2006). The content validity index: Are you sure you know what?s being reported? Critique and recommendations. Research in Nursing & Health, 29(5), 489?497. doi:10.1002/nur.20147 204  Polit, D. F., Beck, C. T., & Owen, S. V. (2007). Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Research in Nursing & Health, 30(4), 459?467. doi:10.1002/nur.20199 Population: Hidden homeless. (n.d.). Retrieved June 22, 2013, from http://www.homelesshub.ca/Topics/Hidden-Homeless-260.aspx Rokach, A. (2005a). Private lives in public places: Loneliness of the homeless. Social Indicators Research, 72(1), 99. Rokach, A. (2005b). The causes of loneliness in homeless youth. The Journal of Psychology, 139(5), 469?480. doi:10.3200/JRLP.139.5.469-480 Rubio, D. M., Berg-Weger, M., Tebb, S. S., Lee, E. S., & Rauch, S. (2003). Objectifying content validity: Conducting a content validity study in social work research. Social Work Research, 27(2), 94?104. doi:10.1093/swr/27.2.94 Savage, C. L., Lindsell, C. J., Gillespie, G. L., Lee, R. J., & Corbin, A. (2008). Improving health status of homeless patients at a nurse-managed clinic in the Midwest USA. Health & Social Care in the Community, 16(5), 469?475. doi:10.1111/j.1365-2524.2007.00758.x Schilling, L. S., Dixon, J. K., Knafl, K. A., Grey, M., Ives, B., & Lynn, M. R. (2007). Determining content validity of a self-report instrument for adolescents using a heterogeneous expert panel. Nursing Research, 56(5), 361?366. doi:10.1097/01.NNR.0000289505.30037.91 Sireci, S. G. (1998a). Gathering and analyzing content validity data. Educational Assessment, 5(4), 299?321. doi:10.1207/s15326977ea0504_2 Sireci, S. G. (1998b). The construct of content validity. Social Indicators Research, 45(1-3), 83?117. doi:10.1023/A:1006985528729 205  Sirgy, M. J. (2012). The psychology of quality of life: Hedonic well-being, life satisfaction, and eudaimonia (2nd ed.). Guildford, UK: Springer London. Retrieved from http://link.springer.com.ezproxy.library.ubc.ca/book/10.1007/978-94-007-4405-9/page/1 Smith, K. W., Avis, N. E., & Assmann, S. F. (1999). Distinguishing between quality of life and health status in quality of life research: A meta-analysis. Quality of Life Research, 8(5), 447?459. Solliday-McRoy, C., Campbell, T. C., Melchert, T. P., Young, T. J., & Cisler, R. A. (2004). Neuropsychological functioning of homeless men. The Journal of Nervous and Mental Disease, 192(7), 471?478. SPARC BC, Eberle Planning and Research, Jim Woodward & Associates Inc., Graves, J., Huhtala, K., Campbell, K., ? Goldberg, M. (2009). Still on our streets .... Results of the 2008 Metro Vancouver Homeless Count. Vancouver, BC: Greater Vancouver Regional Steering Committee on Homelessness. Spence, S., Stevens, R., & Parks, R. (2004). Cognitive dysfunction in homeless adults: A systematic review. Journal of the Royal Society of Medicine, 97(8), 375?379. Sprangers, M. A., & Schwartz, C. E. (1999). Integrating response shift into health-related quality of life research: A theoretical model. Social Science & Medicine, 48(11), 1507?1515. Stewart, J. L., Lynn, M. R., & Mishel, M. H. (2005). Evaluating content validity for children?s self-report instruments using children as content experts. Nursing Research, 54(6), 414?418. Sun, S., Irestig, R., Burstr?m, B., Beijer, U., & Burstr?m, K. (2012). Health-related quality of life (EQ-5D) among homeless persons compared to a general population sample in 206  Stockholm County, 2006. Scandinavian Journal of Public Health, 40(2), 115?125. doi:10.1177/1403494811435493 The Universal Declaration of Human Rights. (n.d.). Retrieved June 20, 2013, from http://www.un.org/en/documents/udhr/ The World Health Organization Quality of Life assessment (WHOQOL): Position paper from the World Health Organization. (1995). Social Science & Medicine, 41(10), 1403?1409. Tilden, V. P., Nelson, C. A., & May, B. A. (1990). Use of qualitative methods to enhance content validity. Nursing Research, 39(3), 172?175. Tinsley, H. E., & Weiss, D. J. (1975). Interrater reliability and agreement of subjective judgments. Journal of Counseling Psychology, 22(4), 358?376. doi:10.1037/h0076640 Toro, P. A. (2007). Toward an international understanding of homelessness. Journal of Social Issues, 63(3), 461?481. doi:10.1111/j.1540-4560.2007.00519.x Toro, P. A., Tompsett, C. J., Lombardo, S., Philippot, P., Nachtergael, H., Galand, B., ? Harvey, K. (2007). Homelessness in Europe and the United States: A comparison of prevalence and public opinion. Journal of Social Issues, 63(3), 505?524. doi:10.1111/j.1540-4560.2007.00521.x Tsui, J. I., Bangsberg, D. R., Ragland, K., Hall, C. S., & Riley, E. D. (2007). The impact of chronic hepatitis C on health-related quality of life in homeless and marginally housed individuals with HIV. AIDS and Behavior, 11(4), 603?610. doi:10.1007/s10461-006-9157-8 Ubersax, J. (n.d.). The Myth of Chance-Corrected Agreement. Retrieved May 19, 2013, from http://www.john-uebersax.com/stat/kappa2.htm 207  UN Commission on Human Rights. (2005). Report of the Special Rapporteur on adequate housing as a component of the right to an adequate standard of living, Millon Kothari (No. E/CN.4/2005/48). Retrieved from http://www.refworld.org/docid/42d66e8a0.html UN-HABITAT. (2011). Affordable land and housing in Europe and North America (No. HS/074/11E). Retrieved from www.unhabitat.org/pmss/getElectronicVersion.aspx?nr=3220&alt=1 Waltz, C. F., & Bausell, R. B. (1981). Nursing research: Design, statistics, and computer analysis. Philadelphia, PA: F.A. Davis Co. Waltz, C. F., Strickland, O., & Lenz, E. R. (1991). Measurement in nursing research (2nd ed.). Philadelphia, PA: F.A. Davis Co. Wilkinson, A., Roberts, J., & While, A. E. (2010). Construction of an instrument to measure student information and communication technology skills, experience and attitudes to e-learning. Computers in Human Behavior, 26(6), 1369?1376. doi:10.1016/j.chb.2010.04.010 Wynd, C. A., Schmidt, B., & Schaefer, M. A. (2003). Two quantitative approaches for estimating content validity. Western Journal of Nursing Research, 25(5), 508?518. Yalow, E. S., & Popham, W. J. (1983). Content validity at the crossroads. Educational Researcher, 12(8), 10?21. doi:10.3102/0013189X012008010 Yoder, K. A., Whitbeck, L. B., & Hoyt, D. R. (2008). Dimensionality of thoughts of death and suicide: Evidence from a study of homeless adolescents. Social Indicators Research, 86(1), 83?100. 208  Zumbo, B. D., Gelin, M. N., & Hubley, A. M. (2002). The construction and use of psychological tests and measures. In Encyclopedia of life support systems (EOLSS). Oxford, UK: EOLSS Publishers.    209  Appendix A: Quality of Life for Homeless and Hard-to-House Individuals (QoLHHI) Survey: Health Impact Section and Living Conditions Impact Section (Hubley et al., 2009) Living Conditions Impact (QoLHHI Impact: Living Conditions) I?d like to know about the place where you currently live or stay.  (Note: Yes/No = sometimes, depends, or any other mixed response.  Use the Comments section to expand on any responses.)   Yes No Yes/No Comments Don?t know N/A No  answer A Do you feel that the place where you live or stay is affordable? 1 2 3     B Does the place where you live or stay have the amenities that are important to you (like a fridge, stove, own bathroom, elevator)? 1 2 3     C Do you have access to bathing facilities (such as a shower)? 1 2 3     D IF YES TO C: Do you feel that these bathing facilities are clean enough to use? 1 2 3     E IF YES TO C: Do you feel safe using these bathing facilities? 1 2 3     F Overall, do you feel that the place where you live or stay is clean enough? 1 2 3     G Do you feel like you have control over your own space? 1 2 3     H Are the other people living or staying there too disruptive? 1 2 3     I Do you have enough privacy there? 1 2 3     J Do you feel there are too many restrictions placed on you there? 1 2 3     210  K Are you always worrying that you?ll catch some illness from other people living there? 1 2 3       Yes No Yes/No Comments Don?t know N/A No answer L Do you feel your stuff is safe there? 1 2 3     M Do you feel that you?re treated well there (for example: by landlord, shelter staff, other residents)? 1 2 3     N Does it feel like a home to you? 1 2 3     O What is the worst thing about the place where you currently live or stay?      P What is the best thing about the place where you currently live or stay?      Q Anything else you want to tell me about the place where you live or stay?      R You?ve talked about some things that describe the place where you currently live or stay. Now I want to know what kind of impact/effect that the place where you live or stay has on you. You could tell me that the place where you live or stay has no impact/effect on you at all. Or you could say that it has a positive impact/effect and makes things better for you. Or, maybe it has a negative impact/effect and makes things worse for you. I?d like you to rate the kind of impact/effect that the place where you currently live or stay has on you. [REFER TO SCALE]  1 Large  negative impact/effect 2 Moderate negative impact/effect 3 Small  negative impact/effect 4 No impact/effect 5 Small  positive impact/effect 6 Moderate positive impact/effect 7 Large  positive impact/effect  Don?t know No answer    211  Now I have some q