UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The test of language competence : a validity study with language disabled and normal children Ainsworth, Cheryl Anne 1989

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1989_A8 A34.pdf [ 5.68MB ]
Metadata
JSON: 831-1.0097728.json
JSON-LD: 831-1.0097728-ld.json
RDF/XML (Pretty): 831-1.0097728-rdf.xml
RDF/JSON: 831-1.0097728-rdf.json
Turtle: 831-1.0097728-turtle.txt
N-Triples: 831-1.0097728-rdf-ntriples.txt
Original Record: 831-1.0097728-source.json
Full Text
831-1.0097728-fulltext.txt
Citation
831-1.0097728.ris

Full Text

THE TEST OF L A N G U A G E COMPETENCE: A VALIDITY STUDY WITH L A N G U A G E DISABLED AND NORMAL CHILDREN By CHERYL ANNE AINSWORTH B.A., The University of British Columbia, 1974 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in THE FACULTY OF EDUCATION (Department of Educational Psychology and Special Education) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA October 1989 (c^ Cheryl Anne Ainsworth, 1989 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 DE-6(3/81) Abstract This study investigated the validity and related psychometric characteristics of the Test of Language Competence ( T L C ) , published in 1985 by Wiig and Secord. The T L C was developed as a measure of higher order language functioning in children and adolescents between the ages of nine and eighteen years. Evidence concerning the psychometric characteristics of the T L C is reported in the test manual; however, to date, no studies addressed primarily to the subject of T L C validity have been reported in the literature. Moreover, no information is available concerning the effectiveness of its use with local school children. This study endeavored to examine the technical characteristics of the T L C using data obtained from 23 language disordered ( L L D ) and 23 control subjects sampled f rom the local school population. A t the same time, the criterion-related validity of an informal language sample analysis was investigated. Item analysis statistics, including indices of item di f f icul ty , item discrimination, internal consistency, and interrater reliability were prepared for the T L C . Discriminant function analyses were used to assess criterion validity of the T L C , with and without corrections in T L C scores for Verbal IQ. Because of the multiethnic nature of the sample, Engl ish as a second language (ESL) and English as a first language ( E F L ) group means were tested for significant differences on six variables. L L D and control group performance on the language sample analysis were tested for significant differences, using Wilcoxon rank sum tests. Results of the item analyses indicated support for the internal consistency of the T L C subtests and the test composite, with the exception of Subtest T w o (Making Inferences), which obtained an internal consistency coefficient below the designated .8 criterion. Subtest T w o and Subtest Three (Recreating Sentences) were found to contain items of questionable validity, and all four subtests contained items that were misordered in terms of di f f iculty. Subtests T w o and Three exhibited satisfactory criterion validity; however, Subtest One (Understanding Ambiguous Sentences) and Four (Understanding Metaphoric Expressions) failed to discriminate between L L D and Control groups in a stepwise analysis. The language sample analysis discriminated between the two groups. Possible explanations for the f indings, along with implications for clinical practice and recommendations for further research, are discussed. i v T A B L E O F C O N T E N T S Abstract ii Table of Contents iv List of Tables vi i Acknowledgements v i i i Dedication ix Contents Chapter Page I Introduction 1 Background of the Problem 1 Purpose of the Study 3 Signficance of the Study 5 Summary 5 II Literature Review 7 Introduction 7 The Assessment Process 7 Test Val idi ty 10 Theoretical Background 13 Validation Research 15 Assessing E S L Populations 18 The Test of Language Competence 19 Theoretical Background 19 Standardization 20 Content Val idi ty 20 Cri ter ion-Related Val idity 21 Construct Val idi ty 22 Internal Consistency 23 Interrater Reliabil i ty 24 Informal Assessment 25 Summary 27 Research Questions 27 III Methodology 29 Sampling Procedures 29 Measuring Instruments 30 Test of Language Competence 30 Subtest One 33 Subtest T w o 33 Subtest Three 33 Subtest Four 34 W I S C - R 35 Informal Language Sample 35 V T A B L E O F C O N T E N T S C O N T ' D Chapter Page Method of Data Analysis 36 Item and Test Analysis 36 Correlational Analyses 37 Interrater Reliability 37 Regression Analysis 37 Stepwise Discriminant Function Analysis 38 Adjusted Discriminant Function Analysis 39 Hotelling's T-Square 39 L A R S P 39 Wilcoxon Rank Sum Tests 41 Method 42 IV Results 43 Descriptive Statistics 43 Item and Test Analysis 43 Item Di f f icu l ty 47 Item Discrimination 48 Internal Consistency 49. Interrater Reliabil ity 49 Correlational Analysis 49 Discriminant Funct ion Analysis 53 Simple Regression Analysis 55 Adjusted Discriminant Function Analysis 55 Hotelling's T-Square 56 Wilcoxon Rank Sum Tests 56 V Discussion 59 Purpose of the Study 59 Summary and Discussion of Results 60 Internal Characteristics of the T L C 60 Item Di f f icul ty 60 Item Discrimination 61 Internal Consistency ( T L C Subtests) 62 Internal Consistency ( T L C Composite) 63 Intercorrelations 63 Interrater Reliabil ity 64 Relationship Between T L C and V I Q 64 Language Disabled and Control Group Discrimination ( T L C ) 65 Test Differences in E S L / E F L Group Performance 68 Language Disabled and Control Group Discrimination (Language Sample Analysis) 68 Conclusions 68 Limitations of the Study 70 v i T A B L E O F C O N T E N T S C O N T ' D Chapter Page Recommendations for Cl inical Practice 71 Recommendations for Further Research 73 References 74 Appendix A — Parents/Student Information 80 Appendix B — Item Analysis Data 90 Appendix C — Stimulus Items Used to El ici t Language Samples 97 Appendix D — Language Sample Transcripts 99 Appendix E — L A R S P Summary Sheets 120 Appendix F — Computational Procedure for Wilcoxon Rank Sum Tests 131 LIST O F T A B L E S Sample Characteristics: Age , Sex, English Second Language (ESL) , English First Language ( E F L ) Cultural Background of Sample W I S C - R Means, Standard Deviations (SD) and Standard Error of Measurement T L C Means, Standard Deviations (SD) and Standard Errors of Measurement (SEM) Scaled Scores and Standard Scores for the Mean Age Group (13 years) on the T L C Subtests and Composite T L C Subtest Internal Consistency Reliabilities and Descriptive Statistics: Combined Sample Correlations Between T L C and W I S C - R for the Combined Sample Intercorrelations Among T L C Subtests for the Combined Sample Summary of Discriminant Function Analysis: Stepwise Selection of T L C Subtests (Total Scores) Summary of Discriminant Function Analysis: Classification of Language Disabled ( L L D ) and Control Groups Summary of Simple Regression Analyses: T L C subtests regressed on V I Q Hotelling's T : Significance of Multivariate and Univariate Differences Between E S L and E F L Group Means Wilcoxon Rank Sum Tests: Sum of Ranks (Rj ) by Stage of L A R S P Analysis v i i i Acknowledgements I would like to thank my faculty advisor and committee chairperson, Dr. Julianne Conry, for her assistance in planning and completing this study. Thanks also to Dr. Robert Conry, second member of the committee, for contributing his time and expertise to the data analysis and interpretation. Further acknowledgements are extended to the staff and students of School District # 3 9 (Vancouver) who participated in the data collection; to Joanna Neilson for re-scoring the T L C protocols; to Cathy Margetts who donated her secretarial skills to preparing the final draft; and to the many friends, relatives and colleagues whose interest encouraged me to persevere. Finally, but not least in importance, I would like to thank my husband, John, without whose patience and support this thesis could not have been finished; and my children, Christopher and Kimberly, who have waited many hours for mommy to be finished working. For my father 1 Chapter One Introduction Wiig and Semel (1984) describe language assessment as a hierarchical process. The first level of language assessment involves screening for language disability. This is followed by more in-depth diagnostic assessment to identify specific areas of weakness, and goals for intervention. Final ly , assessment is concerned with monitoring and evaluating progress. The choice of methods used within the assessment process depends upon the level or purpose of the assessment, and the theoretical orientation of the examiner. Al though theoretical perspectives may vary, current practice generally includes a combination of formal and informal assessment techniques. Formal assessment, by definit ion, involves the use of standardized evaluation procedures. A m o n g the suggested advantages of standardized language tests are their objectivity, their replicability, and controlled administration procedures which help to eliminate unwanted sources of variance. These features, together with the fact that standardized tests yield quantitative data, make them useful for diagnostic and research purposes. Despite the appeal of standardized tests, a number of issues surround their use. These concern the technical and practical merits of many commonly-used instruments. Background of the Problem Sommers, Erdige and Peterson (1978) predicted that the impact of P L 94-142 (U.S. Of f ice of Educat ion, 1975) would be to stimulate the development and widespread use of formal language tests; a forecast which has come true. The years since P L 94-142 was enacted have given rise to a plethora of standardized language tests. Standardized test results are now required by law in the Uni ted States for placement and funding purposes. As Stephens and Montgomery (1985) have pointed out, speech-language clinicians do not have a choice as to whether they will use standardized tests, but only which tests they will use. 2 Despite their widespread use, standardized language tests have become the subject of increasing criticism. Particular dissatisfaction has gathered around "the proliferation of published tests and materials being marketed with insufficient information concerning their effectiveness and psychometric characteristics" (American Speech-Language-Hearing Association [ A S H A ] 1988, p. 75). Theoretical and technical information reported in test manuals is frequently inadequate (Lieberman, Hef f ron , West, Hutchinson & Swem, 1987; McCauley & Swisher, 1984a; Stephens & Montgomery, 1985). Moreover, a review of the professional literature reveals a paucity of research concerning the psychometric characteristics of even the most commonly used instruments. F inal ly , critics maintain that the validity of standardized language tests used to identify language disorders among those who speak Engl ish as a second language (ESL) is highly questionable (Damico; in press, Evard & Sabers, 1979; Vaughn-Cooke , 1983). The increased dissatisfaction with standardized language tests has had two effects. First , there has been a movement away f rom formal assessment procedures toward the use of informal, descriptive techniques, including language sampling and analysis. Proponents of this view argue that in addition to their technical inadequacies, standardized tests offer no practical utility. Apart f rom determining the existence of a language disorder, standardized tests do little to describe the nature of the problem or how to f ix it. Descriptive procedures, on the other hand, take no more time to administer than a battery of standardized tests, and ultimately yield information which is useful at all levels of the assessment process (Damico, in press; 1988; M u m a , Lubinsk i , Pierce, 1982; Simon, 1984). A second movement resulting f rom the dissatisfaction with standardized tests has been toward improving standards for test use and test construction. A recent example of this effort appeared in the form of the A S H A Guidelines on Instrument Evaluation ( A S H A , 1988). The A S H A Guidelines were modeled after the Amer ican Psychological 3 Association ( A P A ) Standards For Educational and Psychological Tests (American Psychological Association, 1985) and were intended "as general criteria for judging the adequacy of measurement and intervention instruments or procedures" (p. 76). The Guidelines raise issues concerning the significance of theory in test development, standardization procedures, and test reliability; however, no direct reference is made to test validity, which according A P A standards "is the most important consideration in test evaluation" ( A P A , 1985). Regardless of divergent professional opinion concerning the relative merits of formal assessment, standardized tests continue to be used extensively by speech-language pathologists for assessment purposes. Lieberman and Michael (1986) and McCauley and Swisher (1984a; 1984b) have drawn attention to the fact that serious errors in diagnosis and remediation can occur if the tests used by clinicians fail to meet adequate technical standards. They maintain that in order to ensure accuracy within the assessment process, and to influence the quality of future instruments, clinicians have a responsibility to scrutinize the technical characteristics of the instruments they use. This view is consistent with A P A (1985) recommendations for test use, and implies that clinicians must have access to technical information concerning the various instruments available, and that the information provided should include evidence of test validity. Purpose of the Study The first purpose of the present study was to investigate the validity and related psychometric characteristics of the Test of Language Competence ( T L C ) (Wiig and Secord, 1985). A second purpose of the study was to determine i f significant differences in the performance of language disabled (LD) and control subjects would be observed on an informal language sample analysis. Each of these objectives is discussed separately below. The Test of Language Competence ( T L C ) is an individually administered measure of language competence developed for use with older children and adolescents between 4 the ages of nine and eighteen years. Language competence is defined as "the appropriate understanding and/or expression of language content and a responsiveness to the communicative demands of a specific situation" (Wiig & Secord, 1985, p. 1). The T L C is intended to complement other formal and informal methods of language assessment, and is recommended for use with measures of receptive vocabulary and language sample analyses. The T L C Technical Manual presents information concerning the theoretical background and development of the test. Evidence supporting T L C validity and reliability is reported on the basis of data obtained during standardization. Separate investigations using L D and control subjects are also reported; however, sampling procedures employed in these latter investigations are not well -described, and intelligence test data are available for the language disabled subjects only. Santos (1987) included the T L C in a study of variance in reading comprehension; however, no studies addressed primarily to the subject of T L C validity or reliability have as yet been reported in the literature. The primary objective of this research was to examine the technical characteristics of the T L C , and in particular, to evaluate the validity of the instrument. This was accomplished first by examining the internal characteristics of the T L C , including item di f f icul ty , item discrimination, internal consistency, and interrater reliability. Second, two discriminant function analyses were calculated to determine the capacity of the T L C to discriminate between language-learning disabled ( L L D ) and control subjects both before and after the effects of verbal intelligence had been removed. Verbal intelligence was operationally defined as the Verbal Intelligence Quotient (VIQ) on the Wechsler Intelligence Scales for Chi ldren-Rev ised (WISC-R) (Wechsler, 1974). T h i r d , in order to determine if subjects who spoke Engl ish as a second language (ESL) performed differently than subjects who spoke Engl ish as their 5 first native language ( E F L ) , group means on the T L C and W I S C - R (VIQ, PIQ, FSIQ) were tested for significant differences. A second objective of this research was to compare the performance of L D and control subjects on an informal language sample analysis. Language samples were obtained from five L D and five control subjects. These were analyzed using the Language Assessment, Remediation, Screening Procedure ( L A R S P ) (Crystal, Fletcher & G a r m a n , 1976). Results of the L A R S P analysis for both groups were then tested for significant differences. Significance of the Study Data for this research was collected in School District 39 (Vancouver). Vancouver is a broadly multicultural district with a high proportion (approximately 46%) of students who speak Engl ish as a second language (ESL) . Al though the T L C is used by speech-language pathologists within the district, little is known about the effectiveness of its use with Vancouver students. The present study investigated the psychometric characteristics of the T L C using data obtained f rom within this culturally diverse population. In so doing, it contributes to existing evidence of T L C validity and reliability, and is of interest to speech-language pathologists in culturally diverse educational jurisdictions who are using or who may consider including the T L C in their assessment batteries. In the same way, this study contributes to evidence concerning the criterion-related validity of an informal language sampling procedure for the local population. Summary Language assessment is a hierarchical process that includes formal and informal methods of assessment. Al though standardized tests are widely used, there is growing dissatisfaction concerning their technical and practical merits. The Test of Language Competence ( T L C ) is a recently developed measure of language suitable for use with older chi ldren and adolescents between the ages of nine and eighteen years. Information 6 concerning the theoretical background and development of the T L C is presented in the test manual. Research evidence of T L C validity and reliability is limited to that reported in the test manual, and one other unpublished investigation. The present study investigated the validity and related technical characteristics of the T L C using data obtained f rom a local and culturally diverse population. A t the same time, the criterion-related validity of an informal language sample was investigated. 7 Chapter Two Review of the Literature Introduction This chapter will review the literature concerning the validity of standardized language tests, and in particular, those developed for use with older children and adolescents. Three issues relevant to this discussion were identified in the previous chapter, and wil l be discussed here in the following order: theoretical considerations during test construction, the lack of test validation research, and the validity of standardized language tests applied in multicultural settings. The T L C will be reviewed within the context of this review, followed by a discussion of informal language sampling as an adjunct to formal assessment. The chapter opens with an overview of the assessment process, and a definit ion of test validity. The Assessment Process Language assessment is a hierarchical process that involves screening, diagnosis, program planning, and evaluation (Wiig & Semel, 1984). The first level of assessment concerns screening for possible language disorder. Screening data may be obtained from a variety of sources, including observation, student records, informal, cl inician-prepared tasks, and standardized tests (Larson & M c K i n l e y , 1987; Tibbits, 1982; Wiig & Semel, 1984). Few screening tests suitable for use with adolescents have been developed. Those available include the Screening Test of Adolescent Language ( S T A L ; Prather, Beecher, Stafford & Wallace, 1980), and the Cl inical Evaluation of Language Functions -Screening Tests ( C E L F ; Semel & Wiig, 1980), currently under revision. A short form of The Test of Language Competence ( T L C ) , which is the subject of this study, is intended for use as a screening instrument. The second level of language assessment concerns diagnosis. The purpose of assessment at this level is to conf i rm the existence of a language disorder. Typica l ly , this objective is met by the administration of standardized tests (Larson and M c K i n l e y , 1987; 8 Tibbits, 1982; Wiig and Semel, 1984). A number of standardized language tests have been developed for use at this level with older children and adolescents. The Test of Adolescent Language, now the T O A L - 2 (Hammil l , Brown, Larsen & Wiederholt, 1987) may be used with subjects between the ages of 12 and 18. The Test of Language Development-Intermediate, revised as the T O L D - 2 - I (Hammill & Newcomer, 1988) was developed for older children between the ages of 8 and 12 years. Other instruments include The Fullerton Language Test for Adolescents ( F L T A ) (Thorum, 1980), which is intended for use with subjects between the ages of 11 and 18 years. The Cl inical Evaluation of Language Fundamentals-Revised ( C E L F - R ; Semel, Wi ig, & Secord, 1987) includes norms for subjects between the ages of 5 and 16 years of age. Other measures of specific language skills and abilities that have not been developed exclusively for use with adolescents but may be suitable for use with this population are described by Larson and M c K i n l e y (1987), Tibbits (1982), and Wiig and Semel (1984). Data obtained f rom standardized tests may be used at the following two levels of assessment for describing the nature of the language problem and planning intervention goals. T o accomplish these objectives, clinicians may examine differences among subtest scores to determine areas of strength or weakness. Other methods include conducting error analyses on the basis of item responses, and altering task formats to observe where student performance breaks down. McCauley and Swisher (1984b) have cautioned against use of the above methods for planning therapy objectives, claiming that they may lead to "a mistaken understanding of a client's problem, to inappropriate and fruitless therapy programs, or to inaccurate conclusions regarding the eff icacy of therapy" (p. 338). Errors associated with response analysis stem f rom the fact that no single test covers an exhaustive range of skills; therefore some skills that need to be addressed in therapy may be overlooked in the assessment. A second problem is that an incorrect response to a specific test item may not represent a true deficit in the skill represented. T h i r d , subject responses 9 obtained under standardized conditions may be unrepresentative of the individual's language in other contexts. Final ly, altering test items or teaching to specific items invalidates a test for future purposes. Profile analysis is considered an acceptable method for determining strengths and weaknesses if appropriate statistical procedures governing the interpretation of significant differences are observed; however, McCauley and Swisher point out that the information required to do this is frequently omitted f rom test manuals. Al though standardized tests may contribute to the diagnostic profi le, information is often obtained at this level through the use of informal assessment procedures. The term refers to interviews, observations, questionnaires, and other non-standardized procedures, including language sample analyses. Informal procedures are, for some, the preferred method of language assessment. Proponents argue that standardized tests collapse what is essentially a complex process (language) into a few meaningless test scores, whereas informal procedures yield descriptive information that may be translated into instructional objectives (Damico, in press; M u m a et al . , 1982; Leonard, Perozzi, Prutting & Berkley, 1978). Others maintain that informal procedures enable the clinician to sample behavior in a variety of contexts, leading to a more accurate portrayal of language functioning (Bloom & Lahey, 1978; Larson & M c K i n l e y , 1987; L u n d & Duchan, 1983). Moreover, informal procedures permit the direct application of theory "thereby bridging the gap between some less timely standardized tests and what is currently understood about the nature of receptive and expressive competence" (Simon, 1984, p. 84). Final ly , standardized tests are viewed as suffering f rom many technical inadequacies; this, it has been suggested, "forces the clinician and teacher to use clinical judgement in the diagnostic process" (Cupples & Lewis, 1984, p. 131). Stark, Tal lal and Mellits (1982) have enumerated the pitfalls of relying on clinical judgement in language assessment. Cl inical judgement, they argue, is based on inexplicit criteria; therefore it cannot be replicated, it is subject to bias, and is not suitable for 10 research purposes. Cl inical judgement cannot be used independently to assess the language of children f rom other cultures, and it cannot be used to distinguish between a language disorder and a more global intellectual impairment. McCauley and Swisher (1984b) advise that the reliability and validity of informal procedures in general require further clinical and research attention. The final level of language assessment concerns progress evaluation. Evaluation is an ongoing part of the assessment process, and may include the use of formal or informal procedures. Hammil l et al. (1987) comment that "the use of cri terion-referenced enroute objectives does not obviate the need to be sure that the enroute objectives do in fact lead to the desired, general integrated language goals" (p. 3), and recommend retesting students with the same, or similar instruments, that were used to identify them for special programmes in the first place. McCauley & Swisher (1984b) cite three reasons against the use of standardized tests for monitoring progress. First , standardized tests are designed to compare individuals, and may be insensitive to intraindividual changes over time. Second, changes in test scores may be related to the unreliability of the instrument. F inal ly , repeated administration of standardized tests may result in practice effects which invalidate test results. While there are extremes of opinion concerning formal and informal assessment procedures, the more broadly-held view is that the two approaches are complementary (Blau, Lahey, Oleks iuk-Ve lez , 1984; K e l l y & R ice , 1986; Launer & Lahey, 1981; McCau ley & Swisher, 1984b; Stephens & Montgomery, 1985). Larson and M c K i n l e y (1987) and Tibbits (1982) consider a combined approach essential for assessing the language of older children and adolescents. Test Val idi ty Test validity concerns the appropriateness of inferences made f rom test scores. / Tradit ional definitions of test validity have distinguished between content, concurrent and construct validity. More recently, validity has been defined as a unitary concept 11 that includes all three types of evidence ( A P A , 1985). For the sake of clarity, each will be defined separately at this point. A preliminary step in the process of test development is the selection of a theoretical model which defines the trait or behavior to be measured, and provides a rationale for item selection ( A P A , 1985; Cronbach, 1971). Content validity concerns the adequacy of content sampling f rom within this theoretical framework, or the extent to which test items represent the behavior of interest in its proper proportion. Nunnally (1978) maintains that content validity "rests mainly on appeals to reason"; however, in some situations empirical methods may be employed to enhance content validity. These include, for example, using item analysis procedures during test development, or obtaining correlations between the test of interest and measures of the same trait or behavior. Detailed discussions of item development and content validity are located in Anastasi (1982), Cronbach (1971), Henrysson (1971) and Nunnally (1978). Criterion-related validity is of primary interest when the test under investigation is intended for classification or decision-making purposes (Anastasi, 1982; Cronbach, 1971). The extent to which test scores predict performance on one or more outcome criteria is a measure of criterion-related validity. Anastasi (1982) suggests "a test may be validated against as many criteria as there are specific uses for it" (p. 138); typical criteria are academic achievement, group membership, diagnostic classification, and so forth. Criterion-related evidence may be concurrent or predictive. Concurrent validity is examined when data are obtained for the test of interest and the outcome criteria at the same point in time. Predictive validity is examined when data pertaining to the outcome criteria are obtained at some point in the future; however, the term may be used to refer to prediction at any time (Anastasi, 1982). Al though test validation should employ all three types of evidence ( A P A , 1985), construct validity is of critical significance when the test of interest is a proposed measure of some unobservable trait, or construct. Construct validity concerns how well 12 a test measures the construct it is intended to measure, and "any data throwing light on the nature of the trait under consideration and the conditions affecting its development and manifestations are grist for this validity mill" (Anastasi, 1982, p. 144). Thus , content and criterion-related evidence may contribute to evaluations of construct validity. It is on this point that G u i o n (1977) has argued "all validity is at its base some form of construct validity" (p. 410). Similarly, Messick (1980) defines construct validity as "the unifying concept of validity that integrates criterion and content considerations into a common framework for testing rational hypotheses about theoretically relevant relationships " (p. 1015). Messick's point can be traced to earlier discussions concerning the significance of the "nomological net" (Cronbach & Meehl , 1955) or theoretical framework within which constructs are defined in relation to other constructs and observable behaviors. Construct validation is a process of testing hypotheses or predictions made on the basis of test performance. The extent to which an hypothesized relationship is supported is evidence of construct validity. Examples of construct validation studies include, for example, correlating the test of interest with measures of the same trait, or measures of different traits, examining item and subtest intercorrelations, and factor analysis. In situations where the proposed relationship is not supported, one may assume that the test, the research methodology, or the theoretical framework is unsound. (Cronbach & Meehl , 1955). T o summarize, test validity is a unitary concept comprised of content, criterion-related and construct validity. A l l three types of evidence are bound by a common theoretical framework which provides a rationale for test score interpretation. This definit ion assumes several points. First, in order to demonstrate validity, a test must be based on a theoretical framework or model that provides a rationale for item selection and test interpretation. Second, evidence of construct validity supports not only the validity of the test under investigation, but the theory on which the test is based. 13 Final ly , test validation is a ongoing process of accumulating evidence which may ultimately be used in an evaluation of construct validity. Theoretical Background It may be concluded from the foregoing discussion that validity is built into a test f rom the outset through the articulation of a sound theoretical model. A s stated, the function of a test model is to provide a rationale for item selection and the interpretation of test scores; a factor which bears upon content and construct validity. Despite the significance of theory in test construction, many language tests are not theory-driven (Muma, 1985; McCauley & Swisher, 1984a). For example, Stephens and Montgomery (1985) reviewed six tests of adolescent language ( S T A L , W O R D Test, T O L D - I , F L T A , C E L F , T O A L ) and concluded only the T O A L and the T O L D - I were constructed with reference to any theoretical model. Lieberman & Michael (1986) evaluated the content relevance and content coverage of three standardized language tests ( C E L F , C E L I , T O L D ) ; two of which are suitable for use with older children. Content relevance was evaluated according to five criteria, including the existence of a theoretical model. Only one test ( T O L D ) was judged to be adequate in this area. Content coverage in the same study was evaluated by analyzing the grammatical requirements of each test item using the Language Assessment Remediation Procedure, or L A R S P (Crystal et al . , 1976), which is based on a developmental model of grammar. Results of the L A R S P analysis led the researchers to conclude that for all three tests, content coverage was incomplete and unrepresentative of the grammatical domain. Each instrument, for example, was found to overrepresent the earlier stages of grammatical development, indicating that it might be too easy to identify language problems in older children. In a later study, Lieberman et al. (1987) compared the performance of 30 randomly-selected sixth-graders (11.6 to 12.5 years) on four adolescent language tests ( F L T A , T O A L , C E L F , S T A L ) . The findings indicated that 21 subjects obtained scores 14 below the designated cutoff point on the F L T A , compared to 22 on the T O A L , 18 on the C E L F , and 6 on the S T A L (a screening test). The researchers attributed the observed differences in group performance in part to the atheoretical nature of the tests, adding: "it is possible that neither the content nor the procedures of these tests may represent the essential forms, features, and systems of adolescent language in their proper proportion and balance" (p. 260-61). Some language tests have been constructed according to models that are not supported in research. For example, M u m a (1984) was critical of the C E L F (Semel & Wiig, 1980) because rather than proposing a broadly-based theoretical model, the test authors cited "various domains of presumed deficits that have been reported in the special education literature" (p. 101). Noting the methodological flaws inherent in much of the learning disabilities research, M u m a concluded that the C E L F authors had "managed to stack together several strawmen in the components of the C E L F " (p. 102). Elsewhere, Lieberman et al. (1987) argued that the results of existing research into adolescent language development are "incomplete and fragmentary", adding that "until researchers broaden this language base and authors use it in test construction, the development of adolescent language tests seems premature and especially susceptible to problems of test inadequacy" (p. 263). T o summarize, content validity depends upon the existence of a theoretical framework. Without a wel l -def ined theoretical domain, "assessment becomes a circular endeavor of merely claiming a domain, attaching a label, and constructing presumed tasks with their attendant responses, scores, norms, and results " (Muma, 1984, p. 102). Furthermore, by definit ion, construct validity assumes the existence of a theoretical, or conceptual framework. The evidence presented would suggest that many standardized language tests may be inadequate with regard to content and construct validity because they are weak on the level of theory. 15 Validation Research Test developers have a responsibility to provide evidence of test validity in test manuals ( A P A , 1985). Reviewers often criticize the adequacy of technical information reported in language test manuals. For example, reviewers of the W O R D test, a measure suitable for use with older children up to the age of 11 years, concur that the test authors offer minimal evidence of validity and reliability (Stephens & Montgomery, 1985; Donahue, 1985; Ra ju , 1985). In their review of the C E L F , Stephens & Montgomery (1985) referred to the reported evidence of validity and reliability as "singularly unimpressive" (p. 36). Sommers (1985) criticized the evidence of criterion-related evidence in the S T A L as "inappropriate" because it used the D T L A as a criterion measure, and "there is no reason to believe that the four subtests f rom the D T L A measure language processing either" (p. 1332). Fol lowing a review of 30 language and articulation test manuals, McCauley & Swisher (1984a) concluded "those criteria that require the application of considerable psychometric expertise, time, and money--cr i ter ia related to empirical evidence of validity and rel iabil i ty--were met least often" (p. 40). Test validation refers to the process of gathering evidence to support specific inferences made f rom test scores ( A P A , 1985). This definit ion implies that test validation continues beyond the initial research reported in test manuals. Nevertheless, few test validation studies are reported in the literature. A number of studies have been reported which raise questions concerning the criterion-related validity of several adolescent language tests. These are touched upon briefly below. Stephens and Montgomery (1985) reported that clinicians surveyed found the S T A L , a screening test for adolescent language disorder, "too easy", adding that the S T A L manual reported a false negative rate of 32% in students passing the S T A L but fall ing below the designated cutoff score on the D T L A . Lieberman et al. (1987) observed that the S T A L identified 6 students out of 30 as being at risk. Three other measures in the same study ( T O A L , C E L F , and F L T A ) identified no fewer that 18 16 students as being language disordered. Considering the latter three as criterion measures, the criterion-related validity of the S T A L seems questionable. This low failure rate on the S T A L might be attributable to the fact that it has a lower cutoff score (10th percentile) than the other measures. Other researchers have concluded, however, that the T O A L and F L T A may, in fact, overidentify individuals as being language disordered. Caskey and Frankl in (1986) and A r a m , Ekelman and Nation (1984) concur that the T O A L appears to be too diff icult for evaluating the lower range of language functioning in adolescents. As an example, Caskey & Frankl in found that in a sample of 20 "gifted" students (WISC-R IQ of 128 or higher), 10 obtained adolescent language quotients on the T O A L in excess of 15 standard score points (+1 standard deviations) below their IQ scores, thus qualifying them for services as learning disabled (Caskey & Frankl in , 1986). These results might be explained in part by i tem-ordering on the T O A L . Caskey & Frankl in observed that when all T O A L items were administered, some individuals achieved several basals after reaching a ceil ing. The researchers concluded that items on the T O A L are not well-ordered with respect to dif f iculty. L ieberman et al. (1987) addressed the question of construct validity in their study of four adolescent language tests. Differences in group performance were observed among the four measures; however these were not significant in three out of four instances, and intertest correlations were moderately high. These results were thought to support the theory of a general language construct underlying many language tests, including those claiming to measure distinct skills and abilities. Studies reported by Damico and Damico (Damico, personal communication, August , 1989), Schery (1985) and Sommers et al. (1978) have resulted in similar conclusions. Al though the data suggested some redundancy among the different measures, Lieberman et al. concluded that further research into the factor structure of adolescent language tests is necessary, and that the 17 substitution of one adolescent language test for another is inadvisable at the present time. The fact that few validation studies are reported in the speech-language literature is disconcerting for two reasons. First, there is insufficient evidence to support the use of any test as a criterion against which to measure the construct validity of new instruments. Second, there is limited information on which to base the revision of existing tests. This point is particularly relevant in view of the fact that a number of standardized language tests have recently been revised. A case in point is the T O A L , which served as a criterion instrument in T L C validation research, and has now been revised as the T O A L - 2 (Hammil l et al. , 1987). Tests should be revised "when new research data, significant changes in the domain represented, or new conditions of test use and interpretation make the test inappropriate for its intended use" ( A P A , 1985). A n inspection of the T O A L - 2 Manual revealed no explicit purpose or rationale for test revision. Val idi ty and reliability data are reported on the basis of research using either the T O A L or the T O A L - 2 , because, the test authors explain, "The two versions of the test are essentially the same" (p. 47). The main difference between the two tests appears to be in the range of item diff iculty. O n the basis of user "comments" that the T O A L was too dif f icult , a number of "easy" items have been added to seven of the eight subtests. No reference is made in the T O A L - 2 manual to independent research investigations of T O A L item dif f icul ty, and it appears that no attempt has been made to correct for suggested problems of item ordering. Thus , although the T O A L - 2 may represent an improvement over the previous edition, it appears that the authors have been less than thorough in their efforts toward improving the test. In summary, validation evidence presented in language test manuals is frequently inadequate, and few studies investigating the validity of standardized language tests are reported in the professional literature. Consequently, insufficient evidence exists to 18 support the use of any one test as a criterion instrument in validation research. Moreover, there is little empirical justification for the revision of existing instruments. Assessing E S L Populations In its position paper on social dialects, the Committee on the Status of Racial Minorit ies ( A S H A , 1983) maintained that "no dialectal variety of English is a disorder or a pathological form of speech or language" (p. 23). Bernstein (1989) allows that distinguishing between a communication difference and a communication disorder in language assessment "is not an easy task". Two kinds of errors are possible. The first is to misclassify children with language differences as language disordered. A second type of error is to overlook children with language disorders because of an assumption that they have had insufficient opportunity to learn the language. Few standardized tests have been developed to identify language disorders among the E S L population. A s a result, clinicians rely on tests that have been developed for use with populations whose first language is Standard Engl ish. A m o n g the threats to validity associated with using tests under these circumstances are the unrepresentative-ness of test norms, the possibility of culturally-biased test items, lack of test-taking skills among children f rom minority backgrounds, examiner effects on the test behavior of culturally different chi ldren, and motivational factors, all of which may lead to errors in assessment (Evard and Sabers, 1979; Sattler, 1988; Vaughn-Cooke , 1983). T o address these issues, alternatives to existing instruments and procedures have been proposed. Examples include developing norms for distinct linguistic groups, including a percentage of minority groups in standardization samples, modify ing test items, administering tests in the subjects' native language(s), and developing new tests (Evard and Sabers, 1979; Sattler, 1988; Vaughn-Cooke , 1983). Others have advocated the use of informal and criterion-referenced procedures to improve the quality of language assessment (Bernstein, 1989; Damico, in press; Hol land & Forbes, 1986). Regarding the assessment of E S L adolescents, Larson & M c K i n l e y (1987) support the 19 development of new instruments, but maintain that informal language samples should be included in the assessment process. The Test of Language Competence A relatively recent publication, the T L C has arrived on the heels of considerable criticism concerning the technical adequacy of standardized language tests. It would appear that T L C authors Wiig and Secord (1985) have been sensitive to this criticism. The test manual reports extensively on the theoretical background and technical characteristics of the test. These are summarized below in relation to the foregoing discussion. Theoretical Background. The T L C authors define language competence as a single construct requiring both the understanding of language content, and a responsiveness to the context in which communication occurs. Each of the four T L C subtests is intended to measure a unique aspect of language competence. The subtests are categorized into a model which features semantics (word meaning), syntax (grammar) and pragmatics (rules governing social/verbal communication). Within these three levels, the content of each subtest is further divided into (a) propositions in narrow contexts and (b) propositions in communicative contexts. In the former context, subtest content deals primarily with semantic or syntactic meaning, while in the latter context, subtest content includes pragmatic considerations. This model of communicative competence represents what the test authors view to be a shift away f rom the assessment of specific language skills such as phonology, vocabulary or syntax, to the assessment of linguistic processes or strategies. The model was derived f rom an extensive review of the literature concerning linguistic strategy development. Four general areas were selected on the basis of that review and are represented by each of the T L C subtests. A second literature review led to the selection of specific models within each area f rom which the subtests were developed. The T L C is then an integrated model based on not one, but several theories of language 20 processing. Each subtest is proposed as a measure of the broader construct, language competence, and is based on existing theory or research. The same criticism raised by M u m a (1984), regarding the patchwork of theory and research that went into the making of the C E L F , may also apply to the design of the T L C . Little contemporary research evidence is reported to support the theoretical design of each subtest. Moreover, relationships among these various bodies of theory and research have not been established. Further examination of the technical aspects of the T L C should contribute to judgements concerning the adequacy of the test model. Standardization. The T L C was standardized on 1,796 students f rom three geographic regions in the Uni ted States. The cultural characteristics of the sample are described as 86.2% "white", 8.6% "black", and 3.6% "other", while the proportion of distinct linguistic groups, other than Spanish, is not described. G i v e n these considerations, the appropriateness of T L C norms for use with students in the Vancouver school district, where data for this research were collected, seems questionable. In 1982, for example, 24,524 Vancouver students were found to speak Engl ish as a second language (ESL) . This figure represented approximately 46% of the total district population. O f these, Chinese, East Indian and Italian were the most common language groups (La Torre , 1983). More recently, 27% of students surveyed in Vancouver reported that they had learned at least two different languages simultaneously as native languages (Watson-Russell, 1986). In their discussion of T L C development and standardization, authors Wiig and Secord explain that separate norms for separate races or ethnic groups were not considered because test validity is unrelated to the representativeness of a norming sample; however, the authors advise caution in interpreting the test scores of minority subjects and have provided information to assist in the development of local norms. Content Validity: Evidence of T L C content validity is claimed on the basis of four criteria outlined by Kretschmer and Kretschmer (1978). These criteria focus on the 21 role of theory in test construction, specifying that tests must be based on a theoretical definit ion, that there should be contemporary research support for this theoretical framework, and sufficient information concerning the development of test items should be provided in test manuals to permit the generation of new test items. In response to these criteria it might be argued that there is insufficient contemporary research evidence to support the theoretical scaffolding on which the T L C is based. In addition to the theoretical evidence offered in support of content validity, item analysis procedures were employed during T L C development. These are not discussed in detail in the Technical Manual , but are reported to have included internal consistency coefficients using Cronbach's A l p h a to increase the homogeneity of the subtests, and studies of item diff iculty (Secord, personal communication, August , 1989). Cri ter ion-Related Val idity. Evidence of criterion-related validity is reported in the Technical Manual on the basis of correlations obtained between the T L C and three criterion measures: the T O A L , W I S C - R , and the Educational Abil i t ies Series ( E A S ; Thurstone, 1978) for a sample of 28 L L D and 28 controls. L L D subjects were so identif ied on the basis of school referral procedures which are not described. Controls were described as normally achieving; again, how this determination was made is unclear. Correlations between the T L C , the T O A L and the E A S were calculated separately for both groups. T L C and W I S C - R correlations were calculated for the L L D group only. W I S C - R data were not reported for the control group. Results of a discriminant function analysis using T L C and T O A L scores for the same 56 subjects indicated that 96% were correctly classified as language-disabled, while 93% of controls were correctly classified. A subsequent stepwise discriminant function procedure indicated that Subtest Four (Understanding Metaphoric Expressions) contributed most to group discrimination, followed by Subtest Three (Recreating Sentences), and Subtest Two (Making Inferences). Subtest One (Understanding 22 Ambiguous Sentences) did not account for a significant proportion of variance and was not entered into the discriminant function. In a later study, Santos (1987) investigated the variance in reading comprehension among a combined sample of 20 reading disabled and 20 control subjects (ages 15-17 years). A significant relationship was hypothesized between each of the TLC subtests and the Durrell Analysis of Reading Difficulty. Results indicated that 16 of the 20 reading disabled subjects obtained TLC composite scores >1 SD below the mean. In contrast, 19 of 20 control subjects scored at or above the 50th percentile. Accepting group membership as a criterion, these results might be claimed to support the concurrent validity of the instrument. Construct Validity. Evidence of TLC construct validity is offered on the basis of correlations between the TLC and WISC-R scores for the same 28 L L D subjects described above. Higher (convergent) correlations were observed between scores on the TLC subtests and VIQ (.48 to .78), while lower (divergent) correlations were observed between scores on the TLC subtests and PIQ (.18 to .53). While this is a reasonable interpretation, it may once gain be pointed out that these correlations are available for the language disabled group only. Further evidence of construct validity is reported on the basis of intersubtest correlations obtained from the standardization sample at various age intervals. These range from .17 to .50. Moderate correlations are explained by the fact that each subtest represents a different content domain. This is reasonable; however, more support for this interpretation could have been demonstrated if subtest-to-total-test correlations had been reported. Assuming that the total test score is an overall measure of language competence, and assuming that each subtest measures some aspect of language competence, subtest-to-total-test correlations should be higher (Anastasi, 1982). In a separate investigation, subtest intercorrelations were calculated using data obtained from the same group of 28 LLD subjects and 28 controls described above. 23 Correlations ranged from .24 to .57 for the L L D ' s and from -.1 to .39 for the controls. In contrast, Santos (1987) reported a considerably higher range of subtest intercorrelations (.47 to .75) for a mixed sample of reading disabled/control subjects. The higher range of correlations observed in Santos' investigation are explained by the heterogeneity of the subject sample involved. Factor analysis results are reported on the basis of subtest intercorrelations obtained f rom the standardization sample. The percentage of variance explained by the first unrotated factor at all but two age levels was greater than 90%. A subsequent oblique rotation factor analysis using T L C item intercorrelations and yielded four factors at the 9-11 and 12-17 year age groups. These results are claimed to support both the existence of an underlying language factor, as well as the specificity of the T L C subtests. Thus , "the T L C emerges as an attractive blend of both worlds—a strong general factor supported by four specific subgroups of items." (Wiig and Secord, 1985, p. 47). Internal Consistency: The validity of a test is a function of its reliability. One method of estimating test reliability that is useful in the evaluation of test validity is to examine the degree of association among test items, or the internal consistency of the test. Internal consistency coefficients (Cronbach's Alpha) are reported for the four T L C subtests and the total test at each age level in the standardization sample. Because the T L C was intended to measure language competence across different content areas, greater homogeneity was expected within subtests than across items (Wiig and Secord, 1985, pp. 2, 48). In fact, the range of coefficients reported for the T L C subtests is f rom .52 to .79, while the range of coefficients reported for the T L C composite is f rom .75 to .82. The lower range of coefficient observed for the T L C subtests is explained by the test authors as the effect of test length on estimates of reliability. That is, the four T L C subtests "were designed to be as short as possible" in order to reduce administration time (Wiig and Secord, 1985, p. 48). In limiting the length of each subtest, internal 24 consistency estimates have likewise been reduced. The combined length of the total test is considered to have resulted in a higher range of internal consistency coefficients than observed for the T L C subtests. Interrater Reliability: Another consideration in the evaluation of test reliability is the extent of interrater agreement on subjectively scored items. Interrater reliability figures are reported in the Technical Manual for Subtests Three and Four , which require some judgement in scoring. Interrater agreement was defined as the percentage of agreements observed between sixteen raters and the test authors on one protocol. The f inal estimates were 97% for Subtest Three and 98% for Subtest Four. The above figures were obtained following a three-step procedure during which all sixteen raters were trained in the application of scoring criteria by author Secord. Intermediate measures of agreement were obtained, followed by additional instruction. Not surprisingly, an increase in the percentage of agreements was observed between the intermediate and final calculations. This method of estimating interrater agreement illustrates the effect of direct training on scorer reliability. Sample protocols and scoring criteria identical to those used in the training procedure are included in the Administrat ion Manual . The magnitude of the interrater reliability coefficients obtained in this study wil l give some indication as to the adequacy of the scoring guidelines. T o summarize, the T L C represents a significant improvement over many other standardized language tests currently available for use with older children and adolescents in several respects. First, the test is based on a clearly stated theoretical model. Second, the test manual includes extensive information concerning the technical characteristics of the test. The information as reported, however, raises a number of questions. In particular, there has as yet been no attempt to examine the relationship between T L C performance and V I Q for control subjects. Second, there is some evidence to suggest that at least one of the subtests does not contribute to L L D / c o n t r o l group 25 discrimination; however; the test authors have not addressed this issue. Final ly, the validity of using the T L C with E S L subjects has not been explored. Informal Assessment As the general dissatisfaction with standardized language tests continues to increase, more extensive use is being made of informal assessment procedures. Informal language assessment involves the use of interviews, questionnaires, observational techniques, and other non-standardized procedures, including informal language sampling and analysis. Language sampling is the process of eliciting, recording and transcribing a sample of spontaneous language, and then analyzing it to determine areas of strength and weakness. The focus of discussion here will be on the informal language sample. Elicitation procedures used in the collection of language samples vary. These include picture stimuli, where the subject is shown a picture and is asked to describe it. Other methods include prompting statements, such as "tell me about..", or direct questions. Unstructured conversation between the clinician and subject is the preferred method of obtaining a language sample (Larson & M c K i n l e y , 1987; Atkins & Cartwright, 1982); however this is not always practical due to situational and time constraints (Simon, 1984). Opin ion varies as to the length of the sample required to ensure adequate coverage of the subject's language. The minimum sample required to calculate Mean Length of Utterance ( M L U ) (Nice, 1925), for example, is 50 utterances. Crystal et al. (1976) recommend continuous sampling of 15 - 30 minutes. M u m a et al. (1982) suggest sampling over time in a variety of situations. Little research evidence exists to help resolve the issue. The consensus appears to be that 50 - 100 utterances is an acceptable min imum (Darley & Spriestersbach, 1978; Wiig & Semel, 1984) The advantages and disadvantages of language sampling have been widely discussed (Bloom & Lahey, 1978; Larson & M c K i n l e y , 1987; M u m a et al . , 1982; Wiig 26 and Semel, 1984). Often cited among the advantages are the flexible administration procedures, and the opportunity for observing language behaviors in various contexts. Moreover, because language sampling is a descriptive procedure, proponents argue that it provides valuable information for planning intervention strategies. A m o n g the disadvantages of language samples is that they do not yield standardized scores and therefore are not useful for making classification decisions. They lack objectivity, and because administration procedures are unstandardized, they are not replicable, and results may vary f rom sample to sample. More practical disadvantages include the time required to obtain and transcribe language samples, and the expertise required to interpret them. Larson & M c K i n l e y (1987) have suggested that the greatest di f f iculty with language samples is knowing how to analyze them. This is particularly the case in analyzing the spontaneous language of older children and adolescents. Numerous language sampling and analysis procedures have been developed (Crystal et al. , 1976; L u n d & Duchan , 1983; Mi l ler & Chapman, 1983; M u m a , 1981; Tyack & Gottsleben, 1974). Many of these are referenced to a developmental framework of early language development, and are considered inappropriate for use with older children and adolescents. There is at present little empirical evidence concerning the critical stages of language development within this age group, or the sequence in which these occur (Hammil l et al . , 1987; Larson & M c K i n l e y , 1987; Lieberman et al. , 1987); however, this does not preclude the use of language samples with these groups. Despite its limitations, language sampling yields valuable descriptive information concerning social-verbal skills and expressive language characteristics at any age level (Larson & M c K i n l e y , 1987; T ibbi ts , 1982). The expressive language characteristics of language-disabled youth have been described in some detail by Larson and M c K i n l e y (1987), Tibbits (1982) and Wiig and Semel (1984). These include word- f inding difficulties such as overuse of fillers (um, 27 uh), pronouns, and circumlocutions. In addition, language disabled youth demonstrate patterns of deficit syntactic development. As a result, they have dif f iculty mastering the rules which govern the construction of complex sentences, and as a group show lower M L U than their non-disabled peers. It might be hypothesized that language disabled adolescents as a group would demonstrate fewer complex sentence structures in natural conversation than nonhandicapped adolescents, and that this difference would be demonstrated by an informal language sample analysis. Summary Language assessment is a hierarchical process that involves the use of formal and informal assessment procedures. Issues surround the use of both; however, they are generally viewed as complementary procedures. The Test of Language Competence ( T L C ) is a relatively new test designed for use with children and adolescents between the ages of 9 and 18 years. It represents a significant improvement over many tests currently available for use with this age group; however, evidence of T L C validity is l imited to that reported in the Technical Manual , and one other unpublished investigation. The purpose of the present study was be to further investigate the technical characteristics of the T L C . A second purpose was be to compare the performance of language disabled and control subjects using an informal language sample. Research Questions This study compared data obtained by a mixed sample of L L D and control group subjects on the T L C and an informal language sample. The following research questions were addressed: 1. What are the internal characteristics of the T L C with respect to: a. item dif f iculty indices b. item discrimination indices c. internal consistency of subtests and composite 2. What is the interrater reliability of the subjectively-scored sections of the T L C ? 3. What is the relationship between the T L C and V IQ? 4. H o w effectively does the T L C discriminate between language-disabled and control groups? T o what extent does this discrimination reflect individual differences in the IQ? 5. Will scores obtained by E S L subjects differ significantly f rom scores obtained by E F L subjects on six variables (VIQ, PIQ, T L C Subtests One through Four)? 6. Do language disabled and control subjects differ in characteristics of their informal language? 29 Chapter III Methodology This chapter presents a description of research methodology. Sampling procedures and sample characteristics are described, followed by a discussion of data collection procedures and data analysis. Sampling Procedures Data were collected in School District 39 (Vancouver). Members of the language disabled group ( L L D ) were selected f rom classrooms for the communicatively disordered located in two schools, one elementary and one secondary, situated on the city's east side. Chi ldren are declared eligible for placement in these classes on the basis of psychoeducational and speech-language assessment data. El igibi l i ty criteria allow a significant discrepancy between W I S C - R V I Q and PIQ on the W I S C - R in favour of Performance; although this may not always occur. Chi ldren must demonstrate language delays of two or more years in the primary grades and three years in the intermediate/secondary grades, and their language deficits must not be the result of physical, intellectual, or emotional impairments. Although children in these classes may speak Engl ish as a second language, this is not considered to be the primary cause of their language difficulties (Education Services G r o u p , 1985). For the purpose of this study, children were required to have no greater than average verbal ability as measured on the W I S C - R verbal scale (VIQ<115), and have at least average nonverbal ability as measured on the W I S C - R Performance Scale (PIQ >85). Each student in both communications classes received an information package to be taken home and read by parents. The package contained a letter explaining the purpose of the study, parent/student consent forms, and a brief questionnaire (Appendix A ) concerning family linguistic background, area of residence and educational history. A total of 25 students agreed, with parent permission, to participate in the study. Existing W I S C - R scores were then obtained from student records. Students whose W I S C - R data 30 was more than two years old were re tested. Those who obtained PIQ's below 85 were eliminated. No subjects were eliminated on the basis of V IQ . The f inal number of students in the L L D group was 23. Control group members (controls) were selected from two regular education classes, one in the same elementary school as L L D subjects, and one in a neighboring high school. Teachers and counsellors were provided with lists of matching criteria; these included the age, sex and linguistic background of each subject in the L L D group. Staff were asked to nominate students who met these criteria, and who were judged to be of average classroom achievement on the basis of school grades. Each nominated student received an information package addressed to parents. Students who, with parent permission, agreed to participate were administered the WISC-R . Students of average verbal ability (V IQ 85-115) and at least average nonverbal ability (PIQ >85) were retained. Students receiving learning assistance, or instruction in Engl ish as a second language, were excluded, as were those whom school counsellors considered to be of above average achievement according to school records. A total of 23 subjects were retained. The f inal sample consisted of 46 subjects ranging in age f rom 11 to 15 years. Subjects were matched for age, sex and linguistic background. A summary of sample characteristics is provided in Tables 1 - 2. Measuring Instruments The Test of Language Competence (Wiig and Secord, 1985) The Test of Language Competence ( T L C ) was developed for use with older children and adolescents between the ages of 9.0 and 18.11 years. It is intended to assess delays in the development of language competence. "Language competence" is defined by the test authors as a unitary construct consisting of (a) the understanding and expression of language and (b) sensitivity to the social context within which the 31 Table 1 Sample Characteristics: Age . Sex. English Second Language (ESL) . English First  Language ( E F L ) Characteristic Number in Each Group Language Disabled Controls Combined Sex Male Female Total Language E S L E F L U n k n o w n Total 13 IG 23 9 13 I 23 12 i i 23 11 12 0. 23 25 2 i 46 20 25 1 46 Age Statistics (in years/months) Range 11-3 to 15-5 Mean 13.4 11-3 to 15-1 13.5 32 Table 2 Cultural Background of Sample No. Language Disordered No. Controls No . Combined Croatian 1 - 1 Japanese 1 - 1 Portuguese - 1 Greek 1 1 2 Native Indian 1 1 2 Phil ipino 1 1 2 Punjabi 1 2 Vietnamese - 2 2 H i n d i / F i j i a n 2 2 4 Chinese 7 6 13 Engl ish 8 8 16 Total 23 23 46 33 communication occurs. The T L C consists of four subtests, each intended to measure a unique aspect of language competence. A brief description of each subtest follows: Subtest One (Understanding Ambiguous Sentences) The subject is presented with one sentence that may be interpreted in two ways (e.g. "I saw the girl take his picture"). The task is to verbalize both interpretations. The subtest includes thirteen scoreable items, and two training items. The entire test is given; no basal or ceiling levels are established. The subtest may be discontinued if the subject fails to respond three times in a row; however this is not a requirement. The subject is allowed a maximum response time of "10-20" seconds. Scoring is objective, and weighted as follows: two correct responses = 3 points; one correct response = 1 point; no correct responses = 0 points. The range of raw scores possible is f rom 0 to 39. Subtest T w o (Making Inferences) The subject is presented with two sentences that describe the beginning and ending of a situation (e.g. "Jack went to a Mexican restaurant"; "He left without giving a tip"). The task to verbalize two inferences that describe what might have transpired between these two instances. The subtest includes twelve scoreable items, and one training item. No basal or ceiling rules apply; the subtest may be discontinued if the subject fails to respond three consecutive times. The response time allowed per item is 60 seconds. Scoring is objective and weighted as in Subtest One. Raw scores possibly range f rom 0 to 36. Subtest Three (Recreating Sentences) The subject is presented with three stimulus words and a picture depicting a social situation (e.g. A picture of people hiking is accompanied with the words 'fall*, ' leg' , 'and'). The words are read aloud to the subject, whose task is to combine the three stimulus words in a sentence that might have been spoken by someone in the picture. The subtest contains 13 scoreable items and 2 trial items. The maximum allowable response time is 60 seconds. Items are scored twice; once for correctness on the basis of scoring criteria provided in the test manual. This system entails some subjectivity on the part of the examiner, and rates responses on a scale of 0, 1, or 3. Items are scored a second time for the number of stimulus words included in each sentence (three stimulus words = 3 points, two stimulus words = 1 point, and 1 or 0 stimulus words = 0 points). These two sets of scores are then combined to obtain the subtest raw score. Raw scores possibly range f rom 0 to 4, or f rom 0 to 6. The range of scores possible for the subtest total is f rom 0 to 78. Subtest Four (Understanding Metaphoric Expressions) The subject is presented with a common metaphor (e.g. "She sure casts a spell over me"). Each item is presented verbally, then in print. The task is first to explain what the metaphor means. Second, the subject is choose the correct interpretation f rom among four distractors. The subtest includes 12 items plus two trials. Fifteen seconds is allowed for the first part of the task, and 45 seconds for the second. No basal or ceiling rules apply; the subtest may be discontinued after three consecutive failures to respond. Points per item are awarded as follows: 3 points for two correct responses, 1 point for one correct, or 0. Possible total raw scores range from 0 to 36. The T L C package includes a Technical Manual , an Administration Manual , and a Stimulus Manual . Subtest One and T w o items are presented verbally by the examiner, and then in print using the Stimulus Manual . In this way, the subject first hears and then reads the items. Subtest Three items are read aloud by the examiner while the subject looks on. For Subtest Four , the first part of each item is presented verbally, and then in print using the Stimulus Manual . The second part of each item is presented visually while the examiner reads the four options. A l l responses are verbal, and recorded on the test protocol by the examiner. The T L C yields scaled scores for each of the four subtests ( X = 10; SD = 3), and a standard score for the T L C Composite ( X = 100; S D =15). A Partial test score, consisting of Subtests Three and Four , may be calculated. The Partial composite is 35 recommended for screening purposes. Scaled scores and standard scores may be converted to age equivalent or percentile scores. Confidence intervals are provided. A detailed discussion of T L C validity and reliability is located in Chapter Two. Wechsler Intelligence Scale for Chi ldren-Revised (WISC-R) (Wechsler, 1974) The W I S C - R is an individually-administered test of general intelligence suitable for use with children between the ages of six and sixteen years. The test is comprised of twelve subtests organized into two scales. The Verbal Scale contains six subtests and provides a measure of verbal reasoning ability. The Performance Scale is made up of the remaining six subtests and provides a measure of nonverbal reasoning ability. Subtests yield scaled scores with a mean of ten and standard deviation of three. Ten of the subtest scaled scores are combined to form the Verbal , Performance and Fu l l Scale IQ's, each with a mean of 100 and a standard deviation of 15. The validity and reliability of the W I S C - R is well-supported in research. Sattler (1988) and K a u f m a n (1979) provide detailed discussions of the W I S C - R and its psychometric properties. Informal Language Sample Language samples were obtained by tape-recording a f i f teen-minute conversation between the examiner and each subject. A n imperative format (eg: "Tell me about...") was used to elicit the language samples. The imperative format was chosen for three reasons. First , this format is replicable (Atkins and Cartwright, 1982). Second, there is evidence to suggest that imperatives elicit more fluent and more complex language than other procedures (Wiig and Semel, 1984). T h i r d , this method was appealing for use with language disabled adolescents whose spontaneous conversation might be limited. Stimulus items used to elicit language samples are presented in Appendix C . F ive language samples were randomly selected f rom each group of subjects. F i f ty utterances f rom the middle of each sample were transcribed for analysis using the L A R S P procedure (Crystal et al. , 1976), described below. Utterances were excluded from the analysis i f they were partially or completely unintelligible, i f they were 36 repetitions of earlier responses, or if they were unfinished (e.g." I went to the..."). Single word utterances, or starters and fillers (like, um, you know) were not included. Repetitions due to dysfluency were treated as one utterance (eg: "I went to the...to the store). Methods of Data Analysis Item and Test Analysis. The reliability and validity of a test is largely a function of its item characteristics. Item analysis is a term applied to the examination of item characteristics. Typical ly it includes measures of item dif f icul ty, item discrimination, and item homogeneity. Item dif f iculty is defined as the proportion of persons passing or fail ing a test item. It is related to the total distribution of test scores, and to test reliability. For dichotomously scored items, item diff iculty is expressed in terms of p values, or the percentage of persons passing each item. A n index of item diff iculty for items scored on a continuous scale, such as is the case with the T L C , is the mean score for each item (Nunnally, 1978). Measures of item discrimination indicate the extent to which a test item differentiates among individuals on the behavior being measured (Anastasi, 1982). Item discrimination may be evaluated on the basis correlations between each item and an external criterion, or between each item and the total test score. Point biserial correlations are appropriate for use when test items are scored dichotomously. When items are scored on a multipoint scale, as in the case of the T L C , product moment correlations are appropriate. A n item demonstrates adequate discrimination when it reaches a level of .3 or better when correlated with the total test score (Nunnally, 1978). Estimates of internal consistency describe the degree association among test items. The Kuder -R ichardson formulas ( K - R 20 and K - R 21) are among the most commonly used methods of calculating internal consistency. Cronbach's A lpha is a generalization of the K - R 20 formula suitable for use with items scored on a multipoint scale. Hoyt's 3 7 Analysis of Variance is a less frequently used procedure that produces the same results as K - R 20. Internal consistency estimates of .8 or better are considered to indicate acceptable test reliability (Nunnally, 1978). In this study, T L C item characteristics were analyzed using L E R T A P (Nelson, 1974), an extensive test analysis program. In order to study the effectiveness of the subjective scoring criteria during this analysis, Subtest Three (Recreating Sentences) was coded as three subtests: Subtest Three (H), Subtest Three (W), and Subtest Three (T). These correspond to the holistic, word count, and total subtest scores respectively. T L C items are scored on a multipoint scale; therefore, the mean and standard deviation of each item were inspected for relative degrees of diff iculty. Likewise, product moment correlations were obtained between items and subtest/total test scores to evaluate item discrimination. In addition, correlations were calculated between each T L C item and an external criterion (VIQ). Hoyt coefficients were obtained for each of the T L C subtests and the composite. Cronbach's Stratified A l p h a was calculated for the test composite only. Correlational Analyses. Pearson product-moment correlation coefficients (Pearson r) were obtained between T L C subtests and the test composite. T L C and W I S C - R scores were correlated to study the relationship between the T L C and V IQ . Correlations were obtained using the computer program SPSSX (La i , 1986). Interrater Reliabil ity. In order to measure the extent of interrater agreement on the subjectively scored sections of the T L C , reliability coefficients (Pearson r) were calculated between scores obtained f rom two independent raters. Rater One was the graduate student researcher, and rater two was a speech-language pathologist who uses the T L C in her employment with the Vancouver school district. Regression Analysis. Regression analysis is a method by which scores on a dependent or criterion variable are predicted f rom scores on an independent variable (Pedhazur, 1982). The extent to which an observed criterion score deviates f rom its 3 8 predicted score is the residual, or error of estimate for that individual. Residual scores represent that proportion of variance which is unique to criterion and unaccounted for by variance in the independent variable. In this study, a series of simple regression analyses were calculated using Verbal IQ as the independent variable, and each of the four T L C subtests as dependent variables. The purpose was to obtain standardized residual scores representing that proportion of variance unaccounted for by Verbal IQ, and unique to the T L C , for each individual. Residual scores were standardized to have a mean of 0 and standard deviation of 1. These analyses were conducted using the computer program SPSSX. Stepwise Discriminant Function Analysis. Discriminant function analysis is an extension of regression analysis suitable for use with multiple variables when the criterion is group membership (Pedhazur, 1982). In his discussion, K l e c k a (1980) divides discriminant analysis into two levels of activity: interpretation and classification. A t the level of interpretation, one is concerned with obtaining the canonical discriminant functions. A discriminant function is a composite of variables that has maximum potential for discriminating between groups. The number of functions possible is one minus the number of groups, or the number of discriminating variables, whichever is smaller. Discriminant functions are applied at the second level of activity to predict group membership. In stepwise discriminant function analysis, the variable contributing most to group discrimination is entered first into the equation. That variable is then paired with each of the remaining variables, and the most discriminating of the remaining variables is entered. This stepwise selection of variables continues until all variables in the function have been entered, or until the remaining variables provide no significant contribution to group discrimination. Wilk's lambda (U) is a measure of residual discrimination employed in stepwise procedures. Values of U range f rom 0 to 1, with 0 indicting maximum group discrimination, and 1 indicating negative discrimination. The 39 U statistic may be converted to an F statistic and tested for significance. By inspecting the U and F statistics, it is possible to distinguish the discriminating variables from those which do not contribute substantially to group discrimination (Klecka, 1980). In this study, the computer program B M D P - P 7 M (Dixon, 1988) was used to calculate a forward stepwise discriminant function analysis. Each of the four T L C subtests was employed as a discriminating variable, and the criterion was group membership. The purpose of the analysis was to observe first the relative contribution of each subtest to group discrimination, and second, the capacity of the test to predict group membership. This was followed by a second discriminant function analysis using Subtest Three (Holistic Scoring) in place of the Subtest Three total, together with the remaining three subtests as discriminating variables. The purpose was to determine if removal of the Word Count scoring would significantly alter the results. Adjusted Discriminant Function Analysis. The standardized residual scores obtained by the regression analyses represented that proportion of variance unique to the T L C and unaccounted for by V IQ . In order to observe the relative contribution of each T L C subtest to group discrimination after the effects of Verbal IQ had been removed, these standardized residual scores were entered into a second discriminant function analysis using the computer program B M D P - P 7 M . Hotelling's T-Sauare. Hotelling's T-square (T ) is a multivariate technique for measuring the distance between group means on two or more variables simultaneously. It can be converted to an F statistic and evaluated for significance. In this study, a total of 22 subjects were known to speak Engl ish as a second language (ESL) . In order to rule out the effect of E S L on test performance, individuals were assigned to one of two groups ( E S L or n o n - E S L ) . Group means were then tested for equality on six variables: W I S C - R Verbal IQ, W I S C - R Performance IQ, and each of the four T L C subtests. The computer program used for this analysis was B M D P - 3 D . L A R S P . The Grammatical Analysis of Language Disability Crystal et al. (1976) 40 ( L A R S P ) is a syntactic analysis procedure. A computerized version of L A R S P (Bishop, 1985) was used in this study. The program analyses sentences at three levels. First, sentences are broken into words, and each word is classified as a part of speech. A t the next level, sentences are divided into noun, verb or adverbial phrases. Relationships between clauses are analyzed at the final stage of the program. Structures at each level are assigned to one of seven developmental stages as frequency counts. Mean length of utterance ( M L U ) is also calculated. Results are printed out on the L A R S P summary sheet. The seven developmental stages represented in the L A R S P analysis are as follows: Stage I: One Word Sentences Stage II: Two Word Sentences Stage III: Three Word Sentences Stage IV: Sentences of Four Words or More Stage V : Recursion Stage VI: System Complet ion Stage VII: Discourse structure, syntactic comprehension and style Stage one (single word) utterances were not included in this analysis. Stages six and seven are not handled by the Bishop version of L A R S P and were not dealt with in this study. The analysis concentrated on stages two, three, four, and f ive. Stage five was of principal interest in this study because it represents that point at which children begin to use complex patterns of sentence structure. Stage Five is defined as follows: Essentially what the chi ld has to learn here is a set of connecting devices which can be used to interrelate clauses, and the transformational processes whereby one can be used within ('embedded within') another. Once these devices have been learned, of course, the process can continue indefinitely, longer and more complex sentences being built up as a result. It is this feature of language, to take a basic structure and use it repeatedly to produce extensive sequences, which is the primary characteristic of the creativity of language...It is accordingly a 41 stage of great significance in normal development, as at this point the range of expression available to the child is enormously increased. (Crystal et al, 1976, p. 76). If L L D adolescents do, in fact, use fewer complex sentence structures than n o n - L D youngsters, some divergence in the performance of L L D and control group subjects might be expected at the Stage Five level. The L A R S P system was selected over other available language analyses procedures for several reasons. First, it is suitable for use with both children and adults, and with speakers of nonstandard English (Crystal et al, 1976; Hol land & Forbes, 1986; Tibbits, 1982). Second, it is based on a descriptive framework of English grammar (Quirk, Greenbaum, Leech, Svartvik, 1972), and thus has adequate content validity (Lieberman et al. , 1987). Final ly , the L A R S P procedure has been found to discriminate among groups of language disabled and control subjects at different age levels (Hawkins & Spencer, 1985; Kearns & Simmons, 1983; Penn, 1983; Penn & Behrmann, 1986). Wilcoxon Rank Sum Test. The Wilcoxon rank sum test is a nonparametric test of significance of group differences, appropriate for use with two independent samples when data are in the form of frequency counts. Data f rom two groups are combined and assigned ranks. The ranks are then summed, and a value (Rj ) is obtained for each group. The probability of this value is then tested in relation to the theoretical distribution of R j . In this study, rank sum tests were calculated using data obtained from the L A R S P procedure, described above. The purpose was to determine whether L A R S P would discriminate between L L D and Control group members, particularly at the Stage Five level. Frequency data were converted to percentages in order to adjust for f luency. Rank sum tests were calculated using procedures outlined by Ferguson (1976). 42 Method Subjects were administered the Test of Language Competence ( T L C ) , the Wechsler Intelligence Scale for Chi ldren Revised (WISC-R) , and an informal language sample. Testing was conducted over a six month period by two graduate students trained in the administration, scoring and interpretation of standardized tests. The W I S C - R was not re-administered to L L D group students who had been tested within the previous two years. In these instances, W I S C - R testing had been conducted by school psychologists employed in School District 39. 43 Chapter IV Results This chapter presents the results of the data analysis. Descriptive and item analysis statistics are presented first, followed by results of the correlational analyses. Results of the discriminant function are described next. F inal ly , results of Hotelling's T - T e s t and Wilcoxon Rank Sum Tests for equality of group means will be summarized. Descriptive Statistics Group descriptive statistics summarized in Tables 3 - 4 . These include the mean, standard deviation and standard error of measurement for each group ( L L D and Control) as well as the combined sample on each of the T L C and the W I S C - R Verbal and Performance Scales. T L C results are reported here in raw scores. The mean subtest and composite score obtained by each group and the combined sample were converted to scaled scores and standard scores respectively using T L C norms. The mean age for each group (13 years) was used to make this conversion. Results are located in Table 5. These indicate that L L D and Control subjects obtained generally lower than average scores on all but two subtests Item and Test Analysis Table 6 presents results of the L E R T A P analysis. These include the means, standard deviations, and standard errors of measurement for each of the T L C subtests, and the total test using combined sample. In this instance, Subtest Three (Recreating Sentences) was analyzed as three subtests (Holistic, Word Count , Total); however, only the subtest composite (Subtest Three-Total ) was included in the total test statistics. Individual item statistics (mean, standard deviation, item correlations) for each subtest are summarized in Appendix B. Appendix B includes the percentage distribution of subjects on each item. 44 Table 3 W I S C - R Means. Standard Deviations (SD) and Standard Error of Measurement W I S C - R Intelligence  Verbal Performance Fu l l Scale Group Mean SD S E M Mean SD S E M Mean S D S E M Language Disabled 74.8 8.9 1.9 101.0 10.7 2.2 86 8.3 1.7 Control 103.0 9.1 1.9 111.0 11.0 2.3 107.3 10.4 2.1 Table 4 T L C Means. Standard Deviations (SD) and Standard Errors of Measurement (SEM) L L D Controls Subtest N o . of Items Mean S D S E M Mean S D S E M Ambiguous 13 7.4 6.8 1.4 19 8.0 1.7 Inferences 12 20.9 5.0 1.0 29.7 3.3 .7 Recreate 13 46.8 10.1 2.1 63.3 6.9 1.4 Metaphors 12 7.3 5.5 1.1 21.8 8.5 1.8 T L C Total 50 82.4 20.8 4.3 133.7 19.8 4.1 Table 5 Scaled Scores and Standard Scores for the Mean Age Group (13 years) on the T L C  Subtests and Composite T L C Scaled Scores T L C Subtests L L D Controls Combined Ambiguous Sentences 03 06 04 Mak ing Inferences 04 08 06 Recreating Sentences 03 08 06 Metaphoric Expressions 03 07 05 T L C Standard Scores Composite 65 83 69 Note. Scaled score mean = 10; S D = 3 Standard Score mean = 100; SD = 15 4 6 Table 6 T L C Subtest Internal Consistency Reliabilities and Descriptive Statistics: Combined  Sample Subtest Mean SD S E M Hoyt r Ambiguous 13.17 9.37 2.97 .89 Inferences 25.30 6.11 3.08 .72 Recreate(H) 23.76 7.29 3.27 .78 .92 Recreate(W) 31.35 7.85 3.10 .83 Recreate(T) 55.04 11.98 4.94 .82 Metaphors 14.54 10.16 3.15 .89 .99 Composite 108.07 32.76 9.27 .95 Note, n = 46 for Tables 6 through 11 r = Interrater reliability H = Holistic Scoring; W = Word Count; T = Subtest Total Coeff icient A l p h a for the test composite = .87 47 Item Dif f icul ty . Item means for the combined sample were compared for evidence of item diff iculty. The range of scores possible for each item was 0, 1 or 3, thus item means falling near 1.5 (or 3 for Subtest Three-Total ) represented the m i d -range of dif f iculty. Results indicated that, for this sample, Subtest One (Understanding Ambiguous Sentences) was the most diff icult of the T L C subtests. Item means ranged from .56 to 1.6, and although items were arranged toward increasing levels of di f f icul ty, some variation within this pattern was observed. For example, 19 subjects (41%) obtained scores of 0 on item 13 ( X = .8), as compared to 28 subjects (60%) who scored 0 on item 10 ( X = .56), indicating item 13 to be an easier item. A l l item means for Subtest Two (Making Inferences) fell above the mid-range of di f f iculty (1.7 to 2.5), indicating this to be the least diff icult subtest, with the exception of Subtest Three (Holistic), described below. A general trend toward increasing item dif f iculty was noted; however, as in Subtest One, there was some variation within this pattern. Item means for Subtest Three (Holistic) ranged f rom 1.3 to 2.5. The range of item means for Subtest Three (Word Count) was 1.8 to 2.7. This was indicated to be the least dif f icult of the T L C subtests. The higher range of item means observed score for the Word Count subtest stems from the fact that most subjects received credit for attempting to include all three stimulus words in their responses. Item means for Subtest Three (Total) ranged f rom 3.5 to 5.2. Items did not appear to be arranged in order of dif f iculty on this subtest, nor on either of the two separate scoring systems. Subtest Four (Understanding Metaphoric Expressions) was the second most dif f icult subtest. The range of item means observed for this subtest was .72 to 1.8. Items 1 to 9 were not ordered in terms of di f f iculty. Items 10, 11 and 12 obtained 48 lower mean scores relative to the other items ( X < 1.0); thus more dif f icult items were located at the end of the subtest. Because some discrepancies in item ordering were noted among the different subtests, item order dif f iculty correlations were calculated using Spearman's rank order correlation coefficient (Rho). The range of coefficients was .53 (Subtest Three-Hol ist ic) , .65 (Subtest Three-Tota l ) , .66 (Subtest Three-Word Count) , .76 (Subtest O n e -Understanding Ambiguous Sentences), .77 (Subtest T w o - M a k i n g Inferences), and .80 (Subtest Four-Understanding Metaphoric Expressions). These results indicate that some T L C items are not well-ordered in terms of diff iculty. Item Discrimination. Each item in the T L C was correlated with its respective subtest, the T L C composite, and an external criterion (VIQ). Item to subtest correlations of .3 or greater indicated that the item was discriminating adequately among subjects. A l l items in Subtest One (Understanding Ambiguous Sentences) and Subtest Four (Understanding Metaphoric Expressions) met the .3 criterion for adequate discrimination when correlated with their respective subtests. Four items in Subtest Two (Making Inferences) failed to meet the .3 criterion when correlated with the subtest total. These included the following: Item two: T i m stopped on his way to school to play a video game. A t the locker, he realized he had to hurry back in order to be in class on time. T i m had to go back because... Item three: The sun was shining, when the Robertson's started out for the picnic. Unfortunately they had the picnic in the l ivingroom. They had the picnic in the l ivingroom because... Item five: Bob and R a y rode on a crowded bus to the shopping mall. They told the story of Bob's bad luck to a policeman. They talked to a policeman because... 49 Item nine: L o r i took the bus downtown because it was her mother's birthday. She left the fashionable stores with tears in her eyes. L o r i cried because... The results suggest that these four items discriminate poorly among individuals on the specific behavior of interest (inferential thinking). Moreover, item two failed to discriminate adequately among individuals on a broader verbal construct represented by the T L C composite. Three items (two, eight, twelve) in Subtest Three (Recreating Sentences Holistic), and six items (one, two, three, f ive, eight) in Subtest Three (Recreating Sentences - Word Count) did not meet the .3 criterion when correlated with their respective subtests. The more discriminating set of items was produced by Subtest Three (Total). Only two items were shown to be unacceptable. These included item eight (without, di f f icul t , again) and twelve (fresh, nor, here). Item eight fai led to reach the level of .3 when correlated with either the T L C composite or V I Q , suggesting that this item discriminates poorly among individuals on both the general and specific factors it was intended to measure. Internal Consistency. Hoyt estimates of internal consistency for the T L C subtests are located in Table 6. These range f rom .72 to .89, indicating a strong degree of association among items within each subtest. Hoyt and A l p h a coefficients obtained for the T L C composite were .95 and .87 respectively. A l p h a coefficients reported in the Technical Manual across eight age intervals for the T L C composite ranged f rom .77 to .82. Interrater Reliabil ity. Interrater reliability coefficients for the subjectively scored sections of the T L C are presented in Table 6. Coefficients ranged f rom .92 to .99, indicating close agreement between raters. Correlational Analyses Correlations between the T L C and the W I S C - R based on the combined sample are presented in Table 7. Coefficients above .34 are significant at<x= .01. A correlation 50 of .82 was observed between the T L C composite and FSIQ. This compares to a correlation of .75 reported in the T L C manual for a group of 28 language disabled youngsters. Correlations of .90 and .50 were observed between the T L C composite and the W I S C - R Verbal and Performance IQ's respectively. This is compared to correlations of .78 and .53 reported in the Technical Manual . Correlations between the T L C composite and the W I S C - R verbal subtests ranged f rom .76 to .86. The range of correlations between the T L C composite and the W I S C - R Performance subtests was .13 to .40. Corresponding values were not reported in the T L C manual. Correlations between each of the four T L C subtests and V I Q ranged f rom .74 to .83. The T L C manual reports a corresponding range of correlations f rom .40 to .79 for a group of 28 language disabled subjects. The range of correlations between the T L C subtests and PIQ was .34 to .59 This compares to a range of .18 to .45 reported in the test manual. The range of correlations observed between the four T L C subtests and f ive W I S C - R verbal subtests ranged f rom .53 to 78. Correlations between the T L C subtests and f ive W I S C - R performance subtests were predictably lower, f rom .00 to .47. Corresponding values were not reported in the Technical Manual . T L C subtest intercorrelations are reported in Table 8. These ranged f rom .56 to .77. Subtest intercorrelations reported in the T L C manual ranged f rom .24 to .57 for a group of 28 language disabled subjects, and from -.11 to .39 for 28 nonhandicapped individuals. Subtest intercorrelations obtained by the standardization sample are reported in one year intervals between the ages of nine to fourteen; and two year intervals above age fourteen. These ranged from .17 to .50. 51 Table 7 Correlations Between T L C and W I S C - R for the Combined Sample W I S C - R A M B S INFS T L C R E C R M E T S T O T A L Information 68 68 61 78 79 Similarities 69 70 71 76 82 Ari thmetic 53 73 65 76 76 Vocabulary 76 76 76 71 86 Comprehension 67 64 74 66 79 P. Complet ion 17 20 15 37 26 P. Arrangement 25 10 -00 12 13 Block Design 34 38 34 46 44 O b j . Assembly 27 24 12 47 31 Coding 32 27 40 33 40 Verbal IQ 74 79 78 84 90 Performance IQ 43 39 34 59 50 Fu l l Scale IQ 69 69 66 81 82 Note. Coeff icients have been rounded to two significant figures and decimals omitted; r > .34 is significant at(X= .01. 52 Table 8 Intercorrelations Among T L C Subtests for the Combined Sample A M B S INFS R E C ( H ) R E C ( W ) R E C ( T ) M E T S A m b i g Sents. 100 (-05) — -- (24) (08) Mak ing Infs. 57 100 -- — (08) (32) Recreate (H) 64 54 100 — -- --Recreate (W) 45 47 25 100 -- --Recreate (T) 68 64 77 81 100 (16) Metaphors 65 77 66 46 70 100 Total 84 82 77 65 89 89 (53) (57) — — — (65) Note. Correlations rounded to two figures and decimals omitted; "Total" variables not corrected for overlap. H = Holistic Scoring, W = Word Count , T = Subtest Total . Values in parentheses represent intercorrelations among V I Q - corrected residuals 53 Discriminant Funct ion Analysis Results of a forward stepwise discriminant function analysis are presented in Table 9. Subtest 2 (Making Inferences) and Subtest 3 (Recreating Sentences) were selected in the first two steps of the analysis. N o other subtest made further significant contribution to the discriminant function for differentiating between L L D and Control groups. Table 10 presents the group classifications produced by the discriminant analysis. O f the 46 subjects, 19/23 (83%) were correctly classified as language disabled, while 21/23 subjects (91%) were correctly classified as controls. A total of four language disordered students (9%) were misclassified as controls (false negatives), and 2 controls (4%) were misclassified as language disordered (false positives). These results d id not change when the analysis used Subtest Three (Holistic Scoring) in place of the total subtest score. Results of the discriminant function analysis reported in the T L C manual indicated that Subtest Four (Understanding Metaphoric Expressions) followed by Subtest Three (Recreating Sentences) were the most discriminating subtests. Subtest T w o accounted for sufficient variance to be entered into the analysis, but its contribution to group discrimination was considered minimal , as indicated by a relatively small increase in the squared canonical correlation which resulted when Subtest T w o was entered into the equation (Wiig & Secord, 1985). Subtest One did not contribute significantly to group discrimination. Results of the classification function indicated that the T L C correctly identif ied 27/28 language disabled students (96%) and 26/28 controls (92%). 54 Table 9 Summary of Discriminant Function Analysis: Stepwise Selection of T L C Subtests (Total  Scores') Variable Step Entered U F p 1. Mak ing Inference 0.47 49.36 <.01 2. Recreating Sentences (Total) 0.38 34.83 <.01 Note. U = Wilk's lambda Table 10 Summary of Discriminant Funct ion Analysis: Classification of Language Disabled ( L L D )  and Control Groups Number of Cases Classified Group L L D Controls % Correct % Incorrect L D 19 04 83 1 7 a Controls 02 21 91 0 9 b Total 21 25 87 13 Note, percentage of false negatives percentage of false positives 55 Simple Regression Analysis The purpose of this analysis was to obtain a set of standardized residual scores representing that proportion of variance unique to the T L C and unaccounted for by V IQ . These adjusted scores were entered into a second discriminant function analysis (discussed below) to determine the capacity of the T L C to discriminate between L L D and Control groups after the effects of V I Q had been removed. Table 11 presents the results of four simple regression analyses using Verbal IQ as the independent, or predictor variable, and each of the T L C subtests as dependent, or criterion variables. It may be seen that Verbal IQ accounted for 55% of the variance in Subtest One (Understanding Ambiguous Sentences), 62% of the variance in Subtest T w o (Making Inferences), 60% of the variance in Subtest Three (Recreating Sentences-Total) and 70% of the variance in Subtest Four (Understanding Metaphoric Expressions). These results conf i rm a substantial relationship between V I Q and each of the T L C subtests. Table 11 Summary of Simple Regression Analyses: T L C subtests regressed on V I Q Subtest r r-Square F P Ambiguous .74 .55 54.32 <.01 Inferences .79 .62 72.47 <.01 Recreate .78 .60 66.89 <.01 Metaphors .84 .70 101.55 <.01 Adjusted Discriminant Funct ion Analysis ( A second discriminant function analysis was undertaken using the standardized residual scores obtained in the regression analyses, discussed above. Not one of the T L C 5 6 subtests was entered into the analysis, indicating group differences are almost wholly explained by V I Q . Intercorrelations among the VIQ-corrected residuals for each subtests are located in Table 8. These ranged from -.05 to .32, and none reached significance. Hotelling's T-Sauare The purpose of this analysis was to determine if significant differences between E S L and E F L group performance existed on six variables: V I Q , PIQ, T L C Subtest One (Understanding Ambiguous Sentences), Two (Making Inferences), Three (Recreating Sentences), or Subtest Four (Understanding Metaphoric Expressions). Results are located in Table 12. N o significant differences were indicated on the multivariate test (T = 12.83), or on any one variable. Wilcoxon Rank Sum Test Language samples were analyzed using the L A R S P procedure. Language Sample transcripts are located in Appendix D. Summary sheets of the L A R S P analysis are located in Appendix E . Rank sum tests were calculated using results of the L A R S P analyses at phrase and clause level, stages two, three, four and five. The purpose of this analysis was to determine if significant differences would be observed between the expressive language characteristics of L L D and Control group members at one or more levels. Results are presented in Table 13. Significant differences were observed between groups at both phrase and clause level, stages three and five. L L D group members demonstrated a significantly higher percentage of phrase and clause usage at stage three than control group members. Conversely, control group members used significantly fewer stage three utterances than L L D ' s at clause level, and a higher percentage of stage five level phrases and clauses than members of the L L D group. 57 Table 12 Hotelling's T : Significance of Multivariate and Univariate Differences Between E S L  and E F L Group Means Variable E S L Mean S D E F L Mean SD T 2 P Multivariate 12.83 >.01 V I Q 88.3 16.4 88.67 17.6 .26 >.01 PIQ 107.4 13.5 105.1 10.5 -.66 >.01 T L C 1 11.8 8.8 14.3 9.9 .90 >.01 T L C 2 23.9 6.1 26.5 5.9 1.46 >.01 T L C 3 54.1 10.7 56.0 13.7 .52 >.01 T L C 4 11.9 7.2 16.8 11.8 1.62 >.01 Note. E S L (n = 20); E F L (n = 25). 58 Table 13 Wilcoxon Rank Sum Tests: Sum of Ranks (Rj ) bv Stage of L A R S P Analysis Phrase Clause L A R S P Stage L L D Control L L D Control II 27 28 30 25 III 36* 21* 37* 18* IV 22 33 24.5 30.5 V 17* 38* 15** 40** Note. * p. < .05 * * p. < -01 59 Chapter V Discussion This chapter reviews the purpose of the study, research methodology, and results of data analysis. Implications of the research findings are discussed, and limitations of the study considered. Final ly , clinical applications of the T L C and suggestions for future research will be proposed. Purpose of the Study The purpose of this study was first to investigate the validity and related psychometric characteristics of the Test of Language Competence ( T L C ) for use with language learning disabled and control subjects. Five research questions were addressed concerning the internal structure of the T L C , the concurrent validity and reliability of the instrument, and the viability of its use within a multi l ingual/multicultural population. A second purpose of the study was to investigate the criterion-related validity of an informal language sample. The principal methods of analysis used to investigate the internal structure of the T L C were measures of item dif f iculty, item discrimination, and internal consistency. Correlational analyses yielded additional information concerning internal consistency and interrater reliability. T L C criterion-related validity was studied on the basis of a discriminant function analysis to determine the capacity of the T L C to predict L L D and control group membership. A second discriminant function analysis examined the criterion-related validity of the T L C with the effects of Verbal IQ removed. A n important consideration during this investigation was the possible influence of E S L on T L C or W I S C - R performance. Hotelling's T was used to test the equality of group means on the T L C subtests, V I Q and PIQ. In order to determine if an informal language sample would differentiate among L L D and control subjects, language samples were obtained f rom five members of each 60 group. These were analyzed using the L A R S P procedure, and the results tested for significant differences using Wilcoxon Rank Sum tests. Summary and Discussion of Results  Internal Characteristics of the T L C Item Dif f icul ty . Present results suggest some variation in item dif f iculty among the four T L C subtests for the combined sample. Subtest One (Understanding Ambiguous Sentences) appears to be the most diff icult subtest, followed by Subtest Four (Understanding Metaphoric Expressions), Subtest Three (Recreating Sentences-Total), and Subtest T w o (Making Inferences). Further variation in item dif f iculty was observed between the two scoring systems which are included in Subtest Three (Total), with Subtest Three (Word Count) obtaining the highest mean raw score relative to any other subtest. Conversion of the mean T L C subtest and composite scores to scaled scores using norms provided in the test manual for the mean age group (thirteen years) indicated that L L D subjects obtained scaled scores below average (>1 SD) on each of the four subtests. Controls obtained scaled scores below average on Subtests One (Making Inferences) and Four (Understanding Metaphoric Expressions). Scaled scores for the combined group were below average for all four subtests. Al though it might be argued that control subjects in this study obtained lower than average scaled scores because, unlike the T L C standardization sample, the current sample included a high proportion of E S L subjects, this argument is not upheld by results of Hotellings T , which indicated no significant differences between E S L / E F L subjects on the T L C subtests. A second argument to explain the lower scaled scores obtained by the current sample might be that the sample included a disproportionate number of language disabled subjects. G i v e n the high correlation observed between the T L C and V I Q , however, combined with the fact that control group subjects were known to be of average verbal intelligence, this explanation is rejected. It is therefore reasonable to conclude that T L C norms may not accurately 61 represent the performance of local children, and consequently may overidentify individuals as language disordered. Moreover, these results would suggest caution in the use of profile analysis based on subtest scaled score comparisons. Results of Spearman's rank order correlation coefficient (Rho) indicated that T L C items are generally not well-ordered in terms of diff iculty. This observation would contraindicate the practice of discontinuing a subtest after three consecutive failures to respond, as subjects may respond correctly to subsequent items that are less diff icult . Item Discrimination. The data indicate that all items in Subtest One (Understanding Ambiguous Sentences) and Four (Making Inferences) discriminated effectively on the specific behavior they were intended to measure. A total of four items (33%) in Subtest T w o (Making Inferences) and two items (15%) in Subtest Three (Recreating Sentences-Total) failed to meet the required .3 criterion when correlated with their respective subtests. It was noted that fewer items in Subtest Three met the .3 criterion when correlated with either the Holistic or Word Count totals; therefore the most discriminating set of items was produced by the subtest composite. The fact that some T L C items fail to discriminate adequately among individuals may be related to several factors, including item diff iculty. Extreme values of item dif f iculty tend to reduce item discrimination (Nunnally, 1978). A l l items in Subtest Two (Making Inferences) were observed to fall above the mid-range of di f f iculty. It might be argued that items which fai l to discriminate among individuals in this subtest do so because they are too easy. Items in Subtest Three (Recreating Sentences-Total) likewise fell above the mid-range of diff iculty; therefore the same argument may apply. Inadequate item discrimination may also be the result of poor content sampling. For example, of the six T L C items which did not discriminate among individuals on the specific behaviors they were intended to measure, f ive showed inadequate discrimination on a broad verbal factor represented by the T L C composite. The results suggest a 62 minimal relationship between these items and either a general or specific verbal construct. Internal Consistency ( T L C Subtests). Internal consistency coefficients were calculated using Hoyt's Analysis of Variance procedure and Cronbach's A lpha . Coeff icients of .8 or better were considered acceptable. Hoyt's estimate of reliability was .89 for Subtest One (Understanding Ambiguous Sentences), .83 for Subtest Three (Recreating Sentences-Word Count) , .82 for Subtest Three (Total) and .89 for Subtest Four (Understanding Metaphoric Expressions). Subtest T w o (Making Inferences) and Subtest Three (Recreating Sentences-Holistic) fell below .8 (.72 and .78 respectively). Several explanations may account for these results. First , test reliability is a function of item diff iculty and item discrimination. Items which are far - removed f rom the mid-range of dif f iculty may reduce the size of the reliability coefficient. L o w item to subtest correlations have the same effect. The lower estimate of internal consistency observed for Subtest T w o (Making Inferences) is consonant with the observation that this subtest contains the least acceptable combination of items in terms of item diff iculty and item discrimination. A second explanation for the low internal consistency estimates obtained by Subtest T w o , and by Subtest Three (Holistic) may be that at least some items in each subtest measure dissimilar constructs, a point raised earlier. The effect of this would be to reduce the inter-item correlations, and estimates of internal consistency as a result. One f inal explanation for these results may be related to the length of the T L C subtests. Subtest T w o (Making Inferences) is one of the shorter subtests, consisting of only twelve items. Subtest Three includes the maximum thirteen items. It might be pointed out, however, that Subtests One and Four obtained reliability coefficients above .8, despite the fact that each contains twelve and thirteen items respectively. Thus , although subtest length is a possible explanation for lower estimates of reliability, it is an unlikely one. 63 Internal Consistency ( T L C Composite). G i v e n that each of the T L C subtests is intended to measure a specific skill or ability, it might be expected that higher estimates of internal consistency would be observed for individual subtests than for the total test. Hoyt's estimate of reliability for the T L C composite contradicts this interpretation. A higher estimate was observed for the T L C composite than for any of the subtests (.95). Cronbach's A l p h a for the test composite was likewise substantial (.87). These results are similar to internal consistency data reported in the T L C manual for the standardization sample. T L C authors Wiig and Secord observed a higher range of internal consistency coefficients (Cronbach's Alpha) for the T L C composite across ages than for individual subtests. The test authors attributed this difference to the effect of increased test length on estimates of reliability when all the items were combined. This interpretation is reasonable, and receives some support here in view of the fact that Cronbach's A l p h a , which is lowered when subtests are not highly correlated, is somewhat lower than Hoyt's estimate for the composite. This difference is small, however, and both results suggest a high degree of association among items in the total test. Intercorrelations. A pattern of moderate, positive intercorrelations was observed among T L C subtests (.56 to .77). Correlations between Subtest Three (Holistic) and Subtest Three (Word Count) d id not reach significance (.34), suggesting that these two scoring systems yield different results. A higher range of subtest to total test correlations was observed (.82 to .89), lending support to the internal consistency of the subtests; however, higher correlations might also be explained by the effects of increased test length, and the fact that correlations were not corrected for overlap. The range of correlations between each subtest and the T L C composite was relatively higher than that observed between each subtest and the external criterion (VIQ). These ranged f rom .74 to .83. This difference might be interpreted as supporting the uniqueness of the measured construct; although higher subtest to total test 64 correlations could again be explained by the effects of overlap. It might be argued that correcting for overlap would result in comparable correlations between each subtest and the test composite or V IQ . Interrater Reliabil ity. Interrater reliability coefficients for Subtest Three (Recreating Sentences-Holistic) and Subtest Four (Understanding Metaphoric Expressions) were .92 and .99 respectively. These results indicate very close agreement between raters, and support the adequacy of the subjective scoring criteria. Relationship Between T L C and V I Q The range of correlations observed between the T L C subtests and subtests on the W I S C - R Verbal Scale for the combined sample was .53 to .78 (correlations above .34 are significant atc*= .01). A more pronounced relationship, which may be explained by the effect of increased test length on correlations, was observed between V I Q and the T L C composite (.90). This value is considerably higher than that reported in the Technical Manual for a sample of 28 language-disabled subjects (.78). Likewise, the range of values observed in the present study between the four T L C subtests and V I Q (.74 to .83) is considerably higher than that reported in the Technical Manual (.40 to 79). The lower range of values reported in the T L C manual might be explained by effect of restricted range on correlation. That is, homogeneous samples produce lower correlation coefficients than heterogeneous samples (Anastasi, 1982). Correlations reported in the Technical Manual were obtained using a homogeneous sample of language disabled youngsters; whereas the subject sample in the present study included handicapped and nonhandicapped individuals. The greater variability in the present sample has likely resulted in higher correlations. Results of the simple regression analyses indicated that V I Q accounted for 55% of the variance in Subtest One (Understanding Ambiguous Sentences), 62% of the variance in Subtest T w o (Making Inferences), 60% of the variance in Subtest Three (Recreating Sentences-Total) and 70% of the variance in Subtest Four (Understanding 65 Metaphoric Expressions). Intercorrelations among the subtest residuals failed to reach significance; thus, the proportion of variance remaining that could be considered unique to the T L C was minimal. In total, these results raise the issue of what Sommers (1985) referred to as "the troublesome distinction between a language disorder and cognitive abilities" (p. 1087). The same issue has been addressed elsewhere by Oiler (1978) and Gunnersson (1978) who challenge the view that language and intelligence tests measure different constructs. L ikewise, present results do not support a distinction between the proposed construct (language competence) and V IQ . Language Disabled and Control Group Discrimination ( T L C ) Results of the discriminant function analysis indicated that Subtest T w o (Making Inferences) was the most discriminating subtest, followed by Subtest Three (Recreating Sentences-Total). The remaining two subtests, Subtest One (Understanding Ambiguous Sentences) and Subtest Four (Understanding Metaphoric Expressions) d id not contribute significantly to group discrimination and were not entered into the equation. Present results are inconsistent with those originally reported by the test authors, who observed that Subtest Four (Understanding Metaphoric Expressions) accounted for the major proportion of variance in group membership, followed by Subtests Three and then Two. In terms of classification, present results are somewhat inconsistent with those reported by the test authors. In this study, the T L C correctly classified 83% of the L L D ' s , as opposed to 93% originally reported. 91% were correctly classified as controls, compared to 93% reported in the test manual. The data would suggest that the T L C does not discriminate as effectively among local students at the lower end of language funct ioning, thus resulting in increased numbers of false negatives. Al though current results do not agree with the original f indings reported by the test authors, the two studies are not directly comparable. Criteria for group membership in the present study were defined as average or better nonverbal ability as measured on 66 the W I S C - R , and delays of two or more years as determined by speech-language pathologists on the basis of objective test data, which were not the result of other handicapping conditions or E S L . Moreover, this research employed a multiethnic subject sample matched for age, sex, and linguistic background. In contrast group selection criteria used in the original study were not reported in the T L C Technical Manual , intelligence test data were not available for the control group, and no information was provided regarding the cultural makeup of the sample involved. It is possible that disparate findings observed between the two studies are the result of different grouping criteria. Several explanations may account for the observation that Subtest One (Understanding Ambiguous Sentences) and Subtest Four (Understanding Metaphoric Expressions) were not included in the discriminant function reported here, despite the fact that each demonstrated adequate item characteristics and internal consistency. First , the results may have been distorted by sampling error. For example, L L D group members in this study were selected f rom within a population of language disabled students already identified as such by qualif ied speech-language pathologists. It is possible that of those individuals retained in the f inal sample, some were originally misclassified. A second explanation may be that neither subtest measures a unique aspect of language competence which is not already accounted for by the other two subtests. This conclusion would challenge the specificity of the T L C subtests, each of which is intended to measure a unique aspect of a broad verbal factor. The test authors claimed support for subtest specificity on the basis of an oblique rotation factor analysis (Wiig & Secord, 1985, pp. 42-47). This method has been found to demonstrate the existence of a general language factor common to different language tests which may be divided into subcomponents, each possessing its own share of reliable variance (Oiler & 67 Damico, in press). Current results, however, offer limited support for T L C subtests specificity. One further point regarding the discriminant function analysis concerns the number of false positives and false negatives observed in the classification function. It was noted in earlier discussion that Subtest Two (Making Inferences), which explained the largest proportion of variance between groups, was found to be one of the least di f f icult subtests, to contain several items which fail to discriminate adequately among individuals on either a general or specific language factor, and to demonstrate inadequate reliability. The number of false negatives observed in the present analysis may be a function of the poor internal characteristics associated with this subtest. The results described above for the present study did not change when the discriminant function analysis was re-calculated using Subtest Three (Recreating Sentences-Holistic) in place of the subtest total, which uses the combined scoring system. This means that Subtest Three (Word Count) does not contribute substantially to group discrimination; however, its exclusion f rom the battery for classification purposes is not advised. Several items in Subtest Three (Holistic) were found to discriminate poorly among individuals. Moreover, the Holistic scoring obtained an internal consistency coefficient below the acceptable .8 minimum. Both item discrimination and internal consistency were greatest for the combined scoring system. Exclusive use of the Holistic scoring system for classification purposes might result in increased numbers of false positives or false negatives due to the poor internal qualities of that subtest. A third discriminant function analysis was intended to determine how well the T L C would discriminate between groups after the effects of V I Q had been removed. In fact, insufficient variance was remaining to calculate the additional analysis. Al lowing for sampling error, the remaining variance which could be considered to account for a unique construct (i.e. language competence) is insignificant; thus, V I Q explains the major proportion of variance between L L D and control groups. 68 Test for Differences in E S L / E F L Group Performance Significant differences were not observed between groups on the six variables of interest (VIQ, PIQ, T L C Subtests One, T w o , Three, Four) . These results indicate that E S L was not a factor in L L D or control group performance in this study. Language Disabled and Control Group Discrimination (Language Sample Analysis) Results of the Wilcoxon Rank Sum Tests indicated that control subjects demonstrated a lower frequency of stage three clause level utterances, and a higher frequency of stage five phrase and clause level utterances than L L D ' s on the L A R S P analysis. Conversely, language-disabled students used a higher proportion of stage three level utterances and a lower percentage of stage five level utterances (phrase and clause). These results support the capacity of the L A R S P analysis to discriminate between L L D and control groups on the basis of language complexity, and further, demonstrate that language disabled adolescents as a group tend to produce less complex sentence structures in spontaneous speech than their nonhandicapped peers. Conclusions The results obtained in this study suggest a number of conclusions concerning the validity and internal characteristics of the T L C , as well as the criterion-related validity of the L A R S P analysis. These have implications for the diagnostic and practical utility of both measures. Conclusions regarding the technical characteristics of the T L C are based on the results of item analyses, estimates of internal consistency and interrater reliability coefficients. The results indicate that Subtests One (Understanding Ambiguous Sentences), Three (Recreating Sentences) and Four (Understanding Metaphoric Expressions) demonstrate adequate item discrimination and internal consistency. Interrater reliability for Subtests Three and Four is very high, supporting the adequacy of the subjective scoring criteria. Subtest T w o (Making Inferences) demonstrates a relatively high percentage of items (33%) which fai l to discriminate adequately among 69 individuals, and the internal consistency estimate for this subtest is below the desired .8. These results are consistent with the observation that this is the least diff icult of the subtests. A l l four subtests contain items which are not well-ordered in terms of di f f iculty. T L C criterion-related validity was judged on the basis of the discriminant function analysis which indicated that Subtest T w o (Making Inferences) and Subtest Three (Recreating Sentences) discriminate between L L D and control groups. These results do not imply that Subtest One and Four lack the capacity to discriminate between groups; an inspection of L L D and control group means for each subtest indicated sizeable differences in the performance of both groups on all four subtests. F r o m this it may be concluded that the variance in Subtests One and Four was accounted for by Subtests T w o and Three in the discriminant function analysis. The criterion-related validity of Subtests One and Four , independent of the other two subtests, might be the subject of further investigation. T L C content and construct validity were evaluated on the basis of the above, together with results of item and subtest intercorrelations, correlations between the T L C and V I Q , the discriminant function analyses; and, the extent to which these results were consistent with the stated theoretical design of the test. The T L C is based on a model of language competence which assumes a broad verbal factor subdivided into four specific content areas represented by each of the T L C subtests. A pattern of moderate, positive intercorrelations was observed among the T L C subtests, indicating some support for subtest specificity. This interpretation was not upheld by the high internal consistency coefficients, which indicated that T L C items measure a common construct; nor by the results of the discriminant function analysis, which suggested that most of the variance in Subtests One and Four was accounted for by Subtests T w o and Three. In total, l imited support has been demonstrated for the view that each subtest measures a unique aspect of language competence. 70 Correlations between the T L C and V I Q support the T L C as a measure of a verbal construct; however, the magnitude of these correlations suggests that the T L C and V I Q are measuring a common factor. This interpretation is supported by the results of the adjusted discriminant function analysis, which indicated that L L D and Control group discrimination was explained by V I Q . These findings are consistent with research conducted by Damico (Damico, personal communication, August, 1989), Schery, (1985), and Sommers et al. (1978), which support the view that many language tests claimed to measure a unique language factor are, in fact, measuring a common construct. These results would further suggest that this common factor may be V IQ . Results of the Wilcoxpn Rank Sum tests support the validity of the L A R S P procedure for distinguishing between L L D and control groups; however, the analysis does not allow for comparisons between individuals. Moreover, the descriptive power of the L A R S P analysis is lost when individual performance is reduced to a set of numbers. Clearly the T L C and L A R S P might be viewed as complementary procedures. The T L C has been demonstrated to possess adequate criterion-related validity; however, the proposed interpretive rationale for T L C results is not strongly supported. Conversely, the L A R S P system may not discriminate among individuals, but it does provide extensive descriptive information. These results would lead to the conclusion that the two instruments may be most effective if used together for the identification of language disorders, and the identification of intervention goals. Limitations of the Study The present study was intended to investigate the technical characteristics of the T L C using a locally selected sample of language disabled and control subjects. The generalizability of current findings is limited by several factors. First , current findings were obtained f rom within a distinctly multiethnic populations, and are not considered representative of populations in other regions. Second, the age range of the present sample was limited to 11 through 15, thus the representativeness of these results is 71 confined to that age range. T h i r d , the selection of average achieving students for the control group was limited to the opinion of school personnel and school records, either of which may not have been objective. It might be argued, however, that as achievement is highly dependent upon verbal ability, and as all control group subjects were found to be of average verbal ability, it is unlikely that control group subjects were of above average achievement. Another limitation of the present study is related to the collection, transcription and coding of the language sample analyses. First , the language sample was obtained under structured conditions (imperatives), and not obtained across a number of settings. It might be suggested that these factors limited the representativeness of the samples collected. Secondly, language sample transcription and analysis is a complex procedure, and the l imited experience of the researcher in conducting these analyses may have increased the possibility for error. Although Bishop's computerized version of L A R S P helped to direct the analyses somewhat, it is an interactive program requiring user judgement as the analysis proceeds. Moreover, the program was found to have limited capacity for analyzing the longer and more complex sentences generated by control subjects. For example, the program will not analyze sentences beyond 25 words. Clearly the program is effective for use with young children, or subjects in the lower range of language functioning. Recommendations for Cl in ical Practice 1. The observation that subtest items are not well-ordered in terms of di f f iculty would contraindicate the procedure of discontinuing a subtest after three consecutive failures to respond. If a subject fails to respond because three consecutive items are too dif f icult , he/she may respond to subsequent items which are less diff icult . 72 2. The T L C short form, which consists of Subtest Three (Recreating Sentences) and Subtest Four (Understanding Metaphoric Expressions) is not recommended for screening purposes at this time because of these two, only Subtest Three has been shown to discriminate adequately between L L D and control subjects. Moreover, the content of Subtest Four may be too specific to North Amer ican culture to discriminate among L L D and n o n - L L D subjects f rom multicultural backgrounds. 3. Al though Subtest One (Understanding Ambiguous Sentences) and Subtest Four (Understanding Metaphoric Expressions) d id not contribute to group discrimination in this study, their exclusion f rom the battery for the purpose of classifying L L D subjects is not recommended without further research. 4. Because V I Q explains a significant proportion of variance in T L C performance, the administration of both instruments for the purpose of classifying language disabled subjects seems t ime-consuming and redundant; however substitution of one instrument or the other is not recommended without further research. 5. Cautious interpretation of T L C results based on the test norms is recommended, as these may tend to overidentify individuals as language disabled. 6. The T L C authors recommend the use of profile analysis for determining individual strengths and weakness; however, this practice is not recommended, first because T L C norms may not be representative of the performance of local chi ldren, and second, because limited support has been demonstrated for T L C subtest specifity. 7. Detailed suggestions for developing individual education plans (IEP's) on the basis of T L C data are provided in the test manual. Current results, however, lend little support to this practice. First, there is no evidence offered by the test authors to suggest that remediation based on T L C results has an effect on actual performance over time. Second, IEP's are based on the assumption that each subtest measures a unique aspect of language competence; however, current results fail to support the specificity of the TLC subtests. 8. It is suggested that the TLC be used only in conjunction with other language measures which provide reliable diagnostic information. These might include an informal language sample, such as the LARSP analysis. Recommendations for Further Research 1. Suggestions for further research would include additional criterion-related validity studies at different age levels to observe test characteristics and possible developmental patterns in TLC performance. 2. Given that current results were obtained from a multilingual/multicultural sample, future studies might focus on the performance of distinct ethnic groups, including EFL. 3 . Further investigation of the criterion-related validity of the TLC short form (Subtests Three and Four) for use with local subjects is suggested. The validity of using Subtests Two and Three as a short form for local children might be investigated. 4. Further investigation of the relationship between VIQ and the TLC is suggested to determine the relative efficacy of each instrument for classifying LLD subjects, and whether one instrument might be substituted for another to avoid redundancy. 5. Further investigations of TLC subtest specificity using factor analysis are suggested. 6. Experimental research to determine the effectiveness over time of instructional objectives based on TLC performance might be considered. 7. Further examination of the appropriateness of TLC norms for local children is suggested. The development of local norms might be considered. References Amer ican Psycholgical Association (1985). Standards for educational and psychological testing. Washington, D C : Amer ican Psychological Association. Amer ican Speech-Language Hearing Association (1988). Report of the ad hoc committee on instrument evaluation. A S H A . 23, 75-76. Anastasi, A . (1982). Psychological testing (5th ed.). New York , N Y : MacMi l lan . A r a m , D., Ekelman, B., & Nation, J . (1984). Preschoolers with language disorders: 10 years later. Journal of Speech and Hearing Research. 27, 232-244. Atk ins , C P . , & Cartwright, L .R . (1982). Preferred language elicitation procedures used in f ive age categories. A S H A . 24, 321-323. Bernstein, D . K . (1989). Assessing children with limited Engl ish proficiency: Current perspectives. Topics in Language Disorders. 9. 15-20. Bishop, D. (1985). Language assessment, remediation and screening procedure ( L A R S P ) by David Chrvstal. Paul Fletcher & M ike Garman: Computerized  version for Apple II and He. (Available f rom Department of Speech, University of Newcastle upon T y n e , Great Britian) Blau, A . , Lahey, M . , & Oleksiuk-Velez , A . (1984). Planning goals for intervention: Can a language test serve as an alternative to a language sample? Journal of  Chi ldhood Communicat ion Disorders. 7, 27-37. B loom, L . , & Lahey, M . (1978). Language development and language disorders. New Y o r k , N Y : John Wiley & Sons. Caskey, W., & Frankl in , L . (1986). The Test of Adolescent Language ( T O A L ) and W I S C - R scores: A caveat. Language. Speech and Hearing Services in Schools. 17, 307-311. Committee on the Status of Racial Minorit ies (1983). Social dialects (position paper). A S H A . 2J, 23-25. Cronbach, L . (1971). Test validation. In R. Thorndike (Ed.) , Educational  measurement. Washington, D C : Amer ican Counci l on Educat ion. Cronbach, L . , & Meeh l , P. (1955). Construct validity in psychological tests. Psychological Bulletin. 52, 281-302. Crystal , D. , Fletcher, P., & Garman, M . (1976). The grammatical analysis of  language disability. L o n d o n , Great Britain: Edward A r n o l d . Cupples, W.P., & Lewis , M . E . B . (1984). Language-based learning disabilities: differential diagnosis and implementation. Learning Disabilities. 3, 129-140. Damico, J.S. (1988). The lack of eff icacy in language therapy: A case study. Language. Speech and Hearing Services in Schools. 19, 51-66. 75 Damico, J.S. (in press). Descriptive assessment of communicative ability in limited English proficient students. In E . Hamayan, & J.S. Damico (Eds.), Nonbiased  assessment of limited English proficient special education children. San Diego, C A : College H i l l Press. Darley, F . L . , & Spriestersbach, D . C . (1978). Diagnostic methods in speech pathology (2nd ed.). Reading, M A : Addison-Wesley. D i x o n , W.J. (Ed.) (1988). B M D P statistical software manual (Vol . 1). Berkely, C A : University of Cal i fornia Press. Donahue, M . (1985). Review of the W O R D Test. In J .V . Mitchel l (Ed.). N inth  mental measurements yearbook. L inco ln , N E : Buros Institute of Mental Measurement, University of Nebraska. Educat ion Services Group (1985). Student services handbook. Vancouver, B.C. : Vancouver School Board. E v a r d , B., & Sabers, D. (1979). Speech and language testing with distinct ethnic -racial groups: A survey of procedures for improving validity. Journal of  Speech and Hearing Disorders. 44, 271-281. Ferguson, G . A . (1976). Statistical analysis in psychology and education. Toronto, Canada: M c G r a w - H i l l . G u i o n , R . M . (1977). Content validity: Three years of talk-What's the action? Public  Personnel Management. 6, 407-414. Gunnarsson, B. (1978). A look at the content similarities between intelligence, achievement, personality, and language tests. In J.W. Oiler Jr. (Ed.) , Language in education: Testing the tests. Rowley, M A : Newbury. Hammi l l , D. , & Newcomer, P. (1988). Test of Language Development - 2 - Intermediate ( T O L D - 2 - I ) . Aust in , T X : P r o - E d . Hammi l l , D . , Brown, V . , Larsen, S., Wiederholt, J . (1987). Test of Adolescent  Language - 2 ( T O A L - 2 ) . Aust in , T X : P r o - E d . Hawkins, P., & Spencer, H . (1985). Imitative versus spontaneous language assessment: A comparison of C E L I and L A R S P . British Journal of Disorders  of Communicat ion. 2Q, 191-200. Henrysson, S. (1971). Gathering, analyzing, and using data on test items. In R. Thorndike (Ed.) Educational measurement. Washington, D C : Amer ican Counci l on Educat ion. Hol land, A . , & Forbes, M . (1986). Nonstandardized approaches to speech and language assessment. In O . L . Taylor (Ed.) , Communicat ion disorders in  linguistically diverse populations (pp. 49-66). San Diego, C A : Co l lege-H i l l Press. K a u f m a n , A . S . (1979). Intelligent testing with the WISC-R . New York , N Y : Wiley. Kearns, K . , & Simmons, N . (1983). A practical procedure for the grammatical analysis of aphasic language impairments: The L A R S P . In R. Brookshire (Ed.) , clinical aphasiology conference proceedings. Minneapolis, M N : B R K . K e l l y , D .J . , & R ice , M . L . (1986). A strategy for language assessment of young children: A combination of two approaches. Language. Speech and Hearing  Services in Schools. 17, 83-92. K l e c k a , W.R. (1980). Discriminant analysis. Beverly Hi l ls , C A : Sage Kretschmer, R .R . , & Kretschmer, L.W. (1978). Language development and intervention for the hearing impaired. Baltimore, M D : University Park Press. L a i , C . (1986). U . B . C . SPSSX Statistical Package for Social Sciences - extended  version. Vancouver, Canada: University of British Columbia. Larson, V . , & M c K i n l e y , N . (1987). Communicat ion assessment and intervention  strategies for adolescents. Eau Claire, Wl: Th ink ing Publications. Launer , P.B. , & Lahey, M . (1981). Passages: F r o m the fifties to the eighties in language assessment. Topics in Language Disorders. 1, 11-29. LaTor re , R . A . (1983). The 1982 survey of pupils in Vancouver schools for whom  English is a second language (Report No.83-02). Vancouver: Board of School Trustees, Evaluation and Research Services, Program Resources. Leonard, L . B . , Perozzi, J . A . , Prutting, C . A . , & Berkley, R . K . (1978). Nonstandardized approaches to the assessment of language behaviors. A S H A . 20, 371-379. L ieberman, R., He f f ron , A . , West, S., Hutchinson, E . , & Swem, T . (1987). A comparison of four adolescent language tests. Language. Speech, and Hearing  Services in Schools, 1£, 250-266. L ieberman, R . J . , & Michae l , A . (1986). Content relevance and content coverage in tests of grammatical ability. Journal of Speech and Hearing Disorders. 5_I, 71-81. L u n d , N . , & Duchan, J . (1983). Assessing children's language in naturalistic  contexts. Englewood C l i f fs , NJ : Prentice Hal l . M c C a u l e y , R . J . , & Swisher, L . (1984a). Psychometric review of language and articulation tests for preschool children. Journal of Speech and Hearing  Disorders. 42, 34-42. McCau ley , R . J . , & Swisher, L . (1984b). Use and misuse of norm-referenced tests in clinical assessment: A hypothetical case. Journal of Speech and Hearinng  Disorders. 49, 338-348. Messick, S. (1980). Test validity and the ethics of assessment. Amer ican  Psychologist. M , 1012-1027. Mil ler , J . F . , & Chapman, R. (1983). S A L T : Systematic analysis of language  transcripts. Madison, Wl: University of Wisconsin, Language Analysis Laboratory, Waisman Center. M u m a , J . (1981). Language primer for the clinical fields. Lubbock , T X : Natural Ch i ld Publishing. M u m a , J .R. (1984). Semel and Wiig's C E L F : Construct validity? (Letter to the editor). Journal of Speech and Hearing Disorders. 49, 101-104. M u m a , J .R. (1985). "No news is bad news": A response to McCau ley and Swisher (letter to the editor). Journal of Speech and Hearing Disorders. 50, 290-292. M u m a , J .R . , Lub insk i , R., & Pierce, S. (1982). A new era in language assessment: Data or evidence. In N.J . Lass (Ed.) , Speech and language: Advances in basic  research and practice. Toronto, Canada: Academic. Nelson, L . R . (1974). Gu ide to L E R T A P use and interpretation. Dunedin , New Zealand: University of Otago. N ice , M . M . (1925). Length of sentences as a criterion of a child's progress in speech. Journal of Educational Psychology. 16. 370-179. Nunnal ly , J . C . (1978). Psychometric theory. Toronto, Canada: M c G r a w - H i l l . Oiler, J.W., Jr. (1978). How important is language proficiency to IQ and other educational tests? In J.W. Oiler Jr. (Ed.), Language in education: Testing the  tests. Rowley , M A : Newbury. Oiler, J.W., Jr . , & Damico, J . (in press). Theoretical considerations in the assessment of L E P students. In E . Hamayan, & J.S. Damico (Eds.), Nonbiased  assessmentof limited English proficient special education children. San Diego, C A : College H i l l Press. Pedhazur, E . J . (1982). Mult iple regression in behavioral research. New Y o r k , N Y : Holt , Rinehart and Winston. Penn, M . A . C . (1983). Syntactic and pragmatic aspects of aohasic language. Unpubl ished doctoral dissertation, University of Witwatersrand, South A f r i c a . Penn, C , & Behrmann, M . (1986). Towards a classification scheme for aphasic syntax. Brit ish Journal of Disorders of Communicat ion. 21. 21-38. Prather, E . , Beecher, S., Stafford, M . , & Wallace, E . (1980). Screening Test of  Adolescent Language. Seattle, WA: University of Washington Press. Public Law 94 - 142. Education for all Handicapped Chi ldren Act . Federal Register, 1975, p. 42478. Quirk , R., Greenbaum, S., Leech , G . , & Svartvik, J . (1972). A grammar of  contemporary Engl ish. L o n d o n , Great Britain: Longman. Ra ju , N.S. (1985). Review of the W O R D Test. In J .V . Mitchel l (Ed.). Ninth mental  measurements yearbook. L inco ln , N E : Buros Institute of Mental Measurement, University of Nebraska. Santos, O . B. (1987). The variance in reading comprehension in terms of language skills and cognitive processes (Doctoral dissertation, Boston Universi ty , 1987). Dissertation Abstracts International. 48, 8716098. Sattler, J . (1988). Assessment of children (3rd ed.). San Diego, C A : Jerome M . Sattler. Schery, T . K . (1985). Correlates of language development in language-disordered children. Journal of Speech and Hearing Disorders. 50, 73-83. Semel, E . , & Wiig, E . (1980). Cl inical Evaluation of Language Functions. Columbus, O H : Merr i l l . Semel, E . , Wi ig, E . H . , & Secord, W. (1987). Cl inical Evaluation of Language  Fundamentals - Revised ( C E L F - R ) . San Antonio, T X : Psychological Corporation. Simon, C.S. (1984). Functional-pragmatic evaluation of communication skills in school-aged children. Language. Speech, and Hearing Services in Schools. 15. 83-98. Sommers, R . K . (1985). A review of the Screening Test of Adolescent Language. In J . V . Mi tchel l (Ed.) , Ninth mental measurements yearbook. L i n c o l n , N E : Buros Institute of Mental Measurement, University of Nebraska. Sommers, R . K . , Erd ige , S., & Peterson, M . K . (1978). How valid are children's language tests? The Journal of Special Educat ion. 12, 393-407. Stark, R . E . , Tal la l , P., Mell its, E . D . (1982). Quantification of language abilities in children. In N .J . Lass (Ed.), Speech and language: Advances in basic  research and practice. Toronto, Canada: Academic Press. Stephens, M . , & Montgomery, A . (1985). A critique of recent relevant standardized tests. Topics in Language Disorders. 5_, 21-45. Thorndike , R. (Ed.) (1971). Educational measurement. Washington, D C : Amer ican Counc i l on Educat ion. T h o r u m , A . (1980). Fullerton Language Test for Adolescents. Palo A l to , C A : Consult ing Psychologists Press. Thurstone, T . G . (1978). Educational Abil it ies Series. Palo A l to , C A : Science Research Associates. T ibbi ts , D . F . (1982). Language disorders in adolescents. L i n c o l n , N E : Cl i f fs Notes. Tyack , D., & Gottsleben, R. (1974). Language sampling, analysis and training. Palo A l to , C A : Consulting Psychologists Press. Vaughn-Cooke , F. (1983). Improving language assessment in minority children. A S H A . 25, 29-34. Watson-Russel, A . (1986). Student ethnic survey (Research Report No . 86-07). Vancouver: Board of School Trustees, Student Assessment and Research, Education Services Group . Wechsler, D. (1974). Wechsler Intelligence Scale for Chi ldren - Revised. New York , N Y : Psychological Corporation. Wi ig, E . H . , & Secord, W. (1985). Test of Language Competence. Toronto, Canada: Merr i l l . Wi ig, E . , & Semel, E . (1984). Language assessment and intervention for the learning  disabled. Toronto, Canada: Merr i l l . 80 Appendix A Parent/Student Information 81 Letter to Parents of Control Subjects Dear Parents: 's school has agreed to participate in a research project: "Validity of the Test of Language Competence". The Test of Languge Competence ( T L C ) is a testing instrument used by speech/language specialists in the Vancouver School District to appraise children's use of language. The purpose of this study is to investigate how well the T L C does the job for which it is intended with secondary school students. For this project to be successful, 60 students in the Vancouver School District will take 3 tests: the T L C , an ability test, and an informal language sample. The language sample is gathered by audiotaping an informal conversation between each student and a graduate student examiner. These conversations centre around n o n -threatening real- l i fe topics such as "What would you do i f you won a mil l ion dollars?". The sole purpose of these conversations is to obtain a representative sample of each student's usual language. In addition, parents of participating students wil l be asked to complete a brief questionnaire. The researcher seeks to determine i f the T L C is an effective measure of language functioning, and also how it compares to anothter commonly used procedure, the language sample analysis. The research project is being undertaken as a master's thesis in the Department of Educational Psychology at the University of British Columbia, it has been approved by the Vancouver School Board's Student Assessment and Research of f f ice , by the Principal of your son's/daughter's school, and by Faculty research specialists at the University. 82 - 2 -'s name has been chosen as a possible participant in this research. If you and your son/daughter agree to participate, he/she will be asked to take part in two individual testing sessions of approximately 60 minutes and 2 hours respectively. A trained graduate student will do the testing in the school. Such tests are common in schools. Students usually f ind them interesting and enjoyable. Parents will also be asked to complete a brief questionnaire on family background. The results of the tests will be strictly confidential; your child's name will not appear on the test forms. No individual test results will be released. The purpose is not to test any one child's performance, but rather to evaluate the usefulness of the Test of Language Competence. Parents interested in receiving a copy of the group results should request this on the consent form. I wish to emphasize that participation is voluntary. Participation in or withdrawl f rom the project at any time will not in any way influence your son's/daughter's class standing. I would, however, greatly appreciate your cooperation in this research. Please complete the Parent Consent Form and questionnaire and return it in the envelope provided as soon as possible. Thank you. Feel free to contact me for any further information at , or for messages. Sincerely, C . Ainsworth 83 Letter to Parents of L L D Subjects Dear Parents: 's school has agreed to participate in a research project: "Validity of the Test of Language Competence". The Test of Languge Competence ( T L C ) is a testing instrument used by speech/language specialists in the Vancouver School District to appraise children's use of language. The purpose of this study is to investigate how well the T L C does the job for which it is intended with secondary school students. For this project to be successful, 60 students in the Vancouver School District will take 3 tests: the T L C , an ability test, and an informal language sample. The language sample is gathered by audiotaping an informal conversation between each student and a graduate student examiner. These conversations centre around non-threatening real- l i fe topics such as "What would you do i f you won a mil l ion dollars?". The sole purpose of these conversations is to obtain a representative sample of each student's usual language. In addition, parents of participating students will be asked to complete a brief questionnaire. The researcher seeks to determine i f the T L C is an effective measure of language functioning, and also how it compares to anothter commonly used procedure, the language sample analysis. The research project is being undertaken as a master's thesis in the Department of Educational Psychology at the University of British Columbia, it has been approved by the Vancouver School Board's Student Assessment and Research of f f ice , by the Principal of your son's/daughter's school, and by Faculty research specialists at the University. 84 's name has been chosen as a possible participant in this research. If you and your son/daughter agree to participate, he/she will be asked to take part in one individual testing session of approximately 60 minutes. A trained graduate student will do the testing in the school. Such tests are common in schools. Students usually f ind them interesting and enjoyable. In order to avoid re-testing children for whom test data is already available, your permission for the investigator to obtain such existing information is requested. Parents wil l also be asked to complete a brief questionnaire on family background. The results of the tests wil l be strictly confidential; your child's name wil l not appear on the test forms. No individual test results will be released. The purpose is not to test any one chid's performance, but rather to evaluate the usefulness of the Test of Language Competence. Parents interested in receiving a copy of the group results should request this on the consent form. I wish to emphasize that participation is voluntary. Participation in or withdrawl f rom the project at any time wil l not in any way influence your son's/daughter's class standing. I would, however, greatly appreciate your cooperation in this research. Please complete the Parent Consent F o r m and the questionnaire and return it to the school as soon as possible. Thank you. Feel free to contact me for any further information at , or for messages. Sincerely, C . Ainsworth 85 Parent Consent Form (Controls) V A L I D I T Y S T U D Y O F T H E T E S T O F L A N G U A G E C O M P E T E C E P A R E N T C O N S E N T F O R M I am will ing / not will ing to give my consent for 's participation in the research study at school. I am aware that this wil l involve testing sesions totalling approximately three hours duration. In understand that confidentiality of test results wil l be maintained and that no individual scores will be released. I also understand that participation in this project is voluntary and may be terminated at any time. Name Signature Date I would / would not like a copy of the group results to be mailed to: 86 Parent Consent F o r m ( L L D Group) V A L I D I T Y S T U D Y O F T H E T E S T O F L A N G U A G E C O M P E T E N C E P A R E N T C O N S E N T F O R M I am will ing / not will ing to give my consent for 's participation in the research study at school. I am aware that this will involve testing sessions totalling approximately 60 minutes duration. I further consent to the release of test data which has been obtained for my son/daughter to the principal investigator in this research. In understand that confidentiality of test results wil l be maintained and that no individual scores will be released. I also understand that participation in this project is voluntary and may be terminated at any time. Name Signature Date I would / would not like a copy of the group results to be mailed to: 87 Student Consent F o r m (Controls) V A L I D I T Y S T U D Y O F T H E T E S T O F L A N G U A G E C O M P E T E N C E S T U D E N T C O N S E N T F O R M I am will ing / not willingto participate in the research study at school. I am aware that this will involve testing sessions totalling about three hourse in length. I understand that my test results will be kept confidential. M y name will not appear on any of the test papers or in the final report. I also understand that my participation in this project is voluntary, and that I can quit at any time without affecting my school grades. Name Signature Date 88 Student Consent F o r m ( L L D Group) V A L I D I T Y S T U D Y O F T H E T E S T O F L A N G U A G E C O M P E T E N C E S T U D E N T C O N S E N T F O R M I am will ing / not willing to participate in the research study at school. I am aware that this will involve a testing session of about an hour's length. I also consent to the release of my test scores that the school has on file to the researcher. I understand that these test scores will be kept confidential , and that my name will not appear on any of the new test papers or on the f inal report. I also understand that my participation in this project is voluntary, and that I can quit at any time without affecting my school grades. Name Signature Date 89 VALIDITY STUDY OF T H E TEST OF L A N G U A G E C O M P E T E N C E QUESTIONNAIRE Your assistance in providing the following information would be very helpful in making this a meaningful study: 1. What language do adults speak in the home? 2. What language do children speak in the home? 3. How often do adults speak English in the home? a I wa ys 3/4 of the time 1/2 of the time 1/4 of the time never 4. How often do children speak English in the home? always 3/4 of the time 1/2 of the time 1/4 of the time never 5. In which area of the city do you live? Downtown (west-end) Vancouver west of Main Street Vancouver east of Main Street and north of 41st Avenue Vancouver east of Main Street and sound of 41st Avenue 6. What is your son's/daughter's birthdate and age? Age: year month day 7. How would you describe your son's/daughter's school achievement? Below Average Average Above Average Reading Writing Spelling Arithmetic Has your son/daughter ever received special assistance for learning difficulties? (yes) (no) QUESTIONS ADDRESSED TO T H E MOTHER 9a. What is your occupation?_ 9b. Which category below best describes your completed level of education? Less than High School Completion High School Completion Post-Secondary, no degree University or college degree 10. QUESTIONS ADDRESSED TO T H E F A T H E R 10a. What is your occupation? 10b. Which category below best describes your completed level of education? Less than High School Completion High School Completion Post-Secondary, no degree University or college degree Thank you for your cooperation. Appendix B Item Analysis Data 91 Summary Item Statistics: Subtest One: Understanding Ambiguous Sentences Correlations % Distribution Item Mean S D Subtest Total Test V I Q 0 1 3 1. 1.63 1.21 63 53 44 19.6 39.1 41.3 2. 1.23 1.30 78 70 63 41.3 26.1 32.6 3. 1.02 1.20 47 35 25 45.7 30.4 23.9 4. 1.06 1.23 58 64 60 21.7 37.0 41.3 5. 1.15 1.13 76 63 58 32.6 43.5 23.9 6. 1.00 1.09 66 62 55 39.1 41.3 19.6 7. 1.02 .95 66 52 50 28.3 56.5 15.2 8. .78 1.05 61 51 48 52.2 32.6 15.2 9. .89 1.14 60 57 47 50.0 30.4 19.6 10. .56 .88 51 42 42 60.9 30.4 8.7 11. .73 1.06 52 56 40 56.5 28.3 15.2 12. .71 .93 47 63 60 50.0 39.1 10.9 13. .80 .91 35 54 46 41.3 47.8 10.9 Note, n = 46 Item order di f f icul ty correlation: Rho = 76 92 Summary Item Statistics: Subtest Two: Making Inferences Item Mean SD Subtest Correlations Total Test V I Q % Distribution 0 1 3 1. 2.61 .80 32 22 22 0 .6 80.4 2. 2.19 1.09 17 25 32 6.5 30.4 63. 3. 2.37 1.04 15 35 30 6.5 21.7 71.7 4. 2.58 .86 54 49 46 2.2 17.4 80.4 5. 2.13 1.00 28 35 39 0.0 43.5 56.5 6. 2.04 1.13 58 72 72 8.7 34.8 56.5 7. 1.84 1.03 55 47 54 2.2 54.3 43.5 8. 1.69 1.03 43 40 40 4.3 58.7 37 9. 2.17 .99 16 32 22 0 41.3 58.7 10. 1.78 1.05 47 46 32 4.3 54.3 41.3 11. 1.93 1.10 36 49 37 6.5 43.5 50. 12. 1.93 1.10 34 34 34 Note, n = 46 Item order di f f iculty correlation: Rho = .77 93 Summary Item Statistics: Subtest Three: Recreating Sentences (Holistic) Item Mean SD Subtest Correlations Total Test V I Q % Distribution 0 1 3 1. 2.50 .91 38 42 40 2.2 21.7 76.1 2. 1.63 .97 21 25 23 2.2 65.2 32.6 3. 2.13 1.07 40 37 35 4.3 37. 58.7 4. 2.09 1.13 43 55 54 8.7 32.6 58.7 5. 2.10 1.04 33 34 42 2.2 41.3 56.5 6. 1.83 1.12 64 69 63 8.7 45.7 45.7 7. 1.52 1.15 59 54 55 17.4 47.8 34.8 8. 1.56 1.05 18 06 -01 8.7 58.7 32.6 9. 2.17 1.06 33 44 38 4.3 34.8 60.9 10. 1.30 .96 56 50 45 13. 65.2 21.7 11. 1.61 1.06 39 33 29 8.7 56.5 34.8 12. 1.41 1.13 28 24 19 19.6 50 30.4 13. 1.89 1.16 60 51 40 10.9 39.1 50 Note, n = 46 Item order di f f iculty correlation: Rho = .53 94 Summary Item Statistics: Subtest Three: Recreating Sentences (Word Count) Item Mean SD Subtest Correlations Total Test V I Q % Distribution 0 1 3 1. 2.74 .68 31 28 29 0 13 87 2. 2.54 .89 20 19 15 2.2 19.6 78.3 3. 2.69 .81 34 12 03 4.3 8.7 87 4. 2.24 1.14 37 36 41 10.9 21.7 67.4 5. 2.67 .87 47 28 27 6.5 6.5 87 6. 2.52 1.00 51 37 30 8.7 10.9 80.4 7. 2.52 1.07 71 49 35 13.0 4.3 82.6 8. 2.28 1.13 34 17 08 10.9 19.6 69.6 9. 1.80 1.20 58 63 62 15.2 37 47.8 10. 2.30 1.20 55 41 32 17.4 8.7 73.9 11. 2.48 1.09 66 54 42 13 6.5 80.4 12. 2.13 1.29 38 44 31 21.7 10.9 67.4 13. 2.41 1.08 76 50 32 10.9 13 76.1 Note, n = 46 Item order di f f iculty correlation: Rho = .664 95 Summary Item Statistics: Subtest Three: Recreating Sentences (Total Test) Correlations Distribution Item Mean SD Subtest Total Test V I Q 0 1 2 3 4 5 6 1. 5.24 1.21 34 47 47 0 2.2 2.2 0 28.3 0 67.4 2. 4.17 1.39 33 30 26 2.2 0 13 0 58.7 0 26.1 3. 4.83 1.44 35 34 27 4.3 0 0 0 45.7 0 50 4. 4.33 1.94 41 53 56 6.5 0 17.4 65 19.6 0 80 5. 4.78 1.41 33 43 48 2.2 0 4.3 4.3 39.1 0 50.0 6. 4.35 1.69 57 68 59 6.5 0 65 4.3 43.5 0 39.1 7. 4.08 1.81 66 63 54 8.7 4.3 2.2 4.3 47.8 0 32.6 8. 3.85 1.55 13 16 05 8.7 0 6.5 2.2 65.2 0 17.4 9. 3.87 1.69 58 70 63 4.3 2.2 17.4 10.9 37 0 28.3 10. 3.61 1.68 63 58 49 13 2.2 2.2 2.2 67.4 0 13 11. 4.04 1.67 55 60 50 8.7 2.2 2.2 2.2 58.7 -- 26.1 12. 3.53 1.96 28 42 32 13 10.9 0 4.3 50 -- 21.7 13. 4.35 1.83 72 62 46 6.5 4.3 4.3 4.3 37 — — 43.5 Note: n - 46 Item order di f f iculty correlation: Rho = .65 96 Summary Item Statistics: Subtest Four: Understanding Metaphoric Expressions Item Mean S D Subtest Correlations Total Test V I Q % Distribution 0 1 3 1. 1.80 1.20 37 53 50 15.2 37.0 47.8 2. 1.22 1.26 69 73 64 39.1 30.4 30.4 3. 1.26 1.24 67 61 62 34.8 34.8 30.4 4. 1.67 1.33 50 49 45 28.3 23.9 47.8 5. 1.37 1.27 49 43 35 32.6 32.6 34.8 6. 1.15 1.25 74 69 63 41.3 30.4 28.3 7. 1.37 1.27 63 68 69 32.6 32.6 34.8 8. 1.22 1.26 48 52 55 39.1 30.4 30.4 9. 1.15 1.44 76 68 64 58.7 4.3 37.0 10. .76 1.23 69 65 58 67.4 10.9 21.7 11. .72 1.07 56 57 50 58.7 26.1 15.2 12. .85 1.03 76 80 72 45.7 39.1 15.2 Note: n = 46 Item order di f f iculty correlation: R h o = .80 Appendix C Stimulus Items Used to El ic i t Language Samples 98 Stimulus Items Used to El ici t Language Samples 1. What was the last movie you saw? Tel l me about it. 2. Describe what you usually do on Saturdays f rom the time you get up until the time you go to bed. 3. Te l l me about the funniest thing that has ever happened at your house. 4. Te l l me about your favourite singer/actor. 5. What is your favourite T V show? What happened on that show the last time you saw it? 6. What's the best book/story you've read. Tel l me about it. 7. Te l l me about your favourite teacher. 8. Te l l me how you would spend a mil l ion dollars. 9. Te l l me about the best time you can remember having with your family or friends. 10. What's the funniest/most embarrasing/exciting thing that's happened to you? Tel l me about it. 11. Where do you live? Tel l me how to get there f rom here. 12. Describe your home. 13. Te l l me about your hobbies. 14. Describe how your family celebrates Christmas. 15. How will you spend your Easter vacation? 99 Appendix D Language Sample Transcripts 1 0 0 Language Sample Transcript: Control Case One There was this guy and he went around looking for all these jobs. He wouldn't fit into any of them. Final ly he went to a department store. Some accident happened. A sign came fall ing. There was a lady under it. He grabbed on to it and started swinging up in the air. He liked working with manikans. The lady asked him what kind of job he wanted. He explained to her what he d id . She gave h im a job. There was these people that d id not agree and disliked the person with the job. He finally made this manikan. One day he turned arond and she was alive. A l l these happened. Everyone starts noticing this change in him. They're wondering why he's talking to a manikan. A t the end she turns into real life. She stays like that. They showed these scenes right at the start. It was in As ia . She was in a tomb. She wanted a different life but she didn't know how she was gonna get it. That's how it ended out. It was good. I l iked it alot. There was this family. Mack is old. Someone said that she'd died. A l l of a sudden he found out that he had a daughter. She came there. A whole bunch of scenes were happening. L i f e was changing in the house. Everybody was getting into arguments. Then A n n came along. He thought that she was dead. Now she moved into the house next door and his wife doesn't agree with it. They always argue. A t the end he went over there and they started dancing. They were having good times. Karen's starting to get thoughts about what's going on between them again. She's married to a man named Ben now. She had twins. She found out they were Gary 's . They weren't Ben's. Ben doesn't approve of it. V a l and Gary are good friends. While she was being a teacher she was being a fr iend too. Gir ls could go and talk to her. 102 Language Sample Transcript: Control Case T w o It would be my teacher because she had a farm and we got to go to her farm and feed her goats. It is about teenagers. They went to f ind a dead body. They went along train tracks. When they found it there was this older group that wanted to f ind it. They stook up for it. They never d id say who got it. They just brought it. His brother had died. I guess when he saw this dead one he might have thought of h im. Actual ly it was really good. It was about fr iendship. They always stood by each other. They were trying to figure out what girls were like. He went through puberty or whatever. He saw his babysitter and I guess he liked her. He was trying to get to know her. Her sister came in. He said that he was a college man. He lied. She found out. They got mad. They asked their dad what it was about girls. That is the plot. M y dad says I don't need braces. So he won't get them. I guess I would get them. It's got grey carpet. M y room has a balcony. We have different rooms. In the middle there is a bathroom. We just walke through doors and we go there. I guess it was Sunday. I went down to Metrotown with Joy. We just had fun. We went to those picture booths and we took a couple of pictures. It was great. Usual ly we just sit around. M y dad makes popcorn. We just watch T V . Yesterday Lesley came over. She said my stairs remind her of Psycho. M e and my sister are in the upstairs. We are the only ones there. M y sister was babysitting. Then she left. I was remembering Psycho when I was laying there. It was really scary. 104 Language Sample Transcript: Control Case Three He's working at this store. He makes this manikan. Then he gets f ired. He sees the manikan in a store. One day he's working on something and she comes alive. Near the end somewhow she gets into this machine where she's going to be chopped up. He saves her. A t the end everyone else can see her. The son Theo wanted to take f lying lessons. The little girl brought home a boyfriend and she was ordering him around and telling h im what to do. He used to stay after school and help you with problems. Spend alot of time with you. First put it in the bank. When I get older buy a car. Probably move to Hawai i . We went there for my dad's convention. / He was mostly at the convention. M y mom used to go shopping and me and my sister just go to the beach. Walk around. I went to the Polynesian Cultural Centre, Pearl Harbour, the zoo. We walked around quite a bit. Met some new people there. It was pretty sad when they showed the movie and all the people dying. It's k ind of just like a normal boat. Then you go on a memorial. 105 There's a big sign of all the people who died. Some people throw flowers into the water. Some of them are sticking out f rom the water. In the evenings we's all go out for dinner and then go shopping or see a movie. This mother takes her two kids to live with her parents but her parents don't know that she has the kids. She leaves them locked away in an attic. She leaves them there for a really long time. She used to come and visit. She got married to someone else. The kids would get really bored. One of the kids died because it got sick. They took her to the hospital but there wasn't enough time. A t the end all the kids sneak out of the window and run away. It was just lying there. One day its mom finally came in. The brother and sister told the mom. Once a long time ago I put my pants on backwards. I had the holes at the back. I came out and we had some guests over. I like their songs and their drummer is really good. I like Sylvester Stalone and Harrison Ford . 106 Language Sample Transcript: Control Case Four He was being killed. He asked a fr iend to help. He must have a revenge on that fr iend. Now he started to have his revenge but it's not the end yet. It's only one chapter a day. It's a Chinese way. I don't know which one to pick. It's about a hero that helps people. Ther's lots of girls l iked him. Gir ls are following h im around. He knows K u n g F u . He knows some other friends that knows K u n g F u too. They're all heroes. It was about people f inding out what's happening. There's a water f looding. People would go over there and check what's happening. They would go and help them. She's a nice teacher. She gives out candies every week on Fr iday. He's real funny. He jokes a lot. He doesn't give us hard work. I would buy a new house. G o on f ield trips with my parents. Buy lots of things. We went to a restaurant to eat dinner. We just go on special days like mother's day or when something comes to Canada. Then we go to a restaurant to eat. Some live in Hong K o n g . I don't know. I read all different kinds of books. It's about the mixed-up twins. It's about a twin that gets mixed-up. They're the same. Policeman and friends get all mixed-up with those two. One time when they were lost policemen were trying to f ind them. When the policeman f ind one of them the other one ran away. When the policeman f ind that one again the policeman got all mixed-up . Sometime they do something wrong and the it's real funny. Sometimes they make mistakes. There are Chinese. They're singers and actors. They sing. The i r songs are excellent. I go shopping with my mother sometimes. G o to my grandma's house. Do my homework. Play with my fr iend. 108 Language Sample Transcript: Control Case Five The last movie I saw was Star Trek Four. I don't know if you're into science fiction. It was really weird. It was different f rom all the other ones because all the other ones were really science f ict ion. This one is more comical because instead of the group being in their spaceship they were coming back in the twenty-f i f th century. There was a great big probe or a spaceship or something. It was terrorizing the earth and was planning on destroying it. It was sending of f these messages that only whales could hear They had to go back to our century to f ind these whales and bring them back. It was really comical. It was good. It left you in suspense for a f i f th part coming out. The last time I saw it was probably last Thursday. He was telling everybody how nobody could fool h im with all these practical jokes they were trying to play on him. The whole family made up this really elabourate joke to play on him. He overheard them talking about it. It backfired on them and he got them instead. He was really too smart for them to actually play practical jokes on. M y favourite teacher was probably my grade seven teacher. He wasn't old. He was in his forties. He knew where his students stood. He wasn't all oldfashioned. He knew all the terms we used and everything they meant. 109 We could talk to him as if he was just another student. We didn't have to talk to him as i f he was a teacher. He d id alot of things with us. When we went places he let us make suggestions of where to go and then he would pick the best places. We went camping two or three times with him. We went one time for a whole week. We missed a whole week of school. He took the whole class on a camping trip. If I won a mil l ion dollars I'd probably put it in the bank for a year and let the interest grow. M y parents were talking about this just the other day. They were saying that they'd put it in the bank for a year. Then they'd take half of it out and use it for a downpayment on a house. They wouldn't take the whole thing out and use it at one time. Every day there was all sorts of things to do because I met a whole bunch of new friends. We did everything together. Language Sample Transcript: L L D Case One I would put it in my bank. Go ing vacation. I forget. Buy car. Visit my aunt and cousin. I d id not watch any movies. We have a part at my house. We celebrate. Everybody came to our house. Put all the dishes in the other side. Wash the other side. Put my clothes together and opened the suitcase. It is big. There is a l ivingroom there. They have a kitchen l ivingroom and one bedroom. I would say you got the wrong number. I will not give it to them. I wil l not open the door. I might call the police. I wil l mail it back. G i v e it to the teacher. Put a bandaid. Keep looking for the library book. I would give it to the police. I would phone the fire department. G o to the neighborhood. A s k them to phone the police. Who did it. G ive it back. Te l l them to stop. They are strong. It is cold. T h y rob something. They kil l someone. It is too hard. It is smaller. It is rough. Do not tell anyone. She teach me new things. She help us math. Do you like school? It is fun. It is small. There is a blue creature. Do you want to be a teacher? Do you want to go to college? I ran out of questions. How old is your sister? It is not a doll . It is a stuffed animal. I call them my cute cub. I got dog. Language Sample Transcript: L L D Case T w o She's kind and helpful . She's mean. Get angry easily. G o around places. Buy a new house. I go to Chinese school. Learn Chinese. Came back at 3:30 and help my dad. We get memorize the words and have dictation. It's a Chinese movie. There's a twin prince. Got mixed-up . They fight the bad guys. Long time I know but forget. Some people going out and found a moon. The moon was dead. No live there. There's a animal. It was an elephant. It was dead. Then went into another moon. Somebody was deep sleep. The master try to scaped last. The god kil l h im. A bird flew in the house. M y dad go open the door. Scare h im out. Clean the tanks. He's a famous singer. Jave a Christmas tree and dinner. I always help my father the most. M y youngers have all sorts of spare time. Doesn't help. M y father's a carpenter. I help him in the roof. We have somewhere chop down trees. M y sister fights. They fight with the other small ones. They keep trying to hit the little ones. Get a nice job. I like to be myself. Wouldn't talk to them. Just hang up. Cal l the police. There's a skytrain to M a i n Street. Then walk down. Pay for it. Te l l somebody. I don't know. It's too far to walk. M y feet get tired. 114 Language Sample Transcript: L L D Case Three A ninja was fighting this boy. There is a man in the show. The man is a police off icer and the kid is a karate guy. The man always catch robbers and the little kid always helps him. The other school doesn't even let us go to the washroom. She does not teach us not to mark it. When you ask her some question she would say go back to your desk. She is nice. She helps people alot. She does not scream at us. She let us go to the washroom. I will keep it for college. Help the family pay their insurance. M y fr iend invite me to his house and then he ask me to stay over for a night. We had a time. He showed me all his videos. We were playing computer games. We have a tree. We decorate our house. We have lots of friends. We have turkey dinners. We go out for dinner. We invite lots of friends over. They are nice. B i rd came inside my house. We had the screen door open. We had a barbecue out. This bird came in. We did not know. Then we went back to the house. We closed the door. The bird and the cat got trapped. The cat was under my mom's room. The bird was in the flowers. M e and my father were f inding something. I was f inding something under my mom's room. We saw this cat. Dad was f inding something around the plants. Then he found a bird. We opened the door and then we chased it around the house. We did the same thing. They are good. They have good songs. I can't name them all. I forgot. It's pretty interesting. Language Sample Transcript: LLD Case Four I'd buy a lambourgine. I didn't really get it. It had Michael J. Fox in it. Guy has to go back in the future. He has to stop this guy from shooting him. Then he has to go in the future to change his kids. Bring em back. He explains stuff good. There was a snake in their house. Bill Cosby was scared of it. They had a string. They were trying to catch it but I forget what happens. It's football. You have all the equipment. It's tackle football. It's small people playing. There's coaches. They're a rock group. The best song I like is Walk this Way. They want kids to go to school. They don't want gangs around. ACDC is like Highway to Hell. I watch cartoons till 12:00. Watch wrestling till 4:00. I go out and play football. All my weeks are different. 117 I have different things. I call it an arcade cause that's the closes real arcade we have. They get different ones each week. I play 1942. It's a wargame. Y o u shoot down airplanes. I like football more. I don't read books. I read my own books. I make books up. There's this boy. It's the night before Halloween. He's on this island. There's a skeleton. The skull is all set to ki l l h im. A n axe is coming towards him. Y o u wake up. You're floating down the river. Y o u go back on this island. This happened now. It's two pages long. I wrote a 37 page book. It's not a book. It's paper. 118 Language Sample Transcript: L L D Case F ive It's a situation comedy. It's about that family. It's a family. It has a housekeeper. The housekeeper has a daughter. They all live together. The housekeeper is a man. He seems to be always solving the problems. I like comedy shows. His name's Tony . Her daughter is Samantha. She's good in basketball. He wanted her daughter to join basketball. She didn't jo in it. He was pretty upset. Later she joined the team. She had a boyfr iend named T o d d . I think they broke up. Last part was she had a new boyfr iend. He happened to be around a eighteen-year old boy and she was only about thirteen or fourteen. That was pretty funny. I would give half to my parents. Buy some things I would like. K e e p it for my future or maybe for college. She's really nice. I guess we were kinda close together. She was easy to talk to. She was really nice and helpful. I l iked her. I go and visit her. I added everything wrong. I thought I was going to get four hundred dollars. M y dad goes you added all wrong. He put it on the wall so everybody culd see. It was really funny and I was so embarassed. I was ading one of my cheques. Everybody was laughing at me. They've been my favourite group since grade f ive. I still like them. They play rock and roll. It's not hard and it's not soft. It's just in the middle. They broke up but now they're back together again. They lost one person. I'm glad they're back together because i like them. She seems to be getting songs that are kind of normal. She seems to be dressing up normal. Most of the times my mom works on the weekends. She's a nurse. I end up cleaning the house. Appendix E L A R S P Summary Sheets LARSP Summary Sheet Control Case One C O N N C T V Y COMMAND QUESTION S T A G E I C O M M V Q V N OTHER I S T A G E II C O M M VX QX SV 8 SO sc NEG X A X 4 VO 3 VC OTHER II D N IS A D J N 5 N N PR N 5 V V V P A R T 11 INT X OTHER II 10 ING 17 P L 13 ED 53 X+S(NP) 3 X+V(VP) 4 X+C(NP) X-fO(NP) 3 X+A(AP) 4 S T A G E III C O M M V X Y L E T X Y DO X Y Q X Y VS? SVC 10 SVO 17 SVA 15 NEG X Y VCA VOA 1 VOI OTHER III 1 D ADJ N 2 ADJ A D J N PR D N 8 COP IS PRON-P 49 PRON-O 9 A U X - M 3 A U X - 0 14 OTHER III 3 E N 3S 26 G E N XY+S(NP) 4 XY-fV(VP) 19 XY+C(NP) 4 XY+0(NP) 13 XY+A(AP) 12 S T A G E IV C O M M +S VXY+ QVS(*) QXYZ VS+? TAG SVOA 5 SVCA 2 SVOI 1 SVOC A A X Y 3 OTHER IV 3 NP PR NP 4 PR D A D J N C X X C X 1 NEG V 6 NEG X 2 A U X 2 OTHER IV 3 N'T 5 'COP 1 ' A U X 6 STAGE V A N D 7 CONJ 1 SUB 11 O T H E R CONN COORD(l ) 7 SUBA(l ) 3 CL S CL C 1 COORD(l-c) SUBA (It) CL 0 14 C O M P A R A T I V E POSTM CL1 2 POSTM PHR1 + POSTM CL1+ 1 EST E R ' L Y 2 S T A G E VI PASSIVE C O M P L E M E N T 1 HOW! WHAT! NP INIT 3 NP COORD C M P L X V P 2 S T A G E VII A CONN 3 IT C O M M N T CL T H E R E 4 E M P H ORDER M L U (IN M O R P H E M E S ) = 8.98 52 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S L A R S P Summary Sheet Control Case Two CONNCTVY S T A G E 1 C O M M A N D C O M M V S T A G E II C O M M V X QUESTION Q V QX SV 5 SO SC NEG X OTHER I A X VO 3 VC OTHER II D N 14 A D J N 2 N N P R N 7 V V 6 V P A R T 8 INT X 8 OTHER II 6 ING 5 P L 11 ED 43 X+S(NP) 3 X+V(VP) 5 X-vC(NP) X+0(NP) 1 X+A(AP) S T A G E III C O M M V X Y L E T X Y DO X Y qxY vs? SVC 6 SVO 28 SVA 19 NEG X Y V C A V O A . VOI OTHER III D ADJ N 2 A D J ADJ N PR D N 4 COP 15 PRON-P 61 PRON-O 4 A U X - M 3 A U X - 0 6 OTHER III 5 EN 3 3S 21 G E N XY+S(NP) 5 XY+V(VP) 18 XY+C(NP) 4 X Y t O ( N P ) 16 XY+A(AP) 12 S T A G E IV C O M M *S VXY+ QVS(+) QXYZ VS»7 TAG SVOA 3 SVCA 4 SVOI SVOC A A X Y OTHER IV NP PR NP 2 PR D ADJ N C X X C X 2 NEG V 2 NEG X 2 AUX 1 OTHER IV N 'T •COP A U X S T A G E V AND 5 CONJ 1 SUB 0 OTHER CONN COORD(I) 5 SUBA(l) 5 CL S CL C COORD(l + ) 1 SUBA (1 + ) CL 0 8 C O M P A R A T I V E POSTM CL1 3 POSTM PHR1+ POSTM C L 1 * EST ER 1 LY S T A G E VI PASSIVE C O M P L E M E N T 1 HOW! WHAT! NP 1NVT NP COORD C M P L X V P 2 S T A G E VII A CONN 1 IT C O M M N T CL T H E R E 2 E M P H ORDER M L U (IN M O R P H E M E S ) = 8.18 48 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE M 0 D E V I A N T N> 0 I N C O M P L E T E M 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S L A R S P Summary Sheet Control Case Three C O N N C T V Y COMMAND QUESTION S T A G E I C O M M V Q V N OTHER 1 S T A G E 11 COMM V X QX SV 4 SO SC NEC X A X 2 VO 1 VC 1 OTHER II D N 22 A D J N 5 N N PR N 3 V V 11 V P A R T 12 INT X 3 OTHER II 6 1NG 11 P L 19 ED 32 X+S(NP) 1 X t V ( V P ) 4 X+C(NP) X+0(NP) 1 X+A(AP) 1 S T A G E III C O M M V X Y L E T X Y DO X Y Q X Y VS? SVC 6 SVO 11 SVA 9 NEG X Y V C A V O A 4 VOI OTHER III 1 D ADJ N 5 ADJ ADJ N 1 PR D N 19 COP 6 PRON-P 47 P R O N - 0 i A U X - M 3 A U X - 0 8 OTHER III 5 EN 3S 21 G E N 1 XY+S(NP) 8 XY+V(VP) 14 XY+C(NP) S XY+0(NP) 12 XY+A(AP) 9 S T A G E IV C O M M +S VXY+ QVS(+) QXYZ VS+? TAG SVOA 8 SVCA 1 SVOI 1 SVOC A A X Y 11 OTHER IV 4 NP PR NP 1 PR D A D J N C X X C X 6 . NEG V 3 NEG X 2 A U X OTHER IV 4 N 'T 2 •COP 2 ' A U X 4 S T A G E V AND 10 CONJ 4 SUB 6 OTHER CONN COORD(l) 13 SUBA(l) 6 CL S 1 CL C COORD(l + ) 1 SUBA (1+) CL 0 1 C O M P A R A T I V E POSTM CL1 3 POSTM PHRl-f POSTM C L l t EST E R 1 L Y S T A G E VI PASSIVE 1 C O M P L E M E N T 4 HOW! WHAT! NP IN1T 6 NP COORD 1 C M P L X VP 4 S T A G E VII A CONN 2 IT C O M M N T CL T H E R E 2 E M P H ORDER M L U (IN MORPHEMES) = 10.1 50 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 D E V I A N T 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S LARSP Summary Sheet Control Case Four CONNCTVY COMMAND QUESTION S T A G E I C O M M V Q V N OTHER I S T A G E II C O M M V X QX SV 6 A X 3 D N 11 VV 1 ING 8 SO V0 8 ADJ N 3 V P A R T 6 SC V C N N INT X P L 22 NEG X O T H E R II PR N 6 O T H E R II 8 ED 15 X+S(NP) I X+V(VP) 9 X+C(NP) X+0(NP) 5 X+A(AP) 3 S T A G E III C O M M V X Y Q X Y SVC 10 V C A D ADJ N 4 P R O N - P 40 E N L E T X Y VS? SVO 9 V O A 1 A D J A D J N P R O N - 0 9 DO X Y SVA 11 VOI PR D N 8 A U X - M 4 3S 25 NEG X Y OTHER III 2 COP 17 A U X - O 8 G E N O T H E R III 5 XY+S(NP) 3 XY+V(VP) 6 XY+C(NP) 8 XY+0(NP) 6 XY+A(AP) 11 S T A G E IV C O M M +S QVS(+) SVOA 5 A A X Y 3 NP PR NP 4 NEG V 4 N 'T 3 QXYZ SVCA 3 OTHER IV 4 PR D ADJ N 2 NEG X 'COP 14 VXY+ VS+? SVOI 1 C X 2 A U X 1 ' A U X 3 T A G SVOC X C X 3 OTHER IV S T A G E V AND 2 COORD(l ) 4 C O O R D ( l + ) POSTM CL1 7 POSTM CL1» 2 EST CONJ 2 SUBA( l ) 6 SUBA (1+) POSTM PHR1+ ER SUB 10 CL S C L 0 4 L Y OTHER CONN CL C 1 C O M P A R A T I V E S T A G E VI PASSIVE 1 HOW! NP INIT 4 C M P L X V P 1 C O M P L E M E N T 3 WHAT! NP COORD S T A G E VII A CONN 3 IT COMMNT CL T H E R E 3 E M P H ORDER M L U (IN M O R P H E M E S ) = 7.76 52 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S L A R S P Summary Sheet Control Case Five CONNCTVY COMMAND QUESTION S T A G E I COMM V Q V N OTHER I S T A G E II C O M M V X QX SV 6 SO SC NEG X A X 1 VO s VC OTHER II D N 9 A D J N 6 N N PR N 4 V V 3 V P A R T 14 I N T X OTHER II 10 ING 14 P L 18 ED 48 X*S(NP) 3 X+V(VP) 3 X+C(NP) X+0(NP) 3 X+A(AP) 1 S T A G E III C O M M V X Y L E T X Y DO X Y Q X Y VS? SVC 11 SVO 6 SVA 5 NEG X Y V C A VOA 4 VOI OTHER III D A D J N 12 A D J A D J N PR D N 9 COP 19 PRON-P 63 P R O N - 0 8 A U X - M 8 A U X - 0 10 OTHER III 5 E N 3S 21 G E N XY-fS(NP) 6 XY+V(VP) 8 XY+C(NP) 8 XY+0(NP) 2 XY+A(AP) 7 S T A G E IV C O M M +S VXY+ QVS(+) 1 Q X Y Z VS+? T A G SVOA 10 SVCA 4 SVOI SVOC A A X Y 3 OTHER IV 5 NP PR NP 12 PR D ADJ N 3 C X X C X 3 NEG V 5 NEG X 2 A U X OTHER IV 6 N'T S 'COP 1 ' A U X 3 S T A G E V AND 8 CONJ SUB 11 OTHER CONN COORD(l ) 9 SUBA(l ) 9 CL S C L C COORD(l-f) SUBA (1+) CL 0 3 C O M P A R A T I V E POSTM CL1 12 POSTM PHR1+ POSTM CL1+ 3 EST 1 E R 1 L Y 4 S T A G E VI PASSIVE C O M P L E M E N T HOW1 WHAT! NP INIT 8 NP COORD C M P L X VP 9 S T A G E VII A CONN 2 IT C O M M N T CL T H E R E 2 E M P H ORDER M L U (IN M O R P H E M E S ) = 13.21 41 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 D E V I A N T 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S LARSP Summary Sheet LD Case One CONNCTVY COMMAND QUESTION S T A C E I C O M M V Q V N OTHER I S T A G E II C O M M VX QX SV 3 AX 2 D N 14 V V 2 ING 2 SO V0 10 ADJ N 1 V P A R T 1 SC VC N N INT X P L 3 NEG X OTHER II PR N 2 - OTHER II 2 ED 7 X+S(NP) X+V(VP) 3 X+C(NP) X+0(NP) ? X+A(AP) 2 S T A G E III C O M M V X Y Q X Y 1 SVC 10 VCA D ADJ N 5 P R O N - P 47 EN LET X Y VS? SVO 13 VOA 3 ADJ ADJ N P R O N - 0 7 DO X Y SVA 2 VOl PR D N 6 A U X - M 8 3S 12 NEG X Y OTHER III COP 14 A U X - O 6 G E N OTHER III 2 XY+S(NP) XY+V(VP) 12 XY+C(NP) 4 X Y t O ( N P ) 10 X Y t A ( A P ) 4 S T A G E IV C O M M +S QVS(t) 1 SVOA 4 A A X Y NP PR NP 2 NEG V 5 N 'T QXYZ SVCA 2 OTHER IV PR D ADJ N 1 NEG X 'COP VXY+ VS»? 3 SVOI 3 C X 2 A U X ' A U X T A G SVOC X C X 2 OTHER IV 1 STAGE V AND 1 COORD(l ) 1 COORD(li-) POSTM CL1 P O S T M C L l t EST CONJ SUBA( l ) SUBA (1+) POSTM P H R H - ER 1 SUB CL S CL 0 3 LY OTHER CONN CL C C O M P A R A T I V E STAGE VI PASSIVE HOWl NP INIT 1 C M P L X V P 1 C O M P L E M E N T WHAT! NP COORD 1 S T A G E VII A CONN IT C O M M N T CL T H E R E 2 E M P H ORDER M L U (IN M O R P H E M E S ) = 5 07 54 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 I N C O M P L E T E 0 AMBIGUOUSO MINOR SOCIAL 0 MINOR STEREOTYPES LARSP Summary Sheet LD Case Two CONNCTVY C O M M A N D QUESTION S T A G E I COMM V Q V N OTHER I S T A G E II C O M M V X QX SV 3 SO SC NEG X A X S VO 12 VC OTHER II D N 18 ADJ N 3 N N PR N 3 V V 3 V P A R T 7 INT X 2 OTHER II 7 1NG 2 PL 9 ED 12 X+S(NP) 3 X+V(VP) 6 X+C(NP) X+0(NP) 7 X+A(AP) 3 S T A G E III C O M M V X Y L E T X Y DO X Y Q X Y VS? SVC 13 SVO 8 SVA 4 NEG X Y V C A 1 VOA VOI OTHER III 2 D ADJ N 6 A D J ADJ N P R D N S COP 13 PRON-P 21 PRON-O 7 A U X - M 4 A U X - 0 4 OTHER III 2 EN 3S 15 G E N XY+S(NP) 8 X Y * V ( V P ) 4 XY+C(NP) 9 XY»0(NP) 6 XY+A(AP) 5 S T A G E IV C O M M +S VXY+ QVS(+) Q X Y Z VSt? TAG SVOA J SVCA SVOI SVOC A A X Y 1 OTHER IV 1 NP PR NP 1 PR D ADJ N C X X C X 2 NEG V 3 NEG X 1 2 A U X OTHER IV 1 N 'T 3 'COP 9 ' A U X S T A G E V AND 3 CONJ 1 SUB OTHER CONN COORD(l ) 4 SUBA(l ) 2 CL S CL C COORD(1 + ) SUBA (1+) CL O 1 C O M P A R A T I V E POSTM CL1 POSTM PHR1 + POSTM CL1+ EST 1 ER L Y 1 S T A G E VI PASSIVE C O M P L E M E N T 2 HOWI WHAT! NP INIT 1 NP COORD C M P L X V P 3 S T A G E VII A CONN 1 IT C O M M N T CL T H E R E 3 E M P H ORDER M L U (IN M O R P H E M E S ) = 5.36 52 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S LARSP Summary Sheet LD Case Three C O N N C T V Y COMMAND QUESTION S T A G E I C O M M V Q V N OTHER I S T A G E II C O M M V X QX SV 4 SO sc N E G X A X V0 2 V C OTHER II D N 24 A D J N 1 N N P R N 1 V V V P A R T 2 INT X 2 OTHER 11 7 ING 5 PL 10 ED 24 X+S(NP) J X+V(VP) 4 X+C(NP) X-t-O(NP) 1 X+A(AP) S T A G E III COMM V X Y 1 LET X Y DO X Y Q X Y VS? SVC 6 SVO 19 SVA 8 N E G X Y V C A VOA VOI OTHER III D A D J N 3 ADJ ADJ N PR D N 10 COP 9 PRON-P 46 P R O N - 0 4 A U X - M 4 A U X - O 9 OTHER III S EN 3S 15 G E N XY+S(NP) 8 XY»V(VP) 8 XY+C(NP) 3 XY+0(NP) 17 X Y * A ( A P ) 9 S T A G E IV C O M M +S VXY+ QVS(+) QXYZ VS+? TAG SVO A 11 S V C A 1 SVOI 2 SVOC A A X Y OTHER IV 3 NP PR NP PR D A D J N 2 C X X C X 2 NEG V 5 NEG X 1 A U X OTHER IV 1 N 'T 1 •COP ' A U X S T A G E V AND 4 CONJ SUB 1 O T H E R CONN COORD( l ) 4 SUBA( l ) CL S CL C COORD(l + ) SUBA (1+) CL 0 7 C O M P A R A T I V E POSTM CL1 POSTM PHR1+ POSTM CL1+ EST ER LY S T A G E VI PASSIVE HOW! NP INIT 4 C M P L X VP C O M P L E M E N T WHAT! NP COORD S T A G E VII A CONN 4 IT C O M M N T CL T H E R E I E M P H ORDER M L U (IN M O R P H E M E S ) = 7.36 47 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S LARSP Summary Sheet LD Case Four C O N N C T V Y COMMAND QUESTION S T A G E 1 C O M M V Q V N OTHER I STAGE II C O M M V X QX SV 4 SO SC NEG X A X VO 2 V C OTHER II D N 13 A D J N 4 N N PR N 4 V V 5 V P A R T 4 1NTX 1 OTHER II 4 ING 5 PL 13 ED 9 XtS(NP) X+V(VP) 3 X+C(NP) X+0(NP) 1 X+A(AP) STAGE III COMM V X Y LET X Y DO X Y Q X Y VS7 SVC 16 SVO 17 SVA 6 NEG X Y V C A VOA 2 VOI OTHER III D A D J N 3 A D J A D J N PR D N 6 COP 8 PRON-P 45 P R O N - 0 8 A U X - M 2 A U X - O 6 OTHER III 2 EN 3S 23 G E N XY+S(NP) S X Y * V ( V P ) 13 XY+C(NP) 9 XY+0(NP) 10 X Y t A ( A P ) 6 S T A G E IV COMM »S V X Y t QVS(» QXYZ VS»? TAG SVOA 6 SVCA 1 SVOI SVOC 1 A A X Y 3 OTHER IV NP PR NP 1 PR D ADJ N C X X C X 1 NEG V 4 NEG X 2 AUX OTHER IV 2 N 'T 3 'COP 14 •AUX : S T A G E V AND CONJ 2 SUB OTHER CONN COORD( l ) 2 SUBA( l ) CL S CL C C O O R D ( l t ) SUBA (1+) CL 0 3 C O M P A R A T I V E POSTM CL1 2 POSTM PHR1 + POSTM CL1 + EST 2 ER 1 LY S T A G E VI PASSIVE HOW! NP IN1T 2 C M P L X V P 2 C O M P L E M E N T 1 WHAT! NP COORD 1 STAGE VII A CONN 1 IT C O M M N T CL T H E R E 4 E M P H ORDER M L U (IN M O R P H E M E S ) = 6.23 52 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 DEVIANT 0 INCOMPLETE 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR STEREOTYPES LARSP Summary Sheet LD Case Five C O N N C T V Y COMMAND QUESTION S T A G E I C O M M V Q V N OTHER I S T A G E II C O M M V X QX SV 4 A X D N 14 V V 2 ING 7 SO VO 1 A D J N 4 V P A R T 5 SC vc N N INT X 6 PL 9 NEG X OTHER II PR N 1 OTHER II 2 ED 26 X+S(NP) X+V(VP) 4 X+C(NP) X+0(NP) 1 X+A(AP) S T A G E III C O M M V X Y Q X Y SVC 16 V C A D A D J N 3 PRON-P 60 EN 1 LET X Y VS? SVO 23 VOA 1 A D J A D J N P R O N - 0 5 DO X Y SVA 9 VOI PR D N 7 A U X - M 5 3S 31 NEG X Y . OTHER III 1 COP 25 A U X - 0 7 G E N OTHER III 3 XY+S(NP) 8 XY+V(VP) 9 XY+C(NP) 11 XY+0(NP) 14 XY+A(AP) 7 S T A G E IV C O M M +S qvs(+) SVOA 6 A A X Y 4 NP PR NP 2 NEG V 3 N'T 1 QXYZ SVCA 2 OTHER IV PR D A D J N 1 NEG X 'COP 12 VXY+ VS+? SVOI C X 2 A U X 1 ' A U X 2 TAG SVOC X C X 4 OTHER IV S T A G E V AND 3 COORD(l ) 6 COORD(1+) POSTM CL1 3 POSTM CL1 + EST CONJ 3 SUBA(l) 2 SUBA (1 + ) POSTM PHR1* ER SUB 2 CL S CL 0 7 LY OTHER CONN CL C C O M P A R A T I V E S T A G E VI PASSIVE HOW! NP INIT 1 C M P L X VP 2 C O M P L E M E N T 2 WHAT! NP COORD S T A G E VII A CONN 1 IT C O M M N T CL T H E R E E M P H ORDER M L U (IN MORPHEMES) = 7.3 53 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 SYMBOLIC NOISE 0 D E V I A N T 0 I N C O M P L E T E 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR STEREOTYPES Appendix F Computational Procedure for Wilcoxon Rank Sum Tests 132 Computational Procedure for Wilcoxon Rank Sum Tests 1. Proceeding from smallest to largest, ranks were assigned to each case in both groups. 2. When ties occurred, each case was assigned the average of the ranks it would occupy i f no ties had occurred. 3. The sum of ranks (Rj ) was calculated for each group at phrase and clause level for Stages 2, 3, 4, and 5. 4. R" was calculated for both groups: Mean = R~ = N (N + N + 1) 2 5. R j for each group was compared to R. If less than~R~, R j was compared to the critical values required for significance. The critical lower tail values of R j for 5 and 5 cases are 19 (cC= .05) and 16 (oc= .01). 6. If R j exceeded R, the corresponding lower tail value was obtained as follows: 2R -R j . This result was then compared to the critical lower tail values indicated above. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0097728/manifest

Comment

Related Items