UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The test of language competence : a validity study with language disabled and normal children Ainsworth, Cheryl Anne 1989

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1989_A8 A34.pdf [ 5.68MB ]
Metadata
JSON: 831-1.0097728.json
JSON-LD: 831-1.0097728-ld.json
RDF/XML (Pretty): 831-1.0097728-rdf.xml
RDF/JSON: 831-1.0097728-rdf.json
Turtle: 831-1.0097728-turtle.txt
N-Triples: 831-1.0097728-rdf-ntriples.txt
Original Record: 831-1.0097728-source.json
Full Text
831-1.0097728-fulltext.txt
Citation
831-1.0097728.ris

Full Text

THE TEST OF L A N G U A G E COMPETENCE: A VALIDITY STUDY WITH L A N G U A G E DISABLED AND NORMAL CHILDREN  By CHERYL ANNE AINSWORTH B.A., The University of British Columbia, 1974 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS in THE F A C U L T Y OF EDUCATION (Department of Educational Psychology and Special Education)  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA October 1989 (c^ Cheryl Anne Ainsworth, 1989  In  presenting  degree freely  at  this  the  available  copying  of  department publication  of  in  partial  fulfilment  University  of  British  Columbia,  for  this or  thesis  reference  thesis by  this  for  his thesis  and  scholarly  or for  her  The University of British C o l u m b i a 1956 Main Mall Vancouver, Canada  DE-6(3/81)  1Y3  I  I further  purposes  gain  the  shall  requirements  agree  that  agree  may  representatives.  financial  permission.  V6T  study.  of  be  It not  is be  that  the  for  Library  an shall  permission  granted  by  understood allowed  advanced  the that  without  for  make  it  extensive  head  of  copying my  my or  written  Abstract  T h i s study investigated the validity and related psychometric characteristics of the Test of Language Competence ( T L C ) , published in 1985 by Wiig and Secord.  The  T L C was developed as a measure of higher order language f u n c t i o n i n g in children and adolescents between the ages of nine and eighteen years.  E v i d e n c e concerning the  psychometric characteristics of the T L C is reported in the test manual; however, to date, no studies addressed p r i m a r i l y to the subject of T L C validity have been reported in the literature.  M o r e o v e r , no information is available concerning the effectiveness of its use  with local school c h i l d r e n .  T h i s study endeavored to examine the technical  characteristics of the T L C using data obtained f r o m 23 language disordered ( L L D ) and 23 control subjects sampled f r o m the local school population.  A t the same time, the  criterion-related validity o f an i n f o r m a l language sample analysis was investigated. Item analysis statistics, including indices of item d i f f i c u l t y , item d i s c r i m i n a t i o n , internal consistency, and interrater reliability were prepared f o r the T L C .  Discriminant  f u n c t i o n analyses were used to assess criterion validity of the T L C , with and without corrections in T L C scores f o r V e r b a l IQ.  Because of the multiethnic nature o f the  sample, E n g l i s h as a second language ( E S L ) and E n g l i s h as a first language ( E F L ) group means were tested f o r significant differences on six variables.  L L D and control group  performance on the language sample analysis were tested f o r significant differences, using W i l c o x o n rank sum tests. Results of the item analyses indicated support for the internal consistency of the T L C subtests and the test composite, with the exception of Subtest T w o  (Making  Inferences), w h i c h obtained an internal consistency coefficient below the designated .8 criterion.  Subtest T w o and Subtest T h r e e (Recreating Sentences) were f o u n d to contain  items of questionable validity, and all f o u r subtests contained items that were misordered i n terms of d i f f i c u l t y .  Subtests T w o and T h r e e exhibited satisfactory criterion validity;  however, Subtest O n e (Understanding A m b i g u o u s Sentences) and F o u r (Understanding M e t a p h o r i c Expressions) failed to discriminate between L L D and C o n t r o l groups in a stepwise analysis.  T h e language sample analysis discriminated between the two groups.  Possible explanations for the f i n d i n g s , along with implications f o r clinical practice and recommendations f o r further research, are discussed.  iv  TABLE OF CONTENTS  Abstract T a b l e of Contents List of Tables Acknowledgements Dedication  ii iv vii viii ix  Contents Chapter  I  II  III  Page  Introduction  1  B a c k g r o u n d of the Problem  1  Purpose of the Study Signficance o f the Study Summary  3 5 5  Literature R e v i e w  7  Introduction T h e Assessment Process Test V a l i d i t y Theoretical B a c k g r o u n d V a l i d a t i o n Research Assessing E S L Populations T h e Test of Language Competence Theoretical B a c k g r o u n d Standardization Content V a l i d i t y Criterion-Related Validity Construct V a l i d i t y Internal Consistency Interrater R e l i a b i l i t y Informal Assessment Summary Research Questions  7 7 10 13 15 18 19 19 20 20 21 22 23 24 25 27 27  Methodology  29  Sampling Procedures M e a s u r i n g Instruments Test of Language Competence Subtest One  29 30 30 33  Subtest T w o Subtest T h r e e Subtest F o u r WISC-R Informal Language Sample  33 33 34 35 35  V  TABLE OF CONTENTS CONT'D Chapter  IV  V  Page  M e t h o d o f Data Analysis Item a n d Test Analysis Correlational Analyses Interrater Reliability Regression Analysis Stepwise Discriminant F u n c t i o n Analysis A d j u s t e d Discriminant F u n c t i o n Analysis Hotelling's T - S q u a r e  36 36 37 37 37 38 39 39  LARSP Wilcoxon R a n k S u m Tests Method  39 41 42  Results  43  Descriptive Statistics Item a n d Test A n a l y s i s Item D i f f i c u l t y  43 43 47  Item D i s c r i m i n a t i o n Internal Consistency Interrater Reliability Correlational Analysis D i s c r i m i n a n t F u n c t i o n Analysis Simple Regression Analysis Adjusted Discriminant Function Analysis Hotelling's T - S q u a r e W i l c o x o n R a n k S u m Tests  48 49. 49 49 53 55 55 56 56  Discussion  59  Purpose o f the Study Summary a n d Discussion o f Results Internal Characteristics o f the T L C Item D i f f i c u l t y Item D i s c r i m i n a t i o n Internal Consistency ( T L C Subtests) Internal Consistency ( T L C Composite) Intercorrelations Interrater Reliability Relationship Between T L C a n d V I Q Language Disabled and C o n t r o l G r o u p D i s c r i m i n a t i o n (TLC) Test Differences i n E S L / E F L G r o u p Performance Language Disabled a n d C o n t r o l G r o u p D i s c r i m i n a t i o n  59 60 60 60 61 62 63 63 64 64  (Language Sample Analysis) Conclusions Limitations o f the Study  68 68 70  65 68  vi TABLE OF CONTENTS CONT'D Chapter  Page Recommendations for C l i n i c a l Practice Recommendations for Further Research  References  71 73  74  Appendix A —  Parents/Student Information  80  Appendix B —  Item Analysis Data  90  Appendix C —  Stimulus Items U s e d to E l i c i t Language Samples  97  Appendix D —  Language Sample Transcripts  99  Appendix E —  L A R S P Summary Sheets  120  Appendix F —  Computational Procedure for Wilcoxon R a n k Sum Tests  131  LIST O F T A B L E S  Sample Characteristics: A g e , Sex, E n g l i s h Second Language ( E S L ) , E n g l i s h First Language (EFL) C u l t u r a l B a c k g r o u n d of Sample W I S C - R M e a n s , Standard Deviations (SD) and Standard E r r o r of Measurement T L C M e a n s , Standard Deviations (SD) and Standard E r r o r s of Measurement ( S E M ) Scaled Scores and Standard Scores for the M e a n A g e G r o u p (13 years) on the T L C Subtests and Composite T L C Subtest Internal Consistency Reliabilities and Descriptive Statistics: C o m b i n e d Sample Correlations Between T L C and W I S C - R for the C o m b i n e d Sample Intercorrelations A m o n g T L C Subtests for the C o m b i n e d Sample Summary of Discriminant F u n c t i o n Analysis: Stepwise Selection of T L C Subtests (Total Scores) S u m m a r y of D i s c r i m i n a n t F u n c t i o n Analysis: Classification o f Language Disabled ( L L D ) and Control Groups S u m m a r y o f Simple Regression Analyses: T L C subtests regressed o n V I Q Hotelling's T : Significance of Multivariate and U n i v a r i a t e Differences Between E S L and E F L G r o u p Means W i l c o x o n R a n k S u m Tests: S u m o f R a n k s ( R j ) Stage o f L A R S P Analysis  by  viii Acknowledgements  I would like to thank my faculty advisor and committee chairperson, Dr. Julianne Conry, for her assistance in planning and completing this study. Thanks also to Dr. Robert Conry, second member of the committee, for contributing his time and expertise to the data analysis and interpretation.  Further acknowledgements are extended to the  staff and students of School District # 3 9 (Vancouver) who participated in the data collection; to Joanna Neilson for re-scoring the T L C protocols; to Cathy Margetts who donated her secretarial skills to preparing the final draft; and to the many friends, relatives and colleagues whose interest encouraged me to persevere. Finally, but not least in importance, I would like to thank my husband, John, without whose patience and support this thesis could not have been finished; and my children, Christopher and Kimberly, who have waited many hours for mommy to be finished working.  F o r m y father  1 Chapter One Introduction Wiig and Semel (1984) describe language assessment as a hierarchical process. T h e first level of language assessment involves screening for language disability.  T h i s is  followed by more i n - d e p t h diagnostic assessment to identify specific areas of weakness, and goals for intervention.  F i n a l l y , assessment is concerned with monitoring and  evaluating progress. T h e choice of methods used within the assessment process depends u p o n the level or purpose of the assessment, and the theoretical orientation of the examiner.  A l t h o u g h theoretical perspectives may vary, current practice generally  includes a combination of f o r m a l and i n f o r m a l assessment techniques. F o r m a l assessment, by d e f i n i t i o n , involves the use of standardized evaluation procedures.  A m o n g the suggested advantages of standardized language tests are their  objectivity, their replicability, and controlled administration procedures w h i c h help to eliminate unwanted sources of variance.  These features, together with the fact that  standardized tests y i e l d quantitative data, make them useful f o r diagnostic and research purposes. use.  Despite the appeal of standardized tests, a number of issues surround their  These concern the technical and practical merits of many c o m m o n l y - u s e d  instruments. B a c k g r o u n d of the P r o b l e m Sommers, E r d i g e and Peterson (1978) predicted that the impact of P L 94-142 ( U . S . O f f i c e of E d u c a t i o n , 1975) would be to stimulate the development and widespread use of f o r m a l language tests; a forecast w h i c h has come true.  T h e years since P L 94-142  was enacted have given rise to a plethora of standardized language tests.  Standardized  test results are now required b y law in the U n i t e d States f o r placement and f u n d i n g purposes.  A s Stephens and M o n t g o m e r y (1985) have pointed out, speech-language  clinicians do not have a choice as to whether they w i l l use standardized tests, but only w h i c h tests they w i l l use.  2 Despite their widespread use, standardized language tests have become the subject of increasing criticism.  Particular dissatisfaction has gathered around "the  proliferation of published tests and materials being marketed with insufficient information concerning their effectiveness and psychometric characteristics" ( A m e r i c a n S p e e c h - L a n g u a g e - H e a r i n g Association [ A S H A ] 1988, p. 75).  Theoretical and technical  information reported in test manuals is frequently inadequate ( L i e b e r m a n , H e f f r o n , West, H u t c h i n s o n & S w e m , 1987; 1985).  M c C a u l e y & Swisher, 1984a; Stephens &  Montgomery,  M o r e o v e r , a review of the professional literature reveals a paucity of research  concerning the psychometric characteristics of even the most c o m m o n l y used instruments.  F i n a l l y , critics maintain that the validity of standardized language tests  used to identify language disorders among those who speak E n g l i s h as a second language ( E S L ) is highly questionable (Damico; in press, E v a r d & Sabers, 1979; V a u g h n - C o o k e , 1983). T h e increased dissatisfaction with standardized language tests has had two effects. F i r s t , there has been a movement away f r o m f o r m a l assessment procedures toward the use o f i n f o r m a l , descriptive techniques, including language sampling and analysis. Proponents o f this view argue that in addition to their technical inadequacies, standardized tests offer no practical utility.  A p a r t f r o m determining the existence of a  language disorder, standardized tests do little to describe the nature o f the problem or how to f i x it.  Descriptive procedures, on the other h a n d , take no more time to  administer than a battery of standardized tests, and ultimately yield i n f o r m a t i o n w h i c h is useful at all levels o f the assessment process ( D a m i c o , in press; 1988; M u m a , L u b i n s k i , Pierce, 1982; S i m o n , 1984). A second movement resulting f r o m the dissatisfaction with standardized tests has been toward i m p r o v i n g standards f o r test use and test construction.  A recent example of  this effort appeared in the f o r m of the A S H A G u i d e l i n e s on Instrument E v a l u a t i o n ( A S H A , 1988).  T h e A S H A G u i d e l i n e s were modeled after the A m e r i c a n Psychological  3  Association ( A P A ) Standards F o r Educational and Psychological Tests ( A m e r i c a n Psychological A s s o c i a t i o n , 1985) and were intended "as general criteria for j u d g i n g the adequacy of measurement and intervention instruments or procedures" (p. 76).  The  G u i d e l i n e s raise issues concerning the significance of theory in test development, standardization procedures, and test reliability; however, no direct reference is made to test validity, w h i c h according A P A standards "is the most important consideration in test evaluation" ( A P A , 1985). Regardless of divergent professional o p i n i o n concerning the relative merits of f o r m a l assessment, standardized tests continue to be used extensively by speech-language pathologists f o r assessment purposes. L i e b e r m a n and M i c h a e l (1986) and M c C a u l e y and Swisher (1984a; 1984b) have drawn attention to the fact that serious errors in diagnosis and remediation can occur if the tests used by clinicians fail to meet adequate technical standards.  T h e y maintain that in order to ensure accuracy w i t h i n the assessment process,  and to influence the quality of future instruments, clinicians have a responsibility to scrutinize the technical characteristics of the instruments they use.  T h i s view is  consistent with A P A (1985) recommendations for test use, and implies that clinicians must have access to technical information  concerning the various instruments available,  and that the information p r o v i d e d should include evidence of test validity. Purpose o f the Study T h e first purpose of the present study was to investigate the validity and related psychometric characteristics o f the Test of Language Competence ( T L C ) (Wiig and Secord, 1985).  A second purpose of the study was to determine i f significant  differences i n the performance of language disabled ( L D ) and control subjects would be observed on an i n f o r m a l language sample analysis. E a c h of these objectives is discussed separately below. T h e Test o f Language Competence ( T L C ) is an i n d i v i d u a l l y administered measure of language competence developed f o r use with older children and adolescents between  4  the ages of nine and eighteen years.  Language competence is d e f i n e d as "the appropriate  understanding a n d / o r expression of language content and a responsiveness to the communicative demands of a specific situation" (Wiig & Secord, 1985, p. 1).  T h e T L C is  intended to complement other f o r m a l and i n f o r m a l methods o f language assessment, and is r e c o m m e n d e d f o r use with measures of receptive vocabulary and language sample analyses. T h e T L C T e c h n i c a l M a n u a l presents information concerning the theoretical background and development of the test. E v i d e n c e supporting T L C validity and reliability is reported on the basis of data obtained d u r i n g standardization.  Separate  investigations using L D and control subjects are also reported; however, sampling procedures e m p l o y e d in these latter investigations are not w e l l - d e s c r i b e d , and intelligence test data are available for the language disabled subjects only.  Santos (1987)  included the T L C i n a study of variance in reading comprehension; however, no studies addressed p r i m a r i l y to the subject of T L C validity or reliability have as yet been reported in the literature. T h e p r i m a r y objective of this research was to examine the technical characteristics o f the T L C , and in particular, to evaluate the validity of the instrument. T h i s was accomplished first b y examining the internal characteristics of the T L C , i n c l u d i n g item d i f f i c u l t y , item d i s c r i m i n a t i o n , internal consistency, and interrater reliability.  Second, two discriminant f u n c t i o n analyses were calculated to determine the  capacity of the T L C to discriminate between language-learning disabled ( L L D ) and control subjects both before and after the effects o f verbal intelligence had been removed.  V e r b a l intelligence was operationally d e f i n e d as the V e r b a l Intelligence  Quotient ( V I Q )  on the Wechsler Intelligence Scales for C h i l d r e n - R e v i s e d ( W I S C - R )  (Wechsler, 1974).  T h i r d , i n order to determine i f subjects who spoke E n g l i s h as a  second language ( E S L ) p e r f o r m e d differently than subjects who spoke E n g l i s h as their  5  first native language ( E F L ) , group means on the T L C and W I S C - R ( V I Q ,  P I Q , FSIQ)  were tested for significant differences. A second objective of this research was to compare the performance of L D and control subjects on an i n f o r m a l language sample analysis. obtained f r o m five L D and five control subjects.  Language samples were  These were analyzed using the  Language Assessment, R e m e d i a t i o n , Screening Procedure ( L A R S P ) (Crystal, Fletcher & G a r m a n , 1976).  Results of the L A R S P analysis for both groups were then tested for  significant differences. Significance of the Study Data for this research was collected in School District 39 (Vancouver). V a n c o u v e r is a broadly multicultural district with a high proportion (approximately of students who speak E n g l i s h as a second language ( E S L ) .  46%)  A l t h o u g h the T L C is used  by speech-language pathologists w i t h i n the district, little is k n o w n about the effectiveness of its use with V a n c o u v e r students.  T h e present study investigated the  psychometric characteristics of the T L C using data obtained f r o m w i t h i n this culturally diverse population.  In so d o i n g , it contributes to existing evidence of T L C validity and  reliability, and is o f interest to speech-language pathologists in culturally diverse educational jurisdictions who are using or who may consider i n c l u d i n g the T L C in their assessment batteries.  In the same way, this study contributes to evidence concerning the  criterion-related validity o f an i n f o r m a l language sampling procedure f o r the local population. Summary Language assessment is a hierarchical process that includes f o r m a l and i n f o r m a l methods o f assessment. A l t h o u g h standardized tests are widely used, there is growing dissatisfaction concerning their technical and practical merits.  T h e Test o f Language  Competence ( T L C ) is a recently developed measure of language suitable for use with older c h i l d r e n and adolescents between the ages of nine and eighteen years.  Information  6 concerning the theoretical background and development of the T L C is presented in the test manual.  Research evidence of T L C validity and reliability is limited to that  reported in the test manual, and one other unpublished investigation.  T h e present study  investigated the validity and related technical characteristics of the T L C using data obtained f r o m a local and culturally diverse population.  A t the same time, the c r i t e r i o n -  related validity of an i n f o r m a l language sample was investigated.  7  Chapter T w o R e v i e w of the Literature Introduction T h i s chapter w i l l review the literature concerning the validity of standardized language tests, and in particular, those developed for use with older c h i l d r e n and adolescents.  T h r e e issues relevant to this discussion were identified in the previous  chapter, and w i l l be discussed here in the following order:  theoretical considerations  d u r i n g test construction, the lack of test validation research, and the validity of standardized language tests applied in multicultural settings.  T h e T L C w i l l be reviewed  w i t h i n the context o f this review, followed by a discussion of i n f o r m a l language sampling as an adjunct to f o r m a l assessment. T h e chapter opens with an overview of the assessment process, and a d e f i n i t i o n of test validity. T h e Assessment Process Language assessment is a hierarchical process that involves screening, diagnosis, program p l a n n i n g , and evaluation (Wiig & Semel, 1984). T h e first level of assessment concerns screening f o r possible language disorder. Screening data may be obtained f r o m a variety of sources, i n c l u d i n g observation, student records, i n f o r m a l , c l i n i c i a n - p r e p a r e d tasks, and standardized tests (Larson & M c K i n l e y , 1987; T i b b i t s , 1982; Wiig & Semel, 1984). F e w screening tests suitable for use with adolescents have been developed. Those available include the Screening Test of Adolescent Language ( S T A L ; Prather, Beecher, Stafford & Wallace, 1980), and the C l i n i c a l E v a l u a t i o n o f Language Functions Screening Tests ( C E L F ; Semel & W i i g , 1980), currently under revision. A short f o r m of T h e Test o f Language Competence ( T L C ) , w h i c h is the subject of this study, is intended for use as a screening instrument. T h e second level of language assessment concerns diagnosis. T h e purpose of assessment at this level is to c o n f i r m the existence o f a language disorder. T y p i c a l l y , this objective is met b y the administration of standardized tests (Larson and M c K i n l e y , 1987;  8  T i b b i t s , 1982; Wiig and Semel, 1984).  A number of standardized language tests have  been developed for use at this level with older children and adolescents.  T h e Test of  Adolescent L a n g u a g e , now the T O A L - 2 ( H a m m i l l , B r o w n , Larsen & Wiederholt, may be used with subjects between the ages of 12 and 18. Development-Intermediate,  1987)  T h e Test of Language  revised as the T O L D - 2 - I ( H a m m i l l & N e w c o m e r , 1988)  developed for older children between the ages of 8 and 12 years.  Other  was  instruments  include T h e Fullerton Language Test for Adolescents ( F L T A ) ( T h o r u m , 1980), w h i c h is intended for use with subjects between the ages of 11 and 18 years.  The Clinical  E v a l u a t i o n of Language F u n d a m e n t a l s - R e v i s e d ( C E L F - R ; Semel, W i i g , & Secord,  1987)  includes norms for subjects between the ages of 5 and 16 years of age. Other measures of specific language skills and abilities that have not been developed exclusively for use with adolescents but may be suitable for use with this population are described by L a r s o n and M c K i n l e y (1987), T i b b i t s (1982), and Wiig and Semel (1984). Data obtained f r o m standardized tests may be used at the f o l l o w i n g two levels of assessment for describing the nature of the language problem and planning goals.  intervention  T o accomplish these objectives, clinicians may examine differences among subtest  scores to determine areas of strength or weakness.  Other methods include conducting  error analyses o n the basis o f item responses, and altering task formats to observe where student performance breaks d o w n . M c C a u l e y and Swisher (1984b) have cautioned against use o f the above methods for planning therapy objectives, claiming that they may lead to "a mistaken understanding o f a client's p r o b l e m , to inappropriate and fruitless therapy programs, or to inaccurate conclusions regarding the e f f i c a c y of therapy" (p. 338).  E r r o r s associated  with response analysis stem f r o m the fact that no single test covers an exhaustive range of skills; therefore some skills that need to be addressed in therapy may be overlooked in the assessment.  A second problem is that an incorrect response to a specific test item  may not represent a true deficit in the skill represented.  T h i r d , subject responses  9  obtained under standardized conditions may be unrepresentative of the individual's language in other contexts. F i n a l l y , altering test items or teaching to specific items invalidates a test f o r future purposes. Profile analysis is considered an acceptable method f o r determining strengths and weaknesses i f appropriate statistical procedures governing the interpretation of significant differences are observed; however, M c C a u l e y and Swisher point out that the information required to do this is frequently omitted f r o m test manuals. A l t h o u g h standardized tests may contribute to the diagnostic p r o f i l e , information is often obtained at this level through the use o f i n f o r m a l assessment procedures.  The  term refers to interviews, observations, questionnaires, and other n o n - s t a n d a r d i z e d procedures, i n c l u d i n g language sample analyses. Informal procedures are, f o r some, the preferred method o f language assessment. Proponents argue that standardized tests collapse what is essentially a complex process (language) into a few meaningless test scores, whereas i n f o r m a l procedures yield descriptive information that may be translated into instructional objectives ( D a m i c o , in press; M u m a et a l . , 1982; L e o n a r d , Perozzi, Prutting & B e r k l e y , 1978).  Others maintain that i n f o r m a l procedures enable the  c l i n i c i a n to sample behavior in a variety of contexts, leading to a more accurate portrayal o f language f u n c t i o n i n g (Bloom & L a h e y , 1978; L a r s o n & M c K i n l e y , 1987; L u n d & D u c h a n , 1983).  M o r e o v e r , i n f o r m a l procedures permit the direct application of  theory "thereby b r i d g i n g the gap between some less timely standardized tests and what is currently understood about the nature of receptive and expressive competence" (Simon, 1984, p. 84).  F i n a l l y , standardized tests are viewed as suffering f r o m many technical  inadequacies; this, it has been suggested, "forces the c l i n i c i a n and teacher to use clinical judgement in the diagnostic process" (Cupples & L e w i s , 1984, p.  131).  Stark, T a l l a l and Mellits (1982) have enumerated the pitfalls o f relying on clinical judgement in language assessment. C l i n i c a l judgement, they argue, is based on inexplicit criteria; therefore it cannot be replicated, it is subject to bias, and is not suitable for  10  research purposes. C l i n i c a l judgement cannot be used independently to assess the language of children f r o m other cultures, and it cannot be used to distinguish between a language disorder and a more global intellectual impairment.  M c C a u l e y and Swisher  (1984b) advise that the reliability and validity of i n f o r m a l procedures in general require further clinical and research attention. T h e final level o f language assessment concerns progress evaluation. Evaluation is an ongoing part of the assessment process, and may include the use of f o r m a l or i n f o r m a l procedures.  H a m m i l l et al. (1987) comment that "the use of c r i t e r i o n -  referenced enroute objectives does not obviate the need to be sure that the enroute objectives do in fact lead to the desired, general integrated language goals" (p. 3), and r e c o m m e n d retesting students with the same, or similar instruments, that were used to identify them f o r special programmes in the first place.  M c C a u l e y & Swisher (1984b)  cite three reasons against the use of standardized tests f o r monitoring progress.  First,  standardized tests are designed to compare individuals, and may be insensitive to intraindividual changes over time. unreliability of the instrument.  Second, changes in test scores may be related to the  F i n a l l y , repeated administration o f standardized tests  may result in practice effects w h i c h invalidate test results. While there are extremes of o p i n i o n concerning f o r m a l and i n f o r m a l assessment procedures, the more b r o a d l y - h e l d view is that the two approaches are complementary (Blau, L a h e y , O l e k s i u k - V e l e z , 1984; K e l l y & R i c e , 1986; L a u n e r & L a h e y , 1981; M c C a u l e y & Swisher, 1984b; Stephens & M o n t g o m e r y , 1985).  L a r s o n and M c K i n l e y  (1987) and T i b b i t s (1982) consider a c o m b i n e d approach essential f o r assessing the language o f older c h i l d r e n and adolescents. Test V a l i d i t y Test validity concerns the appropriateness of inferences made f r o m test scores. T r a d i t i o n a l definitions of test validity have distinguished between content, concurrent and construct validity.  M o r e recently, validity has been d e f i n e d as a unitary concept  /  11 that includes all three types of evidence ( A P A , 1985).  F o r the sake of clarity, each will  be d e f i n e d separately at this point. A preliminary step in the process of test development is the selection of  a  theoretical model w h i c h defines the trait or behavior to be measured, and provides a rationale f o r item selection ( A P A , 1985; C r o n b a c h , 1971).  Content validity concerns the  adequacy of content sampling f r o m within this theoretical f r a m e w o r k , or the extent to w h i c h test items represent the behavior o f interest in its proper proportion. N u n n a l l y (1978) maintains that content validity "rests mainly on appeals to reason"; however, in some situations e m p i r i c a l methods may be employed to enhance content validity.  These  include, f o r example, using item analysis procedures d u r i n g test development, or obtaining correlations between the test of interest and measures of the same trait or behavior.  Detailed discussions of item development and content validity are located in  Anastasi (1982), C r o n b a c h (1971), Henrysson (1971) and N u n n a l l y (1978). C r i t e r i o n - r e l a t e d validity is of primary interest when the test under investigation is intended f o r classification or d e c i s i o n - m a k i n g purposes (Anastasi, 1982; C r o n b a c h , 1971).  T h e extent to w h i c h test scores predict performance on one or more outcome  criteria is a measure of criterion-related validity.  Anastasi (1982) suggests "a test may be  validated against as many criteria as there are specific uses f o r it" (p. 138); typical criteria are academic achievement, group membership, diagnostic classification, and so forth.  C r i t e r i o n - r e l a t e d evidence may be concurrent or predictive.  C o n c u r r e n t validity  is examined when data are obtained for the test o f interest and the outcome criteria at the same point in time. outcome criteria  Predictive validity is examined when data pertaining to the  are obtained at some point in the future; however, the term may be  used to refer to prediction at any time (Anastasi, 1982). A l t h o u g h test validation should employ all three types of evidence ( A P A , 1985), construct validity is o f critical significance when the test o f interest is a proposed measure of some unobservable trait, or construct. Construct validity concerns how well  12 a test measures the construct it is intended to measure, and "any data throwing light on the nature o f the trait under consideration and the conditions affecting its development and manifestations are grist f o r this validity mill" (Anastasi, 1982, p. 144).  Thus,  content and criterion-related evidence may contribute to evaluations of construct validity.  It is on this point that G u i o n (1977) has argued "all validity is at its base some  f o r m of construct validity" (p. 410).  Similarly, Messick (1980) defines construct validity  as "the u n i f y i n g concept of validity that integrates criterion and content considerations into a c o m m o n framework for testing rational hypotheses about theoretically relevant relationships " (p. 1015). Messick's point can be traced to earlier discussions concerning the significance of the "nomological net" ( C r o n b a c h & M e e h l , 1955) or theoretical framework within w h i c h constructs are d e f i n e d in relation to other constructs and observable behaviors. Construct validation is a process of testing hypotheses or predictions made on the basis of test performance.  T h e extent to w h i c h an hypothesized relationship is supported is  evidence of construct validity.  Examples of construct validation studies i n c l u d e , f o r  example, correlating the test of interest with measures of the same trait, or measures of different traits, examining item and subtest intercorrelations, and factor analysis.  In  situations where the proposed relationship is not supported, one may assume that the test, the research methodology, or the theoretical framework is unsound. ( C r o n b a c h & M e e h l , 1955). T o summarize, test validity is a unitary concept comprised of content, c r i t e r i o n related and construct validity.  A l l three types of evidence are b o u n d by a c o m m o n  theoretical f r a m e w o r k w h i c h provides a rationale f o r test score interpretation. d e f i n i t i o n assumes several points.  This  First, in order to demonstrate validity, a test must be  based on a theoretical framework or model that provides a rationale f o r item selection and test interpretation.  Second, evidence of construct validity supports not only the  validity o f the test under investigation, but the theory on w h i c h the test is based.  13 F i n a l l y , test validation is a ongoing process of accumulating evidence w h i c h may ultimately be used in an evaluation of construct validity. Theoretical B a c k g r o u n d It may be concluded f r o m the foregoing discussion that validity is built into a test f r o m the outset through the articulation of a sound theoretical m o d e l . A s stated, the f u n c t i o n of a test model is to provide a rationale for item selection and the interpretation o f test scores; a factor w h i c h bears upon content and construct validity. Despite the significance of theory in test construction, many language tests are not t h e o r y - d r i v e n ( M u m a , 1985; M c C a u l e y & Swisher, 1984a). F o r example, Stephens and M o n t g o m e r y (1985) reviewed six tests of adolescent language ( S T A L , W O R D Test, T O L D - I , F L T A , C E L F , T O A L ) and concluded only the T O A L and the T O L D - I were constructed with reference to any theoretical model. L i e b e r m a n & M i c h a e l (1986) evaluated the content relevance and content coverage of three standardized language tests ( C E L F , C E L I , T O L D ) ; two of w h i c h are suitable f o r use with older c h i l d r e n .  Content  relevance was evaluated according to five criteria, i n c l u d i n g the existence o f a theoretical m o d e l .  O n l y one test ( T O L D ) was j u d g e d to be adequate in this area.  Content coverage in the same study was evaluated by analyzing the grammatical requirements o f each test item using the Language Assessment R e m e d i a t i o n Procedure, or L A R S P (Crystal et a l . , 1976), w h i c h is based on a developmental model o f grammar. Results o f the L A R S P analysis led the researchers to conclude that f o r all three tests, content coverage was incomplete and unrepresentative of the grammatical d o m a i n .  Each  instrument, for example, was f o u n d to overrepresent the earlier stages of grammatical development, indicating that it might be too easy to identify language problems i n older children. In a later study, L i e b e r m a n et al. (1987) compared the performance o f 30 randomly-selected sixth-graders (11.6 to 12.5 years) on f o u r adolescent language tests (FLTA, T O A L , C E L F , STAL).  T h e findings indicated that 21 subjects obtained scores  14 below the designated c u t o f f point on the F L T A , compared to 22 on the T O A L , 18 on the C E L F , and 6 on the S T A L (a screening test). T h e researchers attributed the observed differences in group performance in part to the atheoretical nature of the tests, adding: "it is possible that neither the content nor the procedures of these tests may represent the essential f o r m s , features, and systems of adolescent language in their proper proportion and balance" (p. 260-61). Some language tests have been constructed according to models that are not supported in research.  F o r example, M u m a (1984) was critical of the C E L F (Semel &  W i i g , 1980) because rather than proposing a b r o a d l y - b a s e d theoretical m o d e l , the test authors cited "various domains of presumed deficits that have been reported in the special education literature" (p. 101). N o t i n g the methodological flaws inherent in m u c h o f the learning disabilities research, M u m a concluded that the C E L F authors had "managed to stack together several strawmen in the components of the C E L F " (p.  102).  Elsewhere, L i e b e r m a n et al. (1987) argued that the results of existing research into adolescent language development are "incomplete and fragmentary", adding that "until researchers broaden this language base and authors use it in test construction, the development o f adolescent language tests seems premature and especially susceptible to problems o f test inadequacy" (p.  263).  T o summarize, content validity depends upon the existence of a theoretical framework.  Without a w e l l - d e f i n e d theoretical d o m a i n , "assessment becomes a circular  endeavor o f merely claiming a d o m a i n , attaching a label, and constructing presumed tasks with their attendant responses, scores, norms, and results " ( M u m a , 1984, p.  102).  F u r t h e r m o r e , b y d e f i n i t i o n , construct validity assumes the existence o f a theoretical, or conceptual f r a m e w o r k .  T h e evidence presented w o u l d suggest that many standardized  language tests may be inadequate with regard to content and construct validity because they are weak on the level of theory.  15  V a l i d a t i o n Research Test developers have a responsibility to provide evidence of test validity in test manuals ( A P A , 1985).  Reviewers often criticize the adequacy o f technical information  reported in language test manuals.  F o r example, reviewers o f the W O R D test, a measure  suitable f o r use with older children up to the age of 11 years, concur that the test authors o f f e r m i n i m a l evidence of validity and reliability (Stephens & 1985; D o n a h u e , 1985; R a j u , 1985).  Montgomery,  In their review of the C E L F , Stephens &  M o n t g o m e r y (1985) referred to the reported evidence of validity and reliability as "singularly unimpressive" (p. 36). Sommers (1985) criticized the evidence of c r i t e r i o n related evidence in the S T A L as "inappropriate" because it used the D T L A as a criterion measure, and "there is no reason to believe that the f o u r subtests f r o m the D T L A measure language processing either" (p. 1332).  F o l l o w i n g a review of 30 language and  articulation test manuals, M c C a u l e y & Swisher (1984a) concluded "those criteria that require the application of considerable psychometric expertise, time, and m o n e y - - c r i t e r i a related to e m p i r i c a l evidence of validity and r e l i a b i l i t y - - w e r e  met least often" (p. 40).  Test validation refers to the process o f gathering evidence to support specific inferences made f r o m test scores ( A P A , 1985).  T h i s d e f i n i t i o n implies that test  validation continues b e y o n d the initial research reported in test manuals. few test validation studies are reported i n the literature.  Nevertheless,  A number of studies have been  reported w h i c h raise questions concerning the criterion-related validity of several adolescent language tests. These are touched upon b r i e f l y below. Stephens and M o n t g o m e r y (1985) reported that clinicians surveyed f o u n d the S T A L , a screening test f o r adolescent language disorder, "too easy", adding that the S T A L manual reported a false negative rate of 32% in students passing the S T A L but falling below the designated c u t o f f score o n the D T L A .  L i e b e r m a n et al. (1987)  observed that the S T A L identified 6 students out o f 30 as being at risk.  T h r e e other  measures in the same study ( T O A L , C E L F , and F L T A ) identified no fewer that 18  16 students as being language disordered.  Considering the latter three as criterion measures,  the criterion-related validity of the S T A L seems questionable. T h i s low failure rate on the S T A L might be attributable to the fact that it has a lower c u t o f f score (10th percentile) than the other measures.  Other researchers have  c o n c l u d e d , however, that the T O A L and F L T A may, in fact, overidentify individuals as being language disordered.  Caskey and F r a n k l i n (1986) and A r a m , E k e l m a n and N a t i o n  (1984) concur that the T O A L appears to be too d i f f i c u l t f o r evaluating the lower range o f language f u n c t i o n i n g in adolescents. A s an example, Caskey & F r a n k l i n f o u n d that in a sample of 20 "gifted" students ( W I S C - R IQ of 128 or higher), 10 obtained adolescent language quotients on the T O A L in excess of 15 standard score points (+1  standard  deviations) below their IQ scores, thus qualifying them for services as learning disabled (Caskey & F r a n k l i n , 1986). on the T O A L .  These results might be explained in part by i t e m - o r d e r i n g  Caskey & F r a n k l i n observed that when all T O A L items were  administered, some individuals achieved several basals after reaching a ceiling.  The  researchers c o n c l u d e d that items on the T O A L are not w e l l - o r d e r e d with respect to difficulty. L i e b e r m a n et al. (1987) addressed the question o f construct validity in their study of f o u r adolescent language tests.  Differences in group performance were observed  among the f o u r measures; however these were not significant in three out of f o u r instances, and intertest correlations were moderately high.  These results were thought to  support the theory o f a general language construct underlying many language tests, i n c l u d i n g those c l a i m i n g to measure distinct skills and abilities.  Studies reported by  D a m i c o and D a m i c o ( D a m i c o , personal c o m m u n i c a t i o n , A u g u s t , 1989), Schery (1985) and Sommers et al. (1978) have resulted in similar conclusions.  A l t h o u g h the data suggested  some redundancy among the different measures, L i e b e r m a n et al. c o n c l u d e d that further research into the factor structure of adolescent language tests is necessary, and that the  17 substitution o f one adolescent language test f o r another is inadvisable at the present time. T h e fact that few validation studies are reported in the speech-language literature is disconcerting f o r two reasons. First, there is insufficient evidence to support the use of any test as a criterion against w h i c h to measure the construct validity of new instruments. existing tests.  Second, there is limited information on w h i c h to base the revision of T h i s point is particularly relevant in view of the fact that a number of  standardized language tests have recently been revised.  A case in point is the T O A L ,  w h i c h served as a criterion instrument in T L C validation research, and has now been revised as the T O A L - 2 ( H a m m i l l et al., 1987). Tests should be revised "when new research data, significant changes in the d o m a i n represented, or new conditions of test use and interpretation make the test inappropriate f o r its intended use" ( A P A , 1985).  A n inspection o f the T O A L - 2 M a n u a l  revealed no explicit purpose or rationale f o r test revision. V a l i d i t y and reliability data are reported on the basis of research using either the T O A L or the T O A L - 2 , because, the test authors e x p l a i n , "The two versions of the test are essentially the same" (p. 47). T h e m a i n difference between the two tests appears to be in the range of item d i f f i c u l t y . O n the basis of user "comments" that the T O A L was too d i f f i c u l t , a number o f "easy" items have been added to seven o f the eight subtests. N o reference is made in the T O A L - 2 manual to independent research investigations of T O A L item d i f f i c u l t y , and it appears that no attempt has been made to correct f o r suggested problems o f item ordering. T h u s , although the T O A L - 2 may represent an improvement over the previous edition, it appears that the authors have been less than thorough in their efforts toward i m p r o v i n g the test. In s u m m a r y , validation evidence presented in language test manuals is frequently inadequate, and few studies investigating the validity of standardized language tests are reported in the professional literature.  Consequently, insufficient evidence exists to  18  support the use of any one test as a criterion instrument in validation research. M o r e o v e r , there is little empirical justification f o r the revision of existing instruments. Assessing E S L Populations In its position paper on social dialects, the Committee on the Status o f R a c i a l M i n o r i t i e s ( A S H A , 1983) maintained that "no dialectal variety of E n g l i s h is a disorder or a pathological f o r m of speech or language" (p. 23).  Bernstein (1989) allows that  distinguishing between a communication difference and a communication disorder in language assessment "is not an easy task".  T w o kinds of errors are possible. T h e first is  to misclassify children with language differences as language disordered.  A second type  of error is to overlook children with language disorders because of an assumption that they have had insufficient opportunity to learn the language. F e w standardized tests have been developed to identify language disorders among the E S L population.  A s a result, clinicians rely on tests that have been developed f o r  use with populations whose first language is Standard E n g l i s h .  A m o n g the threats to  validity associated with using tests under these circumstances are the unrepresentativeness o f test norms, the possibility of culturally-biased test items, lack o f  test-taking  skills among c h i l d r e n f r o m minority backgrounds, examiner effects on the test behavior of culturally different c h i l d r e n , and motivational factors, all of w h i c h may lead to errors in assessment ( E v a r d and Sabers, 1979; Sattler, 1988; V a u g h n - C o o k e , 1983). T o address these issues, alternatives to existing instruments and procedures have been proposed. Examples include developing norms f o r distinct linguistic groups, i n c l u d i n g a percentage o f minority groups in standardization samples, m o d i f y i n g test items, administering tests in the subjects' native language(s), and developing new tests ( E v a r d and Sabers, 1979; Sattler, 1988; V a u g h n - C o o k e , 1983).  Others have advocated  the use of i n f o r m a l and criterion-referenced procedures to i m p r o v e the quality of language assessment (Bernstein, 1989; D a m i c o , in press; H o l l a n d & F o r b e s , 1986). R e g a r d i n g the assessment o f E S L adolescents, L a r s o n & M c K i n l e y (1987) support the  19  development of new instruments, but maintain that informal language samples should be included i n the assessment process. T h e Test of Language Competence A relatively recent publication, the T L C has arrived on the heels of considerable criticism concerning the technical adequacy of standardized language tests.  It would  appear that T L C authors Wiig and Secord (1985) have been sensitive to this criticism. T h e test manual reports extensively on the theoretical background and technical characteristics of the test.  These are summarized below in relation to the foregoing  discussion. Theoretical B a c k g r o u n d . T h e T L C authors define language competence as a single construct requiring both the understanding of language content, and a responsiveness to the context in w h i c h communication occurs.  E a c h of the f o u r T L C  subtests is intended to measure a unique aspect o f language competence. T h e subtests are categorized into a model w h i c h features semantics (word meaning), syntax (grammar) and pragmatics (rules governing social/verbal communication).  Within these three levels,  the content of each subtest is further d i v i d e d into (a) propositions in narrow contexts and (b) propositions in communicative contexts.  In the former context, subtest content  deals p r i m a r i l y with semantic or syntactic meaning, while in the latter context, subtest content includes pragmatic considerations. T h i s model o f communicative competence represents what the test authors view to be a shift away f r o m the assessment of specific language skills such as phonology, vocabulary or syntax, to the assessment of linguistic processes or strategies.  T h e model  was d e r i v e d f r o m an extensive review of the literature concerning linguistic strategy development.  F o u r general areas were selected on the basis o f that review and are  represented by each of the T L C subtests. A second literature review led to the selection o f specific models within each area f r o m w h i c h the  subtests were developed.  is then an integrated model based on not one, but several theories o f language  The T L C  20  processing.  E a c h subtest is proposed as a measure of the broader construct, language  competence, and is based on existing theory or research. T h e same criticism raised by M u m a (1984), regarding the patchwork of theory and research that went into the making o f the C E L F , may also apply to the design of the T L C .  Little contemporary research evidence is reported to support the theoretical  design o f each subtest.  M o r e o v e r , relationships among these various bodies of theory  and research have not been established. Further examination of the technical aspects of the T L C should contribute to judgements concerning the adequacy of the test model. Standardization.  T h e T L C was standardized on 1,796 students f r o m three  geographic regions in the U n i t e d States.  T h e cultural characteristics of the sample are  described as 86.2% "white", 8.6% "black", and 3.6% "other", while the proportion of distinct linguistic groups, other than Spanish, is not described. G i v e n these considerations, the appropriateness of T L C norms f o r use with students in the V a n c o u v e r school district, where data f o r this research were collected, seems questionable.  In 1982, f o r example, 24,524 V a n c o u v e r students were f o u n d to  speak E n g l i s h as a second language ( E S L ) . o f the total district population.  T h i s figure represented approximately  46%  O f these, C h i n e s e , East Indian and Italian were the most  c o m m o n language groups ( L a T o r r e , 1983).  M o r e recently, 27% of students surveyed in  V a n c o u v e r reported that they had learned at least two different languages simultaneously as native languages (Watson-Russell, 1986). In their discussion o f T L C development and standardization, authors Wiig and Secord explain that separate norms f o r separate races or ethnic groups were not considered because test validity is unrelated to the representativeness of a n o r m i n g sample; however, the authors advise caution in interpreting the test scores o f minority subjects and have p r o v i d e d information to assist in the development o f local norms. Content V a l i d i t y :  E v i d e n c e o f T L C content validity is claimed on the basis of  f o u r criteria outlined by K r e t s c h m e r and K r e t s c h m e r (1978).  These criteria focus on the  21 role of theory in test construction, specifying that tests must be based on a theoretical d e f i n i t i o n , that there should be contemporary research support for this theoretical f r a m e w o r k , and sufficient information concerning the development of test items should be p r o v i d e d in test manuals to permit the generation of new test items.  In response to  these criteria it might be argued that there is insufficient contemporary research evidence to support the theoretical scaffolding on w h i c h the T L C is based. In addition to the theoretical evidence o f f e r e d in support of content validity, item analysis procedures were employed d u r i n g T L C development. These are not discussed i n detail in the T e c h n i c a l M a n u a l , but are reported to have included internal consistency coefficients using Cronbach's A l p h a to increase the homogeneity o f the subtests, and studies of item d i f f i c u l t y (Secord, personal c o m m u n i c a t i o n , A u g u s t , 1989). Criterion-Related Validity.  E v i d e n c e o f criterion-related validity is reported in  the T e c h n i c a l M a n u a l on the basis of correlations obtained between the T L C and three criterion measures:  the T O A L , W I S C - R , and the E d u c a t i o n a l A b i l i t i e s Series ( E A S ;  T h u r s t o n e , 1978) f o r a sample of 28 L L D and 28 controls.  L L D subjects were so  i d e n t i f i e d on the basis o f school referral procedures w h i c h are not described. Controls were described as normally achieving; again, how this determination was made is unclear.  Correlations between the T L C , the T O A L and the E A S were calculated  separately f o r both groups. group only.  T L C and W I S C - R correlations were calculated for the L L D  W I S C - R data were not reported f o r the control group.  Results o f a discriminant f u n c t i o n analysis using T L C and T O A L scores for the same 56 subjects indicated that 96% were correctly classified as language-disabled, while 93% o f controls were correctly classified. A subsequent stepwise discriminant f u n c t i o n procedure indicated that Subtest F o u r (Understanding M e t a p h o r i c Expressions) contributed most to group discrimination, followed b y Subtest T h r e e (Recreating Sentences), and Subtest T w o ( M a k i n g Inferences).  Subtest O n e (Understanding  22 Ambiguous Sentences) did not account for a significant proportion of variance and was not entered into the discriminant function. In a later study, Santos (1987) investigated the variance in reading comprehension among a combined sample of 20 reading disabled and 20 control subjects (ages 15-17 years). A significant relationship was hypothesized between each of the TLC subtests and the Durrell Analysis of Reading Difficulty. Results indicated that 16 of the 20 reading disabled subjects obtained TLC composite scores >1 SD below the mean. In contrast, 19 of 20 control subjects scored at or above the 50th percentile. Accepting group membership as a criterion, these results might be claimed to support the concurrent validity of the instrument. Construct Validity. Evidence of TLC construct validity is offered on the basis of correlations between the T L C and WISC-R scores for the same 28 L L D subjects described above. Higher (convergent) correlations were observed between scores on the T L C subtests and VIQ (.48 to .78), while lower (divergent) correlations were observed between scores on the T L C subtests and PIQ (.18 to .53). While this is a reasonable interpretation, it may once gain be pointed out that these correlations are available for the language disabled group only. Further evidence of construct validity is reported on the basis of intersubtest correlations obtained from the standardization sample at various age intervals. These range from .17 to .50. Moderate correlations are explained by the fact that each subtest represents a different content domain. This is reasonable; however, more support for this interpretation could have been demonstrated if subtest-to-total-test correlations had been reported. Assuming that the total test score is an overall measure of language competence, and assuming that each subtest measures some aspect of language competence, subtest-to-total-test correlations should be higher (Anastasi, 1982). In a separate investigation, subtest intercorrelations were calculated using data obtained from the same group of 28 L L D subjects and 28 controls described above.  23  Correlations ranged f r o m .24 to .57 for the L L D ' s and f r o m  -.1 to .39 for the controls.  In contrast, Santos (1987) reported a considerably higher range of subtest intercorrelations (.47 to .75) f o r a m i x e d sample of reading disabled/control subjects. T h e higher range of correlations observed in Santos' investigation are explained by the heterogeneity of the subject sample involved. Factor analysis results are reported on the basis of subtest intercorrelations obtained f r o m the standardization sample.  T h e percentage of variance explained by the  first unrotated factor at all but two age levels was greater than 90%.  A subsequent  oblique rotation factor analysis using T L C item intercorrelations and yielded f o u r factors at the 9-11  and 12-17 year age groups. These results are claimed to support both the  existence o f an underlying language factor, as well as the specificity of the T L C subtests.  T h u s , "the T L C emerges as an attractive blend of both w o r l d s — a strong  general factor supported by four specific subgroups of items." (Wiig and Secord, 1985, p. 47). Internal Consistency: T h e validity o f a test is a f u n c t i o n of its reliability.  One  method of estimating test reliability that is useful in the evaluation of test validity is to examine the degree of association among test items, or the internal consistency of the test. Internal consistency coefficients (Cronbach's A l p h a ) are reported f o r the f o u r T L C subtests and the total test at each age level in the standardization sample.  Because the  T L C was intended to measure language competence across different content areas, greater homogeneity was expected w i t h i n subtests than across items (Wiig and Secord, 1985, p p . 2, 48).  In fact, the range o f coefficients reported f o r the T L C subtests is f r o m  .52 to .79, while the range o f coefficients reported f o r the T L C composite is f r o m .75 to .82.  T h e lower range of coefficient observed f o r the T L C subtests is explained b y the  test authors as the effect o f test length on estimates of reliability.  T h a t is, the f o u r T L C  subtests "were designed to be as short as possible" in order to reduce administration time (Wiig and Secord, 1985, p. 48).  In limiting the length of each subtest, internal  24  consistency estimates have likewise been reduced. T h e c o m b i n e d length of the total test is considered to have resulted in a higher range of internal consistency coefficients than observed f o r the T L C subtests. Interrater Reliability:  A n o t h e r consideration in the evaluation of test reliability  is the extent of interrater agreement on subjectively scored items.  Interrater reliability  figures are reported in the T e c h n i c a l M a n u a l for Subtests T h r e e and F o u r , w h i c h require some judgement in scoring.  Interrater agreement was d e f i n e d as the percentage of  agreements observed between sixteen raters and the test authors on one protocol.  The  f i n a l estimates were 97% f o r Subtest T h r e e and 98% for Subtest F o u r . T h e above figures were obtained following a three-step procedure d u r i n g w h i c h all sixteen raters were trained in the application of scoring criteria by author Secord. Intermediate measures o f agreement were obtained, followed by additional instruction. Not surprisingly, an increase in the percentage of agreements was observed between the intermediate and final calculations. T h i s method o f estimating interrater illustrates the effect o f direct training on scorer reliability.  agreement  Sample protocols and scoring  criteria identical to those used in the training procedure are included in the Administration Manual.  T h e magnitude o f the interrater reliability coefficients obtained  in this study w i l l give some indication as to the adequacy of the scoring guidelines. T o summarize, the T L C represents a significant improvement over many other standardized language tests currently available f o r use with older c h i l d r e n and adolescents in several respects. F i r s t , the test is based on a clearly stated theoretical model.  Second, the test manual includes extensive i n f o r m a t i o n concerning the technical  characteristics of the test. questions.  T h e information as reported, however, raises a number o f  In particular, there has as yet been no attempt to examine the relationship  between T L C performance and V I Q f o r control subjects. Second, there is some evidence to suggest that at least one of the subtests does not contribute to L L D / c o n t r o l group  25 discrimination; however; the test authors have not addressed this issue.  F i n a l l y , the  validity of using the T L C with E S L subjects has not been explored. Informal Assessment A s the general dissatisfaction with standardized language tests continues to increase, more extensive use is being made of informal assessment procedures.  Informal  language assessment involves the use of interviews, questionnaires, observational techniques, and other non-standardized procedures, including i n f o r m a l language sampling and analysis. Language sampling is the process of eliciting, recording and transcribing a sample o f spontaneous language, and then analyzing it to determine areas of strength and weakness.  T h e focus of discussion here w i l l be on the i n f o r m a l  language sample. E l i c i t a t i o n procedures used in the collection of language samples vary.  These  include picture stimuli, where the subject is shown a picture and is asked to describe it. Other methods include prompting statements, such as "tell me about..", or direct questions.  U n s t r u c t u r e d conversation between the c l i n i c i a n and subject is the preferred  method of obtaining a language sample (Larson & M c K i n l e y , 1987; A t k i n s &  Cartwright,  1982); however this is not always practical due to situational and time constraints (Simon, 1984). O p i n i o n varies as to the length o f the sample required to ensure adequate coverage o f the subject's language.  T h e m i n i m u m sample required to calculate M e a n  L e n g t h o f Utterance ( M L U ) ( N i c e , 1925), for example, is 50 utterances. (1976) r e c o m m e n d continuous sampling of 15 - 30 minutes. sampling over time i n a variety o f situations.  Crystal et al.  M u m a et al. (1982) suggest  Little research evidence exists to help  resolve the issue. T h e consensus appears to be that 50 - 100 utterances is an acceptable m i n i m u m (Darley & Spriestersbach, 1978; Wiig & Semel,  1984)  T h e advantages and disadvantages of language sampling have been widely discussed (Bloom & L a h e y , 1978; L a r s o n & M c K i n l e y , 1987; M u m a et a l . , 1982; W i i g  26  and Semel, 1984).  O f t e n cited among the advantages are the flexible  administration  procedures, and the opportunity for observing language behaviors in various contexts. M o r e o v e r , because language sampling is a descriptive procedure, proponents argue that it provides valuable information for planning intervention strategies. A m o n g the disadvantages of language samples is that they do not yield standardized scores and therefore are not useful f o r making classification decisions. T h e y lack objectivity, and because administration procedures are unstandardized, they are not replicable, and results may vary f r o m sample to sample.  M o r e practical disadvantages  include the time required to obtain and transcribe language samples, and the expertise required to interpret them. L a r s o n & M c K i n l e y (1987) have suggested that the greatest d i f f i c u l t y with language samples is knowing how to analyze them.  T h i s is particularly the case in  analyzing the spontaneous language of older children and adolescents.  Numerous  language sampling and analysis procedures have been developed (Crystal et al., 1976; L u n d & D u c h a n , 1983; M i l l e r & C h a p m a n , 1983; M u m a , 1981; T y a c k & Gottsleben, 1974). M a n y of these are referenced to a developmental framework of early language development, and are considered inappropriate for use with older children and adolescents.  T h e r e is at present little empirical evidence concerning the critical stages of  language development w i t h i n this age g r o u p , or the sequence in w h i c h these occur ( H a m m i l l et a l . , 1987; L a r s o n & M c K i n l e y , 1987; L i e b e r m a n et al., 1987); however, this does not preclude the use o f language samples with these groups.  Despite its limitations,  language sampling yields valuable descriptive information concerning s o c i a l - v e r b a l skills and expressive language characteristics at any age level (Larson & M c K i n l e y , 1987; T i b b i t s , 1982). T h e expressive language characteristics of language-disabled youth have been described in some detail by L a r s o n and M c K i n l e y (1987), T i b b i t s (1982) and Wiig and Semel (1984).  These include w o r d - f i n d i n g difficulties such as overuse of fillers (um,  27  uh), pronouns, and circumlocutions.  In addition, language disabled youth demonstrate  patterns of deficit syntactic development.  A s a result, they have d i f f i c u l t y mastering the  rules w h i c h govern the construction of complex sentences, and as a group show lower M L U than their n o n - d i s a b l e d peers.  It might be hypothesized that language disabled  adolescents as a group would demonstrate fewer complex sentence structures in natural conversation than nonhandicapped adolescents, and that this difference w o u l d be demonstrated by an informal language sample analysis. Summary Language assessment is a hierarchical process that involves the use of formal and i n f o r m a l assessment procedures.  Issues surround the use of both; however, they are  generally viewed as complementary procedures.  T h e Test of Language Competence  ( T L C ) is a relatively new test designed for use with children and adolescents between the ages o f 9 and 18 years.  It represents a significant improvement over many tests  currently available for use with this age group; however, evidence of T L C validity is l i m i t e d to that reported in the T e c h n i c a l M a n u a l , and one other unpublished investigation.  T h e purpose of the present study was be to further investigate the  technical characteristics of the T L C .  A second purpose was be to compare the  performance o f language disabled and control subjects using an i n f o r m a l language sample. Research Questions T h i s study compared data obtained by a m i x e d sample of L L D and control group subjects on the T L C and an i n f o r m a l language sample.  T h e f o l l o w i n g research questions  were addressed: 1.  What are the internal characteristics of the T L C with respect to: a.  item d i f f i c u l t y indices  b.  item discrimination indices  c.  internal consistency o f subtests and composite  2.  What is the interrater reliability of the subjectively-scored sections of the T L C ?  3.  What is the relationship between the T L C and V I Q ?  4.  H o w effectively does the T L C discriminate between language-disabled and control groups?  T o what extent does this discrimination reflect  individual  differences in the IQ? 5.  Will scores obtained by E S L subjects d i f f e r significantly f r o m scores obtained by E F L subjects on six variables ( V I Q , P I Q , T L C Subtests One through Four)?  6.  D o language disabled and control subjects d i f f e r in characteristics of their i n f o r m a l language?  29  Chapter  III  Methodology T h i s chapter presents a description of research methodology.  Sampling  procedures and sample characteristics are described, followed by a discussion of data collection procedures and data analysis. Sampling Procedures Data were collected in School District 39 (Vancouver). M e m b e r s o f the language disabled group ( L L D ) were selected f r o m classrooms f o r the c o m m u n i c a t i v e l y disordered located in two schools, one elementary and one secondary, situated on the city's east side.  C h i l d r e n are declared eligible for placement in these classes on the basis of  psychoeducational and speech-language assessment data.  E l i g i b i l i t y criteria allow a  significant discrepancy between W I S C - R V I Q and PIQ on the W I S C - R in f a v o u r of Performance; although this may not always occur. C h i l d r e n must demonstrate language delays of two or more years in the p r i m a r y grades and three years in the intermediate/secondary grades, and their language deficits must not be the result of physical, intellectual, or emotional impairments.  A l t h o u g h c h i l d r e n in these classes may  speak E n g l i s h as a second language, this is not considered to be the p r i m a r y cause of their language difficulties (Education Services G r o u p , 1985). F o r the purpose of this study, c h i l d r e n were required to have no greater than average verbal ability as measured on the W I S C - R verbal scale (VIQ<115), and have at least average nonverbal ability as measured on the W I S C - R Performance Scale (PIQ  >85).  E a c h student in both communications classes received an i n f o r m a t i o n package to be taken home and read by parents.  T h e package contained a letter explaining the  purpose of the study, parent/student consent f o r m s , and a brief questionnaire ( A p p e n d i x A ) concerning f a m i l y linguistic b a c k g r o u n d , area of residence and educational history. A total of 25 students agreed, with parent permission, to participate in the study. E x i s t i n g W I S C - R scores were then obtained f r o m student records. Students whose W I S C - R data  30  was more than two years old were re tested. eliminated.  Those who obtained PIQ's below 85 were  N o subjects were eliminated on the basis of V I Q .  T h e f i n a l number of  students in the L L D group was 23. C o n t r o l group members (controls) were selected f r o m two regular education classes, one in the same elementary school as L L D subjects, and one in a neighboring high school.  Teachers and counsellors were provided with lists of matching criteria;  these included the age, sex and linguistic background of each subject in the L L D group. Staff were asked to nominate students who met these criteria, and who were judged to be of average classroom achievement on the basis of school grades. E a c h nominated student received an information package addressed to parents. Students w h o , with parent permission, agreed to participate were administered the W I S C - R . Students of average verbal ability ( V I Q 85-115) and at least average nonverbal ability (PIQ >85) retained.  were  Students receiving learning assistance, or instruction in E n g l i s h as a second  language, were e x c l u d e d , as were those w h o m school counsellors considered to be of above average achievement according to school records.  A total of 23 subjects were  retained. T h e f i n a l sample consisted of 46 subjects ranging in age f r o m 11 to 15 years. Subjects were matched for age, sex and linguistic background.  A summary of sample  characteristics is p r o v i d e d in Tables 1 - 2. M e a s u r i n g Instruments T h e Test o f Language Competence (Wiig and Secord,  1985)  T h e Test o f Language Competence ( T L C ) was developed for use with older c h i l d r e n and adolescents between the ages o f 9.0 and 18.11 years. delays in the development of language competence.  It is intended to assess  "Language competence" is d e f i n e d  by the test authors as a unitary construct consisting o f (a) the understanding and expression of language and (b) sensitivity to the social context within w h i c h the  31  Table 1 Sample Characteristics:  A g e . Sex. E n g l i s h Second Language ( E S L ) . E n g l i s h First  Language ( E F L )  N u m b e r in E a c h G r o u p Language Disabled  Controls  Combined  Characteristic Sex Male  13  12  25  Female  IG  i i  2i  Total  23  23  46  ESL  9  11  20  EFL  13  12  25  Language  23  Total  A g e Statistics (in  0.  I  Unknown  1  23  46  years/months)  Range  11-3 to  Mean  13.4  15-5  11-3 to 13.5  15-1  32  Table 2 C u l t u r a l B a c k g r o u n d of Sample  N o . Language Disordered  N o . Controls  -  No. Combined  1  Croatian  1  Japanese  1  Portuguese  -  Greek  1  1  2  Native Indian  1  1  2  Philipino  1  1  2  Punjabi  1  Vietnamese  -  2  2  Hindi/Fijian  2  2  4  Chinese  7  6  13  English  8  8  16  Total  23  23  46  1 1  2  33  c o m m u n i c a t i o n occurs. T h e T L C consists of f o u r subtests, each intended to measure a unique aspect o f language competence.  A brief description of each subtest follows:  Subtest O n e (Understanding A m b i g u o u s Sentences) T h e subject is presented with one sentence that may be interpreted in two ways (e.g. "I saw the girl take his picture").  T h e task is to verbalize both interpretations.  The  subtest includes thirteen scoreable items, and two training items. T h e entire test is given; no basal or ceiling levels are established. T h e subtest may be discontinued if the subject fails to respond three times in a row; however this is not a requirement. allowed a m a x i m u m response time of "10-20" seconds.  T h e subject is  Scoring is objective, and  weighted as follows: two correct responses = 3 points; one correct response = 1 point; no correct responses = 0 points. T h e range of raw scores possible is f r o m 0 to 39. Subtest T w o ( M a k i n g  Inferences)  T h e subject is presented with two sentences that describe the beginning and ending o f a situation (e.g. "Jack went to a M e x i c a n restaurant"; "He left without giving a tip"). T h e task to verbalize two inferences that describe what might have transpired between these two instances. T h e subtest includes twelve scoreable items, and one training item.  N o basal or ceiling rules apply; the subtest may be discontinued if the  subject fails to respond three consecutive times.  T h e response time allowed per item is  60 seconds. Scoring is objective and weighted as in Subtest O n e . R a w scores possibly range f r o m 0 to 36. Subtest T h r e e (Recreating Sentences) T h e subject is presented with three stimulus words and a picture depicting a social situation (e.g. A picture of people h i k i n g is accompanied with the words 'fall*, ' l e g ' , 'and').  T h e words are read aloud to the subject, whose task is to combine the three  stimulus words in a sentence that might have been spoken b y someone in the picture. T h e subtest contains 13 scoreable items and 2 trial items. T h e m a x i m u m allowable response time is 60 seconds.  Items are scored twice; once f o r correctness on the basis of  scoring criteria p r o v i d e d in the test manual. T h i s system entails some subjectivity on the part of the examiner, and rates responses on a scale of 0, 1, or 3.  Items are scored a  second time for the number of stimulus words included in each sentence (three stimulus words = 3 points, two stimulus words = 1 point, and 1 or 0 stimulus words = 0 points). These two sets of scores are then c o m b i n e d to obtain the subtest raw score. possibly range f r o m 0 to 4, or f r o m 0 to 6.  R a w scores  T h e range of scores possible for the subtest  total is f r o m 0 to 78. Subtest F o u r (Understanding Metaphoric Expressions) T h e subject is presented with a c o m m o n metaphor (e.g. "She sure casts a spell over me").  E a c h item is presented verbally, then i n print.  what the metaphor means. among f o u r distractors.  T h e task is first to explain  Second, the subject is choose the correct interpretation  T h e subtest includes 12 items plus two trials.  from  F i f t e e n seconds is  allowed for the first part of the task, and 45 seconds for the second. N o basal or ceiling rules apply; the subtest may be discontinued after three consecutive failures to respond. Points per item are awarded as follows: 3 points for two correct responses, 1 point for one correct, or 0.  Possible total raw scores range f r o m 0 to 36.  T h e T L C package includes a T e c h n i c a l M a n u a l , an A d m i n i s t r a t i o n M a n u a l , and a Stimulus M a n u a l .  Subtest One and T w o items are presented verbally by the examiner,  and then in print using the Stimulus M a n u a l . then reads the items. subject looks o n .  In this way, the subject first hears and  Subtest T h r e e items are read aloud by the examiner while the  F o r Subtest F o u r , the first part o f each item is presented verbally, and  then i n print using the Stimulus M a n u a l .  T h e second part of each item is presented  visually while the examiner reads the four options.  A l l responses are verbal, and  recorded on the test protocol by the examiner. T h e T L C yields scaled scores for each of the f o u r subtests ( X = 10; S D = 3), and a standard score for the T L C Composite ( X = 100; S D = 1 5 ) . consisting of Subtests T h r e e and F o u r , may be calculated.  A Partial test score,  T h e Partial composite is  35  recommended f o r screening purposes. Scaled scores and standard scores may be converted to age equivalent or percentile scores. C o n f i d e n c e intervals are p r o v i d e d .  A  detailed discussion of T L C validity and reliability is located in Chapter T w o . Wechsler Intelligence Scale f o r C h i l d r e n - R e v i s e d ( W I S C - R ) (Wechsler,  1974)  T h e W I S C - R is an i n d i v i d u a l l y - a d m i n i s t e r e d test of general intelligence suitable for use with c h i l d r e n between the ages of six and sixteen years.  T h e test is comprised  of twelve subtests organized into two scales. T h e V e r b a l Scale contains six subtests and provides a measure of verbal reasoning ability.  T h e Performance Scale is made up of  the remaining six subtests and provides a measure of nonverbal reasoning ability. Subtests yield scaled scores with a mean of ten and standard deviation of three.  T e n of  the subtest scaled scores are c o m b i n e d to f o r m the V e r b a l , Performance and F u l l Scale IQ's, each with a mean o f 100 and a standard deviation of 15.  T h e validity and  reliability of the W I S C - R is well-supported i n research. Sattler (1988) and K a u f m a n (1979) provide detailed discussions of the W I S C - R and its psychometric properties. Informal Language Sample Language samples were obtained by tape-recording a f i f t e e n - m i n u t e conversation between the examiner and each subject.  A n imperative format (eg: "Tell me about...")  was used to elicit the language samples. T h e imperative format was chosen f o r three reasons.  F i r s t , this format is replicable ( A t k i n s and Cartwright, 1982). Second, there is  evidence to suggest that imperatives elicit more fluent and more complex language than other procedures (Wiig and Semel, 1984).  T h i r d , this method was appealing for use with  language disabled adolescents whose spontaneous conversation might be limited. Stimulus items used to elicit language samples are presented i n A p p e n d i x C . F i v e language samples were randomly selected f r o m each group o f subjects.  Fifty  utterances f r o m the middle o f each sample were transcribed f o r analysis using the L A R S P procedure (Crystal et al., 1976), described below.  Utterances were excluded  f r o m the analysis i f they were partially or completely unintelligible, i f they were  36  repetitions of earlier responses, or if they were u n f i n i s h e d (e.g." I went to the...").  Single  w o r d utterances, or starters and fillers (like, u m , y o u know) were not i n c l u d e d . Repetitions due to dysfluency were treated as one utterance (eg: "I went to the...to the store). Methods o f Data Analysis Item and Test A n a l y s i s . T h e reliability and validity of a test is largely a f u n c t i o n of its item characteristics.  Item analysis is a term applied to the examination o f item  characteristics. T y p i c a l l y it includes measures of item d i f f i c u l t y , item d i s c r i m i n a t i o n , and item homogeneity. Item d i f f i c u l t y is defined as the proportion of persons passing or failing a test item.  It is related to the total distribution of test scores, and to test reliability. F o r  dichotomously scored items, item d i f f i c u l t y is expressed in terms o f p values, or the percentage of persons passing each item.  A n index of item d i f f i c u l t y f o r items scored  on a continuous scale, such as is the case with the T L C , is the mean score f o r each item (Nunnally,  1978).  Measures of item discrimination indicate the extent to w h i c h a test item differentiates among individuals on the behavior being measured (Anastasi, 1982).  Item  discrimination may be evaluated on the basis correlations between each item and an external c r i t e r i o n , or between each item and the  total test score. Point biserial  correlations are appropriate for use when test items are scored dichotomously.  When  items are scored on a multipoint scale, as in the case of the T L C , product moment correlations are appropriate.  A n item demonstrates adequate discrimination when it  reaches a level of .3 or better when correlated with the total test score ( N u n n a l l y ,  1978).  Estimates of internal consistency describe the degree association among test items. T h e K u d e r - R i c h a r d s o n formulas ( K - R 20 and K - R 21) are among the most c o m m o n l y used methods of calculating internal consistency. Cronbach's A l p h a is a generalization of the K - R 20 f o r m u l a suitable f o r use with items scored on a multipoint scale.  Hoyt's  37  Analysis of V a r i a n c e is a less frequently used procedure that produces the same results as K - R 20.  Internal consistency estimates of .8 or better are considered to indicate  acceptable test reliability ( N u n n a l l y , 1978). In this study, T L C item characteristics were analyzed using L E R T A P (Nelson, 1974), an extensive test analysis program.  In order to study the effectiveness of the  subjective scoring criteria during this analysis, Subtest T h r e e (Recreating Sentences) was coded as three subtests: Subtest T h r e e (H), Subtest T h r e e (W), and Subtest T h r e e (T). These correspond to the holistic, word count, and total subtest scores respectively. T L C items are scored on a multipoint scale; therefore, the mean and standard deviation of each item were inspected f o r relative degrees of d i f f i c u l t y . L i k e w i s e , product moment correlations were obtained between items and subtest/total test scores to evaluate item discrimination. In a d d i t i o n , correlations were calculated between each T L C item and an external criterion ( V I Q ) . H o y t coefficients were obtained for each of the T L C subtests and the composite.  C r o n b a c h ' s Stratified A l p h a was calculated f o r the test composite  only. Correlational Analyses. Pearson p r o d u c t - m o m e n t correlation coefficients (Pearson r) were obtained between T L C subtests and the test composite.  T L C and  W I S C - R scores were correlated to study the relationship between the T L C and V I Q . Correlations were obtained using the computer program S P S S X ( L a i , 1986). Interrater Reliability. In order to measure the extent of interrater agreement on the subjectively scored sections of the T L C , reliability coefficients (Pearson r) were calculated between scores obtained f r o m two independent raters.  Rater O n e was the  graduate student researcher, and rater two was a speech-language pathologist who uses the T L C in her employment with the V a n c o u v e r school district. Regression A n a l y s i s . Regression analysis is a method by w h i c h scores o n a dependent or criterion variable are predicted f r o m scores on an independent variable (Pedhazur, 1982). T h e extent to w h i c h an observed criterion score deviates f r o m its  3 8  predicted score is the residual, or error of estimate for that i n d i v i d u a l .  Residual scores  represent that proportion of variance w h i c h is unique to criterion and unaccounted for by variance in the independent variable. In this study, a series o f simple regression analyses were calculated using V e r b a l IQ as the independent variable, and each of the f o u r T L C subtests as dependent variables.  T h e purpose was to obtain standardized  residual scores representing that proportion o f and unique to the T L C , for each i n d i v i d u a l .  variance unaccounted f o r by V e r b a l IQ, Residual scores were standardized to have  a mean o f 0 and standard deviation of 1. These analyses were conducted using the computer program S P S S X . Stepwise Discriminant F u n c t i o n Analysis .  Discriminant f u n c t i o n analysis is an  extension o f regression analysis suitable for use with multiple variables when the criterion is group membership (Pedhazur, 1982). In his discussion, K l e c k a (1980) divides discriminant analysis into two levels of activity: interpretation and classification. A t the level of interpretation, one is concerned with obtaining the canonical discriminant functions. A discriminant f u n c t i o n is a composite of variables that has m a x i m u m potential f o r discriminating between groups. T h e number of functions possible is one minus the n u m b e r of groups, or the number of discriminating variables, whichever is smaller.  D i s c r i m i n a n t functions are applied at the second level of activity to predict  group membership. In stepwise discriminant f u n c t i o n analysis, the variable contributing most to group discrimination is entered first into the equation.  That variable is then paired with  each of the remaining variables, and the most discriminating of the remaining variables is entered.  T h i s stepwise selection of variables continues until all variables in the  f u n c t i o n have been entered, or until the remaining variables provide no significant contribution to group discrimination.  Wilk's lambda ( U ) is a measure o f residual  discrimination employed in stepwise procedures. Values of U range f r o m 0 to 1, with 0 indicting m a x i m u m group d i s c r i m i n a t i o n , and 1 indicating negative discrimination.  The  39 U statistic may be converted to an F statistic and tested for significance. the U and F statistics,  B y inspecting  it is possible to distinguish the discriminating variables f r o m  those w h i c h do not contribute substantially to group discrimination ( K l e c k a , 1980). In this study, the computer program B M D P - P 7 M ( D i x o n , 1988) was used to calculate a f o r w a r d stepwise discriminant f u n c t i o n analysis.  E a c h of the four T L C  subtests was employed as a discriminating variable, and the criterion was group membership.  T h e purpose of the analysis was to observe first the relative contribution  of each subtest to group d i s c r i m i n a t i o n , and second, the capacity of the test to predict group membership. T h i s was followed by a  second discriminant f u n c t i o n analysis using  Subtest T h r e e (Holistic Scoring) in place of the Subtest T h r e e total, together with the remaining three subtests as discriminating variables. T h e purpose was to determine  if  removal o f the Word C o u n t scoring would significantly alter the results. A d j u s t e d Discriminant F u n c t i o n Analysis. T h e standardized residual scores obtained b y the regression analyses represented that proportion of variance unique to the T L C and unaccounted f o r by V I Q .  In order to observe the relative contribution o f each  T L C subtest to group discrimination after the effects of V e r b a l IQ had been r e m o v e d , these standardized residual scores were entered into a second discriminant f u n c t i o n analysis using the computer program B M D P - P 7 M . Hotelling's T - S a u a r e . Hotelling's T - s q u a r e ( T  ) is a multivariate technique f o r  measuring the distance between group means on two or more variables simultaneously. It can be converted to an F statistic and evaluated for significance. In this study, a total of 22 subjects were k n o w n to speak E n g l i s h as a second language ( E S L ) . In order to rule out the effect of E S L on test performance, individuals were assigned to one of two groups ( E S L or n o n - E S L ) . G r o u p means were then tested f o r equality on six variables: W I S C - R V e r b a l IQ, W I S C - R Performance IQ, and each o f the f o u r T L C subtests. T h e computer program used f o r this analysis was B M D P - 3 D . L A R S P . T h e G r a m m a t i c a l Analysis of Language Disability Crystal et al. (1976)  40  ( L A R S P ) is a syntactic analysis procedure.  A computerized version of L A R S P (Bishop,  1985) was used i n this study. T h e program analyses sentences at three levels.  First,  sentences are broken into words, and each word is classified as a part of speech. next level, sentences are d i v i d e d into n o u n , verb or adverbial phrases. between clauses are analyzed at the final stage of the program.  sheet.  Relationships  Structures at each level  are assigned to one of seven developmental stages as frequency counts. utterance ( M L U ) is also calculated.  A t the  M e a n length of  Results are printed out on the L A R S P summary  T h e seven developmental stages represented in the L A R S P analysis are as follows:  Stage I:  O n e W o r d Sentences  Stage II:  T w o Word Sentences  Stage III:  T h r e e Word Sentences  Stage IV:  Sentences of F o u r Words or M o r e  Stage V :  Recursion  Stage VI:  System C o m p l e t i o n  Stage VII:  Discourse structure, syntactic comprehension and style  Stage one (single word) utterances were not included i n this analysis. Stages six and seven are not handled b y the Bishop version of L A R S P and were not dealt w i t h i n this study.  T h e analysis concentrated on stages two, three, f o u r , and f i v e .  Stage f i v e was of  p r i n c i p a l interest i n this study because it represents that point at w h i c h c h i l d r e n begin to use complex patterns o f sentence structure.  Stage F i v e is defined as follows:  Essentially what the c h i l d has to learn here is a set of connecting devices w h i c h can be used to interrelate clauses, and the transformational processes whereby one c a n be used w i t h i n ('embedded within') another.  O n c e these devices have  been learned, o f course, the process can continue indefinitely, longer and more complex sentences being built up as a result.  It is this feature o f language, to  take a basic structure and use it repeatedly to produce extensive sequences, w h i c h is the p r i m a r y characteristic of the creativity o f language...It is accordingly a  41  stage of great significance in normal development, as at this point the range of expression available to the c h i l d is enormously increased. (Crystal et al, 1976, p. 76). If L L D adolescents d o , in fact, use fewer complex sentence structures than n o n - L D youngsters, some divergence in the performance of L L D and control group subjects might be expected at the Stage F i v e level. T h e L A R S P system was selected over other available language analyses procedures for several reasons.  F i r s t , it is suitable for use with both children and adults,  and with speakers of nonstandard E n g l i s h (Crystal et al, 1976; H o l l a n d & F o r b e s , 1986; T i b b i t s , 1982). Second, it is based on a descriptive framework of E n g l i s h grammar ( Q u i r k , G r e e n b a u m , L e e c h , Svartvik, 1972), and thus has adequate content validity ( L i e b e r m a n et al., 1987).  F i n a l l y , the L A R S P procedure has been f o u n d to discriminate  among groups o f language disabled and control subjects at different age levels (Hawkins & Spencer, 1985;  K e a r n s & Simmons, 1983;  P e n n , 1983; Penn & B e h r m a n n , 1986).  W i l c o x o n R a n k Sum Test. T h e Wilcoxon rank sum test is a nonparametric test of significance o f group differences, appropriate for use with two independent samples when data are i n the f o r m of frequency counts. Data f r o m two groups are c o m b i n e d and assigned ranks. group.  T h e ranks are then s u m m e d , and a value ( R j )  is obtained for each  T h e probability o f this value is then tested in relation to the  theoretical  distribution o f R j . In this study, rank sum tests were calculated using data obtained f r o m the L A R S P p r o c e d u r e , described above.  T h e purpose was to determine whether  LARSP  w o u l d discriminate between L L D and C o n t r o l group members, particularly at the Stage F i v e level.  F r e q u e n c y data were converted to percentages in order to adjust for f l u e n c y .  R a n k sum tests were calculated using procedures outlined by Ferguson (1976).  42  Method Subjects were administered the Test of Language Competence ( T L C ) , the Wechsler Intelligence Scale for C h i l d r e n R e v i s e d ( W I S C - R ) , and an i n f o r m a l language sample.  Testing was conducted over a six month period by two graduate students  trained in the administration, scoring and interpretation o f standardized tests. T h e W I S C - R was not re-administered to L L D group students who had been tested within the previous two years.  In these instances, W I S C - R testing had been conducted by school  psychologists e m p l o y e d in School District 39.  43 Chapter  IV  Results T h i s chapter presents the results of the data analysis.  Descriptive and item  analysis statistics are presented first, followed by results of the correlational analyses. Results of the discriminant f u n c t i o n are described next.  F i n a l l y , results of Hotelling's  T - T e s t and Wilcoxon R a n k Sum Tests for equality of group means w i l l be summarized. Descriptive Statistics G r o u p descriptive statistics summarized in Tables 3 - 4 .  These include the mean,  standard deviation and standard error of measurement for each group ( L L D and Control) as well as the c o m b i n e d sample on each of the T L C and the W I S C - R V e r b a l and Performance Scales.  T L C results are reported here in raw scores.  T h e mean subtest and  composite score obtained by each group and the c o m b i n e d sample were converted to scaled scores and standard scores respectively using T L C norms. group (13 years) was used to make this conversion.  T h e mean age for each  Results are located in T a b l e 5.  These indicate that L L D and C o n t r o l subjects obtained generally lower than average scores on all but two subtests Item and Test A n a l y s i s T a b l e 6 presents results of the L E R T A P analysis.  These include the means,  standard deviations, and standard errors of measurement for each o f the T L C subtests, and the total test using c o m b i n e d sample.  In this instance, Subtest T h r e e  (Recreating  Sentences) was analyzed as three subtests (Holistic, W o r d C o u n t , Total); however, only the subtest composite (Subtest T h r e e - T o t a l ) was i n c l u d e d in the total test statistics. Individual item statistics (mean, standard deviation, item correlations) for each subtest are summarized i n A p p e n d i x B. subjects on each item.  A p p e n d i x B includes the percentage distribution of  44  Table 3 W I S C - R M e a n s . Standard Deviations (SD) and Standard E r r o r of Measurement  W I S C - R Intelligence Verbal Group  Language D i s a b l e d Control  Performance  Mean  SD  SEM  Mean  SD  SEM  74.8  8.9  1.9  101.0  10.7  2.2  103.0  9.1  1.9  111.0  11.0  2.3  F u l l Scale Mean  SD  86 107.3  SEM  8.3  1.7  10.4  2.1  Table 4 T L C M e a n s . Standard Deviations (SD) and Standard E r r o r s of Measurement ( S E M )  LLD  Subtest  No. of Items  Mean  SD  Controls  SEM  Mean  SD  SEM  Ambiguous  13  7.4  6.8  1.4  19  8.0  1.7  Inferences  12  20.9  5.0  1.0  29.7  3.3  .7  Recreate  13  46.8  10.1  2.1  63.3  6.9  1.4  Metaphors  12  7.3  5.5  1.1  21.8  8.5  1.8  T L C Total  50  82.4  20.8  4.3  133.7  19.8  4.1  Table 5 Scaled Scores and Standard Scores for the M e a n A g e G r o u p (13 years) on the T L C Subtests and Composite  T L C Scaled Scores T L C Subtests  LLD  A m b i g u o u s Sentences  03  06  04  M a k i n g Inferences  04  08  06  Recreating Sentences  03  08  06  Metaphoric  03  07  05  Expressions  Controls  Combined  T L C Standard Scores Composite  Note.  Scaled score mean  65  =  10; S D = 3  Standard Score mean = 100; S D = 15  83  69  4 6  Table 6 T L C Subtest Internal Consistency Reliabilities and Descriptive Statistics:  Combined  Sample  Subtest  Mean  SD  SEM  Hoyt  Ambiguous  13.17  9.37  2.97  .89  Inferences  25.30  6.11  3.08  .72  Recreate(H)  23.76  7.29  3.27  .78  Recreate(W)  31.35  7.85  3.10  .83  Recreate(T)  55.04  11.98  4.94  .82  Metaphors  14.54  10.16  3.15  .89  Composite  108.07  32.76  9.27  .95  Note,  n = 46 f o r Tables 6 through 11 r  = Interrater  reliability  H = Holistic Scoring; W = Word C o u n t ; T = Subtest T o t a l C o e f f i c i e n t A l p h a f o r the test composite = .87  r  .92  .99  47  Item D i f f i c u l t y .  Item means for the c o m b i n e d sample were compared f o r  evidence of item d i f f i c u l t y .  T h e range of scores possible f o r each item was 0, 1 or 3,  thus item means falling near 1.5 (or 3 f o r Subtest T h r e e - T o t a l ) represented the m i d range of d i f f i c u l t y . Results indicated that, for this sample, Subtest O n e (Understanding A m b i g u o u s Sentences) was the most d i f f i c u l t of the T L C subtests. Item means ranged f r o m .56 to 1.6, and although items were arranged toward increasing levels of d i f f i c u l t y , some variation within this pattern was observed. F o r example, 19 subjects (41%) scores of 0 on item 13 ( X = .8), as compared to 28 subjects (60%) 10 ( X =  obtained  who scored 0 on item  .56), indicating item 13 to be an easier item. A l l item means f o r Subtest T w o ( M a k i n g Inferences) fell above the m i d - r a n g e of  d i f f i c u l t y (1.7 to 2.5), indicating this to be the least d i f f i c u l t subtest, with the exception o f Subtest T h r e e (Holistic), described below.  A general trend toward increasing item  d i f f i c u l t y was noted; however, as in Subtest O n e , there was some variation within this pattern. Item means f o r Subtest T h r e e (Holistic) ranged f r o m 1.3 to 2.5. item means f o r Subtest T h r e e (Word C o u n t ) was 1.8 to 2.7.  T h e range o f  T h i s was indicated to be the  least d i f f i c u l t o f the T L C subtests. T h e higher range of item means observed score for the W o r d C o u n t subtest stems f r o m the fact that most subjects received credit for attempting to include all three stimulus words in their responses. T h r e e (Total) ranged f r o m 3.5 to 5.2.  Item means f o r Subtest  Items d i d not appear to be arranged in order o f  d i f f i c u l t y on this subtest, nor on either of the two separate scoring systems. Subtest F o u r (Understanding M e t a p h o r i c Expressions) was the second most d i f f i c u l t subtest.  T h e range o f item means observed f o r this subtest was .72 to 1.8.  Items 1 to 9 were not ordered in terms o f d i f f i c u l t y .  Items 10, 11 and 12 obtained  48  lower mean scores relative to the other items ( X < 1.0); thus more d i f f i c u l t items were located at the end of the subtest. Because some discrepancies in item ordering were noted among the  different  subtests, item order d i f f i c u l t y correlations were calculated using Spearman's rank order correlation coefficient (Rho).  T h e range o f coefficients was .53 (Subtest T h r e e - H o l i s t i c ) ,  .65 (Subtest T h r e e - T o t a l ) , .66 (Subtest T h r e e - W o r d C o u n t ) , .76 (Subtest O n e U n d e r s t a n d i n g A m b i g u o u s Sentences), .77 (Subtest T w o - M a k i n g Inferences), and .80 (Subtest F o u r - U n d e r s t a n d i n g M e t a p h o r i c Expressions). These results indicate that some T L C items are not w e l l - o r d e r e d in terms of d i f f i c u l t y . Item D i s c r i m i n a t i o n .  E a c h item in the T L C was correlated with its respective  subtest, the T L C composite, and an external criterion ( V I Q ) .  Item to subtest correlations  of .3 or greater indicated that the item was discriminating adequately among subjects. A l l items in Subtest O n e (Understanding A m b i g u o u s Sentences) and Subtest F o u r (Understanding M e t a p h o r i c Expressions) met the .3 criterion f o r adequate discrimination when correlated with their respective subtests. F o u r items i n Subtest T w o ( M a k i n g Inferences) failed to meet the .3 criterion w h e n correlated with the subtest total. Item two:  These i n c l u d e d the following:  T i m stopped on his way to school to play a video game.  A t the  locker, he realized he had to hurry back in order to be i n class on time.  Tim  had to go back because... Item three: picnic.  T h e sun was s h i n i n g , when the Robertson's started out f o r the  U n f o r t u n a t e l y they had the p i c n i c in the l i v i n g r o o m .  T h e y had the  picnic in the l i v i n g r o o m because... Item five:  B o b and R a y rode on a crowded bus to the shopping mall.  the story o f Bob's bad luck to a policeman. because...  T h e y told  T h e y talked to a policeman  49 Item nine:  L o r i took the bus downtown because it was her mother's birthday.  She left the fashionable stores with tears i n her eyes.  L o r i cried because...  T h e results suggest that these f o u r items discriminate poorly among individuals o n the specific behavior o f interest (inferential thinking).  M o r e o v e r , item two f a i l e d to  discriminate adequately among individuals on a broader verbal construct represented by the T L C composite. T h r e e items (two, eight, twelve) in Subtest T h r e e (Recreating Sentences H o l i s t i c ) , a n d six items (one, two, three, f i v e , eight) i n Subtest T h r e e (Recreating Sentences - W o r d C o u n t ) d i d not meet the .3 criterion when correlated with their respective subtests. (Total).  T h e more discriminating set o f items was p r o d u c e d by Subtest T h r e e  O n l y two items were shown to be unacceptable.  (without, d i f f i c u l t , again) a n d twelve (fresh, n o r , here).  These i n c l u d e d item eight Item eight f a i l e d to reach the  level o f .3 w h e n correlated with either the T L C composite or V I Q , suggesting that this item discriminates poorly among individuals on both the general a n d specific factors it was intended to measure. Internal Consistency. H o y t estimates o f internal consistency f o r the T L C subtests are located i n T a b l e 6.  These range f r o m .72 to .89, indicating a strong degree of  association among items within each subtest.  H o y t a n d A l p h a coefficients obtained f o r  the T L C composite were .95 and .87 respectively.  A l p h a coefficients reported in the  T e c h n i c a l M a n u a l across eight age intervals f o r the T L C composite ranged f r o m .77 to .82. Interrater R e l i a b i l i t y .  Interrater reliability coefficients f o r the subjectively  scored sections o f the T L C are presented i n T a b l e 6.  C o e f f i c i e n t s ranged f r o m .92 to  .99, indicating close agreement between raters. Correlational Analyses Correlations between the T L C and the W I S C - R based o n the c o m b i n e d sample are presented i n T a b l e 7.  C o e f f i c i e n t s above .34 are significant at<x= .01. A correlation  50  of .82 was observed between the T L C composite and F S I Q .  T h i s compares to a  correlation o f .75 reported in the T L C manual for a group of 28 language disabled youngsters. Correlations of .90 and .50 were observed between the T L C composite and the W I S C - R V e r b a l and Performance IQ's respectively. T h i s is compared to correlations of .78 and .53 reported in the T e c h n i c a l M a n u a l . Correlations between the T L C composite and the W I S C - R verbal subtests ranged f r o m .76 to .86.  T h e range of correlations between the T L C composite and the W I S C - R  Performance subtests was .13 to .40.  Corresponding values were not reported in the  T L C manual. Correlations between each o f the f o u r T L C subtests and V I Q ranged f r o m .74 to .83.  T h e T L C manual reports a corresponding range o f correlations f r o m .40 to .79 for a  group o f 28 language disabled subjects. subtests and P I Q was .34 to .59  T h e range o f correlations between the T L C  T h i s compares to a range of .18 to .45 reported i n the  test manual. T h e range o f correlations observed between the four T L C subtests and f i v e W I S C - R verbal subtests ranged f r o m .53 to 78.  Correlations between the T L C subtests  and f i v e W I S C - R performance subtests were predictably lower, f r o m .00 to .47. C o r r e s p o n d i n g values were not reported i n the T e c h n i c a l M a n u a l . T L C subtest intercorrelations are reported in T a b l e 8. .77.  These ranged f r o m .56 to  Subtest intercorrelations reported in the T L C manual ranged f r o m .24 to .57 f o r a  group o f 28 language disabled subjects, and f r o m -.11  to .39 f o r 28 nonhandicapped  individuals. Subtest intercorrelations obtained by the standardization sample are reported i n one year intervals between the ages of nine to fourteen; and two year intervals above age fourteen.  These ranged f r o m .17 to .50.  51  Table 7 Correlations Between T L C and W I S C - R for the C o m b i n e d Sample  TLC WISC-R  AMBS  INFS  RECR  METS  TOTAL  Information  68  68  61  78  79  Similarities  69  70  71  76  82  Arithmetic  53  73  65  76  76  Vocabulary  76  76  76  71  86  Comprehension  67  64  74  66  79  P. C o m p l e t i o n  17  20  15  37  26  P. Arrangement  25  10  -00  12  13  Block Design  34  38  34  46  44  O b j . Assembly  27  24  12  47  31  Coding  32  27  40  33  40  V e r b a l IQ  74  79  78  84  90  43  39  34  59  50  69  69  66  81  82  Performance  IQ  F u l l Scale IQ  Note.  C o e f f i c i e n t s have been rounded to two significant figures and decimals omitted; r  > .34  is significant at(X= .01.  52  Table 8 Intercorrelations A m o n g T L C Subtests for the C o m b i n e d Sample  AMBS  INFS  A m b i g Sents.  100  (-05)  —  M a k i n g Infs.  57  100  Recreate  (H)  64  Recreate  (W)  Recreate  (T)  REC(T)  METS  --  (24)  (08)  --  —  (08)  (32)  54  100  —  --  --  45  47  25  100  --  --  68  64  77  81  100  (16)  Metaphors  65  77  66  46  70  100  Total  84  82  77  65  89  89  (53)  (57)  —  —  —  (65)  Note.  REC(H)  REC(W)  Correlations rounded to two figures and decimals omitted; "Total" variables not corrected f o r overlap. H = Holistic Scoring, W = Word C o u n t , T = Subtest T o t a l . Values in parentheses represent intercorrelations among V I Q - corrected residuals  53  D i s c r i m i n a n t F u n c t i o n Analysis Results o f a f o r w a r d stepwise discriminant f u n c t i o n analysis are presented in T a b l e 9.  Subtest 2 ( M a k i n g Inferences) and Subtest 3 (Recreating Sentences) were  selected in the first two steps of the analysis. N o other subtest made further significant contribution to the discriminant f u n c t i o n for differentiating between L L D and C o n t r o l groups. T a b l e 10 presents the group classifications produced by the discriminant analysis. O f the 46 subjects, 19/23 (83%) 21/23 subjects (91%)  were correctly classified as controls.  disordered students (9%) (4%)  were correctly classified as language disabled, while A total of f o u r language  were misclassified as controls (false negatives), and 2 controls  were misclassified as language disordered (false positives).  These results d i d not  change when the analysis used Subtest T h r e e (Holistic Scoring) in place o f the total subtest score. Results o f the discriminant f u n c t i o n analysis reported in the T L C manual indicated that Subtest F o u r (Understanding M e t a p h o r i c Expressions) followed by Subtest T h r e e (Recreating Sentences) were the most discriminating subtests. Subtest T w o accounted f o r sufficient variance to be entered into the analysis, but its contribution to group discrimination was considered m i n i m a l , as indicated b y a relatively small increase in the squared canonical correlation w h i c h resulted when Subtest T w o was entered into the equation (Wiig & Secord, 1985). group discrimination.  Subtest O n e d i d not contribute significantly to  Results o f the classification f u n c t i o n indicated that the T L C  correctly i d e n t i f i e d 27/28 language disabled students (96%)  and 26/28 controls (92%).  54  Table 9 Summary o f D i s c r i m i n a n t F u n c t i o n Analysis:  Stepwise Selection of T L C Subtests (Total  Scores')  Step  Variable Entered  1.  M a k i n g Inference  0.47  49.36  <.01  2.  Recreating Sentences (Total)  0.38  34.83  <.01  Note.  U  F  p  U = Wilk's lambda  T a b l e 10 S u m m a r y of D i s c r i m i n a n t F u n c t i o n Analysis:  Classification of Language D i s a b l e d ( L L D )  and C o n t r o l G r o u p s  N u m b e r of Cases C l a s s i f i e d Group  LLD  LD  19  04  83  Controls  02  21  91  09  Total  21  25  87  13  Note,  Controls  p e r c e n t a g e of false negatives percentage o f false positives  % Correct  % Incorrect  1 7  a  b  55  Simple Regression Analysis T h e purpose of this analysis was to obtain a set of standardized residual scores representing that proportion of variance unique to the T L C and unaccounted f o r by VIQ.  These adjusted scores were entered into a second discriminant f u n c t i o n analysis  (discussed below) to determine the capacity of the T L C to discriminate between L L D and C o n t r o l groups after the effects of V I Q had been removed.  T a b l e 11 presents the  results of f o u r simple regression analyses using V e r b a l IQ as the independent, or predictor variable, and each of the T L C subtests as dependent, or criterion variables.  It  may be seen that V e r b a l IQ accounted f o r 55% of the variance i n Subtest O n e (Understanding A m b i g u o u s Sentences), 62% of the variance in Subtest T w o ( M a k i n g Inferences), 60% of the variance in Subtest T h r e e (Recreating Sentences-Total) and 70% of the variance in Subtest F o u r (Understanding M e t a p h o r i c Expressions). These results c o n f i r m a substantial relationship between V I Q and each of the T L C subtests.  T a b l e 11 S u m m a r y of Simple Regression Analyses: T L C subtests regressed on V I Q  r  r-Square  F  P  Subtest Ambiguous  .74  .55  54.32  <.01  Inferences  .79  .62  72.47  <.01  Recreate  .78  .60  66.89  <.01  Metaphors  .84  .70  101.55  <.01  Adjusted Discriminant Function Analysis ( A second discriminant f u n c t i o n analysis was undertaken using the standardized residual scores obtained in the regression analyses, discussed above.  N o t one o f the T L C  56  subtests was entered into the analysis, indicating group differences are almost wholly explained by V I Q .  Intercorrelations among the V I Q - c o r r e c t e d residuals f o r each subtests  are located i n T a b l e 8.  These ranged f r o m -.05 to .32, and none reached significance.  Hotelling's T - S a u a r e T h e purpose of this analysis was to determine if significant differences between E S L and E F L group performance existed on six variables:  V I Q , P I Q , T L C Subtest O n e  (Understanding A m b i g u o u s Sentences), T w o ( M a k i n g Inferences), T h r e e (Recreating Sentences), or Subtest F o u r (Understanding M e t a p h o r i c Expressions). Results are located in T a b l e 12.  N o significant differences were indicated on the multivariate test ( T  =  12.83), or on any one variable. Wilcoxon R a n k Sum Test Language samples were analyzed using the L A R S P procedure. transcripts are located in A p p e n d i x D . located in A p p e n d i x E .  Language Sample  Summary sheets of the L A R S P analysis are  R a n k sum tests were calculated using results of the L A R S P  analyses at phrase and clause level, stages two, three, four and five. T h e purpose of this analysis was to determine if significant differences would be observed between the expressive language characteristics of L L D and C o n t r o l group members at one or more levels.  Results are presented in Table 13.  Significant differences were observed  between groups at both phrase and clause level, stages three and five. L L D group members demonstrated a significantly higher percentage of phrase and clause usage at stage three than control group members.  C o n v e r s e l y , control group members used  significantly fewer stage three utterances than L L D ' s at clause level, and a higher percentage of stage five level phrases and clauses than members of the L L D group.  57  T a b l e 12 Hotelling's T : Significance of Multivariate and Univariate Differences Between E S L and E F L G r o u p Means  ESL Variable  Mean  EFL SD  Mean  SD  Multivariate  T  2  P  12.83  >.01  17.6  .26  >.01  VIQ  88.3  16.4  PIQ  107.4  13.5  105.1  10.5  -.66  >.01  TLC1  11.8  8.8  14.3  9.9  .90  >.01  TLC2  23.9  6.1  26.5  5.9  1.46  >.01  TLC3  54.1  10.7  56.0  13.7  .52  >.01  TLC4  11.9  7.2  16.8  11.8  1.62  >.01  Note.  E S L (n = 20); E F L (n = 25).  88.67  58  T a b l e 13 Wilcoxon R a n k Sum Tests:  Sum of Ranks ( R j )  bv Stage of L A R S P Analysis  Phrase LARSP  LLD  Stage  Clause  Control  LLD  Control  II  27  28  30  25  III  36*  21*  37*  18*  IV  22  33  24.5  30.5  V  17*  38*  15**  40**  Note.  * p. < .05  * * p. < -01  59 Chapter V Discussion T h i s chapter reviews the purpose of the study, research methodology, and results of data analysis.  Implications of the research findings are discussed, and limitations of  the study considered.  F i n a l l y , clinical applications o f the T L C and suggestions for  future research w i l l be proposed. Purpose o f the Study T h e purpose of this study was first to investigate the validity and related psychometric characteristics of the Test of Language Competence ( T L C ) for use with language learning disabled and control subjects.  F i v e research questions were addressed  concerning the internal structure of the T L C , the concurrent validity and reliability the instrument, and the viability of its use within a population.  of  multilingual/multicultural  A second purpose of the study was to investigate the  criterion-related  validity o f an i n f o r m a l language sample. T h e p r i n c i p a l methods of analysis used to investigate the internal structure of the T L C were measures o f item d i f f i c u l t y , item discrimination, and internal consistency. Correlational analyses yielded additional information concerning internal consistency and interrater reliability.  T L C criterion-related validity was studied on the basis o f a  discriminant f u n c t i o n analysis to determine the capacity of the T L C to predict L L D and control group membership.  A second discriminant f u n c t i o n analysis examined the  criterion-related validity o f the T L C with the effects of V e r b a l IQ removed.  An  important consideration d u r i n g this investigation was the possible influence o f E S L on T L C or W I S C - R performance.  Hotelling's T  was used to test the equality of group  means on the T L C subtests, V I Q and PIQ. In order to determine if an i n f o r m a l language sample would differentiate among L L D and control subjects, language samples were obtained f r o m five members of each  60 group.  These were analyzed using the L A R S P procedure, and the results tested f o r  significant differences using Wilcoxon R a n k S u m tests. Summary and Discussion o f Results Internal Characteristics of the T L C Item D i f f i c u l t y .  Present results suggest some variation in item d i f f i c u l t y among  the f o u r T L C subtests f o r the c o m b i n e d sample.  Subtest O n e (Understanding A m b i g u o u s  Sentences) appears to be the most d i f f i c u l t subtest, followed b y Subtest F o u r (Understanding M e t a p h o r i c Expressions), Subtest T h r e e (Recreating Sentences-Total), and Subtest T w o ( M a k i n g Inferences). Further variation in item d i f f i c u l t y was observed between the two scoring systems w h i c h are included in Subtest T h r e e (Total), with Subtest T h r e e (Word C o u n t ) obtaining the highest mean raw score relative to any other subtest. C o n v e r s i o n of the mean T L C subtest and composite scores to scaled scores using norms p r o v i d e d i n the test manual f o r the mean age group (thirteen years) indicated that L L D subjects obtained scaled scores below average (>1 S D ) on each o f the f o u r subtests. Controls obtained scaled scores below average on Subtests O n e ( M a k i n g Inferences) and F o u r (Understanding M e t a p h o r i c Expressions). Scaled scores f o r the c o m b i n e d group were below average f o r all f o u r subtests.  A l t h o u g h it might be argued that control  subjects in this study obtained lower than average scaled scores because, unlike the T L C standardization sample, the current sample i n c l u d e d a high proportion o f E S L subjects, this argument is not upheld b y results of Hotellings T  , w h i c h indicated no significant  differences between E S L / E F L subjects o n the T L C subtests.  A second argument to  explain the lower scaled scores obtained by the current sample might be that the sample i n c l u d e d a disproportionate number o f language disabled subjects.  G i v e n the high  correlation observed between the T L C and V I Q , however, c o m b i n e d with the fact that control group subjects were k n o w n to be o f average verbal intelligence, this explanation is rejected.  It is therefore reasonable to conclude that T L C norms may not accurately  61  represent the performance of local c h i l d r e n , and consequently may overidentify individuals as language disordered.  M o r e o v e r , these results w o u l d suggest caution i n the  use of profile analysis based on subtest scaled score comparisons. Results of Spearman's rank order correlation coefficient (Rho) indicated that T L C items are generally not well-ordered in terms of d i f f i c u l t y .  T h i s observation would  contraindicate the practice of discontinuing a subtest after three consecutive failures to respond, as subjects may respond correctly to subsequent items that are less d i f f i c u l t . Item D i s c r i m i n a t i o n .  T h e data indicate that all items in Subtest O n e  (Understanding A m b i g u o u s Sentences) and F o u r ( M a k i n g Inferences)  discriminated  effectively on the specific behavior they were intended to measure.  A total of f o u r  items (33%)  i n Subtest T h r e e  in Subtest T w o ( M a k i n g Inferences) and two items (15%)  (Recreating Sentences-Total) failed to meet the required .3 criterion when correlated with their respective subtests.  It was noted that fewer items in Subtest T h r e e met the .3  criterion when correlated with either the Holistic or Word C o u n t totals; therefore the most discriminating set of items was produced by the subtest composite. T h e fact that some T L C items fail to discriminate adequately among individuals may be related to several factors, i n c l u d i n g item d i f f i c u l t y .  E x t r e m e values of item  d i f f i c u l t y tend to reduce item discrimination ( N u n n a l l y , 1978).  A l l items in Subtest T w o  ( M a k i n g Inferences) were observed to fall above the m i d - r a n g e of d i f f i c u l t y .  It might  be argued that items w h i c h f a i l to discriminate among individuals in this subtest do so because they are too easy.  Items in Subtest T h r e e (Recreating Sentences-Total) likewise  fell above the m i d - r a n g e of d i f f i c u l t y ; therefore the same argument may apply. Inadequate item discrimination may also be the result o f poor content sampling. F o r example, o f the six T L C items w h i c h d i d not discriminate among individuals on the specific behaviors they were intended to measure, f i v e showed inadequate discrimination on a broad verbal factor represented by the T L C composite. T h e results suggest a  62 m i n i m a l relationship between these items and either a general or specific verbal construct. Internal Consistency ( T L C Subtests).  Internal consistency coefficients were  calculated using Hoyt's Analysis of V a r i a n c e procedure and Cronbach's A l p h a . C o e f f i c i e n t s of .8 or better were considered acceptable.  Hoyt's estimate of reliability  was .89 f o r Subtest O n e (Understanding A m b i g u o u s Sentences), .83 f o r Subtest T h r e e (Recreating Sentences-Word C o u n t ) , .82 f o r Subtest T h r e e (Total) and .89 f o r Subtest F o u r (Understanding M e t a p h o r i c Expressions). Subtest T w o ( M a k i n g Inferences) and Subtest T h r e e (Recreating Sentences-Holistic) fell below .8 (.72 and .78 respectively). Several explanations may account f o r these results. F i r s t , test reliability is a f u n c t i o n of item d i f f i c u l t y and item discrimination. Items w h i c h are f a r - r e m o v e d f r o m the m i d - r a n g e of d i f f i c u l t y may reduce the size of the reliability coefficient.  L o w item to subtest correlations have the same effect.  The  lower estimate o f internal consistency observed f o r Subtest T w o ( M a k i n g Inferences) is consonant with the observation that this subtest contains the least acceptable combination o f items i n terms of item d i f f i c u l t y and item discrimination. A second explanation f o r the low internal consistency estimates obtained b y Subtest T w o , and by Subtest T h r e e (Holistic) may be that at least some items in each subtest measure dissimilar constructs, a point raised earlier.  T h e effect o f this would be  to reduce the i n t e r - i t e m correlations, and estimates o f internal consistency as a result. O n e f i n a l explanation f o r these results may be related to the length of the T L C subtests.  Subtest T w o ( M a k i n g Inferences) is one o f the shorter subtests, consisting o f  only twelve items.  Subtest T h r e e includes the m a x i m u m thirteen items.  It might be  pointed out, however, that Subtests O n e and F o u r obtained reliability coefficients above .8, despite the fact that each contains twelve and thirteen items respectively.  Thus,  although subtest length is a possible explanation f o r lower estimates o f reliability, it is an unlikely one.  63  Internal Consistency ( T L C Composite).  G i v e n that each o f the T L C subtests is  intended to measure a specific skill or ability, it might be expected that higher estimates of internal consistency would be observed f o r i n d i v i d u a l subtests than f o r the total test. Hoyt's estimate o f reliability f o r the T L C composite contradicts this interpretation.  A  higher estimate was observed f o r the T L C composite than f o r any o f the subtests (.95). C r o n b a c h ' s A l p h a f o r the test composite was likewise substantial (.87). These results are similar to internal consistency data reported i n the T L C manual for the standardization sample. T L C authors Wiig and Secord observed a higher range o f internal consistency coefficients (Cronbach's A l p h a ) f o r the T L C composite across ages than f o r i n d i v i d u a l subtests.  T h e test authors attributed this difference to the effect o f  increased test length o n estimates o f reliability when all the items were c o m b i n e d . interpretation  This  is reasonable, and receives some support here i n view o f the fact that  C r o n b a c h ' s A l p h a , w h i c h is lowered when subtests are not highly correlated, is somewhat lower than Hoyt's estimate f o r the composite.  T h i s difference is small, however, a n d  both results suggest a high degree o f association among items i n the total test. Intercorrelations.  A pattern o f moderate, positive intercorrelations was observed  among T L C subtests (.56 to .77). Correlations between Subtest T h r e e (Holistic) a n d Subtest T h r e e (Word Count) d i d not reach significance (.34), suggesting that these two scoring systems y i e l d different results. A higher range o f subtest to total test correlations was observed (.82 to .89), lending support to the internal consistency o f the subtests; however, higher correlations might also be explained b y the effects o f increased test length, a n d the fact that correlations were not corrected f o r overlap. T h e range o f correlations between each subtest a n d the T L C composite was relatively higher than that observed between each subtest a n d the external criterion (VIQ).  These ranged f r o m .74 to .83. T h i s difference might be interpreted as  supporting the uniqueness o f the measured construct; although higher subtest to total test  64 correlations c o u l d again be explained by the effects o f overlap.  It might be argued that  correcting f o r overlap w o u l d result i n comparable correlations between each subtest and the test composite or V I Q . Interrater Reliability.  Interrater reliability coefficients f o r Subtest T h r e e  (Recreating Sentences-Holistic) and Subtest F o u r (Understanding Expressions) were .92 a n d .99 respectively.  Metaphoric  These results indicate very close agreement  between raters, a n d support the adequacy o f the subjective scoring criteria. Relationship Between T L C and V I Q T h e range o f correlations observed between the T L C subtests a n d subtests on the W I S C - R V e r b a l Scale f o r the c o m b i n e d sample was .53 to .78 (correlations above .34 are significant a t c * = .01). A more pronounced relationship, w h i c h may be explained by the effect o f increased test length o n correlations, was observed between V I Q a n d the T L C composite (.90). T h i s value is considerably higher than that reported i n the T e c h n i c a l M a n u a l f o r a sample o f 28 language-disabled subjects (.78).  L i k e w i s e , the range o f  values observed i n the present study between the four T L C subtests a n d V I Q (.74 to .83) is considerably higher than that reported i n the T e c h n i c a l M a n u a l (.40 to 79). T h e lower range o f values reported i n the T L C manual might be explained by effect o f restricted range o n correlation.  That i s , homogeneous samples produce lower  correlation coefficients than heterogeneous samples (Anastasi, 1982).  Correlations  reported i n the T e c h n i c a l M a n u a l were obtained using a homogeneous sample o f language disabled youngsters; whereas the subject sample in the present study included handicapped a n d nonhandicapped individuals.  T h e greater variability i n the present  sample has likely resulted i n higher correlations. Results o f the simple regression analyses indicated that V I Q accounted f o r 55% o f the variance i n Subtest O n e (Understanding A m b i g u o u s Sentences), 62% o f the variance i n Subtest T w o ( M a k i n g Inferences), 60% o f the variance i n Subtest T h r e e (Recreating Sentences-Total) a n d 70% o f the variance i n Subtest F o u r (Understanding  65  M e t a p h o r i c Expressions). Intercorrelations among the subtest residuals failed to reach significance; thus, the proportion of variance remaining that c o u l d be considered unique to the T L C was m i n i m a l . In total, these results raise the issue o f what Sommers (1985) referred to as "the troublesome distinction between a language disorder and cognitive abilities" (p. 1087). T h e same issue has been addressed elsewhere by Oiler (1978) and G u n n e r s s o n (1978) who challenge the view that language and intelligence tests measure different constructs. L i k e w i s e , present results do not support a distinction between the proposed construct (language competence) and V I Q . Language Disabled and C o n t r o l G r o u p D i s c r i m i n a t i o n ( T L C ) Results o f the discriminant f u n c t i o n analysis indicated that Subtest T w o  (Making  Inferences) was the most discriminating subtest, followed by Subtest T h r e e (Recreating Sentences-Total).  T h e remaining two subtests, Subtest O n e (Understanding A m b i g u o u s  Sentences) and Subtest F o u r (Understanding M e t a p h o r i c Expressions) d i d not contribute significantly to group discrimination and were not entered into the equation.  Present  results are inconsistent with those originally reported b y the test authors, who observed that Subtest F o u r (Understanding M e t a p h o r i c Expressions) accounted for the major proportion o f variance i n group membership, followed b y Subtests T h r e e and then T w o . In terms o f classification, present results are somewhat inconsistent with those reported b y the test authors.  In this study, the T L C correctly classified 83% o f the  L L D ' s , as opposed to 93% originally reported.  91% were correctly classified as controls,  c o m p a r e d to 93% reported in the test manual.  T h e data would suggest that the T L C  does not discriminate as effectively among local students at the lower end o f language f u n c t i o n i n g , thus resulting in increased numbers of false negatives. A l t h o u g h current results do not agree with the original f i n d i n g s reported b y the test authors, the two studies are not directly comparable.  C r i t e r i a f o r group membership  i n the present study were d e f i n e d as average or better nonverbal ability as measured on  66 the W I S C - R , and delays of two or more years as determined by speech-language pathologists on the basis of objective test data, w h i c h were not the result of other handicapping conditions or E S L . M o r e o v e r , this research e m p l o y e d a multiethnic subject sample matched f o r age, sex, and linguistic b a c k g r o u n d .  In contrast group  selection criteria used in the original study were not reported in the T L C T e c h n i c a l M a n u a l , intelligence test data were not available f o r the control g r o u p , and no i n f o r m a t i o n was p r o v i d e d regarding the cultural makeup of the sample i n v o l v e d .  It is  possible that disparate findings observed between the two studies are the result of d i f f e r e n t grouping criteria. Several explanations may account for the observation that Subtest O n e (Understanding A m b i g u o u s Sentences) and Subtest F o u r (Understanding M e t a p h o r i c Expressions) were not i n c l u d e d in the discriminant f u n c t i o n reported here, despite the fact that each demonstrated adequate item characteristics and internal consistency. F i r s t , the results may have been distorted b y sampling error.  F o r example, L L D group  members in this study were selected f r o m within a population of language disabled students already identified as such by qualified speech-language pathologists.  It is  possible that of those individuals retained in the f i n a l sample, some were originally misclassified. A second explanation may be that neither subtest measures a unique aspect of language competence w h i c h is not already accounted f o r b y the other two subtests.  This  conclusion w o u l d challenge the specificity o f the T L C subtests, each o f w h i c h is intended to measure a unique aspect o f a broad verbal factor.  T h e test authors claimed  support f o r subtest specificity on the basis o f an oblique rotation factor analysis (Wiig & Secord, 1985, p p . 42-47).  T h i s method has been f o u n d to demonstrate the  existence of a general language factor c o m m o n to different language tests w h i c h may be d i v i d e d into subcomponents, each possessing its o w n share o f reliable variance (Oiler  &  67 D a m i c o , i n press).  C u r r e n t results, however, offer limited support f o r T L C subtests  specificity. One further point regarding the discriminant f u n c t i o n analysis concerns the number of false positives and false negatives observed in the classification f u n c t i o n .  It  was noted in earlier discussion that Subtest T w o ( M a k i n g Inferences), w h i c h explained the largest proportion o f variance between groups, was f o u n d to be one of the least d i f f i c u l t subtests, to contain several items w h i c h f a i l to discriminate adequately among individuals on either a general or specific language factor, and to demonstrate inadequate reliability.  T h e number of false negatives observed i n the present analysis  may be a f u n c t i o n of the poor internal characteristics associated with this subtest. T h e results described above f o r the present study d i d not change w h e n the discriminant f u n c t i o n analysis was re-calculated using Subtest T h r e e (Recreating Sentences-Holistic) in place of the subtest total, w h i c h uses the c o m b i n e d scoring system. T h i s means that Subtest T h r e e (Word C o u n t ) does not contribute substantially to group discrimination; however, its exclusion f r o m the battery f o r classification purposes is not advised.  Several items in Subtest T h r e e (Holistic) were f o u n d to discriminate poorly  among individuals.  M o r e o v e r , the Holistic scoring obtained an internal consistency  coefficient below the acceptable .8 m i n i m u m .  Both item discrimination and internal  consistency were greatest f o r the c o m b i n e d scoring system. E x c l u s i v e use o f the Holistic scoring system f o r classification purposes might result in increased numbers o f false positives or false negatives due to the poor internal qualities of that subtest. A third discriminant f u n c t i o n analysis was intended to determine how well the T L C w o u l d discriminate between groups after the effects o f V I Q had been removed. fact, insufficient variance was remaining to calculate the additional analysis.  In  Allowing  f o r sampling error, the remaining variance w h i c h c o u l d be considered to account f o r a unique construct (i.e. language competence) is insignificant; thus, V I Q explains the major proportion of variance between L L D and control groups.  68 Test f o r D i f f e r e n c e s in E S L / E F L G r o u p Performance Significant differences were not observed between groups on the six variables o f interest ( V I Q , P I Q , T L C Subtests O n e , T w o , T h r e e , F o u r ) .  These results indicate that  E S L was not a factor in L L D or control group performance in this study. Language Disabled and C o n t r o l G r o u p Discrimination (Language Sample Analysis) Results of the Wilcoxon R a n k S u m Tests indicated that control subjects demonstrated a lower frequency o f stage three clause level utterances, and a higher frequency of stage five phrase and clause level utterances than L L D ' s on the L A R S P analysis.  C o n v e r s e l y , language-disabled students used a higher proportion o f stage three  level utterances and a lower percentage of stage f i v e level utterances (phrase and clause). These results support the capacity of the L A R S P analysis to discriminate between L L D and control groups on the basis of language complexity, and f u r t h e r , demonstrate that language disabled adolescents as a group tend to produce less complex sentence structures i n spontaneous speech than their nonhandicapped peers. Conclusions T h e results obtained in this study suggest a number of conclusions concerning the validity a n d internal characteristics o f the T L C , as well as the criterion-related  validity  of the L A R S P analysis. These have implications f o r the diagnostic and practical utility o f both measures. Conclusions regarding the technical characteristics o f the T L C are based on the results o f item analyses, estimates of internal consistency and interrater reliability coefficients.  T h e results indicate that Subtests O n e (Understanding A m b i g u o u s  Sentences), T h r e e (Recreating Sentences) and F o u r (Understanding M e t a p h o r i c Expressions) demonstrate adequate item discrimination and internal consistency. Interrater reliability f o r Subtests T h r e e and F o u r is very h i g h , supporting the adequacy of the subjective scoring criteria.  Subtest T w o ( M a k i n g Inferences) demonstrates a  relatively high percentage of items (33%)  w h i c h f a i l to discriminate adequately among  69 i n d i v i d u a l s , and the internal consistency estimate f o r this subtest is below the desired .8. These results are consistent with the observation that this is the least d i f f i c u l t o f the subtests.  A l l f o u r subtests contain items w h i c h are not well-ordered in terms of  difficulty. T L C criterion-related validity was judged on the basis of the discriminant f u n c t i o n analysis w h i c h indicated that Subtest T w o ( M a k i n g Inferences) and Subtest T h r e e (Recreating Sentences) discriminate between L L D and control groups.  These  results do not i m p l y that Subtest O n e and F o u r lack the capacity to discriminate between groups; an inspection o f L L D and control group means f o r each subtest indicated sizeable differences in the performance of both groups on all f o u r subtests.  F r o m this it  may be c o n c l u d e d that the variance in Subtests O n e and F o u r was accounted f o r b y Subtests T w o and T h r e e in the discriminant f u n c t i o n analysis. T h e  criterion-related  validity of Subtests O n e and F o u r , independent o f the other two subtests, might be the subject of further investigation. T L C content and construct validity were evaluated on the basis of the above, together with results o f item and subtest intercorrelations, correlations between the T L C and V I Q , the discriminant f u n c t i o n analyses; a n d , the extent to w h i c h these results were consistent with the stated theoretical design o f the test.  T h e T L C is based on a model of  language competence w h i c h assumes a broad verbal factor s u b d i v i d e d into f o u r specific content areas represented b y each of the T L C subtests.  A pattern o f moderate, positive  intercorrelations was observed among the T L C subtests, indicating some support for subtest s p e c i f i c i t y .  T h i s interpretation was not upheld by the high internal consistency  coefficients, w h i c h indicated that T L C items measure a c o m m o n construct; nor by the results o f the discriminant f u n c t i o n analysis, w h i c h suggested that most of the variance i n Subtests O n e and F o u r was accounted f o r b y Subtests T w o and T h r e e .  In total,  limited support has been demonstrated f o r the view that each subtest measures a unique aspect o f language competence.  70  Correlations between the T L C and V I Q support the T L C as a measure of a verbal construct; however, the magnitude of these correlations suggests that the T L C and V I Q are measuring a c o m m o n factor.  T h i s interpretation is supported b y the results of the  adjusted discriminant f u n c t i o n analysis, w h i c h indicated that L L D and C o n t r o l group d i s c r i m i n a t i o n was explained b y V I Q .  These findings are consistent with research  conducted by D a m i c o ( D a m i c o , personal c o m m u n i c a t i o n , A u g u s t , 1989), Schery, (1985), and Sommers et al. (1978), w h i c h support the view that many language tests claimed to measure a unique language factor are, in fact, measuring a c o m m o n construct.  These  results w o u l d further suggest that this c o m m o n factor may be V I Q . Results o f the Wilcoxpn R a n k S u m tests support the validity o f the L A R S P procedure f o r distinguishing between L L D and control groups; however, the analysis does not allow f o r comparisons between individuals.  M o r e o v e r , the descriptive power of  the L A R S P analysis is lost when i n d i v i d u a l performance is reduced to a set o f numbers. C l e a r l y the T L C and L A R S P might be viewed as complementary procedures.  The T L C  has been demonstrated to possess adequate criterion-related validity; however, the proposed interpretive rationale f o r T L C results is not strongly supported. C o n v e r s e l y , the L A R S P system may not discriminate among individuals, but it does provide extensive descriptive information.  These results would lead to the conclusion that the  two instruments may be most effective if used together f o r the identification of language disorders, and the identification o f intervention goals. Limitations o f the Study T h e present study was intended to investigate the technical characteristics o f the T L C using a locally selected sample of language disabled and control subjects. generalizability o f current findings is limited by several factors.  The  F i r s t , current findings  were obtained f r o m w i t h i n a distinctly multiethnic populations, and are not considered representative o f populations in other regions.  Second, the age range o f the present  sample was limited to 11 through 15, thus the representativeness o f these results is  71 c o n f i n e d to that age range.  T h i r d , the selection o f average achieving students f o r the  control group was limited to the o p i n i o n o f school personnel and school records, either of w h i c h may not have been objective.  It might be argued, however, that as  achievement is highly dependent upon verbal ability, and as all control group subjects were f o u n d to be o f average verbal ability, it is unlikely that control group subjects were of above average achievement. A n o t h e r limitation of the present study is related to the collection, transcription and c o d i n g o f the language sample analyses. F i r s t , the language sample was obtained under structured conditions (imperatives), and not obtained across a n u m b e r of settings. It might be suggested that these factors limited the representativeness o f the samples collected.  Secondly, language sample transcription and analysis is a complex procedure,  and the limited experience of the researcher in conducting these analyses may have increased the possibility f o r error. A l t h o u g h Bishop's computerized version o f L A R S P helped to direct the analyses somewhat, it is an interactive program requiring user judgement as the analysis proceeds. M o r e o v e r , the program was f o u n d to have limited capacity f o r analyzing the longer and more complex sentences generated b y control subjects. F o r example, the program w i l l not analyze sentences b e y o n d 25 words.  Clearly  the program is effective f o r use with young c h i l d r e n , or subjects in the lower range o f language f u n c t i o n i n g . Recommendations f o r C l i n i c a l Practice 1.  T h e observation that subtest items are not w e l l - o r d e r e d in terms o f d i f f i c u l t y w o u l d contraindicate the procedure o f discontinuing a subtest after three consecutive failures to respond. If a subject fails to respond because three consecutive items are too d i f f i c u l t , he/she may respond to subsequent items w h i c h are less d i f f i c u l t .  72  2.  T h e T L C short f o r m , w h i c h consists of Subtest T h r e e (Recreating Sentences) and Subtest F o u r (Understanding M e t a p h o r i c Expressions) is not recommended f o r screening purposes at this time because of these two, only Subtest T h r e e has been shown to discriminate adequately between L L D and control subjects.  Moreover,  the content of Subtest F o u r may be too specific to N o r t h A m e r i c a n culture to discriminate among L L D and n o n - L L D subjects f r o m multicultural backgrounds. 3.  A l t h o u g h Subtest O n e (Understanding A m b i g u o u s Sentences) and Subtest F o u r (Understanding M e t a p h o r i c Expressions) d i d not contribute to group discrimination i n this study, their exclusion f r o m the battery f o r the purpose of classifying L L D subjects is not recommended without further research.  4.  Because V I Q explains a significant proportion of variance in T L C performance, the administration of both instruments for the purpose of classifying language disabled subjects seems t i m e - c o n s u m i n g and redundant; however substitution o f one instrument or the other is not recommended without further research.  5.  Cautious interpretation of T L C results based on the test norms is r e c o m m e n d e d , as these may tend to overidentify individuals as language disabled.  6.  T h e T L C authors r e c o m m e n d the use o f profile analysis f o r determining i n d i v i d u a l strengths and weakness; however, this practice is not r e c o m m e n d e d , first because T L C norms may not be representative o f the performance of local c h i l d r e n , and second, because limited support has been demonstrated f o r T L C subtest specifity.  7.  Detailed suggestions f o r developing i n d i v i d u a l education plans (IEP's) on the basis o f T L C data are p r o v i d e d in the test manual. lend little support to this practice.  C u r r e n t results, however,  F i r s t , there is no evidence o f f e r e d by the test  authors to suggest that remediation based on T L C results has an effect on actual performance over time.  Second, IEP's are based on the assumption that each  subtest measures a unique aspect of language competence; however, current results fail to support the specificity of the TLC subtests. 8.  It is suggested that the TLC be used only in conjunction with other language measures which provide reliable diagnostic information. These might include an informal language sample, such as the LARSP analysis.  Recommendations for Further Research 1.  Suggestions for further research would include additional criterion-related validity studies at different age levels to observe test characteristics and possible developmental patterns in TLC performance.  2.  Given that current results were obtained from a multilingual/multicultural sample, future studies might focus on the performance of distinct ethnic groups, including EFL.  3.  Further investigation of the criterion-related validity of the TLC short form (Subtests Three and Four) for use with local subjects is suggested. The validity of using Subtests Two and Three as a short form for local children might be investigated.  4.  Further investigation of the relationship between VIQ and the TLC is suggested to determine the relative efficacy of each instrument for classifying L L D subjects, and whether one instrument might be substituted for another to avoid redundancy.  5.  Further investigations of TLC subtest specificity using factor analysis are suggested.  6.  Experimental research to determine the effectiveness over time of instructional objectives based on T L C performance might be considered.  7.  Further examination of the appropriateness of TLC norms for local children is suggested. The development of local norms might be considered.  References  A m e r i c a n Psycholgical Association (1985). psychological testing.  Standards for educational and  Washington, D C : A m e r i c a n Psychological Association.  A m e r i c a n S p e e c h - L a n g u a g e H e a r i n g Association (1988). committee on instrument evaluation. Anastasi, A . (1982).  R e p o r t of the ad hoc  A S H A . 23, 75-76.  Psychological testing (5th ed.).  New York, N Y : MacMillan.  A r a m , D., E k e l m a n , B . , & N a t i o n , J . (1984). Preschoolers with language disorders: 10 years later. Journal of Speech and Hearing Research. 27, 232-244. A t k i n s , C P . , & Cartwright, L . R . (1982). Preferred language elicitation procedures used i n five age categories. A S H A . 24, 321-323. Bernstein, D . K . (1989). Assessing children with limited E n g l i s h proficiency: Current perspectives. T o p i c s in Language Disorders. 9. 15-20. B i s h o p , D . (1985). Language assessment, remediation and screening procedure ( L A R S P ) by D a v i d C h r v s t a l . Paul Fletcher & M i k e G a r m a n : C o m p u t e r i z e d version f o r A p p l e II and He. (Available f r o m Department o f Speech, U n i v e r s i t y o f Newcastle upon T y n e , Great Britian) B l a u , A . , L a h e y , M . , & O l e k s i u k - V e l e z , A . (1984). Planning goals for intervention: C a n a language test serve as an alternative to a language sample? Journal of C h i l d h o o d C o m m u n i c a t i o n Disorders. 7, 2 7 - 3 7 . B l o o m , L . , & L a h e y , M . (1978). Language development and language disorders. N e w Y o r k , N Y : John Wiley & Sons. C a s k e y , W., & F r a n k l i n , L . (1986). T h e Test o f Adolescent Language ( T O A L ) and W I S C - R scores: A caveat. Language. Speech and Hearing Services in Schools. 17, 307-311. Committee on the Status o f R a c i a l M i n o r i t i e s (1983). A S H A . 2 J , 23-25.  Social dialects (position paper).  C r o n b a c h , L . (1971). Test validation. In R. T h o r n d i k e ( E d . ) , E d u c a t i o n a l measurement. Washington, D C : A m e r i c a n C o u n c i l on E d u c a t i o n . C r o n b a c h , L . , & M e e h l , P. (1955). Construct validity in psychological tests. Psychological B u l l e t i n . 52, 281-302. C r y s t a l , D . , F l e t c h e r , P., & G a r m a n , M . (1976). T h e grammatical analysis of language disability. L o n d o n , G r e a t Britain: E d w a r d A r n o l d . C u p p l e s , W . P . , & L e w i s , M . E . B . (1984). L a n g u a g e - b a s e d learning disabilities: differential diagnosis and implementation. L e a r n i n g Disabilities. 3, 129-140. D a m i c o , J.S. (1988). T h e lack o f e f f i c a c y i n language therapy: A case study. L a n g u a g e . Speech and H e a r i n g Services in Schools. 19, 51-66.  75 D a m i c o , J.S. (in press). Descriptive assessment of communicative ability in limited E n g l i s h proficient students. In E . H a m a y a n , & J.S. D a m i c o (Eds.), Nonbiased assessment of limited E n g l i s h proficient special education c h i l d r e n . San D i e g o , C A : College H i l l Press. D a r l e y , F . L . , & Spriestersbach, D . C . (1978). Diagnostic methods in speech pathology (2nd ed.). R e a d i n g , M A : A d d i s o n - W e s l e y . D i x o n , W . J . (Ed.) (1988). B M D P statistical software manual ( V o l . 1). U n i v e r s i t y o f C a l i f o r n i a Press.  Berkely, C A :  D o n a h u e , M . (1985). R e v i e w of the W O R D Test. In J . V . M i t c h e l l (Ed.). N i n t h mental measurements yearbook. L i n c o l n , N E : Buros Institute o f M e n t a l Measurement, U n i v e r s i t y o f Nebraska. E d u c a t i o n Services G r o u p (1985). V a n c o u v e r School B o a r d .  Student services handbook.  Vancouver, B.C.:  E v a r d , B . , & Sabers, D . (1979). Speech and language testing with distinct ethnic - r a c i a l groups: A survey of procedures for i m p r o v i n g validity. Journal o f Speech and H e a r i n g Disorders. 44, 271-281. F e r g u s o n , G . A . (1976). Statistical analysis in psychology and education. Canada: M c G r a w - H i l l .  Toronto,  G u i o n , R . M . (1977). Content validity: Three years o f talk-What's the action? Personnel Management. 6, 407-414.  Public  G u n n a r s s o n , B. (1978). A look at the content similarities between intelligence, achievement, personality, and language tests. In J.W. O i l e r Jr. ( E d . ) , Language in education: Testing the tests. R o w l e y , M A : N e w b u r y . H a m m i l l , D . , & N e w c o m e r , P. (1988). Test of Language Development - 2 Intermediate ( T O L D - 2 - I ) . A u s t i n , T X : P r o - E d . H a m m i l l , D . , B r o w n , V . , L a r s e n , S., Wiederholt, J . (1987). Language - 2 ( T O A L - 2 ) . A u s t i n , T X : P r o - E d .  Test o f Adolescent  H a w k i n s , P., & Spencer, H . (1985). Imitative versus spontaneous language assessment: A comparison of C E L I and L A R S P . British Journal o f Disorders o f C o m m u n i c a t i o n . 2Q, 191-200. H e n r y s s o n , S. (1971). G a t h e r i n g , analyzing, and using data on test items. In R. T h o r n d i k e (Ed.) Educational measurement. Washington, D C : A m e r i c a n C o u n c i l on E d u c a t i o n . H o l l a n d , A . , & F o r b e s , M . (1986). Nonstandardized approaches to speech and language assessment. In O . L . T a y l o r ( E d . ) , C o m m u n i c a t i o n disorders i n linguistically diverse populations (pp. 49-66). San D i e g o , C A : C o l l e g e - H i l l Press. K a u f m a n , A . S . (1979).  Intelligent testing with the W I S C - R .  N e w Y o r k , N Y : Wiley.  K e a r n s , K . , & Simmons, N . (1983). A practical procedure f o r the grammatical analysis o f aphasic language impairments: T h e L A R S P . In R. Brookshire ( E d . ) , clinical aphasiology conference proceedings. M i n n e a p o l i s , M N : B R K . K e l l y , D . J . , & R i c e , M . L . (1986). children:  A strategy f o r language assessment o f y o u n g  A combination o f two approaches.  Language. Speech and Hearing  Services in Schools. 17, 83-92. K l e c k a , W . R . (1980).  Discriminant analysis. Beverly H i l l s , C A : Sage  K r e t s c h m e r , R . R . , & K r e t s c h m e r , L . W . (1978). Language development and intervention f o r the hearing impaired. Baltimore, M D : U n i v e r s i t y Park Press. L a i , C . (1986). U . B . C . S P S S X Statistical Package f o r Social Sciences version. V a n c o u v e r , Canada: U n i v e r s i t y o f British C o l u m b i a .  extended  L a r s o n , V . , & M c K i n l e y , N . (1987). C o m m u n i c a t i o n assessment and intervention strategies f o r adolescents. E a u C l a i r e , W l : T h i n k i n g Publications. L a u n e r , P . B . , & L a h e y , M . (1981). Passages: F r o m the fifties to the eighties i n language assessment. T o p i c s in Language Disorders. 1, 11-29. L a T o r r e , R . A . (1983). T h e 1982 survey of pupils i n V a n c o u v e r schools f o r w h o m E n g l i s h is a second language (Report N o . 8 3 - 0 2 ) . V a n c o u v e r : B o a r d o f School Trustees, E v a l u a t i o n and Research Services, Program Resources. L e o n a r d , L . B . , Perozzi, J . A . , Prutting, C . A . , & B e r k l e y , R . K . (1978). Nonstandardized approaches to the assessment o f language behaviors. 20, 371-379.  ASHA.  L i e b e r m a n , R . , H e f f r o n , A . , West, S., H u t c h i n s o n , E . , & Swem, T . (1987). A comparison o f f o u r adolescent language tests. Language. Speech, and Hearing Services i n Schools, 1 £ , 250-266. L i e b e r m a n , R . J . , & M i c h a e l , A . (1986). Content relevance and content coverage i n tests o f grammatical ability. Journal o f Speech and H e a r i n g Disorders. 5_I, 7 1 81. L u n d , N . , & D u c h a n , J . (1983). Assessing children's language i n naturalistic contexts. E n g l e w o o d C l i f f s , N J : Prentice H a l l . M c C a u l e y , R . J . , & Swisher, L . (1984a). Psychometric review o f language a n d articulation tests f o r preschool children. Journal o f Speech and H e a r i n g Disorders. 42, 34-42. M c C a u l e y , R . J . , & Swisher, L . (1984b). U s e and misuse o f n o r m - r e f e r e n c e d tests i n clinical assessment: A hypothetical case. Journal o f Speech and H e a r i n n g Disorders. 49, 338-348. M e s s i c k , S. (1980). Test validity and the ethics o f assessment. Psychologist. M , 1012-1027.  American  M i l l e r , J . F . , & C h a p m a n , R . (1983). S A L T : Systematic analysis o f language transcripts. M a d i s o n , W l : U n i v e r s i t y o f Wisconsin, Language A n a l y s i s L a b o r a t o r y , Waisman Center. M u m a , J . (1981). Language primer f o r the clinical fields. C h i l d Publishing.  Lubbock, T X :  Natural  M u m a , J . R . (1984). Semel and Wiig's C E L F : Construct validity? (Letter to the editor). Journal o f Speech and Hearing Disorders. 49, 101-104. M u m a , J . R . (1985). "No news is bad news": A response to M c C a u l e y and Swisher (letter to the editor). Journal o f Speech and Hearing Disorders. 50, 290-292. M u m a , J . R . , L u b i n s k i , R., & Pierce, S. (1982). A new era in language assessment: Data or evidence. In N . J . Lass ( E d . ) , Speech and language: A d v a n c e s i n basic research and practice. T o r o n t o , Canada: A c a d e m i c . N e l s o n , L . R . (1974).  G u i d e to L E R T A P use and interpretation. D u n e d i n , N e w  Zealand: U n i v e r s i t y o f Otago. N i c e , M . M . (1925). L e n g t h o f sentences as a criterion o f a child's progress i n speech. Journal o f E d u c a t i o n a l Psychology. 16. 370-179. N u n n a l l y , J . C . (1978).  Psychometric theory. T o r o n t o , Canada: M c G r a w - H i l l .  O i l e r , J . W . , Jr. (1978). H o w important is language p r o f i c i e n c y to IQ and other educational tests? In J.W. Oiler Jr. (Ed.), Language in education: Testing the tests. R o w l e y , M A : N e w b u r y . O i l e r , J . W . , J r . , & D a m i c o , J . (in press). Theoretical considerations i n the assessment of L E P students. In E . H a m a y a n , & J.S. D a m i c o (Eds.), N o n b i a s e d assessmentof limited E n g l i s h proficient special education c h i l d r e n . San D i e g o , C A : College H i l l Press. Pedhazur, E . J . (1982). M u l t i p l e regression i n behavioral research. N e w Y o r k , N Y : H o l t , Rinehart and Winston. P e n n , M . A . C . (1983). Syntactic and pragmatic aspects o f aohasic language. U n p u b l i s h e d doctoral dissertation, U n i v e r s i t y o f Witwatersrand, South A f r i c a . P e n n , C , & B e h r m a n n , M . (1986). Towards a classification scheme f o r aphasic syntax. British Journal o f Disorders o f C o m m u n i c a t i o n . 21. 2 1 - 3 8 . Prather, E . , Beecher, S., S t a f f o r d , M . , & Wallace, E . (1980). Screening Test o f Adolescent Language. Seattle, W A : U n i v e r s i t y o f Washington Press. Public L a w 94 - 142. E d u c a t i o n f o r all H a n d i c a p p e d C h i l d r e n A c t . Register, 1975, p. 42478.  Federal  Q u i r k , R., G r e e n b a u m , S., L e e c h , G . , & Svartvik, J . (1972). A grammar o f contemporary E n g l i s h . L o n d o n , G r e a t Britain: L o n g m a n .  R a j u , N.S. (1985). R e v i e w o f the W O R D Test. In J . V . M i t c h e l l (Ed.). N i n t h mental measurements yearbook. L i n c o l n , N E : Buros Institute o f M e n t a l Measurement, U n i v e r s i t y o f Nebraska. Santos, O . B. (1987). T h e variance i n reading comprehension i n terms of language skills and cognitive processes (Doctoral dissertation, Boston U n i v e r s i t y , 1987). Dissertation Abstracts International. 48, 8716098. Sattler, J . (1988). Sattler.  Assessment o f children (3rd ed.). San D i e g o , C A : Jerome M .  Schery, T . K . (1985). Correlates o f language development i n language-disordered c h i l d r e n . Journal o f Speech and Hearing Disorders. 50, 73-83. Semel, E . , & W i i g , E . (1980). C l i n i c a l E v a l u a t i o n o f Language Functions. Columbus, O H : Merrill. Semel, E . , W i i g , E . H . , & Secord, W. (1987). Fundamentals - R e v i s e d ( C E L F - R ) . Corporation.  C l i n i c a l E v a l u a t i o n of Language San A n t o n i o , T X : Psychological  S i m o n , C . S . (1984). F u n c t i o n a l - p r a g m a t i c evaluation o f c o m m u n i c a t i o n skills i n s c h o o l - a g e d c h i l d r e n . Language. Speech, and Hearing Services in Schools. 15. 83-98. Sommers, R . K . (1985). A review o f the Screening Test o f Adolescent Language. In J . V . M i t c h e l l ( E d . ) , N i n t h mental measurements yearbook. L i n c o l n , N E : Buros Institute o f M e n t a l Measurement, U n i v e r s i t y o f Nebraska. Sommers, R . K . , E r d i g e , S., & Peterson, M . K . (1978). H o w valid are children's language tests? T h e Journal of Special E d u c a t i o n . 12, 393-407. Stark, R . E . , T a l l a l , P., Mellits, E . D . (1982). Quantification o f language abilities i n c h i l d r e n . In N . J . Lass (Ed.), Speech and language: A d v a n c e s i n basic research and practice. T o r o n t o , Canada: A c a d e m i c Press. Stephens, M . , & M o n t g o m e r y , A . (1985). A critique o f recent relevant standardized tests. T o p i c s i n Language Disorders. 5_, 2 1 - 4 5 . T h o r n d i k e , R . ( E d . ) (1971). Educational measurement. Council on Education.  Washington, D C : A m e r i c a n  T h o r u m , A . (1980). Fullerton Language Test f o r Adolescents. C o n s u l t i n g Psychologists Press. T h u r s t o n e , T . G . (1978). E d u c a t i o n a l Abilities Series. Research Associates. T i b b i t s , D . F . (1982).  Palo A l t o , C A :  Palo A l t o , C A : Science  Language disorders i n adolescents.  L i n c o l n , N E : C l i f f s Notes.  T y a c k , D . , & Gottsleben, R. (1974). Language sampling, analysis and training. A l t o , C A : Consulting Psychologists Press.  Palo  V a u g h n - C o o k e , F . (1983). A S H A . 25, 29-34.  Improving language assessment in minority  children.  W a t s o n - R u s s e l , A . (1986). Student ethnic survey (Research Report N o . 86-07). Vancouver: B o a r d o f School Trustees, Student Assessment and Research, E d u c a t i o n Services G r o u p . Wechsler, D . (1974). Wechsler Intelligence Scale f o r C h i l d r e n - R e v i s e d . N Y : Psychological Corporation. W i i g , E . H . , & Secord, W. (1985). Merrill.  Test o f Language Competence.  New York,  T o r o n t o , Canada:  W i i g , E . , & Semel, E . (1984). Language assessment and intervention f o r the learning disabled. T o r o n t o , Canada: M e r r i l l .  80  Appendix A Parent/Student Information  81 Letter to Parents of C o n t r o l Subjects  Dear Parents:  's school has agreed to participate i n a research project: "Validity o f the Test of Language Competence".  T h e Test of Languge  Competence ( T L C ) is a testing instrument used by speech/language specialists in the V a n c o u v e r School District to appraise children's use of language.  T h e purpose of this  study is to investigate how well the T L C does the job for w h i c h it is intended with secondary school students. F o r this project to be successful, 60 students in the V a n c o u v e r School District w i l l take 3 tests:  the T L C , an ability test, and an i n f o r m a l language sample.  The  language sample is gathered by audiotaping an i n f o r m a l conversation between each student and a graduate student examiner.  These conversations centre around n o n -  threatening r e a l - l i f e topics such as "What would y o u do i f y o u w o n a m i l l i o n dollars?". T h e sole purpose of these conversations is to obtain a representative sample of each student's usual language.  In addition, parents o f participating students w i l l be asked to  complete a brief questionnaire. T h e researcher seeks to determine i f the T L C is an effective measure o f language f u n c t i o n i n g , and also how it compares to anothter c o m m o n l y used procedure, the language sample analysis. T h e research project is being undertaken as a master's thesis in the of E d u c a t i o n a l Psychology at the U n i v e r s i t y o f British C o l u m b i a ,  Department  it has been approved  by the V a n c o u v e r School Board's Student Assessment and Research o f f f i c e , by the P r i n c i p a l o f y o u r son's/daughter's school, and by F a c u l t y research specialists at the University.  82  - 2 -  's name has been chosen as a possible participant in this research. If y o u and your son/daughter agree to participate, he/she w i l l be asked to take part i n two i n d i v i d u a l testing sessions of approximately 60 minutes and 2 hours respectively.  A trained graduate student w i l l do the testing in the school.  c o m m o n in schools.  Such tests are  Students usually f i n d them interesting and enjoyable.  Parents w i l l also be asked to complete a brief questionnaire on f a m i l y background. T h e results of the tests w i l l be strictly confidential; your child's name w i l l not appear on the test f o r m s .  N o i n d i v i d u a l test results w i l l be released.  T h e purpose is not  to test any one child's performance, but rather to evaluate the usefulness of the Test of Language Competence.  Parents interested in receiving a copy o f the group results  should request this on the consent f o r m . I wish to emphasize that participation is voluntary.  Participation in or  withdrawl  f r o m the project at any time w i l l not in any way influence your son's/daughter's class standing.  I w o u l d , however, greatly appreciate your cooperation in this research.  Please complete the Parent Consent F o r m and questionnaire and return it in the envelope p r o v i d e d as soon as possible. Thank you. , or  Sincerely, C. Ainsworth  F e e l free to contact me f o r any further i n f o r m a t i o n at f o r messages.  83  Letter to Parents of L L D Subjects  Dear Parents:  's school has agreed to participate in a research project: "Validity of the Test of Language Competence".  T h e Test of L a n g u g e  Competence ( T L C ) is a testing instrument used by speech/language specialists in the V a n c o u v e r School District to appraise children's use of language.  T h e purpose of this  study is to investigate how well the T L C does the job for w h i c h it is intended with secondary school students. F o r this project to be successful, 60 students in the V a n c o u v e r School District w i l l take 3 tests:  the T L C , an ability test, and an i n f o r m a l language sample.  The  language sample is gathered by audiotaping an i n f o r m a l conversation between each student and a graduate student examiner.  These conversations centre around n o n -  threatening r e a l - l i f e topics such as "What w o u l d y o u do i f y o u w o n a m i l l i o n dollars?". T h e sole purpose o f these conversations is to obtain a representative sample of each student's usual language.  In addition, parents of participating students w i l l be asked to  complete a b r i e f questionnaire. T h e researcher seeks to determine i f the T L C is an effective measure of language f u n c t i o n i n g , and also how it compares to anothter c o m m o n l y used procedure, the language sample analysis. T h e research project is being undertaken as a master's thesis in the of E d u c a t i o n a l Psychology at the U n i v e r s i t y o f British C o l u m b i a ,  Department  it has been approved  by the V a n c o u v e r School Board's Student Assessment and Research o f f f i c e , by the P r i n c i p a l of your son's/daughter's school, and by F a c u l t y research specialists at the University.  84  's name has been chosen as a possible participant in this research.  If y o u and your son/daughter agree to participate, he/she w i l l be asked to  take part in one i n d i v i d u a l testing session of approximately 60 minutes.  A trained  graduate student w i l l do the testing in the school. Such tests are c o m m o n in schools.  Students usually f i n d them interesting and  enjoyable. In order to avoid re-testing children for w h o m test data is already available, your permission f o r the investigator to obtain such existing information is requested.  Parents  w i l l also be asked to complete a brief questionnaire o n f a m i l y b a c k g r o u n d . T h e results of the tests w i l l be strictly confidential; on the test f o r m s .  your child's name w i l l not appear  N o i n d i v i d u a l test results w i l l be released.  T h e purpose is not to test  any one chid's p e r f o r m a n c e , but rather to evaluate the usefulness o f the Test of Language C o m p e t e n c e . Parents interested i n receiving a copy of the group results should request this on the consent f o r m . I wish to emphasize that participation is voluntary.  Participation in or withdrawl  f r o m the project at any time w i l l not in any way influence your son's/daughter's class standing.  I w o u l d , however, greatly appreciate your cooperation in this research.  Please complete the Parent Consent F o r m and the questionnaire and return it to the school as soon as possible. Thank you. or  Sincerely, C . Ainsworth  F e e l free to contact me f o r any further i n f o r m a t i o n at  f o r messages.  ,  85  Parent Consent F o r m (Controls)  VALIDITY STUDYOF T H E TEST OF L A N G U A G E PARENT CONSENT  COMPETECE  FORM  I am w i l l i n g / not willing to give m y consent for the research study at  's participation in school.  I am aware that this  w i l l involve testing sesions totalling approximately three hours duration.  In understand  that confidentiality o f test results w i l l be maintained a n d that no i n d i v i d u a l scores w i l l be released. I also understand that participation in this project is voluntary and may be terminated at any time.  Name  Signature  Date  I w o u l d / w o u l d not like a copy of the group results to be mailed to:  86 Parent Consent F o r m ( L L D G r o u p )  VALIDITY STUDY OF T H E TEST OF L A N G U A G E PARENT CONSENT  COMPETENCE  FORM  I am w i l l i n g / not willing to give my consent f o r the research study at  school.  's participation in  I am aware that this w i l l involve testing  sessions totalling approximately 60 minutes duration.  I further consent to the release of  test data w h i c h has been obtained f o r my son/daughter to the p r i n c i p a l investigator in this research. In understand that confidentiality of test results w i l l be maintained and that no i n d i v i d u a l scores w i l l be released.  I also understand that participation in this  project is voluntary and may be terminated at any time.  Name  Signature  Date  I w o u l d / w o u l d not like a copy o f the group results to be mailed to:  87 Student Consent F o r m (Controls)  VALIDITY STUDYOF T H E TEST OF L A N G U A G E STUDENT CONSENT  COMPETENCE  FORM  I am w i l l i n g / not willingto participate in the research study at school. I am aware that this will involve testing sessions totalling about three hourse in length. confidential.  I understand that m y test results will be kept  M y name will not appear on any of the test papers or in the f i n a l report.  I also understand that m y participation in this project is voluntary, and that I can quit at any time without affecting m y school grades.  Name  Signature  Date  88 Student Consent F o r m ( L L D G r o u p )  VALIDITY STUDY OF T H E TEST OF L A N G U A G E STUDENT CONSENT  COMPETENCE  FORM  I am willing / not willing to participate in the research study at school. session of about an hour's length.  I am aware that this will involve a testing  I also consent to the release of m y test scores that the  school has on file to the researcher.  I understand that these test scores will be kept  c o n f i d e n t i a l , and that m y name will not appear on any o f the new test papers or on the f i n a l report.  I also understand that m y participation in this project is voluntary, a n d  that I can quit at any time without affecting m y school grades.  Name  Signature  Date  89 VALIDITY STUDY OF T H E TEST OF L A N G U A G E C O M P E T E N C E QUESTIONNAIRE Your assistance in providing the following information would be very helpful in making this a meaningful study: 1. What language do adults speak in the home? 2.  What language do children speak in the home?  3.  How often do adults speak English in the home? a I wa ys 3/4 of the time 1/2 of the time 1/4 of the time never 4. How often do children speak English in the home? always 3/4 of the time 1/2 of the time 1/4 of the time never 5. In which area of the city do you live? Downtown (west-end) Vancouver west of Main Street Vancouver east of Main Street and north of 41st Avenue Vancouver east of Main Street and sound of 41st Avenue 6.  What is your son's/daughter's birthdate and age? Age: year month day  7.  How would you describe your son's/daughter's school achievement? Below Average Average Above Average Reading Writing Spelling Arithmetic Has your son/daughter ever received special assistance for learning difficulties? (yes) (no) QUESTIONS A D D R E S S E D T O T H E M O T H E R 9a. What is your occupation?_ 9b. Which category below best describes your completed level of education? Less than High School Completion High School Completion Post-Secondary, no degree University or college degree  10. QUESTIONS A D D R E S S E D T O T H E F A T H E R 10a. What is your occupation? 10b. Which category below best describes your completed level of education? Less than High School Completion High School Completion Post-Secondary, no degree University or college degree Thank you for your cooperation.  Appendix B Item Analysis Data  91  Summary Item Statistics:  Subtest One:  Understanding A m b i g u o u s Sentences  Correlations Item  Mean  SD  VIQ  0  1  3  1.  1.63  1.21  63  53  44  19.6  39.1  41.3  2.  1.23  1.30  78  70  63  41.3  26.1  32.6  3.  1.02  1.20  47  35  25  45.7  30.4  23.9  4.  1.06  1.23  58  64  60  21.7  37.0  41.3  5.  1.15  1.13  76  63  58  32.6  43.5  23.9  6.  1.00  1.09  66  62  55  39.1  41.3  19.6  7.  1.02  .95  66  52  50  28.3  56.5  15.2  8.  .78  1.05  61  51  48  52.2  32.6  15.2  9.  .89  1.14  60  57  47  50.0  30.4  19.6  10.  .56  .88  51  42  42  60.9  30.4  8.7  11.  .73  1.06  52  56  40  56.5  28.3  15.2  12.  .71  .93  47  63  60  50.0  39.1  10.9  13.  .80  .91  35  54  46  41.3  47.8  10.9  Note,  Subtest  Total Test  % Distribution  n = 46 Item order d i f f i c u l t y correlation:  Rho =  76  92  Summary Item Statistics:  Subtest T w o : M a k i n g Inferences %  Correlations Item  Mean  1.  2.61  .80  32  2.  2.19  1.09  3.  2.37  4.  SD  VIQ  0  1  3  22  22  0  .6  80.4  17  25  32  6.5  30.4  63.  1.04  15  35  30  6.5  21.7  71.7  2.58  .86  54  49  46  2.2  17.4  80.4  5.  2.13  1.00  28  35  39  0.0  43.5  56.5  6.  2.04  1.13  58  72  72  8.7  34.8  56.5  7.  1.84  1.03  55  47  54  2.2  54.3  43.5  8.  1.69  1.03  43  40  40  4.3  58.7  37  9.  2.17  .99  16  32  22  0  41.3  58.7  10.  1.78  1.05  47  46  32  4.3  54.3  41.3  11.  1.93  1.10  36  49  37  6.5  43.5  50.  12.  1.93  1.10  34  34  34  Note,  Subtest  Total Test  Distribution  n = 46 Item order d i f f i c u l t y correlation: R h o = .77  93  Summary Item Statistics: Subtest Three:  Recreating Sentences (Holistic) %  Correlations Item  Mean  1.  2.50  .91  2.  1.63  3.  SD  Subtest  Distribution 1  3  2.2  21.7  76.1  23  2.2  65.2  32.6  37  35  4.3  37.  58.7  43  55  54  8.7  32.6  58.7  1.04  33  34  42  2.2  41.3  56.5  1.83  1.12  64  69  63  8.7  45.7  45.7  7.  1.52  1.15  59  54  55  17.4  47.8  34.8  8.  1.56  1.05  18  06  -01  8.7  58.7  32.6  9.  2.17  1.06  33  44  38  4.3  34.8  60.9  10.  1.30  .96  56  50  45  13.  65.2  21.7  11.  1.61  1.06  39  33  29  8.7  56.5  34.8  12.  1.41  1.13  28  24  19  19.6  50  30.4  13.  1.89  1.16  60  51  40  39.1  50  Total Test  VIQ  38  42  40  .97  21  25  2.13  1.07  40  4.  2.09  1.13  5.  2.10  6.  Note,  n = 46 Item order d i f f i c u l t y correlation:  R h o = .53  0  10.9  94  Summary Item Statistics: Subtest Three:  Recreating Sentences (Word Count) Correlations  %  Item  Mean  1.  2.74  .68  31  28  29  2.  2.54  .89  20  19  3.  2.69  .81  34  4.  2.24  1.14  5.  2.67  6.  SD  Subtest  T o t a l Test  VIQ  Distribution 1  3  0  13  87  15  2.2  19.6  78.3  12  03  4.3  8.7  37  36  41  10.9  21.7  .87  47  28  27  6.5  6.5  87  2.52  1.00  51  37  30  8.7  10.9  80.4  7.  2.52  1.07  71  49  35  13.0  4.3  82.6  8.  2.28  1.13  34  17  08  10.9  19.6  69.6  9.  1.80  1.20  58  63  62  15.2  37  47.8  10.  2.30  1.20  55  41  32  17.4  8.7  73.9  11.  2.48  1.09  66  54  42  13  6.5  80.4  12.  2.13  1.29  38  44  31  21.7  10.9  67.4  13.  2.41  1.08  76  50  32  10.9  13  76.1  Note,  n = 46 Item order d i f f i c u l t y correlation:  R h o = .664  0  87 67.4  95  Summary Item Statistics: Subtest Three:  Recreating Sentences (Total Test) Correlations  Item M e a n  SD  Subtest  Total Test  Distribution VIQ  0  1  2  1.  5.24  1.21  34  47  47  0  2.2  2.  4.17  1.39  33  30  26  2.2  0  3.  4.83  1.44  35  34  27  4.3  0  4.  4.33  1.94  41  53  56  6.5  0  17.4  5.  4.78  1.41  33  43  48  2.2  0  4.3  6.  4.35  1.69  57  68  59  6.5  0  7.  4.08  1.81  66  63  54  8.7  4.3  8.  3.85  1.55  13  16  05  8.7  0  9.  3.87  1.69  58  70  63  4.3  10.  3.61  1.68  63  58  49  11.  4.04  1.67  55  60  50  12.  3.53  1.96  28  42  32  13.  4.35  1.83  72  62  46  Note:  13 8.7 13 6.5  n - 46 Item order d i f f i c u l t y correlation:  R h o = .65  4  3  2.2  6  5  0  28.3  0  67.4  13  0  58.7  0  26.1  0  0  45.7  0  50  65  19.6  0  80  4.3  39.1  0  50.0  4.3  43.5  0  39.1  2.2  4.3  47.8  0  32.6  6.5  2.2  65.2  0  17.4  2.2  17.4  10.9  37  0  28.3  2.2  2.2  2.2  67.4  0  13  2.2  2.2  2.2  58.7  --  26.1  0  4.3  50  --  21.7  4.3  4.3  37  ——  43.5  10.9 4.3  65  96  Summary Item Statistics: Subtest F o u r :  U n d e r s t a n d i n g Metaphoric Expressions Correlations  Item  Mean  SD  Subtest  1.  1.80  1.20  37  2.  1.22  1.26  3.  1.26  4.  Total Test  %  Distribution  VIQ  0  1  3  53  50  15.2  37.0  47.8  69  73  64  39.1  30.4  30.4  1.24  67  61  62  34.8  34.8  30.4  1.67  1.33  50  49  45  28.3  23.9  47.8  5.  1.37  1.27  49  43  35  32.6  32.6  34.8  6.  1.15  1.25  74  69  63  41.3  30.4  28.3  7.  1.37  1.27  63  68  69  32.6  32.6  34.8  8.  1.22  1.26  48  52  55  39.1  30.4  30.4  9.  1.15  1.44  76  68  64  58.7  4.3  37.0  10.  .76  1.23  69  65  58  67.4  10.9  21.7  11.  .72  1.07  56  57  50  58.7  26.1  15.2  12.  .85  1.03  76  80  72  45.7  39.1  15.2  Note:  n = 46 Item order d i f f i c u l t y correlation:  R h o = .80  Appendix C Stimulus Items U s e d to E l i c i t Language Samples  98  Stimulus Items U s e d to E l i c i t Language Samples  1.  What was the last movie y o u saw?  2.  Describe what y o u usually do on Saturdays f r o m the time you get up until the time y o u go to bed.  3.  T e l l me about the funniest thing that has ever happened at your house.  4.  T e l l me about your favourite  5.  What is your favourite T V show? it?  6.  What's the best b o o k / s t o r y you've read.  7. T e l l me about your favourite  T e l l me about it.  singer/actor. What happened on that show the last time y o u saw  T e l l me about it.  teacher.  8.  T e l l me how y o u would spend a m i l l i o n dollars.  9.  T e l l me about the best time y o u can remember having with your f a m i l y or friends.  10. What's the funniest/most embarrasing/exciting thing that's happened to you? me about it. 11. Where do y o u live?  T e l l me how to get there f r o m here.  12. Describe your home. 13. T e l l me about y o u r hobbies. 14. Describe how your f a m i l y celebrates Christmas. 15. H o w w i l l y o u spend your Easter vacation?  Tell  99  Appendix  D  Language Sample Transcripts  100  Language Sample Transcript:  C o n t r o l Case One  T h e r e was this guy and he went around looking f o r all these jobs. H e wouldn't fit into any of them. F i n a l l y he went to a department store. Some accident happened. A sign came falling. T h e r e was a lady under it. H e grabbed o n to it and started swinging up in the air. H e liked w o r k i n g with manikans. T h e lady asked h i m what k i n d of job he wanted. H e explained to her what he d i d . She gave h i m a j o b . T h e r e was these people that d i d not agree and disliked the person with the j o b . H e finally made this manikan. O n e day he turned arond and she was alive. A l l these happened. E v e r y o n e starts noticing this change i n h i m . T h e y ' r e w o n d e r i n g why he's talking to a m a n i k a n . A t the e n d she turns into real life. She stays like that. T h e y showed these scenes right at the start. It was i n A s i a . She was in a tomb. She wanted a different life but she didn't know how she was gonna get it. That's how it ended out. It was good. I l i k e d it alot.  T h e r e was this f a m i l y . M a c k is o l d . Someone said that she'd d i e d . A l l o f a sudden he f o u n d out that he had a daughter. She came there. A whole b u n c h o f scenes were happening. L i f e was changing in the house. E v e r y b o d y was getting into arguments. T h e n A n n came along. H e thought that she was dead. N o w she m o v e d into the house next door and his w i f e doesn't agree with it. T h e y always argue. A t the end he went over there and they started dancing. T h e y were having good times. K a r e n ' s starting to get thoughts about what's going on between them again. She's married to a man named B e n now. She had twins. She f o u n d out they were G a r y ' s . T h e y weren't Ben's. B e n doesn't approve o f it. V a l and G a r y are good friends. While she was being a teacher she was being a f r i e n d too. G i r l s c o u l d go and talk to her.  102  Language Sample Transcript:  C o n t r o l Case T w o  It would be m y teacher because she had a f a r m and we got to go to her f a r m and feed her goats. It is about teenagers. T h e y went to f i n d a dead body. T h e y went along train tracks. When they f o u n d it there was this older group that wanted to f i n d it. T h e y stook up for it. T h e y never d i d say who got it. T h e y just brought it. H i s brother had d i e d . I guess when he saw this dead one he might have thought of h i m . A c t u a l l y it was really good. It was about f r i e n d s h i p . T h e y always stood by each other. T h e y were trying to figure out what girls were like. H e went through puberty or  whatever.  H e saw his babysitter and I guess he liked her. H e was trying to get to know her. H e r sister came i n . H e said that he was a college man. H e lied. She f o u n d out. T h e y got m a d . T h e y asked their dad what it was about girls. T h a t is the plot. M y dad says I don't need braces.  So he won't get them. I guess I w o u l d get them. It's got grey carpet. M y r o o m has a balcony. We have different rooms. In the m i d d l e there is a bathroom. We just walke through doors and we go there. I guess it was Sunday. I went d o w n to M e t r o t o w n with Joy. We just had f u n . We went to those picture booths and we took a couple o f pictures. It was great. U s u a l l y we just sit around. M y dad makes p o p c o r n . We just watch T V . Yesterday Lesley came over. She said m y stairs r e m i n d her o f Psycho. M e and m y sister are in the upstairs. We are the only ones there. M y sister was babysitting. T h e n she left. I was remembering Psycho when I was laying there. It was really scary.  104  Language Sample Transcript:  C o n t r o l Case T h r e e  He's w o r k i n g at this store. H e makes this m a n i k a n . T h e n he gets f i r e d . H e sees the manikan in a store. O n e day he's w o r k i n g o n something and she comes alive. Near the end somewhow she gets into this machine where she's going to be c h o p p e d up. H e saves her. A t the end everyone else can see her. T h e son T h e o wanted to take f l y i n g lessons. T h e little girl brought home a b o y f r i e n d and she was ordering h i m around and telling h i m what to do. H e used to stay after school and help you with problems. Spend alot of time with y o u . First put it in the bank. When I get older b u y a car. Probably move to H a w a i i . We went there f o r m y dad's convention. / H e was mostly at the convention. M y m o m used to go shopping and me and m y sister just go to the beach. Walk around. I went to the Polynesian C u l t u r a l Centre, Pearl H a r b o u r , the zoo. We walked around quite a bit. M e t some new people there. It was pretty sad when they showed the movie and all the people d y i n g . It's k i n d o f just like a normal boat. T h e n y o u go on a memorial.  105 There's a big sign of all the people who d i e d . Some people throw flowers into the water. Some o f them are sticking out f r o m the water. In the evenings we's all go out for dinner and then go shopping or see a movie. T h i s mother takes her two kids to live with her parents but her parents don't know that she has the kids. She leaves them locked away in an attic. She leaves them there for a really long time. She used to come and visit. She got married to someone else. T h e kids w o u l d get really bored. O n e o f the kids died because it got sick. T h e y took her to the hospital but there wasn't enough time. A t the end all the kids sneak out of the w i n d o w and run away. It was just l y i n g there. O n e day its m o m finally came i n . T h e brother and sister told the m o m . O n c e a long time ago I put my pants on backwards. I had the holes at the back. I came out and we had some guests over. I like their songs and their d r u m m e r is really good. I like Sylvester Stalone and H a r r i s o n F o r d .  106  Language Sample Transcript:  C o n t r o l Case F o u r  H e was being killed. H e asked a f r i e n d to help. H e must have a revenge on that f r i e n d . N o w he started to have his revenge but it's not the end yet. It's only one chapter a day. It's a Chinese way. I don't k n o w w h i c h one to p i c k . It's about a hero that helps people. T h e r ' s lots o f girls liked h i m . G i r l s are f o l l o w i n g h i m around. H e knows K u n g F u . H e knows some other friends that knows K u n g F u too. T h e y ' r e all heroes. It was about people f i n d i n g out what's happening. There's a water f l o o d i n g . People w o u l d go over there and check what's happening. T h e y w o u l d go and help them. She's a nice teacher. She gives out candies every week on F r i d a y . He's real f u n n y . H e jokes a lot. H e doesn't give us hard work. I would buy a new house. G o on f i e l d trips with m y parents. B u y lots o f things. We went to a restaurant to eat dinner.  We just go on special days like mother's day or when something comes to C a n a d a . T h e n we go to a restaurant to eat. Some live i n H o n g K o n g . I don't know. I read all different kinds of books. It's about the m i x e d - u p twins. It's about a twin that gets m i x e d - u p . T h e y ' r e the same. Policeman and friends get all m i x e d - u p with those two. O n e time when they were lost policemen were trying to f i n d them. When the policeman f i n d one of them the other one ran away. When the p o l i c e m a n f i n d that one again the policeman got all m i x e d - u p . Sometime they do something wrong and the it's real f u n n y . Sometimes they make mistakes. T h e r e are Chinese. T h e y ' r e singers and actors. T h e y sing. T h e i r songs are excellent. I go shopping with m y mother sometimes. G o to m y grandma's house. D o my homework. Play with m y f r i e n d .  108  Language Sample Transcript:  C o n t r o l Case F i v e  T h e last movie I saw was Star T r e k F o u r . I don't know i f you're into science f i c t i o n . It was really weird. It was different f r o m all the other ones because all the other ones were really science fiction. T h i s one is more comical because instead of the group being in their spaceship they were c o m i n g back in the t w e n t y - f i f t h century. T h e r e was a great big probe or a spaceship or something. It was terrorizing the earth and was planning on destroying it. It was sending o f f these messages that only whales c o u l d hear T h e y had to go back to our century to f i n d these whales and bring them back. It was really c o m i c a l . It was good. It left y o u in suspense f o r a f i f t h part c o m i n g out. T h e last time I saw it was probably last T h u r s d a y . H e was telling everybody how nobody c o u l d f o o l h i m with all these practical jokes they were trying to play on h i m . T h e whole f a m i l y made up this really elabourate joke to play on h i m . H e overheard them talking about it. It b a c k f i r e d on them and he got them instead. H e was really too smart f o r them to actually play practical jokes o n . M y favourite teacher was probably m y grade seven teacher. H e wasn't o l d . H e was in his forties. H e knew where his students stood. H e wasn't all oldfashioned. H e knew all the terms we used and everything they meant.  109 We c o u l d talk to h i m as i f he was just another student. We didn't have to talk to h i m as i f he was a teacher. H e d i d alot of things with us. When we went places he let us make suggestions of where to go and then he would pick the best places. We went c a m p i n g two or three times with h i m . We went one time f o r a whole week. We missed a whole week of school. H e took the whole class on a camping trip. If I w o n a m i l l i o n dollars I'd probably put it in the bank for a year and let the interest grow. M y parents were talking about this just the other day. T h e y were saying that they'd put it in the bank f o r a year. T h e n they'd take half o f it out and use it f o r a downpayment on a house. T h e y wouldn't take the whole thing out and use it at one time. E v e r y day there was all sorts of things to do because I met a whole b u n c h o f new friends. We d i d everything together.  Language Sample Transcript:  L L D Case One  I w o u l d put it in m y bank. G o i n g vacation. I  forget.  B u y car. V i s i t m y aunt and cousin. I d i d not watch any movies. We have a part at m y house. We celebrate. E v e r y b o d y came to our house. Put all the dishes in the other side. Wash the other side. Put m y clothes together and opened the suitcase. It is b i g . T h e r e is a l i v i n g r o o m there. T h e y have a kitchen l i v i n g r o o m and one bedroom. I w o u l d say y o u got the wrong  number.  I w i l l not give it to them. I w i l l not open the door. I might call the police. I w i l l mail it back. G i v e it to the teacher. Put a bandaid. K e e p looking for the library book. I w o u l d give it to the police. I would phone the fire  department.  G o to the neighborhood.  A s k them to phone the police. Who d i d it. G i v e it back. T e l l them to stop. T h e y are strong. It is c o l d . T h y rob something. T h e y kill someone. It is too hard. It is smaller. It is rough. D o not tell anyone. She teach me new things. She help us math. D o y o u like school? It is f u n . It is small. T h e r e is a blue creature. D o y o u want to be a teacher? D o y o u want to go to college? I ran out o f questions. H o w o l d is your sister? It is not a d o l l . It is a stuffed animal. I call them m y cute c u b . I got d o g .  Language Sample Transcript:  L L D Case T w o  She's k i n d and h e l p f u l . She's mean. G e t angry easily. G o around places. B u y a new house. I go to Chinese school. L e a r n Chinese. C a m e back at 3:30 and help m y d a d . We get memorize the words and have dictation. It's a Chinese movie. There's a twin prince. Got mixed-up. T h e y fight the bad guys. L o n g time I know but forget. Some people going out and f o u n d a m o o n . T h e m o o n was dead. N o live there. There's a animal. It was an elephant. It was dead. T h e n went into another m o o n . Somebody was deep sleep. T h e master try to scaped last. T h e god k i l l h i m . A b i r d flew in the house. M y dad go open the door.  Scare h i m out. C l e a n the tanks. He's a famous singer. Jave a Christmas tree and dinner. I always help m y father the most. M y youngers have all sorts of spare time. Doesn't help. M y father's a carpenter. I help h i m i n the roof. We have somewhere chop d o w n trees. M y sister fights. T h e y fight with the other small ones. T h e y keep trying to hit the little ones. G e t a nice j o b . I like to be myself. Wouldn't talk to them. Just hang up. C a l l the police. There's a skytrain to M a i n Street. T h e n walk d o w n . Pay f o r it. T e l l somebody. I don't k n o w . It's too far to walk. M y feet get tired.  114  Language Sample Transcript:  L L D Case Three  A n i n j a was fighting this boy. T h e r e is a man in the show. T h e man is a police o f f i c e r and the k i d is a karate guy. T h e man always catch robbers and the little k i d always helps h i m . T h e other school doesn't even let us go to the washroom. She does not teach us not to mark it. When y o u ask her some question she would say go back to your desk. She is nice. She helps people alot. She does not scream at us. She let us go to the washroom. I w i l l keep it f o r college. H e l p the f a m i l y pay their insurance. M y f r i e n d invite me to his house and then he ask me to stay over for a night. We had a time. H e showed me all his videos. We were playing computer games. We have a tree. We decorate our house. We have lots o f friends. We have turkey dinners. We go out for dinner. We invite lots o f friends over. T h e y are nice. B i r d came inside m y house. We had the screen door open.  We had a barbecue out. T h i s b i r d came i n . We d i d not know. T h e n we went back to the house. We closed the door. T h e b i r d and the cat got trapped. T h e cat was under m y mom's room. T h e b i r d was in the flowers. M e and m y father were f i n d i n g something. I was f i n d i n g something under my mom's r o o m . We saw this cat. D a d was f i n d i n g something around the plants. T h e n he f o u n d a b i r d . We opened the door and then we chased it around the house. We d i d the same thing. T h e y are good. T h e y have good songs. I can't name them all. I forgot. It's pretty interesting.  Language Sample Transcript: L L D Case Four I'd buy a lambourgine. I didn't really get it. It had Michael J. Fox in it. Guy has to go back in the future. He has to stop this guy from shooting him. Then he has to go in the future to change his kids. Bring em back. He explains stuff good. There was a snake in their house. Bill Cosby was scared of it. They had a string. They were trying to catch it but I forget what happens. It's football. You have all the equipment. It's tackle football. It's small people playing. There's coaches. They're a rock group. The best song I like is Walk this Way. They want kids to go to school. They don't want gangs around. ACDC is like Highway to Hell. I watch cartoons till 12:00. Watch wrestling till 4:00. I go out and play football. All my weeks are different.  117  I have different things. I call it an arcade cause that's the closes real arcade we have. T h e y get different ones each week. I play  1942.  It's a wargame. Y o u shoot d o w n airplanes. I like football more. I don't read books. I read m y o w n books. I make books up. There's this boy. It's the night before Halloween. He's on this island. There's a skeleton. T h e skull is all set to k i l l h i m . A n axe is c o m i n g towards h i m . Y o u wake up. Y o u ' r e floating d o w n the river. Y o u go back on this island. T h i s happened now. It's two pages l o n g . I wrote a 37 page book. It's not a book. It's paper.  118  Language Sample Transcript:  L L D Case F i v e  It's a situation c o m e d y . It's about that f a m i l y . It's a f a m i l y . It has a housekeeper. T h e housekeeper has a daughter. T h e y all live  together.  T h e housekeeper is a man. H e seems to be always solving the problems. I like comedy shows. H i s name's T o n y . H e r daughter is Samantha. She's good i n basketball. H e wanted her daughter to j o i n basketball. She didn't j o i n it. H e was pretty upset. Later she j o i n e d the team. She had a b o y f r i e n d named T o d d . I think they broke up. Last part was she had a new b o y f r i e n d . H e happened to be around a eighteen-year old boy and she was only about thirteen or fourteen. T h a t was pretty f u n n y . I would give half to m y parents. B u y some things I would like. K e e p it for m y future or maybe for college. She's really nice.  I guess we were k i n d a close together. She was easy to talk to. She was really nice and h e l p f u l . I liked her. I go and visit her. I added everything wrong. I thought I was going to get f o u r hundred dollars. M y dad goes y o u added all wrong. H e put it o n the wall so everybody c u l d see. It was really f u n n y and I was so embarassed. I was ading one of m y cheques. E v e r y b o d y was laughing at me. T h e y ' v e been m y favourite group since grade f i v e . I still like them. T h e y play rock and roll. It's not hard and it's not soft. It's just i n the m i d d l e . T h e y broke up but now they're back together again. T h e y lost one person. I'm glad they're back together because i like them. She seems to be getting songs that are k i n d o f normal. She seems to be dressing up normal. M o s t o f the times m y m o m works on the weekends. She's a nurse. I end up cleaning the house.  Appendix E L A R S P Summary Sheets  LARSP Summary Sheet Control Case One CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 8 SO  AX 4 VO 3 VC OTHER II  D N IS ADJ N 5 N N PR N 5  sc NEG X  S T A G E III  S T A G E IV  52 A N A L Y S E D S E N T E N C E S 0 UNINTELLIGIBLE 0 S Y M B O L I C NOISE 0 DEVIANT 0 INCOMPLETE 0 AMBIGUOUS 0 MINOR SOCIAL 0 MINOR S T E R E O T Y P E S  P R O N - P 49 PRON-O 9 AUX-M 3 A U X - 0 14 O T H E R III 3  EN  X-fO(NP) 3  X+A(AP) 4  COMM V X Y LET XY DO X Y  Q XY VS?  SVC SVO SVA NEG  VCA VOA 1 VOI OTHER III 1  D ADJ N 2 ADJ A D J N PR D N 8 COP IS  XY+S(NP) 4  XY-fV(VP) 19  XY+C(NP) 4  XY+0(NP) 13  XY+A(AP) 12  C O M M +S  QVS(*) QXYZ VS+? TAG  SVOA 5 SVCA 2 SVOI 1 SVOC  AAXY 3 OTHER IV 3  NP P R NP 4 PR D A D J N C X X CX1  NEG V 6 NEG X 2 AUX 2 O T H E R IV 3  N'T 5 'COP 1 'AUX 6  COORD(l) 7 SUBA(l) 3 CL S CL C 1  COORD(l-c) SUBA ( I t ) CL 0 14 COMPARATIVE  POSTM CL1 2 POSTM PHR1 +  P O S T M CL1+ 1  EST ER ' LY 2  PASSIVE COMPLEMENT 1  HOW!  NP INIT 3 NP COORD  CMPLX VP 2  S T A G E VI  M L U (IN M O R P H E M E S ) = 8.98  ED 53  X+C(NP)  AND 7 CONJ 1 SUB 11  A CONN 3 C O M M N T CL E M P H ORDER  P L 13  O T H E R II 10  X+V(VP) 4  O T H E R CONN  S T A G E VII  ING 17  V P A R T 11 INT X  X+S(NP) 3  VXY+  STAGE V  V V  IT THERE 4  10 17 15 XY  WHAT!  3S 26 GEN  L A R S P Summary Sheet Control Case T w o CONNCTVY  COMMAND  QUESTION  STAGE 1  COMM V  Q  V  S T A G E II  COMM VX  QX  SV 5 SO SC NEG X  AX VO 3 VC OTHER II  D N 14 ADJ N 2 N N PR N 7  S T A G E III  S T A G E IV  X+S(NP) 3  X+V(VP) 5  X-vC(NP)  X+0(NP) 1  X+A(AP)  COMM VXY LET X Y DO X Y  qxY  SVC SVO SVA NEG  VCA VOA. VOI OTHER III  D ADJ N 2 A D J ADJ N PR D N 4 COP 15  XY+V(VP) 18  XY+C(NP) 4  X Y t O ( N P ) 16  XY+A(AP) 12  C O M M *S  QVS(+) QXYZ VS»7 TAG  SVOA 3 SVCA 4 SVOI SVOC  AAXY OTHER IV  NP PR NP 2 PR D ADJ N C X  COORD(I) 5 SUBA(l) 5 CL S CL C  COORD(l + ) 1 SUBA (1 + ) CL 0 8 COMPARATIVE  PASSIVE COMPLEMENT 1  HOW!  AND 5 CONJ 1 SUB 0 O T H E R CONN  S T A G E VI  S T A G E VII  vs?  6 28 19 XY  XY+S(NP) 5  VXY+  STAGE V  OTHER I  A CONN 1  IT  C O M M N T CL E M P H ORDER  THERE 2  P R O N - P 61 PRON-O 4 AUX-M 3 AUX-0 6 OTHER III 5  ING 5 P L 11 ED 43  EN 3 3S 21 GEN  NEG V 2 NEG X 2 AUX 1 OTHER IV  N'T •COP AUX  P O S T M CL1 3 P O S T M PHR1+  POSTM C L 1 *  EST ER 1 LY  NP 1NVT NP COORD  CMPLX VP 2  X C X2  WHAT!  V V 6 V PART 8 INT X 8 O T H E R II 6  M L U (IN M O R P H E M E S ) = 8.18 48 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 S Y M B O L I C NOISE 0 DEVIANT 0 0 0 0  INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  M  N> M  L A R S P Summary Sheet Control Case Three CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER 1  S T A G E 11  COMM VX  QX  SV 4  NEC X  AX 2 VO 1 VC 1 O T H E R II  D N 22 ADJ N 5 N N PR N 3  SO SC  S T A G E III  S T A G E IV  X+S(NP) 1  XtV(VP) 4  X+C(NP)  X+0(NP) 1  X+A(AP) 1  COMM VXY LET X Y DO X Y  Q XY VS?  SVC SVO SVA NEG  VCA VOA 4 VOI O T H E R III 1  D ADJ N 5 ADJ A D J N 1 P R D N 19 COP 6  XY+S(NP) 8  XY+V(VP) 14  XY+C(NP) S  XY+0(NP) 12  XY+A(AP) 9  C O M M +S  QVS(+) QXYZ VS+? TAG  SVOA 8 SVCA 1 SVOI 1 SVOC  A A X Y 11 O T H E R IV 4  NP PR NP 1 PR D A D J N C X  COORD(l) 13 SUBA(l) 6 CL S 1 CL C  COORD(l + ) 1 SUBA (1+) CL 0 1  PASSIVE 1  HOW!  COMPLEMENT 4  WHAT!  VXY+  STAGE V  AND 10 CONJ 4 SUB 6 O T H E R CONN  S T A G E VI  S T A G E VII  A CONN 2  IT  C O M M N T CL  THERE 2  E M P H ORDER  M L U (IN M O R P H E M E S ) = 10.1 50 A N A L Y S E D SENTENCES 0 UNINTELLIGIBLE 0 S Y M B O L I C NOISE 0 DEVIANT 0 0 0 0  INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  6 11 9 XY  X CX 6 P O S T M CL1 3  V V 11  1NG 11  V P A R T 12 INT X 3 OTHER II 6  P L 19 ED 32  P R O N - P 47 PRON-0 i AUX-M 3 AUX-0 8 OTHER III 5  . NEG V 3 NEG X 2 AUX O T H E R IV 4 POSTM C L l t  POSTM PHRl-f  COMPARATIVE NP IN1T 6 NP COORD 1  C M P L X VP 4  EN 3S 21 GEN 1  N'T 2 •COP 2 'AUX 4  EST ER 1 LY  LARSP Summary Sheet Control Case Four CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 6 SO SC NEG X  AX 3 V0 8 VC O T H E R II  D N 11 ADJ N 3 N N PR N 6  S T A G E III  S T A G E IV  X+S(NP) I  X+V(VP) 9  X+C(NP)  X+0(NP) 5  X+A(AP) 3  COMM V X Y LET X Y DO X Y  Q XY VS?  SVC SVO SVA NEG  10 9 11 XY  VCA VOA 1 VOI O T H E R III 2  D ADJ N 4 ADJ ADJ N PR D N 8 COP 17  XY+S(NP) 3  XY+V(VP) 6  XY+C(NP) 8  XY+0(NP) 6  XY+A(AP) 11  C O M M +S  QVS(+) QXYZ VS+? TAG  SVOA 5 SVCA 3 SVOI 1 SVOC  AAXY 3 O T H E R IV 4  NP PR NP 4  COORD(l) 4 SUBA(l) 6 CL S CL C 1  COORD(l + ) SUBA (1+) CL 0 4 COMPARATIVE  PASSIVE 1 COMPLEMENT 3  HOW! WHAT!  VXY+  STAGE V  AND 2 CONJ 2 SUB 10 O T H E R CONN  S T A G E VI  S T A G E VII  A CONN 3 C O M M N T CL E M P H ORDER  M L U (IN M O R P H E M E S ) = 7.76 52 A N A L Y S E D S E N T E N C E S 0 0 0 0 0 0 0  UNINTELLIGIBLE S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  IT THERE 3  VV 1  ING 8  V PART 6 INT X O T H E R II 8  P L 22 ED 15  P R O N - P 40 PRON-0 9 AUX-M 4 AUX-O 8 O T H E R III 5  EN 3S 25 GEN  NEG V 4 NEG X 2 AUX 1 O T H E R IV  N'T 3 'COP 14 'AUX 3  POSTM CL1 7 P O S T M PHR1+  P O S T M CL1» 2  EST ER LY  NP INIT 4 NP COORD  CMPLX VP 1  PR D ADJ N 2 C X X CX 3  L A R S P Summary Sheet  Control Case Five CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM V X  QX  SV 6 SO  AX 1 VO s VC OTHER II  D N 9 ADJ N 6 N N PR N 4  SC NEG X  S T A G E III  S T A G E IV  X*S(NP) 3  X+V(VP) 3  X+C(NP)  X+0(NP) 3  X+A(AP) 1  COMM V X Y LET XY DO X Y  Q XY VS?  SVC SVO SVA NEG  VCA VOA 4 VOI OTHER III  D A D J N 12 ADJ ADJ N PR D N 9 COP 19  XY-fS(NP) 6  XY+V(VP) 8  XY+C(NP) 8  XY+0(NP) 2  XY+A(AP) 7  C O M M +S  QVS(+) 1 QXYZ VS+? TAG  SVOA 10 SVCA 4 SVOI SVOC  AAXY 3 OTHER IV 5  NP P R NP 12 PR D ADJ N 3 C X  COORD(l) 9 SUBA(l) 9 CL S CLC  COORD(l-f) SUBA (1+) CL 0 3 COMPARATIVE  PASSIVE COMPLEMENT  HOW1  VXY+  STAGE V  AND 8 CONJ SUB 11 O T H E R CONN  S T A G E VI  S T A G E VII  A CONN 2 COMMNT CL E M P H ORDER  M L U (IN M O R P H E M E S ) = 13.21 41 A N A L Y S E D S E N T E N C E S 0 UNINTELLIGIBLE 0 0 0 0 0 0  S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUS M I N O R SOCIAL MINOR S T E R E O T Y P E S  IT THERE 2  11 6 5 XY  O T H E R II 10  P R O N - P 63 PRON-0 8 AUX-M 8 A U X - 0 10 O T H E R III 5  ING 14 P L 18 ED 48  EN 3S 21 GEN  NEG V 5 NEG X 2 AUX OTHER IV 6  N'T S 'COP 1 'AUX 3  P O S T M CL1 12 P O S T M PHR1+  P O S T M CL1+ 3  EST 1 ER 1 LY 4  NP INIT 8 NP COORD  CMPLX VP 9  X CX3  WHAT!  V V3 V P A R T 14 INTX  LARSP Summary Sheet LD Case One CONNCTVY  COMMAND  QUESTION  STACEI  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 3  AX 2 V0 10 VC OTHER II  D N 14 ADJ N 1 N N PR N 2  X+0(NP) ?  X+A(AP) 2  VCA VOA 3 VOl  D ADJ N 5 ADJ ADJ N PR D N 6  OTHER III  COP 14  SO SC NEG X  S T A G E III  S T A G E IV  X+S(NP)  X+V(VP) 3  X+C(NP)  COMM VXY LET X Y DO X Y  Q XY 1 VS?  SVC 10 SVO 13  XY+S(NP)  XY+V(VP) 12  XY+C(NP) 4  X Y t O ( N P ) 10  XYtA(AP) 4  C O M M +S  QVS(t) 1 QXYZ  SVOA 4 SVCA 2  AAXY OTHER IV  NP PR NP 2  VS»? 3 TAG  SVOI 3 SVOC COORD(l) 1 SUBA(l) CL S CL C  COORD(li-) SUBA (1+) CL 0 3 COMPARATIVE  PASSIVE  HOWl  COMPLEMENT  WHAT!  VXY+  STAGE V  AND 1 CONJ SUB O T H E R CONN  S T A G E VI  S T A G E VII  A CONN  IT  C O M M N T CL E M P H ORDER  THERE 2  M L U (IN M O R P H E M E S ) = 5 07 54 A N A L Y S E D SENTENCES 0 0 0 0 0 0  UNINTELLIGIBLE S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUSO M I N O R SOCIAL MINOR S T E R E O T Y P E S  SVA 2 NEG X Y  V V2 V PART 1 INT X - O T H E R II 2  ING 2 PL 3 ED 7  P R O N - P 47  EN  PRON-0 7 AUX-M 8 AUX-O 6 O T H E R III 2  3S 12 GEN  NEG V 5 NEG X 2 AUX O T H E R IV 1  N'T  POSTM CL1 POSTM P H R H -  POSTM C L l t  EST  NP INIT 1 NP COORD 1  CMPLX VP 1  PR D A D J N 1 C X X CX 2  'COP 'AUX  ER 1 LY  LARSP Summary Sheet L D Case Two CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 3  AX S VO 12 VC OTHER II  D N 18 ADJ N 3 N N PR N 3  SO SC NEG X  S T A G E III  S T A G E IV  0 0 0 0  P O S T M CL1 P O S T M PHR1 +  P O S T M CL1+  EST 1 ER LY 1  NP INIT 1 NP COORD  CMPLX VP 3  COMM V X Y LET X Y DO X Y  Q XY VS?  SVC SVO SVA NEG  VCA 1 VOA VOI OTHER III 2  D ADJ N 6 ADJ ADJ N PR D N S COP 13  XY+S(NP) 8  XY*V(VP) 4  XY+C(NP) 9  XY»0(NP) 6  XY+A(AP) 5  C O M M +S  QVS(+) QXYZ VSt? TAG  SVOA J SVCA SVOI SVOC  AAXY 1 OTHER IV 1  NP PR NP 1 P R D ADJ N C X  COORD(l) 4 SUBA(l) 2 CL S CL C  COORD(1 + ) SUBA (1+) CL O 1  PASSIVE COMPLEMENT 2  HOWI  IT THERE 3  INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  N'T 3 'COP 9 'AUX  X+A(AP) 3  C O M M N T CL E M P H ORDER  52 A N A L Y S E D SENTENCES  NEG V 3 NEG X 1 2 AUX OTHER IV 1  X+0(NP) 7  A CONN 1  0 UNINTELLIGIBLE 0 S Y M B O L I C NOISE 0 DEVIANT  EN  X+C(NP)  AND 3 CONJ 1 SUB OTHER CONN  M L U (IN M O R P H E M E S ) = 5.36  P R O N - P 21 PRON-O 7 AUX-M 4 AUX-0 4 OTHER III 2  X+V(VP) 6  S T A G E VI  S T A G E VII  1NG 2  X+S(NP) 3  VXY+  STAGE V  V V 3 V PART 7 INT X 2 OTHER II 7  13 8 4 XY  X CX 2  COMPARATIVE  WHAT!  PL 9 ED 12  3S 15 GEN  LARSP Summary Sheet LD Case Three CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 4  NEG X  AX V0 2 VC O T H E R II  D N 24 ADJ N 1 N N PR N 1  SO  sc  S T A G E III  S T A G E IV  X+C(NP)  X-t-O(NP) 1  X+A(AP)  COMM V X Y 1 LET X Y DO X Y  Q XY VS?  SVC SVO SVA NEG  VCA VOA VOI OTHER III  D ADJ N 3 ADJ A D J N PR D N 10 COP 9  XY+S(NP) 8  XY»V(VP) 8  XY+C(NP) 3  XY+0(NP) 17  XY*A(AP) 9  C O M M +S  QVS(+) QXYZ VS+? TAG  SVO A 11 SVCA 1 SVOI 2 SVOC  AAXY  NP PR NP PR D ADJ N 2 CX X CX 2  NEG V 5 NEG X 1 AUX OTHER IV 1  N'T 1 •COP 'AUX  COORD(l) 4 SUBA(l) CL S CL C  COORD(l + ) SUBA (1+) CL 0 7 COMPARATIVE  P O S T M CL1 P O S T M PHR1+  P O S T M CL1+  EST ER LY  PASSIVE COMPLEMENT  HOW!  NP INIT 4 NP COORD  C M P L X VP  AND 4 CONJ SUB 1  S T A G E VI  A CONN 4 C O M M N T CL EMPH ORDER  M L U (IN M O R P H E M E S ) = 7.36 47 A N A L Y S E D S E N T E N C E S 0 UNINTELLIGIBLE 0 0 0 0 0 0  S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  PL 10 ED 24  X+V(VP) 4  OTHER CONN  S T A G E VII  ING 5  V PART 2 INT X 2 OTHER 11 7  X+S(NP) J  VXY+  STAGE V  V V  IT THERE I  6 19 8 XY  OTHER IV 3  WHAT!  P R O N - P 46 PRON-0 4 AUX-M 4 AUX-O 9 OTHER III S  EN 3S 15 GEN  LARSP Summary Sheet L D Case Four CONNCTVY  COMMAND  QUESTION  STAGE 1  COMM V  Q  V  N  OTHER I  S T A G E II  COMM VX  QX  SV 4 SO SC  AX VO 2 VC  NEG X  OTHER II  D N 13 ADJ N 4 N N PR N 4  S T A G E III  S T A G E IV  XtS(NP)  X+V(VP) 3  X+C(NP)  X+0(NP) 1  X+A(AP)  COMM V X Y LET X Y DO X Y  Q XY VS7  SVC SVO SVA NEG  VCA VOA 2 VOI OTHER III  D ADJ N 3 ADJ ADJ N PR D N 6 COP 8  XY+0(NP) 10  XYtA(AP) 6  C O M M »S  QVS(» QXYZ VS»? TAG  SVOA 6 SVCA 1 SVOI SVOC 1  AAXY 3  NP PR NP 1 PR D A D J N C X X CX 1  NEG V 4 NEG X 2 AUX  COORD(l) 2 SUBA(l) CL S CL C  COORD(lt) SUBA (1+) CL 0 3  P O S T M CL1 2 P O S T M PHR1 +  POSTM CL1 +  PASSIVE COMPLEMENT 1  HOW! WHAT!  NP IN1T 2  CMPLX VP 2  A CONN 1  IT  C O M M N T CL E M P H ORDER  THERE 4  52 A N A L Y S E D S E N T E N C E S 0 0 0 0 0 0 0  UNINTELLIGIBLE S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR S T E R E O T Y P E S  EN  XY+C(NP) 9  AND CONJ 2 SUB O T H E R CONN  M L U (IN M O R P H E M E S ) = 6.23  P R O N - P 45 PRON-0 8 AUX-M 2 AUX-O 6 OTHER III 2  X Y * V ( V P ) 13  S T A G E VI  S T A G E VII  ING 5  XY+S(NP) S  VXYt  STAGE V  16 17 6 XY  V V5 V PART 4 1NTX 1 OTHER II 4  O T H E R IV  3S 23 GEN  N'T 3 'COP 14 •AUX :  OTHER IV 2  COMPARATIVE  NP COORD 1  PL 13 ED 9  EST 2 ER 1 LY  LARSP Summary Sheet L D Case Five CONNCTVY  COMMAND  QUESTION  STAGE I  COMM V  Q  V  N  OTHER I  S T A G E II  COMM V X  QX  SV 4 SO SC NEG X  AX VO 1 OTHER II  D N 14 ADJ N 4 N N PR N 1  S T A G E III  X+S(NP)  X+V(VP) 4  X+C(NP)  X+0(NP) 1  X+A(AP)  COMM V X Y  Q XY VS?  SVC SVO SVA NEG  VCA VOA 1 VOI OTHER III 1  D ADJ N 3 ADJ ADJ N PR D N 7 COP 25  LET X Y DO X Y  S T A G E IV  STAGE V  M L U (IN M O R P H E M E S ) = 7.3 53 A N A L Y S E D SENTENCES 0 0 0 0 0 0 0  UNINTELLIGIBLE S Y M B O L I C NOISE DEVIANT INCOMPLETE AMBIGUOUS MINOR SOCIAL MINOR STEREOTYPES  .  PL 9  OTHER II 2  ED 26  P R O N - P 60  EN 1  PRON-0 5 AUX-M 5 AUX-0 7 OTHER III 3  3S 31 GEN  XY+V(VP) 9  XY+C(NP) 11  XY+0(NP) 14  XY+A(AP) 7  C O M M +S  qvs(+)  SVOA 6 SVCA 2 SVOI SVOC  AAXY 4 OTHER IV  VXY+  QXYZ VS+? TAG  NP P R NP 2 PR D ADJ N 1 C X X C X 4  NEG V 3 NEG X 2 AUX 1 OTHER IV  N'T 1 'COP 12 'AUX 2  COORD(l) 6 SUBA(l) 2 CL S CL C  COORD(1+) SUBA (1 + ) CL 0 7  P O S T M CL1 3 POSTM PHR1*  POSTM CL1 +  EST ER LY  PASSIVE COMPLEMENT 2  HOW!  NP INIT 1 NP COORD  C M P L X VP 2  AND 3 CONJ 3 SUB 2 O T H E R CONN  A CONN 1 C O M M N T CL E M P H ORDER  16 23 9 XY  ING 7  V PART 5 INT X 6  XY+S(NP) 8  S T A G E VI  S T A G E VII  vc  V V 2  IT THERE  COMPARATIVE  WHAT!  Appendix F Computational Procedure for Wilcoxon R a n k S u m Tests  132  Computational Procedure for Wilcoxon R a n k S u m Tests  1.  Proceeding f r o m smallest to largest, ranks were assigned to each case in both groups.  2.  When ties o c c u r r e d , each case was assigned the average of the ranks it would occupy i f no ties had occurred.  3.  T h e sum of ranks ( R j ) Stages 2, 3, 4, and 5.  4.  R" was calculated for both groups:  was calculated f o r each group at phrase and clause level for  M e a n = R~ = N ( N + N + 1) 2 5.  R j f o r each group was compared to R. If less than~R~, R j was compared to the critical values required for significance. T h e critical lower tail values of R j for 5 and 5 cases are 19 (cC= .05) and 16 (oc= .01).  6.  If R j exceeded R , the corresponding lower tail value was obtained as follows: 2 R R j . T h i s result was then compared to the critical lower tail values indicated above.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0097728/manifest

Comment

Related Items