Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Computer generated corpus and lexical analysis of English language instructional materials prescribed… Edwards, Peter 1974

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
831-UBC_1974_A2 E38_8.pdf [ 16.04MB ]
Metadata
JSON: 831-1.0055689.json
JSON-LD: 831-1.0055689-ld.json
RDF/XML (Pretty): 831-1.0055689-rdf.xml
RDF/JSON: 831-1.0055689-rdf.json
Turtle: 831-1.0055689-turtle.txt
N-Triples: 831-1.0055689-rdf-ntriples.txt
Original Record: 831-1.0055689-source.json
Full Text
831-1.0055689-fulltext.txt
Citation
831-1.0055689.ris

Full Text

A COMPUTER GENERATED CORPUS AND LEXICAL ANALYSIS OF ENGLISH LANGUAGE INSTRUCTIONAL MATERIALS PRESCRIBED FOR USE IN BRITISH COLUMBIA JUNIOR SECONEARY GRADES by PETER EDWARDS B.A., U n i v e r s i t y of Western A u s t r a l i a , 1963 B.Ed., U n i v e r s i t y of Western A u s t r a l i a , 1967 M.A., U n i v e r s i t y of B r i t i s h Columbia, 1972 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF EDUCATION i n the Department of READING EDUCATION . FACULTY OF EDUCATION © We accept t h i s t h e s i s as conforming t c the r e q u i r e d standard A d v i s e r E x t e r n a l Examiner THE UNIVERSITY OF BRITISH COLUMBIA October, 1974 In p r e s e n t i n g t h i s t h e s i s in p a r t i a l f u l f i l m e n t o f the r equ i r emen t s f o r an advanced degree at the U n i v e r s i t y o f B r i t i s h C o l u m b i a , I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s tudy . I f u r t h e r agree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y pu rposes may be g r a n t e d by the Head o f my Department o r by h i s r e p r e s e n t a t i v e s . It i s u n d e r s t o o d that c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l owed w i thout my w r i t t e n p e r m i s s i o n . Department o f Faculty of Education, The U n i v e r s i t y o f B r i t i s h Co lumbia Vancouver 8, Canada ABSTRACT The major purpose of the study was to capture a r e p r e s e n t a t i v e sample of n a t u r a l language from the textbooks p r e s c r i b e d f o r use i n the j u n i o r secondary c u r r i c u l u m f o r B r i t i s h Columbia s c h o o l s , o r g a n i z e the sample f o r computer p r o c e s s i n g through the development of needed programs, develop a l e x i c a l a n a l y s i s and d e s c r i b e the word and sentence c h a r a c t e r i s t i c s of the samples organized by grades, s u b j e c t s a c r o s s grades, s u b j e c t s w i t h i n grades and textbook c o r p o r a . A number of hypotheses r e l a t e d to the d i s t r i b u t i o n of f r e q u e n t l y o c c u r r i n g words and a sub-set o f r e p r e s e n t a t i v e sentence l e n g t h s a c r o s s the c o r p o r a were then t e s t e d and a model was developed to a i d i n s e l e c t i n g l e x i c a l l y s i g n i f i c a n t vocabulary from word l i s t s based on samples from s u b j e c t area textbooks. A s t r a t i f i e d sampling model, a p p l i e d to t h i r t y - s e v e n textbooks from seven s u b j e c t areas, produced a Corpus of approximately a quarter m i l l i o n running words of n a t u r a l language t e x t based on 469 samples of approximately 500 words each. The r e s u l t s of the l e x i c a l a n a l y s i s i n d i c a t e d t h a t Grade 9 makes s i g n i f i c a n t l y g r e a t e r reading demands i n terms of volume of m a t e r i a l (tokens) and vocabulary (word-types) than e i t h e r Grades 8 or 10. C o n s i d e r a b l e d i v e r s i t y was e x h i b i t e d i n type and token d i s t r i b u t i o n by grades, s u b j e c t s , and textbooks but no apparent p a t t s r n emerged. However, use of Yule's K c h a r a c t e r i s t i c to determine the repeat r a t e frequency of word-types across the v a r i o u s c o r p o r a , r e v e a l e d great v a r i a t i o n i n redundancy of word-types with the most s t r i k i n g d i f f e r e n c e s e x h i b i t e d i n the samples from E n g l i s h textbooks and to some extent those from Home Economics and Commerce. S i m i l a r r e s u l t s were obtained i n a p p l y i n g Yule's K as a measure of the repeat r a t e frequency f o r sentence l e n g t h s . Samples from E n g l i s h textbooks, again, e x h i b i t e d e x c e p t i o n a l v a r i a b i l i t y i n sentence l e n g t h v a r i e t y . These r e s u l t s were f u r t h e r s u b s t a n t i a t e d by the a n a l y s i s of other measures of v a r i a b i l i t y based on computation of standard d e v i a t i o n s , c o e f f i c i e n t s of v a r i a t i o n , Pearson's skew f a c t o r and, t o a l e s s e r degree, the average number of sentences per 500 word sample. In a l l i n s t a n c e s , o r g a n i z a t i o n of the samples by gross grade groupings tended to mask the r e a l i n h e r e n t v a r i a b i l i t y o f the samples organized by s u b j e c t s and textbooks. Chi-square analyses o f word and sentence d i s t r i b u t i o n f u r t h e r s u b s t a n t i a t e d the i n h e r e n t v a r i a b i l i t y r e v e a l e d by the l e x i c a l a n a l y s i s . L i t t l e u n i f o r m i t y was e x h i b i t e d i n the d i s t r i b u t i o n of the most f r e q u e n t l y o c c u r r i n g words i n E n g l i s h and a r e p r e s e n t a t i v e sub-set o f sentence l e n g t h s with the samples o r g a n i z e d by grade l e v e l s , s u b j e c t s a c r o s s grades and s u b j e c t s w i t h i n grades. Grouping by gross grade l e v e l again masked s u b j e c t v a r i a t i o n s . The s t y l e and content c h a r a c t e r i s t i c s of the p r i n t m a t e r i a l s p r e s c r i b e d f o r use i n the separate s u b j e c t areas are t h e r e f o r e s i g n i f i c a n t l y i n s t r u m e n t a l i n a f f e c t i n g the frequency of occurrence of even the most common words i n E n g l i s h and a r e p r e s e n t a t i v e sub-set of sentence l e n g t h s . Further a n a l y s i s of the word l i s t s produced i n the study s u b s t a n t i a t e d the u t i l i t y of developing an e l i m i n a t i o n technique, based on omission of the most f r e q u e n t l y o c c u r r i n g words and the V r e l a t i v e l y r a r e words, to i d e n t i f y the s i g n i f i c a n t vocabulary from word l i s t s based on samples from t e x t s i n s u b j e c t areas. The major c o n c l u s i o n of the study suggests that the p r i n t m a t e r i a l s p r e s c r i b e d f o r use i n j u n i o r secondary grades e x h i b i t markad v a r i a b i l i t y when examined on even the most s t r a i g h t f o r w a r d of l i n g u i s t i c c h a r a c t e r i s t i c s such as word and sentence frequency. I t i s suggested t h a t t h i s v a r i a b i l i t y would be even more pronounced i f analyses were developed based on other s y n t a c t i c and semantic v a r i a b l e s . The e x p e r t i s e of the s u b j e c t area s p e c i a l i s t and the reading s p e c i a l i s t should be combined i n developing i n s t r u c t i o n to maximize l e a r n i n g from p r i n t m a t e r i a l s . Such i n s t r u c t i o n would best be based on m a t e r i a l s organized by s u b j e c t s a c r o s s grades and by separate s u b j e c t s w i t h i n grades r a t h e r than on m a t e r i a l s organized by gross grade groupings. - - " ACKNOWLEDGEMENTS I wish t o express my s i n c e r e g r a t i t u d e to the f o l l o w i n g people f o r t h e i r r o l e i n the completion of t h i s d i s s e r t a t i o n . Dr.E.G.Summers, my a d v i s e r and t h e s i s s u p e r v i s o r , f o r h i s help i n i n i t i a l l y d e f i n i n g the problem, and f o r h i s guidance, c o u n s e l , p a t i e n c e , and encouragement during the p r e p a r a t i o n of the d i s s e r t a t i o n . The members of my t h e s i s committee. Dr.E.Bentley, D r . J . C a t t e r s o n , Dr.Br.L.Courtney, and Dr.D.Pratt, f o r t h e i r a d v i c e , support, and many f r u i t f u l suggestions made throughout the course of the study. Dr.J.Bormuth of the U n i v e r s i t y of Chicago, f o r the time he spent examining the d i s s e r t a t i o n and the h e l p f u l suggestions he made. Mr. J, Coult hard, Mr. A. M i l l e r , Inger Nissen and Irene Amiraslany of the Computing Centre, f o r t h e i r h elp and advice during the computer programming and key-punching of the m a t e r i a l used i n the study. I would e s p e c i a l l y l i k e to thank A l l a n M i l l e r f o r h i s enthusiasm and e x p e r t i s e i n developing the numerous computer techniques and programs r e q u i r e d during the study. F i n a l l y , I would l i k e t o thank D r . E . N . E l l i s of the Vancouver School Board, f o r h i s he l p and kindness i n o b t a i n i n g the textbooks used i n the d i s s e r t a t i o n . v i i TABLE OF CONTENTS CHAPTER PAGE I THE PROBLEM 1 BACKGROUND OF THE PROBLEM ............... 1 STATEMENT OF THE PROBLEM AND PURPOSE OF THE STUDY 6 TASKS, QUESTIONS AND HYPOTHESES ......... 7 SIGNIFICANCE OF THE STUDY ............... 13 DEFINITION OF TERMS 16 LIMITATIONS 17 OVERVIEW OF THE STUDY 18 I I REVIEW OF THE LITERATURE AND CONCEPTUAL FRAMEWORK 20 INTRODUCTION 20 WORD LISTS AND THEIR BOLE IN READING RESEARCH 22 The Development of Word L i s t s ......... 22 Word L i s t s and Content M a t e r i a l s ...... 29 Word L i s t s and R e a d a b i l i t y ............ 31 COMPUTER TECHNOLOGY IN LANGUAGE RESEARCH 32 RESEARCH INTO THE READABILITY OF INSTRUCTIONAL MATERIALS ............... 36 R e a d a b i l i t y Formulas .................. 37 Important language v a r i a b l e s ........ 37 Recent r e a d a b i l i t y formulas ......... 39 New trends i n r e s e a r c h 40 The Cloze Procedure ................... 42 v i i i TABLE OF CONTENTS CHAPTER PAGE A d e s c r i p t i o n of Cloze .............. 4 3 Important l i n g u i s t i c v a r i a b l e s 43 Cloze and r e a d a b i l i t y ............... 45 DETERMINING SIGNIFICANT CONTENT MATERIAL 47 SUMMARY ., 50 I I I THE RESEARCH DESIGN 55 THE PILOT STUDY 56 TASK 1. SELECTION OF MATERIALS , 58 Sampling Procedures 59 TASK 2. INPUT PROCESSING, KEY PUNCHING .. AND EDITING 61 Text C o r r e c t i o n s 63 TASK 3. PRODUCTION OF THE CORPUS ........ 64 TASK 4. PRODUCTION OF WORD LISTS ........ 65 TASK 5. DESCRIPTION AND ANALYSIS OF LEXICAL CHARACTERISTICS ............... 70 TASK 6. DESCRIPTION AND ANALYSIS OF SENTENCE CHARACTERISTICS 72 TASK 7. ANALYSIS OF DISTRIBDTION OF 100 MOST FREQUENT WORD-TYPES 74 TASK 8. ANALYSIS OF DISTRIBUTION OF SELECTED SENTENCE LENGTHS ............. 76 TASK 9. IDENTIFICATION OF SIGNIFICANT CONTENT MATERIAL 77 I? ANALYSIS OF THE DATA AND FINDINGS ......... 82 ix TABLE OF CONTENTS CHAPTER PAGE TASKS, QUESTIONS AND HYPOTHESES 83 Task 1 83 Task 2 83 Task 3 83 Task 4 84 4.1 84 4.2 86 Task 5 87 5.1 87 5.2 95 Task 6 102 6.1 102 6.2 113 6.3 113 Task 7 117 7. 1 117 7.2 ................................. 120 Task 8 ....121 Task 9 123 9.1 124 9.2 125 9.3 128 SUMMARY 130 Computers and Language A n a l y s i s 135 X T ABLE OF CONTENTS CHAPTER PAGE V DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS 137 DISCUSSION OF MAIN FINDINGS ............. 138 Tasks 1 and 2: Sampling and P r o c e s s i n g Procedures 138 Task 3: Prod u c t i o n of the Corpus ...... 138 Task 4: Prod u c t i o n of Word L i s t s ...... 139 Task 5: L e x i c a l C h a r a c t e r i s t i c s 139 Task 6: Sentence C h a r a c t e r i s t i c s .......141 Task 7: Common Words .................. 142 Task 8: S e l e c t e d Sentence Lengths ,143 Task 9: E l i m i n a t i o n Technique ......... 144 CONCLUSIONS 145 RECOMMENDATIONS ......................... 147 BIBLIOGRAPHY 151 APPENDIXES 169 x i LIST OF TABLES TABLE PAGE I A SUMMARY OF WORD LISTS: 1921-1972 ........ 28 II THE TWENTY-ONE, 500 WORD SAMPLES USED IN THE PILOT STUDY 57 I I I NUMBER OF TEXTS AND SAMPLES FOR EACH GRADE LEVEL AND SUBJECT AREA ............ 59 IV NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR GRADE LEVELS AND THE CORPUS 88 V NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS ACROSS GRADE LEVELS 89 VI NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 8 90 VII NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 9 91 VIII NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 10 9 2 IX NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF EACH GRADE LEVEL OF THE CORPUS .......... 93 X NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE THIRTY-SEVEN TEXTS 94 x i i x i i i LIST OF TABLES TABLE PAGE XXI MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 8 105 XXII MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 9 ................ 105 XXIII MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 10 106 XXIV MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE THIRTY-SEVEN TEXTS 107 XXV PEARSON'S SKEW FACTOR FOR EACH GRADE LEVEL, THE CORPUS, AND SUBJECTS ACROSS THE CORPUS 110 XXVI PEARSON'S SKEW FACTOR FOR SUBJECTS IN EACH GRADE LEVEL ........................111 XXVII PEARSON'S SKEW FACTOR FOR EACH TEXT ....... 112 XXVIII K FACTORS (SENTENCES) FOR EACH GRADE LEVEL, THE CORPUS, AND SUBJECTS ACROSS THE CORPUS ., 114 XXIX K FACTORS (SENTENCES) FOR SUBJECTS WITHIN GRADE LEVELS ............................ 114 XXX K FACTORS (SENTENCES) FOR EACH TEXT ..115 x i v LIST OF TABLES TABLE PAGE XXXI CHI-SQUARE ANALYSIS OF THE 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES ............118 XXXII SUMMARY OF CHI-SQUARE ANALYSIS OF 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES . . . . . . . . . . . . . . . . . . 121 X X X I I I CHI-SQUARE ANALYSIS OF SELECTED SENTENCE LENGTHS FOR THE GRADES, SUBJECTS ACROSS GRADES, AND SUBJECTS WITHIN GRADES ..... 123 XXXIV NUMBER AND PERCENTAGE OF WORD-TYPES ELIMINATED BY POINT A (50% CUTOFF OF TOKENS) AND POINT B (10% CUTOFF OF TOKENS) . . . . . . . . . . . . . . . . . . 127 XXXV NUMBER AND PERCENTAGE OF WORD-TYPES BETWEEN POINT A AND POINT B (40% OF TOKENS) FOR THE CORPUS, GRADES, AND SUBJECTS ACROSS GRADES . . . . . . . . . . . . . . 1 2 9 XXXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD-TYPES ACROSS THE GRADE LEVELS OF THE CORPUS 251 XXXVII DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD-TYPES ACROSS THE SUBJECT AREAS OF THE CORPUS .... . . . . . 261 XX X V I I I DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD-TYPES ACROSS THE SUBJECT AREAS OF GRADE 8 .. . . . . . . . . . . 271 XXXIX DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD-TYPES ACROSS THE SUBJECT AREAS OF GRADE 9 ............ 281 X V LIST OF TABLES TABLE PAGE XXXX DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT flORD-TIPES ACROSS THE SUBJECT AREAS OF GRADE 10 , 291 XXXXI DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE GRADE LEVELS OF THE CORPUS .................... 302 XXXXII DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF THE CORPUS 302 XXXXIII DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 8 303 XXXXIV DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 9 304 XXXXV DISTRIBUTION OFOCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 10 304 x v i LIST OF FIGURES FIGURE PAGE 1. PRODUCTION OF VOLUMES C.G. AND C.S. OF THE CORPUS . 65 2. PRODUCTION OF WORD LISTS: VOLUMES C.V., G.V., S.V., S.G.V., AND T.V.............. 69 3. MODEL OF A WORD FREQUENCY DIAGRAM 78 4. APPLICATION OF "ELIMINATION TECHNIQUE" TO THE MODEL OF A WORD FREQUENCY DIAGRAM 80 5. WORD FREQUENCY DIAGRAM OF THE CORPUS 125 6. APPLICATION OF THE "ELIMINATION TECHNIQUE" TO THE WORD FREQUENCY DIAGRAM OF THE CORPUS 126 7. GRAPHS OF SENTENCE LENGTH DISTRIBUTION (7.1 TO 7.66) 216 8. WORD FREQUENCY DIAGRAMS (8.1 TO 8.11) ..... 306 x v i i LIST OF APPENDIXES APPENDIX PAGE A INDEX OF TEXTS AND SAMPLES BY GRADE LEVEL , . 169 B SAMPLE SIZES IN ALPHABETICAL ORDER AND ASCENDING RANK ............187 C COMPUTER FILES AND PROGRAMS USED IN THE STUDY 200 D ALPHABETICAL LISTING OF CORPUS VOCABULARY (SAMPLE) 207 E RANK LISTING OF CORPUS VOCABULARY (SAMPLE) ............................... 209 F ASCENDING AND DESCENDING ORDER OF CORPUS VOCABULARY (SAMPLES) ............211 G SENTENCE LENGTH DISTRIBUTION OF THE CORPUS (SAMPLE) ....214 H GRAPHS OF SENTENCE LENGTH DISTRIBUTION ... 216 I CHI-SQUARE RESULTS OF DISTRIBUTION OF 100 MOST COMMON WORD-TYPES 250 J CHI-SQUARE RESULTS OF DISTRIBUTION OF SELECTED SENTENCE LENGTHS ..............301 K WORD FREQUENCY DIAGRAMS (GRAPHS) ......... 305 1 CHAPTER I THE PROBLEM BACKGROUND OF THE PROBLEM Research i n t o the v a r i o u s c o n s t i t u e n t s of the r e a d i n g act suggests that development of re a d i n g p r o f i c i e n c y i s a cumulative, l i f e - l o n g process r e q u i r i n g continued l e a r n i n g and refinement. Few would argue with the statement t h a t , "The a b i l i t y t o read w e l l c o n s t i t u t e s one of the most v a l u a b l e s k i l l s a person can a c q u i r e . Our world i s a reading world" (Bond and T i n k e r , 1967). The use o f w r i t t e n language g i v e s man a permanent, e x t e r n a l memory i n s t r i k i n g c o n t r a s t to the more ephemeral spoken word. In f a c t , reading i n s t r u c t i o n c o u l d be d e f i n e d as the s o c i a l l y planned, guided or aided e s t a b l i s h m e n t of competency i n d e a l i n g with the e x t e r n a l p r i n t memory system of man. U n l i k e c e r t a i n c h a r a c t e r i s t i c s t h a t are g e n e t i c i n o r i g i n , r e a d i n g p r o f i c i e n c y i s g e n e r a l l y an a c q u i r e d s k i l l which must be r e l e a r n e d by each g e n e r a t i o n . Thus, developing s k i l l i n understanding p r i n t e d language has been a b a s i c o b j e c t i v e i n man's educa t i v e p rocesses since e a r l y recorded h i s t o r y (Dodds, 1967). E i n s t e i n suggested t h a t the a b i l i t y to read was mankind's oost amazing f e a t . In an address presented to the 1972 convention of the I n t e r n a t i o n a l Reading A s s o c i a t i o n , Jerome Bruner (1972) noted t h a t the c a p a c i t y to read and e x t r a c t meaning from the p r i n t e d page may re p r e s e n t the u l t i m a t e stage i n the e v o l u t i o n of homo sapiens. Comprehending the p r i n t e d page i s the r e s u l t of complex i n t e r a c t i o n between p h y s i o l o g i c a l and p s y c h o l o g i c a l processes s t i l l but vaguely understood. With the i n c r e a s e d demand f o r l i t e r a c y as technology and i n f o r m a t i o n have i n c r e a s e d , r e a d i n g i n s t r u c t i o n , cnce an e x c l u s i v e concern of the elementary s c h o o l , has become an important area o f study i n secondary sch o o l s . Evidence of i t s importance i s r e a d i l y a v a i l a b l e from a v a r i e t y of sources. An i n f o r m a t i o n base of resea r c h on r e a d i n g has e x i s t e d f o r s e v e n t y - f i v e years. Such rese a r c h has expl o r e d t o p i c s r e l a t e d t o : sequencing and developing r e a d i n g i n s t r u c t i o n a t a l l l e v e l s , the process o f r e a d i n g , the products or s k i l l s of r e a d i n g , language development as i t r e l a t e s to re a d i n g , the pedagogy of re a d i n g , and the s p e c i a l problems of the d i s a b l e d reader (Robinson et a l , 1967; Summers e t a l , 1968, 1967, 1968). Roughly 40 percent o f the r e p o r t e d r e s e a r c h r e l a t e s to reading beyond the elementary l e v e l . A s i g n i f i c a n t trend i n secondary r e a d i n g i n s t r u c t i o n i n recent years has been the a t t e n t i o n given to more s y s t e m a t i c development of re a d i n g a b i l i t i e s i n c l u d i n g the o r g a n i z a t i o n of s p e c i a l r e a d i n g programs, i n c r e a s e d emphasis on reading as i t r e l a t e s to s u b j e c t c l a s s e s , and p r o v i s i o n of s p e c i a l s e r v i c e s f o r students with s e r i o u s r e a d i n g problems (Davis, 1952; Robinson et a l , 1960; Summers, 1963, 1964, 1966, 1967; A r t l e y , 1968, 1970; F a r r et a l , 1970; H i l l and B a r t i n , 1971). In a d d i t i o n , the volume of i n t e r e s t i s evidenced by the over f i f t y sources c i t e d i n a recent b i b l i o g r a p h i c a l guide, i n d e x i n g i n f o r m a t i o n sources f o r secondary r e a d i n g (Summers e t a l , 1973), In r e c e n t years, there has a l s o been a massive outpouring of o f f e r i n g s from over 200 North American p u b l i s h e r s d e v e l o p i n g i n s t r u c t i o n a l m a t e r i a l s s p e c i f i c a l l y designed f o r student use i n r e a d i n g i n s t r u c t i o n programs ranging from p r e s c h o o l through c o l l e g e and a d u l t l e v e l s . The trend towards more i n t e r e s t i n secondary r e a d i n g has some o l d and some new f a c e t s . A f a c e t that r e f l e c t s a reasonably o l d i n t e r e s t i s t h a t of f o c u s s i n g on the l i n g u i s t i c a n a l y s i s of p r i n t m a t e r i a l . The most notable and e x t e n s i v e l i n g u i s t i c a n a l y s es which have generated word l i s t s and r e s u l t s a p p l i c a b l e to i n s t r u c t i o n a l s e t t i n g s and s c h o o l m a t e r i a l s have been those r e p o r t e d by Thorndike and Lorge (1944), R i n s l a n d (1945), C a r r o l l et a l (1971), and the study by H a r r i s and Jacobson (1972). Although based on a sampling of more g e n e r a l a d u l t m a t e r i a l s , the p r o j e c t r e p o r t e d by Kucera and F r a n c i s (1967) represented the f i r s t study of a massive m i l l i o n word, computer-based corpus which generated r e s u l t s based on word and sentence l e n g t h s . The techniques employed are r e l e v a n t to any a n a l y s i s of s c h o o l based i n s t r u c t i o n a l m a t e r i a l s . In r e c e n t r e s e a r c h , computer technology has provided an important t o o l f o r a n a l y z i n g transformed n a t u r a l language t e x t . 4 o r g a n i z i n g c o r p o r a , d e v e l o p i n g word frequency counts, and more s i g n i f i c a n t l y , e n a b l i n g the a n a l y s i s and comparison of masses of data across numerous sub-components within a corpus of m a t e r i a l s . Computerized t e x t data bases f a c i l i t a t e the use of v a r i e d s t a t i s t i c a l a n a l y s e s and make i t p o s s i b l e to study the l i n g u i s t i c c h a r a c t e r i s t i c s of s i z a b l e bodies of w r i t t e n language. The advantages of computer technology i n l i n g u i s t i c r e s e a r c h are a p t l y i l l u s t r a t e d by Kucera (1969). Since any u s e f u l a n a l y s i s of language usage has to be based on a l a r g e body of t e x t u a l m a t e r i a l , even elementary i n f o r m a t i o n c o u l d be obtained, before the advent of computers, only with enormous l a b o r . Let us imagine t h a t one wished to determine some very b a s i c l e x i c a l p r o p e r t i e s of a t e x t u a l corpus c o n t a i n i n g a m i l l i o n running words. I f t h i s were t o be done by hand (or, more a c c u r a t e l y , by the human b r a i n ) , the task would r e q u i r e an i n o r d i n a t e amount of time; each of the one m i l l i o n words would have to be i n s p e c t e d i n d i v i d u a l l y , and each new word recorded a f t e r f i r s t checking to make sure i t had not a l r e a d y been noted. I f the a n a l y s i s were a l s o to preserve i n f o r m a t i o n about the frequency of occurrence of i n d i v i d u a l words, or perhaps r e f e r e n c e s to the pages or l i n e s of the t e x t where t h e i r occurrences were to be found, the assignment would become more formidable s t i l l . . . . . L i n g u i s t s and l e x i c o g r a p h e r s a l i k e have found i n the computer a new and u s e f u l t o o l t h a t has not only made the a n a l y s i s of language l e s s time-consuming but has a l s o opened new i n s i g h t s i n t o important problems i n language usage. New avenues to r e s e a r c h have a l s o been opened by an upsurge i n i n t e r e s t i n the l i n g u i s t i c s of w r i t t e n language and the r e a d a b i l i t y of i n s t r u c t i o n a l m a t e r i a l s . There i s g r e a t i n t e r e s t i n frequency counts and item co-occurrence, p o s i t i o n a l c r i t e r i a based on the placement of items w i t h i n a t e x t , s y n t a c t i c c r i t e r i a based on s t r u c t u r a l r e l a t i o n s h i p s between items, and semantic c r i t e r i a depending on the p a r t i c u l a r area of d i s c o u r s e 5 and on the l a r g e r context w i t h i n which a given t e x t i s p l a c e d . As Robinson (1971) pointed out: "We must study our language before we generate approaches to r e a d i n g i n s t r u c t i o n . . . . we need to l e a r n more about the p a t t e r n s of s p e c i f i c l e t t e r s i n words, sentence p a t t e r n s , and the o v e r a l l o r g a n i z a t i o n a l p a t t e r n s of our language." In a r e c e n t a r t i c l e , Jenkinson (1970) analyzed c u r r e n t i n f o r m a t i o n gaps r e l a t e d to r e s e a r c h on r e a d i n g comprehension, and o u t l i n e d v i t a l a reas needing f u r t h e r study i n c l u d i n g "problems i n h e r e n t w i t h i n the m a t e r i a l s . " Jenkinson concluded her d i s c u s s i o n by s t r e s s i n g the need f o r " f u r t h e r l i n g u i s t i c a n a l y s i s of the language of textbooks..." The work of Smith (1964) was a major attempt to s u b j e c t i v e l y i d e n t i f y p a t t e r n s of w r i t i n g i n d i f f e r e n t s u b j e c t areas and to r e l a t e necessary reading s k i l l s to these p a t t e r n s . The a n a l y s i s i n c l u d e d r e a d i n g and study s k i l l s which were common to t e x t s i n L i t e r a t u r e , S o c i a l S t u d i e s , Science and Mathematics i n Grades 7 through 12. I f the i n s t r u c t i o n a l m a t e r i a l s s e l e c t e d f o r use i n a s c h o o l program r e f l e c t s o c i e t y ' s communication with students through the language of p r i n t , then i t becomes i n c r e a s i n g l y important to ask, "What are the l i n g u i s t i c c h a r a c t e r i s t i c s of the p r i n t sources p r e s c r i b e d f o r use i n Canadian secondary s c h o o l s ? " Answers to t h i s query may w e l l form the b a s i s f o r i n s t r u c t i o n t h a t i s b e t t e r a d j u s t e d to the r e a l r e a d i n g a b i l i t y and needs of secondary students. 6 T h i s q u e s t i o n p r o v i d e s the b a s i s f o r the present study which was undertaken as a c o n t r i b u t i o n to resea r c h r e f l e c t i n g o l d and new trends i n two main areas: 1) the a n a l y s i s of c e r t a i n l i n g u i s t i c c h a r a c t e r i s t i c s of a sample of n a t u r a l language t e x t which forms the b a s i s of an e x i s t i n g secondary s c h o o l c u r r i c u l u m , and 2) the a p p l i c a t i o n of computer techniques to the a n a l y s i s of n a t u r a l language t e x t . STATEMENT OF THE PROBLEM AND PURPOSE OF THE STUDY Numerous s t u d i e s r e p o r t e d i n the l i t e r a t u r e of l i n g u i s t i c s and education i l l u s t r a t e attempts to d e s c r i b e and compare the language content o f a body of p r i n t m a t e r i a l s . The b a s i c problems i n t h i s study were t o d e s c r i b e and compare c e r t a i n l i n g u i s t i c f e a t u r e s across and w i t h i n the grades and s u b j e c t areas comprising a sample of p r i n t e d i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n the v a r i o u s s u b j e c t areas of Grades 8, 9 and 10 i n B r i t i s h Columbia through the development of a model u t i l i z i n g computer technology. The s p e c i f i c purposes were to generate a n a t u r a l language corpus and to make va r i o u s l i n g u i s t i c a n a l y ses ( i n v o l v i n g word frequency and sentence lengths) and comparisons of the t o t a l corpus and i t s sub-components by a p p l y i n g the power of computer storage and programming techniques. The i d e a l study i n d e s c r i b i n g the l i n g u i s t i c c h a r a c t e r i s t i c s of p r i n t encountered by Grade 8, 9 and 10 7 s t udents would draw samples from a l l p o s s i b l e p r i n t sources the student comes i n c o n t a c t with, i n c l u d i n g r e g u l a r textbooks, supplementary sources, r e f e r e n c e m a t e r i a l s , and perhaps even samples of student w r i t t e n and spoken language. However, t h i s study c o n c e n t r a t e s on a s i n g l e p r i n t component and analyzes a l i m i t e d number of e x p l i c i t l y d e f i n e d , c a r e f u l l y s e l e c t e d , r e a d i l y a v a i l a b l e language samples from the t e x t m a t e r i a l s t h a t students are most l i k e l y to encounter dur i n g t h e i r j u n i o r secondary s c h o o l years i n s u b j e c t c l a s s e s . The p r i n t m a t e r i a l s used f o r a n a l y s i s i n t h i s study c o n s i s t e d of approximately a q u a r t e r m i l l i o n running words of n a t u r a l language s y s t e m a t i c a l l y sampled from t e x t s p r e s c r i b e d f o r use i n s u b j e c t s i n Grades 8, 9 and 10 i n B r i t i s h Columbia s c h o o l s . TASKS, QUESTIONS AND HYPOTHESES A complete d e s c r i p t i o n of i n s t r u c t i o n a l m a t e r i a l s would be b u i l t around c l e a r l y i s o l a t e d l i n g u i s t i c v a r i a b l e s i d e n t i f i e d i n a f u l l y developed theory of language comprehension. However, such a theory has y e t t o be developed. Innumerable v a r i a b l e s r e l a t e t o c o m p r e h e n s i b i l i t y of p r i n t e d m a t e r i a l s . In a recent seminal study, Bormuth (1969) analyzed f a c t o r s t h a t r e l a t e to the comprehension of p r i n t m a t e r i a l s and i d e n t i f i e d 169 l i n g u i s t i c v a r i a b l e s t h a t c o r r e l a t e with c o m p r e h e n s i b i l i t y and r e a d a b i l i t y . The s t a t e - o f - t h e - a r t i s such that i t i s not yet p o s s i b l e to e x p l i c a t e a complete t h e o r e t i c a l account of the comprehension 8 process, determine the l i n g u i s t i c v a r i a b l e s i n p r i n t m a t e r i a l s which c o r r e l a t e most c l o s e l y with comprehension d i f f i c u l t y , and develop pedagogical procedures and i n s t r u c t i o n a l m a t e r i a l s t h a t are c o n s i s t e n t l y p r e d i c t i v e i n generating high l e v e l student comprehension of p r i n t m a t e r i a l s . The development of a f u l l y e x p l i c a t e d , s c i e n t i f i c theory of language comprehension w i l l no doubt emerge g r a d u a l l y . U n t i l a f u l l y developed theory can be used to generate s t u d i e s , r e s e a r c h based on s t r a i g h t f o r w a r d manipulable v a r i a b l e s t h a t evidence i n d i c a t e s are r e l a t e d to the comprehension process, can p r o v i d e important i n s i g h t s i n t o the f a c t o r s i n i n s t r u c t i o n a l m a t e r i a l s which may c o n t r i b u t e to t h e i r d i v e r s e comprehension demands. Such d e s c r i p t i v e and comparative rese a r c h can produce r e s u l t s which may i n f l u e n c e pedagogy and i n c r e a s e e f f e c t i v e n e s s i n t e a c h i n g students to a c q u i r e knowledge from w r i t t e n i n s t r u c t i o n a l m a t e r i a l s . In t h i s study, the f o c u s i s on d e s c r i b i n g and comparing word and sentence c h a r a c t e r i s t i c s of i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n seven secondary s u b j e c t areas i n B r i t i s h Columbia s c h o o l s . The d e s c r i p t i v e and comparative analyses are framed w i t h i n a s t r a t i f i c a t i o n model a l l o w i n g data to be organized to answer que s t i o n s and t e s t hypotheses a c r o s s and w i t h i n the t o t a l sample based on: grade l e v e l s (Grades 8, 9 , 10), s u b j e c t areas (seven s u b j e c t s ) , s u b j e c t s within grades (eighteen s u b j e c t s ) , and textbooks ( t h i r t y - s e v e n p r e s c r i b e d texts) . 9 Word-types are i d e n t i f i e d and f e a t u r e s such as g r a p h i c c h a r a c t e r s and r e l a t i v e frequency of occurrence of i n d i v i d u a l words i n d i c a t e d . The repeat r a t e frequency of words i s i n d i c a t e d with frequency counts and comparisons based on occurrence of word c h a r a c t e r i s t i c s of the p r i n t e d m a t e r i a l s . Sentence l e n g t h c h a r a c t e r i s t i c s are d e s c r i b e d and compared ac r o s s the grade l e v e l s and s u b j e c t areas. F i n a l l y , a d e c i s i o n theory, or model f o r an e l i m i n a t i o n technique, i s proposed as an a i d i n i d e n t i f y i n g the most s i g n i f i c a n t content vocabulary i n word l i s t s d e r i v e d from samples based on s u b j e c t area t e x t s . The study i s organized i n t o nine tasks. Tasks 1 to 4 were designed to o r g a n i z e the i n p u t data i n t o a t o t a l Corpus and s i x t y - f i v e other corpora and develop necessary word l i s t s and summary t a b l e s . Tasks 5 and 6 were designed to produce the d e s c r i p t i v e and comparative s t a t i s t i c s f o r the Corpus and the v a r i o u s c o r p o r a . Tasks 7, 8, and 9 were developed to produce an a l y s e s of s e l e c t e d l i n g u i s t i c f e a t u r e s of the Corpus and the c o r p o r a . The nine tasks with t h e i r r e l a t e d q u e s t i o n s and hypotheses f o l l o w . Ia§3S Is. Develop a Corpus to r e p r e s e n t n a t u r a l language t e x t basei on i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n the s u b j e c t areas of B r i t i s h Columbia j u n i o r secondary grades. Task 2± Organize the Corpus of m a t e r i a l s f o r computer i n - p u t and m a n i p u l a t i o n . Task 3. Generate two volumes of the Corpus: one organized by 10 grade l e v e l s and one o r g a n i z e d by s u b j e c t - a r e a s , each with a d e s c r i p t i v e index. Task 4 O r g a n i z e the samples i n t o word l i s t s f o r the Corpus, the grade corpora (3), the s u b j e c t c o r p o r a (7), the s u b j e c t w i t h i n grade corpora (18), and the textbook corpora (37). 4.1 For each of the above, p r o v i d e an a l p h a b e t i c a l and a rank order (descending frequency) l i s t i n g of word-types to give the f o l l o w i n g i n f o r m a t i o n . 4.11 The frequency of occurrence of each word-type. 4.12 The cumulative percentage frequency o f each word-type. 4.13 The r e l a t i v e frequency of occurrence of each word-type per 1 ,000 tokens. 4.14 The d e s c r i p t i v e s t a t i s t i c s f o r the rank order l i s t s of the Corpus and corpora i n c l u d i n g : X, FX, SUM FX, FX * X, SUM FX * X, CUM % FX * X . (A f u l l e x p l a n a t i o n of these terms i s given i n Chapter I I I ) . 4.2 C o n s t r u c t two summary t a b l e s f o r each of the s i x t y - s i x word l i s t s , i n d i c a t i n g the word frequency f i g u r e s i n descending order (highest frequency f i r s t ) and i n ascending order ( hajjax lecjomena f i r s t ) . Task 5 t Generate comparative and s t a t i s t i c a l a n a l y s e s based on the l e x i c a l c h a r a c t e r i s t i c s of the Corpus and the corpora and data produced i n Tasks 1 through 4. 5.1 What are the l e x i c a l c h a r a c t e r i s t i c s of the Corpus; the Grade 8, 9 and 10 c o r p o r a ; each of the seven s u b j e c t area c o r p o r a across Grades 8, 9 and 10; each of the corpora f o r s u b j e c t s within Grades 8, 9 and 10; and each of the t h i r t y - s e v e n textbook corpora i n terms of: t o t a l number of g r a p h i c c h a r a c t e r s , average number of g r a p h i c c h a r a c t e r s , tokens, and d i s c r e t e word-types? 5.2 What are the c h a r a c t e r i s t i c s , i n terms of r e p e a t - r a t e frequency of words (Yule's K), f o r the Corpus and corpora d e f i n e d i n 5.1 above ? 11 Task 6._ Generate comparative and s t a t i s t i c a l a n a l y s es based on sentences and sentence l e n g t h s f o r the Corpus, the c o r p o r a , and the data produced i n Tasks 1 through 4. 6.1 What are the sentence-length c h a r a c t e r i s t i c s of the Corpus: the Grade 8, 9 and 10 c o r p o r a : each of the seven s u b j e c t area corpora a c r o s s Grades 8, 9 and 10:, each of the corpora f o r s u b j e c t s w i t h i n Grades 8, 9 and 10; and each of the t h i r t y - s e v e n textbook c o r p o r a i n terms o f : average number of sentences; mean, median and modal sentence length i n words; stand a r d d e v i a t i o n , c o e f f i c i e n t of v a r i a t i o n , average number of sentences, and Pearson's skew f a c t o r f o r sentence l e n g t h s ? 6.2 Produce a s e t of graphs to i l l u s t r a t e each of the s i x t y - s i x sentence l e n g t h d i s t r i b u t i o n s f o r the Corpus and corpora d e f i n e d i n 6.1 above. 6.3 What are the c h a r a c t e r i s t i c s , i n terms of r e p e a t r a t e frequency of sentence-lengths (Yule's K) , f o r the Corpus and c o r p o r a d e f i n e d i n 6.1 above ? Task 7j_ Generate comparative and s t a t i s t i c a l a n a l y s es of the d i s t r i b u t i o n of the 100 most f r e q u e n t l y o c c u r r i n g word-types of the Corpus across the three grade l e v e l s , the seven s u b j e c t areas, and the eighteen s u b j e c t areas w i t h i n the three grade l e v e l s . 7.1 Test the f o l l o w i n g n u l l hypotheses. Hypothesis 1. There are no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n of the 100 most freq u e n t word-types of the Corpus when compared to the expected d i s t r i b u t i o n of each word-type f o r : Hypothesis 1. 1 the t h r e e grade l e v e l s of the Corpus, Hypothesis 1. 2 the seven s u b j e c t areas of the Corpus, Hypothesis 1. 3 the s u b j e c t areas within Grade 8, Hypothesis 1. 4 the s u b j e c t areas w i t h i n Grade 9, Hypothesis 1. 5 the s u b j e c t areas w i t h i n Grade 10. 7.2 I n v e s t i g a t e and d e s c r i b e the number of word-types 12 which d i f f e r s i g n i f i c a n t l y i n t h e i r d i s t r i b u t i o n a c r o s s each of the areas t e s t e d i n 7.1. £§sk 8_. Do the sentence l e n g t h d i s t r i b u t i o n s of the seven s u b j e c t areas d i f f e r from the sentence l e n g t h d i s t r i b u t i o n of the Corpus ? T h i s task i n v o l v e s t e s t i n g the f o l l o w i n g n u l l hypotheses. Hypothesis 2. There are no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n of s h o r t , average, and long sentences when compared to the expected d i s t r i b u t i o n of each of the sentence l e n g t h s f o r : Hypothesis 2.1 the three grade l e v e l s of the Corpus, Hypothesis 2.2 the seven s u b j e c t areas of the Corpus, Hypothesis 2.3 the s u b j e c t area c o r p o r a w i t h i n Grade 8, Hypothesis 2.4 the s u b j e c t area c o r p o r a w i t h i n Grade 9, Hypothesis 2.5 the s u b j e c t area c o r p o r a w i t h i n Grade 10. Task 9.. Develop an " e l i m i n a t i o n technique" f o r s e l e c t i n g the most s i g n i f i c a n t content words i n a word l i s t using the ranked frequency l i s t s developed f o r the Corpus, the three grade l e v e l c orpora, and the seven s u b j e c t area c o r p o r a . 9.1 Produce a s e t of graphs t o i l l u s t r a t e the word frequency by rank of the Corpus, the three grade l e v e l c o r p o r a , and the seven s u b j e c t - area c o r p o r a . 9.2 What i s the e f f e c t of e l i m i n a t i n g the h i g h e s t frequency words and the lowest frequency words from the t o t a l spectrum of words f o r each of the corpora s t a t e d i n 9.1? 9.3 Can the r e s i d u a l of words remaining a f t e r e l i m i n a t i n g the high and low frequency words d e s c r i b e d i n 9.2 serve as a pool f o r s e l e c t i n g the most u s e f u l content words f o r the Corpus, the three grade l e v e l corpora, 1 3 and the seven s u b j e c t -area corpora through analyses based on r e l a t i v e frequency of occurrence and s u b j e c t i v e c r i t e r i a ? SIGNIFICANCE OF THE STUDY The l i n g u i s t i c components of writte n d i s c o u r s e i n f l u e n c e the comprehension d i f f i c u l t y of p r i n t e d m a t e r i a l . For example, longer sentences g e n e r a l l y complicate the arrangement of words and make g r e a t e r demands upon memory i n re a d i n g than do s h o r t e r sentences, Redundancy i n terms of word and sentence r e p e t i t i o n i s a l s o considered to i n f l u e n c e the reader i n d e r i v i n g i n f o r m a t i o n . Furthermore, comparisons of the r e l a t i v e freguency of occurrence o f d i s c r e t e word-types i n d i f f e r e n t types of d i s c o u r s e has s i g n i f i c a n c e f o r l e a r n i n g and t e a c h i n g , s i n c e the more a word i s used, the g r e a t e r the p r o b a b i l i t y the reader has had an o p p o r t u n i t y to come i n c o n t a c t with i t . The vocabulary l o a d i n g of m a t e r i a l , i n terms of l e x i c a l and s t r u c t u r a l word-types, i s another f a c t o r t h a t i n f l u e n c e s comprehension. Even word l e n g t h , measured both p h o n o l o g i c a l l y and g r a p h o l o g i c a l l y i n s y l l a b l e s or l e t t e r s , can be p r e d i c t i v e of comprehension d i f f i c u l t y . I n c r e a s i n g l y , r e c e n t language resea r c h r e l a t e d to such f a c t o r s i s sharpening, both i n design and q u a n t i t y , and p r o v i d i n g r e s u l t s with i m p l i c a t i o n s f o r te a c h i n g and f u r t h e r r e s e a r c h . Now that the advent of computer technology has g r e a t l y minimized l a b o r , such language enquiry has been s t i m u l a t e d by f a c i l i t a t i n g r e s e a r c h based on l a r g e bodies of m a t e r i a l e n a b l i n g m u l t i p l e comparisons a c r o s s d i v e r s e 14 sub-components. T h i s study w i l l p r o v i d e , f i r s t of a l l , a u s e f u l pool of w r i t t e n language samples and e x t e n s i v e i n f o r m a t i o n about the s p e c i f i c Corpus of m a t e r i a l s s e r v i n g as the i n - p u t data base. In a d d i t i o n , programs and models w i l l be produced which are g e n e r a l i z a b l e to other i d i o s y n c r a t i c p o p u l a t i o n s of m a t e r i a l s and i n t u r n can become u s e f u l t o o l s i n r a i s i n g and answering f u r t h e r questions. The study d e r i v e s i t s major s i g n i f i c a n c e from s e v e r a l unique f e a t u r e s . The study r e p r e s e n t s the f i r s t e x t e n s i v e a n a l y s i s o f the l e x i c a l c h a r a c t e r i s t i c s of E n g l i s h language i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n the secondary schools of a Canadian province. The vocabulary l i s t s emanating from the study r e f l e c t the demands of r e a l reading m a t e r i a l s being used by Canadian students i n the 1970*s as d i s t i n c t from most of the dated word l i s t s c u r r e n t l y i n use. The data generated by the study c o u l d have s i g n i f i c a n c e i n developing g u i d e l i n e s f o r authors of s c h o o l i n s t r u c t i o n a l m a t e r i a l s and teachers using those m a t e r i a l s . The word l i s t s would supply w r i t e r s with i n f o r m a t i o n they need to meet the needs and the c a p a b i l i t i e s of students, p a r t i c u l a r y those of l i m i t e d r e a d i n g a b i l i t y . The data have value i n p l a n n i n g i n s t r u c t i o n f o r both n a t i v e speakers of E n g l i s h and students coping with E n g l i s h as a second language. Data from the study would permit a number of c o r r e l a t i o n a l a n a l y s e s to be made with e x i s t i n g word l i s t s and word-graded 15 reading t e s t s now i n e x t e n s i v e use throughout Canada. Researchers i n r e l a t e d d i s c i p l i n e s could make use of the word l i s t s without having to r e l y on data from outmoded or f o r e i g n sources. The r e s u l t s p r o v i d e a r e a d i l y a c c e s s i b l e , fundamental c o m p i l a t i o n t h a t can be co n s u l t e d when needed. The samples obtained from the study and the r e l a t e d word and sentence s t a t i s t i c s c o u l d be used t o f u r t h e r r e s e a r c h i n t o the r e a d a b i l i t y o f secondary i n s t r u c t i o n a l m a t e r i a l s and as inp u t i n the development of both s t a n d a r d i z e d and i n f o r m a l r e a d i n g t e s t s f o r placement, e v a l u a t i o n of student progress and e s t i m a t i o n of program e f f e c t i v e n e s s . Improved t e a c h i n g methods t o f a c i l i t a t e i n s t r u c t i o n i n re a d i n g comprehension c o u l d be a v i t a l l y important outcome of the study. S i g n i f i c a n t d i f f e r e n c e s between the b a s i c language c h a r a c t e r i s t i c s of the s u b j e c t - a r e a s would emphasize the need to develop a l t e r n a t i v e t e a c h i n g procedures and the u t i l i z a t i o n of i n s t r u c t i o n a l m a t e r i a l s geared to the unique word and sentence demands of each s u b j e c t - a r e a . The r e s u l t s c o u l d provide data which would a i d g r e a t l y i n determining t o what ext e n t i n s t r u c t i o n a l m a t e r i a l s r e f l e c t vocabulary t h a t i s within reach of students and i n the i d e n t i f i c a t i o n of words of s p e c i a l importance needing s p e c i a l a t t e n t i o n i n t e a c h i n g . The study has p o t e n t i a l impact beyond i t s s p e c i f i c f i n d i n g s , however. The model designed f o r the study makes e x t e n s i v e use of natural-language computer technology and co u l d r e a d i l y be adapted to f a c i l i t a t e much l a r g e r s t u d i e s , or con v e r s e l y the model c o u l d be used to examine very s m a l l u n i t s 16 of m a t e r i a l . The computer programs generated by the study c o u l d be a p p l i e d i n the a n a l y s i s of other i d i o s y n c r a t i c p o p u l a t i o n s of p r i n t e d m a t e r i a l s . F i n a l l y , computer technology was used e x t e n s i v e l y to produce the d i s s e r t a t i o n i t s e l f , format and p r i n t word l i s t s , t a b l e s and graphs from raw data, and to develop, e d i t and produce the f i n a l p r i n t e d copy. Thus the study c o u l d serve as a r e p r e s e n t a t i v e model i n developing other r e s e a r c h p r o j e c t s i n v o l v e d with the p r o c e s s i n g of n a t u r a l language t e x t . DEFINITION OF TERMS For the purposes of the study a number of d e f i n i t i o n s were developed. Character A l e t t e r , d i g i t , or other symbol t h a t i s used to organize, c o n t r o l , or represent data. £2J=I!i«~i^£t of V a r i a t i o n A method of measuring the r a t e a t which sentence types move away from the mean. Commuting Centre Dollar^CCS Used i n accounting by the Computer Centre. A CC$ r e p r e s e n t s an amount of computing r e s o u r c e s which c o s t s the U n i v e r s i t y of B r i t i s h Columbia $1.00 to provide. £2£Z§E§§tional Terminal A t y p e w r i t e r - l i k e d e v i c e which enables a user to communicate with MTS. Corpus Disk The t o t a l body of 235,107 tokens of n a t u r a l language t e x t based on the 469, f i v e hundred word samples across t h i r t y - s e v e n textbooks p r e s c r i b e d f o r use i n the s u b j e c t areas of Grades 8, 9 and 10 i n B r i t i s h Columbia. A computer storage device used i n MTS f o r l i n e and s e q u e n t i a l f i l e s t o r a g e , batch queue storage, and paging. 17 F i l e Used with MTS to r e f e r t o c o l l e c t i o n s of r e l a t e d i n f o r m a t i o n r e s i d i n g on d i r e c t access d e v i c e s . Ma^netic Tage A storage medium which permits the r e c o r d i n g of data as a s e r i e s of magnetized spots. MTS The Michigan Terminal System designed to run on an IBM model computer. Pearson^s Skew Fa c t o r A method of measuring the skewness of a d i s t r i b u t i o n curve. Sentence A number of tokens, the f i r s t beginning with a c a p i t a l l e t t e r and the l a s t ending with a p e r i o d , q u e s t i o n mark, or an exclamation mark, f o l l o w e d by a blank space or a p a i r of q u o t a t i o n marks. An i n d i v i d u a l occurrence of a word-type. A continuous s t r i n g of c h a r a c t e r s bounded l e f t by a blank space and d e l i m i t e d by a blank space or one of the f o l l o w i n g c h a r a c t e r s •-( )";:, ?a>/$#+%=!_0. Word Ty_pe A " d i s t i n c t word" r e p r e s e n t i n g a s e t of i d e n t i c a l i n d i v i d u a l words (tokens) . Yule^s C h a r a c t e r i s t i c K A method of determining the r a t e of r e p e t i t i o n (word-types or sentences) i n a passage of p r i n t . Token Word LIMITAT IONS There are three main l i m i t a t i o n s to the f i n d i n g s of t h i s study. 1. The study i s r e s t r i c t e d to the use of t h i r t y - s e v e n "A" i s s u e E n g l i s h language textbooks p r e s c r i b e d f o r use i n Grades 8, 9, and 10 i n B r i t i s h Columbia secondary s c h o o l s during 1972-73. 18 Because of the s i z e of the undertaking, not a l l of the content a v a i l a b l e was used. I n s t e a d , a sampling of between 30 - 40 percent of the prose s e l e c t i o n s was made. 2. No attempt i s made to analyze a l l of the l i n g u i s t i c f e a t u r e s of the m a t e r i a l used i n the study. The main focus i s on the a n a l y s i s of l e x i c a l c h a r a c t e r i s t i c s and sentence forms. 3. The study i s l i m i t e d by the accuracy of the v a r i o u s computer programs which were developed s p e c i f i c a l l y f o r the p r o j e c t , as w e l l as by the accuracy o f keypunching and e d i t i n g procedures employed i n data p r e p a r a t i o n . A P i l o t Study was u t i l i z e d to v a l i d a t e procedures and programs and minimize e r r o r s as much as p o s s i b l e . OVERVIEW OF THE STUDY The study has t h r e e major aspects: 1) the s e l e c t i o n and treatment of the t e x t m a t e r i a l s used f o r computer i n p u t ; 2) the development of computer programs needed to generate the word-l i s t s and other r e l a t e d s t a t i s t i c s ; 3) the a n a l y s i s and comparison of the computer generated data i n r e l a t i o n to the que s t i o n s r a i s e d and hypotheses s t a t e d . Chapter II presents the review of l i t e r a t u r e and the con c e p t u a l framework f o r the study. The design and methodology of the i n v e s t i g a t i o n i n v o l v i n g the nine t a s k s i s o u t l i n e d i n Chapter I I I . Chapter IV presents the a n a l y s i s of the data and the f i n d i n g s of the study. F i n a l l y , Chapter V presents a summary of the r e s u l t s , the suggests a number future research. 19 conclusions for the investigation, and of implications for reading instruction and 20 CHAPTER I I REVIEW OF THE LITERATURE AND CONCEPTUAL FRAMEWORK INTRODUCTION T h i s study i s concerned with the development, d e s c r i p t i o n , comparison, and a n a l y s i s of a r e p r e s e n t a t i v e sample c f p r i n t e d i n s t r u c t i o n a l m a t e r i a l used i n secondary grades. There have been few r e p o r t e d s t u d i e s i n t h i s area and t h e r e f o r e i t has been necessary to make use of resea r c h a t other e d u c a t i o n a l l e v e l s i n order to c o n s t r u c t the conc e p t u a l framework f o r the present i n v e s t i g a t i o n . Some of the s t u d i e s mentioned are based on a m p i r i c a l research and ot h e r s r e p o r t the r e s u l t s of d e s c r i p t i v e r e s e a r c h . Although the major aspect of comprehension i n reading i s concsrned with the f u l l r e l a t i o n s h i p s of phonology, syntax, and semantics, the use of p r i n t e d d i s c o u r s e r e q u i r e s the reader to l e a l f i r s t with words. In recent years a t t e n t i o n has been given once again to the development of word l i s t s f o r the a n a l y s i s of p r i n t e d m a t e r i a l s and s e v e r a l new word l i s t s have been developed (Kucera-Francis, 1967; C a r r o l l et a l , 1971; H a r r i s and Jacobson, 1972) . 21 A r e c e n t i n n o v a t i o n i n the a n a l y s i s of n a t u r a l language t e x t has been the use of computer technology. A l f o r d (1971) emphasized t h a t only a computer could handle the vast c o m p l e x i t i e s of modern day techniques i n language r e s e a r c h . H a r r i s and Jacobson (1972) o u t l i n e d many of the advantages which a computerized system of word a n a l y s i s o f f e r e d , e s p e c i a l l y i n comparative s t u d i e s of p r i n t e d m a t e r i a l s . Information gained from r e s e a r c h which makes use of a massive corpus of language would enable educators to s e l e c t and modify i n s t r u c t i o n a l m a t e r i a l s to meet the reading needs of i n d i v i d u a l s t u d e n t s . Research i n t o the r e a d a b i l i t y of p r i n t e d m a t e r i a l s has continued to s t r e s s the importance of word and sentence c h a r a c t e r i s t i c s as two of the main f a c t o r s i n determining reading d i f f i c u l t y (Fry, 1968; Mclaughlan, 1969; G u t h r i e , 1970). I n v e s t i g a t o r s using the Cloze technique maintain t h a t having to r e p l a c e randomly d e l e t e d words i n passages s e l e c t e d as r e p r e s e n t a t i v e of p r i n t m a t e r i a l s c o n s t i t u t e s the best means c u r r e n t l y a v a i l a b l e f o r measuring the c o m p r e h e n s i b i l i t i e s of p r i n t e d prose (Bormuth, 1969; Ramanauskas, 1972). The l e x i c a l and f u n c t i o n a l aspects of the d e l e t e d words have i m p l i c a t i o n s r e l a t e d t o the comprehension of p r i n t m a t e r i a l s . F i n a l l y an area of r e s e a r c h i n the l i n g u i s t i c a n a l y s i s of p r i n t m a t e r i a l s which appears to have p o t e n t i a l concerns the development of techniques to i d e n t i f y the »significant' body of words and sentences which can be used to summarize the content of a passage of m a t e r i a l . 22 The purpose of t h i s chapter i s to organize a c o n c e p t u a l framawork and make a s e l e c t i v e review of s t u d i e s t h a t r e l a t e to these areas, i n c l u d i n g : word l i s t s and t h e i r r o l e i n r e a d i n g r e s e a r c h ; computer technology and language r e s e a r c h ; the r e a d a b i l i t y of p r i n t e d m a t e r i a l s ; and techniques u s e f u l i n i d e n t i f y i n g the " s i g n i f i c a n t " content i n a body cf p r i n t m a t e r i a l . WORD LISTS AND THEIR ROLE IN READING RESEARCH T h i s s e c t i o n d e a l s with three major t o p i c s : the development of word l i s t s ; word l i s t s and content m a t e r i a l s ; and word l i s t s and r e a d a b i l i t y . The f i r s t t o p i c w i l l i n c l u d e computer generated word l i s t s to present an up-to-date o u t l i n e but the use of computer technology i n language r e s e a r c h w i l l be presented i n a l a t e r s e c t i o n . S i m i l a r l y , the r o l e o f word l i s t s i n r e a d a b i l i t y r e s e a r c h i s i n c l u d e d under the t h i r d t o p i c , but a f u l l e r treatment of r e a d a b i l i t y w i l l be presented i n p a r t f o u r . E x t e n s i v e s t u d i e s have been made of the vocabulary used i n p r i n t e d m a t e r i a l s i n the O.S.A s i n c e the 10,000 words l i s t e d i n Thorn d i k e ' s , The_reacherj_s_£ord_Book were pu b l i s h e d i n 1921. Thorndike's study, which had a great impact on e d u c a t i o n a l r e s e a r c h , made use of over f o u r and a h a l f m i l l i o n words of running prose taken from a v a r i e t y of sources i n c l u d i n g c h i l d r e n ' s l i t e r a t u r e , elementary school t e x t s , commercial 23 m a t e r i a l s , and the B i b l e . Ten years l a t e r , Thorndike (1931) added another 10,000 words to the frequency l i s t s and then c o l l a b o r a t e d with Lorge to produce a much more d i v e r s e sampling of book and magazine content (Thorndike and Lorge, 1944). The pioneer work of Thorndike and Lorge had great e d u c a t i o n a l s i g n i f i c a n c e because t h e r e was a r e a l need to have s c h o o l i n s t r u c t i o n a l m a t e r i a l s based on language which had a high f u n c t i o n a l frequency. However, the point has been made t h a t the a d d i t i o n a l 10,000 words used i n the Thorndike-Lorge l i s t were compiled mainly from a d u l t m a t e r i a l s ( H a r r i s and Jacobson, 1972) . During t h i s p e r i o d a number o f other r e s e a r c h e r s were c o n s t r u c t i n g word l i s t s based mainly on the language c o n s i d e r e d most common to c h i l d r e n . Pressey (1924) compiled s p e c i a l v o c abulary l i s t s i n f i f t e e n s c h o o l s u b j e c t s i n an attempt to i s o l a t e s p e c i f i c areas of emphasis i n language usage. Gates (1926) developed a 1,500 word r e a d i n g vocabulary f o r the primary grades by s e l e c t i n g from 2,500 of the highest frequency words i n Thorndike's i n i t i a l work, 1,000 of the most f r e q u e n t words i n a s e r i e s of c h i l d r e n ' s readers and 1,000 of the words most f r e q u e n t l y spoken by young c h i l d r e n . The Gates' word l i s t had c o n s i d e r a b l e i n f l u e n c e on the vocabulary used i n primary grade reading textbooks. Horn (1926) p u b l i s h e d an a d u l t vocabulary l i s t of 10,000 words c o n s i d e r e d to be b a s i c f o r w r i t t e n e x p r e s s i o n and made i t p o s s i b l e to compare t h i s mode with reading and speaking v o c a b u l a r i e s . Another major development about t h i s time was the I n t e r n a t i o n a l Kindergarten L i s t of 2,596 words c o n s i d e r e d most widely known by k i n d e r g a r t e n c h i l d r e n (West, 1928) . In 1931, a l i s t of 769 easy words which were common to both the I n t e r n a t i o n a l Kindergarten L i s t and the f i r s t 1,000 words of the Thorndike l i s t , was produced by Dale. The f o l l o w i n g year, the r e s u l t s of a complex study were presented, designed to assess which words i n the E n g l i s h language were used most o f t e n and how other v a r i a b l e s i n the language i n f l u e n c e d t h e i r use (Faucett and Maki, 19 32). A few years l a t e r , Buckingham and Dolch (1936) developed a word l i s t based on the word knowledge of c h i l d r e n i n Grades 2 t o 6. About the same time Dolch (1936) compiled a l i s t of 220 words by s e l e c t i n g 193 words which were common to the most f r e q u e n t words taken from t h r e e sources: a l i s t of 2,596 words common to p r e s c h o o l e r ' s v o c a b u l a r i e s ; the Gates' Primary Word L i s t of 1,811 words judged important i n c h i l d r e n ' s r e a d i n g ; and a l i s t of 153 words taken from a number of primers and f i r s t grade r e a d e r s . The f i r s t major undertaking t o develop a knowledge of the bas i c E n g l i s h vocabulary used i n Canadian elementary schools was s t a r t e d i n 1945 (St o t h e r s , Jackson and Minkler, 1947). The authors pointed to the complete r e l i a n c e by Canadian educators on word l i s t s c o n s t r u c t e d i n the U.S.A. They s t r e s s e d t h a t very l i t t l e a t t e n t i o n had been paid to the nature of the vocabulary i n Canadian textbooks or to the r o l e of vocabulary development as a d i s t i n c t i v e reading s k i l l . The method used i n the study was to review a number of surveys c a r r i e d out between 1921-1945 and to a l s o assess the uncommon vocabulary o f readers used i n O n t a r i o . Three word l i s t s were then prepared: f o r Grades 1 and 25 2, Grades 3 and 4, and Grades 5 and 6 r e s p e c t i v e l y . The l i s t s were next examined by students and teachers i n an attempt to check content v a l i d i t y . F i n a l l y a t o t a l of 5,764 words d i s t r i b u t e d a c r o s s Grades 1 to 6 was presented In 1945, a b a s i c vocabulary f o r elementary school c h i l d r e n i n the U.S.A was developed by R i n s l a n d and i n 1949, the f i r s t of a s e r i e s of core vocabulary l i s t s were c o n s t r u c t e d by the E d u c a t i o n a l Developmental L a b o r a t o r i e s (EDL). The EDL word l i s t s were designed to f a c i l i t a t e the p r e p a r a t i o n of b a s i c and supplementary r e a d i n g m a t e r i a l s and t o serve as a guide f o r t e a c h e r s and students r e g a r d i n g the vocabulary l o a d of books. In a d d i t i o n , the b a s i c core vocabulary was suggested f o r use i n the development of r e a d a b i l i t y l e v e l s of r e a d i n g m a t e r i a l s (Taylor, 1949) . In the i n i t i a l EDL study, 150 sources were i n v e s t i g a t e d . R e v i s i o n s f o l l o w e d i n 1951 and 1955, and i n 1968, an a d d i t i o n a l nine b a s a l r e a d e r s were added to the survey. The primary g r a d e s 1 l i s t s were based on b a s a l readers and a t the i n t e r m e d i a t e l e v e l a combination of p u p i l s * knowledge of the word checked a g a i n s t t h e R i n s l a n d (1945) l i s t and the word«s frequency measured a g a i n s t the frequency (G l i s t i n g ) of the Thorndike-Lorge (1944) l i s t , determined the i n c l u s i o n of a word. For Grades 7 and 8, the words from Grades 4 t o 6 were rechecked a g a i n s t the Thorndike-Lorge (1 944) and the R i n s l a n d (1945) l i s t s and added i f t h e i r frequency warranted i t . The remainder of the words were taken from the Thorndike-Lorge and the R i n s l a n d l i s t s . F i n a l l y the core vocabulary f o r Grades 9 to 13 was compiled by using the 26 h i g h e s t frequency words from the Thorndike-Lorge (1944) l i s t as w e l l as a number of other words from a b i b l i o g r a p h y of vocabulary improvement m a t e r i a l s . The Kucera-Francis (1967) a n a l y s i s of American-English made use of computer techniques to compile a 1,014,232 word corpus which was unique at the time i n t h a t i t was the only randomly s e l e c t e d sample of p r i n t e d m a t e r i a l published i n the USA i n the one c a l e n d a r year. F i f t e e n genre were used i n the Kucera-Francis study and 500 samples of approximately 2,000 words each were randomly s e l e c t e d a c r o s s the genre. The study provided an i n v a l u a b l e data base f o r other r e s e a r c h e r s to use i n i n v e s t i g a t i n g p h o n o l o g i c a l and l e x i c o g r a p h i c a l a s p e c t s of w r i t t e n E n g l i s h language. However, the K u c e r a - F r a n c i s study was d e r i v e d from a d u l t m a t e r i a l s and was not designed to p r o v i d e grade l e v e l or s u b j e c t area analyses of the m a t e r i a l being t r e a t e d . C a r r o l l , e t a l (1971) emphasized the need to l e a r n more about the l e x i c a l c h a r a c t e r i s t i c s of language i n a massive study i n v o l v i n g p u b l i s h e d m a t e r i a l s f r e q u e n t l y used by students i n Grades 3 through 9. The American Heritage Intermediate Corpus or AHI Corpus, as the study was c a l l e d , made use of computer techniques to generate frequency l i s t s from over 5,000,000 words taken from some 1,000 d i f f e r e n t p u b l i c a t i o n s . The AHI Corpus was designed t o provide a ' c u l t u r a l frame of r e f e r e n c e f o r judgment and comparison' which would serve as 'a r e f l e c t i o n of the c u l t u r e t a l k i n g to i t s c h i l d r e n * ( C a r r o l l e t a l , 1971). The word f r e q u e n c i e s were l i s t e d by grade l e v e l s thus p r o v i d i n g 27 v a l u a b l e i n f o r m a t i o n f o r te a c h e r s and w r i t e r s of i n s t r u c t i o n a l m a t e r i a l s . The authors a l s o noted that word frequency data had been u s e f u l i n h e l p i n g to determine r e a d a b i l i t y l e v e l s and the s e l e c t i o n of t e x t s f o r classroom i n s t r u c t i o n , the te a c h i n g of E n g l i s h as a second language, and the c o m p i l a t i o n of vocabulary l i s t s . The AHI Corpus i n c o r p o r a t e d a number of s t a t i s t i c a l a n alyses of the Corpus by grade and s u b j e c t area using the word frequency data but no attempt was made t o examine sentence l e n g t h c h a r a c t e r i s t i c s of the m a t e r i a l used i n the study. In 1972, H a r r i s and Jacobson published a s e r i e s of elementary reading v o c a b u l a r i e s c o n s i s t i n g of words which were widely used i n elementary s c h o o l textbooks during 1970. The study made use of f o u r t e e n s e r i e s of elementary s c h o o l textbooks f o r Grades 1 to 6, s i x b a s a l reader s e r i e s , p l u s two s e r i e s of t e x t s f o r each o f the core s u b j e c t s ( E n g l i s h , S o c i a l S t u d i e s , Science, Mathematics). The l i s t s i n c l u d e d General Vocabulary l i s t s c o n t a i n i n g common vocabulary found i n b a s a l r e a d e r s and content textbooks, a Core L i s t of words found i n three of the s i x b a s a l readers, and an A d d i t i o n a l L i s t made up of words which appear i n four or more of the f o u r t e e n s e r i e s of books used. A Core L i s t and an A d d i t i o n a l L i s t were a l s o i n c l u d e d f o r each of the b a s a l reader l e v e l s (Preprimer through Grade 6 ) . The authors s t a t e d t h a t t h e i r l i s t s provided the b a s i s f o r a number of comparative analyses t o be made between word l i s t s , i n c l u d i n g elements such as content, obsolescence, l e v e l s of d i f f i c u l t y , number and l e n g t h of words, word frequency, and aspects of word c o n s t r u c t i o n such as s i n g u l a r - p l u r a l ( H a r r i s and Jacobson, 1972) . 28 A summary of the most widely known word l i s t s developed between 1921 - 1972 i s presented i n TABLE I . TABLE I A SUMMARY OF WORD LISTS : 1921-1972 Author Thorndike Gates Horn Thorndike Dale Buckingham & Dolch Dolch Thorndike & Lorge R i n s l a n d S t o t h e r s , Jackson, & Minkler Year T92T 1926 1926 1931 1931 1936 19 36 1944 1945 1947 T a y l o r , 1949 Frackenpohl r e v i s e d & White i n 1951 & 19 55 D e s c r i p t i o n Thg^. lE£§5i?e£l§..ffQ£§..§99J£ con t a i n e d 10,000 words taken from p r i n t e d m a t e r i a l s i n the U.S.A. A_Heading_Vocabulao_^2I_illS_££il^^ Grades contained 1,500 words f o r Grades 1, 2, and 3. A vocabulary s t a t e d to be b a s i c f o r w r i t t e n e x p r e s s i o n . Another 10,000 words added to the 1921 l i s t . A l i s t of 769 easy words which were common to the I n t e r n a t i o n a l K i n d e r g a r t e n L i s t and the f i r s t 1 ,000 words of the Thorndike L i s t . A word l i s t based on v o c a b u l a r i e s of c h i l d r e n i n Grades 2 to 6. A b a s i c s i g h t vocabulary of 220 words. A much more d i v e r s e sampling of book and magazine content i n the U.S.A. 30,000 words i n the l i s t . A B a s i c Vocabulary of Elementary School C h i l d r e n . I l l u s t r a t e d the f r e q u e n c i e s of 14,571 words taken from an a n a l y s i s of 200,000 w r i t t e n papers. The f i r s t major undertaking of produce a Canadian w o r d - l i s t f o r Grades 1-6. A t o t a l of 5,764 words used. A s e r i e s of core v o c a b u l a r i e s developed by the E.D.L. 29 TABLE I (CONT.) A SUMMARY OF WORD LISTS : 1921-1972 Kucera £ 1967 An a n a l y s i s of American-English a d u l t F r a n c i s m a t e r i a l s using computer techniques to generate a corpus of 1,014,232 words. T a y l o r , 1968 An a d d i t i o n a l nine b a s a l readers were Frackenpohl added to the 1955 r e v i s i o n . L i s t s a t & White the primary, i n t e r m e d i a t e , and secondary l e v e l s were pro v i d e d . C a r r o l l 19 71 The_American_Heritage _Word et a l ££§3S®li22_S22JSi A computer-generated a n a l y s i s of over 5,000,000 words taken from 1 ,000 d i f f e r e n t p u b l i c a t i o n s used i n Grades 3 to 9 H a r r i s & 1972 B a s i c Elementary,, Reading Jacobson V o c a b u l a r i e s ^ A s e t of word l i s t s at the elementary l e v e l developed by computer techniques. S t u d i e s concerned with the vocabulary content of p r i n t e d m a t e r i a l s have o f t e n f o l l o w e d the development of frequency word l i s t s . Between 1925-1945 a number of r e s e a r c h e r s i n v e s t i g a t e d the r e l a t i o n s h i p between the vocabulary used i n i n s t r u c t i o n a l m a t e r i a l s i n the content areas and the most common words r e p o r t e d i n frequency word l i s t s (Powers, 1925; P a t t y and P a i n t e r , 1931; F r i e s and Traven, 1940). In 1952, Malsbary measured the understanding t h a t high s c h o o l students had of business and economic terms s e l e c t e d from a v a r i e t y of newspapers, j o u r n a l s , and newscasts. He found t h a t 30 there was some r e l a t i o n s h i p between student understanding and the frequency of the item, Malsbary a l s o r e p o r t e d t h a t seventy-nine o f the items were understood by only 50 percent o f the stude n t s . Kyte (1953) conducted a study t o determine the core vocabulary r e q u i r e d f o r v a r i o u s i n s t r u c t i o n a l programs. He used the 500 most common words from each of Horn's 1926 l i s t , the Thorndike-Lorge 1944 l i s t and the R i n s l a n d 1945 l i s t . A f i n a l l i s t of 663 words was presented. The continued r e l i a n c e of e d u c a t i o n a l r e s e a r c h e r s on word l i s t s compiled s e v e r a l decades e a r l i e r was r e f l e c t e d i n a s e r i e s of s t u d i e s made during the e a r l y 1960*s. T r a x l e r (1963) developed two forms of a f i f t y - i t e m v ocabulary t e s t f o r high s c h o o l students and c o l l e g e freshmen by randomly s e l e c t i n g items from the 10,000th t o the 20,000th word of the Thorndike-Lorge word l i s t . Another r e s e a r c h p r o j e c t compared the frequency of s e l e c t e d s t r u c t u r e words found i n c h i l d r e n ' s and a d u l t s ' w r i t t e n e x p r e s s i o n . The s t r u c t u r e words were taken from Rinsland's 1945 l i s t , Dewey's " R e l a t i v e Frequency of E n g l i s h Speech Sounds," (1923) and from Horn's 1926 work (Card and McDavid, 1965). The f o l l o w i n g year another study analysed the frequency b i a s of the 122 most commonly used E n g l i s h words as determined i n a number of l i s t s i n c l u d i n g Dewey's and R i n s l a n d ' s . The r e s u l t s o f t h i s study showed t h a t s t r u c t u r e words i n E n g l i s h formed a t y p i c a l corpus (Card and McDavid 1966). In 1967, Jacobs compared the 1926 Buckingham-Dclch word l i s t r e s u l t s with a study c a r r i e d out i n Oregon. He found t h a t 31 f r e e - a s s o c i a t i o n vocabulary e l i c i t e d from c h i l d r e n i n Grades 2 through 6 d i f f e r e d s i g n i f i c a n t l y from that r e p o r t e d i n the o r i g i n a l study. An i n t e r e s t i n g p o i n t t o note i s t h a t although more c h i l d r e n knew the same word i n 1966, they a l s o knew fewer words than t h e i r p r e d e c e s s o r s . In 197 1, Johnson made an examination of the Dolch (1936) b a s i c s i g h t word l i s t and i t s r e l a t i o n s h i p t o the K u c e r a - F r a n c i s study. He s t a t e d that 82 of the 220 words l i s t e d by Dolch were not among the 220 most fr e q u e n t words i n the Kucera-Francis Corpus. Other d i s c r e p a n c i e s were r e p o r t e d and Johnson concluded that the o r i g i n a l Dolch l i s t was no l o n g e r s u i t a b l e as a measuring instrument f o r the vocabulary content of i n s t r u c t i o n a l m a t e r i a l s i n the 1970's. The need f o r e x t e n s i v e , a n a l y t i c a l s t u d i e s i n t o the nature of i n s t r u c t i o n a l m a t e r i a l s c u r r e n t l y used i n Canadian s c h o o l s i s an i m p l i c a t i o n from the preceding d i s c u s s i o n . Such s t u d i e s would present a d e s c r i p t i o n of the language composition of r e a d i n g m a t e r i a l s used i n Canadian education i n the 1970's. Wor d _ L i s t s _ a n d _ R e a d a b i l i t y Word l i s t s have been used e x t e n s i v e l y i n the development of r e a d a b i l i t y formulas. L i v e l y and Pressey (1923) used the Thorndike l i s t to g i v e a 'weighting 1 to m a t e r i a l s they had s e l e c t e d from elementary b a s a l r e a d e r s and c o l l e g e s c i e n c e textbooks. A number of r e s e a r c h e r s used words t h a t were not i n c l u d e d i n the Thorndike l i s t as a v a r i a b l e i n t h e i r work i n t o r e a d a b i l i t y (Vogel and Washbourne, 1928; Washbourne and 32 Morphatt, 1938; Jacobson, 1961). Gray and Leary (1935) used the 1931 word l i s t developed by Dale i n t h e i r r e a d a b i l i t y formula as d i d Lorge (1944) and Spache (1953) . Spache l a t e r made use of the Stone (1956) r e v i s i o n of Dale's l i s t i n h i s formula. In 1948, Dale and C h a l l used the Dale L i s t of 3,000 words as a v a r i a b l e i n t h e i r r e a d a b i l i t y formula. A l a t e r word l i s t by B o t e l (1962) was a l s o used i n r e a d a b i l i t y r e s e a r c h . In r e cent years, work i n t o the r e a d a b i l i t y cf p r i n t m a t e r i a l s has made more use of language v a r i a b l e s other than word frequency. T h i s aspect of r e a d a b i l i t y i s d i s c u s s e d l a t e r i n the chapter. COMPUTER TECHNOLOGY IN LANGUAGE RESEARCH In r ecent years there has been a growing i n t e r e s t i n the use of computer techniques to help compile and analyze n a t u r a l language samples. The s t u d i e s have con c e n t r a t e d i n two main areas: the a n a l y s i s of m a t e r i a l s i n s p e c i a l i z e d areas such as l i b r a r y s c i e n c e , i n f o r m a t i o n s c i e n c e , and f o r e i g n languages; and the a n a l y s i s of e d u c a t i o n a l , i n s t r u c t i o n a l m a t e r i a l s . A study which generated a computer-based g e n e r a l frequency w o r d - l i s t i n German designed f o r use a t the c o l l e g e l e v e l , i n d i c a t e d t h a t over 30 percent of the o r i g i n a l sample t e x t had not been covered by p r e v i o u s l y developed w o r d - l i s t s ( S i l i a k u s , 1967) . The author pointed out that although most of the 33 u n t r e a t e d words were very low frequency items, there were a l s o numerous high frequency proper nouns and cognates found i n the p o r t i o n s of the t e x t not covered. T h i s study thus emphasized the very thorough a n a l y s i s of language p o s s i b l e with the a i d of a computer. A l a t e r study by Johnson (1972) with Russian language m a t e r i a l s made use of a s e t of frequency groups and an a l g o r i t h m f o r the implementation of a frequency i d e n t i f i c a t i o n and marking procedure on an IBM 360 computer. The work of F u e l l h a r t and Weeks (1968) examined some twenty-three l e x i c a l r e s o u r c e s i n i n f o r m a t i o n s c i e n c e . T h i s a n a l y s i s , which made use of the IBM 360 Model HQ computer, was s u c c e s s f u l i n q u a n t i f y i n g the terminology and e s t a b l i s h i n g the frequency of occurrence of main concepts. However, the study a l s o showed that there was no formal s t r u c t u r e f o r the d i s c i p l i n e of i n f o r m a t i o n s c i e n c e and t h a t the m a t e r i a l s examined tended to r e f l e c t the o p i n i o n s of the authors about the nature of the s t r u c t u r e . A u s t i n (1969) conducted an i n v e s t i g a t i o n i n t o the a u t h e n t i c i t y of a p i e c e of E n g l i s h l i t e r a t u r e by using a computer a s s i s t e d technique f o r s t y l i s t i c d i s c r i m i n a t i o n . The A u s t i n study was important i n t h a t i t i l l u s t r a t e d how frequency l i s t s of words and other p e r t i n e n t l i n g u i s t i c v a r i a b l e s c o u l d be used to help determine a u t h o r s h i p . L a t e r r e s e a r c h by Berkeley (1972) showed t h a t the computer c o u l d be used to help i s o l a t e d i f f i c u l t terminology i n a s p e c i f i c d i s c i p l i n e . The computer scanned a chapter o f a Navy t r a i n i n g manual c o n s i s t i n g of 9,800 34 words and c l a s s i f i e d words of two s y l l a b l e s or longer as e i t h e r "assumed audience v o c a b u l a r y " (words the audience would be expected to know), or d i f f i c u l t words needing f u r t h e r c l a r i f i c a t i o n . The computer scanning technique d e s c r i b e d by B e r k e l e y made use of a p r e v i o u s l y d e f i n e d l e x i c o n and i s a procedure which has important i m p l i c a t i o n s f o r f u t u r e language r e s e a r c h . The study by Kucera and F r a n c i s (1967) represented the f i r s t attempt to computer-generate a g e n e r a l w o r d - l i s t f o r use i n e d u c a t i o n a l r e s e a r c h through the manipulation of a massive (over 1,000,000 words) corpus. Since the Kucera-Francis work, which d e a l t with a d u l t m a t e r i a l s , r e s e a r c h e r s have been developing computer techniques t o a i d them i n t h e i r i n v e s t i g a t i o n of i n s t r u c t i o n a l m a t e r i a l s at v a r i o u s grade l e v e l s . A study by C r o n n e l l (1971) developed a l e x i c o n of 9,000 words which were taken from m a t e r i a l s used i n k i n d e r g a r t e n to Grade 3. With the use of a computer, the 9,000 words were s y s t e m a t i c a l l y arranged both by order of word l e n g t h and by the i n t r o d u c t i o n of vowels i n unstressed s y l l a b l e s . The study was designed to a i d the i n v e s t i g a t i o n of the s p e l l i n g - t o - s o u n d correspondences needed i n beginning r e a d i n g . H a r r i s and Jacobson (1972) compiled a number of elementary re a d i n g vocabulary l i s t s taken from 127 books i n twenty-eight s e r i e s . T h i s computer-assisted p r o j e c t produced three s e t s of l i s t s which i n c l u d e d a General Vocabulary s e t , a T e c h n i c a l Vocabulary s e t , and a T o t a l A l p h a b e t i c a l L i s t . The study 3 5 generated approximately 17,000 word-types ( a f t e r c e r t a i n adjustments) from an o r i g i n a l 80,000 unique words found i n the 4,500,000 running words t r e a t e d . The authors gave an e x c e l l e n t d e s c r i p t i o n of the procedures they f o l l o w e d throughout the i n v e s t i g a t i o n , i n c l u d i n g v a l u a b l e i n f o r m a t i o n on the types of programs which were developed f o r use with the Burroughs B5500 computer. However, the study d i d not attempt to make s t a t i s t i c a l a nalyses of the m a t e r i a l , but was designed to g i v e a d e s c r i p t i o n of the language comprising elementary grade l e v e l textbooks i n the A910's and t h e r e f o r e provided an important b a s i c r e f e r e n c e i n s t u d i e s of word frequency. A s l i g h t l y d i f f e r e n t approach was presented by Durr (1973) who i n s i s t e d t h a t t h e r e was a need f o r an up-to-date vocabulary l i s t a t the primary l e v e l which concentrated on books which the students had s e l e c t e d f o r themselves, The author made use of e i g h t y l i b r a r y books which were f r e q u e n t l y chosen by c h i l d r e n f o r r e c r e a t i o n a l r eading. Over 100,000 running words were then computer analyzed i n t o word-types and a frequency l i s t of word tokens. Durr concluded t h a t t h e r e were s e v e r a l very important i m p l i c a t i o n s f o r the t e ac h i n g of reading from h i s study i n c l u d i n g the need to i n t r o d u c e c h i l d r e n t o high frequency words e a r l y i n t h e i r r e a d i n g experience. The f i r s t major study i n v o l v i n g j u n i o r secondary m a t e r i a l s was presented i n the American H e r i t a g e Word Frequency Book and the American Heritage S c h o o l D i c t i o n a r y ( C a r r o l l e t a l , 1971). The authors s t r e s s e d t h a t the study, which computer processed over 5,000,000 words taken from books f r e q u e n t l y used i n Grades 36 3 through 9, was n e c e s s i t a t e d by both the types of m a t e r i a l used i n s c h o o l s today, and by the r a p i d i n c r e a s e of new words i n the E n g l i s h language. The C a r r o l l study recognized the need to i n c l u d e m a t e r i a l s at the j u n i o r secondary s c h o o l l e v e l , but d i d not c a r r y the i n v e s t i g a t i o n past the stage of a n a l y z i n g word c h a r a c t e r i s t i c s and d e a l t only with m a t e r i a l s used i n the U.S.A. RESEARCH INTO THE READABILITY OF INSTRUCTIONAL MATERIALS The v a r i o u s d e s c r i p t i v e and s t a t i s t i c a l analyses made during t h i s study c o n c e n t r a t e d on word and sentence c h a r a c t e r i s t i c s as the two main language v a r i a b l e s . The r o l e of these v a r i a b l e s i n r e a d a b i l i t y r e s e a r c h i s presented i n two s e c t i o n s : the development of r e a d a b i l i t y formulas; and work on the C l o z e procedure. The i n i t i a l d i s c u s s i o n d e a l i n g with r e a d a b i l i t y formulas t r a c e d attempts by e d u c a t i o n a l r e s e a r c h e r s t o i d e n t i f y and then s i m p l i f y v a r i o u s combinations of language v a r i a b l e s thought to cause d i f f i c u l t y i n r e a d i n g comprehension. The s e c t i o n on the development of the Cloze technique as an instrument of r e a d a b i l i t y a n a l y s i s concentrated on more re c e n t attempts by r e s e a r c h e r s to gain an understanding of the s y n t a c t i c and semantic r e l a t i o n s h i p of language i n the comprehension p r o c e s s . 37 Readability__Formulas Works by C h a l l (1958) and K l a r e (1963) summarized e a r l y r e s e a r c h i n t o r e a d a b i l i t y and pointed to the need f o r g r e a t e r e x p e r t i s e i n the design of r e s e a r c h s t u d i e s , understanding of l i n g u i s t i c v a r i a b l e s i n v o l v e d , and a n a l y s i s of r e s u l t s i n f u t u r e r e s e a r c h . In a study of s i x t y - s i x secondary s c h o o l l i t e r a t u r e t e x t s i n use i n the U.S. A, Aukerman (1965) l i s t e d f i v e language v a r i a b l e s which he maintained helped account f o r a book's r e a d a b i l i t y l e v e l . Aukerman's v a r i a b l e s were sentence l e n g t h and complexity which he c l a s s e d as mechanical complexity, and the i n c i d e n c e of v e r b a l s , word d i f f i c u l t y , and a b s t r a c t i o n which he termed v e r b a l complexity. Aukerman then c o n s t r u c t e d a t a b l e which he claimed l i s t e d the r e l a t i v e r e a d i n g d i f f i c u l t y of each book based on f i v e , 500 word samples. He a l s o s t r e s s e d t h a t h i s f i n d i n g s were onl y t e n t a t i v e and t h a t e m p i r i c a l evidence would have* to wait u n t i l he had had an o p p o r t u n i t y to engage i n f u r t h e r r e s e a r c h . K l a r e (1966) e x p l a i n e d t h a t e a r l i e r work by Coleman and Bormuth had shown the value of c o u n t i n g l e t t e r s per word as a measure of passage d i f f i c u l t y . Other a s p e c t s of words such as morphological complexity, the number of s y l l a b l e s o r i g i n a t i n g i n L a t i n , a b s t r a c t n e s s , and frequency have been i n v e s t i g a t e d (Bormuth, 1967). U n t i l r e c e n t l y the l a t t e r v a r i a b l e was not c o n s i d e r e d very important i n r e a d a b i l i t y a n a l y s i s . However, K l a r e * s (1968) r e s e a r c h l e d him to b e l i e v e t h a t word frequency may i n f a c t encompass most of the other v a r i a b l e s . 38 Coleman (1968) o u t l i n e d a number of experiments which had s t u d i e d grammatical r e l a t i o n s and r e a d a b i l i t y . He s t a t e d that i n most cases when a sentence i s r e w r i t t e n to make i t more re a d a b l e , i t u s u a l l y undergoes a grammatical t r a n s f o r m a t i o n i n some form. Coleman i l l u s t r a t e d t h i s by p o i n t i n g out t h a t much prose i s a b s t r a c t simply because the w r i t e r chose to use c e r t a i n d e r i v a t i v e s of verbs (e.g. a b s t r a c t noun n o m i n a l i z a t i o n s from a c t i v e verbs - " o p e r a t i o n " from " o p e r a t e " ) . Coleman concluded h i s a r t i c l e by s u g g e s t i n g t h a t f u r t h e r r e s e a r c h i n t o the improved r e a d a b i l i t y of i n s t r u c t i o n a l m a t e r i a l s c o u l d l e a d to a g r e a t e r awareness of the value of t r a n s f o r m a t i o n a l grammar. Rosenshine (1968) presented a d e s c r i p t i o n of three experiments i n h o r i z o n t a l r e a d a b i l i t y where s i m i l a r passages were compared a c c o r d i n g to the c o g n i t i v e s i m i l a r i t y of t h e i r words and phrases. The f i n d i n g s of t h i s study suggested s e v e r a l language v a r i a b l e s which a f f e c t e d r e a d a b i l i t y , namely indeterminate q u a l i f i e r s and p r o b a b i l i t y words which caused vagueness, and the omission of i r r e l e v a n t sentences from the passage. Bormuth (1968) pointed out t h a t r e c e n t r e s e a r c h i n t o the r e a d a b i l i t y o f w r i t t e n i n s t r u c t i o n a l m a t e r i a l s had attempted to e x p l a i n c o r r e l a t i o n s between language and r e a d i n g d i f f i c u l t y through a more d e t a i l e d examination of the p s y c h o l o g i c a l processes i n v o l v e d . S e v e r a l recent s t u d i e s i n t o the i n h e r e n t d i f f i c u l t i e s of v a r i o u s grammatical measures and t h e i r c o n t r i b u t i o n to r e a d a b i l i t y have r e s u l t e d i n e i t h e r sentence length or word d i f f i c u l t y being c i t e d as the most important v a r i a b l e s . 39 H a c G i n i t i e and T r e t i a k (1969) used Yngve's phrase s t r u c t u r e measurement and A l l e n ' s " s e c t o r a n a l y s i s " on e i g h t y s e l e c t e d passages and compared the r e s u l t s t o the Lorge R e a d a b i l i t y formula a p p l i e d t o t e s t s based on the same passages. Sentence l e n g t h emerged as the v a r i a b l e most c l o s e l y c o r r e l a t e d with t e s t s c o r e s . In an experiment to i n v e s t i g a t e the l e a r n a b i l i t y (new l e a r n i n g from a passage) as w e l l as the r e a d a b i l i t y o f t e x t books, Guthrie (1970) used eig h t e e n l i n g u i s t i c v a r i a b l e s , i n c l u d i n g sentence l e n g t h , word d i f f i c u l t y , p a r t s of speech, grammatical t r a n s f o r m a t i o n s , and c e r t a i n other s t i m u l u s dimensions such as word f a m i l i a r i t y . G u t h r i e r e p o r t e d t h a t sentence l e n g t h and word d i f f i c u l t y were the best p r e d i c t o r s of l e a r n a b i l i t y as w e l l as r e a d a b i l i t y . He supported h i s f i n d i n g s by s t a t i n g t h a t sentence l e n g t h was found to c o r r e l a t e .842 with Cloze gain s c o r e s , while word d i f f i c u l t y c o r r e l a t e d .815 with m u l t i p l e choice gain s c o r e s . 1§£§Si_£§a d a b i 1 i t y _ f o r m u l a s E a r l y attempts t o measure the r e a d a b i l i t y l e v e l s of m a t e r i a l s r e s u l t e d i n instruments which r e q u i r e d c o n s i d e r a b l e time and e f f o r t to apply. Fry (1968) developed a r e a d a b i l i t y f o r m u l a which used sentence l e n g t h and s y l l a b l e s as the two language v a r i a b l e s . Fry's formula was r e l a t i v e l y easy t o a p p l y and c o r r e l a t e d h i g h l y with a number of e x i s t i n g r e a d a b i l i t y formulae. The f o l l o w i n g year an even more s i m p l i s t i c and p u r p o r t e d l y 40 more v a l i d r e a d a b i l i t y formula e n t i t l e d , "SMOG Grading," appeared. McLaughlin (1969), the c r e a t o r of "SMOG," exp l a i n e d t h a t a f t e r c o n s i d e r a b l e r e s e a r c h i n t o the problem, he had concluded that p o l y s y l l a b i c words and sentence l e n g t h were the most p r e d i c t i v e l i n g u i s t i c v a r i a b l e s to use i n determining the r e a d i n g d i f f i c u l t y of m a t e r i a l s . McLaughlin e x p l a i n e d h i s d e c i s i o n by p o i n t i n g out t h a t i n h i s d o c t o r a l d i s s e r t a t i o n some three years e a r l i e r he had shown t h a t words and sentences were r e s p e c t i v e l y , the best measures of semantic and s y n t a c t i c c h a r a c t e r i s t i c s of readi n g d i f f i c u l t y . By n o t i n g t h a t semantic and s y n t a c t i c v a r i a b l e s i n t e r a c t , McLaughlin claimed he was a b l e to reduce h i s formula t o a mere counting of the number of p o l y s y l l a b i c words i n thr e e s e t s of ten c o n s e c u t i v e sentences, f i n d i n g the square r o o t of the number obta i n e d , and adding t h r e e . He then gave a d e t a i l e d account of the v a l i d i t y of h i s instrument and emphasized t h a t i t gave a measure of complete understanding of the m a t e r i a l i n c o n t r a s t to other formulas which s t a t e d a 'general understanding' only. For t h i s reason, McLaughlin concluded, the "SMOG Grading" would u s u a l l y be s e v e r a l grades higher than other r e a d a b i l i t y formulas i n common use. M ^ ^ g n f l s _ i n _ r e s e a i : c h In a well designed s e r i e s o f s t u d i e s , Bormuth (1969) i l l u s t r a t e d j u s t how f a r r e s e a r c h i n t o r e a d a b i l i t y was from a c h i e v i n g i t s o b j e c t i v e and s t a t e d , " I t had been a n t i c i p a t e d t h a t these analyses would s i m p l i f y the problem of c o n s t r u c t i n g *»1 the theory of the comprehension of language. The f a i l u r e to r e a l i z e t h i s e x p e c t a t i o n was s p e c t a c u l a r " . A main o b j e c t i v e of Bormuth's s t u d i e s was to i s o l a t e l i n g u i s t i c f e a t u r e s c f p r i n t e d passages and determine which f e a t u r e s stood i n c a u s a l r e l a t i o n to comprehension d i f f i c u l t i e s . The m a t e r i a l s c o n s i s t e d of 330 passages, drawn from ten s u b j e c t areas, r e p r e s e n t a t i v e of Grades 1 through 12. The 169 l i n g u i s t i c v a r i a b l e s i n c l u d e d e i g h t vocabulary v a r i a b l e s ( f a c t o r s such as l e t t e r s per s y l l a b l e , s y l l a b l e s per word, frequency of l e x i c a l and s t r u c t u r a l words and the l i k e ) ; f i f t y v a r i a b l e s based on s y n t a c t i c s t r u c t u r e s ( i n c l u d i n g f a c t o r s which, might i n d i c a t e how the types or numbers of s t r u c t u r e s a sentence c o n t a i n s might i n f l u e n c e comprehension d i f f i c u l t y ) ; t h i r t y - e i g h t s y n t a c t i c complexity v a r i a b l e s ( i n c l u d i n g s t r u c t u r a l d e n s i t y , t r a n s f o r m a t i o n a l complexity, s t r u c t u r a l complexity, Yngve depth and s y n t a c t i c l e n g t h ) ; s i x t y -two p a r t s of speech v a r i a b l e s , and eleven anaphora v a r i a b l e s ( i n c l u d i n g frequency of anaphoric s t r u c t u r e s , d e n s i t y of of anaphora, and the time i n t e r v a l between occurrence of an anaphora and i t s antecedent) . A t o t a l of 94 of the 169 v a r i a b l e s c o r r e l a t e d s i g n i f i c a n t l y with measures of passage d i f f i c u l t y i n c l u d i n g 8 cut of 8 vocabulary v a r i a b l e s , 19 out of 50 s y n t a c t i c s t r u c t u r e v a r i a b l e s , 34 out of 38 s y n t a c t i c complexity v a r i a b l e s , 25 out of 62 p a r t s of speech v a r i a b l e s , and 8 out of 11 anaphoric v a r i a b l e s . I t i s i n t e r e s t i n g to note that a l l 8 vocabulary v a r i a b l e s and 11 out of 12 of the s y n t a c t i c l e n g t h v a r i a b l e s , i n c l u d e d i n s y n t a c t i c complexity, c o r r e l a t e d s i g n i f i c a n t l y with passage d i f f i c u l t y . The high number of s i g n i f i c a n t l y c o r r e l a t e d 42 v a r i a b l e s suggested that an overwhelming number of answers might p r e s e n t l y be given to the q u e s t i o n , "What accounts f o r comprehension d i f f i c u l t y of p r i n t e d m a t e r i a l s ? " In a d d i t i o n , v a r i a b l e s not s p e c i f i e d i n the t o t a l examined i n the study c o u l d r e l a t e s i g n i f i c a n t l y to comprehension d i f f i c u l t y . Some estimates place the t o t a l number of p o s s i b l e v a r i a b l e s a t we l l over 200. The 94 s i g n i f i c a n t l y r e l a t e d v a r i a b l e s were a l s o f a c t o r analyzed. Bormuth s t a t e d , "To summarize the r e s u l t s from f a c t o r a n a l y s i s , then, a simple s t r u c t u r e does not seem t o underly the v a r i a b l e s c o r r e l a t i n g with passage d i f f i c u l t y " . In order t o f a c i l i t a t e v a l i d research i n t o the r e a d a b i l i t y of p r i n t e d m a t e r i a l s , Bormuth advocated the use of very l a r g e samples of words to enable r a r e l y o c c u r r i n g l i n g u i s t i c v a r i a b l e s to be adequately examined. R e s u l t s o b tained from such s t u d i e s would o f f e r v a l u a b l e guidance t o educators concerned with the c o n s t r u c t i o n of i n s t r u c t i o n a l m a t e r i a l s s u i t a b l e f o r students at var y i n g l e v e l s of re a d i n g a b i l i t y . The use of computerized technology o f f e r s e x c i t i n g p o s s i b i l i t i e s i n t h i s regard i n the near f u t u r e . The d e l e t i o n o f words from a passage of p r i n t m a t e r i a l s at r e g u l a r i n t e r v a l s ensures that both l e x i c a l and s t r u c t u r a l words w i l l be omitted. When Cloze t e s t s have been c o n s t r u c t e d and administered c o r r e c t l y , the r e s u l t s are s a i d to measure the f a c i l i t y a student has i n understanding the s y n t a c t i c and semantic i n t e r r e l a t i o n s h i p s of the m a t e r i a l being read. 43 A_^5§2£iJ2iion_of _Cloze The Cloze technique, which was used by Ebbinghaus as e a r l y as 1897, was f i r s t developed as an instrument f o r measuring reading comprehension by T a y l o r (1953). B a s i c a l l y the C l o z e r e a d a b i l i t y procedure i n v o l v e d f i v e steps; 1. The s e l e c t i o n of passages from the m a t e r i a l t o be evaluated, 2. The d e l e t i o n of every "nth" word ( u s u a l l y every f i f t h word) and the i n s e r t i o n of u n d e r l i n e d blanks of a standard l e n g t h , 3. The a d m i n i s t e r i n g of the m u t i l a t e d t e x t to students who had not read the o r i g i n a l work, 4. The i n s t r u c t i o n to students to w r i t e i n the blank spaces the words they thought had been d e l e t e d , 5. The marking of c o r r e c t responses when i d e n t i c a l items have been i n s e r t e d (Bormuth, 1968). Since the work of T a y l o r , there have been numerous s t u d i e s i n t o the a p p l i c a t i o n of Cloze as a means of measuring (a) comprehension, (b) r e a d a b i l i t y , and (c) language v a r i a b l e s . A survey of some of the s t u d i e s p e r t a i n i n g to the l a t t e r category as i t r e l a t e s to the secondary s c h o o l l e v e l w i l l be presented i n t h i s s e c t i o n of the chapter. Much more comprehensive treatments of the Cloze procedure have been organized by Rankin (1965), Weintraub (1968), P o t t e r (1968), B i c k l e y , E l l i n g t o n and B i c k l e y (1 970), Jongsma (1971), Bormuth (1972), and B a i l e y (1973). I mp. or t a n t_ 1 i ng u i s t i c_ v a r i a b l e s Louthan (1965) noted that when s p e c i f i c words were d e l i b e r a t e l y d e l e t e d , i n c r e a s e d emphasis was placed on the meaning of the remaining words i n c o n t e x t . The Cloze technique. 44 t h e r e f o r e , made the reader use a l l of the l e x i c a l and grammatical c l u e s i n h e r e n t i n the language s t r u c t u r e . In a s e r i e s of experiments, Loutham d e l e t e d a v a r i e t y of l i n g u i s t i c v a r i a b l e s i n c l u d i n g p a r t s of speech and f u n c t i o n words, and t e s t e d t o see i f the experimental s u b j e c t s improved i n reading comprehension compared t o a c o n t r o l group who read m a t e r i a l which had not been m u t i l a t e d . The experimental group which showed the most s i g n i f i c a n t gain i n comprehension was the one which was r e a d i n g m a t e r i a l with c e r t a i n f u n c t i o n words (a, the, t h a t , whose, what, his) d e l e t e d . Another r e s e a r c h e r attempting to d i s c o v e r more exact p r e d i c t o r s of comprehension d i f f i c u l t i e s by using the Cloze technique, s t a t e d t h a t important i n f o r m a t i o n c o u l d be learned about the d i f f i c u l t i e s of words, independent c l a u s e s , and sentences (Bormuth, 1966). The v i t a l importance of content words i n language was i l l u s t r a t e d by a study conducted by Weaver and B i c k l e y (1967). I t was found t h a t o r i g i n a t o r s of w r i t t e n m a t e r i a l c o u l d r e c a l l about 85 percent of both s t r u c t u r a l and l e x i c a l d e l e t i o n s two days a f t e r t h e i r w r i t i n g , whereas students who had read the m a t e r i a l s o n l y , c o u l d r e c a l l s t r u c t u r a l words as w e l l as the producers, but c o u l d not r e c a l l l e x i c a l words. The importance of accurate i n f o r m a t i o n p e r t a i n i n g to s p e c i a l i z e d core v o c a b u l a r i e s f o r each of the content areas was an obvious i m p l i c a t i o n to be drawn. B i c k l e y , e t a l (1970) pointed out t h a t r e s e a r c h i n Cloze had been conducted i n t o the e f f e c t of sentence length on the 45 comprehension of the reader and t h a t s h o r t sentences were found t o bs more readable than long sentences. £l2£e_and_readability A number of s t u d i e s designed to e x p l o r e the s u i t a b i l i t y of i n s t r u c t i o n a l m a t e r i a l s using the Cloze technique were r e p o r t e d i n 1968. Weintraub (1968) reviewed s e v e r a l s t u d i e s which showed t h a t the Cloze technique had high r e l i a b i l i t y and v a l i d i t y i n measuring r e a d a b i l i t y . I t was f u r t h e r suggested that Cloze c o u l d o f f e r v a l u a b l e i n s i g h t s i n t o a spects o f the r e a d i n g process., Bormuth (1968) i n v e s t i g a t e d the r e l a t i o n s h i p between the r e a d a b i l i t y l e v e l and the amount of i n f o r m a t i o n gained by the reader. He s t a t e d t h a t scores on C l o z e t e s t s d i d not depend e n t i r e l y on the reader's p r i o r knowledge of the m a t e r i a l . T h i s would suggest t h a t the r o l e of c e r t a i n f u n c t i o n words i n the language s t r u c t u r e was of v i t a l importance. Bormuth's other c o n t r i b u t i o n s to r e s e a r c h i n the Cloze technique have aided the work i n t o r e a d a b i l i t y tremendously. His e a r l y work concentrated on the need to develop the C l o z e procedure i n t o an e f f e c t i v e instrument to use i n s t u d i e s of r e a d a b i l i t y . By t h i s means, Bormuth planned to i d e n t i f y 'the l i n g u i s t i c f e a t u r e s that serve as s t i m u l i f o r the v a r i o u s comprehension p r o c e s s e s ' and then to move 'towards e f f o r t s t o o p e r a t i o n a l i z e those processes i n a manner t h a t i s s u i t a b l e f o r i n s t r u c t i o n . • Thus the a p p l i c a t i o n of the Cloze procedure would enable a g r e a t e r understanding to be gained of the c a u s a l r e l a t i o n s h i p between s p e c i f i c l i n g u i s t i c v a r i a b l e s and l e v e l s of comprehension among secondary s c h o o l students. 46 An e x c e l l e n t summary of experiments using the Cloze technique to determine r e a d a b i l i t y l e v e l s of m a t e r i a l s f o r c h i l d r e n and a d u l t s was presented by P o t ter (1968). In a d d i t i o n to h i s d i s c u s s i o n on the t e c h n i c a l a spects of the s t u d i e s , P o t t a r mentioned that the s e p a r a t e s c o r i n g of f u n c t i o n and content words may p r o v i d e v a l u a b l e i n f o r m a t i o n f o r s p e c i a l i z e d purposes. Geyer (19 68) t e s t e d the use of C l o z e as a p r e d i c t o r of a student's a b i l i t y t o comprehend s o c i a l s t u d i e s c o n t e n t and a l s o to determine i f m a t e r i a l s r e w r i t t e n a t an e a s i e r r e a d a b i l i t y l e v e l would r e s u l t i n improved comprehension. The r e s u l t s of the l a t t e r a spect of the study showed t h a t comprehension may not be s i g n i f i c a n t l y improved by r e d u c i n g vocabulary d i f f i c u l t y and sentence complexity. Hater (1969) and l a t e r Kulm (1971) measured the r e a d a b i l i t y of mathematical E n g l i s h . Kulm r e p o r t e d t h a t there were a t l e a s t ten language v a r i a b l e s that had a s i g n i f i c a n t e f f e c t on the r e a d a b i l i t y of the m a t e r i a l . Kulm maintained t h a t e x i s t i n g r e a d a b i l i t y formulas th a t r e l y on word d i f f i c u l t y and sentence l e n g t h t c measure r e a d a b i l i t y are not a p p r o p r i a t e to use with mathematical E n g l i s h . The work of Houska (1971) showed t h a t the C l o z e procedure was a v i a b l e instrument to determine the r e a d a b i l i t y l e v e l of i n s t r u c t i o n m a t e r i a l s i n I n d u s t r i a l Education at the secondary s c h o o l l e v e l . An i n t e r e s t i n g approach was o f f e r e d by Ramanauskas (1972) who conducted an experiment using two examples of m a t e r i a l with i d e n t i c a l s y n t a c t i c and semantic components but with some of the sentences rearranged i n the second sample. Ramanauskas argued that the r e a d a b i l i t y of the second sample, as measured by r e a d a b i l i t y formulas, was 47 unchanged. That i s , there were e x a c t l y the same number of sentences, words, s y l l a b l e s , e t c . as before. I t was only by using the Cloze technique t h a t a v a l i d measure of r e a d a b i l i t y c o u l d be obtained. DETERMINING SIGNIFICANT CONTENT MATERIAL Words are g e n e r a l l y considered to belong to one of two c l a s s i f i c a t i o n s : s t r u c t u r e or f u n c t i o n words, and l e x i c a l or r e f e r e n t i a l words (Betts , 1965; Dauzat, 1968), The former words a c t as c l u e s to grammatical s t r u c t u r e (e.g. a, an, a t , by, what, very) whereas the l a t t e r type have l e x i c a l meanings which are r e a d i l y d i s t i n g u i s h a b l e from s t r u c t u r a l meanings. F r i e s (1952) i d e n t i f i e d some 154 s t r u c t u r e words and c a t e g o r i z e d them i n t o f i f t e e n groups i n c l u d i n g a u x i l i a r y verbs, c o n j u n c t i o n s , p r e p o s i t i o n s , r e l a t i v e pronouns, and determiners. However, F r i e s d i d not c l a i m t h a t h i s l i s t was exhaustive and other w r i t e r s have d e f i n e d c o n s i d e r a b l y more words as s t r u c t u r e words (Lefevre, 1964; Goodman et a l , 1966). The r o l e of f u n c t i o n words i n p r o v i d i n g s t r u c t u r a l i n f o r m a t i o n was i l l u s t r a t e d by Young (1973) who quoted a p o r t i o n of Lewis C a r r o l ' s poem "Jabberwocky": •Twas b r i l l i g , and_the s l i t h y toves Did gyre and gimble i n _ t h e wabe; A l l mimsy wera_the borogoves, And_the mome r a t h s outgrabe...' Young poi n t e d out t h a t the underscored s t r u c t u r e words helped generate the i d e a s which were i n h e r e n t i n the nonsense 48 sentences comprising the poem. Rogers (1965) s t a t e d that although s t r u c t u r e words were r e l a t i v e l y few i n number they were extremely important i n the language because of t h e i r dense d i s t r i b u t i o n . F r i e s (1952) found t h a t s t r u c t u r e words accounted f o r over 30 percent of a sample of 1,000 words while Kucera and F r a n c i s (1967) estimated that j u s t under 50 percent of t h e i r megaword corpus c o n s i s t e d of s t r u c t u r e words. The K u c e r a - F r a n c i s study showed that the frequency of s t r u c t u r e words d i f f e r e d g r e a t l y a c r o s s the v a r i o u s genre samples. The vast m a j o r i t y of the 100 most f r e q u e n t l y o c c u r r i n g words i n each of the f i f t e e n genre examined were f u n c t i o n words. However, the rank order of the s t r u c t u r e words (except f o r "the" which was f i r s t i n a l l cases) was n o t i c e a b l y a f f e c t e d by the type of genre i n which i t o c c u r r e d ; " i f " was the second most frequent word i n ten of the genre, while "and" was second i n rank i n f i v e genre. An area of r e s e a r c h which i s p e r t i n e n t t o t h i s study i s the automatic c r e a t i o n of a l i t e r a t u r e a b s t r a c t d e r i v e d from an a n a l y s i s of words i n a l i t e r a r y passage. Luhn (1958) o u t l i n e d the methodology of " a u t o - a b s t r a c t i n g " which i n v o l v e d determining a w o r d - l i s t compiled i n descending order of frequency to g i v e a " s i g n i f i c a n c e " f a c t o r f o r words, and an a n a l y s i s of the r e l a t i v e p o s i t i o n of the words i n each sentence to determine the s i g n i f i c a n c e of sentences. A combination of these two measurements was then used to g i v e a " s i g n i f i c a n c e " f a c t o r f o r sentences. The " a u t o - a b s t r a c t " was f i n a l l y compiled from the h i g h e s t ranking or most s i g n i f i c a n t sentences. Luhn defined the 49 most " s i g n i f i c a n t " words as being n e i t h e r i n the r e g i o n of hi g h e s t frequency (these words c o n s t i t u t e d the 'noise' i n the system), nor i n the area o f low frequency where t h e i r r a r i t y would negate t h e i r r e l e v a n c e to the s u b j e c t matter. The " s i g n i f i c a n t " s e c t i o n of words i n the m a t e r i a l would t h e r e f o r e occur somewhere between the two extreme p o i n t s i n the d i s t r i b u t i o n . Luhn hypothesized t h a t i t would then be p o s s i b l e to determine the degree of d i s c r i m i n a t i o n or " r e s o l v i n g power" of the words making up t h i s middle s e c t i o n of the d i s t r i b u t i o n . The " s i g n i f i c a n c e " f a c t o r f o r sentences was a r r i v e d at by i d e n t i f y i n g the p r o x i m i t y of " s i g n i f i c a n t " words t o one another. Sentences which had the g r e a t e s t number of f r e q u e n t l y o c c u r r i n g d i f f e r e n t words i n c l o s e p r o x i m i t y to each other were ranked higher and were c l a s s e d as more " s i g n i f i c a n t " to the s u b j e c t . These sentences were then s e l e c t e d on the b a s i s o f t h e i r rank to form the " a u t o - a b s t r a c t " of the ex c e r p t or a r t i c l e . An obvious drawback to the system d e s c r i b e d by Luhn was the absence of i n t e l l e c t u a l d e c i s i o n s made by s p e c i a l i s t s i n v a r i o u s d i s c i p l i n e s i n making a f i n a l s e l e c t i o n of " s i g n i f i c a n t " content. A s i m i l a r technique f o r a u t o m a t i c a l l y a n a l y z i n g p r i n t e d m a t e r i a l s was suggested by Maron (1961) who was concerned with the automatic indexing of documents ac c o r d i n g to t h e i r s u b j e c t content. Maron's t h e s i s s t a t e d t h a t reasonably v a l i d p r e d i c t i o n s of the s u b j e c t matter of documents co u l d be made on the b a s i s of s t a t i s t i c s i n v o l v i n g word frequency, word order, l o c a t i o n , e t c . The main d i f f i c u l t y concerned the s e l e c t i o n of c l u e words which would be n e i t h e r too r a r e to be v a l i d p r e d i c t o r s , nor belong to 50 the ' l o g i c a l ' c l a s s of s t r u c t u r e words which d i d not supply r e f e r e n t i a l meaning f o r the m a t e r i a l . Haron decided t h a t the high frequency s t r u c t u r e words which accounted f o r over 40 percent of the t o t a l occurrences i n h i s study should be excluded because of t h e i r l a c k of i n f o r m a t i o n about the s u b j e c t matter. S i m i l a r l y , the high frequency l e x i c a l words were next excluded because of t h e i r l a c k of s p e c i f i c i t y f o r the s u b j e c t . Next to be r e j e c t e d were the words which o c c u r r e d only once or twice i n the corpus. The r e s u l t i n g 1,000 words were then l i s t e d and analyzed to determine which of these words were v a l i d p r e d i c t o r s of the s u b j e c t content. Although the present study d i d not attempt to make an i n t e n s i v e a n a l y s i s of the data along the l i n e s suggested by Luhn (1958) or Maron (1961), Task 9 i n CHAPTER I suggests a p o s s i b l e s t r a t e g y f o r f u r t h e r r e s e a r c h i n t o the a n a l y s i s of p r i n t m a t e r i a l s and the s e l e c t i o n o f s i g n i f i c a n t content vocabulary. The v a r i o u s grade l e v e l and s u b j e c t area c o r p o r a generated by t h i s study a l s o o f f e r a r e a d i l y a v a i l a b l e sample of m a t e r i a l s f o r the purpose of developing techniques f o r the s e l e c t i o n of s i g n i f i c a n t content vocabulary i n p r i n t sources. SUMMARY Numerous s t u d i e s have been made i n t o the vocabulary used i n p r i n t e d E n g l i s h language m a t e r i a l s s i n c e the 1920»s. Most of the s t u d i e s have o r i g i n a t e d i n the U.S.A and have concentrated on w o r d - l i s t s designed f o r use i n primary and elementary s c h o o l s . The development of frequency counts of words o c c u r r i n g i n 51 w r i t t e n d i s c o u r s e has aided other r e s e a r c h e r s i n t h e i r examination of both l i n g u i s t i c and p s y c h o l o g i c a l aspects of the language. Vocabulary l i s t s provide one of the most important f a c t o r s i n r e a d a b i l i t y work and much of t h i s r e s e a r c h r e l i e s on the a v a i l a b i l i t y of word l i s t s . Sources of t h i s nature at the secondary s c h o o l l e v e l have been l a c k i n g i n the past. The l a t e s t trends have seen the use of d i g i t a l computers which have allowed r e s e a r c h e r s to d e a l with much g r e a t e r and more d i v e r s e amounts of p r i n t e d m a t e r i a l s based on c a r e f u l procedures of random sampling. The need remains f o r s i m i l a r , w e l l - d e s i g n e d s t u d i e s to be made i n t o i n s t r u c t i o n a l m a t e r i a l s used i n Canadian schools at the secondary l e v e l . Few word l i s t s have been developed based on r e p r e s e n t a t i v e sampling from secondary s u b j e c t area m a t e r i a l s which allow f o r a n a l y s i s a c r o s s grades and by s u b j e c t area. Word l i s t s t r a d i t i o n a l l y p r ovide data i n frequency of occurrence of word-types but do not i n d i c a t e repeat r a t e frequency or averages or v a r i a b i l i t y f o r samples organized by grades or s u b j e c t areas. In a d d i t i o n , only a s m a l l number of s t u d i e s have r e p o r t e d sentence l e n g t h c h a r a c t e r i s t i c s , i n d i c a t i n g averages and v a r i a b i l i t y , or repeat r a t e frequency f o r sentence l e n g t h types f o r samples or g a n i z e d by grades and s u b j e c t areas. Although s u b j e c t i v e analyses have been r e p o r t e d , few s t u d i e s based on data from c a r e f u l l y s e l e c t e d p r i n t s o u r ces have been announced which e m p i r i c a l l y v a l i d a t e s u b j e c t i v e o p i n i o n s with r e s p e c t to word and sentence c h a r a c t e r i s t i c s of samples of n a t u r a l language t e x t from s u b j e c t areas. 52 E a r l y attempts to c o n s t r u c t r e a d a b i l i t y measures r e s u l t e d i n formulas t h a t r e q u i r e d lengthy and f a i r l y complex computations. Late r r e s e a r c h i n t o the most s i g n i f i c a n t language v a r i a b l e s i n v o l v e d i n r e a d a b i l i t y , i s o l a t e d word d i f f i c u l t y ( o f t e n measured as word l e n g t h and word frequency) and sentence l e n g t h as two important v a r i a b l e s . Recent r e s e a r c h i n t o r e a d a b i l i t y has emphasized the need to develop much l a r g e r samples of i n s t r u c t i o n a l m a t e r i a l s and more usable word l i s t s on which to base i n v e s t i g a t i o n s . A l s o the need to look much more c l o s e l y a t l i n g u i s t i c v a r i a b l e s which appear to have a c a u s a l r e l a t i o n s h i p to comprehension d i f f i c u l t y i s of prime importance. Again, data bases c o n s i s t i n g of c a r e f u l l y s e l e c t e d r e p r e s e n t a t i v e samples from n a t u r a l language t e x t are necessary i n f u r t h e r i n g t h i s type of r e s e a r c h . Because of the numbers of samples i n v o l v e d and the complexity of the l i n g u i s t i c v a r i a b l e s which need to be examined, data bases should be organized f o r computer i n p u t and p r o c e s s i n g . The use of the C l o z e technique to measure r e a d a b i l i t y has gained c o n s i d e r a b l e a t t e n t i o n s i n c e the l a t e 1950's. T h i s procedure i n v o l v e s many asp e c t s of language i n c l u d i n g l e x i c a l and s t r u c t u r a l words, grammar, and c o n n o t a t i v e f e a t u r e s of language. Many r e s e a r c h e r s f e e l t h a t i n the f u t u r e the Cl o z e t e c h n i q u e w i l l be ab l e to c o n t r i b u t e a g r e a t deal to an understanding of the s y n t a c t i c and semantic f u n c t i o n s of the b a s i c language v a r i a b l e s i n i n s t r u c t i o n a l m a t e r i a l s at the secondary l e v e l . E f f e c t i v e Cloze analyses are a l s o f a c i l i t a t e d by the a v a i l a b i l i t y of w e l l organized data bases that have known 53 word and sentence c h a r a c t e r i s t i c s . The development of methodology to i d e n t i f y s i g n i f i c a n t content m a t e r i a l i n a p r i n t e d passage was recommended by s e v e r a l r e s e a r c h e r s . The techniques have p o t e n t i a l i n the examination of word l i s t s d e r i v e d from samples o f p r i n t m a t e r i a l s from s u b j e c t areas. Many s t u d i e s generate word l i s t s but l i t t l e a t t e n t i o n i s given to the p r o v i s i o n of adequate techniques f o r the i d e n t i f i c a t i o n of the s i g n i f i c a n t content i n such l i s t s . In summary, the f o c u s of t h i s study i s on the i d e n t i f i c a t i o n and a n a l y s i s o f the l e x i c a l c h a r a c t e r i s t i c s of a sample of p r i n t m a t e r i a l s p r e s c r i b e d f o r use i n j u n i o r secondary s u b j e c t areas. The c o n c e p t u a l base, design and methodology f o r the study emanate from the review and a n a l y s i s of s e l e c t e d , r e l a t e d l i t e r a t u r e i n the f o u r areas p r e v i o u s l y d i s c u s s e d . A w e l l d e f i n e d , r e p r e s e n t a t i v e , adequately s t r a t i f i e d body of p r i n t m a t e r i a l c o n s i s t i n g of 500 word samples, forms the b a s i s f o r the development of word l i s t s and comparative and s t a t i s t i c a l analyses. The samples are organized t o r e p r e s e n t c h a r a c t e r i s t i c s of the p r e s c r i b e d p r i n t m a t e r i a l s a c r o s s the t o t a l j u n i o r secondary c u r r i c u l u m , by the t h r e e grades, by the seven s u b j e c t areas a c r o s s grades, by the e i g h t e e n s u b j e c t s w i t h i n grades, and by the t h i r t y - s e v e n textbooks. The word and sentence data are analyzed i n terms of the r e l a t i v e freguency of occurrence of v a r i o u s word-types and sentence l e n g t h s , and the e m p i r i c a l t e s t s made to i l l u s t r a t e the p a t t e r n i n frequency of occurrence of the most common words and a s e r i e s of r e p r e s e n t a t i v e sentence lengths. A technique i s proposed which 54 serves as a model for the i d e n t i f i c a t i o n of the most s i g n i f i c a n t content i n word l i s t s derived from subject area materials. F i n a l l y , computer technology i s used throughout i n the development, organization, comparison and analysis of the data base and in the production of the f i n a l printed copy of the di s s e r t a t i o n i t s e l f . 55 CHAPTER I I I THE RESEARCH DESIGN T h i s chapter d e s c r i b e s the r e s e a r c h design and methodology f o r the study. The study was concerned with " p r e s e n t - o r i e n t e d * r e s e a r c h and a d e s c r i p t i v e , survey approach was used t o d e s c r i b e a s p e c i f i c s e t of phenomena i n and of themselves u t i l i z i n g u nobtrusive measures d e r i v e d from samples of n a t u r a l language t e x t . The i n f o r m a t i o n p r o v i d e s the answers to the r e s e a r c h q u e s t i o n s and hypotheses posed. The r e s e a r c h method was developed t o make an ac c u r a t e assessment of the i n c i d e n c e , d i s t r i b u t i o n , and r e l a t i o n s h i p s of the phenomena under i n v e s t i g a t i o n . The r e s e a r c h d e s i g n was org a n i z e d to generate the samples of n a t u r a l language t e x t , produce the Corpus of m a t e r i a l s , develop the v a r i o u s word l i s t s and generate the data necessary to accomplish the nine major t a s k s , answer the qu e s t i o n s r a i s e d , and t e s t the s p e c i f i c hypotheses of the study as o u t l i n e d i n Chapter I. A P i l o t Study was f i r s t conducted t o generate needed computer programs, t e s t procedures and sharpen the methodology f o r the study. F o l l o w i n g a d e s c r i p t i o n of the P i l o t Study, the design and methodology f o r each o f the nine major tasks are presented. 56 THE PILOT STUDY Before commencing with the study i t was necessary to make a t r i a l run with a smal l sample of i n s t r u c t i o n a l m a t e r i a l s , T h i s procedure was u t i l i z e d t o determine: (a) the time needed f o r keypunching a s e t amount of running > prose (10,000 words) so th a t an estimate c o u l d be made of the eve n t u a l s i z e of the data base to be used i n the study, (b) the i n c i d e n c e of e r r o r s i n keypunching to determine whether i t was necessary t o have the work v e r i f i e d by machines, (c) the e f f i c i e n c y of e x i s t i n g programs and the need f o r a d d i t i o n a l programs necessary to organize word l i s t s , make s t a t i s t i c a l a n a l y s e s , e t c , (d) the s i z e of the samples taken from each t e x t needed to give a v a l i d r e p r e s e n t a t i o n o f the content m a t e r i a l , (e) the r e l i a b i l i t y of u s i n g a random, s t r a t i f i e d sampling technique w i t h i n a textbook, (f) the use of d e l i m i t e r s t o determine words and sentences i n the content m a t e r i a l , and (g) the amount of data t h a t c o u l d be f e a s i b l y analyzed w i t h i n the time and resources a v a i l a b l e . Twenty-one samples of approximately 500 words were taken from the p r e s c r i b e d "B" i s s u e textbook f o r A g r i c u l t u r e , Farmer^s Shop_Book (See Table II) , T h i s p a r t i c u l a r t e x t was chosen f o r the P i l o t Study because i n the judgment of the re s e a r c h e r the m a t e r i a l contained a good s e l e c t i o n of both v e r b a l and symbolic language l i k e l y to be found i n the other content areas. 57 TABLE I I THE TWENTY-ONE, 500 WORD SAMPLES USED IN THE PILOT STUDY Sample Pages Sample Pages 01 02 03 04 05 06 07 08 09 10 11 14-17 34-36 62-64 78-81 12 13 14 15 16 17 18 19 20 21 231-233 261 274 146-147 161-164 1 89-190 216-220 223-275 109-11 1 129 313-316 328-329 353-355 375 390-391 416-417 433-437 The r a t e of e r r o r was found to be l e s s than one word per 500 words keypunched which suggested t h a t machine checking was not warranted. The r a t e of keypunching was estimated at approximately 1,000 words per hour under i d e a l c o n d i t i o n s . As a r e s u l t of the P i l o t Study, the f o l l o w i n g d e c i s i o n s were made: (a) A Corpus of approximately 235,000 words of running prose taken from 469 samples of 500 words each was f e a s i b l e f o r the study. (b) The Command Operand n)P"» which was i n t e r s p e r s e d throughout the t e x t i n p u t to s i g n a l a new paragraph, was d e l e t e d from the f i n a l frequency count of words because of i t s high r a t e of occurrence. (c) An a d d i t i o n a l nine d e l i m i t e r s of a word were i n c l u d e d to br i n g the t o t a l to twenty-one. These c o n s i s t e d o f : 58 (d) The d i c t i o n a r y s i z e e s t a b l i s h e d t o d e a l with word-types was s e t a t 20,000. (e) A "Repeat Rate Frequency" t a b l e designed to i l l u s t r a t e the i n c i d e n c e of s i m i l a r l y o c c u r r i n g f r e q u e n c i e s f o r both word-types and sentence l e n g t h s was i n c l u d e d f o r each frequency word l i s t and f o r the sentence a n a l y s i s . (f) The ch i - s q u a r e and Yule's C h a r a c t e r i s t i c "K" s t a t i s t i c s were t e s t e d and i n c l u d e d i n the study f o r both word frequency and sentence l e n g t h a n a l y s e s . (g) A number of a d d i t i o n a l programs were developed to enable data to be generated i n the form d e s i r e d . (h) The graphs d e p i c t i n g word frequency and frequency o f sentence length were p l o t t e d by computer programs. TASK 1. SELECTION OF MATERIALS The sampling procedures were developed to provide r e p r e s e n t a t i v e l e x i c a l data f o r every p r e s c r i b e d t e x t with s u f f i c i e n t q u a n t i t i e s of n a t u r a l language prose. The s e l e c t i o n procedure c o n s i s t e d of two phases: an i n i t i a l s u b j e c t i v e d e c i s i o n t o determine the number of t e x t s and samples to be used, f o l l o w e d by a s t r a t i f i e d , random sampling procedure to determine the number of samples to be s e l e c t e d w i t h i n each t e x t . Works of verse or drama were not i n c l u d e d on the grounds t h a t they seemed t o i n v o l v e s p e c i a l l i n g u i s t i c problems and d i d not c o n s t i t u t e the usual syntax a s s o c i a t e d with normal prose. Passages c o n t a i n i n g s p e c i a l coding techniques such as shorthand and mathematics were excluded f o r the same reasons. 59 § §S£iiJS2_E£.ocedures (a) Textbooks_And_Sam2les_I T h i r t y - s e v e n "A" i s s u e textbooks c o n t a i n i n g samples of E n g l i s h language prose of 500 words or over were i n c l u d e d i n the study. The t o t a l number of textbooks and samples f o r each content area i s presented i n TABLE I I I . I n f o r m a t i o n p e r t a i n i n g to t i t l e s , authors and p u b l i s h e r s of the books i s l i s t e d i n APPENDIX A. TABLE I I I NUMBER OF TEXTS AND SAM PL ES FOR EACH GRADE LEVEL AND SUBJECT AREA SUBJECT GRADE 8 GRADE 9 GRADE 10 SUBJECT TC Text Sample Text Sample Text Sample Text Samp! Commerce X 2 2 4 X 25 16 41 E n g l i s h 2 4 2 8 17 47 16 80 Home Economics 1 5 X 6 22 76 X 98 Ind. Education 1 3 X 4 9 54 X 63 Mathematics 1 1 1 3 14 7 14 35 Science 2 2 2 6 20 24 31 75 S o c i a l S t u d i e s 2 2 2 6 22 13 42 77 Grade T o t a l s 9 104 19 246 9 119 37 469 The t h i r t y - s e v e n "A" i s s u e textbooks were s e l e c t e d f o r use because every student i n each c l a s s or course of study r e c e i v e s a copy of the t e x t . Other t e x t s i n c l u d e those provided i n s e t s 60 to be shared by the students, ("B" i s s u e ) ; p r e s c r i b e d f o r teacher use o n l y , ("C" i s s u e ) ; or a l l o t t e d f o r s p e c i a l purposes, ("D" and "E" issue) and are d e s c r i b e d i n the b o o k l e t , P r e s c r i b e d Iglibooks x_J972^73 i_Grades_I^ p u b l i s h e d by the Department of Education, P r o v i n c e of B r i t i s h Columbia. Eleven "A" i s s u e textbooks were not i n c l u d e d i n the study. The grade l e v e l and s u b j e c t area of the omitted textbooks (with the number of t e x t s i n brackets) were as f o l l o w s : Grade 10 Commerce (2), Grade 8 E n g l i s h (1), Grade 9 E n g l i s h (1), Grade 10 E n g l i s h (2), Grade 9 Mathematics (1), Grade 10 Mathematics (1), Grade 8 S o c i a l S t u d i e s (1), Grade 9 S o c i a l S t u d i e s (1), Grade 10 S o c i a l S t u d i e s (1) . The reasons f o r e x c l u d i n g the textbooks are as f o l l o w s : the Commerce textbooks contained shorthand e x e r c i s e s ; the E n g l i s h textbooks c o n s i s t e d of poetry and blank v e r s e ; the Mathematics textbooks c o n t a i n e d mainly a l g e b r a i c and geometric problems; and the S o c i a l S t u d i e s textbooks were a t l a s e s . The textbooks used i n Grade 10 Home Economics and I n d u s t r i a l Education were the same as those p r e s c r i b e d f o r use i n Grade 9 and were i n c l u d e d only once because the r e p e t i t i o n of i d e n t i c a l m a t e r i a l would have d i s t o r t e d the r e s u l t s o b t a i n e d . A Grade 9 E n g l i s h textbook used i n Grade 10 E n g l i s h was not i n c l u d e d i n the l a t t e r t o t a l f o r the same reason, (b) Sampling^ A t o t a l of 469 samples, each of approximately 500 words, were s e l e c t e d from the t h i r t y - s e v e n "A" textbooks (see APPENDIX A). Samples of 500 words were used because research evidence suggested t h a t t h e r e was both g r e a t e r 61 d i v e r s i t y of word-types from samples of t h i s s i z e than l a r g e r samples, and s u f f i c i e n t f l e x i b i l i t y i n the r e p r e s e n t a t i o n of content m a t e r i a l s ( C a r r o l l et a l , 1971). The samples c o n s i s t e d of E n g l i s h language running prose and were randomly s e l e c t e d from every twenty pages throughout each t e x t using a t a b l e of random numbers. Each sample began with the f i r s t complete sentence on the page s e l e c t e d and continued f o r approximately 500 words. E v e r y t h i n g other than running prose was omitted, i n c l u d i n g t i t l e s , running heads, f o o t n o t e s , t a b l e s , and p i c t u r e c a p t i o n s . Two l i s t s of the sample s i z e s , one i n a l p h a b e t i c a l order and one i n ascending rank o r d e r , are presented i n APPENDIX B. These procedures produced a t o t a l random sample of approximately 40 percent of the "A" i s s u e i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n the seven s u b j e c t areas of Grades 8, 9, and 10 i n B r i t i s h Columbia j u n i o r secondary s c h o o l s . TASK 2. INPUT PROCESSING, KEY PUNCHING AND EDITING Once the sampling procedures were e s t a b l i s h e d , the s e l e c t i o n s were keypunched onto computer cards using the UBC FORMAT (FMT) program. FMT i s a program which enab l e s the r a p i d p r i n t i n g of m a t e r i a l s i n upper and lower case and with s p e c i a l c h a r a c t e r s d i r e c t l y on the system p r i n t e r . Input to the program was i n f r e e - f o r m l i n e s . The m a t e r i a l was formatted and c o n t r o l l e d a c c o r d i n g t o c o n t r o l cards and command words i n t e r s p e r s e d throughout the i n p u t . The b a s i c 62 command_o£erands and S£ecial_operand_values of the FMT program were used to or g a n i z e the format of the document and allow f o r most i n s t a n c e s of s p e c i a l arrangment ( i n d e n t i n g , c e n t e r i n g , u n d e r l i n i n g , etc) u s u a l l y r e q u i r e d i n a formal paper. In a d d i t i o n , the symbol -• was placed a f t e r a p e r i o d t h a t d i d not s i g n i f y the end of a sentence (e.g. Dr.-»). Each sample was given a code number c o n s i s t i n g of the grade l e v e l (designated by 1, 2, or 3 f o r Grades 8, 9, and 10); a l e t t e r s i g n i f y i n g the s u b j e c t area. Commerce(B), E n g l i s h ( C ) , Home Economics(D) , I n d u s t r i a l Education (E), Mathematics (F), Science (G), S o c i a l S t u d i e s (H) ; a two d i g i t number f o r the order of the t e x t ; the l e t t e r "C" to r e p r e s e n t the Corpus; and another t w o - d i g i t number to d i s t i n g u i s h the sequence of samples i n each t e x t . Thus 2B 01 C 01 designated the f i r s t sample i n the f i r s t textbook l i s t e d f o r Grade 9 Commerce and 3H 03 C 07 represented the seventh sample i n the t h i r d textbook l i s t e d f o r Grade 10 S o c i a l S t u d i e s . The i n f o r m a t i o n on the cards was then t r a n s f e r r e d i n 80 c h a r a c t e r "card-image" form to a magnetic tape and s t o r e d permanently i n the computer l i b r a r y . The equipment used was an IBM /370 Model 168 computer, with 2 megabytes of s t o r a g e , and f i v e 9-channel 1600 bpi IBM 2401 tape d r i v e s . The computer has 14 ITEL 7330 d i s k u n i t s and f o u r 1100-line-per-minute p r i n t e r s , p l u s a number of card readers and car d punches. A more d e t a i l e d e x p l a n a t i o n of the 209 computer f i l e s and programs developed f o r pr o c e s s i n g the i n p u t samples and conducting the analyses i s p r o v i d e d i n APPENDIX C. These 63 programs are a v a i l a b l e a t the Computing Centre, U n i v e r s i t y of B r i t i s h Columbia. I ex t _ Co r r e c t i on s.^  Three methods of t e x t c o r r e c t i o n were used to ensure accuracy. 1. A p r e l i m i n a r y stage o f p r o o f - r e a d i n g took place when the content of each sample had been keypunched onto IBM c a r d s . The cards were scanned by the w r i t e r and checked a g a i n s t the o r i g i n a l t e x t . The cards were then p r i n t e d as FMT output and the p r i n t - o u t was again scanned f o r obvious e r r o r s . V e r i f i c a t i o n by machine means was not used because of the s m a l l i n c i d e n c e of e r r o r s noted i n the P i l o t Study and a l s o because of the high cost of t h i s method. 2. The second stage o f e d i t i n g made use of the C o n v e r s a t i o n a l Terminal which i s an IBM 3270 D i s p l a y S t a t i o n c o n s i s t i n g of a cathode-ray-tube screen (CRT) and a t y p e w r i t e r -l i k e keyboard. The o r i g i n a l input data were d i s p l a y e d on the CRT and scanned again f o r e r r o r s , c o r r e c t i o n s were made and a r e v i s e d p r i n t - o u t was obtained f o r examination. The use of the C o n v e r s a t i o n a l Terminal f a c i l i t a t e d very f a s t p r o o f - r e a d i n g and c o r r e c t i o n of the m a t e r i a l being processed. 3. The f i n a l stage of p r o o f i n g was p o s s i b l e a f t e r the Corpus vocabulary had been arranged i n descending order of frequency of word-types. T h i s method of e d i t i n g was by f a r the most e f f i c i e n t f o r i t merely e n t a i l e d checking the hapax legomena (words that o c c u r r e d once) and words t h a t had occurred twice to q u i c k l y i d e n t i f y obvious e r r o r s . The chance of a word 64 being i n c o r r e c t l y keypunched more than twice i n d i f f e r e n t p a r t s of the corpus was considered u n l i k e l y and a quick check confirmed t h i s b e l i e f . TASK 3. PRODUCTION OF THE CORPUS Task 3 i n v o l v e d the use of e x i s t i n g programs and the development of new programs to generate two c o p i e s o f the Corpus: one organized by s u b j e c t s w i t h i n each of the three grade l e v e l s , and the other organized by s u b j e c t s a c r o s s the Corpus. An index was a l s o developed f o r each Corpus which l i s t e d the f u l l d e s c r i p t i o n of the textbooks and the samples used i n the study. The two corpora have been produced as separate volumes and are i d e n t i f i e d as CG (Corpus by Grades) and CS (Corpus by S u b j e c t s ) . The MTS FORMAT computer program was used t c produce the p r i n t - o u t o f these c o r p o r a . The production of the two c o p i e s of the Corpus i s i l l u s t r a t e d i n FIGURE 1. A d e t a i l e d d e s c r i p t i o n of the computer programs and procedures f o l l o w e d to produce the two copies of the Corpus i s presented i n the document, E£°aiaSl§£l§_2liii.§_t2_tk§_MliS£^§i_£2EE u.§x by A l l a n M i l l e r (1974) . 6 5 E x t r a c t s . V o l . I I Sub j e c t s of Corpus -> K eypunch. -> Cards with *FMT c o n t r o l s <-*FMT U7B7C7 f o r m a t t i n g program <-.EDITOR. "MTTTST" E d i t o r <- -r e p e a t c o r r e c t i o n c y c l e . F i l e s : AGRICULTURE, COMMERCE, ENGLISH, HOMEC, INDED, MATH, SCIENCE, SOCIALS. A l l f i l e s above, except > AGRICULTURE , and e x c l u d i n g *FMT c o n t r o l commands from 2nd and subsequent f i l e s . CORPUS Grade8, Grade9 , GradelO _*FMT u.ITcT f o r m a t t i n g program <--> _SFLITJ J !_S_ program s e p a r a t e s grades. V o l . I Grades of the Corpus < FIGURE 1 PRODUCTION OF VOLUMES C.G. AND C S . OF THE CORPUS TASK 4. PRODUCTION OF WORD LISTS T h i s task i n v o l v e d the g e n e r a t i o n of word l i s t s based on v a r i o u s combinations of samples i n t o d i s t i n c t i v e c o r p o r a . Two sub-tasks were i n v o l v e d . 66 Task_U_1 E x i s t i n g computer programs were u t i l i z e d and new programs developed where needed to generate two l i s t s based on each of the f o l l o w i n g s i x t y - s i x c o r p o r a ; a) the Corpus; b) t h r e e grade c o r p o r a ; c) seven s u b j e c t corpora; d) eighteen s u b j e c t w i t h i n grade corpora; e) t h i r t y - s e v e n textbook corpora. The f i r s t of the two l i s t s i s an a l p h a b e t i c a l arrangement of word-types (See APPENDIX D), The l i s t c o n s i s t s of three columns and a number pl a c e d a t the top of the f i r s t column p r o v i d e s a running t o t a l of the word-types to t h a t p o i n t . The f i r s t column (FREQ) i n d i c a t e s the r e l a t i v e frequency per 1000 tokens f o r each word-type; the second column (COUNT) s t a t e s the frequency of occurrence; and the t h i r d column (WORD) l i s t s the word-type. The second l i s t p r e s e n t s the rank-order of each word-type (See APPENDIX E ) . The rank l i s t a l s o c o n s i s t s of three columns s i m i l a r to the a l p h a b e t i c a l l i s t except t h a t the f i r s t column (FREQ) i n d i c a t e s the cumulative percent of the t o t a l corpus accounted f o r by each word-type. Ta.sk_4._2 Two a d d i t i o n a l t a b l e s were i n c l u d e d f o r the rank-order l i s t which summarized the rank i n (a) descending order ( i . e . the word of h i g h e s t frequency f i r s t and the ha_ax legomena l a s t ) ; and (b) ascending order ( i . e . the haj>ax_legomena f i r s t and the h i g h e s t frequency word l a s t ) . (See APPENDIX F ) . The o r g a n i z a t i o n of the t a b l e s and the column headings are 67 i d e n t i c a l , except that the t a b l e which g i v e s the descending order has an e x t r a column (RANK) which p r o v i d e s the rank number of each word-type. T h i s arrangement makes i t p o s s i b l e to q u i c k l y l o c a t e the rank of any word i n the Corpus or the v a r i o u s corpora by matching the frequency of a word i n e i t h e r the a l p h a b e t i c a l l i s t o r the rank-order l i s t under the column COUNT, with the same frequency i n column X i n the descending order t a b l e . In cases where the frequency of word-types i s the same, the rank range of these words i s s u p p l i e d . The column headings i n the descending and ascending order t a b l e s p r o v i d e the f o l l o w i n g i n f o r m a t i o n . Column X The frequency of occurrence of tokens. Column FX The number of word-types of the frequency X. Column SUM FX The sum of word-types counting from the top of the t a b l e . Column CUM% FX The sum of word-types as a cumulative percentage of the t o t a l number of word-types. Column FX * X The number of tokens accounted f o r by each of the word-types. Column SUM FX * X The number of tokens due to the cumulative t o t a l o f word-types. Column CUM% FX * X 68 The p r e v i o u s column as a percentage of the t o t a l number of tokens. The descending and ascending order t a b l e s f a c i l i t a t e the r a p i d a n a l y s i s of the i n f o r m a t i o n contained i n the v a r i o u s word l i s t s . For example, the descending order t a b l e shows t h a t the f i r s t 100 most frequent word-types i n the Corpus account f o r a mere 0.610 percent of the t o t a l number of word-types. However, the same 100 words-types account f o r 48.973 percent of the t o t a l number of tokens i n the Corpus, on the other hand, the ascending order t a b l e shows t h a t words o c c u r r i n g ten times or l e s s account f o r 84.505 percent of the word-types but only 14.705 percent of the t o t a l number of tokens. The word l i s t s and accompanying t a b l e s d e s c r i b e d i n Task 4 have been organized i n t o the f o l l o w i n g f i v e volumes: 1) Corpus, designated as C.V. (Corpus Vocabulary) ; 2) Grades, designated as G.V. (Grade V o c a b u l a r y ) ; 3) S u b j e c t s , designated as S.V. (Subject V o c a b u l a r y ) ; 4) S u b j e c t s w i t h i n Grades, designated as S. 0. V. (Subjects by Grade Vocabulary) ; and 5) Textbooks, designated as T.V. (Textbook V o c a b u l a r y ) . The p r o d u c t i o n of the volumes d i s c u s s e d i n Task 4 was accomplished by making use of a number of computer programs as i l l u s t r a t e d i n FIGURE 2 . 69 C o n t r o l I n f o . -> - -> T a b l e s T a b l e s <-<-<-Grade8 ,| Grade9, GradelO, .COUJTW..S_ word count program TABL. B1 .S Descending with rank TABL.B1.S ascending without Rank -> -> -> <-<-<-SPLIT2.S program se p a r a t e s gde/subjs SPLIT3.S program s e p a r a t e s t e x t s V o l s . See below C o n t r o l i n f o . • SORT K.T.S. s o r t C o n t r o l i n f o . •SORT M.T.S. s o r t -> -> -> <-<-F i l e s : a l l 67 raw data f i l e s l i s t e d i n APPENDIX C with the e x c e p t i o n of AGRICULTURE (the P i l o t Study). Volumes: C.V., G.V., S.V., S.G.V., and T.V. FIGURE 2 PRODUCTION OF WORD LISTS: VOLUMES C.V., G.V., S.V., S.G.V., AN C T.V. 70 A complete d e s c r i p t i o n of the word l i s t s and the computer programs used i s a v a i l a b l e on Tape #RE0616 at the U n i v e r s i t y of B r i t i s h Columbia Computing Centre. TASK 5. DESCRIPTION AND ANALYSIS OF LEXICAL CHARACTERISTICS For Task 5 e x i s t i n g computer programs were u t i l i z e d and new programs developed where necessary to generate a number of comparative and s t a t i s t i c a l a n a l y s es f o r the Corpus, the three grade l e v e l c o r p o r a , the seven s u b j e c t - a r e a c o r p o r a , the eighteen s u b j e c t w i t h i n grade c o r p o r a , and the t h i r t y - s e v e n textbook corpora. Task_5.1 was designed to determine the number of word-types, tokens, c h a r a c t e r s , and average number of c h a r a c t e r s per token f o r each of the f o l l o w i n g : a) the Corpus; b) t h r e e grade c o r p o r a ; c) seven s u b j e c t corpora; d) eighteen s u b j e c t w i t h i n grade corpora; and e) t h i r t y - s e v e n textbook c o r p o r a . Comparative summary t a b l e s were developed f o r t h i s data and are presented i n Chapter IV. Task_5±2 was designed to determine the r e p e a t - r a t e frequency f o r word-types f o r each of the f o l l o w i n g : a) the Corpus; b) three grade corpora; c) seven s u b j e c t c o r p o r a ; d) e i g h t e e n s u b j e c t w i t h i n grade c o r p o r a ; and e) t h i r t y - s e v e n textbook corpora. The r e p e a t - r a t e frequency t a b l e s of word-types f o r each of the s i x t y - s i x corpora are i n c l u d e d i n the f i v e volumes C.V., 71 G.V., S.V., S.G.V., and T.V. (See Task 4.2). The f i r s t column (REPETITIONS) of each t a b l e g i v e s the frequency of the word-type and the second column (RATE) i n d i c a t e s the number of word-types t h a t have t h i s frequency. The t a b l e s thus combine l i k e f r e q u e n c i e s of word-types and present d i f f e r e n t i n f o r m a t i o n than the b a s i c t a b l e s of word f r e q u e n c i e s d i s c u s s e d e a r l i e r . lS§iS«5i3 makes use of Yule*s c h a r a c t e r i s t i c K which i s a s t a t i s t i c a l parameter of a frequency d i s t r i b u t i o n based on the Poisson p r o b a b i l i t y law. The assumptions u n d e r l y i n g the use of the K c h a r a c t e r i s t i c have been s t a t e d t h e o r e t i c a l l y (Yule, 1944) and t e s t e d e m p i r i c a l l y (Kucera and F r a n c i s , 1967). In b r i e f , the K f a c t o r i s s a i d to be independent of sample s i z e when the samples have been c o l l e c t e d from a l a r g e body of m a t e r i a l s . Formula f o r K: K = 10,00 0 S1_-_S2 S I 2 where S1 = _T f x X i s the f i r s t moment of the d i s t r i b u t i o n about x zero as o r i g i n , S2 = _T f x X 2 i s the second moment, and fx i s x the number of words o c c u r r i n g X times. The q u a n t i t y 10,000 i s i n t r o d u c e d to avoid d e a l i n g with s m a l l decimals. Yule»s c h a r a c t e r i s t i c K was used to provide an i n d i c a t i o n of t h e c o n c e n t r a t i o n of vocabulary i n the samples from a p a r t i c u l a r area. A l a r g e K value i m p l i e s a g r e a t e r use of commonly o c c u r r i n g vocabulary or words of high frequency of 72 occurrence. A low K value i m p l i e s t h a t the m a t e r i a l c o n t a i n s a g r e a t e r p r o p o r t i o n of r a r e words or words of low frequency. Summary t a b l e s of Yule's K f o r each of the s i x t y - s i x corpora are presented i n Chapter IV. TASK 6. DESCRIPTION AND ANALYSIS OF SENTENCE CHARACTERISTICS A number of e x i s t i n g computer programs were used and o t h e r s modified f o r use i n the development o f Task 6. Comparative and s t a t i s t i c a l a n a l y s e s were generated, based on the sentence and sentence l e n g t h c h a r a c t e r i s t i c s of the s i x t y - s i x corpora of the study. Four major sub-tasks were i n v o l v e d . T a s k _ 6 w a s designed to p r o v i d e sentence length c h a r a c t e r i s t i c s i n c l u d i n g mean sentence l e n g t h , standard d e v i a t i o n , c o e f f i c i e n t of v a r i a t i o n , median, mode, average number of sentences, and Pearson's skew f a c t o r f o r each of the f o l l o w i n g : a) the Corpus; b) t h r e e grade corpora; c) seven s u b j e c t c o r p o r a ; d) eighteen s u b j e c t within grade corpora; and e) t h i r t y - s e v e n textbook corpora. Comparative summary t a b l e s were developed f o r t h i s data and are presented i n Chapter IV. F u l l d e t a i l s o f the sentence length d i s t r i b u t i o n of each of the s i x t y - s i x corpora are provided i n a volume t i t l e d SENT. (See APPENDIX G ). The volume arranges the data f o r each t a b l e under f i v e headings. The f i r s t column (LENGTH) s t a t e s the l e n g t h of the sentence i n words and the second column (REPETITIONS) g i v e s the number of occurrences of t h i s p a r t i c u l a r sentence l e n g t h . 73 Column three (CUM. SENT) l i s t s the sum of sentences c o u n t i n g from the top of the t a b l e and the f o u r t h column (ACCUM WORDS) serves the same f u n c t i o n f o r words. Column f i v e (% WORDS) g i v e s the running t o t a l o f the percentage of tokens accounted f o r throughout the sentence l e n g t h d i s t r i b u t i o n . £ask_6._2 A matching s et of graphs i l l u s t r a t i n g each of the s i x t y - s i x sentence l e n g t h d i s t r i b u t i o n s was p r i n t e d through the UBC p l o t t i n g package using a CALCOMP Drum P l o t t e r at the UBC Computing Centre. The graphs are presented as APPENDIX H. Task 6^3 T h i s task was designed t o pro v i d e data on the repeat-r a t e frequency o f sentence l e n g t h s f o r each of the f o l l o w i n g : a) the Corpus; b) the three grade c o r p o r a ; seven s u b j e c t c o r p o r a ; eighteen s u b j e c t w i t h i n grade c o r p o r a ; and e) t h i r t y - s e v e n textbook c o r p o r a . The r e p e a t - r a t e frequency t a b l e s f o r sentence l e n g t h s f o r each of the s i x t y - s i x corpora are a l s o presented i n SENT. A complete d e s c r i p t i o n of the sentence c h a r a c t e r i s t i c s i s a v a i l a b l e on tape #RE0616 at the UBC Computing Centre. Task 6.3 made use of Yule's c h a r a c t e r i s t i c K along the l i n e s suggested i n Task 5.3 (Kucera and F r a n c i s , 1967). In t h i s procedure, X i n the statement S1 = 51 fxX, equals the number of x occurrences of a s p e c i f i c sentence l e n g t h and fx equals the number of cases o f X. The c h a r a c t e r i s t i c K i s u s e f u l i n i n d i c a t i n g whether m a t e r i a l c o n t a i n s a gre a t d i v e r s i t y of sentence l e n g t h s (low K v a l u e ) ; or whether there i s a high 74 r e p e t i t i o n of commonly o c c u r r i n g sentence lengths present (high K v a l u e ) , The i m p l i c a t i o n s of the K - f a c t o r f o r d i f f e r e n c e s i n w r i t i n g s t y l e are d i s c u s s e d i n l a t e r c h a p t e r s . Summary t a b l e s of K f o r each o f the s i x t y - s i x c o r p o r a o u t l i n e d i n Task 6.1 are presented i n Chapter IV. TASK 7. ANALYSIS OF DISTRIBUTION OF 100 MOST FREQUENT WORD-TYPES Task 7 u t i l i z e d e x i s t i n g computer programs and developed new programs where needed to analyze the d i s t r i b u t i o n of the 100 most f r e q u e n t l y o c c u r r i n g word-types ac r o s s the f o l l o w i n g c o r p o r a : a) three grade l e v e l s ; b) seven s u b j e c t areas; c) s i x s u b j e c t s i n Grade 8; d) seven s u b j e c t s i n Grade 9; and e) f i v e s u b j e c t s i n Grade 10. Two major sub-tasks were i n v o l v e d . Task_7.1 The chi-square t e s t was used to t e s t whether t h e r e were s i g n i f i c a n t d i f f e r e n c e s i n the d i s t r i b u t i o n of the 100 most frequent word-types i n the f i v e areas d e s c r i b e d above, using the usual formula: X 2 = l _ o _ - e_j_ z e~~ where o = the observed frequency of the word-types, and e = the the expected frequency of the word-types. (The expected value equals the r a t i o of the t o t a l number of word-types i n a corpora to t he t o t a l number of word-types i n the Corpus, m u l t i p l i e d by 75 the t o t a l number of Corpus occurrences of the r e s p e c t i v e word-type being t e s t e d ) . The .01 l e v e l of s i g n i f i c a n c e was chosen to t e s t the hypotheses i n order to guard a g a i n s t a type 1 e r r o r . That i s , i t was decided to r i s k r e j e c t i n g the n u l l hypotheses when they were true o n l y one time i n a 100. Complete d e t a i l s of the c h i - s q u a r e t e s t s f o r the d i s t r i b u t i o n of word-types have been arranged i n a s e r i e s of t a b l e s and are presented i n APPENDIX I. For each word-type there are t h r e e l i n e s of data. The f i r s t l i n e g i v e s the observed frequency, the second l i n e l i s t s the expected frequency, and the t h i r d l i n e i n d i c a t e s the r a t i o o f the number of occurrences of the s p e c i f i c word-type i n the corpora to the t o t a l number of a l l word-types i n the c o r p o r a expressed as a percentage. The 100 most frequent word-types are placed i n ranked order on the l e f t hand s i d e of the t a b l e s . Task_7 i2 was designed to analyze and i l l u s t r a t e the number of word-types which d i f f e r e d s i g n i f i c a n t l y i n t h e i r d i s t r i b u t i o n a c r o s s each of the f i v e areas t e s t e d i n Task 7.1. A summary t a b l e of these r e s u l t s i s presented i n Chapter IV. The t a b l e . i s o r ganized i n t o seven columns with the f i r s t column (BANK) g i v i n g the rank l i s t i n g of each of the 100 word-types and the second column (WORD) l i s t i n g the word-type. Columns three to seven i n d i c a t e whether or not each of the word-types i s evenly d i s t r i b u t e d across the t h r e e grade l e v e l s (GRADES); the seven s u b j e c t areas (SUBJECTS C) ; the s u b j e c t s i n Grade 8 (SUBJECTS 8 ) ; the s u b j e c t s i n Grade 9 (SUBJECTS 9 ) ; and the s u b j e c t s i n Grade 10 (SUBJECTS 10). 76 TASK 8. ANALYSIS OF DISTSIBOTION OF SELECTED SENTENCE LENGTHS T h i s task i n v o l v e d the t e s t i n g of a number of n u l l hypotheses which s t a t e d t h a t t h e r e were no s i g n i f i c a n t d i f f e r e n c e s i n the sentence l e n g t h d i s t r i b u t i o n s of the s u b j e c t areas i n the v a r i o u s corpora when compared to the normal p o p u l a t i o n expressed by the sentence length d i s t r i b u t i o n o f the Corpus. The c h i - s g u a r e t e s t was used to t e s t these hypotheses using the u s u a l formula: X 2 = l _ o _ - _ e _ l _ 2 e where 0= the observed frequency of the sentence l e n g t h , and e= the expected frequency of the sentence l e n g t h . (The expected value equals the r a t i o of the t o t a l number of sentence l e n g t h s i n a corpora to the t o t a l number of sentence l e n g t h s i n the Corpus, m u l t i p l i e d by the t o t a l number of Corpus occurrences of the r e s p e c t i v e sentence length being t e s t e d ) . The l e v e l of s i g n i f i c a n c e used to t e s t these hypotheses was .01. The chi-square t e s t s were run u s i n g f i v e ranges of sentence l e n g t h s : 10, 20, 30, 40, and 50+ words i n l e n g t h . These sentence l e n g t h s were chosen to rep r e s e n t s h o r t sentences, a group of sentences on e i t h e r s i d e of the Corpus mean sentence l e n g t h , and two groups of longer sentences. The l a s t range i n c l u d e d a l l sentences with 50 words or above because of the v a r i e t y and s m a l l number of sentences expected i n t h i s category. A computer program was developed to t e s t the d i s t r i b u t i o n of 7 7 occurrence of the f i v e s e l e c t e d sentence l e n g t h s a c r o s s the three grade l e v e l s , the seven s u b j e c t a r e a s , the s i x s u b j e c t areas i n Grade 8, the seven s u b j e c t areas i n Grade 9, and the f i v e s u b j e c t areas of Grade 1 0 . A summary of these r e s u l t s appears i n Chapter IV. Complete d e t a i l s of the c h i - s q u a r e t e s t s f o r the sentence l e n g t h d i s t r i b u t i o n have been arranged i n t o f i v e t a b l e s and are presented i n APPENDIX J . The format of the t a b l e s i s the same as t h a t d i s c u s s e d i n Task 7 , except that the s e l e c t e d sentence l e n g t h s are p l a c e d on the l e f t hand s i d e of the t a b l e s . TASK 9. IDENTIFICATION OF SIGNIFICANT CONTENT MATERIAL The f i n a l task i n the study i n v o l v e d the development of an " e l i m i n a t i o n technique" f o r the s e l e c t i o n of the most s i g n i f i c a n t words i n s p e c i f i c c ontent areas using the ranked frequency word l i s t s developed f o r the Corpus, the three grade l e v e l c o r p o r a , the seven s u b j e c t - a r e a corpora, and the t h i r t y -seven textbook c o r p o r a . Three sub-tasks were i n v o l v e d . £ask_9..1 Word frequency graphs were c o n s t r u c t e d f o r the eleven c o r p o r a d e s c r i b e d above using the UBC CALCOMP Drum P l o t t e r . The graphs p l o t t e d the rank of each word-type along the a b s c i s s a and the frequency of each word-type on the o r d i n a t e . Because of the magnitude of the q u a n t i t i e s being p l o t t e d , a one-tenth s c a l e was used. (See APPENDIX K). The word frequency graphs take the g e n e r a l shape of the diagram i n FIGURE 3. 7 8 Words Ranked In Descending order FIGURE 3 MODEL OF A WORD FREQUENCY DIAGRAM l&sk_9±2 The " e l i m i n a t i o n technique" suggested i n t h i s study c o n s i s t s of two s t a g e s . The f i r s t stage i s designed to i d e n t i f y the high frequency words i n a word l i s t t h a t are considered to be too common to have s p e c i a l s i g n i f i c a n c e f o r the content area 79 under i n v e s t i g a t i o n . A c u t o f f p o i n t i s determined and these high frequency words are e l i m i n a t e d . The d e c i s i o n was made to use the p o s i t i o n on the a b s c i s s a where 50 percent of the tokens occurred as the c u t o f f p o i n t . T h i s i n v o l v e s e l i m i n a t i o n of most of the s t r u c t u r e words. The l i n e A i n FIGURE 4 r e p r e s e n t s t h i s c u t o f f . The second stage i s designed t o e l i m i n a t e words which are too r a r e or do not have s u f f i c i e n t frequency of occurrence to warrant t h e i r being considered as s i g n i f i c a n t f o r the s p e c i f i c c ontent area. A c u t o f f p o i n t i s determined and these low frequency words are e l i m i n a t e d . The d e c i s i o n was made to use the p o s i t i o n on the a b s c i s s a where approximately 10 percent o f the low frequency tokens occurred as the c u t o f f p o i n t . T h i s r e s u l t s i n the e l i m i n a t i o n of words t h a t occur only one to three times i n most l i s t s and which are regarded t o be low i n s i g n i f i c a n c e , The l i n e B i n FIGURE 4 r e p r e s e n t s the c u t o f f . Words which f a l l i n the gray area immediately to the l e f t and r i g h t of both p o i n t A and B could of course a l s o be i n c l u d e d as having s i g n i f i c a n c e depending on the judgment of the i n d i v i d u a l s e l e c t i n g s i g n i f i c a n t content and the degree of accuracy d e s i r e d i n d e s i g n a t i n g the words to be e l i m i n a t e d . Task_9_,3 The balance of the words remaining between p o i n t s A and B (approximately 40 percent of the t o t a l t o k e n s ) , r e p r e s e n t s , f o r most purposes, the most s i g n i f i c a n t content i n a word l i s t . That i s , these are the items of vocabulary t h a t occur n e i t h e r too f r e q u e n t l y to be c l a s s e d as common words, nor too i n f r e q u e n t l y to be c l a s s e d as r a r e words. I t must again be emphasized t h a t s u b j e c t i v e judgment by s p e c i a l i s t s i n the 80 content area concerned i s v i t a l i n making f i n a l decisions in the elimination and retention of 'gray' area words and i n establishing the general cutoff points for A and B. FIGURE 4 APPLICATION OF "ELIMINATION TECHNIQUE" TO THE MODEL OF A WORD FREQUENCY DIAGRAM 8 1 A complete d i s c u s s i o n of the r e s u l t s of a p p l y i n g the " e l i m i n a t i o n technique" to the Corpus, the th r e e grade l e v e l c orpora, and the seven s u b j e c t - a r e a corpora i s presented i n Chapter 17 using the frequency d i s t r i b u t i o n graph of the Corpus as an example. one f i n a l p o i n t should be made with r e s p e c t to the design and methodology of the study. The e n t i r e p r oduction of the d i s s e r t a t i o n was accomplished through the use of computer technology. I n i t i a l l y , each chapter was keypunched onto IBM computer cards, read i n t o the computer memory bank and stored on d i s k . The d i s s e r t a t i o n was then e d i t e d , using a 3270 CRT u n i t , r e v i s e d numerous times and f i n a l l y p r i n t e d i n i t s present form using the FMT computer program. The graphs and ch i - s g u a r e t a b l e s i n the APPENDIX were produced by s p e c i a l programs and reduced f o r convenience of p r e s e n t a t i o n . The use of the computer i n producing the d i s s e r t a t i o n had the great advantage of a l l o w i n g constant r e v i s i o n s t o be made and m u l t i p l e c o p i e s of the r e v i s e d manuscript t o be obtained very q u i c k l y . The f o r m a t t i n g of t a b l e s and other d e s c r i p t i v e s t a t i s t i c s plus the c o n s t r u c t i o n o f graphs were a l s o r e l a t i v e l y easy with computer f a c i l i t i e s . The major drawback i n using computer techniques was the need f o r the r e s e a r c h e r to e d i t very c a r e f u l l y the ' l o g i c a l ' but s e t procedures used by the computer i n the o r g a n i z a t i o n and i n t e r p r e t a t i o n of p r i n t m a t e r i a l s . T h i s i n v o l v e d an understanding of b a s i c computer processes plus some of the programming language used i n gen e r a t i n g the computer output. 82 CHAPTER IV ANALYSIS OF THE DATA AND FINDINGS The purpose of the chapter i s to present and analyze the r e s u l t s obtained from the completion of Tasks 1-9. The t a s k s r e s u l t e d i n the p r o d u c t i o n of over 5,500 pages of p r i n t e d m a t e r i a l , i n c l u d i n g p r i n t f a c s i m i l e s of a l l the i n s t r u c t i o n a l m a t e r i a l s sampled, the s i x t y - s i x word l i s t s , and accompanying t a b l e s , graphs, and s t a t i s t i c a l summaries. These data were then organized i n t o e i g h t volumes and are d i s c u s s e d f u l l y i n Tasks 3 and U. A l l of the m a t e r i a l generated i n the study, i n c l u d i n g over 200 computer f i l e s used to o r g a n i z e the m a t e r i a l and twenty s p e c i a l l y developed computer programs, have been placed on magnetic tape. Copies of the tape are a v a i l a b l e from the Computing Centre (Tape #RE0616) and the S p e c i a l C o l l e c t i o n s D i v i s i o n of the L i b r a r y (Tape #RE0617) a t the U n i v e r s i t y of B r i t i s h Columbia. A t e c h n i c a l d e s c r i p t i o n of the procedures f o l l o w e d i n developing and using the v a r i o u s computer programs i s g i v e n i n the booklet, P£23E3Sl§£i^§Mi^_:_,l:_!__:]l§_I^_;_:dsJ!. Corpus, by A l l a n K i l l e r , a v a i l a b l e from the Computing Centre at the U n i v e r s i t y of B r i t i s h Columbia. 83 TASKS, QUESTIONS AND HYPOTHESES The tasks o u t l i n e d i n Chapter I are r e s t a t e d i n t h i s s e c t i o n , the q u e s t i o n s answered, the hypotheses t e s t e d , and the g e n e r a l f i n d i n g s p r e sented. Task_J_. Develop a r e p r e s e n t a t i v e corpus of n a t u r a l language t e x t based on i n s t r u c t i o n a l m a t e r i a l p r e s c r i b e d f o r use i n B r i t i s h Columbia j u n i o r secondary grades. The t h i r t y - s e v e n textbooks and 469 samples used i n developing the 235,107 word Corpus of i n s t r u c t i o n a l m a t e r i a l s are d e s c r i b e d i n APPENDIX A. The sample s i z e s ranged from 657 words to 338 words with a mean of 501.294 and a standard d e v i a t i o n of 44.187. Two c o p i e s o f the 469 sample s i z e s , one o r g a n i z e d i n a l p h a b e t i c a l order, and one ranked by s i z e i n ascending order are presented i n APPENDIX B. T a s k _ 2 t Organize the Corpus f o r computer i n p u t and m a n i p u l a t i o n . The keypunching of computer cards c o n t a i n i n g the Corpus was accomplished u s i n g the UBC FffT (FORMAT) program a v a i l a b l e from the Computing Centre at the U n i v e r s i t y of B r i t i s h Columbia. The computer cards were read i n t o the computer v i a a card-reader and p l a c e d on d i s k to await r e o r g a n i z a t i o n i n t o the various t a s k s i n v o l v e d i n the study. Task_3 j_ Generate two volumes of the Corpus: one o r g a n i z e d by grade l e v e l s , and one organized by s u b j e c t areas, each with a d e s c r i p t i v e index. 84 Computer programs were u t i l i z e d to generate the two volumes of the Corpus: 1) C.G., which pres e n t s p r i n t f a c s i m i l e s of the i n s t r u c t i o n a l m a t e r i a l organized by grade l e v e l s , and 2) C.S., which o r g a n i z e s the i n s t r u c t i o n a l m a t e r i a l by s u b j e c t . A d e s c r i p t i o n of the development of the corpo r a , which i n c l u d e s an index and f u l l p a r t i c u l a r s f o r each t e x t and the samples used, i s i n c l u d e d i n the f r o n t of each of the two volumes. A d e t a i l e d l i s t i n g of the 209 computer f i l e s and programs used i n the study i s presented i n APPENDIX C. Organize the samples i n t o word l i s t s f o r the Corpus, the grade corp o r a (3), the s u b j e c t corpora (7), the s u b j e c t s w i t h i n grade corp o r a (18), and the textbook corpora (37) . 4.1 For each of the above, p r o v i d e an a l p h a b e t i c a l and a rank order (descending frequency) l i s t i n g of word-types to g i v e the f o l l o w i n g i n f o r m a t i o n . 4.11 The frequency of occurrence of each word-type. 4.12 The cumulative percentage frequency o f each word-type. 4.13 The r e l a t i v e frequency of occurrence of each word-type per 1000 tokens. 4.14 The d e s c r i p t i v e s t a t i s t i c s f o r the rank order l i s t s of the Corpus and corpora i n c l u d i n g : X, FX, SUM FX, FX * X, CUM % FX * X . 4.2 C o n s t r u c t two summary t a b l e s f o r each of the s i x t y - s i x word l i s t s , i n d i c a t i n g the word frequency f i g u r e s i n descending order (highest frequency words f i r s t ) , and i n ascending order (highest frequency words l a s t ) . Task_4_.J[ The a l p h a b e t i c a l and rank order word l i s t s and r e l e v a n t s t a t i s t i c a l d e t a i l s f o r the Corpus and a l l corpora are organized i n t o f i v e volumes as f o l l o w s : 1) C.v. Represents the word l i s t f o r the Corpus (345 85 pages) , 2) G.V. Represents the word l i s t s f o r the three grade l e v e l corpora (550 pages) , 3) S.V. Represents the word l i s t s f o r the seven s u b j e c t area c o r p o r a (730 pages), 4) S.G.V. Represents the word l i s t s f o r the eighteen s u b j e c t within grade l e v e l corpora (986 pages), 5) T.V. Represents the word l i s t s f o r the t h i r t y -seven textbook corpora (1 ,292 pages). A l l word l i s t s are s e t up i n two columns per page with f i f t y words per column f o r added convenience. The o r g a n i z a t i o n of the a l p h a b e t i c a l l i s t s and the rank order l i s t s i s b a s i c a l l y the same f o r a l l corpora with one exc e p t i o n . Each word entry i n both l i s t s i s preceded by two f i g u r e s . For the a l p h a b e t i c a l l i s t the f i r s t q u a n t i t y i n column FREQ i n d i c a t e s the r e l a t i v e frequency per 1000 tokens of the word e n t r y . For the rank order l i s t the f i r s t q u a n t i t y i n the FREQ column i n d i c a t e s the cumulative t o t a l o f the tokens i n the Corpus c o n t r i b u t e d by the word ent r y . The second f i g u r e i n each l i s t g i v e s the frequency of the word e n t r y . (See APPENDIXES D and E) . The a l p h a b e t i c a l l i s t begins with s e v e r a l command symbols p l u s a complete l i s t i n g of the alphanumerical indexes of the 469 samples used i n the study. A l l other types t h a t do not begin with l e t t e r s are pl a c e d a t the end of the l i s t . The rank order l i s t begins with the h i g h e s t frequency word i n the Corpus and p l a c e s a l l other types i n descending rank. 86 t l i s _ o r d e r _ w i t h i n _ t IlliSki£§_Ii§ted_lasts_ A complete l i s t i n g of the alphanumerical indexes of the 469 samples used i n the study appears a t the beginning of the ha£ax_le_omena e n t r i e s . 2^§iS_iii2 The descending word frequency l i s t s c o n s t r u c t e d f o r each o f the s i x t y - s i x corpora g i v e s the rank of each word-type i n descending order. A sample page from the descending l i s t s i s i n c l u d e d i n APPENDIX F. T h i s l i s t enables the rank of any word to be q u i c k l y l o c a t e d by f i r s t f i n d i n g the frequency of occurrence of the word i n the a l p h a b e t i c a l l i s t and then matching t h i s number with the same frequency i n column X of the descending l i s t . For example, i f the rank of the word "about" i n the Corpus i s r e q u i r e d , the reader could note t h a t the frequency of the word i n the a l p h a b e t i c a l l i s t (APPENDIX D) i s 463 and determine from "The Corpus with Rank i n Descending Order" l i s t (APPENDIX F ) , that X = 463 corresponds t o a rank of 55 which i s the rank of the word "about" i n the Corpus. T h i s means t h a t there are 54 words which have a g r e a t e r frequency of occurrence than the word "about" and 16,350 words which occur l e s s f r e q u e n t l y i n the Corpus. A s i m i l a r procedure c o u l d be f o l l o w e d with e n t r i e s i n any of the other s i x t y - f i v e c o r p o r a word l i s t s . Another s e r v i c e o f f e r e d by the descending order word frequency l i s t i n v o l v e s determining the r e l a t i o n s h i p between word-types and tokens. The descending l i s t f o r the Corpus i n d i c a t e s t h a t the f i r s t 100 most frequent words c o n s t i t u t e only 0.610 percent of the word-types i n the Corpus yet the same words account f o r 115,141 tokens or 48.973 percent of the t o t a l number 87 of word occurrences i n the Corpus. The ascending order word frequency l i s t developed f o r each corpo r a g i v e s the rank of each word-type i n ascending order. A sample page i s i n c l u d e d i n APPENDIX F. T h i s l i s t i s u s e f u l i n determining the number of tokens accounted f o r by the r a r e l y o c c u r r i n g word-types. For example, "The Corpus i n Ascending Order" l i s t , i n d i c a t e s t h a t low frequency word-types o c c u r r i n g ten times or l e s s c o n s t i t u t e 84.505 percent of the word-types i n the Corpus yet account f o r only 34,572 tokens or 14.705 percent of the t o t a l number of word occurrences i n the Corpus. Task_5 Generate comparative and s t a t i s t i c a l a n a l y s e s based on the l e x i c a l c h a r a c t e r i s t i c s of the Corpus, the corpora, and data produced i n Tasks 1 through 4. 5.1 What are the l e x i c a l c h a r a c t e r i s t i c s of the Corpus; the Grade 8,9, and 10 c o r p o r a ; each of the seven s u b j e c t area corpora a c r o s s Grades 8, 9, and 10; each of the s u b j e c t corpora w i t h i n Grade 8, 9, and 10; and each of the t h i r t y - s e v e n textbook c o r p o r a , i n terms of the t o t a l number of g r a p h i c c h a r a c t e r s , average number of g r a p h i c c h a r a c t e r s per token, tokens and d i s c r e t e word-types? 5.2 What are the c h a r a c t e r i s t i c s i n terms of repeat-r a t e frequency (Yule^s K) of words f o r the Corpus and corpora d e f i n e d i n Task 5.1? Task_5.il The l e x i c a l c h a r a c t e r i s t i c s of the Corpus and the v a r i o u s corpora are presented i n TABLES IV through X. The t o t a l Corpus i n c l u d e s 16,405 word-types across the 469 samples developed f o r the study. TABLE IV i l l u s t r a t e s the r e l a t i v e l y l a r g e s i z e of the Grade 9 corpus i n c o n t r a s t to those of Grades 8 and 10. Over 50 percent (122,953) of the tokens i n the t o t a l Corpus are represented by 69 percent (11,401) of the 88 Corpus word-types i n the nineteen textbooks used i n Grade 9. The Grade 8 (52,867 tokens) and Grade 10 (59,343 tokens) corpora are approximately the same s i z e i n terms of both word-types and tokens. TABLE IV NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR GRADE LEVELS AND THE CORPUS Grade Types Tokens C h a r a c t e r s Average 8 7,027 52,867 234, 527 4.44 9 11,401 122,953 554,488 4.51 10 7,736 59,343 273, 654 4.61 Corpus 16,405 235,107 1,062,411 4.52 The l e x i c a l c h a r a c t e r i s t i c s of the s u b j e c t areas of the Corpus a c r o s s the t h r e e grade l e v e l s , o u t l i n e d i n TABLE V, i n d i c a t e t h a t Home Economics (49,257 tokens) i s the l a r g e s t s u b j e c t corpora and Mathematics (17,808) the s m a l l e s t . E n g l i s h , which i s the second l a r g e s t c o r p o r a (40,300 tokens) has by f a r the most word-types (7,079) i n d i c a t i n g a much gr e a t e r v a r i e t y of vocabulary used throughout the e i g h t textbooks i n t h i s s u b j e c t as compared to the other content areas i n the j u n i o r secondary grades. 89 TABLE V HUM BER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS ACROSS GRADE LEVELS Sub j e c t Types Tokens C h a r a c t e r s A verage Commerce 3,020 20,155 90,171 4.47 E n g l i s h 7,079 40 ,300 178,192 4. 42 Home Economics 5,529 49,257 221, 576 4.50 I n d u s t r i a l Ed. 4,060 31 ,300 141 ,176 4.51 Mathematics 1,952 17,808 73, 852 4.15 Sci e n c e 4,833 37,787 173,023 4.58 S o c i a l S t u d i e s 6,21 1 38,608 184,727 4.78 Corpus 16,405 235,107 1,062, 411 4.52 TABLE VI gi v e s the l e x i c a l c h a r a c t e r i s t i c s of the s i x su b j e c t areas (Commerce i s not offered) within Grade 8. Home Economics and S o c i a l Studies are the two l a r g e s t corpora with over 11,000 tokens each although E n g l i s h has a g r e a t e r number of word-types (2,388) than Home Economics (2,169). S o c i a l S t u d i e s , although r a n k i n g second i n t o t a l tokens, has the g r e a t e s t number of word-types f o r Grade 8 (2,890 t y p e s ) . 90 TABLE 71 HUM BER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 8 Subj e c t Types Tokens C h a r a c t e r s Average Commerce -E n g l i s h 2,388 8,605 37 ,901 4. 40 Home Economics 2,169 11,425 50,472 4.42 I n d u s t r i a l Ed. 1,305 4,624 20,981 4.54 Mathematics 1,164 7,073 30,201 4.27 Science 1,975 9,907 43,363 4.38 S o c i a l S t u d i e s 2,890 11,205 51,480 4.59 Grade 8 7,027 52,867 234,527 4.44 In Grade 9, (TABLE V I I ) , Home Economics i s the l a r g e s t corpus (37,812 tokens) f o l l o w e d by I n d u s t r i a l E d u c a t i o n (26,656 tokens) and E n g l i s h (23,123 tokens). E n g l i s h again has the l a r g e s t number of word-types. Only one Mathematics t e x t was used (the a l g e b r a t e x t was excluded) r e s u l t i n g i n a r e l a t i v e l y s m a l l number of samples of running prose (3,616 t o k e n s ) . Grade 9 i s the l a r g e s t o f the three grade l e v e l corpora with a t o t a l o f nineteen textbooks i n c l u d e d . 91 TABLE VII NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRACE 9 Subjac t Types Tokens C h a r a c t e r s A verage Commerce 2,208 12,485 55, 653 4.4 6 E n g l i s h 4,920 23 ,123 103 ,490 4. 48 Home Economics 4,894 37,812 171,040 4.52 I n d u s t r i a l Ed. 3,688 26,656 120 ,125 4.51 Mathematics 910 3, 616 15,460 4.28 Sci e n c e 2,365 12 ,278 55,612 4.53 S o c i a l S t u d i e s 2,065 6,955 32,973 4.74 Grade 9 11,401 122 ,953 273 ,654 4.51 TABLE V I I I l i s t s the l e x i c a l c h a r a c t e r i s t i c s of the f i v e s u b j e c t s i n Grade 10. The l a r g e s t s u b j e c t corpus i n Grade 10 i s S o c i a l S t u d i e s (20,428 tokens) and the s m a l l e s t i s Mathematics (7,100 tokens). S o c i a l S t u d i e s a l s o has the l a r g e s t number of word-types (3,930). The Grade 9 textbooks f o r Home Economics and I n d u s t r i a l Education are repeated i n Grade 10 but were not used again i n the study. Nine textbooks were used to o b t a i n samples and s i x textbooks were excluded because they d i d not c o n t a i n s u f f i c i e n t l y l a r g e q u a n t i t i e s o f running prose. The s i x excluded t e x t s i n c l u d e d two Commerce books, two E n g l i s h books, a Mathematics (Geometry) t e x t , and an a t l a s used i n S o c i a l S t u d i e s . 92 TABLE VI I I NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 10 Subj e c t Types Tokens C h a r a c t e r s Average Commerce 1 ,746 7,651 34,459 4.50 E n g l i s h 2,489 8,553 36,743 4.30 Home Economics - - -I n d u s t r i a l Ed. -Mathematics 912 7,100 28,123 3.96 Sci e n c e 3,015 15,583 73,990 4.75 S o c i a l S t u d i e s 3,930 20,428 100,210 4.91 Grade 10 7,736 59,343 273,654 4. 61 A summary of a l l the l e x i c a l c h a r a c t e r i s t i c s f o r each of the s u b j e c t areas a c r o s s grade l e v e l s i s presented i n TABLE IX . The l a r g e s t s e l e c t i o n of m a t e r i a l at the one grade l e v e l d e a l t with i n t h i s study was i n Grade 9 Home Economics {37,812 tokens) where f i v e textbooks were sampled. Other s u b j e c t areas c o n t a i n i n g l a r g e amounts of running prose were Grade 9 I n d u s t r i a l Education (26,656 tokens). Grade 9 E n g l i s h (23,123 tokens), and Grade 10 S o c i a l S t u d i e s (20 ,428 t o k e n s ) . The s m a l l e s t q u a n t i t i e s of running prose were l o c a t e d i n Grade 9 Mathematics (3,616 t o k e n s ) . Grade 8 I n d u s t r i a l Education (4,624 t o k e n s ) , and Grade 9 S o c i a l S t u d i e s (6 ,955 tokens). Grade 10 Mathematics contained the s m a l l e s t recorded 'average length of tokens i n c h a r a c t e r s * (3.96) throughout the study. The l a r g e s t number of word-types occur i n Grade 9 E n g l i s h (4,920) where fo u r textbooks were sampled. Grade 9 Home Economics i s a c l c s e second with 4,894 types. 93 TABLE IX NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF EACH GRADE LEVEL OF THE CORPUS Sub j e c t Grade Types Tokens C h a r a c t e r s Average Commerce 9 2,208 12,485 55,653 4.46 Commerce 10 1 ,746 7,651 34,459 4 .50 E n g l i s h 8 2,388 8, 605 37,901 4.40 E n g l i s h 9 4,920 23 ,123 103 ,490 4.48 E n g l i s h 10 2,489 8,553 36,743 4.30 Home Economics 8 2,169 11,425 50,472 4.42 Home Economics 9 4 ,894 37,812 171 ,040 4.52 I n d u s t r i a l Ed. 8 1,305 4,624 20,981 4.54 I n d u s t r i a l Ed. 9 3,688 26,656 120,125 4.51 Ma thematics 8 1,164 7, 073 30 ,201 4. 27 Mathematics 9 910 3,616 15,460 4.28 Ma the matics 10 912 7,100 28 ,123 3.96 Science 8 1,975 9,907 43,363 4.38 Sci e n c e 9 2,365 12 ,278 55,612 4.53 Science 10 3,015 15, 583 73,990 4.75 Soc. S t u d i e s 8 2,890 11,205 51,480 4.59 Soc. S t u d i e s 9 2,065 6,955 32 ,973 4.74 Soc. S t u d i e s 10 3,930 20,428 100,210 4.91 TABLE X l i s t s the l e x i c a l c h a r a c t e r i s t i c s of each of the t h i r t y - s e v e n textbooks used i n the study. A Grade 10 S o c i a l S t u d i e s t e x t (*3H01), A_R__io_al_Geo_ra_h__^ c o n t a i n s the l a r g e s t s e l e c t i o n of running prose (14,736 tokens), while Drama.,IVm a Grade 10 E n g l i s h t e x t (*3C02), has the s m a l l e s t number of tokens (1,867). A Grade 10 S o c i a l S t u d i e s textbook has the l a r g e s t number of word-types (2,913) and a Grade 10 E n g l i s h t e s t has the l e a s t (822). 9 a TABLE X NUMBER OF TYPES , TOKENS, CHARACTERS , AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE THIRTY-SEVEN TEXTS Text Types Tokens C h a r a c t e r s Average *1C01 (Eng ) 1 ,187 3,500 15,601 4.46 *lc02(Eng ) 1 ,672 5,105 22 ,300 4. 37 *1 D01 (H, Ec) 2,169 11,425 5,8472 4.42 * 1E01(I.Ed) 1 ,305 4,624 20 ,981 4.54 *1F01 (Math) 1 ,164 7,073 30,201 4.27 *1G01(Sci ) 1 ,033 4,402 18 ,926 4.30 *1G02 (Sci ) 1 ,399 5, 505 24,437 4.44 *1H01 (S.St) 2 ,177 7,728 35 ,235 4.56 *1H02 (S.St) 1,215 3, 477 16,245 4.67 *2B01 (Comm) 1 ,234 5,494 24,022 4.37 *2B02(Comm) 1,511 6,991 31 ,631 4.52 • 2C01 (Eng ) 2,436 9, 646 44,736 4.64 *2C02(Eng ) 1 ,232 3,400 15 ,122 4. 45 *2C03 (Eng ) 1 ,705 5,035 21,839 4.34 *2C0a(Eng ) 1 ,638 10,198 21 ,793 4. 32 *2D01 (H.Ec) 1,872 10,755 46,425 4.55 •2D02(H.Ec) 1 ,871 6,928 48 ,323 4.49 *2D03 (H. Ec) 1 ,685 4, 599 31,352 4.53 *2D04 (H.Ec) 1 ,467 5,332 24,051 4.51 *2D05 (H. Ec) 1 ,269 4, 599 20,889 4.54 *2E01(I.Ed) 1 ,615 6,075 27 ,547 4.53 *2E02 (I.Ed) 1 ,638 7,792 34,579 4.44 •2E03(I.Ed) 2,062 12,789 57 ,999 4.54 *2F01 (Math) 910 3, 616 15,466 4.28 *2G01(Sci ) 1 ,516 6,748 30 ,618 4.54 *2G02 ( S c i ) 1 ,474 5, 530 24,994 4.52 *2H01 (S.St) 1 ,420 4,408 20,365 4.62 *2H02(S.St) 984 2,547 12 ,608 4.95 *3B01 (Comm) 1 ,017 3, 546 15,477 4.36 *3B02 (Comm) 1 ,170 4,105 18 ,982 4.62 *3C01 (Eng ) 1 ,946 6, 686 27,972 4.18 *3C02(Eng ) 822 1 ,867 8 ,771 4.70 • 3F01 (Math) 912 7, 100 28,123 3.96 *3G01 ( S c i ) 1 ,955 8,592 40 ,616 4.73 *3G02 ( S c i ) 1 ,844 6,991 33,374 4.77 *3H01 (S.St) 2,913 14,736 70,766 4.80 *3H02 (S.St) 1 ,837 5,692 29 ,444 5. 17 95 Ta^k_5_2, r e p e a t - r a t e frequency t a b l e s f o r the Corpus and corpora are l i s t e d i n the f i v e volumes C . V . , G .V . , S . V . , S.G.v., and T .V . (See Task 4.2). The r e s u l t s f o r Yule's c h a r a c t e r i s t i c K ( f o r words) are presented i n TABLES XI through XVI. As s t a t e d e a r l i e r i n Chapter III, the K value i s u s e f u l as a measure o f the repeat r a t e of words and p r o v i d e s an i n d i c a t i o n of the c o n c e n t r a t i o n of vocabulary i n a passage cf p r i n t e d m a t e r i a l . A l a r g e K f a c t o r suggests a p r o p o r t i o n a t e l y g r e a t e r use of high frequency (common) words than a s m a l l value of K which i m p l i e s more r e l i a n c e on low frequency (rare) words. The K f a c t o r i s t h e o r e t i c a l l y independent of sample s i z e when the samples have been randomly s e l e c t e d from the p o p u l a t i o n being used. For t h i s reason i t i s p o s s i b l e to compare the r e s u l t s of K f o r the v a r i o u s c o r p o r a . The K f a c t o r s f o r each grade l e v e l and the Corpus are presented i n TABLE XI. Grade 9 has the s m a l l e s t value of K ( 106. 547) and Grade 10 has the l a r g e s t K value (1 12.587) although a l l grades were c l o s e to the K value f o r the Corpus (108.104). 96 TABLE XI K FACTORS (WORDS) FOR EACH GRADE LEVEL AND THE CORPUS Grade K F a c t o r 8 9 10 Corpus 109.510 106.547 112.587 108.104 TABLE XII presents the K f a c t o r s ranked f o r the s u b j e c t areas a c r o s s grade l e v e l s . Home Economics (92.572) and E n g l i s h (100.517) have markedly lower values o f K i m p l y i n g t h a t these s u b j e c t s use a r e l a t i v e l y g r e a t e r number of low frequency (rare) words than the other s u b j e c t s . TABLE XII SUBJECT AREAS ACROSS GRADES RANKED BY K FACTOR (WORDS) Rank Subject K F a c t o r 2 3 4 5 6 7 8 Home Economics E n g l i s h Corpus Commerce Mathematics Science I n d u s t r i a l Education S o c i a l S t u d i e s 92.572 100.517 108.104 108.922 121.662 129.894 129.922 130.372 97 The K f a c t o r s f o r the s u b j e c t areas w i t h i n each of the thre e grade l e v e l s are presented i n TABLE XIII and t h e i r ranked order i s l i s t e d i n TABLE XIV. In E n g l i s h , Mathematics, and S o c i a l S t u d i e s , the lowest value of K occurs i n Grade 9, i n d i c a t i n g a gr e a t e r use of low frequency (rare) words than i n e i t h e r of the other two grades, while i n I n d u s t r i a l E d u c a t i o n and S c i e n c e , the lowest K va l u e s are i n Grade 8 and Grade 10 r e s p e c t i v e l y . In Home Economics and Commerce, the lowest K val u e s are i n Grades 9 and 10 r e s p e c t i v e l y . Four out of the seven s u b j e c t s have t h e i r lowest K values i n Grade 9 with two i n Grade 10 and one i n Grade 8. TABLE XIII K FACTORS (WORDS) FOR SUBJECTS WITHIN GRADE LEVELS Subjec t Gde 8 Gde 9 Gde 10 Corpus Commerce E n g l i s h Home Economics I n d u s t r i a l Ed. Mathematics Science S o c i a l S t u d i e s Corpus 107.175 98.166 116.973 123.568 135.992 130.613 109. 510 117.619 98.491 91.788 133.630 118.672 145. 159 127.738 106.547 131.571 118.004 133.350 112.587 99.329 104.271 108.922 100.517 92. 572 129.922 121.662 129.894 130.372 TABLE XIV presents the rank of the s u b j e c t areas w i t h i n grades and i n d i c a t e s that Commerce, Home Economics, and E n g l i s h occupy seven of the f i r s t e i g h t p l a c e s among the eighteen p o s i t i o n s . TABLE XIV SUBJECT ABEAS WITHIN GRADE LEVELS RANKED BY K FACTOR (WORDS) Rank Subject Gde K Factor 1. Home Economics 9 91.788 2. Home Economics 8 98.166 3. E n g l i s h 9 98.491 4. Commerce 10 99.329 5. E n g l i s h 10 104.271 6. E n g l i s h 8 107.175 7. I n d u s t r i a l Education 8 116.973 8. Commerce 9 117.619 9. Science 10 118.004 10. Mathematics 9 118.672 11. Mathematics 8 123.568 12. S o c i a l S t u d i e s 9 127.738 13. S o c i a l S t u d i e s 8 130.613 14. Mathematics 10 131.571 15. S o c i a l S t u d i e s 10 133.350 16. I n d u s t r i a l Education 9 133.630 17. Science . 8 135.992 18. Science 9 145.159 The K f a c t o r s f o r each i n d i v i d u a l textbook f o l l o w . They are: ranked by s u b j e c t areas (TABLE XV), l i s t e d by s u b j e c t s w i t h i n a grade l e v e l (TABLE XVI) , and ranked independently a c r o s s a l l s u b j e c t s and grade l e v e l s (TABLE XVII). The low K va l u e s f o r the textbooks i n Home Economics and E n g l i s h i s evident i n TABLE XV. Only one of the Home Economics t e x t s , Guide_to_Modern_Meals (*2D01), has a K f a c t o r over 100, while most of the E n g l i s h t e x t s have K f a c t o r s approaching 100. TABLE XV TEXTS IN SUBJECT AREAS RANKED BY K FACTOR (WORDS) Text Subject K F a c t o r *3 B01 Commerce 95.532 *3B02 Commerce 111.939 *2B01 Commerce 114.560 *2B02 Commerce 125.988 *2C03 E n g l i s h 99.231 *2C04 E n g l i s h 99.253 *1C02 E n g l i s h 101.873 *3C01 E n g l i s h 102. 632 *2C02 E n g l i s h 103.655 *2C0 1 E n g l i s h 105.651 *1C01 E n g l i s h 118.117 *3C02 E n g l i s h 125.065 *2D04 Home Economics 81. 857 *2D0 3 Home Economics 87.723 *2D05 Home Economics 92.203 *2D02 Home Economics 97.723 *1D01 Home Economics 98.166 *2D01 Home Economics 111.747 *2E01 I n d u s t r i a l Education 113.084 *2E02 I n d u s t r i a l Education 114.428 *1E01 I n d u s t r i a l Education 116.973 *2E03 I n d u s t r i a l Education 169.462 *2 F01 Mathematics 118.672 *1F01 Mathematics 123.568 *3F01 Mathematics 131.571 *1G02 S c i e n c e 117.664 *3G02 Science 117.712 *3G01 Sci e n c e 128.905 *2G02 Science 142.048 *2G0 1 Sc i e n c e 150.283 *1G01 Science 167.198 *2H01 S o c i a l S t u d i e s 126.723 *1H02 S o c i a l S t u d i e s 127.347 • 3H0 2 S o c i a l S t u d i e s 128.258 *2H02 S o c i a l S t u d i e s 134.655 *1H01 S o c i a l S t u d i e s 137.026 *3H01 S o c i a l S t u d i e s 137.962 100 TABLE XVI K FACTOR (WORDS) FOR EACH TEXT BY GRADES Text Subject K F a c t o r *1C01 E n g l i s h 118.117 *1C02 E n g l i s h 101.873 • 1D01 Home Economics 98.166 *1E01 I n d u s t r i a l Education 1 16.973 *1F01 Mathematics 123.568 *1G01 Science 167.198 *1G02 Sc i e n c e 117.664 *1H01 S o c i a l S t u d i e s 137.026 *1H02 S o c i a l S t u d i e s 127.347 *2B01 Commerce 114. 560 *2B0 2 Commerce 125.988 *2C01 E n g l i s h 105.651 *2C02 E n g l i s h 103.655 *2C03 E n g l i s h 99.231 *2C04 E n g l i s h 99.253 *2D01 Home Economics 111.747 *2D02 Home Economics 97.723 *2D03 Home Economics 87.723 *2D04 Home Economics 81.857 *2D05 Home Economics 92.203 *2E01 I n d u s t r i a l Education 113.084 • 2E02 I n d u s t r i a l Education 114.428 *2E03 I n d u s t r i a l Education 169.462 *2F01 Mathematics 118.672 • 2G01 Science 150.283 *2G02 Science 142.048 *2H01 S o c i a l S t u d i e s 126.723 *2H02 S o c i a l S t u d i e s 134.655 *3B01 Commerce 95. 532 *3B02 Commerce 111.939 *3C01 E n g l i s h 102.632 *3C02 E n g l i s h 125.065 *3F01 Mathematics 131.571 *3G01 Science 128.905 *3G02 Science 117.712 *3H01 S o c i a l S t u d i e s 137.962 *3H02 S o c i a l S t u d i e s 128.258 Within Grade 8 and 9, Home Economics has the lowest K value while Commerce has the lowest value w i t h i n Grade 10. TABLE XVII presents the ranked order of the textbooks by K f a c t o r (words). F i v e of the f i r s t twelve textbooks are Home Economics, s i x are E n g l i s h t e x t s , and one i s a Commerce t e x t . TABLE XVII K FACTOR (WORDS) FOR EACH TEXT RANKED ACROSS SUBJECTS AND GRADES Rank Text Subject K Factor 1. *2D04 Home Economics 81.857 2. *2D03 Home Economics 87.723 3. *2D05 Home Economics 92.203 4. • 3B01 Commerce 95.532 5. *2D02 Home Economics 97.723 6. *1D01 Home Economics 98.166 7. *2C03 E n g l i s h 99.231 8. *2C04 E n g l i s h 99.253 9. *1C02 E n g l i s h 101.873 10. *3C01 E n g l i s h 102.632 11. • 2C02 E n g l i s h 103.655 12. *2C01 E n g l i s h 105.651 13. *2D01 Home Economics 111.747 14. *3B02 Commerce 111.939 15. • 2E01 I n d u s t r i a l Education 113.084 16. *2E02 I n d u s t r i a l Education 114.428 17. *2B01 Commerce 114.560 18. • 1E01 I n d u s t r i a l Education 116.973 19. *1G02 Science 117.664 20. *3G02 Science 117.712 21. *1C01 E n g l i s h 118. 117 22. *2F01 Mathematics 118.672 23. *1F01 Mathematics 123.568 24. *3C02 E n g l i s h 125.065 25. *2B02 Commerce 125.988 26. *2H01 S o c i a l S t u d i e s 126.723 27. *1H02 S o c i a l S t u d i e s 127.347 28. *3H02 S o c i a l S t u d i e s 128.258 29. *3G01 Science 128.905 30. *3F01 Mathematics 131.571 31. *2H02 S o c i a l S t u d i e s 134.655 32. *1H01 S o c i a l S t u d i e s 137.026 33. *3H01 S o c i a l S t u d i e s 137.962 34. *2G02 Science 142.048 35. *2G01. Science 150.283 36. *1G01 Science 167.198 37. *2E03 I n d u s t r i a l Education 169.462 102 Generate comparative and s t a t i s t i c a l a n a l y s e s based on the sentence l e n g t h d i s t r i b u t i o n of the Corpus, the corp o r a , and data produced i n Task 1 through Task 4. 6.1 What are the s e n t e n c e - l e n g t h c h a r a c t e r i s t i c s of the Corpus; the Grade 8, 9, and 10 c o r p o r a ; each of the seven s u b j e c t area c o r p o r a across Grades 8, 9, and 10; each of the corpora f o r s u b j e c t s within Grades 8, 9, and 10; and each of the t h i r t y - s e v e n textbook corpora i n terms of the mean, median, modal sentence le n g t h i n words, standard d e v i a t i o n , c o e f f i c i e n t of v a r i a t i o n , average number of sentences, and Pearson's skew f a c t o r . 6.2 Produce a s e t of graphs to i l l u s t r a t e each o f the s i x t y - s i x sentence l e n g t h d i s t r i b u t i o n s developed during the study. 6.3 What are the c h a r a c t e r i s t i c s i n terms of repeat-r a t e frequency of sentence l e n g t h s (Yule's K) f o r the Corpus and the c o r p o r a d e f i n e d i n 6.1 above? Task_6_1 The sentence l e n g t h c h a r a c t e r i s t i c s of the Corpus and the v a r i o u s corpora ( a l l measured i n number of words) are given i n TABLES XVIII through XXIV. Complete d e t a i l s o f the sentence-length d i s t r i b u t i o n of the Corpus and each of the s i x t y - s i x corpora are presented i n the volume SENT. A sample of the c o n t e n t s of SENT i s i n c l u d e d i n APPENDIX G. TABLE XVIII i l l u s t r a t e s the f a i r l y uniform average sentence l e n g t h when the samples are organized by grade l e v e l s across the Corpus. T h i s p a t t e r n i s a l s o repeated when the samples are o r g a n i z e d by s u b j e c t s a c r o s s the three grades (TABLE XIX), although the range i n averages i n c r e a s e s . 103 TABLE XVIII MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH GRADE LEVEL AND THE CORPUS Grade Mean S. D. V a r i a t i o n Median Mode Average Sentences 8 18.595 9.7745 0.5256 9 17.824 10.2550 0.5753 10 17.593 9.8504 0.5599 Corpus 17.927 10.0480 0.5605 16.764 18 15.428 15 15.733 10 15.743 15 27.33 28.04 28.34 27.96 TABLE XIX MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH SUBJECT AREA ACROSS GRADES Average Subject Mean S.D. V a r i a t i o n Median Mode Sentences Commerce 17.772 9.080 0.510 15.770 13 27.66 E n g l i s h 17.568 13.685 0.779 13.750 7 28.68 Home Economics 18.476 8.633 0.467 16.813 16 27.20 I n d u s t r i a l ed. 16.683 8.449 0.506 14.550 11 29.78 Mathematics 15.247 8.150 0.534 13.532 14 33.37 Sci e n c e 18.495 9.785 0.529 16.444 15 27.24 S o c i a l s t u d i e s 19.973 9.582 0.479 18.207 21 25.10 Corpus 17.927 10.048 0.560 15.743 15 27.96 TABLE XX presents the sentence length c h a r a c t e r i s t i c s a c r o s s the grade l e v e l s . The s m a l l e s t average sentence length i s i n Grade 8 Mathematics and the l a r g e s t i n Grade 10 S o c i a l S t u d i e s . 104 TABLE XX MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH SUBJECT AREA WITHIN GRADE ! LEVELS OF THE CORPUS Average Subject Gde Mean S . D. Var. M edian Mode Sentences Comm. 9 17.558 9.642 0.549 15. 475 17 28.440 Comm. 10 18.0 87 8.016 0.443 16.300 14 26.437 Eng. 8 17.597 13.049 0.741 14.280 12 28.764 Eng. 9 18.753 14.585 0.777 14.509 8 26.234 Eng. 10 14.953 1 1.670 0.780 11.230 7 35.750 H. Ec. 8 19.430 8. 105 0. 417 18.100 16 26.727 H.Ec, 9 18.196 8.738 0.480 16.442 16 27.342 I.Ed. 8 17.511 7. 677 0.438 15.530 14 29.333 I. Ed. 9 16.535 8.552 0.517 14.392 15 29.851 Math. 8 17.421 8. 872 0.509 16.170 18 29.000 Math. 9 14.406 6. 802 0. 472 13.190 9 35.857 Math. 10 13.894 7.781 0.560 12.330 10 36.500 S c i . 8 17.081 8.631 0.505 15.170 15 29.000 S c i . 9 17.924 9.173 0.511 15.757 15 28.541 S c i , 10 20.028 10.861 0.542 17.900 18 25.096 S.St. 8 21.715 9.876 0.4 54 19.700 23 23.454 S. St. 9 21.204 1 1.137 0.525 18.444 15 25.230 S. St. 10 18.758 8.666 0.462 17.260 10 28.340 The sentence length c h a r a c t e r i s t i c s f o r s u b j e c t s w i t h i n Grade 8, Grade 9, and Grade 10 are presented i n TAELES XXI, XXII, and XXIII r e s p e c t i v e l y . 105 TABLE XXI MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NDMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 8 Average Subject Mean S.D. V a r i a t i o n Median Mode Sentences Comme rce E n g l i s h 17. 597 13. 049 0. 741 14.280 12 28. 764 Home Economics 19. 430 8. 105 0. 417 18.100 16 26. 727 I n d u s t r i a l Ed. 17. 511 7. 677 0. 438 15.530 14 29. 333 Mathematics 17. 421 8. 872 0. 509 16.170 18 29. 000 Sc i e n c e 17. 081 8. 631 0. 505 15.170 15 29. 000 S o c i a l S t u d i e s 21. 715 9. 876 0. 454 19.700 23 23. 454 Grade 8 18. 595 9. 774 0. 525 16.764 18 27. 330 TABLE XXII MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 9 Average Subj ect Mean S. V a r i a t i o n Median Mode Sentences Commerce 17.558 9. 642 0.549 15. 475 17 28.440 E n g l i s h 18.753 14. 585 0.777 14. 509 8 26.234 Home Economics 18.196 8. 738 0.480 16. 442 16 27.342 I n d u s t r i a l Ed. 16.535 8. 552 0. 517 14. 392 15 29.851 Mathematics 14.406 6. 802 0.472 13. 190 9 35.857 Science 17.924 9. 173 0.511 15. 757 15 28.541 S o c i a l S t u d i e s 21.204 11 . 137 0.525 18. 444 15 25.230 Grade 9 17.824 10. 255 0. 575 15. 428 15 28.040 106 TABLE XXIII MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 10 Average Subject Mean S.D. V a r i a t i o n Median Mode Sentences Commerce 18.0 87 8. 0 16 0. 443 16. 30 14 26. 437 E n g l i s h 14.9 53 11. 670 0. 780 11. 23 7 35. 750 Home Economics - - - - - -I n d u s t r i a l Ed. - - - -Mathematics 13. 894 7. 781 0. 560 12. 33 10 36. 500 Sci e n c e 20.028 10. 861 0. 542 17. 90 18 25. 096 S o c i a l S t u d i e s 18.758 8. 666 0. 462 17. 26 21 25. 928 Grade 10 17.593 9. 850 0. 559 10 28. 340 The average sentence l e n g t h s f o r the s u b j e c t areas w i t h i n grades d i f f e r c o n s i d e r a b l y with Grade 9 e x h i b i t i n g the g r e a t e s t range i n sentence l e n g t h s . TABLE XXIV l i s t s the sentence l e n g t h c h a r a c t e r i s t i c s f o r the t h i r t y - seven textbooks. 107 TABLE XXIV BEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE THIRTY-SEVEN TEXTS Average Subject Mean S.D. V a r i a t i o n Median Mode Sentences *1C01 (Eng.) 18.717 14.639 0.782 14.60 15 26.714 *1C02 (Eng.) 16.904 11.933 0.706 14.00 12 30.200 * 1 D01 (H.Ec) 19.430 8.105 0.417 18.10 16 26.727 *1E01 (I.Ed) 17.511 7.677 0.438 15.53 14 29.333 *1F01 (Math) 17.421 8.872 0.509 16.25 18 29.000 *1G01 (Sci) 14.673 8. 125 0. 553 12.87 15 33.333 * 1G02 (Sci) 19.661 8.421 0.428 17.90 18 25.454 *1H01 (S. St) 21.348 9.811 0.459 19.13 17 24. 133 *1 H02 (S.St) 22.578 10.006 0. 443 21. 14 23 22.000 *2B01 [Comm) 15.652 8.223 0.525 14.13 17 31.909 *2B02 (Comm) 19.417 10.532 0. 542 17.07 14 25.714 *2C01 (Eng) 17.762 12.041 0.678 14.60 8 27. 145 • 2C02 (Eng) 19.101 14.070 0.736 15.40 12 25.428 • 2C03 (Eng) 25.429 18.775 0.738 21.20 15 19.800 *2C04 (Eng) 16.057 14.666 0.913 11.20 4 31.400 *2D01 (H.Ec) 20.1 14 9.761 0.485 17.90 15 24. 142 *2D02 (H.Ec) 19.002 8. 8 28 0. 464 17.20 17 25.727 *2D03 (H.Ec) 17.451 8.095 0.463 16.40 17 28.357 *2D04 (H. Ec) 18.071 8. 379 0. 463 16.80 18 29.500 *2D05 (H.Ec) 14.693 6.565 0.446 13.40 13 34.777 *2E01 (I.Ed) 14.194 6.972 0. 491 12.40 12 32.920 *2E02 (I.Ed) 16.402 9.069 0.553 18.50 9 28.562 *2E03 (I.Ed) 18.037 8.744 0. 484 16.20 15 28.360 *2F01 (Math) 14.406 6.802 0.472 13.20 9 35.857 *2G01 (Sci) 16.828 8.905 0. 529 14.90 15 30.846 *2G02 (Sci) 19.472 9.338 0.479 17.70 11 25.818 *2H01 (S. St) 21. 822 12.143 0.556 19.40 21 25.250 *2 H02 (S. St) 20.214 9. 262 0. 458 17.40 16 25.400 *3B01 (Comm) 17.214 7.936 0.461 15.50 10 29.428 *3B02 (Comm) 18.917 8.021 0. 424 17.00 18 24.111 *3C01 (Eng.) 13.701 10.242 0.747 10.10 7 40.666 *3C02 (Eng) 22.226 16.081 0.723 17.00 14 21.000 *3F01 (Math) 13.894 7.781 0.560 12.30 10 36.500 *3G01 (Sci.) 18.636 9. 441 0.506 17.25 14 27.117 *3G02 (Sci.) 22.054 12 . 3 84 0.561 19.85 22 22.642 *3H01 (S. St) 17. 522 7.517 0.429 16.20 21 28.033 *3H02 (S.St) 22.952 10.761 0.468 21.00 20 20.666 108 A Grade 9 E n g l i s h t e x t has the lowest average sentence l e n g t h and a Grade 10 E n g l i s h t e x t the h i g h e s t . One f i n a l o b s e r v a t i o n should be made about the sentence l e n g t h c h a r a c t e r i s t i c s presented i n TABLES XVIII through XXIX. Sentence length and v a r i a b i l i t y are r e l a t i v e l y c o n s i s t e n t w i t h i n the t h r e e grade l e v e l corpora. However, c o n s i d e r a b l e range i s e v i d e n t i n average sentence lengths a c r o s s the samples when organized by grades, s u b j e c t s across grades, s u b j e c t s w i t h i n grades, and by i n d i v i d u a l textbooks. In a d d i t i o n to t h i s d i v e r s i t y , a s t r i k i n g f e a t u r e i s the c o n s i d e r a b l e v a r i a b i l i t y of the sentence l e n g t h p a t t e r n s as i n d i c a t e d by the standard d e v i a t i o n s and c o e f f i c i e n t s of v a r i a t i o n r e p o r t e d f o r the samples when organized by s u b j e c t s a c r o s s grades, s u b j e c t s w i t h i n grades, and textbooks. For example, i n TABLE XIX f o r the samples organized by s u b j e c t s a c r o s s grades, the standard d e v i a t i o n f o r the sentence l e n g t h s range from 8.150 f o r Mathematics to 13.685 f o r E n g l i s h . For the Math samples, approximately 68 percent of the sentences would range from 6.097 to 23.397 words i n l e n g t h with an average of 15.247. For the E n g l i s h samples, 68 percent of the sentences would range from 3.883 t o 31.253 words i n l e n g t h with a mean of 17.568. T h i s v a r i a b i l i t y e x i s t s throughout the range of samples, with the exception of grades, and i s a l s o e v i d e n t i n the ranges r e p o r t e d f o r the c o e f f i c i e n t of v a r i a t i o n and to some degree f o r the ranges reported f o r average numbers of sentences per 500 word sample. The c o e f f i c i e n t of v a r i a t i o n i n d i c a t e s the r a t e at 109 which items move away from the mean, and the lower the v a r i a t i o n the g r e a t e r the degree o f sentence length homogeny i n the sample. For example, f o r the samples organized by s u b j e c t s a c r o s s grades (TABLE XIX), the c o e f f i c i e n t s of v a r i a t i o n are q u i t e v a r i e d . The c o e f f i c i e n t of 0.779 f o r E n g l i s h i n d i c a t e s t h a t the samples i n t h i s s u b j e c t area are l e s s a l i k e than those i n S o c i a l S t u d i e s which has a c o e f f i c i e n t of 0.479. E n g l i s h has, overwhelmingly, the l a r g e s t c o e f f i c i e n t of v a r i a t i o n w i t h i n a l l the s u b j e c t and t e x t c o r p o r a . The r e s u l t s of Pearson's skew f a c t o r f o r the sentence le n g t h c h a r a c t e r i s t i c s f o r the Corpus and v a r i o u s corpora are presented i n TABLES XXV through XXVII. A r e s u l t of zero i n d i c a t e s sentence lengths approximating a normal d i s t r i b u t i o n where the mean and mode c o i n c i d e . A p o s i t i v e l y skewed d i s t r i b u t i o n i n d i c a t e s a t a i l i n g o f f to l onger sentences while a n e g a t i v e l y skewed d i s t r i b u t i o n i n d i c a t e s a t a i l i n g o f f to s h o r t e r sentences i n r e l a t i o n to the mean. A normal d i s t r i b u t i o n i n d i c a t e s a g e n e r a l l y e q u i v a l e n t d i s t r i b u t i o n of long and s h o r t sentences about the mean. The areas most c l o s e l y approximating a normal d i s t r i b u t i o n of sentence l e n g t h s were the Corpus (0.029), Grade 8 (0.060), Grade 8 Mathematics (0.065), Grade 9 Commerce (0.057), three Grade 8 textbook corpora (Mathematics, -0.065; S c i e n c e , -0.040; S o c i a l S t u d i e s , 0.042), and a Grade 10 Science textbook with the c l o s e s t f i g u r e of a l l (-0.004). 110 The corpora which had the most skewed d i s t r i b u t i o n s i n c l u d e d Grade 10 (0.770), E n g l i s h (0.772), Grade 9 E n g l i s h (0.737), Grade 9 Mathematics (0.794), s i x Grade 9 textbooks ( E n g l i s h 0.811 and 0.822; Home Economics 0.847; I n d u s t r i a l E ducation 0.816; Mathematics 0.794; and Science 0.907), and a Grade 10 Commerce textbook, 0.909. TABLE XXV PEARSON'S SKEW FACTOR FOR EACH GRADE LEVEL, THE CORPUS AND SUBJECTS ACROSS THE CORPUS Grade Skew 8 0.060 9 0.275 10 0.770 Corpus 0.029 Commerce 0.525 E n g l i s h 0.772 Home Economics 0.286 I n d u s t r i a l Educ. 0.672 Mathematics 0. 153 Science 0.357 S o c i a l S t u d i e s -0.107 111 TABLE XXVI PEARSON'S SKEH FACTOR FOR SUBJECTS IN EACH GRADE LEVEL Subject Grade 8 Grade 9 Grade 10 Commerce — 0.057 0. 509 E n g l i s h 0.429 0.737 0. 681 Home Economics 0. 423 0.251 — I n d u s t r i a l Educ. 0.457 0. 179 -Mathematics -0.065 0.794 0. 500 Science 0.241 0.318 0. 186 S o c i a l Studies -0.130 0.557 -0. 258 Corpus 0.060 0.275 0. 770 TABLE XXVII PEARSON'S SKEW FACTOR FOR EACH TEXT Text Subject Skew *1C01 E n g l i s h 0.254 *1C02 E n g l i s h 0.411 *1D01 Home Economics 0.423 *1E01 I n d u s t r i a l Education 0.457 *1F01 Mathematics -0.065 *1G01 S c i e n c e -0.040 *1G02 Science 0. 197 *1H01 S o c i a l S t u d i e s 0.443 *1H02 S o c i a l S t u d i e s -0.042 *2B01 Commerce -0.164 *2B02 Commerce 0.514 *2C01 E n g l i s h 0.811 *2C02 E n g l i s h 0.504 *2C03 E n g l i s h 0.555 *2C04 E n g l i s h 0.822 *2D01 Home Economics 0.523 *2D02 Home Economics 0.226 *2D03 Home Economics 0.557 *2D04 Home Economics 0. 847 *2D05 Home Economics 0.257 *2E01 I n d u s t r i a l Education 0.314 *2E02 I n d u s t r i a l E d u c a t i o n 0.816 *2E03 I n d u s t r i a l Education 0.347 *2F01 Mathematics 0.794 *2G01 Science 0.205 *2G02 Science 0.907 *2H01 S o c i a l S t u d i e s 0.677 • 2H02 S o c i a l S t u d i e s 0.455 *3B01 Commerce 0.909 *3B02 Commerce 0. 114 *3C01 E n g l i s h 0.654 *3C02 E n g l i s h 0.511 *3F01 Mathematics 0.500 *3G01 Sci e n c e 0.490 *3G02 Science 0.004 *3H01 S o c i a l S t u d i e s -0.462 *3H02 S o c i a l S t u d i e s 0.274 113 Task_6._2 To i l l u s t r a t e the data f o r sentence l e n g t h s , s i x t y - s i x graphs were produced by the O.B.C. p l o t t i n g package using a CALCOMP Drum P l o t t e r . These are presented i n APPENDIX H. The narrow range of standard d e v i a t i o n s f o r the sentence l e n g t h s of the Corpus and most of the corpora i s i n d i c a t e d by the l e p t o k u r t i c nature of these graphs, (sentences tend to c l u s t e r around the mean l e n g t h ) . The g r e a t e r degree of v a r i a t i o n of sentence l e n g t h s i n some corpo r a ( E n g l i s h f o r example), i s i n d i c a t e d by the mesokurtic p l a t e a u of t h e i r graphs. The graphs provide good v i s u a l i l l u s t r a t i o n of the r e l a t i v e d i s t r i b u t i o n of s h o r t and long sentences i n the s i x t y - s i x c o r p o r a . Task_6_.3 The r e p e a t - r a t e frequency t a b l e s f o r the Corpus and c o r p o r a are presented i n volumes C. V. , G.V., S.V., S.G. v . , and T. V. (See Task 5.2). The r e s u l t s of Yule's K (for sentences) are l i s t e d i n TABLES XXVIII through XXX. High K values i n d i c a t e a greater c o n c e n t r a t i o n of commonly o c c u r r i n g sentence l e n g t h s while low values i n d i c a t e a c o n c e n t r a t i o n of l e s s f r e q u e n t l y o c c u r r i n g sentence l e n g t h s . The K v a l u e s f o r each of the three grade l e v e l s , the Corpus, and the s u b j e c t areas across grade l e v e l s are presented i n TABLE XXVIII. Grade 8 has the s m a l l e s t value (326.67), i n d i c a t i n g a g r e a t e r v a r i e t y of sentence l e n g t h s used than i n Grades 9 and 10. 111 TABLE XXVIII K FACTORS (SENTENCES) FOR EACH GRADE LEVEL, THE CORPUS, AND SUBJECTS ACROSS THE CORPUS Grade K Fa c t o r 8 326.67 9 344.88 10 334.55 Corpus 336.35 C ommerce 364.57 E n g l i s h 296.64 Home Ec, 361.32 Ind. Ed. 399.64 Math. 40 2.07 Science 334.32 Soc. St. 333.49 The great d i v e r s i t y of the K f a c t o r i n the v a r i o u s s u b j e c t areas within the grade l e v e l s i s shown i n TABLE XXIX. TABLE XXIX K FACTORS (SENTENCES) FOR SUBJECTS WITHIN GRADE LEVELS Subject Grade 8 Grade 9 Grade 10 Commerce — 357.26 397.36 E n g l i s h 279.02 287.23 360.10 Home Ec. 377.56 359.24 — Ind. Ed. 434.17 398.78 -Math. 385.35 465.07 427. 39 Science 356.30 343.89 319.42 Soc. S t . 312.18 291.12 361.56 Corpus 326.67 344.88 334.55 115 The K f a c t o r s , range from 279.02 i n Grade 8 E n g l i s h to 465.07 i n Grade 9 Mathematics. TABLE XXX presents the K f a c t o r s f o r each textbook used i n the study. These r e s u l t s range from 204.57 (Grade 9 English) to 484.43 f o r a Grade 9 I n d u s t r i a l Education textbook. TABLE XXX K FACTORS (SENTENCES) FOR EACH TEXT Text Subject K Factor *1C01 E n g l i s h 244.22 *1C02 E n g l i s h 297.57 *1D01 Home Economics 377.56 *1E01 I n d u s t r i a l Education 434.17 *1F01 Mathematics 385.35 *1G01 Sc i e n c e 403.33 • 1G02 Science 348.21 *1H01 S o c i a l S t u d i e s 311.35 *1H02 S o c i a l S t u d i e s 308.65 *2B01 Commerce 406.81 *2B02 Commerce 324.85 • 2C01 E n g l i s h 310.80 *2C02 E n g l i s h 275.85 *2C03 E n g l i s h 204.57 *2C04 E n g l i s h 337.54 *2D01 Home Economics 335.97 • 2D02 Home Economics 344.99 *2C03 Home Economics 368.25 *2D04 Home Economics 351.62 *2D05 Home Economics 449.12 *2E01 I n d u s t r i a l Education 484.43 *2E02 I n d u s t r i a l Education 429.83 • 2E03 I n d u s t r i a l Education 365.36 • 2F01 Mathematics 465.07 *2G01 Sci e n c e 362.81 *2G02 Science 342.44 *2H01 S o c i a l S t u d i e s 266.15 *2H02 S o c i a l S t u d i e s 311.16 116 TABLE XXX (CONT.) K FACTORS (SENTENCES) FOR EACH TEXT Text Subject K F a c t o r • 3B01 Comme rce 382.22 *3B02 Commerce 409.44 *3C01 E n g l i s h 400.35 *3C02 E n g l i s h 226.76 *3F01 Mathematics 427.39 *3G01 Science 348.48 *3G02 Sci e n c e 285.01 *3H01 S o c i a l S t u d i e s 385.76 *3H02 S o c i a l S t u d i e s 311.85 E n g l i s h with f i v e of the f i r s t ten textbooks ranked, has the g r e a t e s t number of textbooks with a low value of K. S o c i a l S t u d i e s has four textbooks out of the f i r s t t e n and there i s one Sci e n c e textbook ranked number s i x . Four of the f i r s t ten textbooks with low K values are i n Grade 8, f o u r are i n Grade 9, and two are i n Grade 10. I n d u s t r i a l Education with three of the l a s t s i x t e x t s and Mathematics with two of the l a s t s i x t e x t s are the two s u b j e c t s with the g r e a t e s t number of high K values. One Home Economics t e x t a l s o had a high K va l u e , 117 Generate comparative and s t a t i s t i c a l a n a l y s e s of the d i s t r i b u t i o n of the 100 most f r e q u e n t l y o c c u r r i n g word-types of the Corpus across the t h r e e grade l e v e l s , the seven s u b j e c t a r e a s , and the s u b j e c t areas wit h i n the three grade l e v e l s . 7.1 Test the f o l l o w i n g n u l l hypotheses. U2E2_kS§ i§_li There are no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n of the 100 most frequent word-types of the Corpus when compared to the expected d i s t r i b u t i o n o f each word-type f o r : 1.1 the three grade l e v e l s of the Corpus, 1.2 the seven s u b j e c t a r e a s of the Corpus, 1.3 the s u b j e c t areas w i t h i n Grade 8, 1.4 the s u b j e c t a r e a s w i t h i n Grade 9 , 1.5 the s u b j e c t areas w i t h i n Grade 10. 7.2 I n v e s t i g a t e and d e s c r i b e the number of word-types which d i f f e r e d s i g n i f i c a n t l y i n t h e i r d i s t r i b u t i o n a c r o s s each of the areas t e s t e d i n Task 7 .1 . Task _ 7 _ _ In t h i s a n a l y s i s , the 100 most f r e q u e n t l y o c c u r r i n g words i n the t o t a l Corpus were used as the b a s i s of comparison. The b a s i c task was to answer the q u e s t i o n , "Do the 100 most f r e q u e n t l y o c c u r r i n g words d e r i v e d from the t o t a l Corpus have s i m i l a r f r e q u e n c i e s of occurrence when the samples are o r g a n i z e d by grade l e v e l , s u b j e c t s a c r o s s grades, and s u b j e c t s within grades?" Acceptance of the n u l l hypotheses would i n d i c a t e t h a t there i s s u b s t a n t i a l s i m i l a r i t y (in terms of frequency of occurrence) between the l i s t of the 100 most f r e q u e n t l y o c c u r r i n g words i n the Corpus and the most f r e q u e n t l y o c c u r r i n g words f o r the v a r i o u s corpora. Chi-square t e s t s were not computed f o r the t h i r t y - s e v e n textbooks but i t would have been p o s s i b l e to do so. A t o t a l of 500 c h i - s q u a r e s were computed. Complete data f o r the chi-square analyses are a v a i l a b l e i n APPENDIX I. 118 TABLE XXXI prov i d e s a summary of the c h i - s q u a r e r e s u l t s . Only two words have s i m i l a r f r e q u e n c i e s of occurrence across a l l corpora - "as" and "very". The other n i n e t y - e i g h t words e x h i b i t c o n s i d e r a b l e v a r i a t i o n i n t h e i r frequency of occurrence a c r o s s the v a r i o u s c o r p o r a . In a l l , the n u l l h y p othesis was r e j e c t e d i n a t o t a l of 372 out of 500 t e s t s , with the l e v e l of s i g n i f i c a n c e s e t at .01. TABLE XXXI CHI-SQUARE ANALYSIS OF THE 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Rank Word Grades S u b j e c t s S u b j e c t s S u b j e c t s S u b j e c t s (C) (8) (9) (10) 1 THE - ** ** ** ** 2 OF ** ** ** ** ** 3 AND ** ** ** ** ** 4 A ** ** ** ** ** 5 TO - ** ** ** ** 6 IN ** ** - - ** 7 IS ** ** ** ** ** 8 THAT ** ** ** ** ** 9 IT - ** ** - ** 10 ARE ** ** ** ** ** 1 1 FOR ** ** ** ** ** 12 YOU * * ** ** ** ** 13 BE ** ** ** ** ** 14 AS - - - - -15 OR * * ** ** ** ** 16 WITH ** ** - - ** 17 ON - ** ** ** -18 THIS ** ** ** ** ** 19 BY - ** ** ** -20 WAS ** ** ** ** ** 21 HE ** ** ** ** ** 22 FROM - ** ** ** ** 23 HAVE * * ** ** ** ** 24 AT - ** ** ** ** 25 WHICH - ** ** ** ** 26 ONE - ** ** - ** 27 NOT - ** - ** ** 28 CAN ** ** ** ** ** 29 YOUR ** ** ** ** ** 119 TABLE XXXI (CONT.) CHI-SQUARE ANALYSIS OF THE 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Rank Word Grades S u b j e c t s S u b j e c t s S u b j e c t s Subjects (C) (8) (9) (10) 30 They ** ** ** ** ** 31 We ** ** ** ** ** 32 His - ** ** ** ** 33 W i l l ** ** ** ** ** 34 If ** ** ** ** ** 35 An - - ** - -36 When ** ** - - ** 37 A l l - ** - ** ** 38 But - ** - • * ** 39 These - ** ** ** ** 40 May ** ** ** ** ** 41 There - ** ** ** ** 42 Has - ** ** - ** 43 I ** ** ** ** ** 44 Other - ** - - -45 Some - ** - ** — 46 More - ** — ** — 47 Where ** ** ** - ** 48 Had ** ** ** ** ** 49 T h e i r ** ** ** ** ** 50 Used ** ** ** ** ** 51 Many - ** ** ** ** 52 So ** ** ** ** ** 53 Each - ** ** - ** 54 Two - ** ** ** ** 55 About - ** ** ** ** 56 Should ** ** ** ** ** 57 What - ** ** ** ** 58 Than - - - ** ** 59 Been ** ** - - — 60 Into - ** ** - — 61 Them - ** ** ** ** 62 Use ** ** ** ** ** 63 Make ** ** ** ** -64 Do - ** ** ** ** 65 Up - ** - ** ** 66 Such - - - ** — 67 Then ** ** ** - ** 68 Time - ** - - ** 69 I t s ** ** ** ** ** 70 Would - ** - ** ** 71 How - ** ** ** ** 72 Number ** ** ** ** ** 73 Made ** ** ** ** -74 Out - ** - ** *• 75 MOSt - ** - - ** 76 Only ** ** ** — ** 120 TABLE XXXI (CONT.) CHI-SQUARE ANALYSIS OF THE 100 HOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GBADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Rank Word Grades Sub j e c t s S u b j e c t s S u b j e c t s Subject (C) (8) (9) (10) 77 No _ ** _ — -78 Must ** ** - ** ** 79 Water ** ** ** ** ** 80 Al s o - ** ** ** — 81 F i r s t - - - ** — 82 Very - — — — — 83 Good ** ** ** — ** 84 Him - ** *# ** ** 85 Same - ** ** ** ** 86 Could ** ** ** ** ** 87 Who - ** ** ** ** 88 Any - ** ** ** ** 89 Because - ** ** — ** 90 See - ** ** - ** 91 L i k e - ** - ** ** 92 Much - ** - ** ** 93 People - ** ** — ** 94 C a l l e d - ** ** ** ** 95 P l a c e ** ** ** — 96 Through - ** - ** ** 97 Work ** ** ** ** — 98 New ** ** - ** ** 99 Small - ** - - ** 100 Over - ** - ** -** SIGNIFICANT AT THE .01 LEVEL. - NOT SIGNIFICANT. Task_7_2 A breakdown of the c h i - s q u a r e r e s u l t s f o r grades, s u b j e c t s across grades, and s u b j e c t s within grades i s presented i n TABLE XXXII. The g r e a t e s t s i m i l a r i t y i n the d i s t r i b u t i o n of commonly o c c u r r i n g words appears when the samples are organized by grades. Only 46 out of 100 c h i - s q u a r e t e s t s were r e j e c t e d . In the case of s u b j e c t s a c r o s s grades, 94 c h i - s q u a r e t e s t s were r e j e c t e d . S i m i l a r r e s u l t s are e v i d e n t f o r s u b j e c t s within Grades 8, 9, and 10 with 72, 76, and 81 chi-square t e s t s r e j e c t e d . 121 These r e s u l t s are a l s o shown i n the patte r n of r e j e c t i o n i n TABLE XXXI. TABLE XXXII SUMMARY OF CHI-SQUARE ANALYSIS OF 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Gdes i n Subj. i n Subj. i n Subj, i n Subj. i n Corpus Corpus Gde 8 Gde 9 Gde 10 S i g . 46 94 72 76 81 Non S i g . 54 6 28 24 19 T o t a l 100 100 100 100 100 d.f 2 6 5 6 4 Task_8_ Do the sentence l e n g t h d i s t r i b u t i o n s of the three grade l e v e l c o r p o r a , the seven s u b j e c t area c o r p o r a , and the eighteen s u b j e c t w i t h i n grade l e v e l c o r p o r a d i f f e r from the sentence l e n g t h d i s t r i b u t i o n of the Corpus? T h i s t a s k i n v o l v e d t e s t i n g the f o l l o w i n g n u l l hypotheses. j3y.£2ti®§i§_2i There are no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n of sho r t , average, and long sentences when compared to the expected d i s t r i b u t i o n of each of the sentence l e n g t h s f o r : 2.1 the three grade l e v e l s of the Corpus, 2.2 the seven s u b j e c t areas of the Corpus, 2.3 the s u b j e c t area corpora within Grade 8, 2.4 the s u b j e c t area c o r p o r a w i t h i n Grade 9, 2.5 the s u b j e c t area corpora w i t h i n Grade 10. T a s k _ 8 i The purpose of t h i s a n a l y s i s was to determine i f the sentence l e n g t h d i s t r i b u t i o n s were s i m i l a r across the va r i o u s corpora. I t would have been unwieldly to determine the s i m i l a r i t y i n the d i s t r i b u t i o n s of a l l sentence l e n g t h types a c r o s s a l l the corpor a i n v o l v e d . For example, 94 d i f f e r e n t 122 sentence l e n g t h s were i d e n t i f i e d f o r the Corpus alone, ranging i n l e n g t h from one word to 117 words. The d e c i s i o n was made to s e l e c t a number of sentence l e n g t h s r e p r e s e n t a t i v e of the Corpus as a whole and determine i f these e x h i b i t e d s i m i l a r d i s t r i b u t i o n s a c r o s s the v a r i o u s corpora. The sentence length d i s t r i b u t i o n f o r the t o t a l Corpus (See FIGURE 7.1, APPENDIX H ) was c a r e f u l l y s c r u t i n i z e d and f i v e sentence l e n g t h s s e l e c t e d as being r e p r e s e n t a t i v e of the Corpus on the b a s i s of t h e i r r e l a t i v e frequency of occurrence were chosen. The f i v e sentence l e n g t h s r e p r e s e n t a group on e i t h e r s i d e of the Corpus mean (10 and 20 words i n l e n g t h r e s p e c t i v e l y ) , plus three groups of l a r g e r sentences (30, 40 and 50+ words i n length r e s p e c t i v e l y ) . The sentences of 50+ words r e p r e s e n t a l l the l a r g e r sentences i n the p o s i t i v e l y skewed d i s t r i b u t i o n f o r sentence lengths. T h i s end of the curve r e p r e s e n t s the small q u a n t i t i e s and g r e a t v a r i e t i e s of sentences 50 words and over i n l e n g t h . The b a s i c task was to answer the q u e s t i o n , "Do the f i v e sentence l e n g t h s d e r i v e d from the Corpus have s i m i l a r d i s t r i b u t i o n s a c r o s s the corpora when the samples are organized by grade l e v e l s , s u b j e c t s across grades, and s u b j e c t s within grades?" Acceptance of the n u l l hypothesis would i n d i c a t e that there i s s u b s t a n t i a l s i m i l a r i t y between the d i s t r i b u t i o n of the r e p r e s e n t a t i v e sentence lengths of the Corpus and the d i s t r i b u t i o n of sentences a c r o s s the v a r i o u s c o r p o r a . Chi-square t e s t s were not computed f o r the t h i r t y - s e v e n textbooks but i t would have been p o s s i b l e t o do so. Complete data f o r the c h i -square analyses f o r the f i v e sentence lengths a c r o s s the v a r i o u s corpora are a v a i l a b l e i n APPENDIX J . A t o t a l of t w e n t y - f i v e c h i -123 square t e s t s were computed. TABLE XXXIII p r o v i d e s a summary of the c h i - s q u a r e r e s u l t s . In a l l t e s t s the n u l l hypothesis was r e j e c t e d at the .01 l e v e l of s i g n i f i c a n c e i l l u s t r a t i n g the d i v e r s i t y i n the sentence l e n g t h d i s t r i b u t i o n s f o r the r e p r e s e n t a t i v e sentences by grades, s u b j e c t s a c r o s s grades, and s u b j e c t s w i t h i n grades. I t should be po i n t e d out t h a t there i s g r e a t e r apparent s i m i l a r i t y i n the sentence l e n g t h d i s t r i b u t i o n s f o r the sample organized by grades than when they are organized by s u b j e c t s across grades or s u b j e c t s w i t h i n grades. TABLE XXXIII CHI-SQUARE ANALYSIS OF SELECTED SENTENCE LENGTHS FOR THE GRADES, SUBJECTS ACROSS GRADES, AND SUBJECTS WITHIN GRADES Grades Subje c t s Subj. i n Subj. i n Subj. i n Corpus Gde 8 Gde 9 Gde 10 X 2 value 21.98 152. 23 53. 33 109. 63 41.68 .01 l e v e l 18.48 42.98 37. 57 42. 98 32.00 d.f 8 24 20 24 16 Task_9.. Develop an " e l i m i n a t i o n technique" f o r s e l e c t i n g the most s i g n i f i c a n t content words i n a word l i s t using the ranked frequency l i s t s developed f o r the Corpus, the three grade l e v e l corpora, and the s u b j e c t area c o r p o r a . 9.1 Produce a set o f graphs to i l l u s t r a t e the word frequency by rank of the Corpus, the three grade l e v e l c o r p o r a , and the seven s u b j e c t area corpora. 9.2 What i s the e f f e c t of e l i m i n a t i n g the h i g h e s t frequency words and the lowest frequency words from the t o t a l spectrum of words f o r each of the areas 124 s t a t e d i n 9. 1? 9.3 Can the r e s i d u a l of words remaining a f t e r e l i m i n a t i n g the high and low frequency words d e s c r i b e d i n 9.2 serve as a pool f o r s e l e c t i n g the most u s e f u l content words f o r the Corpus, the three grade l e v e l corpora, and the seven s u b j e c t area c o r p o r a , through a n a l y s e s based on r e l a t i v e frequency of occurrence and s u b j e c t i v e c r i t e r i a ? Task_9__ T h i s task was designed t o determine the f e a s i b i l i t y of developing an " e l i m i n a t i o n technigue" f o r s e l e c t i n g the most s i g n i f i c a n t vocabulary from a l i s t of words d e r i v e d from samples of n a t u r a l language t e x t r e p r e s e n t a t i v e of p r e s c r i b e d s u b j e c t m a t e r i a l s . The graphs i l l u s t r a t i n g the word frequency d i s t r i b u t i o n s f o r the Corpus, grade l e v e l , and s u b j e c t s across grades corpora used i n t h i s task are presented i n APPENDIX K. The graphs f o r s u b j e c t s within grades and the t h i r t y - s e v e n textbooks were not p l o t t e d although i t would be p o s s i b l e to do so. The graphs approximate the shape u s u a l l y found i n the a n a l y s i s of word frequency data. The graphs f o r the grade l e v e l and s u b j e c t corpora have the same g e n e r a l shape as the word frequency d i s t r i b u t i o n f o r the Corpus. The Corpus graph (see FIGURE 5 ) i l l u s t r a t e s the high frequency f o r the f i r s t 100 most common words, the c l u s t e r i n g of word f r e q u e n c i e s about the mid poi n t of the graph, the gradual t a i l i n g o f f t o words o c c u r r i n g three times or l e s s , and the f i n a l t a b u l a t i o n of the ha_ax legomena . 125 0 I I I I I l l i . - • 1 ro 3o So fa 7o go lo "o Words Ranked In Descending Order (x io'.) FIGURE 5 WORD FREQUENCY DIAGRAM OF THE CORPUS J§sk _ 9 _ 2 The word frequency graph of the Corpus (FIGURE 5) i s used to i l l u s t r a t e the e f f e c t of a p p l y i n g the " e l i m i n a t i o n t e c h n i q u e " t o a word l i s t . P o i n t A r e p r e s e n t s the c u t o f f p o i n t f o r 50 percent of the high frequency tokens i n the Corpus and Point B r e p r e s e n t s the c u t o f f p o i n t f o r 10 percent of the low frequency tokens i n the Corpus. The remaining 40 percent of 126 tokens between p o i n t s A and B are c o n s i d e r e d to rep r e s e n t the " s i g n i f i c a n t " body of content f o r the Corpus. The d i s t r i b u t i o n f o r these words would most l i k e l y approximate a normal curve with a mean frequency of occurrence and p r o p o r t i o n a t e t a i l i n g o f f from both s i d e s of the mean. s • ' i I I 1 1 1 » i i > lO To 3o Vo So £o to fa 9t> 'o© Words Ranked In Descending Crder (x 10') FIGURE 6 APPLICATION OF THE "ELIMINATION TECHNIQUE" TO THE WORD FREQUENCY DIAGRAM OF THE CORPUS TABLE XXXIV pres e n t s the data f o r determining the number of word-types and the percentage of the t o t a l word-types accounted 127 f o r by both the c u t o f f p o i n t s A and B i n the Corpus, the thr e e grade l e v e l c o r p o r a , and the seven s u b j e c t area corpora. TABLE XXXIV NUMBER AND PERCENTAGE OF WORD-TYPES ELIMINATED BY POINT A (50% CUTOFF OF TOKENS) AND POINT B (10% CUT OFF OF TOKENS) P o i n t A Poi n t B T o t a l No.of % of No.of % of No.pf Word Word Word Word Word Types Types Types Types Types Corpus 111 0. 68 12,695 77.40 16 ,405 Grade 8 94 1. 33 4, 593 65.36 7,027 Grade 9 109 0.96 7 ,730 67.80 11,401 Grade 10 118 1.52 4,906 63. 40 7,736 Commerce 82 2.72 1 ,904 63.00 3,020 E n g l i s h 92 1. 30 4, 010 72. 60 7,079 Home Economics 81 1 .47 3,832 69. 30 5,529 I n d u s t r i a l Education 90 2.22 2, 433 60.00 4,060 Mathematics 53 2.72 1 ,298 66.50 1.952 Science 96 1.99 2, 968 61.41 4,833 S o c i a l S t u d i e s 120 1 .93 3 ,144 50.62 6,211 The words up to P o i n t A account f o r a very s m a l l number of word-types i n each of the eleve n d i s t r i b u t i o n s . The Corpus, which had 111 word-types ' e l i m i n a t e d * , represented the s m a l l e s t percentage (0.68), while Commerce with 82 word-types and Mathematics with 53 word-types ' e l i m i n a t e d * r e s p e c t i v e l y , both had 2.72 percent of t h e i r t o t a l word-types d e l e t e d . The g r e a t m a j o r i t y of the word-types which would be d e l e t e d from the d i s t r i b u t i o n using t h i s technique would be high frequency s t r u c t u r e words which are s i m i l a r l y common to the 128 Corpus and the ten c o r p o r a being i n v e s t i g a t e d . These words, which c o n s t i t u t e 'noise i n the system', are not considered d i s t i n c t enough to have s p e c i a l s i g n i f i c a n c e f o r the content m a t e r i a l they r e p r e s e n t . The words up t o P o i n t B account f o r the m a j o r i t y of word-types i n each of the eleven d i s t r i b u t i o n s . The numbers of word-types ' e l i m i n a t e d ' ranged from a low of 3,144 (50.62 percent) i n S o c i a l S t u d i e s , to a high of 12,695 (77.4 percent) i n the Corpus. Most of the low frequency words d e l e t e d would be items which occur o n l y s e v e r a l times i n a d i s t r i b u t i o n . These words are considered to be too r a r e to have s p e c i a l s i g n i f i c a n c e f o r t h e i r r e s p e c t i v e content m a t e r i a l s . The complete l i s t i n g of the word-types and tokens f o r each of the eleven areas i s provided i n the f o l l o w i n g volumes a v a i l a b l e from the Computing Centre, at the U n i v e r s i t y of B r i t i s h Columbia. The o r g a n i z a t i o n of these volumes was d e s c r i b e d p r e v i o u s l y under Task 4. 1) Corpus (Volume C.V.) 2) Grade L e v e l s (Volume G.V.) 3) Subject Areas (Volume S.V.) Task_9_3 The balance of the words remaining between p o i n t s A and B (approximately 40 percent of tokens) i n each of the c o r p o r a are c o n s i d e r e d t o be n e i t h e r too common nor too r a r e and 129 have the g r e a t e s t s i g n i f i c a n c e f o r the content m a t e r i a l they r e p r e s e n t . TABLE XXXV presents the number and percentage of word-types between the A and B c u t o f f p o i n t s f o r the Corpus, grades, and s u b j e c t s a c r o s s grades corpora. The vast majority of word-types i n t h i s s e c t i o n of the d i s t r i b u t i o n s are l e x i c a l items which occur seven times or more i n the Corpus, and three times or more i n most of the corpora. TABLE XXXV NUHBER AND PERCENTAGE OF WORD-TYPES BETWEEN POINT A AND POINT B (40% OF TOKENS) FOR THE CORPUS, GRADES, AND SUBJECTS ACROSS GRADES NO. of % Of No. Of Word-Types Word-Types Word-Types Corpus 3,599 21.92 16,405 Grade 8 2, 340 33.31 7,027 Grade 9 3,562 31.24 11,401 Grade 10 2,712 35.08 7,736 Commerce 1,034 34, 28 3,020 E n g l i s h 2,977 26.10 7,079 Home Economics 1,616 29.23 5,529 I n d u s t r i a l Education 1 ,537 37.78 4 ,060 Hathematics 601 30.78 1,952 Science 1 ,769 36.60 4,833 S o c i a l S t u d i e s 2,947 47.45 6,211 I t i s i n t e r e s t i n g t o note t h a t the r e a l l y s i g n i f i c a n t l e x i c a l c ontent, once the common words and the r a r e l y o c c u r r i n g words are ' e l i m i n a t e d * , c o n s i s t s of a r e l a t i v e l y s m a l l number of word-types when compared to the t o t a l f o r the complete sample org a n i z e d by Corpus, grades, and s u b j e c t s a c r o s s grades. Further s u b j e c t i v e e v a l u a t i o n of the words remaining between p o i n t s A 130 and B by s u b j e c t s p e c i a l i s t s would no doubt reduce the t o t a l even f u r t h e r . SUMMARY T h i s chapter has presented the a n a l y s i s of the data and the f i n d i n g s f o r the study. Nine t a s k s were i n v o l v e d and the completion of the ta s k s r e s u l t e d i n the p r o d u c t i o n of some 5,500 pages of p r i n t e d m a t e r i a l which i n c l u d e d f a c s i m i l e s of the i n s t r u c t i o n a l m a t e r i a l s sampled and the s i x t y - s i x word l i s t s p l u s accompanying t a b l e s , graphs, and s t a t i s t i c a l summaries. Task__ A r e p r e s e n t a t i v e sample of i n s t r u c t i o n a l m a t e r i a l s was s e l e c t e d and organized. The Corpus c o n s i s t e d of 469, f i v e hundred word samples of n a t u r a l language taken from t h i r t y - s e v e n p r e s c r i b e d E n g l i s h language textbooks r e p r e s e n t i n g seven s u b j e c t areas. Task_2 The Corpus was keypunched onto IBM computer cards u s i n g the FMT computer program and st o r e d on d i s k to await p r o c e s s i n g . Task_3 Two e d i t i o n s o f the Corpus were produced. One e d i t i o n was org a n i z e d by grade l e v e l s and one organized by s u b j e c t areas. T h i s enabled the p r o d u c t i o n of an a d d i t i o n a l s i x t y - f i v e corpora with the samples organized by the three grade l e v e l s , seven s u b j e c t areas, eighteen s u b j e c t s w i t h i n grade l e v e l s , and t h i r t y - s e v e n textbooks. Tasl___ Two word frequency l i s t s , one organized a l p h a b e t i c a l l y 131 and one organized by descending rank order, were produced f o r the Corpus and the s i x t y - f i v e c o r p o r a . Tables p r e s e n t i n g the rank of word frequency f i g u r e s i n descending and ascending order were developed f o r each of the s i x t y - s i x c o r p o r a . Task_5_ Comparative and s t a t i s t i c a l a n a l y s e s were generated based on the l e x i c a l c h a r a c t e r i s t i c s of the Corpus and the s i x t y - f i v e c orpora. Yule's c h a r a c t e r i s t i c K was computed to i l l u s t r a t e the c o n c e n t r a t i o n of commonly o c c u r r i n g vocabulary i n each of the s i x t y - s i x c o r p o r a . The h e a v i e s t l o a d o f new r e a d i n g m a t e r i a l as measured by t o t a l tokens was in t r o d u c e d i n Grade 9 which a l s o has the h e a v i e s t l o a d i n g of s p e c i f i c word-types i n the three grades. Home Economics and E n g l i s h were the two l a r g e s t s u b j e c t corpora when the samples were organized a c r o s s the three grade l e v e l s by s u b j e c t s , with E n g l i s h having a much higher p r o p o r t i o n of word-types suggesting a g r e a t e r vocabulary l o a d . S o c i a l S t u d i e s a l s o had a high p r o p o r t i o n of word-types. When the samples were organized by s u b j e c t s w i t h i n grades. Home Economics and S o c i a l S t u d i e s have the l a r g e s t s u b j e c t c o r p o r a and S o c i a l S t u d i e s and E n g l i s h have the l a r g e s t v ocabulary load f o r Grade 8; Home Economics and I n d u s t r i a l Education have the l a r g e s t s u b j e c t corpora and Home Economics and E n g l i s h have the l a r g e s t vocabulary l o a d f o r Grade 9; and. Science and S o c i a l S t u d i e s have the l a r g e s t s u b j e c t corpora and a l s o the l a r g e s t v o cabulary load f o r Grade 10. With the samples organized by textbooks, a Grade 10 S o c i a l 132 S t u d i e s t e x t and a Grade 9 I n d u s t r i a l Education t e x t have the l a r g e s t s u b j e c t corpora while the same Grade 10 S o c i a l S t u d i e s t e x t and a Grade 9 E n g l i s h t e x t have the l a r g e s t vocabulary l o a d . Thus, i t i s evident t h a t there i s c o n s i d e r a b l e d i v e r s i t y i n word-type and token d i s t r i b u t i o n when the samples are org a n i z e d by the v a r i o u s grade, s u b j e c t and t e x t corpora. The a p p l i c a t i o n of the K c h a r a c t e r i s t i c t o measure d e n s i t y of commonly o c c u r r i n g vocabulary again i n d i c a t e d t h a t Grade 9 had the lowest K value and t h a t Home Economics and E n g l i s h had the g r e a t e s t v a r i e t y of vocabulary used a c r o s s the v a r i o u s c o r p o r a except f o r s u b j e c t s w i t h i n Grade 10 where Commerce and E n g l i s h have the lowest d e n s i t y of commonly o c c u r r i n g words. With the samples or g a n i z e d by textbooks, E n g l i s h t e x t s c o n s i s t e n t l y have the lowest K val u e s . Task_6_ Comparative and s t a t i s t i c a l a n a l y s e s were generated based on the sentence l e n g t h d i s t r i b u t i o n s of the Corpus and the s i x t y - f i v e corpora. Graphs f o r each of the s i x t y - s i x corpora were developed f o r t h i s task. Y u l e ' s K c h a r a c t e r i s t i c was used to d e s c r i b e the c o n c e n t r a t i o n of commonly o c c u r r i n g sentence l e n g t h types i n the Corpus and s i x t y - f i v e corpora. R e l a t i v e l y uniform average sentence l e n g t h s were found when the samples were organized by grades. T h i s p a t t e r n was a l s o repeated when the samples were org a n i z e d by the s u b j e c t s across the three grades although the range i n averages i n c r e a s e d . 133 When the samples were organized by s u b j e c t s w i t h i n Grade 9, f a i r l y uniform average sentence l e n g t h s are e v i d e n t with the e x c e p t i o n of S o c i a l S t u d i e s . However, c o n s i d e r a b l e v a r i a b i l i t y i n the sentence l e n g t h d i s t r i b u t i o n s are e v i d e n t as i n d i c a t e d by the range i n standard d e v i a t i o n , c o e f f i c i e n t of v a r i a t i o n and to some extent by the average sentence l e n g t h s r e p o r t e d per 500 word samples. With the samples or g a n i z e d by s u b j e c t s w i t h i n Grade 10, the same p a t t e r n i s e x h i b i t e d with the e x c e p t i o n t h a t S c i e n c e has the l a r g e s t average sentence l e n g t h . With the samples organized by textbooks, a Grade 10 E n g l i s h t e x t has the l a r g e s t average sentence l e n g t h . I t should be p o i n t e d out that with the samples organized by s u b j e c t s a c r o s s grades, s u b j e c t s w i t h i n Grades 8 , 9 , and 10, and by textbooks, E n g l i s h e x h i b i t s the g r e a t e s t v a r i a t i o n i n sentence l e n g t h d i s t r i b u t i o n . Ho other s u b j e c t area approaches the magnitude o f the standard d e v i a t i o n and c o e f f i c i e n t of v a r i a t i o n r e p o r t e d f o r E n g l i s h samples. The a p p l i c a t i o n of Yule's K c h a r a c t e r i s t i c t o measure the d e n s i t y o f commonly o c c u r r i n g sentence l e n g t h s i n d i c a t e d t h a t Grade 8 had the lowest K v a l u e ; E n g l i s h had the lowest K value f o r samples organized by s u b j e c t s a c r o s s grades; E n g l i s h had the lowest K value f o r samples 8 and 9; Science had the lowest K value f o r Grade 10; and two E n g l i s h t e x t s had the lowest K v a l u e s f o r the samples o r g a n i z e d by textbooks. Task_7^ Chi-square t e s t s were computed t o i l l u s t r a t e the d i s t r i b u t i o n of the 100 most commonly o c c u r r i n g word-types of 134 the Corpus a c r o s s the t h r e e grade l e v e l c o r p o r a , the seven s u b j e c t area corpora, and the eigh t e e n s u b j e c t s w i t h i n grade l e v e l c orpora. A t o t a l of 500 c h i - s q u a r e t e s t s were computed f o r the samples organized by grade l e v e l , s u b j e c t s a c r o s s grades, and s u b j e c t s w i t h i n grades corpora to determine the nature o f the d i s t r i b u t i o n of the 100 most common words. The r e s u l t s i n d i c a t e d t h at there was s i g n i f i c a n t l y more v a r i a b i l i t y i n the use of the most commonly o c c u r r i n g vocabulary when the samples were organized by s u b j e c t s across grades and s u b j e c t s w i t h i n grades than when they were organized by Grades 8, 9, and 10. Task_8_ Chi-square t e s t s were computed to i l l u s t r a t e the d i s t r i b u t i o n o f f i v e s e l e c t e d sentence l e n g t h s of the Corpus a c r o s s the three grade l e v e l corpora, the seven s u b j e c t area corpora, and the eighteen s u b j e c t s w i t h i n grade l e v e l c o r p o r a . The r e s u l t s of the c h i - s q u a r e t e s t s on the s e l e c t e d sentence l e n g t h s f o r the v a r i o u s corpora i n d i c a t e d t h a t s i g n i f i c a n t d i v e r s i t y e x i s t s i n sentence l e n g t h d i s t r i b u t i o n f o r even the most common sentence lengths a c r o s s the corpora when the samples are org a n i z e d by grades, s u b j e c t s a c r o s s grades, and s u b j e c t s within grades. An a n a l y s i s o f the c h i - s q u a r e r e s u l t s w i t h i n a p a r t i c u l a r t e s t suggested t h a t t h e r e was g r e a t e r apparent s i m i l a r i t y i n the sentence length d i s t r i b u t i o n s organized by grade l e v e l s than by s u b j e c t areas or s u b j e c t s w i t h i n grades. 135 Task_9_, An " e l i m i n a t i o n technique" was d e s c r i b e d f o r use i n s e l e c t i n g the most " s i g n i f i c a n t " c ontent words i n the word l i s t s of the Corpus, the t h r e e grade l e v e l corpora, and the seven s u b j e c t area c o r p o r a . A s e t of graphs was c o n s t r u c t e d to r e p r e s e n t the word frequency by rank of each of the eleven a r e a s i n v e s t i g a t e d . The use of the " e l i m i n a t i o n t e c h n i q u e " to determine the " s i g n i f i c a n t " words i n a body of p r i n t m a t e r i a l i l l u s t r a t e d the great i n f l u e n c e a r e l a t i v e l y s m a l l number of h i g h l y frequent s t r u c t u r e words have on the word frequency d i s t i b u t i o n . Once these common words had been • e l i m i n a t e d * , along with a number of very r a r e words, a core of * s i g n i f i c a n t ' vocabulary i s a v a i l a b l e f o r examination and a n a l y s i s . Computer techniques were used e x t e n s i v e l y i n most a s p e c t s of the study. Over 200 s p e c i a l l y prepared computer f i l e s and twenty computer programs were developed to generate a Corpus of 235,107 words; s i x t y - f i v e s m a l l e r corpora r e p r e s e n t i n g grade l e v e l s , s u b j e c t areas, s u b j e c t areas within w i t h i n grades, and textbooks; t a b l e s , f i g u r e s , and graphs; numerous s t a t i s t i c a l procedures used to analyze the m a t e r i a l ; and, p r i n t the f i n a l copy of the d i s s e r t a t i o n i t s e l f . The data generated by the study were formatted using the FMT program and occupied over 5,500 pages which were organized i n t o e i g h t volumes. A l l i n f o r m a t i o n p e r t a i n i n g to the study was p l a c e d on magnetic tapes and s t o r e d i n the Computing Centre and the S p e c i a l C o l l e c t i o n s D i v i s i o n of the L i b r a r y a t the 136 U n i v e r s i t y of B r i t i s h Columbia. The 3270 C o n v e r s a t i o n a l T e r m i n a l (CRT) was used throughout the study to monitor the i n p u t of data, e d i t the m a t e r i a l and o r g a n i z e the p r o d u c t i o n of the Corpus, c o r p o r a , word l i s t s and accompanying s t a t i s t i c s . Apart from the Corpus, which contained n e a r l y a q u a r t e r of a m i l l i o n words, c o n s i d e r a b l e r e o r g a n i z a t i o n was r e q u i r e d to develop the other s i x t y - f i v e c o r p o r a used i n the study. The magnitude of the f i l e management task was f u r t h e r complicated by the need to compile two word l i s t s (one i n a l p h a b e t i c a l order and one i n descending rank order) f o r the Corpus and f o r each of the s i x t y - f i v e corpora, as w e l l as to develop two t a b l e s i l l u s t r a t i n g descending and ascending order f o r each of the s i x t y - s i x corpora used i n the study. In a d d i t i o n , i t was necessary to program a thorough examination of the word and sentence l e n g t h c h a r a c t e r i s t i c s of the Corpus and the corpora, provide r e l e v a n t graphs and t a b l e s , and t e s t the s t a t i s t i c a l hypotheses. The use of e x i s t i n g computer techniques, plus the development of new computer programs, enabled the t a s k s d e s c r i b e d above to be r a p i d l y f a c i l i t a t e d and a l s o provided the necessary s t a t i s t i c a l i n f o r m a t i o n r e q u i r e d f o r the study. 137 CHAPTER V DISCUSSION, CONCLUSIONS, AND RECOMMENDATIONS T h i s chapter presents a d i s c u s s i o n of the major f i n d i n g s of the study. A number of c o n c l u s i o n s drawn from the f i n d i n g s are given and the r e l a t i o n s h i p of these c o n c l u s i o n s to the r o l e of r e a d i n g i n the secondary s c h o o l d i s c u s s e d . F i n a l l y , recommendations f o r f u t u r e r e s e a r c h r e s u l t i n g from the study are presented. The c e n t r a l focus of the study i n v o l v e d the use of computer technology to 1) develop a r e p r e s e n t a t i v e Corpus and a s e r i e s of r e l a t e d corpora of samples of n a t u r a l language t e x t s e l e c t e d from E n g l i s h language i n s t r u c t i o n a l m a t e r i a l s p r e s c r i b e d f o r use i n B r i t i s h Columbia j u n i o r secondary grades, and 2) make a number of d e s c r i p t i v e and comparative analyses of s e l e c t e d word and sentence f e a t u r e s of the Corpus, and the grade l e v e l , s u b j e c t area, and textbook c o r p o r a . The study was organized i n t o nine t a s k s which i n v o l v e d s e l e c t i n g and sampling procedures, methods of data c o l l e c t i n g and r e c o r d i n g , data p r o c e s s i n g and a n a l y s i s , and the posing of r e l e v a n t r e s e a r c h q u e s t i o n s to be answered and n u l l hypotheses to be t e s t e d . A P i l o t Study was conducted p r i o r t o the commencement of the i n v e s t i g a t i o n to v a l i d a t e sampling techniques and methodological procedures, and generate needed 138 computer programs. DISCUSSION OF MAIN FINDINGS The d e t a i l e d f i n d i n g s of the study are presented i n Chapter IV. They are d i s c u s s e d here under the headings: Sampling and Pro c e s s i n g Procedures, Production of the Corpus, P r o d u c t i o n of Word L i s t s , L e x i c a l C h a r a c t e r i s t i c s , Sentence C h a r a c t e r i s t i c s , Common Words, S e l e c t e d Sentence Lengths, and E l i m i n a t i o n Technique. Tasks_l_an__2 The 235,107 word Corpus d e r i v e d from the 469, f i v e hundred word samples taken from the t h i r t y - s e v e n textbooks was developed and prepared f o r computer p r o c e s s i n g . The t o t a l sample used i n the study provided an adequate data base f o r the v a r i o u s d e s c r i p t i v e , comparative, and s t a t i s t i c a l a n a l y s e s performed on the Corpus and s i x t y - f i v e c o r p o r a . The FMT computer program was an i d e a l instrument f o r the p r o c e s s i n g of the n a t u r a l language samples used i n the i n v e s t i g a t i o n . T ask_3__Production_ of_ the _Corpus Two co p i e s of the Corpus, one organized by grade l e v e l s 1 3 9 ( C S . ) and the other o r g a n i z e d by s u b j e c t areas (C.S.), were produced. The o r g a n i z a t i o n of the two e d i t i o n s of the Corpus (one arranged by grade l e v e l s and the other arranged by s u b j e c t areas) provided u s e f u l access to the 469 samples used i n the study. The v a r i o u s samples were org a n i z e d i n t o a l p h a b e t i c a l and descending rank order word l i s t s f o r the Corpus (C.V.), the three grade l e v e l s (G.V.), the seven s u b j e c t areas (S.V.), the eighteen s u b j e c t s wi t h i n grades (S .G.v. ) , and the t h i r t y - seven textbooks (T. V. ). S t a t i s t i c a l t a b l e s f o r each of the s i x t y - s i x c o r p o r a d e s c r i b e d above were a l s o developed. The development of 132 word frequency l i s t s i n both a l p h a b e t i c a l and descending rank order along with s t a t i s t i c a l summary t a b l e s f a c i l i t a t e d the r a p i d l o c a t i o n o f s p e c i f i c word-types and tokens throughout the Corpus and the s i x t y - f i v e c o rpora. The p r o c e s s i n g of the t o t a l Corpus of 235,107 tokens r e s u l t e d i n the i d e n t i f i c a t i o n of 16,405 s p e c i f i c word-types. These r e s u l t s are p r o p o r t i o n a t e l y s i m i l a r to the type and token 140 d i s t r i b u t i o n s found i n other r e c e n t research based on computer generated corpora of v a r i o u s s i z e s i n c l u d i n g : Kucera and F r a n c i s (1967) 50,406 types, 1,014,232 tokens; C a r r o l l , Davis, and Richman (1971) 86,741 types, 5,088, 721 tokens; and, H a r r i s and Jacobson (1972) 80,000 types, 4,500,000 tokens, A p a t t e r n was evident i n type and token c h a r a c t e r i s t i c s with the samples o r g a n i z e d by grade l e v e l s . The corpus of m a t e r i a l f o r Grade 9 was twice the s i z e of t h a t f o r both Grade 8 and Grade 10 in terms of tokens and 50 percent g r e a t e r i n s i z e i n terms of word-types. Nineteen textbooks were used i n Grade 9 and nine i n Grade 10. T h i s suggests that the middle year i n the j u n i o r secondary grades e x h i b i t s p o t e n t i a l l y h e a v i e r r e a d i n g demands than e i t h e r Grade 8 or Grade 10. However, i t should be noted t h a t textbooks used i n Grade 9 E n g l i s h , Home Economics and I n d u s t r i a l Education can be repeated i n Grade 10 and the r e a d i n g load f o r t h i s grade depends on the s p e c i f i c use made of these textbooks. With t h i s i n mind, one might assume t h a t a marked i n c r e a s e i n the q u a n t i t y of r e a d i n g content and sheer vocabulary exposure occurs i n Grade 9 and most l i k e l y c o n t i n u e s i n t o Grade 10. F u r t h e r r e s e a r c h would have t o be conducted to determine the f e a t u r e s of the reading demands i n Grade 11 and 12. With the samples or g a n i z e d by s u b j e c t s a c r o s s grades, s u b j e c t s w i t h i n grades, and textbooks, no apparent p a t t e r n e x i s t e d i n the data except f o r c o n s i d e r a b l e d i v e r s i t y i n word-type and token d i s t r i b u t i o n . However, a p p l i c a t i o n of Yule's K c h a r a c t e r i s t i c , which p r o v i d e s a s t a t i s t i c a l i n d i c e s of the c o n c e n t r a t i o n of vocabulary w i t h i n p r i n t m a t e r i a l s , r e s u l t e d i n 1 4 1 c l e a r trends based on repeat r a t e frequency f o r the v a r i o u s grade, s u b j e c t and t e x t c o r p o r a . Grade 9 had the l e a s t redundancy i n vocabulary of the t h r e e grades; Home Economics and E n g l i s h had the l e a s t redundancy, to a l a r g e degree, i n vocabulary i n comparison t o other s u b j e c t s w i t h i n each grade (with Commerce i n Grade 9 a l s o e x h i b i t i n g a low K v a l u e ) ; and Home Economics and E n g l i s h had the l e a s t redundancy i n vocabulary i n comparison to other textbooks (with the ex c e p t i o n of the low K value f o r Commerce t e x t s ) . C o n s i d e r i n g t h a t K i s a measure of the degree t o which the token d i s t r i b u t i o n s tend to have d i f f e r e n t words, E n g l i s h and Home Economics c l e a r l y make p r o p o r t i o n a t e l y g r e a t e r vocabulary demands. The token d i s t r i b u t i o n s f o r a l l s u b j e c t word l i s t s a l s o d i s p l a y c o n s i d e r a b l e v a r i a t i o n . These r e s u l t s provide s t r i k i n g evidence f o r the value of using measures such as the K c h a r a c t e r i s t i c to supplement the usual type and token s t a t i s t i c s computed i n word frequency s t u d i e s . Determining the s p e c i f i c number of word-types and tokens can provide u s e f u l data, but a s t a t i s t i c a l measure based on the r e l a t i v e redundancy i n occurrence of those words allows f o r sharper d i f f e r e n t i a t i o n of the r e a l vocabulary demands of v a r i o u s s u b j e c t areas. Task_6j__Sentence_Char No apparent p a t t e r n i n sentence length d i s t r i b u t i o n was evid e n t f o r the samples organized by grades. With the samples 142 orga n i z e d by s u b j e c t s a c r o s s grades, s u b j e c t s within grades, and textbooks, c o n s i d e r a b l e range of v a r i a t i o n i n sentence l e n g t h was apparent. In a d d i t i o n , E n g l i s h overwhelmingly e x h i b i t e d the l a r g e s t standard d e v i a t i o n s and c o e f f i c i e n t s of v a r i a t i o n i n sentence length s t a t i s t i c s . A p p l i c a t i o n o f the K c h a r a c t e r i s t i c i n d i c a t e d t h a t Grade 8 had the l e a s t redundancy i n repeat r a t e of sentence l e n g t h s ; E n g l i s h had the l e a s t redundancy with samples organized by s u b j e c t s a c r o s s grades and with the samples or g a n i z e d by s u b j e c t s within Grades 8 and 9; Science had the l e a s t redundancy f o r Grade 1 0 ; and, E n g l i s h had the l e a s t redundancy f o r samples or g a n i z e d by textbooks. E n g l i s h a g a i n , as i n the case of vocabulary, makes e x c e p t i o n a l demands i n terms of sentence l e n g t h v a r i e t y . Although E n g l i s h i s focused on here because of i t s r a t h e r s i g n i f i c a n t demands i n terms of l a c k of vocabulary redundancy and minimal sentence l e n g t h r e p e t i t i o n , i t should be poi n t e d out t h a t with the data a v a i l a b l e from t h i s study, i t i s p o s s i b l e to e a s i l y develop u s e f u l d e s c r i p t i v e statements on vocabulary redundancy and sentence l e n g t h repeat r a t e f o r a gre a t v a r i e t y of c o n f i g u r a t i o n s f o r the samples organized by grades, s u b j e c t s and textbooks. Task_7__Common__ords The type and percentage of "common words" found to be most f r e q u e n t l y represented i n the samples of t h i s study are r e l a t i v e l y s i m i l a r to the r e s u l t s obtained i n other word count 1 4 3 s t u d i e s . (See f o r example, the three c o r p o r a r e f e r r e d to p r e v i o u s l y ) . The r e s u l t s f o r the c h i - s q u a r e a n a l y s e s f o r samples organized by grades, s u b j e c t s across grades, and s u b j e c t s w i t h i n grades, provide s t a t i s t i c a l evidence f o r the assumption t h a t l i t t l e u n i f o r m i t y e x i s t s i n the d i s t r i b u t i o n of even the most commonly o c c u r r i n g word-types used i n w r i t i n g . There tended to be a g r e a t e r u n i f o r m i t y with the samples organized by gross grade groupings than when they were organized by s u b j e c t s . The s t y l e and content c h a r a c t e r i s t i c s of the separate s u b j e c t areas are thus s i g n i f i c a n t l y i n s t r u m e n t a l i n a f f e c t i n g the frequency of occurrence o f even the most common words found i n E n g l i s h . Task.8: ..Selected,Sentence Lengths The r e s u l t s of the ch i - s q u a r e a n a l y s i s f o r samples o r g a n i z e d by grades, s u b j e c t s a c r o s s grades, and s u b j e c t s w i t h i n grades provide s t a t i s t i c a l evidence f o r the assumption t h a t l i t t l e u n i f o r m i t y e x i s t s i n the d i s t r i b u t i o n of r e p r e s e n t a t i v e sentence l e n g t h s . In no s u b j e c t or grade groupings d i d the samples f o l l o w a homogeneous p a t t e r n . There tends t o be more u n i f o r m i t y with the samples or g a n i z e d by gross grade groupings than by s u b j e c t s across and w i t h i n grades. These r e s u l t s p a r a l l e l those found f o r the common word a n a l y s i s and again suggest t h a t the s t y l e and content c h a r a c t e r i s t i c s of the sep a r a t e s u b j e c t areas are a l s o s i g n i f i c a n t l y i n s t r u m e n t a l i n a f f e c t i n g the frequency of occurrence of r e p r e s e n t a t i v e sentence 1 4 a l e n g t h s . An e l i m i n a t i o n technique was developed, with c u t o f f p o i n t s suggested a t the 50 percent of the high frequency tokens and 10 percent of the low frequency tokens, using the Corpus word l i s t as a model. T h i s a n a l y s i s a l s o r e v e a l e d t h a t the t o t a l Corpus and the separate grade and s u b j e c t corpora c o n t a i n a l a r g e r number of r a r e word-types even though the l a r g e m a j o r i t y of running words are f a i r l y common words. F u l l comprehension of p r i n t sources would i n v o l v e knowledge of a l l word-types. However, t h i s i s seldom p o s s i b l e and the e l i m i n a t i o n technique i s u s e f u l i n a s c e r t a i n i n g the most s i g n i f i c a n t vocabulary f o r i n s t r u c t i o n a l purposes. The e l i m i n a t i o n technique (based on the e l i m i n a t i o n of h i g h l y f r e q u e n t and r e l a t i v e l y r a r e words) c o u l d be u s e f u l i n determining the most s i g n i f i c a n t content i n a word l i s t when coupled with the a p p l i c a t i o n of judgment by s u b j e c t s p e c i a l i s t s . Word frequency can be j u s t i f i e d as a measure of word s i g n i f i c a n c e . C e r t a i n words are normally repeated as an author develops a t o p i c , when the most s i g n i f i c a n t of the words are separated from words that serve t o t i e w r i t i n g together, vocabulary l i s t s with high content s i g n i f i c a n c e r e s u l t . T h i s i s p a r t i c u l a r l y t r u e i n e x p o s i t o r y w r i t i n g where there i s l i t t l e p r o b a b i l i t y t h a t a given word i s used to r e f l e c t more than one i d e a . 145 CONCLUSIONS In Chapter I the question was asked, "What are the l i n g u i s t i c c h a r a c t e r i s t i c s of the p r i n t sources p r e s c r i b e d f o r use i n Canadian secondary s c h o o l s ? " T h i s study p r o v i d e s p a r t i a l answers to that q u e s t i o n f o r m a t e r i a l s p r e s c r i b e d f o r the j u n i o r secondary grades. The major answer to the q u e s t i o n can be s t a t e d , " P r i n t sources e x h i b i t extremely d i v e r s e c h a r a c t e r i s t i c s when examined i n r e l a t i o n t o ; q u a n t i t y of m a t e r i a l p r e s c r i b e d , vocabulary redundancy, sentence c h a r a c t e r i s t i c s , d i s t r i b u t i o n of common words, and the d i s t r i b u t i o n of r e p r e s e n t a t i v e sentence t y p e s . " In f a c t , l i t t l e c o n g r u i t y o f p a t t e r n e x i s t s a c r o s s the samples of the study when the r e s u l t s are organized to r e f l e c t the p r i n t sources p r e s c r i b e d f o r grades, s u b j e c t s a c r o s s grades, s u b j e c t s w i t h i n grades, and samples by textbooks w i t h i n the s u b j e c t s themselves. The v a r i a b i l i t y i s marked even i n l o o k i n g a t data based on s t r a i g h t f o r w a r d l e x i c a l v a r i a b l e s such as word frequency and sentence l e n g t h . In a l l cases, o r g a n i z a t i o n of the samples i n t o gross grade p a t t e r n s masked the s u b j e c t d i f f e r e n c e s so obvious when the p r i n t sources were organized i n t o v a r i o u s combinations r e p r e s e n t i n g a c r o s s and w i t h i n s u b j e c t groupings. I t would thus be more p r e c i s e t o speak of readi n g demands i n the j u n i o r secondary y e a r s i n terms of s u b j e c t s a c r o s s the t h r e e grades, s u b j e c t s within the three grades, or by separate t e x t , r a t h e r than by gross grade l e v e l alone. The separate m a t e r i a l s i n each s u b j e c t area make unique reading demands as p r i n t sources when compared w i t h i n s u b j e c t s 146 or to other s u b j e c t areas w i t h i n or across grades. U n i f o r m i t y i s l a c k i n g i n the d i s t r i b u t i o n of even the most common words comprising 50 percent of running prose. The same holds t r u e f o r the d i s t r i b u t i o n of a r e p r e s e n t a t i v e s e t of sentence l e n g t h s . While there i s c o n s i d e r a b l e v a r i a t i o n i n the vocabulary and sentence s t y l e demands i n a l l s u b j e c t s , the very unique demands of the E n g l i s h genre (and to some extent Home Economics and Commerce) must be pointed out. Mo other s u b j e c t area c o n s i s t e n t l y e x h i b i t s such v a r i a b i l i t y i n vocabulary redundancy, sentence length c h a r a c t e r i s t i c s , and sentence l e n g t h homogeny. E n g l i s h m a t e r i a l s tend to have a g r e a t e r c o n c e n t r a t i o n of r e l a t i v e l y uncommon words over a g r e a t v a r i e t y of sentence l e n g t h s . I t i s assumed here t h a t v a r i a b i l i t y i s r e l a t e d to r e a d i n g d i f f i c u l t y and t h a t widely f l u c t u a t i n g p a t t e r n s of repeat r a t e frequency of words and d i v e r s e sentence l e n g t h c h a r a c t e r i s t i c s are more d i f f i c u l t f o r the reader to cope with than m a t e r i a l s e x h i b i t i n g a more even d i s t r i b u t i o n of these c h a r a c t e r i s t i c s . The r e s u l t s of t h i s study are based on samplings from one p r i n t source, p r e s c r i b e d nk" i s s u e t e x t s , and the c h a r a c t e r i s t i c s of only two r e l a t i v e l y s t r a i g h t f o r w a r d l e x i c a l f e a t u r e s , word frequency and sentence l e n g t h , are examined. The v a r i a b i l i t y i n the r e s u l t s would p o s s i b l y be even more pronounced i f t o t a l samples had been analyzed and i f samples from a l l types of p r i n t sources ( i n c l u d i n g supplementary, r e f e r e n c e and r e c r e a t i o n a l reading had been i n c l u d e d . In a d d i t i o n , i f probes were made and s t a t i s t i c s developed on a broader a r r a y of s y n t a c t i c and semantic v a r i a b l e s r e l a t e d to 1U7 grammatical f u n c t i o n i n g , syntax, and l o g i c a l r e l a t i o n s h i p s , a g r e a t e r v a r i a b i l i t y would be expected. i n c o n c l u s i o n , i n d e s c r i b i n g the re a d i n g demands of p r i n t m a t e r i a l s p r e s c r i b e d f o r use i n j u n i o r secondary grades, the v a r i a b i l i t y of the word and sentence c h a r a c t e r i s t i c s w i t h i n each s u b j e c t area are the most obvious f a c t o r s to be c o n s i d e r e d . T h i s suggests that r e a l i s t i c r e a d i n g i n s t r u c t i o n f o r secondary s c h o o l s must fo c u s on the s u b j e c t areas and the s p e c i f i c p r i n t m a t e r i a l s used as t o o l s i n p r e s e n t i n g the i d e a s and concepts i n those s u b j e c t areas. Such i n s t r u c t i o n may best be viewed as a shared r e s p o n s i b i l i t y between the su b j e c t teacher and the readin g s p e c i a l i s t r a t h e r than the s o l e province o f the re a d i n g s p e c i a l i s t . The s u b j e c t s p e c i a l i s t b r i n g s unique knowledge and i n s i g h t of the d i s c i p l i n e and i t s p r i n t sources to the team, while the re a d i n g s p e c i a l i s t c o n t r i b u t e s knowledge of the u n d e r l y i n g processes and s k i l l s of the reading a ct and f a m i l i a r i t y with the c h a r a c t e r i s t i c s o f p r i n t i n g e n e r a l which c o n t r i b u t e t o problems i n comprehending i n s t r u c t i o n a l m a t e r i a l s . RECOMMENDATIONS T h i s study suggests a number of p r a c t i c a l recommendations f o r the immediate implementation of the main f i n d i n g s and a l s o avenues f o r f u t u r e r e s e a r c h . 1. The word frequency l i s t s produced f o r the Corpus and the s i x t y - f i v e corpora provide s u b j e c t t e a c h e r s and c o o r d i n a t o r s , reading s p e c i a l i s t s , and s c h o o l a d m i n i s t r a t o r s a t the j u n i o r secondary l e v e l with a v a l u a b l e source of language data 1 4 8 r e p r e s e n t i n g each grade l e v e l , s u b j e c t area, s u b j e c t area w i t h i n a grade, and i n d i v i d u a l textbook. The word l i s t s should be examined and t h e i r r e l e v a n c e to i n s t r u c t i o n i n r e g u l a r classroom s e t t i n g s , a d u l t b a s i c e d u c a t i o n , and c l a s s e s f o r New Canadians determined. 2. A number of c o r r e l a t i o n a l analyses c o u l d be made with the word l i s t s from the present study and word l i s t s p r e v i o u s l y developed by Lorge-Thorndike, K u c e r a - F r a n c i s , and C a r r o l l et a l . T h i s comparison c o u l d i d e n t i f y b a s i c d i f f e r e n c e s between data bases compiled from p r i n t sources i n two d i f f e r e n t c o u n t r i e s and a i d i n determining the b a s i c d i f f e r e n c e s between Canadian and American E n g l i s h . 3. The Corpus of r e p r e s e n t a t i v e samples generated i n the study c o u l d provide a u s e f u l data base f o r r e s e a r c h i n a number of areas. The samples c o u l d be used i n r e a d a b i l i t y r e s e a r c h . F o r example, i t would be r e l a t i v e l y easy to generate mutilated samples f o r Cloze r e s e a r c h by developing computer programs to modify the samples and d e l e t e every "nth" word. Research c o u l d be undertaken to determine the e f f e c t of d i f f e r i n g sample l e n g t h and number of samples i n the a p p l i c a t i o n of e x i s t i n g r e a d a b i l i t y measures. A u s e f u l p r o j e c t would be the development of a computer program f o r s y l l a b l e counts f o r a p p l i c a t i o n i n r e a d a b i l i t y r e s e a r c h . The samples themselves could a l s o be f u r t h e r analyzed using techniques and measures from s t u d i e s i n t r a n s f o r m a t i o n a l grammar and other l i n g u i s t i c a l g o r i t h m s . Such s t u d i e s could provide f u r t h e r i n s i g h t i n t o the r o l e the s t r u c t u r e of p r i n t m a t e r i a l s p l a y s i n the p r o c e s s i n g of w r i t t e n 149 language. 4. A thorough a n a l y s i s of the r e a d a b i l i t y of the v a r i o u s textbooks used i n the study c o u l d be r e a d i l y undertaken. The 469 samples of approximately 500 words each i n len g t h have been c a r e f u l l y s e l e c t e d and d e s c r i b e d . The data c o u l d be added to and updated as new adoptions are made or as the c u r r i c u l u m i s r e v i s e d i n subsequent years. 5. An area of r e s e a r c h r e q u i r i n g continued a t t e n t i o n concerns the d i f f e r e n t p a t t e r n s o f language i n the s u b j e c t areas. There i s a need t o f u r t h e r i d e n t i f y what Bormuth (1969) r e f e r r e d to as 'the manipulable l i n g u i s t i c v a r i a b l e s which bear a c a u s a l r e l a t i o n s h i p to the d i f f i c u l t y of the i n s t r u c t i o n a l m a t e r i a l s being used'. with t h i s i n f o r m a t i o n i t would be p o s s i b l e to develop t e a c h i n g s t r a t e g i e s t o help students cope with the r e a d i n g demands presented i n t h e i r i n s t r u c t i o n a l ma t e r i a l s . 6. F u r t h e r a n a l y s i s should be made i n t o the l i n g u i s t i c c h a r a c t e r i s t i c s of textbooks w i t h i n a s u b j e c t area to determine the s p e c i f i c r e a d i n g d i f f i c u l t i e s i n h e r e n t i n c e r t a i n types of w r i t t e n e x p r e s s i o n . For example, a textbook d e a l i n g with i n s t r u c t i o n i n E n g l i s h e x p r e s s i o n may o f f e r suggestions on improving sentence c o n s t r u c t i o n i n one p a r t of the book and a few pages l a t e r present a l i t e r a r y excerpt as an example of good w r i t i n g s t y l e . 7. The use of the " e l i m i n a t i o n technique" c o u l d be r e f i n e d and developed to produce a core vocabulary f o r each of the 150 s u b j e c t a r e a s . These vocabulary l i s t s would p r o v i d e v a l u a b l e i n f o r m a t i o n i n the development of summative, formative, and placement e v a l u a t i o n i n r e a d i n g . 8. Computer techniques should be f u r t h e r developed and modified to allow f o r f u r t h e r a n a l y s i s of n a t u r a l language samples. In a d d i t i o n , a v i t a l need e x i s t s f o r r e s e a r c h e r s i n education to become aware of the advantages of using the computer i n t h e i r work, to g a i n an understanding of b a s i c computer procedures, and to communicate t h e i r needs and o b j e c t i v e s to the computer programmers and other t e c h n i c i a n s who are a v a i l a b l e f o r c o n s u l t a t i o n and a d v i c e . 9. The model developed i n t h i s study c o u l d be modified i n a number of ways. I n i t i a l l y , the model co u l d be e n l a r g e d to d e a l with a sample of other textbooks and i n s t r u c t i o n a l m a t e r i a l s used i n the j u n i o r secondary grades. T h i s would provide f o r a wider r e p r e s e n t a t i o n of p r i n t e d samples and may supply f u r t h e r i n s i g h t s i n t o l i n g u i s t i c v a r i a b l e s encountered by students i n t h e i r r e a d i n g . Secondly, the model co u l d be extended to encompass Grade 4 to Grade 12, thus e n a b l i n g a thorough d e s c r i p t i o n and a n a l y s i s to be made of the s u b j e c t areas w i t h i n and across the elementary, j u n i o r secondary, and s e n i o r secondary grade l e v e l s . The model c o u l d a l s o be a p p l i e d i n other p r o v i n c e s or i n s t u d i e s based on samples a c r o s s p r o v i n c e s . F i n a l l y , the model could be adapted to allow a more d e t a i l e d a n a l y s i s of s e l e c t e d l i n g u i s t i c f e a t u r e s w i t h i n a language sample which would provide important i n f o r m a t i o n f o r r e s e a r c h e r s , s u b j e c t area teachers, and r e a d i n g s p e c i a l i s t s . 151 BIBLIOGRAPHY 152 A. BOOKS AND MONOGRAPHS A l f o r d , M.H.T. "Computer A s s i s t a n c e i n Language L e a r n i n g and i n Authorship I n d e n t i f i c a t i o n " . _ h e _ C o _ p _ t e _ _ i n _ L i t _ r a r _ _ a _ _ M _ i _ t i c _ R e s e a r c h _ ed R. A. Wisbey, London: Cambridge U n i v e r s i t y P r e s s , 1971. A r t l a y , A S t e r l . "The Development o f Reading M a t u r i t y i n High School - I m p l i c a t i o n s of the Gray-Rogers Study". lEE£ovin_ Reading i n Secondary S c h o o l s : . S e l e c t e d Readings^ ed. Lawrence E. Hafner, New York: Macmillan Co., 1967. Tr e n d s _ a n d _ P r a c t i c _ s _ i n _ S e c Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1968. "Implementing a Developmental Reading Program on the Secondary L e v e l , Ieachin_jReadinjg_in_H Ik___.£le.s.x e d » Robert K a r l i n , I n d i a n a p o l i s , New York: Bobbs-M e r r I l l ~ C o . Inc., 1969. Aukerman, C R . Reading i n t h e M Secondary ,„ School _Classroom., New York: McGraw-Hill Book Company, 1972. B a l l o u , Stephen V. A_Model_for_Theses_and_Research_Pa_ers_ Boston: Houghton M i f f l i n Company, 1970. Bond, Guy L., and Eva Bond. Developing Reading_in - -High l lSchool L New York: MacMillan Co., 19417 Bond, Guy L., and M i l e s A. T i n k e r . R e a d i n g D i f f i c u l t i e s _ _ T h e i r D i a g n o s i s and C o r r e c t i o n . New York: A p p l e t o n - C e n t u r y - C r o f t s , 19677" Bormuth, John R. R e _ d a b i l i t _ _ i n _ 1 9 6 8 _ A Research E u l l e t i n , N a t i o n a l C o u n c i l of Teachers of E n g l i s h , 1968. B o t e l , Morton. B o t e l _ P r e d i c t i n g _ R e a _ a b i l i ^ Chicago: F o l l e t t Publishing~Co77 1962." Buckingham, B.R., and E.W.Dolch. A Combined _ o r d _ L i s t _ Boston: Ginn and Co. , 1936. Burton, Dwight L. "Heads Out of the Sand: Secondary Schools Face the Challenge o f Reading". T e a c h i n _ _ _ e a _ i n _ _ i n _ H i _ §§I§£i§i__E_icles_ ed. Robert K a r l i n , I n d i a n a p o l i s , New Yo r k 7 ~ B o b b s - M e r r i l l Co. Inc., 1969. Campbell, Wi l l i a m R. Form and_Style_in_Thesis_Wr T h i r d e d i t i o n . Boston: Houghton M i f f l i n Company, 1969. 153 C a r r o l l , John B. "The Nature o f the Beading Process". Theoretical_Models_and ed. Harry S i n g e r , Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1970. C a r r o l l , John B. "Development of N a t i v e Language S k i l l s Beyond the E a r l y Years". The_Learnin2_of_Lan^ ed. C a r r o l l E. Reed, New York: Appleton-Century C r o f t s , 1971. C a r r o l l , John B., Peter Davies, and Barry Richman. The_American S®Eii33S_]i2£^_E£S2]i2i}£l_§22JSi Ne w York: Houghton M i f f l i n Company, 1971. Ch a l l , Jean. R e a d a b i l i t y • ^  An_ A p g r ^ i g a l o f Research_and A££li£§.ti.2Il». Bureau of E d u c a t i o n a l Research Monographs, No. 6, Columbus, Ohio: Ohio State U n i v e r s i t y , Bureau Of E d u c a t i o n a l Research, 1958. Chomsky, N. S y n t a c t i c _ S t r u c t u The Hague: Mouton, 1957. Aspects of the Theory,of Syntax. Cambridge, Mass.: M.I.T. Pre s s , 1965. Cola, L u e l l a . The_.Teacher_s _ Hand book _of T e c h n i c Bloomington, I l l i n o i s : P u b l i c School P u b l i s h i n g Company, 1940. Coleman, E.B. "Experimental S t u d i e s o f R e a d a b i l i t y " , R e a d a b i l i t y in__968_ ed. John R. Bormuth, N a t i o n a l C o u n c i l o f Teachers o f " E n g l i s h , 1968. Coombs, Clyde H. A_Theory_of_pata i New York: John Wiley 6 Sons, Inc., 1964. Davis, F r e d e r i c k B. "Research i n Beading i n High S c h o o l and C o l l e g e " , Review of E d u c a t i o n a l Research^. 22 ( A p r i l , 1952), 65-75. Dechant, E.V. Improving the Teaching of Reading. Second E d i t i o n . New J e r s e y : P r e n t i c e - H a l l Inc., 1970. Deese, J . S t r u c t u r e _ o f _ A s s o c i B a l t i m o r e : John Hopkins Press, 1965. Ebel, Robert L. (ed) . Iij£y»£i2£_;^ia.-2f_MM£3;_;i2fi§i_!2§£l£2]:i Fourth e d i t i o n . London: The Macmillan Company, 1969. F a r r , R., et a l . "An Examination o f Reading programs i n Indiana Schools," B u l l e t i n _ o f _ S c h o o l _ o 45 (1972), 5-92. 1 5 4 F a u c e t t , Lawrence, and I t s u Maki. A_St____of____li_h_Wor_ S t a t i s t i c a l l y Deterlined_from_thg_Latest_Extensiye_Word Counts_ Tokyo: Matsumura Sanshoda, 1932. Ferguson, C h a r l e s A. " I n t r o d u c t i o n " , The Learning of Language, 3d. C a r r o l E. Reed, New York: Appleton-Century C r o f t s , 1971. Fox, David j . The_Research_Process_i^ York: H o l t , R i n e h a r t and Winston, Inc., *1969. F r a n c i s , W. Nelson. Manual of Information to accompany Kucera, Henry, and W. Nelson F r a n c i s . Com_utational_AnaJL._sis_of ££§§5___riiy._^ £§£i£a_-S£_2iS_2. Providence, Rhode I s l a n d : Brown U n i v e r s i t y P r e s s , 1967. F r i e s , C. Thg„_tructure^pf E n g l i s h . New York: Harcourt, Brace and World7~Inc77~T952.~ F r i e s , C h a r l e s C., and A. A i l e e n T r a ver. En2_ish_Word_Lists__A §tud__of_Their_Ad_£tability_for_Instruction_ Washington: American C o u n c i l on Education, 1940. Gates, Arthur I. A^Reading__gcabulary for_the_Primar^_Grades_ New York: Teachers C o l l e g e , Columbia U n i v e r s i t y , 1926. " R e f l e c t i o n and Return". _ea_in_____Hu__n__i_ht_a Human_Problem_ ed. Ralph C. S t a i g e r and O l i v e r Andresen, Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1968. Glass, G.V., and J.C. S t a n l e y . S t a t i s t i c a l . Methods_in_Education l_i_£§I£il2£22Is. Englewood C l i f f s , New J e r s e y : P r e n t i c e - H a l l , Inc77~19707~ Goodman, Kenneth S., et a l . C h o o s i n _ _ M a t e r i a l s _ t o _ T e ^ D e t r o i t : Wayne s t a t e U n i v e r s i t y Press, 1 9 6 6 . Gray, William S. I--^^gi2„£B2_2l^§§§^-Bg..,^n^-^g^^:ingr.l y_N_E_S_C_Ox 1969. Gray, W i l l i a m S., and Bernice E. Leary. What Makes a Eook Readable. C h i c a _ o _ _ U n i v e r s i t _ _ o f _ C h H a r r i s , A l b e r t J. 2ow_to_Increase_Readi^ F i f t h e d i t i o n . New York: David~icKay~Co7 Inc., 1 9 7 0 . H a r r i s , A l b e r t J . , and M i l t o n D. Jacobson. Basic_Elementary. R__ii£__y.2£___l_£_e.s.s. ®ev York: The Macmillan Company, 1972. 155 H a r r i s , Chester W. (ed). g p c y c l o _ e d i a _ o f _ E d u c a t i o n a l _ R e s e a r c h _ T h i r d e d i t i o n . New York: Macmillan Co., 1960. Horn, Ernest. Basic_Writing_Voca (Monographs i n Education, F i r s t S e r i e s , No. H) r Iowa C i t y : U n i v e r s i t y of Iowa, 1926. Huus, Helen, "Innovations i n Reading I n s t r u c t i o n : At L a t e r L e v e l s " . T _ g _ 6 7 t h ^ e a ^ b g o k _ o f _ - . . _ M t i o n a ! _ S o c i e t y _ f o r _ ^ h ^ S t u d y _ o f _ E d u c a t i o n x _ P a r t _ I I x ed. Helen M. Robinson, Chicago: The U n i v e r s i t y of Chicago Press, 1968. Jenkinson, M.D. "Information Gaps i n Research i n Reading Comprehension". ReadJ.ng__£^ George B. S c h i c k and M e r r i l l M. May, Milwaukee: N a t i o n a l Reading Conference, 1970, pp. 179-192. Jewatt, Arno (ed.). IgpcQY4M_S§§§J-I!9^4B_l]bg_vIl}I!iSI^5i2li^SSi29l?. Washington, D.C: United s t a t e s Government P r i n t i n g O f f i c e , 1957. Johnson, M a r j o r i e Seddon. "Word P e r c e p t i o n i n the Reading T h i n k i n g P r o c e s s " . Paper i n Reading_and_Think Proceedings of the 22nd Annual Reading I n s t i t u t e a t Temple U n i v e r s i t y , 1965, Temple U n i v e r s i t y , P h i l a d e l p h i a , Pa. (ED 015 094) Jones, L y l e V., and Joseph M. Wepman, A_Sgoken_Word_Count. Chicago, I l l i n o i s : Language Research A s s o c i a t e s , 1966. Jongsma, Eugene. The_Cloze_Procedure_as_a_ Newark, Delaware: The I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1 971. K l a r e , George R. T he_ Measure me nt^of__ R e a d a b i l i t y ^ Ames, Iowa: Iowa St a t e U n i v e r s i t y P r e s s , ?963. Kucera, Henry. "Computers i n Language A n a l y s i s and Lexicography". The_American^Heri SS-iiSk-iaiiasaaSx. Boston: American Heritage P u b l i s h i n g Company, Inc., and Houghton M i f f l i n Company, 1969, p . x x x v i i i . Kucera, Henry and W. Nelson F r a n c i s . C o m p u t a t i o n a l _ A n a l y s i s _ o f Pgg§§flt~MY^ American^ EMligh,;, Providence, Rhode I s l a n d : Brown U n i v e r s i t y P r e s s , 1967. 156 Luhn, H.P. "The Automatic C r e a t i o n of L i t e r a t u r e A b s t r a c t s " . IBM l°]iE£iI_°l_E®§§S££h_§._^_"2§vil2£S§__ ( A p r i l , 1958), 159-165. Rep r i n t e d i n , Ke__Pa£_rs_in_Information_Science_ ed. Arthur W. E l i a s , Washington, D. C: American S o c i e t y f o r Information S c i e n c e , (1972) , 87-93. Malmguist, Eve. "Reading: A Human Right and A Human Problem". Reading: A Human Right and A Human Problem, ed. Ralph C. S t a i g e r and O l i v e r Andressen, Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1968. Malsbary, Dean R. "A Study of the Terms t h a t People Need to Understand i n Order t o Comprehend and I n t e r p r e t the B u s i n e s s and Economic News A v a i l a b l e Through the Mass Media", S t u d i e s _E_I^U2__i°"ix T h e s i s A b s t r a c t S e r i e s , No. 4, Bloomington, Indiana: School of Education, Indiana U n i v e r s i t y , 1952. Karon, M.E. "Automatic Indexing: An Experimental I n q u i r y , Journal_of_the_As§.2£_2_12__f2E_£om£utin^_Machiner__ (1961), 404-417. R e p r i n t e d i n , Ke__Pa_ers_in_Information Sci e n c e , ed. Arthur W. E l i a s , Washington7 D.C: American S o c i e t y f o r Information S c i e n c e , (1972), 94-107. McLuhan, Mars h a l l . U_derstandin__Media__The New York: S i g n e t Books7"964. Nason, H.M. " E f f i c i e n t Reading - A Way to Permanent Education". _e__in__-__he_Ri_ht Milwaukee, Wisconsin: Twentieth Yearbook of the N a t i o n a l Reading Conference, 1971. P a i v i o , A. Imagery and v e r b a l Processes. New York: Holt, R i n e h a r t and Winston, Inc., 1971. P e i , Mario A. The__orld_s_Chief_Lan_u T h i r d e d i t i o n . London: George A l l e n S Unwin L t d . , 1949. P r a t t , Edward. "Reading as a Thinking Process". V i s t a s _ i n I_§.^iSa__l2i3i£S_II__Part_I_ ed. J . A l l e n F i g u r e l , Seventh Annual Convention, I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1966. Pressey, L u e l l a C. V o c a b _ l a _ _ _ L i s t s _ i Bloomington, I l l i n o i s : P u b l i c School P u b l i s h i n g Co., 1924. Rankin, E F. "Cloze Procedure - A Survey of Research", !L2_E_e-_____I§!E_22__2l___i_I___2__2_&SS^_£2»£2_£§£S££§x. e Q" s » E.L.Thurston and L.E.Hafner, Milwaukee: N a t i o n a l Reading Conference, 1965. R i n s l a n d , Henry D. A_Basic_Vocabular__of_E lementarj^School C h i l d r e n ^ New York: The Macmillan Company, 1945. 157 Robinson, H.A. "Communications and Curriculum Change". Language^ Reading, and the Communication.Process, ed. C a r l Eraun, Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, 1971, p.2. Rogers, John R. (ed) . L i n g u i s t i c s i n .Read i"3^£n§j_xuc£:iQ5s. M i s s i s s i p p i : U n i v e r s i t y of M i s s i s s i p p i , The Reading C l i n i c , 1965. R u s s e l l , D.H. And H.R. Fea. "Research on Teaching Reading". S S I i 5 i 2 2 l S _ ° £ _ S ® § S S £ £ ^ _ ° £ _ T e a c h i n g x ed. N. L. Gage, Chicago: Rand~McNally7 T963. S i n g e r , H. , and R. B. Ruddell, (eds). T h e o r e t i c a l _ M o d e l s and ££2£®§§SS_2^_E§5MMi Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1970. Smith, N i l a Banton. American_Reading_Instruct Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1965. St e i n b e r g , S. H. F i _ e _ H u n d r e d _ Y e a r s _ o ^ ^ ^ ^ ^ ^ B r i s t o l : Penguin Books, 1966. St o t h e r s , G.E., R.W.B. Jackson, and F.W. Minkler. A_Canadian _ord_Li ist.. Toronto: The Ryerson Press, 19 47. Strang, Ruth., CM. McCullough, and A.E. T r a x l e r . The Improvement of Reading. New York: McGraw-Hill Book Company, 19677" Tatsuoka, Maurice M., and David V. Tiedeman. " S t a t i s t i c s as an Aspect of S c i e n t i f i c Method i n Research on Teaching", £&-4fa22J£;_gf_gg§gg£gh 2 2_Teaching_ ed. N.L. Gage, Chicago: Rand~McNally7 T9637 T a y l o r , S t a n f o r d E., Helen Frackenpohl, and Cat h e r i n e E. White. i_Kevised_Core_Vocabu §.•,-.&S_M?,a.il£ggLy22aIut!arY._f oc_Grades_9-13 _ Research and Information B u l l e t i n No. 5 ( r e v i s e d ! , Huntington, N.Y.: E d u c a t i o n a l Developmental L a b o r a t o r i e s , 1949, 1969 ( r e v i s e d ) , Thorndike, Edward L. The_Teacher_s__org_B New York: Teachers C o l l e g e , Columbia U n i v e r s i t y , 192*1. Thorndike, Edward L. &_Sgacher_s__ord_Bogk_of_the_Twenty Thousand _ Word s_Fou Reading f o r C h i l d r e n and Young People. New York: Teachers C o l l e g e , Columbia U n i v e r s i t y , 1931. 158 Thorndike, Edward L., and I r v i n g Lorge, Th§_l§acherJ_s_Word_Book of_30 A000_Words_ New York: Teachers C o l l e g e , Columbia UnIversity7~T9iTu. T o f f l e r , A. Future^ Shock., New York: Random House, 1970. T r a x l e r , Arthur E. "Development of a Vocabulary T e s t f o r High School P u p i l s and C o l l e g e Freshmen", 1962 F a l l T e s t i n g Program i n Independent Schools and Supplementary S t u d i e s , I d u c a t i o n a l _ R e c o r d s _ B u l l e d No. 83 (February, 1963), 67-73. Webb, Eugene J , et a l . Unobtrusive Measures^..Nonreactiye Research i n the S o c i a l Sciences.. Chicago: Rand McNally & Co.7~1966. West, Michael. i_S_udy_of_the_Vocabu I S t S t i S S - I i l s ^ G r a d e ^ Washington: The I n t e r n a t i o n a l K i n d e r g a r t e n Union, 1928. Yule, G. Udny. The S t a t i s t i c a l Study of L i t e r a r y Vocabular%«.. Cambridge, England, 1944. B. PERIODICALS Ames, Wilbur S. "The Development of a C l a s s i f i c a t i o n Scheme of C o n t e x t u a l A i d s " , Reading_Research CjuarterlXt 2 ( f a l l * 1 9 6 6 ) . Aukerman, Robert C. " R e a d a b i l i t y of Secondary School L i t e r a t u r e Textbooks: A F i r s t Report", E n g l i s h J o u r n a l ^ 5 4 , (September, 1 9 6 5 ) . . Barton, Johnson D. "Computer Frequency C o n t r o l of Vocabulary i n Language Learning Reading M a t e r i a l s " , I n s t r u c t i o n a l S c i e n c e ^ 1 (March, 1 9 7 2 ) , 1 2 1 - 1 3 1 . B e i e r , E r n s t G., John A. Starkweather, and Dan E. M i l l e r . " A n a l y s i s of Word Frequencies i n Spoken Language of C h i l d r e n " , _anguage_and_Speech x o ( 1 9 6 7 ) , 2 1 7 - 2 2 7 . B e t t s , Emmett A. " S t r u c t u r e i n the Reading Program", Elementary Englishj, XL (March, 1 9 6 5 ) , 2 3 8 - 2 4 2 . 159 B i c k l e y , A.C., et a l . "The Clo z e Procedure: A Conspectus", JQugnal^of„Jeading_Behavior^ 2 (Summer, 1970), 232-243. Bloomer, R.H. "Connotative Meaning and the Reading and S p e l l i n g D i f f i c u l t y of Words", T h e _ J o u r n a l _ o f _ E u d c a t i o n a l Research.,. 55 (November, 1961), 107-112. Bormuth, John R. " R e a d a b i l i t y : A New Approach", Readin__Research _ u a r t _ r l _ _ 1 (1966), 79-131. Bormuth, John R. "New Developments i n R e a d a b i l i t y Research", E l e m e n t a r _ _ E n _ l i s h _ 44 (December, 1967), 840-845. B o r t n i c k , R., and G.S. Lopardo. "An I n s t r u c t i o n a l A p p l i c a t i o n of the Cloze Procedure", J o u r n a l of_Readin__ 16 (January, 1973), 296-299. Broadbent, D.E. "Word Frequency E f f e c t and Response B i a s " , Psychologic__________ 74 (January, 1967), 1-10. Bruner, J. S. "Neural Mechanisms i n P e r c e p t i o n " , P s y c h o l o g i c a l _ e v i e _ _ 64 (1957), 340-358. Card, W i l l i a m , and V i r g i n i a McDavid. "Frequencies of S t r u c t u r e Words i n the W r i t i n g o f C h i l d r e n and A d u l t s " , Elementary E n _ l i s h _ 42 (December, 1965), 878-882,894. C h r o n i s t e r , G.M., and K.M. Ahrendt. "Reading I n s t r u c t i o n i n B r i t i s h Columbia's Secondary Schools", J o u r n a l o f ^ R e a d i n g c (March, 1968) , 425-427. Coleman, E.B. "Experimental Stud i e s of R e a d a b i l i t y " . Elementary E _ _ l i _ h _ XLV (March, 1968), 316-323,333. Conway, James A., and Troy V. McKelvey. "The Role of the Relevant L i t e r a t u r e : A Continuous Process", The_Journal of Educational__________ 63 (May-June, 1970), 407-4T57 Culhane, Joseph W. "Cloze Procedures and Comprehension", The S__3in__Teacher_ 23 (February, 1970), 410-413. Dale, Edgar., and Jeanne S. C h a l l . "A Formula f o r P r e d i c t i n g R e a d a b i l i t y " , E d u c a t i o n a l Research B u l l e t i n _ 27 (1948), 11-20. Dale, Edgar. "The Problem of Vocabulary i n Reading", E d u c a t i o n a l R e s e a r c h _ B u l l e t i n _ 35 (1956) , 113-123. 160 "VOCABULARY MEASUREMENT: TECHNIQUES AND MAJOR FINDINGS", _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 4 2 (December, 1965), 895-901. Dodds, William J. " H i g h l i g h t s from the H i s t o r y of Reading I n s t r u c t i o n " , The Reading Teacher_ 21 (December, 1967), 274-280. Dolch, Edward W. "A Basic S i g h t Vocabulary", Elementar__School _ o _ r n a l _ 36 (1936), 456-460. Durr, W i l l i a m K. "Computer Study of High Frequency Words i n Popular Trade J u v e n i l e s " , The Reading Teacher_ 27, (October, 1973), 37-42. Fry, Edward. "A R e a d a b i l i t y Formula That Saves Time", J o u r n a l of Reading, 11 ( A p r i l , 1968) , 513-516, 575-578. " ~~ S l a z e r , Susan Mandal. " I s Sentence Length a V a l i d Measure of D i f f i c u l t y i n R e a d a b i l i t y Formulas?", The Reading,T_eacher t 27 (February, 1974), 464-468. Howes, Davis. "A Word Count of Spoken E n g l i s h " , J o u r _ a l _ o f Z_Ebal______i_g__nd_Verbal_____yior_ 5 (1966), 572-604. Johnson, Dale D. "The Dolch L i s t Re-examined", The_Reading Teacher_ 24 (February, 1971), 449-457. Johnson, D. Barton. Computer Frequency C o n t r o l of Vocabulary i n Language Learning Reading M a t e r i a l s " , I n s t r u c t i o n a l S c i e n c e _ 1, (March, 1972), 121-131. K l a r e , George R. "Comments on Bormuth*s R e a d a b i l i t y : A New Approach", Re_ding_R_se__ch_2uarterl_ , 4 (1966), 119-125. K l a r e , George R. "The Role of Word Frequency i n R e a d a b i l i t y Research", E l e _ e _ t a r _ _ _ n g l i _ h , 45 (January, 1968), 12-22. Kyte, George C. "A Core Vocabulary i n the Language A r t s " , Phi __i£__Ka._£___ 34 (March, 1953), 231-34. L i v e l y Bertha, and S.L Pressey. "A Method of Measuring the Vocabulary Burden of Textbooks", E d u c a t i o n a l A d m i n i s t r a t i o n ___________ 9 (October, 1923)7 389-3 98." Lorga, I r v i n g . "Word L i s t s as Background f o r Communication", T_achers_College_Recor__ 45 (May, 1944), 543-52. Louthan, Vincent. "Some Systematic Grammatical D e l e t i o n s and T h e i r E f f e c t s on Reading Comprehension", E n g l i s h J o u r n a l _ 54 ( A p r i l , 1965), 295-299. 161 M a c G i n i t i e , Walter H., and Richard T r e t i a k . "Sentence Depth Measures as P r e d i c t o r s of Reading D i f f i c u l t y " , l e a d i n g Ie.§§a££._,_i2]i§,_:lsEl24. V I (Spring, 1971), 364-377. McLaughlin, Harry G. "SMOG Grading - a New R e a d a b i l i t y Formula", 22a££il_2J_S®adin2 x (May, 1969), 639-646. Nyman, P a t r i c i a , et a l . "An Attempt to Shorten the Word L i s t with the D a l e - C h a l l R e a d a b i l i t y Formula." E d u c a t i o n a l 8 2§ earch_Bulletin x 40 (September, 1961) , 150-152. Palmer, William S. " R e a d a b i l i t y , R h e t o r i c , and the Reduction of U n c e r t a i n t y " , J o u r n a l o f j R e a d i n g , 19 ( A p r i l , 1974), 552-558. Pa t t y , W.W., and W.I.Painter. "A Technigue f o r Measuring the Vocabulary Burden o f Textbooks", Journal,, of _g ducat i o n a l Research^ 24 (September, 1931), 127-134. Powers, S.R. "The V o c a b u l a r i e s o f High School Science Textbooks", Teachers C o l l e g e Record_ 26 (January, 1925), 368-382. Ramanauskas, S i g i t a . "The Responsiveness of C l o z e R e a d a b i l i t y Measures t o L i n g u i s t i c V a r i a b l e s Operating Over Segments of Text Longer Than a Sentence", Reading Res ear c h_p_.ua r t e r l y A 8 ( F a l l , 1972), 72-91. Robinson, H. Alan and Dan S. Dramer. "High School Reading -1958", J2U£_al_of_Deyelo_mental_Reading_ 3 (Winter, 1960), 94-105. (See s u c c e s s i v e i s s u e s f o r summaries r e l a t e d t o 1961 through 1966) . Robinson, Helen M., Samuel Weintraub, and Helen K. Smith. "Summary of I n v e s t i g a t i o n s R e l a t i n g to Reading, J u l y 1, 1966 to June 30, 1967", Seading_Research_2uarter 3 (Winter, 1967), 151-301. (See s u c c e s s i v e Winter i s s u e s f o r summaries r e l a t e d to 1968 through 1973). Ross, Ramon Royal. "Frannie and Frank and the F l a n n e l b c a r d " , The E§ading_Teacher x 27 (October, 1973), 43-47. S i l i a k u s , H.J. "Computer-Aided Word Research", Babel_ 3 (July, 1 967) , 19-21. Smith, John M., and Maxwell E. McCombs. "Research i n B r i e f : The Graphics of Prose", V i s i b l e Language_ V (Autumn, 1971) 365-369. 162 Smith, N i l a Banton. "What Have We accomplished i n Reading? A Review of the Past F i f t y Years". Elementary E n g l i s h , 38 (March, 1961), 141-150. "The Many Faces of Reading Comprehension". The Reading Teacher, 23 (December, 1969) Spache, George D. "A New R e a d a b i l i t y Formula f o r Primary -Grade Reading M a t e r i a l s " , Elementary S c h o o l _ J o u r n a l , 53 (March, 1 953), 410-413. S t a u f f e r , R.G. "A Study of the P r e f i x e s i n the Thorndike L i s t to E s t a b l i s h a L i s t of P r e f i x e s That should be Taught i n the Elementary School", J o u r n a l o f _ E d u c a t i o n a l Research, 35, ( 1944) , 453-458. Stone, Clarence R. "Measuring D i f f i c u l t y o f Primary Reading M a t e r i a l : A C o n s t r u c t i v e C r i t i c i s m of Spache's Measure", E l e , , , t _ , y _ S c _ , o l _ _ o _ r , a _ _ 57 (October, 19 56), 36-41. Summers, Edward G. "Important Resource f o r Secondary Reading", _ o u r n _ l _ o f _ _ e a d i n g _ 10 (November, 1966), 88-102. Summers, E.G., Brother Leonard Courtney, and Peter Edwards. "Guide to P r o f e s s i o n a l Textbooks and Research i n Secondary Reading I n s t r u c t i o n , " The E n g l i s h Quarterly^ 7 (Summer, 1974) , 124-146. T a y l o r , W. "Cloze Procedure: A New T o o l f o r Measuring R e a d a b i l i t y " , _ _ _ r _ a l i s _ _ 2 u _ r t e r l y _ 30, 1953, 414-433. Townsend, A. "Reading i n the Junior Grades", The__eading Teacher, 15 (March, 1962), 369-371. Vogel, Mabel, and W. C a r l e t o n Washbourne. "An O b j e c t i v e Method of Determining Grade Placement of C h i l d r e n ' s Reading M a t e r i a l s " , Elementary School J o u r n a l _ 28 (January, 1928), 373-381. Warfel, Harry R. "A Bag With Holes", J o u r n a l of Developmental Reading, I I I (Autumn, 1959), 320-333? Washbourne, C a r l e t o n W., and Mabel V. Morphett. "Grade Placement of C h i l d r e n ' s Books", Elementary School J o u r n a l , 38 (January, 1938), 355-364. 163 Weintraub, Samuel. "The Cloze Procedure", The Reading Teacher, 21 (March, 1 9 6 8 ) , 5 6 7 , 5 6 9 , 5 7 1 , 6 0 7 . Z i p f , George K i n g s l e y . "The Meaning-Frequency R e l a t i o n s h i p of Words", The_Journal o f . G e n e r a l Psychology, 33 (October, 1945) , 251-2567 UNPUBLISHED MATERIALS Aaronson, S h i r l e y . "Vocabulary I n s t r u c t i o n : Challenge of the 7 0 * s " . Paper read a t the N a t i o n a l Reading Conference, December, 1971, Tampa, F l o r i d a . (ED 058 016) A r t l e y , A. S t e r l . T rend s - -and_ P r a c t i c e s . i n ^ Secondary School Rea_i______Co_ _ a_io Monograph. Bloomington, Indiana: Indiana U n i v e r s i t y , ~ M a r c h , 1*970. 1AA~000 507) A u l l s , Mark W. "Toward a Systematic Approach to How the Reader Uses Context to Determine Meaning". Paper read a t the N a t i o n a l Reading Conference, December, 1970, St. Petersbury, F l o r i d a . (ED 049 003) A u s t i n , Warren B. i_Com£uter-Aided_Techni D i s c r i m i n a t i o n _ _ T h e Authorship of Greene's _Groatsworth of ~~ _ I t _ _ ~ F i n a I ~ R e p o r t r - 1 9 6 9 7 ~ 7 E D 6"lo~322f B a i l e y , Stephen D. "Recent Trends and Developments i n Research I n v o l v i n g the C l o z e Procedure". Unpublished r e s e a r c h paper. F a c u l t y of Education, U n i v e r s i t y of B r i t i s h Columbia, 1973. Berg, P. C. "The Psychology of Reading Behavior". Paper read at the N a t i o n a l Reading Conference, December, 1968, Los Angeles. (ED 028 050) Berkeley, Edmund C. _ e s e a _ c h _ i n _ C o i n p _ t e _ - _ s s i s _ A________o__a___T_ain S p r i n g f i e l d , V i r g i n i a : N a t i o n a l T e c h n i c a l Information S e r v i c e , 1972. (ED 074 729) Bormuth, John R, "New Data on R e a d a b i l i t y " . Paper read a t a meeting cosponsored by the I n t e r n a t i o n a l Reading A s s o c i a t i o n and the American E d u c a t i o n a l Research A s s o c i a t i o n , May, 1967, S e a t t l e . (ED 016 586) C l o z e _ R e a _ a b i l i t y _ _ r o c e d _ r e _ Report Number CSEIP-0R-1, ~ T o s A n g e l e s 7 ~ U n i v e r s i t y of C a l i f o r n i a , 1967. (ED 010 983) 164 " E m p i r i c a l Determination of the I n s t r u c t i o n a l Reading L e v e l " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, A p r i l , 1968, Boston. (ED 020 084) "The E f f e c t i v e n e s s of Current Procedures f o r Teaching Reading Comprehension". Paper read at the F i f t y - E i g h t h Annual Meeting of the N a t i o n a l C o u n c i l of Teachers of E n g l i s h , November, 1968, Milwaukee. De velo_menl__of__e^ P r o j e c t No. 7-0052, ~U.S. OffIce~of~*Education,~March~ 1969. ~ "EDUPLAN B i b l i o g r a p h y " . Unpublished b i b l i o g r a p h y . U n i v e r s i t y of Chicago, 1972., B o t e l , M. " A s c e r t a i n i n g I n s t r u c t i o n a l L e v e l s " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, May, 1967, S e a t t l e . (ED 014 373) B r i t i s h Columbia Department of Education. P£escribed_Textbooks_ 12Z£rl3__Grades_I-X_I_ V i c t o r i a : C u r r i c u l u m Development Branch7 19727 Bruner, Jerome. " H e l l Begun i s Half Done: Thoughts About E a r l y Childhood". Address a t the I n t e r n a t i o n a l Reading A s s o c i a t i o n Annual Convention, May, 1972, D e t r o i t , Michigan. Carver, Ronald P. "What i s Reading Comprehension and How Should i t be Measured?". Paper read at the N a t i o n a l Reading Conference, December, 1969, A t l a n t a , Georgia. (ED 038 243) C a r r o l l , John B. "Behind the Scenes i n the Making of a Corpus-Based D i c t i o n a r y and a Word Frequency Book". Paper read at the meeting of the N a t i o n a l C o u n c i l of Teachers of E n g l i s h , November, 1971, Las Vegas, Nevada. (ED 056 842) C h a l l , J . "Research i n L i n g u i s t i c s and Reading I n s t r u c t i o n : I m p l i c a t i o n s f o r Further Research and P r a c t i c e " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, A p r i l , 1968, Boston, Mass. (ED 028 904) Cooper, J . L. "The Reading Program Spans the T o t a l C u r r i c u l u m " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, May, 1967, S e a t t l e . (ED 015 824) C r o n n e l l , Bruce. Designing a Reading Program Based on Research £iliMM§_i__Ortho_ra£hy_ 19777 ~(ED 057 990) Dauzat, S. V. " S t r u c t u r e Word Usage i n the V e r b a l Discourse of Two Groups of C h i l d r e n " . Unpublished Doctor's d i s s e r t a t i o n , The U n i v e r s i t y of M i s s i s s i p p i , 1968. 165 Dunn-Rankin, Peter. " A n a l y z i n g the Development of Reading S k i l l Using an Error-Word P r e f e r e n c e Inventory". Paper read a t the meeting of the American E d u c a t i o n a l Research A s s o c i a t i o n , February, 1971, New York, N.Y. (ED 051 960) Eagan, S i s t e r Ruth Louis e . "An I n v e s t i g a t i o n i n t o the Re l a t i o n s h i p of the Pausing Phenomena i n O r a l Reading and Reading Comprehension". Unpublished Doctor's d i s s e r t a t i o n , the U n i v e r s i t y of A l b e r t a , 1973. Fagan, William T. "An I n v e s t i g a t i o n i n t o the R e l a t i o n s h i p Between Reading D i f f i c u l t y and the Number of Types of Sentence T r a n s f o r m a t i o n s " . Unpublished Doctor's d i s s e r t a t i o n , the U n i v e r s i t y of A l b e r t a , 1969. F r o s t , Joe L. " A p p l i c a t i o n of S t r u c t u r e Process Theory to the Teaching of Reading". Paper read a t the N a t i o n a l Conference of Teachers of E n g l i s h , March, 1970, St L o u i s , M i s s o u r i . (ED 045 288) F u e l l h a r t , P a t r i c i a 0., and David C. Weeks. Com_ilation_and A n a l y s i s o f L e xical^Resou ReportT S p r i n g f i e l d , V i r g i n i a : Clearinghouse f o r F e d e r a l S c i e n t i f i c and T e c h n i c a l I n f o r m a t i o n , 1968. (ED 021 602) Seyer, James R. Cloze_Procedure_as_a in--?'g£Pi3i^§£-.„?,2--il. §- u§^ eg_ 3§terials_ Olympia: Washington State~Board for"community C o l l e g e Education, 1968. (ED 039 157) S u t h r i e , J . T. L e a r n a b i l i t y _ V e B a l t i m o r e , Maryland: Center f o r S o c i a l O r g a n i z a t i o n of Schools, John Hopkins U n i v e r s i t y , 1970. (ED 042 594) H a r r i s , J e s s i c a L. A_Study_of_the_Comput of ComplexTerms q c c u r r i n g _ i n _ a III__III_IIIiIIIsI^  Hater, M A. "The Cloze Procedure as a Measure of the Reading C o m p r e h e n s i b i l i t y and D i f f i c u l t y of Mathematical E n g l i s h . " Unpublished Doctor's d i s s e r t a t i o n , Purdue U n i v e r s i t y , 1969. 166 H i l l , Walter R., and Norma G. B a r t i n . Secondar__Readi Programs^ Descr i E t d o n_a^d_Resea£ch_ ERIC/CRIER Reading Review~Series7~July~197T7~(ED~055~759). H i l l , Walter R., and Norma G B a r t i n . Reading Pr^gr^mj i n §§£ondar__Schools__^ Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , 1971. (ED 071 057) Houska, J.T. "The E f f i c i e n c y of the Cloze Procedure as a R e a d a b i l i t y T o o l on T e c h n i c a l Content m a t e r i a l Used i n I n d u s t r i a l Education at the High School L e v e l . " Unpublished Doctor's d i s s e r t a t i o n . U n i v e r s i t y of I l l i n o i s at Urbana and Champaign, 1971. Jacobs, H. Donald. A s s o c i a t i o n _ W o r d _ L i s t _ f o r _ t h e W i l l a m e t t e V a l l e y . Eugene: Oregon U n i v e r s i t y , September, T967.~ED~0"T5 8 457 Jacobson, M i l t o n D. "Reading D i f f i c u l t y of P h y s i c s and Chemistry Textbooks i n Use i n Minnesota." Unpublished Doctor's d i s s e r t a t i o n . U n i v e r s i t y of Minnesota, 1961. Jacobson, Milton D. "Developing and Comparing Elementary School Word L i s t s by Computer". Paper read a t the meeting of the American E d u c a t i o n a l Research A s s o c i a t i o n , A p r i l , 1972, Chicago. (ED 062 102) Jacobson, M i l t o n D., and Mary Ann MacDougall. "Computerized Model of Program S t r u c t u r e and L e a r n i n g D i f f i c u l t y " . Paper i n Proceedings of the 1969 Convention of the American P s y c h o l o g i c a l A s s o c i a t i o n , Washington, D.C. (ED 040 006) Jongsma, Eugene R. f,!}g_^°2^fr9£§§y£gl.J.§!JOgL9LU§ Research. Indiana U n i v e r s i t y : School of Education, 1970, (ED 050~893)"~ Kulm, G. "Measuring the R e a d a b i l i t y of Elementary Algebra Using the Cloze Technique." Paper presented at the Annual Meeting of the American E d u c a t i o n a l Research A s s o c i a t i o n , February, 1971. L e f e v r e , C a r l A. "Language and C r i t i c a l Reading: The Consummate Reader". Paper read at the N a t i o n a l Reading Conference, December, 1969, A t l a n t a , Georgia. (ED 038 249) L e r n e r , J. W. A G l o b a l T h e o r y o f Reading... a n ^ _ L i n g u i s t i x s _ Newark, Delaware: i n t e r n a t i o n a l Reading A s s o c i a t i o n , February, 1968. (ED 023 538) 167 L o t t , Deborah, et a l . B l _ _ _ i o n a J . _ E _ ^ Combinations,in t h e . v i s u a l , I d e n t i f i c a t i o n _ p f words. Inglewood, C a l i f o r n i a : Southwest Regional E d u c a t i o n a l Laboratory, 1968. (ED 035 516) Lynch, Mervin D., e t a l . "The B u i l d i n g Block Construct as a P o s s i b l e Model f o r Decoding Processes". Paper read a t the N a t i o n a l Reading Conference, December, 1970, St. Petersburg, F l o r i d a . (ED 049 002) M a c G i n i t i e , W. H., and R. T r e t i a k . "Measures of Sentence Complexity as P r e d i c t o r s of the D i f f i c u l t y of Reading M a t e r i a l s " . In Proceedings of the 77th Annual Convention of the American P s y c h o l o g i c a l A s s o c i a t i o n , 1969. (ED 038 254) M i l l e r , A l l a n . Programmer's Guide t o the Edwards' Corpus. Vancouver, B r i t i s h Columbia: Computing Centre, U n i v e r s i t y of B r i t i s h Columbia, 1974. Olsen H. C. " L i n g u i s t i c P r i n c i p l e s and the S e l e c t i o n of M a t e r i a l s " . Paper read a t the I n t e r n a t i o n a l Reading A s s o c i a t i o n conference, A p r i l , 1968, Boston, Mass. (ED 022 649) P o t t e r , Thomas C. A_Taxonom__of _Clo_e_ResearcJi__Par^__I_ Readability_and.Reading^Comprehension. I n g l e w o od, C a l i f o r n i a : Southwest Regional E d u c a t i o n a l L a boratory, 1968. (ED 035 514) Rawson, H i l d r e d I. "A Study of the R e l a t i o n s h i p s and Development of Reading and C o g n i t i o n " . Unpublished Doctor's d i s s e r t a t i o n , The U n i v e r s i t y of A l b e r t a , 1969. Rosenshine, Barak. "New C o r r e l a t e s of R e a d a b i l i t y and L i s t e n a b i l i t y . " Paper read a t the I n t e r n a t i o n a l Reading A s s o c i a t i o n conference, A p r i l , 1968, Boston, Mass. (ED 024 528) S e e l s , Barbara, and Dale Edgar. £eadabilit__and_Re^ _ n n o t a t e _ _ B i b l i o g r _ Newark, Delaware: I n t e r n a t i o n a l Reading A s s o c i a t i o n , *1971. (ED 049 896) Shima, Fr e d . _ese_rch_o_ Wo rd_ A s s o c i a t i o n ^ Discourse... Inglewood, C a l i f o r n i a : Southwest Regional E d u c a t i o n a l L a b o r a t o r y , 1970. (ED 043 470) 168 Summers, Edward G. an Annotated B i b l i o g r a p h y o f S e l e c t e d {i®§§S£_b__ela_ed_to__each^ School__T_0(__196_7 U n i v e r s i t y o f " P i t t s b u r g h ? School of I d u c a t i o n ? " 19 63 7~~(ED 0 10 757). A _ _ _ _ _ _ t a _ _ d _ B i b l i _ _ r a _ h _ ^ 2§_£Eili__l§_4i!i__i__Iil§. Second 1963_ U n i v e r s i t y of P i t t s b u r g h : School of Education, 1964. "(ED~01 0 758) . I_tjjrna t i o n a l _ _ e a ^ i n ^ Re_orts_on Secondary Reading. Bloomington, Indiana: ER IC/CRIER?" T96 77"ED~013~ 185) Summers, Edward G., C h a r l e s H. Davis, and Catherine F. S i f f i n . SJlfeii-hg^Re sear c h _ L i t era ERIC Document Reproduction S e r v i c e , Bethesda, Md., 1968. (ED 013 970) . Pub l i s h e d _ R e s e a r c h _ L i t Document Reproduction S e r v i c e , Bethesda, Md., 1967. (ED 012 834) . Published _ _ e s j 3 a r c h_L ERIC Document Reproduction S e r v i c e , i e t h e s d a , Md., 1968. (ED 013 969) . Vernon, Evelyn, I. "Words Make For Success". Paper read a t the I n t e r n a t i o n a l Reading A s s o c i a t i o n , April-May, 1969, Kansas C i t y , M i s s o u r i . (ED 034 662) Weaver, Wendell W., and A. C. B i c k l e y . " S t r u c t u r a l - L e x i c a l P r e d i c t a b i l i t y of M a t e r i a l s Which P r e d i c t o r Has P r e v i o u s l y Produced or Read". Paper i n the 1967 Proceedings of the American P s y c h o l o g i c a l A s s o c i a t i o n , D i v i s i o n 15. (ED 011 812) Whipple, Gertrude. " P r a c t i c a l Problems of Schoolbook S e l e c t i o n f o r Disadvantaged P u p i l s " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, A p r i l , 1968, Boston, Mass. (ED 029 750) Wolfe, Josephine B, "Appl y i n g Research F i n d i n g s i n Comprehension to Classroom P r a c t i c e " . Paper read at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, May, 1967, S e a t t l e . (ED 0 14 371) Young, C a r o l E. Deyelopment__of_Langua2e_Analysis_Proce^ ^ E E l i _ S _ i o n _ t o _ _ _ t o _ _ t i c _ I _ d e _ i n Columbus Ohio: Computer and Information S c i e n c e Research Center, 1973, p.69. (ED 078 843) 1 6 9 APPENDIX A INDEX OF TEXTS AND SAMPLES BY GRADE LEVEL \ C. ENGLISH 8.] ( T o t a l of 17 Samples) • 1C01C Text: T h e _ C r a f t _ o f _ _ r i t i _ 3 . Don M i l l s , O n t a r i o : Longmans Canada L t d . , 1965. Author: R.J. McMaster. Sample Pages Sample Pages 01 2-3 05 96-97 02 25-26 06 119-121 03 48-49 07 132-136 04 71-75 * 1C02C Text: Short Storie__gf_Distinction. A g i n c o u r t : The Book S o c i e t y of Canada L t d . , 1960. Author: L.H. Newell and J.W. MacDonald (eds). Sample Pages Sample Pages 01 9-10 06 124- 125 02 32-33 07 147-148 03 55-56 08 170-171 04 78-79 09 192-194 05 101-102 10 215-216 r 1 | D. HOWE ECONOMICS 8, | L J ( T o t a l o f 22 Samples) *1D01C Text: Teen_Guide_to_Hom Toronto: McGraw-Hill~Co7~of"Canada L t d . , 1968. Authors: M.S. Barc l a y and F. Champion. Sample Pages Sample Pages 01 6-8 12 230-235 02 34-36 13 247-248 03 52-53 14 262-265 04 7 1-72 15 278-280 05 85-86 16 306-308 06 108-111 17 334-336 07 124-125 18 342-345 08 153-154 19 366-368 09 168-170 20 392-395 10 180-181 21 406-408 11 218-221 22 428-430 r 1 | E. INDUSTRIAL EDUCATION 8. | i J ( T o t a l of 9 Samples) •1E01C Text: g x p l o r i n g _ I n d u s t r i a l s New York: McGraw-Hill Co7 of~Canada~Ltd7, 1968. Authors: Jes La u s t r u p , et a l . Sample Pages Sample Pages 01 3-5 06 141-144 02 32-38 07 161-162 03 51-55 08 181-183 04 55-106 09 196-197 05 106-115 I I F. MATHEMATICS ( T o t a l of 14 Samples) *1F01C Text: I n t r o d u c t i o n ^ . Reading, Mass: Addison-Hesley P u b l i s h i n g Co. Inc., 1962. Author: C F . B r u m f i e l , e t a l . Sample Pages Sample Pages 01 14-16 08 175-179 02 31-35 09 188-197 03 52-56 10 201-207 04 70-74 11 226-228 05 101-102 12 243-244 06 131-133 13 259-260 07 136-141 14 264-268 I*" G. SCIENCE 8.1 i i ( T o t a l o f 20 Samples) *1G01C Text: Labtext_in_Science:_Book_1. Toronto: The Copp~Clark PublishIng~Co., 1968. Authors: G.H. Cannon, et a l . Sample Pages Sample Pages 01 13-16 06 121-124 02 25-27 07 138-140 03 61-62 08 163-164 04 78-80 09 179-180 05 100-101 173 •1G02C Text: _ e _ _ _ _ _ _ _ _ o _ t _ S c i e _ c _ _ _ . Holt7~Rinehart~6~Winston of Canada, L t d . , 1968. Authors: C l i f f o r d J . A n a s t a s i o u , e t a l . Sample Pages Sample Pages 01 12-13 07 143-144 02 43 08 157-158 03 57-58 09 176-177 04 72-73 10 200-201 05 90 11 226-227 06 122-123 i 1 | H. SOCIAL STUDIES 8.| i j ( T o t a l of 22 Samples) *1H01C Text: Man._i__the_Tropics. Scarborough, O n t a r i o . B e l l h a v e n House L t d . , 1968. Authors: Bordon E. C a r s w e l l , et a l . Sample Pages Sample Pages 01 1-3 09 2 0 3 - 2 0 5 02 25-29 10 227-233 03 51-54 11 245-247 04 76-78 12 269-272 05 100-104 13 294-297 06 126-127 14 319-322 07 153-156 15 345-346 08 177-181 •1H02C Text: _he_Shapin__of_Modern_Euro . Toronto: The H a c M i l l a n Company of Canada L t d . , 1968. Author: G e o f f r e y W i l l i a m s . Sample Pages Sample Pages 01 5-6 05 99-100 02 28-29 06 1 17- 1 18 03 50-51 07 140-141 04 76-77 \ B. COMMERCE 9.1 ( T o t a l o f 25 Samples) *2B01C Text: £ S £ s o n a l _ T y j g e w r i t i . Toronto: wTjT~Gage Ltd .7~1967. Authors: S.J. Hanous, et a l . Sample Pages Sample Pages 01 pr e f a c e 07 151-156 02 v i - v i i 08 168 03 54-61 09 189-195 04 65-69 10 211-212 05 95-99 11 239-242 06 132-133 *2B02C Text: T h e _ J u n i o r _ C l e r k . Toronto: S i r ~ I s a a c Pitman (Canada) L t d . , 1970. Authors: C.A. T r o t t e r and P.C. Glover. Sample Pages Sample Pages 01 1-3 08 171-174 02 32-35 09 187-189 03 60-61 10 204-213 04 75-77 11 238-239 05 95-97 12 261-263 06 125-126 13 279-280 07 133-14 8 14 295-298 I 1 I C. ENGLISH 9. j ( T o t a l o f '47 Samples) *2C01C Text: k ^ S r n i n _ _ E _ n _ l i s h . Toronto: The~MacMillan Company of Canada L t d . , 1963. Author: P h i l i p G. Penner and Ruth E. McConnell. Sample Pages Sample Pages 01 1-3 11 230-234 02 24-26 12 256-263 03 4 9-50 13 284-288 04 55-57 14 310-314 05 j 70-73 15 337-340 06 95-101 16 360-363 07 123-124 17 384-386 08 146-158 18 411-412 09 181-183 19 435-441 10 202-208 20 453-455 •2C02C Text: The_Accomplished_R Don M i l l s , O n t a r i o B e l l h a v e n House, 7 9 6 4 . Author: Maurice Gibbons and Alan Dawe. Sample Pages Sample Pages 01 02 03 04 1 - 2 2 6 - 2 7 5 0 - 5 1 7 3 - 7 4 0 5 0 6 0 7 96-97 118 142-143 •2C03C Text: Prose_Readin_s. O n t a r i o : Longmans Canada L t d . , 1964. Author: Jan de Bruyn (ed) . Sample Pages Sample Pages 01 3-4 06 119-120 02 26-28 07 142-144 03 50-51 08 166-167 04 73-74 09 189- 190 05 96-97 10 212-213 176 •2C04C Text: ____H_r____B_o__of__o_er_^ Toronto Clarice? Irwin S~Co.~Ltd.7~1964." Author: J.G. B u l l o c k e (ed). . Sample Pages Sample Pages 01 9-10 06 113-115 02 2 1-22 07 137- 138 03 44-45 08 159-161 04 67-68 09 183- 184 05 89-91 10 200-202 \ D. HOME ECONOMICS 9.| ( T o t a l of 76 Samples) *2D01C Text: Guide to Mod_rn_Meals. Toronto: McGraw-Hill Co7~of"Canada L t d . , 1970. Authors: D.E. Shank, et a l . Sample Pages Sample Pages 01 2-3 12 246-249 02 31-32 13 267 03 61-61 14 289-291 04 72-73 15 306-307 05 97-98 16 324 06 120-121 17 342-344 07 137-141 18 365-366 08 159-163 19 383-384 09 186-187 20 417 10 206-207 20 417 11 223 21 426-427 *2D02C Text: C l o t h e s _ f o r _ T e e n s . Toronto: D7c7~Heath~Canada L t d . , 1970. Authors: E . Todd and F. Roberts. Sample Pages 01 2-3 02 35 03 62 04 81 05 108 06 125 07 147-148 08 167-168 09 194-196 10 214-215 11 240 Sample Pages 12 258 13 283-284 14 299 15 328-329 16 338-339 17 359-361 18 376-378 19 400-401 20 439-440 21 460-461 22 489-490 • 2D03C Text: Learning;_About_Chil P h i l a d e l p h i a : J.B. L i p p i n c o t t Co., ?964. Authors: R.M. Shuey, et a l . Sample Pages Sample Pages 01 18-20 08 170-171 02 36-39 09 193-194 03 63-64 10 216-219 04 82-83 1 1 237-240 05 95-97 12 258-259 06 126-128 13 279-281 07 146-148 14 289-290 178 • 2D04C Text: _E_________s_Fro____to_6. Dept7~of N a t i o n a l Health S Welfare, Ottawa, Canada, 1967. Author: Not given. Sample Pages Sample Pages 01 8-10 06 115-116 02 31-33 07 136-139 03 55-56 08 151-152 04 68-69 09 183-184 05 95-96 10 201-202 *2D05C Text: _o_-_Yo__Are__ead__To_Cgg_. Minneapolis: Burgess P u b l i s h i n g Co., 1964. Author: M.A. D u f f i e . Sample Pages Sample Pages 01 4-6 06 127-130 02 37 07 141- 143 03 59-61 08 162 04 86-87 09 182-183 05 103 \ E. INDUSTRIAL EDUCATION 9.*| ( T o t a l o f 54 Samples) •2E01C Text: Genera1_Woodworking. Toronto: McGraw-Hill~Co. of Canada L t d . , 1965. Author: C h r i s . H. Groneman. Sample Pages Sample Pages 01 1-2 08 175 02 42-44 09 184-186 03 54-55 10 210-211 04 74-76 11 235-238 05 95-113 12 253-254 06 114-135 13 272-273 07 158-160 •2E02C Text: G e n e r a l _ _ e t a I s , Toronto: McGraw-Hill~Co. of Canada L t d . , 1965. Author: John L. F e i r e r . Sample Pages Sample Pages 01 1 2-14 09 185-187 02 35-36 10 210-212 03 59-61 11 226-227 04 67-69 12 238-240 05 104-105 13 260-262 06 126-129 14 273-274 07 1 49-151 15 317-319 08 167-170 16 340-349 *2E03C Text: General MPower_Mechanics. Toronto: McGraw-Hill Co."of Canada L t d . , 1970. Authors: Robert M. Worthington, et a l . Sample P ages Sample Pages 01 1 8-21 14 297-298 02 35-37 15 328-331 03 57-59 16 342-348 04 79-81 17 367-369 05 100-101 18 390-391 06 127-129 19 413-414 07 1 48-149 20 435 08 169-170 21 458-460 09 192-193 22 473-474 10 210-212 23 500-501 11 237-240 24 521-522 12 260-261 25 546-547 13 281-283 I F. MATHEMATICS 9. | i J ( T o t a l o f 7 Samples) •2F01C Text: M o d e r n G e n g r a l Mathematics. Don M i l l s , O ntario Addison"Wesley Icanada)"ltd. , 1966. Authors: R.E. E i c h o l y , et a l . Sample Pages Sample Pages 01 1-70 05 146-161 02 75-106 06 183-214 03 108-127 07 227-331 04 130-143 i 1 | G. SCIENCE 9. | ( T o t a l of 24 Samples) • 2G01C Text: _J§__lo_in___ci_n Scarborough, O n t a r i o : P r e n t i c e - H a l l o f Canada L t d . , 1968. Authors: W.H. Rasmussen and M.C. Schmid. Sample Pages Sample Pages 01 1-2 08 156-159 02 22-23 09 171- 175 03 41-44 10 199-202 04 76-77 11 222-224 05 92-94 12 248-251 06 111-116 13 278 07 136-138 181 *2G02C Text: E_adin__About_Sci Holt, B inehart 5 Winston of Canada L t d . , 1969. Authors: M. F o r s t e r , et a l . Sample Pages Sample Pages 01 13-14 07 145-146 02 30 08 172-173 03 63-64 09 191-192 04 78-79 10 218 05 100-101 11 241-242 06 124-125 i SOCIAL STUDIES 9.I i *2H01C Text: ( T o t a l of 13 Samples) M__I__ T-k§_5E§§__£2IS_£__2* Scarborough, O n t a r i o : B e l l h a v e n House L t d . , 1969. Authors: G.E. C a r s w e l l , e t a l , Sample 01 02 03 04 Pages 1-3 22-27 61-62 82-85 Sample 05 06 07 08 Pages 95-100 111-114 132-134 155-156 • 2H02C Text: Our_World_of_Change. Toronto: McGraw-Hill Company of Canada, L t d . , 1969, Author: Hugh B. I n n i s . Sample 01 02 0 3 Pages 13-14 20-21 4 9-51 Sample 04 05 Pages 71-72 104-106 I*" A. AGRICULTURE 10.1 i J ( P i l o t Study: Not used i n Corpus) ( T o t a l of 21 Samples) •3A01C Text: Farmer's Shop Book. Milwaukee: The Bruce P u b l i s h i n g Co., 1953. Authors: L. M. Roehl and A.D. Longhouse. Sample Pages Sample Pages 01 14-17 12 231-233 02 34-36 13 261 03 62-64 14 274 OU 7 8-81 15 313-316 05 1 09-111 16 328-329 06 129 17 353-355 07 146-147 18 375 08 161-164 19 390-391 09 1 89-190 20 416-417 10 216-220 21 433-437 11 223-275 r 1 | B. COMMERCE 10. | (T o t a l o f 16 Samples) •3B01C Text: _ew_Basic_Course_ Toronto S i r I s a a c Pitman (Canada) L t d . , 1964, Author: Not g i v e n . Sample Pages Sample Pages 01 v i i i - i x 05 88-89 02 27-37 06 113-121 03 55-66 07 137-145 04 77-87 *3B02C Text: E x p l o r i n g B u s i n e s s . Toronto: McGraw-Hill Co., of Canada L t d . , 1968. Authors: J. Frank Dame, et a l . Sample Pages Sample Pages 01 5-7 06 114-117 02 27-29 07 143 03 54-56 08 178-179 04 85 09 186 05 100-102 T C. ENGLISH 10.1 i J ( T o t a l of 16 Samples) • 3C01C Text: ________________^ D O N M___s> O n t a r i o : 3TM. Dent~S~Sons (Canada), 1965. Authors: Malcolm Boss and John Stevens (eds). Sample Pages Sample Pages 01 1-3 07 140-141 02 25-26 08 163-165 03 48-49 09 187-188 04 71-72 10 210-211 05 94-95 11 233-234 06 117-118 12 251-252 • 3C02C Text: Drama_IV. Toronto: The~MacMillan Co. of Canada L t d . , 1965. Author: Herman Voaden (ed). Sample Pages Sample Pages 01 2-3 03 226-227 02 142-143 04 383-384 I 1 I F. MATHEMATICS 10. | i i ( T o t a l o f 14 Samples) • 3F01C Text: Mathematics: ..A.Modern^Approach. Don M i l l s , O n t a r i o : Addison Wesley (Canada) L t d . , 1966. Authors: M.S. Wilcox and J.E. Y a r n e l l e . Sample Pages Sample Pages 01 1-5 08 190-191 02 14-20 09 207-209 03 54-57 10 237-263 04 65-66 11 295-297 05 90-91 12 304-305 06 101-102 13 322-324 07 150-153 14 346-347 \ G. SCIENCE 10.1 i j ( T o t a l of 31 Samples) • 3G01C Text: Extendino__Science_Co^ Scarborough, O n t a r i o : P r e n t i c e - H a l l o f Canada L t d . , 1970. Author: M.C. Schmid (ed). Sample Pages Sample Pages 01 1-5 10 204-210 02 31-37 11 253-256 03 55-56 12 259-264 04 79-85 13 291-298 05 106-108 14 311-319 06 126-128 15 323-326 07 149-154 16 329-336 08 164-169 17 372-375 09 193-200 185 •3G02C Text: _g__i____bg_t_Scie_ce_3. Toronto: Holt7~Rinehart S Winston of Canada, Ltd., 1970. Author: J. Woodrow. Sample Pages Sample Pages 01 38-42 08 186-188 02 59 09 203-204 03 75 10 227-228 04 100-101 11 234-235 05 128-129 12 254-255 06 139-143 13 277-278 07 155 14 301-302 f H. SOCIAL STUDIES 10.I i J ( T o t a l o f 42 Samples) • 3H01C Text: A _ _ _ _ i ° M l _ _ _ 2 _ ^ Toronto: Gage E d u c a t i o n a l Pub. Ltd., 1970. Authors: G.S. Tomkins, et a l . Sample Pages 01 11 02 26-27 03 53 04 74 05 97 06 112 07 134-135 08 1 59 09 175 10 1 96 11 210-211 12 2 36-23 7 13 254 14 279 15 299 Sample Pages 16 320 17 327-328 18 347 19 372 20 387 21 406 22 434 23 449 24 469 25 490 26 514-515 27 536 28 557-559 29 581-582 30 596 186 *3H02C Text: A_Nation_pevelo£in__ Toronto: McGraw-Hill Company of Canada L t d . , 1970. Author: J. A. Lower. Sample Pages Sample Pages 01 14-15 07 131-132 02 35-36 08 158-159 03 54-55 09 178-179 04 77-78 10 197-198 05 92-93 11 213-214 06 1 15-116 12 230-231 187 APPENDIX B SAMPLE SIZES IN ALPHABETICAL ORDER AND ASCENDING RANK SAMPLES IW ALPHABETICAL ORDER SAMPLE SIZE *1C01C01 523 *1C01C02 507 • 1C01C0 3 496 *1C01C04 455 •1C01C05 485 * 1 CO 1 CO 6 508 •1C01C07 526 •1C02C01 470 •1C02C02 475 *1C02CO 3 484 •1C02C04 575 *1C02C05 515 •1C02C06 588 * 1 CO 2 CO 7 450 •1C02C08 529 *1C02C09 526 •1C02C10 493 *1D01C01 505 *1D01C02 387 •1D01C03 49 1 *1D01C04 576 •1D01C05 384 •1D01C06 557 • 1D01C0 7 618 *1D01C08 584 *1D01C09 577 •1D01C10 554 • 1D01C1 1 573 •MD01C12 480 •1D01C13 509 *1D01C14 560 •1D01C15 427 *1D01C16 391 *1D01C17 535 •1D01C18 635 •1D01C19 436 •1D01C20 466 *1D01C21 611 •1D01C22 571 * 1 EO 1 CO 1 464 SAMPLE SIZE *1E0 1C02 521 •1E01C03 612 *1E01C04 579 •1E01C05 556 *1E01C0 6 505 *1E0 1C07 473 *1E01C08 473 *1E01C09 441 • 1F01C01 453 *1F01C02 459 • 1F01C03 498 *1F0 1C04 427 • 1F01C0 5 481 •1F01C06 566 *1F01C07 549 •1F01C08 552 •1F01C09 54 5 •1F01C10 491 • 1F01C11 541 *1F0 1C12 522 •1F01C13 481 *1F0 1C14 509 •1G01C01 505 *1G0 1C02 515 •1G01C03 514 •1G01C04 482 *1G01C05 512 *1G0 1C06 453 • 1G01C07 458 *1G0 1C08 468 •1G01C09 495 •1G02C01 513 *1G02C02 490 •1G02C03 524 •1G02C04 485 •1G02C05 445 •1G02C06 470 •1G02C07 499 • 1G02C08 533 •1G02C09 495 SAMPLES IN ALPHABETICAL ORDEB SAMPLE SIZE *1G02C10 526 •1G02C11 525 *1H01C01 532 *1H01C02 522 *1H01C03 543 •1H01C04 512 •1H01C05 495 *1H01C06 562 * 1 HO 1 CO 7 500 *1H01C08 537 *1H01C09 512 *1H01C10 494 *1H01C1 1 512 •1H01C12 493 *1H01C13 519 •1H01C14 499 •1H01C15 496 •1H02C01 495 *1H02C02 508 •1H02C03 479 *1H02C04 501 *1H02C05 490 •1H02C06 527 •1H02C07 477 *2B01C01 451 •2B01C02 469 •2B01C03 470 •2B01C04 361 •2B01C0 5 618 •2B01C06 498 •2B01C07 530 •2B01C08 458 *2B01C09 551 •2B01C10 608 • 2B01C1 1 480 •2B02C01 571 •2B02C02 528 •2B02C03 545 *2B02C04 489 • 2B02C05 448 SAMPLE SIZE •2B02C06 498 *2B02C07 4 85 •2B02C08 471 *2B02C09 463 •2B02C10 495 •2B02C11 523 •2B02C12 470 •2B02C13 490 *2B02C14 515 •2C01C01 500 •2C01C02 487 •2C01C03 499 *2C01C04 528 •2C01C05 500 *2C0 1C06 465 •2C01C07 455 •2C01C08 445 *2C01C0 9 453 •2C01C10 480 •2C01C11 445 *2C0 1C12 500 •2C01C13 476 *2C01C14 504 *2C01C15 547 •2C01C16 444 •2C01C17 491 •2C01C18 509 *2C01C19 381 *2C0 1C20 537 •2C02C01 498 *2C02C02 520 •2C02C03 458 •2C02C04 494 •2C02C05 489 •2C02C06 420 •2C02C07 521 •2C03C01 491 *2CO3 CO 2 439 •2C03C03 562 •2C03C04 499 SAMPLES IN ALPHABETICAL ORDER SAMPLE SIZE *2C03C35 530 *2C03C06 470 •2C03C07 474 •2C03C08 572 *2C03C09 483 *2C03C10 515 •2C04C01 532 •2C04C02 451 •2C04C03 514 •2C04C04 514 •2C04C05 503 •2C04C06 513 •2C04C07 500 •2C04C38 508 •2C04C09 505 •2C04C10 502 •2D01C01 534 *2D01C02 479 *2D01C03 525 •2D01C04 452 •2D01C05 487 *2D01C06 446 *2D01C07 455 •2D01C08 444 •2D01C09 508 •2D01C10 457 *2D01C11 507 •2D01C12 479 •2D01C13 473 *2D01C14 520 •2D01C15 469 •2D01C16 508 •2D01C17 515 *2D01C18 517 *2D01C19 504 *2D01C20 465 *2D01C21 454 *2D02C0 1 513 •2D02C02 525 *2D02C03 504 SAMPLE SIZE •2D02C04 549 •2D02C05 458 •2D02C06 471 •2D02C07 470 •2D02C08 496 •2D02C09 478 *2D02C10 526 •2D02C11 485 •2D02C12 480 *2D02C13 489 •2D02C14 511 *2D02C15 438 *2D02C16 485 *2D02C17 477 •2D02C18 514 •2D02C19 407 *2D02C20 524 •2D02C21 454 •2D02C22 501 *2D03C01 544 •2D03C0 2 4 96 *2D03C03 508 •2D03C04 507 *2D03C05 516 *2D03C06 506 •2D03C07 516 •2D03C08 502 •2D03C09 493 •2D03C10 473 *2D03C11 435 •2D03C12 496 •2D03C13 448 • 2D03C14 488 *2D04C01 564 •2D04C02 615 •2D04C03 501 *2D04C04 572 *2D04C05 588 •2D04C06 469 •2D04C07 489 SAMPLES IN ALPHABETICAL OBDEB SAMPLE SIZE •2D04C08 524 •2D04C09 488 •2D04C10 522 •2D05C01 518 *2D05C02 539 •2D05C03 524 •2D05C04 475 •2D05C05 486 •2D05C06 506 •2D05C07 504 *2D05C08 49 1 •2D05C09 556 •2E01C01 535 •2E01C02 511 •2E01C03 448 *2E01C04 455 •2E01C05 657 *2E01C06 536 • 2E01C07 400 *2E01C08 446 *2E01C09 486 *2E01C10 445 •2E01C11 338 •2E01C12 404 •2E01C13 414 •2E02C01 503 •2E02C02 508 •2E02C03 457 •2E02C04 504 •2E02C05 476 *2E02C06 502 *2E02C07 356 *2E02C08 508 •2E02C09 573 • 2E02C10 484 •2E02C11 472 •2E02C12 528 • 2E02C13 490 •2E02C14 467 •2E02C15 471 SAMPLE SIZE • 2 E 0 2 C 1 6 494 • 2 E 0 3 C 0 1 511 •2E03C02 519 •2E03C03 467 •2E03C04 558 •2E03C05 525 *2E03C06 525 *2E03C07 551 •2E03C08 561 *2E03C09 514 •2E03C10 458 •2E03C11 479 •2E03C12 538 *2E03C13 479 *2E03C14 488 •2E03C15 488 •2E03C16 548 •2E03C17 506 •2E03C18 523 •2E03C19 521 *2E03C20 519 *2E03C21 567 •2E03C22 474 *2E03C23 506 •2E03C24 447 •2E03C25 517 *2F01C01 505 •2F01C02 503 *2F01C03 480 •2F01C04 485 *2F01C05 561 *2F01C06 501 •2F01C07 581 •2G01C01 507 •2G01C02 500 •2G01C03 517 •2G01C04 543 •2G01C05 502 •2G01C06 494 •2G01C07 508 SAMPLES IN ALPHABETICAL ORDER SAMPLE SIZE •2G01C08 540 •2G01C09 512 *2G01C10 523 *2G01C11 572 *2G01C12 511 *2G01C13 519 *2G02C01 496 *2G02C02 516 *2G02C03 452 •2G02C04 513 •2G02C05 484 *2G02C06 522 •2G02C07 514 *2G02C08 511 •2G02C09 524 •2G02C10 460 *2G02C11 538 *2H01C01 532 *2H01C02 544 •2H01C03 638 •2H01C04 559 •2H01C05 557 •2H01C06 508 *2H01C07 513 •2H01C08 557 *2H02C01 507 •2H02C02 472 •2H02C03 525 *2H02C04 497 •2H02C05 546 •3B01C01 543 •3B01C02 552 •3B01C03 483 •3B01C0 4 494 *3B01C05 573 *3B01C06 419 •3B01C07 482 *3B02C0 1 482 *3B02C02 403 •3B02C03 428 SAMPLE SIZE •3B02C04 477 •3B02C0 5 435 •3B02C06 499 *3B02C07 429 *3B02C08 495 *3B02C09 458 • 3C01C01 509 *3C0 1C02 570 •3C01C0 3 523 *3C0 1C04 551 •3C01C05 564 •3C01C06 519 *3C01C0 7 594 *3C01C08 612 •3C01C09 532 •3C01C10 620 *3C01C11 538 • 3C01C12 556 *3C02C01 454 •3C02C02 522 • 3C02C03 461 •3C02C04 430 •3F01C01 499 *3F0 1C02 450 *3F01C03 508 •3F01C0 4 484 •3F01C05 450 *3F01C06 505 *3F01C07 549 *3F0 1C08 531 •3F01C09 478 •3F01C10 492 •3F01C11 537 *3F01C12 541 •3F01C13 546 *3F01C14 530 •3G01C01 548 •3G01C02 477 •3G01C03 517 •3G01C04 497 193 SAMPLES IN ALPHABETICAL ORDEB SAMPLE SIZE SAMPLE SIZE •3G01C05 4 86 •3H01C14 437 *3G01C0 6 520 *3H01C15 489 *3G01C07 470 *3H01C16 503 •3G01C08 517 •3H01C17 478 •3G01C09 458 •3H01C18 482 *3G01C10 532 •3H01C19 428 *3G01C11 517 •3H01C20 431 •3G01C12 514 •3H01C21 455 *3G01C13 543 *3H0 1C22 520 •3G01C14 492 *3H01C23 439 *3G01C1 5 498 *3H0 1C24 494 •3G01C16 510 •3H01C25 469 *3G01C17 498 *3H0 1C26 466 *3G02C01 523 *3H01C27 503 *3G02C02 451 •3H01C28 530 *3G02C03 538 *3H01C29 480 *3G02C04 497 *3H01C30 377 *3G02C05 495 •3H02C01 460 *3G02C06 435 •3H02C02 449 •3G02C07 522 *3H02C03 492 *3G02C08 493 •3H02C04 442 *3G02C09 511 •3H02C05 490 *3G02C10 ,469 •3H02C06 423 *3G02C1 1 467 •3H02C07 470 *3G02C12 513 *3H02C08 447 *3G02C13 522 *3H02C09 537 •3G02C14 555 •3H02C10 485 •3H01C01 482 •3H02C11 541 *3H01C02 568 •3H02C12 456 *3H01C03 514 *3H0 1C04 480 •3H01C05 525 • 3H01C06 565 •3H01C07 501 *3H0 1C08 535 •3H01C09 552 *3H01C10 497 *3H01C11 567 *3H01C12 423 •3H01C13 54 6 SAMPLES BANKED IN ASCENDING ORDER SAMPLE SIZE SAMPLE SIZE •2E01C1 1 338 *2E01C08 446 *2E02C07 356 •2D01C06 446 *2B0 1C04 361 •3H02C08 447 *3H01C30 377 •2E03C24 447 •2C01C19 381 • 2B02C05 448 * 1D01C0.5 384 •2D03C13 448 •1D01C02 387 *2E01C0 3 448 •1D01C16 391 *3H02C02 449 •2E01C07 400 *1 C02C07 450 *3B02C02 403 *3F0 1C05 450 •2E01C12 404 •3F01C0 2 450 *2D02C19 407 •2C04C02 451 *2E01C13 414 •2B01C01 451 • 3B01C06 419 •3G02C02 451 •2C02C06 420 •2D01C0 4 452 *3H02C06 423 •2G02C03 452 •3H01C12 423 •2C01C09 453 "MD01C15 427 •1F01C01 453 •1F01C04 427 *1G01C06 453 •3B02C03 428 *2D0 1C2 1 454 *3H01C19 428 •3C02C01 454 *3B02C07 429 •2D02C21 454 •3C02C04 430 •3H01C21 455 *3H01C20 431 *1C01C04 455 *2D03C11 435 *2E01C04 455 •3G02C0 6 435 •2D01C07 455 •3B02C0 5 435 •2C01C07 455 •1D01C19 436 •3H02C12 456 *3H01C14 437 *2D01C10 457 • 2D02C1 5 438 •2E02C03 457 •3H01C23 439 *2D02C05 458 •2C03C02 439 •2B01C08 458 *1E01C09 441 *3B02C0 9 458 *3H02C04 442 *3G0 1C09 458 •2C01C16 444 •2E03C10 458 *2D01C08 444 •1G01C07 458 •1G02C05 445 •2C02C03 458 *2E01C10 445 •1F01C02 459 •2C01C08 445 •3H02C01 460 •2C01C11 445 *2G02C10 460 SAMPLES BANKED IN ASCENDING OBDER SAMPLE SIZE •3C02C03 461 •2B02C09 463 *1E01C01 464 *2C01C06 465 •2D01C20 465 •3H01C26 466 •1D01C20 466 •2E02C14 467 •2E03C03 467 •3G02C11 467 •1G01C08 468 *2D04C06 469 •2B01C0 2 469 *2D01C15 469 *3G02C10 469 •3H01C25 469 • 2C03C06 470 *1G02C06 470 •2D02C07 470 •2B01C03 470 •2B02C12 470 *3G01C07 470 *1C02C01 470 •3H02C07 470 *2D02C06 471 *2E02C15 471 *2B02C08 471 •2H02C02 472 •2E02C11 472 •1E01C07 473 •2D03C10 473 •1E01C08 473 *2D01C13 473 •2C03C07 474 *2E03C22 474 •2D05C04 475 *1C02C02 475 •2E02C05 476 *2C01C13 476 •1H02C07 477 SAMPLE SIZE •2D02C17 477 •3G01C02 477 *3B02C04 477 *3H01C17 478 •2D02C09 478 •3F01C09 478 *2D01C02 479 *1H02C03 479 •2E03C11 479 •2D01C12 479 •2E03C13 479 •2F01C03 480 •2C01C10 480 *2B01C11 480 •2D02C12 480 *3H01C04 480 *1D01C12 480 •3H01C29 480 •1F01C13 481 *1F01C05 481 •3B01C07 482 •3H01C01 482 •3B02C01 482 •1G01C04 482 •3H01C18 482 •2C03C09 483 •3B01C03 483 •2E02C10 484 •1C02C03 484 •2G02C05 484 •3F01C04 484 •2D02C11 485 •3H02C10 485 • 1C01C05 485 •1G02C04 485 •2D02C16 485 •2F01C04 485 *2B02C07 485 *3G0 1C05 486 •2E01C09 486 SAMPLES RANKED IN ASCENDING ORDER SAMPLE SIZE SAMPLE SIZE *2 DO 5 CO 5 486 •3G0 2C0 5 495 *2C01C02 487 •1G02C09 49 5 *2D0 1C05 48 7 •2D02C08 496 •2D04C09 488 •1H01C15 496 *2E03C14 488 •1C01C03 496 *2D03C1 4 488 •2G02C01 496 *2E03C15 488 • 2D03C02 496 *3H01C15 489 •2D03C12 496 •2B02C04 489 *3H01C10 497 *2D04C07 489 •3G01C04 497 *2C02C05 489 *2H02C04 497 *2D02C1 3 489 *3G02C04 497 •2E02C13 490 •2C02C01 498 •2B02C13 490 •2B01C06 498 •3H02C05 490 •2B02C06 498 •1G02C02 490 *1F01C0 3 498 •1H02C05 490 *3G01 C17 498 *1DO 1CD3 491 *3G01C15 498 •1F01C10 491 •2C03C04 499 • 2C03C0 1 49 1 •1H01C14 499 • 2C01C17 491 *2C01C03 499 •2D05C08 491 •3F01C01 499 •3H02C03 492 •3B02C06 499 *3G01C14 492 *1G02C07 499 *3F01C10 492 • 2C04C0 7 500 *2D03CD9 493 *1H01C07 500 *1H01C12 493 *2C01C1 2 500 "MC02C10 493 •2G01C02 500 •3G02C08 4 93 •2C01C01 500 •3B01C0U 494 *2C01C05 500 •3H01C24 494 •3H01C07 501 • 2C02C04 494 •2D04C0 3 501 •2E02C16 494 •2F01C06 501 •1H01C10 494 •2D02C22 501 •2G01C06 494 •1H02C04 501 •2B02C10 495 *2C04C10 502 •1H02C01 495 *2G01C05 502 • 3B02C08 495 •2D03C08 502 •1G01C09 495 •2E02C06 502 * 1 HO 1 CO 5 495 •2C04C05 503 SAMPLES RANKED IN ASCENDING ORDER SAMPLE SIZE SAMPLE SIZE *3H01C16 503 •2D02C14 511 •2E02C0 1 503 *2G02C08 511 *2F01C02 503 •2E03C01 511 • 3H01C27 503 •3G02C09 511 •2C01C14 504 •2E01C02 511 •2E02C04 504 *2G01C12 511 *2D02C0 3 504 *1H01C11 512 *2D05C07 504 •1G01C05 512 *2D01C19 504 *1H01C09 512 •1G01C01 505 *1H01C04 512 • 2C04C09 505 *2G01C09 512 •3F01C06 505 •2C04C06 513 * 1 EO 1 CO 6 505 •2H01C07 513 •2F01C01 505 •2D02C01 513 * 1 DO 1 CO 1 505 •2G02C04 513 *2D05C06 506 *3G02C12 513 •2E03C17 506 *1G02C01 513 •2D03C06 506 *2C04C03 514 *2E03C23 506 •3H01C03 514 •2H02C01 507 •2E03C09 514 * 1 CO 1 CO 2 507 •1G01C03 514 *2G01C01 507 *2C04C04 514 *2D0 1C1 1 507 •3G01C12 514 •2D03C04 507 •2D02C18 514 •2H01C0 6 508 •2G02C07 514 •2C04C08 508 •2B02C14 515 *2E02C08 508 *1G0 1C02 515 •2E02C02 508 *2 D01C17 515 • 1H02C02 508 •2C03C10 515 •1C01C06 508 • 1C02C05 515 •2G01C0 7 508 •2G02C02 516 •2D01C16 508 •2D03C07 516 *3F01C03 508 •2D03C05 516 *2D01C09 508 •3G01C03 517 *2D03C03 508 •2E03C25 517 *1F01C14 509 *2G01C03 517 *2C01C18 509 •2D01C18 517 •1D01C13 509 *3G0 1C08 517 •3C01C01 509 •3G01C11 517 *3G01C16 510 *2D05C01 518 SAMPLES RANKED IN ASCENDING ORDER SAMPLE SIZE *1H.01C1 3 519 *2E0 3C0 2 519 *2G01C13 519 •3C01C06 519 •2E03C20 519 *3H01C22 520 *2C02C02 520 *3G01C06 520 *2D0 1C1 4 520 •2C02C07 521 * 1EO1 CO 2 521 •2E03C19 521 *1F01C12 522 •3C02C02 522 •3G02C07 522 *2D04C10 522 •2G02C06 522 •1H01C02 522 *3G02C13 522 *1C01C01 523 *2G01C10 523 •2E03C18 523 • 3G02C0 1 523 •3C01C03 523 •2B02C11 523 •1G02C03 524 •2D02C20 524 •2G02C09 524 *2D05C03 524 *2D04C08 524 •2H02C03 525 •2D01C03 525 *1G02C11 525 •2E03C05 525 *3H0 1C0 5 525 •2D02C02 525 •2E03C06 525 •1G02C10 526 * 1 CO 1 CO 7 526 • 2D02C1 0 526 SAMPLE SIZE • 1C02C09 526 •1H02C06 52 7 •2B02C02 528 •2E02C12 528 •2C01C04 528 • 1C02C08 529 *2B01C07 530 •2C03C05 530 •3F01C14 530 •3H01C28 530 *3F0 1C08 531 •3C01C09 532 •2H01C01 532 *2C04C01 532 *3G0 1C10 532 • 1H01C01 532 *1G02C08 533 *2D01C01 534 •3H01C08 535 *1D01C17 535 •2E01C01 535 •2E01C06 536 *3F01C11 537 •2C01C20 537 •1H01C08 537 •3H02C09 537 •2E03C12 538 *3 GO 2 CO 3 538 •3C01C11 538 •2G02C11 538 •2D05C02 539 •2G01C08 540 •3H02C11 541 • 1F01C11 541 •3F01C12 541 •3B01C01 543 *3G0 1C13 543 •2G01C04 543 *1H01C03 543 •2D03C01 544 SAMPLES RANKED IN ASCENDING ORDER SAMPLE SIZE SAMPLE SIZE •2H01C02 544 *3C0 1C02 570 *2B02C03 54 5 •2B02C0 1 571 *1F0 1C09 54 5 •1D01C22 571 •2H02C0 5 546 • 2G01C11 572 •3F01C13 54 6 *2C03C08 572 •3H01C13 546 •2D04C04 572 *2C01C1 5 547 •2E02C09 573 • 3G01C0 1 548 *3B01C05 573 •2E03C16 54 8 *1D01C11 573 * 1FO1 CO 7 549 *1C02C04 575 *2D02C04 54 9 • 1D01C04 576 *3F0 1C0 7 549 *1D0 1C09 577 •3C01C04 551 • 1E01C04 579 •2B01C0 9 551 *2F01C07 581 *2E03C07 551 •1D01C08 584 •1F01C08 552 •1C02C06 588 *3B01C02 552 •2D04C05 588 •3H01C09 552 •3C01C07 594 *1D01C10 554 •2B01C10 608 •3G02C14 555 *1D01C21 611 *2D05C09 556 • 3C01C0 8 612 •1E01C05 556 • 1E01C03 612 •3C01C12 556 •2D04C02 615 *2H01C08 557 •2B01C05 618 * 1 DO 1 CO 6 557 •1D01C07 618 •2H01C05 557 •3C01C10 620 •2E03C04 558 •1D01C18 635 •2H01C04 559 •2H01C03 638 • 1D01C14 560 *2E0 1C05 657 •2F01C05 561 •2E03C08 561 •1H01C06 562 *2C03C03 562 *2D04C01 564 •3C01C0 5 564 •3H01C06 565 *1FO 1C0 6 566 •2E03C21 567 •3H01C11 567 •3H01C02 568 200 APPENDIX C COMPUTER FILES AND PROGRAMS USED IN THE STUDY 20 1 FILE# FILE NAME SIZE DESCRIPTION OF FILE 1 CORPUS (394) COMPLETE ENGLISH LANGUAGE SAMPLE. 2 GRADE8 (87) SAMPLE TAKEN JUST FROM GRADE EIGHT. 3 GRADE9 (206) II II •i t i GRADE NINE. 4 GRADE10 (102) n II II i i GRADE TEN. 5 AGRICULTURE (18) i t II it n AGRICULTURE 6 COMMERCE (35) •i II i i II COMMERCE 7 ENGLISH (70) n II II it ENGLISH 8 HOMEC (85) II •• it it HOME EC. 9 INDED (54) II II n II INDUSTRIAL ED. 10 MATH (30) II II i i •t MATHEMATICS 11 SCIENCE (6 7) II II it •i SCIENCE 12 SOCIALS (72) n n i i it SOCIALS 13 GRADE8.ENGLISH (16) GRADE-SUBJECT SAMPLE 14 GRADE8. HOMEC (20) II •i 15 GRADE8.INDED (9) II i i 16 GRADE8. MATH (13) n II 17 GRADE8.SCIENCE (17) II II 18 GRADE8.SOCIALS (21) II II 19 GRADE9.COMMERCE (22) II •i 20 GRADE9.ENGLISH (41) it II 21 GRADE9.HOMEC (66) II II 22 GRADE9.INDED (47) II •i 23 GRADE9.MATH (7) II II 24 GRADE9.SCIENCE (23) II II 25 GRADE9.SOCIALS (14) n t i 26 GRADE10.COMMERCE (14) II •i 27 GRADE10.ENGLISH (15) II II 28 GRADE10.MATH (12) II II 29 GRADE10.SCIENCE (2 8) II t i 30 GRADE10.SOCIALS (41) it II 31 8E01 (7) GRADE-: SUBJECT -TEXT* SAMPLE 32 8E02 (9) •i II it 33 8H01 (2 0) it II •i 34 8101 (9) II II i i 35 8M01 (13) II II n 36 8SC01 (8) n II •i 37 8SC02 (11) II it it 38 8SO01 (14) II II II 39 8SO02 (7) n t i n 40 9C01 (11) II i i II THE SIZE COLUMN REFERS TO MACHINE COMPUTER, WHICH ARE APPROXIMATELY PRINTED PAGES EACH. PAGES INSIDE THE THE SAME AS TWO 8 X 1 1 202 FILES FILE NAME SIZE DESCRIPTION OF FILE 41 9C02 (13) GRADE-SUBJECT-TEXT* SAMPLE 42 9E01 (19) II II II 43 9E02 (7) n II II 44 9E03 (9) •i II t i 45 9E04 (9) II it II 46 9H01 (18) II it it 47 9H02 (19) II n it 48 9H03 (13) II tt t i 49 9H04 (10) •i it it 50 9H05 (9) it t i it 51 9101 (12) it it •i 52 9102 (14) n it tt 53 9103 (23) II i i II 54 9M01 (V) II tt it 55 9SC01 (13) it it II 56 9SC02 (11) it tt it 57 9SO01 (8) II II II 58 9SO02 (6) II it II 59 10C01 (6) it n it 60 10C02 (8) t i it t i 61 10E01 (12) it it t i 62 10E02 (5) it it II 63 10M01 (12) t i it it 64 10SC01 (16) tt n II 65 10SC02 (14) it tt tt 66 10SO01 (29) it it it 67 10S002 (12) t i t i it 68 WRDSTAT.S (<*) SENTENCE STATISTICS 69 WRDSTAT.O (*») it it 70 SPLIT1.S (3) PROGRAM TO BREAK 'CORPUS' INTO GDES 71 SPLIT 1.0 (2) II it •i it it n 72 ST.DEV.S (D STANDARD DEVIATION PROGRAM 73 ST.DEV.0 (D it II it 74 TABL.B1.S (D RANK TABLE (DESC. ORDER) PROGRAM 75 TABL.B1.0 (D it it it t i II 76 TABL.B4.S (D UNRANKED ASCENDING TABLE PROGRAM 77 TABL.B4.0 (D t i t i II II 78 SPLIT2.S (3) BREAKS GRADES INTO GRADE-SUBJECTS 79 SPLIT2.0 (3) it II « n it 80 SPLIT3.S (3) BREAKS GRADE-SUBJECTS INTO TEXTS THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO 8 X 1 1 PRINTED PAGES EACH. 203 FILEt FILE NAME SIZE DESCRIPTION OF FILE 81 SPLIT3.0 (3) BREAKS GRADE-SUBJECTS INTO TEXTS 82 COUNTW.S (7) WORD COUNT PROGRAM 83 COUNTW.O (7) n II II 84 UNSORT. CORP. FREQS (50) CORPUS (TABL. B 1, & TABL.B4 DATA) 85 UNSORT.GRD8.FREQS (21) GRADE8 II 86 UNSORT. GRD9. FREQS (35) GRADE9 II 87 UNSORT.GD10. FREQS (24) GRADE10 II 88 UNSORT. COMM. FREQS (9) COMMERCE •i 89 UNSORT.ENGL.FREQS(21) ENGLISH II 90 UNSORT. HOME. FREQS (17) HOME EC. II 91 UN SORT. INDE. FREQS (12) INDUSTRIAL ED. i i 92 UNSORT. MATH. FREQS (6) MATHEMATICS i i 93 UNSORT.SCIE.FREQS (15) SCIENCE n 94 UNSORT. SOCI. FREQS (20) SOCIALS i i 95 P1 (1) P1 TO P37 FOLLOW 8E01 THRU' 10SO02 96 P2 (1) AND ARE THE INPUT DATA FOR PLOTT.S 97 P3 (1) . 98 P4 (1) 99 P5 (1) 100 P6 (1) 101 P7 (1) 102 P8 (1) 103 P9 (1) 104 P10 (1) 105 P11 (1) 106 P12 (1) 107 P13 (1) 108 P14 (1) 109 P15 (1) 110 P16 (1) 111 P17 (1) 112 P18 (1) 113 P19 (1) 114 P20 (1) 115 P21 (1) 116 P22 (1) 117 P23 (1) 118 P24 (1) 119 P25 (1) 120 P26 (1) THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO 8 X 1 1 PRINTED PAGES EACH. 204 FILE* FILE NAME SIZE 121 P27 (D 122 P28 (D 123 P29 (D 124 P30 (1) 125 P31 (D 126 P32 (D 127 P33 (1) 128 P34 (D 129 P35 (D 130 P36 (D 131 P37 (D 132 PG1 (D 133 PG2 (1) 134 PG3 (1) 135 PG4 (D 136 PG5 (1) 137 PG6 (1) 138 PG7 (D 139 PG8 (D 140 PG9 (1) 141 PG10 (D 142 PG1 1 (D 143 PG12 (D 144 PG13 (D 145 PG14 (D 146 PG15 (D 147 PG16 (D 148 PG17 (D 149 PG1 8 (1) 150 PS 1 (D 151 PS 2 (D 152 PS3 (D 153 PS4 (D 154 PS5 (D 155 PS6 (D 156 PS7 (D 157 PS 8 (D 158 PS9 (D 159 PS 10 (D 160 PS 11 (D DESCRIPTION OF FILE PG1 TO PG18 FOLLOW GRADE8.ENGLISH TO GRADE10.SOCIALS. AND ARE THE INPUT DATA FOR PLOTGS.S PS1 TO PS11 FOLLOW CORPUS TO SOCIALS (EXCL. AGRICULTURE) AND ARE INPUT DATA FOR PLOTG.S, & PLOTS.S THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO 8 X 11 PRINTED PAGES EACH. 205 FILE # FILE NAME SIZE DESCRIPTION OF FILE 161 CORP.X2 (2) CORPUS 'WORD' CHI-SQUARE TABLE 162 GRADES.X2 (2) GRADES 163 EIGHT.X2 (2) GRADES « 164 NINE.X2 (2) GRADE9 " 165 TEN.X2 (2) GRADE 10 " 166 COR SEN T.X2 (1) CORPUS * SENTENCE' CHI-SQUARE TABLE 167 GDSSENT.X2 (D GRADES " 168 G8SENT.X2 (1) GRADE8 " 169 G9SENT.X2 (1) GRADE9 " 170 G10SENT.X2 (D GRADE10 '» 171 LENGS (1) LENGTHS DATA FOR SENTENCE CHI-SQUARE 172 WORDS (1) WORDS DATA FOR WORDS CHI-SQUARE 1 73 CHIS.3 (D 3 COLUMN CHI-SQUARE PROGRAM 174 CHI0.3 (1) II 175 CHIS.4 (D 3 COLUMN CHI-SQUARE PROGRAM (SENTS.) 176 CHI0.4 (D II 1 77 CHIS.7 (1) 7 COLUMN CHI-SQUARE PROGRAM 178 CHI0.7 (3) t i 179 CHIS.8 (D 7 COLUMN CHI-SQUARE PROGRAM (SENTS.) 180 CHI0.8 (2) t i 1 81 COUNTW.O (8) WORD COUNT (OUTPUTS PLOT DATA TOO) 182 WRDSTAT. S (<*) SENT. STATS. (OUTPUTS PLOT DATA TOO) 183 WRDSTAT.0 (4) i i 184 PLOTS.S (1) PLOT SENT. LENGTH DISTR. FOR SUBJS. 1 85 PLOTS.0 (D i i 186 PLOTG.S (D PLOT SENT. LENGTH DISTR. FOR CORPUS S GRADES 187 PLOTG.O (D i i 188 CORPUS.INDEX (7) TEXT INFORMATION FOR BOOKS 189 GRADES.INDEX (7) II 190 CORPUS. INTRODUCTIO (3) •i 191 GRADES.INTRODUCTIO (3) II 192 GRADES. INTRO. INSER (1) II 193 PLOTT.S (D PLOT SENT. LENGTH DISTR. FOR TEXTS 194 PLOTT.0 d ) •i 195 PLOTGS.S (D PLOT SENT. LENGTH DISTR. FOR GRADE-SUBJ ECTS 196 PLOTGS.O (D II 197 PW1 (3) PW1 THRU' PW11 FOLLOW PS 1 THRU' PS11 198 PW2 (2) AND ARE INPUT DATA FOR PLOTGW.S 199 PW3 (3) • 200 PW4 (D • THE SIZE COLUMN REFERS TO MACHINE COMPUTER, WHICH ARE APPROXIMATELY PRINTED PAGES EACH. PAGES INSIDE THE THE SAME AS TWO 8 X 1 1 206 FILE # FILE NAME SIZE DESCRIPTION OF FILE 201 PW5 (D INPOT DATA FOR PLOTGW.S CONT'D 202 PW6 (D 203 PW7 (2) 204 PW8 (D 205 PW9 (D 206 PW10 (1) 207 PW11 (D 208 PLOTGW.S (1) PLOT WORD-FREQ-DISTR. (CORPUS, GRADES, 6 SUBJECTS) 209 TABL.B1W.S (D VERSION OF TABL. B1.S TO GIVE DATA FOR PLOTGW.S TOTAL SIZE (2,522) THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO 8 X 1 1 PRINTED PAGES EACH. 207 APPENDIX D ALPHABETICAL LISTING OF CORPUS VOCABULARY (SAMPLE) CORPUS VOCABULARY 208 ALPHABETICAL LIST FREQ COUNT WORD FREQ COUNT WORD 501 551 0.0043 1 ABBEY 0.0085 2 ACCELERATING 0.0043 1 ABBOTS 0.0085 2 ACCELERATOR 0.0043 1 ABBREVIATED 0.0043 1 ACCELERATORS 0.0043 1 ABBREVIATING 0.0128 3 ACCENT 0.0043 I ABDICATE 0.0043 1 ACCENTED 0.0043 1 ABDICATED 0.0638 15 ACCEPT 0.0255 6 ABDOMEN 0.0213 5 ACCEPTABLE 0.0085 2 ABE* S 0.0085 2 ACCEPTANCE 0.0043 1 ABERDARES 0.0383 9 ACCEPTED 0.0043 1 ABIDES 0.0128 3 ACCEPTING 0.0213 5 ABILITIES 0.0043 1 ACCEPTS 0.0936 22 ABILITY 0.0043 1 ACCESS 0.3445 81 ABLE 0.0043 1 ACCESSIBLE 0.0043 1 ABNER 0.0170 4 ACCESSORIES 0.0085 2 ABNORMAL 0.0043 1 ACCESSORY 0.0043 1 ABNORMALITIES 0.0298 7 ACCIDENT 0.0043 1 ABOARD 0.0085 2 ACCIDENTALLY 0.0043 1 ABODE 0.0043 1 ACCIDENTALS 0.0043 1 ABDLISHED 0.0383 9 ACCIDENTS 0.0043 1 ABOUND 0.0043 1 ACCLIMATED 1.9693 463 ABOUT 0.0043 1 ACCLIMATIZED 0.4424 104 ABOVE 0.0043 1 ACCOMMODATE 0.0043 1 ABRAHAM 0.0043 1 ACCOMMODATES 0.0766 18 ABRASIVE 0.0043 1 ACCOMMODATIONS 0.0468 11 ABRASIVES 0.0128 3 ACCOMPANIED 0.0128 3 ABROAD 0.0170 4 ACCOMPANIES 0.0043 1 ABRUPT 0.0043 1 ACCOMPANIMENT 0.0043 1 ABRUPTLY 0.0043 1 ACCOMPANY 0.0043 1 ARSCURED 0.0213 5 ACCOMPANYING 0.0213 5 ABSENCE 0.0468 11 ACCOMPLISHED 0.0128 3 ABSENT 0.0043 1 ACCOMPLISHES 0.0510 12 ABSOLUTE 0.0128 3 ACCOMPLISHMENT 0.0213 5 ABSOLUTELY 0.0043 1 ACCOMPLISHMENTS 0.0468 11 ABSORB 0.0085 2 ACCORD 0.0510 12 ABSORBED 0.0170 4 ACCORDANCE 0.0085 2 ABSORBENCY 0.1957 46 ACCORDING 0.0085 2 ABSORBENT 0.0170 4 ACCORDINGLY 0.0128 3 ABSORBING 0.1999 47 ACCOUNT 0.0255 6 ABSORBS 0.0128 3 ACCOUNTANT 0.0085 2 ABSORPTION 0.0128 3 ACCOUNTED 0.0170 4 ABSTRACT 0.0043 1 ACCOUNTING 0.0043 1 ABSURDITY 0.0681 16 ACCOUNTS 0.0213 5 ABUNDANCE 0.0128 3 ACCUMULATE 0.0255 6 ABUNDANT 0.0043 1 ACCUMULATES 0.0085 2 ABUSES 0.0043 1 ACCUMULATION 0.0043 1 ABUTTED 0.0808 19 ACCURACY 0.0043 1 ACADEMY 0.1531 36 ACCURATE 0.0085 2 ACAOIAN 0.0766 18 ACCURATELY 0.0043 1 ACCELERATE 0.0043 1 ACCUSED 0.0043 1 ACCELERATED 0.0043 1 ACCUSING 209 APPENDIX E RANK LISTING OF CORPUS VOCABULARY (SAMPLE) CORPUS VOCABULARY RANK LIST FREQ COUNT WORD FREQ COUNT WORD 1 51 7.4515 17519 THE 41.8690 488 MANY 10.9295 8177 OF 42.0736 481 SO 13.6768 6459 AND 42.2761 476 EACH 16.1952 5921 A 42.4738 465 TWO 18.7136 5921 TO 42.6708 463 ABOUT 20.8497 5022 IN 42.8575 439 SHOULD 22.5140 3913 IS 43.0396 428 WHAT 23.4778 2266 THAT 43.2203 425 THAN 24.4212 2218 IT 43.4007 424 BEEN 25.3421 2165 ARE 43.5806 423 INTO 26.2527 2141 FOR 43.7596 421 THEM 27.1417 2090 YOU 43.9370 417 USE 27.9375 1871 BE 44.1118 411 MAKE 28.6984 1789 AS 44.2845 406 DO 29.4002 1650 OR 44.4559 403 UP 30.0318 1485 WITH 44.6261 400 SUCH 30.6524 1459 ON 44.7962 400 THEN 31.2015 1291 THIS 44.9633 393 TIME 31.7315 1246 BY 45.1275 386 ITS 32.2036 1110 WAS 45.2845 369 WOULD 32.6660 1087 HE 45.4410 368 HOW 33.1185 1064 FROM 45.5967 366 NUMBER 33.5681 1057 HAVE 45.7511 363 MADE 34.0164 1054 AT 45.9034 358 OUT 34.4384 992 WHICH 46.0535 353 MOST 34.8352 933 ONE 46.2028 351 ONLY 35.2133 889 NOT 46.3512 349 NO 35.5778 857 CAN 46.4967 342 MUST 35.9419 856 YOUR 46.6413 340 WATER 36.3047 853 THEY 46.7740 312 ALSO 36.6590 833 WE 46.9050 308 FIRST 37.0104 826 HIS 47.0352 306 VERY 37.3575 816 WILL 47.1640 303 GOOD 37.6888 779 IF 47.2895 295 HIM 38.0172 772 AN 47.4112 286 SAME 38.3149 700 WHEN 47.5298 279 1 38.6045 681 ALL 47.6404 260 COULD 38.8865 663 BUT 47.7510 260 WHO 39.1583 639 THESE 47.8595 255 ANY 39.4055 581 MAY 47.9675 254 BECAUSE 39.6475 569 THERE 48.0751 253 SEE 39.8848 558 HAS 48.1793 245 LIKE 40.1196 552 I 48.2810 239 MUCH 40.3501 542 OTHER 48.3814 236 PEOPLE 40.5785 537 SOME 48.4809 234 CALLED 40.8035 529 MORE 48.5804 234 2 41.0239 518 WERE 48.6795 233 PLACE 41.2408 510 HAD 48.7782 232 THROUGH 41.4526 498 THEIR 48.8769 232 WORK 41.6615 491 USED 48.9739 228 NEW 2 1 1 APPENDIX F DESCENDING AND ASCENDING ORDER OF CORPUS VOCABULARY (SAMPLES) 212 THE CORPUS WITH RANK IN DESCENDING ORDER X FX SUM FX CUM? FX FX*X SUM FX*X CUM? FX*X 51 488 1 51 0.311 488 98437 41.869 52 481 1 52 0.317 481 98918 42-073 53 476 1 53 0.323 476 99394 42.276 54 465 1 54 0.329 465 99859 42.473 55 463 1 55 0.335 463 100322 42.670 56 439 1 56 0.341 439 100761 42.857 57 428 1 57 0.347 428 101189 43.039 58 425 1 58 0.354 425 101614 43.220 59 424 1 59 0.360 424 102038 43.400 60 423 1 60 0.366 423 102461 43.580 61 421 1 61 0.372 421 102882 43.759 62 417 1 62 0.378 417 103299 43.937 63 411 1 63 0.384 411 103710 44.111 64 406 1 64 0.390 406 104116 44.284 65 403 1 65 0.396 403 104519 44.455 67 400 67 0.408 800 105319 44.796 68 393 1 68 0.415 393 105712 44.963 69 386 1 69 0.421 386 106098 45.127 70 369 1 70 0.427 369 106467 45.284 71 368 1 71 0.433 368 106835 45.440 72 366 1 72 0.439 366 107201 45.596 73 363 1 73 0.445 363 107564 45.750 74 358 1 74 0.451 358 107922 45.903 75 353 1 75 0.457 353 108275 46.053 76 351 1 76 0.463 351 108626 46.202 77 349 1 77 0.469 349 108975 46.351 78 342 1 78 0.475 342 109317 46.496 79 340 1 79 0.482 340 109657 46.641 80 312 1 80 0.488 312 109969 46.773 81 308 1 81 0.494 308 110277 46.904 82 306 1 82 0.500 306 110583 47.035 83 303 1 83 0.506 303 110886 47.163 84 295 1 84 0.512 295 111181 47.289 85 286 1 85 0.518 286 111467 47.411 86 279 1 86 0.524 279 111746 47.529 88 260 88 0.536 520 112266 47.750 89 255 1 89 0.543 255 112521 47.859 90 254 1 90 0.549 254 112775 47.967 91 253 1 91 0.555 253 113028 48.074 92 245 1 92 0.561 245 113273 48.179 93 239 1 93 0.567 239 113512 48.280 94 236 1 94 0.573 236 113748 48.381 96 234 96 0.585 468 114216 48.580 97 233 1 97 0.591 233 114449 48.679 99 232 99 0.603 464 114913 48.876 100 228 1 100 0.610 228 115141 48.973 101 223 1 101 0.616 223 115364 49.068 102 220 1 102 0.622 220 115584 49.162 103 217 1 103 0.628 217 115801 49.254 105 216 2 105 0.640 432 116233 49.438 THE CORPUS IN ASCENDING ORDER 2 1 3 X FX SUM FX CUM? FX FX*X SUM FX*X CUM? FX*X L 7098 7098 43.267 7098 7098 3.019 2 2418 9516 58.007 4836 11934 5.076 3 1240 10756 65.565 3720 15654 6.658 4 854 11610 70.771 3416 19C70 8. I l l 5 632 12242 74.624 3160 22230 9.455 6 453 12695 77.385 2718 24948 10.611 7 387 13082 79.744 2709 27657 11.764 8 322 13404 81.707 2576 30233 12.859 9 251 13655 83.237 2259 32492 13.820 10 208 13863 84.505 2080 34572 14.705 11 189 14052 85.657 2079 36651 15.589 12 156 14208 86.608 1872 38523 16.385 13 123 14331 87.357 1599 40122 17.065 14 138 14469 88.199 1932 42054 17.887 15 113 14582 88.887 1695 43749 18.608 16 100 14682 89.497 1600 45349 19.289 17 98 14780 90.094 1666 47015 19.997 18 98 14878 90.692 1764 48779 20.748 19 58 14936 91.045 1102 49881 21.216 20 68 15004 91.460 1360 51241 21.795 21 51 15055 91.771 1071 52312 22.250 22 63 15118 92.155 1386 53698 22.840 23 36 15154 92.374 828 54526 23.192 24 44 15198 92.642 1056 55582 23.641 25 37 15235 92.868 925 56507 24.034 26 44 15279 93.136 1144 57651 24.521 27 39 15318 93.374 1053 58704 24.969 28 34 15352 93.581 952 59656 25.374 29 33 15385 93.782 957 60613 25.781 30 38 15423 94.014 1140 61753 26.266 31 27 15450 94.178 837 62590 26.622 32 30 15480 94.361 960 63550 27.030 33 21 15501 94.489 693 64243 27.325 34 39 15540 94.727 1326 65569 27.889 35 40 15580 94.971 1400 66969 28.484 36 31 15611 95. 160 1116 68085 28.959 37 26 15637 95.318 962 69047 29.368 38 25 15662 95.471 950 69997 29.772 39 21 15683 95.599 819 70816 30.121 40 26 15709 95.757 1040 71856 30.563 41 18 15727 95.867 738 72594 30.877 42 21 15748 95.995 882 73476 31.252 43 24 15772 96.141 1032 74508 31.691 44 17 15789 96.245 748 75256 32.009 45 13 15802 96.324 585 75841 32.258 46 18 15820 96.434 828 76669 32.610 47 10 15830 96.495 470 77139 32.810 48 12 15842 96.568 576 77715 33.055 49 8 15850 96.617 392 78107 33.222 214 APPENDIX G SENTENCE LENGTH DISTRIBUTION OF THE CORPUS (SAMPLE) 215 SENTENCE-LENGTH DISTRI3UTI0N OF THE CORPUS LENGTH REPETITIONS CUM. 1 36 36 2 86 122 3 100 222 4 176 398 5 283 681 6 313 994 7 415 1409 8 471 1880 9 522 2402 10 581 2983 11 577 3560 12 616 4176 13 623 4799 14 632 5431 15 655 6086 16 634 6720 17 584 7304 18 580 7884 19 505 8389 20 471 8860 21 432 9Z92 22 424 9716 23 384 10100 24 343 10443 25 294 10737 26 266 11003 27 252 11255 28 239 11494 29 192 11686 30 193 11879 31 171 12050 32 124 12174 33 109 12283 34 103 12386 35 81 12467 36 75 12542 37 59 12601 38 43 12644 39 50 12694 40 48 12742 41 50 12 792 42 36 12828 43 24 12852 44 22 12 874 45 19 12 893 4 6 23 12916 ACCUM WORDS Z WORDS 36 0.02 208 0.09 508 0.22 1212 0.52 2627 1.12 4505 1.92 7410 3.15 11178 4.75 15876 6.75 21686 9.22 28033 11.92 35425 15.07 43524 18.51 52372 22.28 62197 26.46 72341 30.77 82269 34.99 92709 39.43 102304 43.51 111724 47.52 120796 51.38 130124 55.35 138956 59.10 147188 62.61 154538 65.73 161454 68.67 168258 71.57 174950 74.41 180518 76.78 186308 79.25 191609 81.50 195577 83.19 199174 84.72 202676 86.21 205511 87.41 208211 88.56 210394 89.49 212028 90.19 213978 91.02 215898 91.83 217948 92. 70 21,9460 93.35 220492 93. 79 221460 94.20 222315 94.56 223373 95.01 216 APPENDIX H GRAPHS OF SENTENCE LENGTH DISTRIBUTION 217 CI 0 0 10 0 2 0.0 3fl.O 40.0 5 0 . 0 60.0 70.0 BO.O 90.0 100.0 S E N T E N C E L E N G T H FIGURE 7 . 2 SENTENCE-LENGTH DISTRIBUTION OF GRADE EIGHT z LU<=> UJ A ' 0.0 10.0 20 . U 3 0 .0 00.0 50.0 S E N T E N C E 1 E N O T H - 1 — 60.0 70.0 PO.O no.o -I 100.0 FIGURE 7 3 SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE n 1 30.0 40.0 50.0 ' SENTENCE LENGTH 63.0 70.0 60.0 ~T 90.0 FIGURE 74- SENTENCE-LENGTH DISTRIBUTION . OF GRADE TEN w.a 50.n SENTENCE LENGTH 63.0 70.0 80.0 90.0 220 F I G U R E 7.7 S E N T E N C E - L E N G T H D I S T R I B U T I O N OF HOME E C O N O M I C S 20.0 10.0 S E N T E N C E 50.0 L E N G T H 60.0 70.0 BO.O 93.0 —1 100.0 F I G U R E 7? S E N T E N C E - L E N G T H D I S T R I B U T I O N OF I N D U S T R I A L E D U C A T I O N r_^*^=^*=_-*^==*=* , r , 2J j JO 0 Hi 0 60.0 60.0 10.0 80.0 90.0 100.0 SENIENCE LING-TH FIGURE 7.10 SENTENCE-LENGTH DISTRIBUTION OF SCIENCE o z. O r LU CC 0.0 10.0 - 1 — 20.0 J O . O SENTENCE 1 '•L-'^-r 50.0 6 0 . 0 LENGTH 70.0 00.0 " ~ I — 90.0 "I 100 222 FIGURE 7.11 SENTENCE-LENGTH DISTRIBUTION OF SOCIAL STUDIES a ID. (_> z. UJC3 U J cc 0.0 10.0 20.0 30.0 40.0 SENTENCE so.o LENGTH 60.0 70.0 80. 90.0 100.0 FIGURE 7./* SENTENCE-LENGTH DISTRIBUTION OF GRADE EIGHT ENGLISH UJ CK 0.0 10.0 20.0 30.0 40.0 ''.0.0 SENTENCE LENGTH co.o 70.0 80.0 ~ ! 90.0 I 100.0 224 2 2 6 FIGURE 7/9 SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE ENGLISH o.o T 10.0 20.0 30.0 40.0 50.0 .SENTENCE LENGTH 70.0 100.0 FIGURE 7.20 SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE HOME ECONOMICS A -r AT . o 50 . n ENTENCE LENGTH co.o 70.0 BO.O ~1 90.0 1U0.0 228 230 <_) o— IU FIGURE SENTENCE-LENGTH DISTRIBUTION OF GRHDE TEN MATHEMATICS i 1 I 80.0 90.0 100.0 0.0 - 1 — 10.0 I 20.0 —1 1 1 1 30.0 4 0 . 0 50.0 60.0 SENTENCE LENGTH 70.0 FIGURE 7-28 SENTENCE-LENGTH DISTRIBUTION OF GRADE TEN SCIENCE - f l 0 tr-T 1 -vi. o so o SENTENCE LENGTH cu.o ~ia.a — r — 00.0 90.0 100.0 231 232 o <3h FIGURE 7.3/ SENTENCE-LENGTH DISTRIBUTION OF TEXT X1C02 8' 233 2 3 4 235 236 237 2 3 8 2 3 9 240 FIGURE 7.W SENTENCE-LENGTH DISTRIBUTION OF TEXT #2D05 242 243 FIGURE 7-53 SENTENCE-LENGTH D I S T R I B U T I O N OF TEXT -H2F01 -1 100.0 50.0 LENGTH 60. 70.0 eo.o 90.0 FIGURE 7-5V- SENTENCE-LENGTH D I S T R I B U T I O N . OF TEXT *2G01 1 40.0 50.0 if.HTENCE LENGTH 60.0 70 .0 —I ao.o ! 90.0 "I 100.0 244 9n 8- F1GURE7« " SENTENCE-LENGTH DISTRIBUTION OF TEXT *2G02 R-(_> z UJQ o-- | — 10.0 1 40.0 SENTENCE - 1 — 80.0 1 100.0 0.0 30.0 30.0 50.0 LENGTH 60.0 70.0 90.0 FIGURE 7 - » SENTENCE-LENGTH DISTRIBUTION OF TEXT *2H01 z UJO g2-UJ CK — i — 80.0 ~1 90.0 0.0 10.0 20.0 1 1 40.0 50.0 SENTENCE LENGTH (30.0 70.0 100.0 245 SB- FIGURE 7S7 SENTENCE-LENGTH DISTRIBUTION OF TEXT -K2H02 u UJQ 2d. On UJ 0£ T 30.0 40.0 50.0 60.0 SENTENCE LENGTH FIGURE 1-58 SENTENCE-LENGTH DISTRIBUTION OF TEXT #3801 R-o z UJQ o<-« UJ a; 0.0 10.0 1 1 1 0 . 0 b f l . O SENTENCE LENGTH 60.0 "70.0 eo.o i — 90.0 100.0 246 FIGURE 7-59 SENTENCE-LENGTH DISTRIBUTION OF TEXT K30O2 0.0 10.0 10.0 SENTENCE so.o LENGTH 100.0 9h FIGURE 740 SENTENCE-LENGTH DISTRIBUTION OF TEXT X3C01 o z UJO § 2 -I eo.o 100.0 0.0 ~ 1 — 10.0 20.0 30.0 ni.n M.o SENTENCE LENGTH 60.0 70.0 90.0 247 FIGURE 7-bl SENTENCE-LENGTH DISTRIBUTION OF TEXT *3C02 O 248 FIGURE 7U SENTENCE-LENGTH DISTRIBUTION OF TEXT K3H02 o o.o 10.0 20.0 30.0 m.n w.n co.o IO.O no.o 90.0 IOO.O SENTENCE LENGTH 250 APPENDIX I CHI SQUARE RESULTS OF DISTRIBUTION OF 100 MOST COMMON WORD TYPES TABLE XXXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS G R A D fc S 8 9 10 TOTAL 3859.0 9071.0 4589.0 17519. 3938.4 9159.7 4420.9 7.299 7.378 7.733 C H I - S Q U A r t E 8.85 2. 1949.0 3857.0 2373.0 3177. OF 1838.3 4275.3 2063.5 3.6B7 3.137 3.999 CHI-SJUA^E 94.03 3. 1462.0 3539.0 1*58.0 6459. AND 1452.0 3377.0 1629.9 2.765 2.878 2.457 CHI-SQUA!'.E 25.97 4. 1436.0 3036.0 1399.0 5921. A 1331.1 3095.7 1494.2 2.716 . 2.510 2.357 CH1-SQUAKE 14.36 5. 1326.0 3168.0 1427.0 5921. TO 1331.1 3095.7 1494.2 2.503 2.577 2.405 CHI-SOUARE 4.72 T H E T H P . E t LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FRFCQ. TO TOTAL NO. OF WORDS IN GRADE RANK WORD 1. THE TABLE X X X ^ I DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS RANK G R A D E S WORD 8 9 10 TOTAL 6. 1108.0 2515.0 1399.0 5022. I N 1129.0 2625.7 1267.3 2.096 2.045 2.357 C H I - r S Q U A R E 18.75 7. 891.0 2208.0 836.0 3913. I S 879.7 2045.9 987.4 1.685 1.796 1.4C9 C H I - S Q U A R E 36.22 8. 567.0 1064.0 635.0 2266. T H A T 509.4 1184.8 571.8 1.073 0.865 1.070 CHI-S3UAKE 25.80 9. 532.0 1181.0 505.0 2218. I T 498.6 11^9.7 559.7 1.006 0.961 0.851 C H I - S Q U A R E 7.97 10. 4U9.0 1278.0 398.0 2165. A R E 486.7 1132.0 546.3 0.925 1.039 0.671 C H I - S Q U A R E 59.13 T H E T H R E E L I N E S Vr F I G U H E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T U T A L N O . f l F W O R D S I N G R A D E T A B L E X X X V I D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S WORD 8 9 1 0 T O T A L 1 1 . 4 9 9 . 0 1 1 8 2 . 0 4 6 0 . 0 2 1 4 1 . F O R 4 3 1 . 3 1 1 1 9 . 4 5 4 0 . 3 0 . 9 4 4 0 . 9 6 1 0 . 7 7 5 C H I - S Q U A R E 16.oa 1 2 . 5 < - 5 . 0 1 1 9 1 . 0 3 5 4 . 0 2 0 9 0 . Y O U 4 6 9 . 9 1 0 9 2 . 7 5 2 7 . 4 1 . 0 3 1 0 . 9 6 9 0 . 5 9 7 C H I - S 3 U A A E 7 7 . 8 7 1 3 . 4 1 8 . 0 1 1 0 2 . 0 3 5 1 . 0 1 8 7 1 . 8 E 4 2 0 . 6 9 7 8 . 2 4 7 2 . 1 0 . 7 9 1 0 . 8 9 6 0 . 5 9 1 C H I - S a U A r t E 4 6 . 7 6 1 4 . 4 0 5 . 0 8 9 9 . 0 4 8 5 . 0 1 7 8 4 . A S 4 0 1 . 1 9 3 2 . 7 4 5 U . 2 0 . 7 6 6 0 . 7 3 1 0 . 8 1 7 C H I - S Q U A K E 3 . 9 5 1 5 . 3 6 9 . 0 1 0 7 5 . 0 2 0 6 . 0 1 6 5 0 . OR 3 7 0 . 9 3 6 2 . 7 4 1 6 . 4 C . 6 9 8 0 . 8 7 4 0 . 3 4 7 CHI-SaUARE 1 5 8 . 5 5 T H E T H R E E L I N E S U F F I G U K E S F O R EACh E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E - J U E N C Y R A T I O A S IT U F F R T O . T O T O T A L N U . U F WORDS I N G R A D E T A B L E X X X V / D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H t G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S WORD 8 9 1 0 T O T A L 1 6 . 2 9 8 . 0 8 7 0 . 0 3 1 7 . 0 1 4 8 5 . W I T H 3 3 3 . 8 7 7 6 . 4 3 7 4 . 7 0 . 5 6 4 0 . 7 0 S 0 . 5 3 4 C H I - S Q U A R E 2 4 . 0 2 1 7 . 3 4 0 . 0 7 5 9 . 0 3 6 0 . 0 1 4 5 9 . O N 3 2 8 . 0 7 6 2 . 8 3 6 8 . 2 C . 6 4 3 0 . 6 1 7 0 . 6 0 7 C H I - S Q U A S E 0 . 6 4 1 8 . 2 6 3 . 0 6 5 0 . 0 3 7 8 . 0 1 2 9 1 . T H I S 2 9 0 . 2 6 7 5 . 0 3 2 5 . 8 0 . 4 9 T 0 . 5 2 9 0 . 6 3 7 C H I - S O U A f ( E 1 1 . 8 5 1 9 . 2 7 3 . 0 6 2 7 . 0 3 4 6 . 0 1 2 4 6 . B Y 2 8 0 . 1 6 5 1 . 5 3 1 4 . 4 0 . 5 1 6 0 . 5 1 0 C . 5 8 3 C H I - S Q U A R E 4 . 2 7 20. 2 7 8 . 0 4 1 9 . 0 4 1 3 . 0 1 1 1 0 . WAS 2 4 9 . 5 5 8 0 . 4 2 8 0 . 1 0 . 5 2 6 0 . 3 4 1 0 . 6 9 6 C H I - S Q U A K E 1 1 1 . 1 6 rO Ul T H E T H R E i L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : CO F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S St O F F R E Q . T O T O T A L N U . ;)F W O R D S I N G R A D E TABLE XXXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREUUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS RANK G R A D E S WORD 8 9 10 TOTAL 21. 253.0 521.0 313.0 1087. HE 244.4 568.3 274.3 0.479 0.424 0.527 CHI-SQUARE 9.71 22. 248.0 534.0 282.0 1064. FROM 239.2 556.3 268.5 0.469 0.434 0.475 CHl-SQUA-iE 1.90 23. 263.0 503.0 291.0 1057. HAVE 237.6 552.6 266.7 0.497 0.409 0.490 CHI-SQUAKE 9.38 24. 241.0 526.0 287.0 1054. AT 236.9 551.1 266.0 0.456 0.428 0.484 CHI-SQUARE 2.87 25. 257.0 484.0 251.0 992. WHICH 223.0 518.7 250.3 0.486 0.394 0.423 CHI-SQUARE 7.50 THE THREE LINES Or FIGURES FOR EACH ENTRY REPRESENT FREUUENCY EXPECTED FREQUENCY KATIU AS *, UF FREQ. TO TUTAL NO. OF WORDS IM GRADE TABLE X X X V I DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORO TYPES ACROSS THE GRADE LEVELS OF THE CORPUS RANK G R A D E S WORO 8 9 10 TOTAL 26. 225.0 481.0 227.0 933. ONE 209.7 487.8 235.4 0.426 0.391 0.383 CHI-SQUAKE 1.51 2.7. 192.0 492.0 205.0 889. NOT 199.9 464.8 224.3 0.363 0.400 0.345 CHI-SQUARE 3.57 28. 196.0 513.0 148.0 857. CAN 192.7 448.1 216.3 0.371 0.417 0.249 CHI-SQUARE 31.01 29. 255.0 481.0 120.0 853. YOUR 191.8 446.0 215.3 0.482 0.391 0.202 CHI-SQUARt 65.75 30. 230.0 452.0 171.0 833. THEY 187.3 435.5 210.2 0.435 0.368 0.288 CHI-SQUARE 17.69 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS S, OF FREQ. TO TOTAL NO. OF WORDS IN GRADE TABLE XXXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREOUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS RAN< G R A D E S WURD 8 9 10 TOTAL 31. 253.0 219.0 361.0 826. WE 185.7 431.9 208.4 0.479 0.178 0.608 CHI-SQUARE 240.98 32. 2C4.0 407.0 215.0 816. HIS 133.4 426.6 205.9 0.386 0.331 0.362 CHI-S3UA<E 3.61 33 . 164.0 495.0 157.0 816. WILL 133.4 426.6 205.9 0.310 0.403 0.265 CHI-SQUARE 24.64 34. 175.0 477.0 127.0 779. IF 175.1 407.3 196.6 0.331 0.388 0.214 CHI-SQUARE 36.56 3 5 . 176.0 409.0 187.0 772. AN 173.6 403.6 194.8 0.333 0.333 0.315 CHI-SQUARE 0.42 THE THREc LINES OF FIGURES FOR EACH ENTRY REPRESENT FREJUEUCY EXPECTED FREQUENCY RATIO AS %, UF FRcQ. TO TOTAL NU. OF WORDS IN GRADE TABLE XXXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS RANK G R A D E S WORD 8 9 10 TOTAL 36. 142.0 424.0 134.0 700. WHEN 157.4 366.0 176.6 0.269 0.345 0.226 CHI-SQUARE 20.99 37. 144.0 366.0 171.0 681. ALL 153.1 356.1 171.8 0.272 0.298 0.288 CHI-SQUAKC 0.82 3 8 . 144.0 356.0 163.0 663. OUT 149.0 346.6 167.3 0.272 0.290 0.275 CHI-SQUARE 0.53 $9. 148.0 319.0 172.0 639. THESE 143./ 334.1 161.3 0.280 0.259 0.290 CHI-SQUARE 1.53 40. 126.0 358.0 97.0 581. MAY 130.6 303.8 146.6 0.238 0.291 0.163 CHI-SQUARE 26.63 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FREQ. TO TOTAL NO. OF WORDS IN GRAOE TABLE X X X V / DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS R A N K G R A D E S WORD 8 9 1 0 T O T A L 4 1 . 1 4 2 . 0 3 0 1 . 0 1 2 6 . 0 5 6 9 . THERE 1 2 7 . 9 2 9 7 . 5 1 4 3 . 6 0 . 2 6 9 0 . 2 4 5 0 . 2 1 2 C H I - S Q U A R E 3 . 7 5 +Z. 1 0 2 . 0 2 9 6 . 0 1 6 0 . 0 5 5 8 . H A S 1 2 5 . 4 2 9 1 . 7 1 4 0 . 8 0 . 1 9 3 0 . 2 4 1 0 . 2 7 0 C H I - S Q U A K E 7 . 0 6 4 3 . 1 0 4 . 0 2 6 8 . 0 1 8 0 . 0 5 5 2 . I 1 2 4 . 1 2 8 8 . 6 1 3 9 . 3 0 . 1 9 7 0 . 2 1 8 0 . 3 0 3 C H I - S O U A K E 1 6 . 6 2 4 4 . 1 3 9 . 0 2 6 9 . 0 1 3 4 . 0 5 4 2 . O T H E R 1 2 1 . 8 2 8 3 . 4 1 3 o . d 0 . 2 6 3 0 . 2 1 9 U . 2 2 6 C H I - S Q U A K E 3 . 2 0 4 5 . 1 2 2 . 0 2 9 9 . 0 1 1 6 . 0 5 3 7 . S O M E 1 2 0 . 7 2 8 0 . 3 1 3 5 . 5 0 . 2 3 1 0 . 2 4 3 0 . 1 9 5 C H I - S Q U A R E 4 . 0 1 T h e T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, U F F R E Q . T O T O T A L N O . C;T W O R D S I N G R A D E TABLE XXXV/ DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS R A N K G R A D E S W O R O 8 9 1 0 T O T A L 4 6 . 1 0 8 . 0 3 0 4 . 0 1 1 7 . 0 5 Z 9 . M O R E 1 1 8 . 9 2 7 6 . 6 1 3 3 . 5 0 . 2 0 4 0 . 2 4 7 0 . 1 9 7 C H I - S Q U A R E 5 . 7 6 4 7 . 1 4 3 . 0 1 7 8 . 0 1 9 7 . 0 5 1 8 . W E R E 1 1 6 . 5 2 7 0 . 8 1 3 0 . 7 0 . 2 7 0 0 . 1 4 5 0 . 3 3 2 C H I - S Q U A K E 7 1 . 4 8 4 8 . 1 1 3 . 0 2 1 9 . 0 1 7 8 . 0 5 1 0 . H A D 1 1 4 . 7 2 6 6 . 6 1 2 U . 7 0 . 2 1 4 0 . 1 7 8 0 . 3 0 0 C H I - S Q U A R E 2 7 . 4 3 4 9 . 1 2 0 . 0 2 2 6 . 0 1 5 2 . 0 4 9 8 . T H E I R 1 1 2 . 0 2 6 0 . 4 1 2 5 . 7 0 . 2 2 7 0 . 1 8 4 0 . 2 5 6 C H I - S Q U A K E 1 0 . 6 3 50. 1 1 7 . 0 3 1 0 . 0 6 4 . 0 4 9 1 . U S E D U 0 . 4 2 5 6 . 7 1 2 3 . 9 0 . 2 2 1 0 . 2 5 2 0 . 1 0 8 C H I - S Q U A R E 4 0 . 4 2 to Ul T h t T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : (JT F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S Z, O F F R E Q . T O T O T A L N O . OF W O R D S I N G R A D E TABLE X X X V/ D I S T R I B U T I O N OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE L E V E L S OF THE CORPUS G R A D E S 8 9 1 0 TOTAL 1 1 8 . 0 2 4 5 . 0 1 2 5 . 0 4 8 8 . 1 0 9 . 7 2 5 5 . I 1 2 3 . 1 0 . 2 2 3 0 . 1 9 9 0 . 2 1 1 C H I - S Q U A R E 1 . 0 6 5 2 . 1 3 7 . 0 2 4 9 . 0 9 5 . 0 4 9 1 . S O 1 C 8 . 1 2 5 1 . 5 1 2 1 . 4 0 . 2 5 9 0 . 2 0 3 U . 1 6 0 C H I - S i l U A < = 1 3 . 4 6 5 3 . 1 2 2 . 0 2 5 3 . 0 101.V 4 7 6 . EACH 1 0 7 . 0 2 4 8 . 9 1 2 0 . 1 0 . 2 3 1 0 . 2 0 6 0 . 1 7 0 CHI-SQUARE 5 . 2 1 5 4 . 1 1 7 . 0 2 2 5 . 0 1 2 3 . 0 4 6 5 . TWO 1 0 4 . 5 2 4 3 . 1 1 1 7 . 3 0 . 2 2 1 0 . 1 8 3 0 . 2 0 7 CHI-SOUAKE 3 . 1 1 5 5 . 9 1 . 0 2 6 5 . 0 1 0 7 . 0 4 6 3 -ABOUT 1 0 4 . 1 2 4 2 . 1 1 1 6 . 8 0 . 1 7 2 0 . 2 1 6 0 . 1 8 0 C H I - S Q U A R E 4 . 6 4 T H E T H R E E L I N E S O F FIGURES FOR E A C H ENTRY REPRESENT F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S 2 , U F F K E Q . TO T O T A L N O . O F WORDS I N GRADE RAN< W O R D 51. M A N Y TABLE XKXVI DISTRIBUTION OF OCCURRENCE OF THE 100 MDST FREQUENT WORD TYPES ACROSS THE GRADE L E V E L S OF THE CORPUS R A N K G R A D E S W O R D 8 9 1 0 T O T A L 5 6 . 1 0 9 . 0 2 6 6 . 0 6 4 . 0 4 3 9 . S H O U L D 9 8 . 7 2 2 9 . 5 1 1 0 . 8 0 . 2 0 6 0 . 2 1 6 0 . 1 0 8 C H I - S Q U A R E 2 6 . 6 3 5 7 . 1 1 4 . 0 1 9 4 . 0 1 2 0 . 0 4 2 8 . W H A T 9 6 . 2 2 2 3 . 8 1 0 8 . 0 0 . 2 1 6 0 . 1 5 8 0 . 2 0 2 C H I - S Q U A R E 8 . 5 3 5 8 . 1 0 2 . 0 2 1 7 . 0 1 0 6 . 0 4 2 5 . T H A N 9 5 . 5 2 2 2 . 2 1 0 7 . 2 0 . 1 9 3 0 . 1 7 6 0 . 1 7 9 C H I - S Q U A R E 0 . 5 7 5 9 . 7 9 . 0 1 9 7 . 0 1 4 8 . 0 4 2 4 . B E E N 9 5 . 3 2 2 1 . 7 1 0 7 . 0 0 . 1 4 9 0 . 1 6 0 0 . 2 4 9 C H I - S Q U A R E 2 1 . 2 6 60. 9 6 . 0 2 3 7 . 0 9 0 . 0 4 2 3 . I N T O 9 5 . 1 2 2 1 . 2 1 0 6 . 7 0 . 1 8 2 0 . 1 9 3 0 . 1 5 2 C H I - S Q U A R E 3 . 7 7 N J T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : CTl F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T U T A L N O . C F W O R D S I N G R A D E T A B L E XXXVI D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 H O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R i N K G R A D E S W O R D 8 9 1 0 T O T A L 61. 1 0 9 . 0 2 2 0 . 0 9 2 . 0 4 2 1 . T H E M 9 4 . 6 2 2 0 . 1 1 0 6 . 2 0 . 2 0 6 0 . 1 7 9 U . 1 5 5 C H I - S Q ' J A S E 4 . 0 9 6 2 . - 9 7 . 0 2 5 3 . 0 6 7 . 0 4 1 7 . U S E 4 3 . 7 2 1 8 . 0 1 0 5 . 2 0 . 1 8 3 0 . 2 0 6 0 . 1 1 3 C H I - S Q U A R E 1 9 . 6 1 6 3 . 1 0 7 . 0 2 5 5 . 0 4 9 . 0 4 1 1 . M A K E 9 2 . 4 2 1 4 . 9 1 0 3 . 7 0 . 2 0 2 0 . 2 0 7 0 . 0 8 3 C H I - S Q U A R E 3 8 . 6 6 ( 4 . 9 3 . 0 2 3 4 . 0 7 9 . 0 4 0 6 . 0 0 . 9 1 . 3 2 1 2 . 3 1 0 2 . 5 0 . 1 7 6 0 . 1 9 0 0 . 1 3 3 C H I - S Q U A R E 7 . 6 3 65. 8 1 . 0 2 4 0 . 0 8 2 . 0 4 0 3 . U P 9 0 . 6 2 1 0 . 7 1 0 1 . 7 0 . 1 5 3 0 . 1 9 5 0 . 1 3 8 C H I - S Q U A R c 8 . 9 0 T H E T H R E E L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S i, O F F K L Q . T O T U T A L N O . C F W O R D S I N G R A D E T A B L E XXXVI D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S WORO 8 9 1 0 T O T A L 4 6 . 7 7 . 0 2 1 4 . 0 1 0 9 . 0 4 0 0 . S U C H 6 9 . 9 2 0 9 . 1 I O C . 9 0 . 1 4 6 0 . 1 7 4 0 . 1 8 4 C H I - S Q U A R E 2 . 6 1 6 7 . 6 4 . 0 2 3 8 . 0 9 8 . 0 4 0 0 . T H E N 6 9 . 9 2 0 9 . 1 1 0 0 . 9 0 . 1 2 1 0 . 1 9 4 0 . 1 6 5 C H I - S Q U A R E 1 1 . 5 4 6 8 . 7 0 . 0 2 1 8 . 0 1 0 5 . 0 3 9 3 . T I M E ' 6 8 . 4 2 0 5 . 5 9 9 . 2 0 . 1 3 2 0 . 1 7 7 0 . 1 7 7 C H I - S Q U A R E 4 . 9 2 6 9 . 7 6 . 0 1 8 5 . 0 1 2 5 . 0 3 8 6 . I T S 8 6 . 8 2 0 1 . 8 9 7 . 4 0 . 1 4 4 0 . 1 5 0 0 . 2 1 1 C H I - S Q U A R E 1 0 . 5 6 7 0 . 9 4 . 0 1 6 2 . 0 9 0 . 0 3 6 9 . W O U L D 8 3 . 0 1 9 2 . 9 9 3 . 1 0 . 1 7 6 0 . 1 3 2 0 . 1 5 2 C H I - S Q U A R E 6 . 5 3 t-O Ul T H E T H R E E L I N E S U F F I G U R E S F U R E A C H E N T R Y R E P R E S E N T : ~J F R E Q U E N C Y E X P E C T E D F R E U U E N C Y R A T I U A S X, U F F R E Q . T O T O T A L N O . O F W O R D S I N G R A D E T A B L E XXXVI. D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E G R A O E L E V E L S O F T H E . C O R P U S R A N K G R A D E S W O R D 8 9 1 0 T O T A L 7 1 . 8 5 . 0 2 0 4 . 0 7 9 . 0 3 6 8 . HOW 8 2 . 7 1 9 2 . 4 9 2 . 9 0 . 1 6 1 0 . 1 6 6 0 . 1 3 3 C H I - S Q U A K E 2 . 8 3 7 2 . 1 1 8 . 0 9 8 . 0 1 5 2 . 0 3 6 6 . N U M B E R 6 2 . 3 1 9 1 . 4 9 2 . 4 0 . 2 2 3 0 . 0 8 0 0 . 2 5 6 C H I - S Q U A R E 9 9 . 5 7 7 3 . 9 6 . 0 2 1 0 . 0 5 7 . 0 3 6 3 . M A D E 8 1 . 6 1 3 9 . 8 9 1 . 6 0 . 1 8 2 0 . 1 7 1 0 . 0 9 6 C H I - S Q U A K E 1 7 . 7 6 7 4 . 8 5 . 0 2 0 1 . 0 7 2 . 0 3 5 8 . O U T 6 0 . 5 1 8 7 . 2 9 0 . 3 0 . 1 6 1 0 . 1 6 3 0 . 1 2 1 C H I - S O U A k E 5 . 0 0 7 5 . 7 6 . 0 1 7 3 . 0 1 0 4 . 0 3 5 3 . M O S T 7 9 . 4 1 8 4 . 6 8 9 . 1 0 . 1 4 4 0 . 1 4 1 0 . 1 7 5 C H I - S Q U A K E 3 . 3 7 T H E T H R E E L I N E S O F F I C U U S F O R E A C h E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E O F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . C F W O R D S I N G R A D E TABLE XXXVI D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S W O R D 8 9 1 0 T O T A L 7 6 . 1 0 2 . 0 1 6 1 . 0 8 8 . 0 3 5 1 . O N L Y 7 8 . 9 1 8 3 . 5 8 8 . 6 0 . 1 9 3 0 . 1 3 1 0 . 1 4 8 C H I - S Q U A R E 9 . 5 2 7 7 . 7 0 . 0 1 9 2 . 0 8 9 . 0 3 4 9 . N O 7 8 . 5 1 8 2 . 5 8 8 . 1 0 . 1 3 2 0 . 1 5 6 0 . 1 5 0 C H I - S Q U A R E 1 . 4 2 7 8 . 7 3 . 0 2 1 6 . 0 5 3 . 0 3 4 2 . M U S T 7 6 . 9 1 7 8 . 8 8 6 . 3 0 . 1 3 8 0 . 1 7 6 0 . 0 8 9 C H I - S i U A R E 2 0 . 7 8 7 9 . 5 2 . 0 1 9 8 . 0 9 0 . 0 3 4 0 . W A T E R 7 6 . 4 1 7 7 . 8 8 5 . 8 0 . 0 9 8 0 . 1 6 1 U . 1 5 2 C H I - S Q U A R E 1 0 . 3 2 SO. 6 0 . 0 1 7 4 . 0 7 8 . 0 3 1 2 . A L S O 7 0 . 1 1 6 3 . 1 7 8 . 7 0 . 1 1 3 0 . 1 4 2 0 . 1 3 1 C H I - S Q U A R E 2 . 2 0 T H E T H R E E L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S 2 t O F F R E Q . T O T O T A L N U . C-F W O R D S I N G R A D E CO T A 8 L E XXXV/ D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A 0 E S N C R D 8 9 1 0 T O T A L 8 1 . 7 3 . 0 1 4 2 . 0 9 3 . 0 3 0 8 . F I R S T 6 9 . 2 1 6 1 . 0 7 7 . 7 0 . 1 3 8 0 . 1 1 5 0 . 1 5 7 C H I - S C U A R E 5 . 4 6 8 2 . 7 3 . 0 1 5 7 . C 7 a . 0 3 ? 6 . V E R Y be .8 1 6 0 . 0 7 7 . 2 0 . 1 3 8 0 . 1 2 8 0 . 1 2 8 C H I - S Q U A K E 0 . 3 3 8 3 . 9 3 . 0 1 6 8 . 0 4 2 . 0 3 0 3 . G O O O 6 8 . 1 1 5 8 . 4 7 6 . 5 0 . 1 7 6 0 . 1 3 7 0 . 0 7 1 C H I - S Q U A R E 2 5 . 2 0 8 4 . t > 4 . 0 1 6 0 . 0 7 1 . 0 2 9 5 . H I M 6 6 . 3 1 5 4 . 2 7 4 . 4 0 . 1 2 1 0 . 1 3 0 0 . 1 2 0 C H I - S Q U A K E 0 . 4 6 8 5 . 7 0 . 0 1 3 7 . 0 7 9 . 0 2 8 6 . S A M E 6 4 . 3 1 4 9 . 5 7 2 . 2 0 . 1 3 2 0 . 1 1 1 0 . 1 3 3 C H I - S Q U A P E 2 . 2 0 T H E T H R E E L I N E S O f F I G U R E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E O . T O T O T A L N O . O F W O R D S I N G R A D E T A B L E XXXV/ D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R D T Y P f c S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S W O R O 8 9 1 0 T O T A L 8 6 . 6 9 . 0 9 5 . 0 9 6 . 0 2 6 0 . C O U L O 5 8 . 5 1 3 5 . 9 6 5 . 6 0 . 1 3 1 0 . 0 7 7 0 . 1 6 2 C H I - S Q U A R E 2 8 . 3 1 8 7 . 6 4 . 0 1 3 2 . 0 6 4 . 0 2 6 0 . WHO 5 8 . 5 1 3 5 . 9 6 5 . 6 0 . 1 2 1 0 . 1 0 7 0 . 1 0 8 C H I - S Q U A R E 0 . 6 8 S3. 5 8 . 0 1 2 6 . 0 7 1 . 0 2 5 5 . A N Y 5 7 . 3 1 3 3 . 3 6 4 . 3 0 . 1 1 0 0 . 1 0 2 0 . 1 2 0 C H I - S Q U A R E 1 . 1 0 8 9 . 6 0 . 0 1 4 4 . 0 5 0 . 0 2 5 4 . B E C A U S E 5 7 . 1 1 3 2 . 8 6 4 . 1 0 . 1 1 3 0 . 1 1 7 0 . 0 8 4 C H I - S Q U A R E 4 . 1 9 90. 6 7 . 0 1 3 3 . 0 5 3 . 0 2 3 3 . S E E 5 6 . 9 1 3 2 . 3 6 3 . 8 0 . 1 2 7 0 . 1 0 8 0 . 0 8 9 C H I - S Q U A R E 3 . 6 5 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S %t O F F R E Q . T O T U T A L N O . O F W O R D S I N G R A D E T A B L E XXXVI D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N < G R A D E S W O R D 8 9 1 0 T O T A L 9 1 . 6 4 . 0 1 2 9 . 0 5 2 . 0 2 4 5 . L I K E 5 5 . 1 1 2 8 . 1 6 1 . 8 0 . 1 2 1 0 . 1 0 5 0 . 0 8 8 C H I - S Q U A R E 3 . 0 1 9 2 . 5 5 . 0 1 1 6 . 0 6 8 . 0 2 3 9 . M U C H 5 3 . 7 1 2 5 . 0 6 0 . 3 0 . 1 0 4 0 . 0 9 4 0 . 1 1 5 C H I - S Q U A R E 1 . 6 5 9 3 . 6 0 . 0 1 0 4 . 0 7 2 . 0 2 3 6 . P E O P L E 5 3 . 1 1 2 3 . 4 5 9 . 6 0 . 1 1 3 0 . 0 8 5 0 . 1 2 1 C H I - S O U A ^ E 6 . 5 6 9 4 . 5 3 . 0 l l l . O 7 0 . 0 2 3 4 . C A L L E D 5 2 . 6 1 2 2 . 3 5 9 . 0 0 . 1 0 0 0 . 0 9 0 0 . 1 1 8 C H I - S Q U A R E 3 . 0 9 9 5 . 7 4 . 0 1 2 3 . 0 3 6 . 0 2 3 3 . " P L A C E 5 2 . 4 1 2 1 . 8 5 8 . 8 0 . 1 4 0 0 . 1 0 0 0 . 0 6 1 C H I - S Q U A R E 1 7 . 7 7 T H E T H R E E L I N E S O F F I G U R E S F U R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E U U E N C Y R A T I O A S S t O F F R E Q . T O T O T A L N O . D F W O R O S I N G R A D E T A B L E XXXVI D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E G R A D E L E V E L S O F T H E C O R P U S R A N K G R A D E S W O R D 8 9 1 0 T O T A L 9 6 . 4 4 . 0 1 3 6 . 0 5 2 . 0 2 3 2 . T H R O U G H 5 2 . 2 1 2 1 . 3 5 8 . 5 0 . 0 8 3 0 . 1 1 1 0 . 0 8 8 C H I - S Q U A R E 3 . 7 9 9 7 . 4 3 . 0 1 6 3 . 0 2 6 . 0 2 3 2 . W O R K 5 2 .2 1 2 1 . 3 5 8 . 5 0 . 0 3 1 0 . 1 3 3 0 . 0 4 4 C H I - S C U A R E 3 4 . 0 3 38. 3 8 . 0 0 5 . 0 1 0 5 . 0 2 2 8 . NEW 5 1 . 3 1 1 9 . 2 5 7 . 5 0 . 0 7 2 0 . 0 6 9 0 . 1 7 7 C H I - S Q U A R E 5 2 . 4 0 9 9 . 4 8 . 0 1 1 7 . 0 5 8 . 0 2 2 3 . S M A L L 5 0 . 1 1 1 6 . 6 5 6 . 3 0 . 0 9 1 0 . 0 9 5 0 . 0 9 8 C H I - S Q U A R E 0 . 1 5 100. 5 2 . 0 1 2 3 . 0 4 5 . 0 2 ? 0 . O V E R 4 9 . 5 1 1 5 . 0 5 5 . 5 O . O s b 0 . 1 0 0 0 . 0 7 6 C H I - S Q U A R E 2 . 6 8 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S t O F F R E Q . T O T O T A L N G . UF W O R D S I N G R A D E T A B L E X X X V U D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D 8 C D E F 0 H T O T A L 1 . 1 4 6 3 . 0 2 7 9 9 . 0 2 6 5 2 . 0 2 7 6 7 . 0 1 2 6 3 . 0 3 2 6 6 . 0 3 2 8 9 . 0 1 7 5 1 9 . T H E 1 5 0 1 . 8 3 0 C 3 . 0 3 6 7 0 . 4 2 3 3 2 . 3 1 3 2 7 . 0 2 8 1 5 . 7 2 8 7 6 . 9 7 . 2 5 9 6 . 9 4 5 5 . 3 8 4 8 . 9 0 4 7 . 0 9 2 8 . 6 4 3 8 . 5 1 9 C H I - S Q U A R E 5 2 0 . 1 9 2 . 6 0 7 . 0 1 1 8 9 . 0 1 3 6 7 . 0 8 4 5 . 0 7 5 7 . 0 1 6 3 6 . 0 1 7 8 2 . 0 8 1 7 7 . C F 7 0 1 . 0 1 4 0 1 . 6 1 7 1 3 . 2 1 0 8 8 . 6 6 1 9 . 4 1 3 1 4 . 2 1 3 4 2 . 8 3 . 0 1 2 2 . 9 5 0 2 . 7 7 5 2 . 7 0 0 4 . 2 5 1 4 . 3 3 0 4 . 6 1 6 C H I - S Q U A R E 4 2 2 . 3 5 3 . 4 7 1 . 0 1 2 a 7 . 0 1 6 5 3 . 0 8 5 5 . 0 2 8 7 . 0 7 4 6 . 0 1 1 9 0 . 0 6 4 5 9 . A N D 5 5 3 . 7 1 1 0 7 . 1 1 3 5 3 . 2 8 5 9 . 9 4 8 9 . 2 1 0 3 8 . 1 1 0 6 0 . 7 2 . 3 3 7 3 . 1 1 9 3 . 3 5 6 2 . 7 3 2 1 . 6 1 2 1 . 9 7 4 3 . 0 8 2 C H I - S Q U A R E 2 8 0 . 6 4 4 . 5 0 2 . 0 9 9 2 . 0 1 3 6 9 . 0 7 9 7 . 0 5 2 5 . 0 1 0 3 3 . 0 7 0 3 . 0 5 9 2 1 . A 5 0 7 . 6 1 0 1 4 . 9 1 2 4 0 . 5 7 8 8 . 3 4 4 8 . 5 9 5 1 . 6 9 7 2 . 3 2 . 4 9 1 2 . 4 6 2 2 . 7 7 9 2 . 5 4 6 2 . 9 4 8 2 . 7 3 4 1 . 8 2 1 C H l - S Q U A ' E 1 0 8 . 5 9 5 . 5 9 9 . 0 9 7 5 . 0 1 4 9 4 . 0 5 9 5 . 0 4 2 1 . 0 8 5 9 . 0 8 7 8 . 0 5 9 2 1 . T O 5 0 7 . 6 1 0 1 4 . 9 1 2 4 0 . 5 7 8 8 . 3 4 4 8 . 5 9 5 1 . 6 9 7 2 . 3 2 . 9 7 2 2 . 4 1 9 3 . 0 3 3 2 . 2 2 0 2 . 3 6 4 2 . 2 7 3 2 . 2 7 4 C H I - S Q U A R E 1 0 0 . 7 2 T H E T H R E E L I N E S O F F I G U R E S F C R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C 0 E F G H T O T A L 6 . 4 1 1 . 0 7 7 7 . 0 1 0 4 6 . 0 6 2 0 . 0 3 5 8 . 0 7 9 9 . 0 1 0 1 1 . 0 5 0 2 2 I N 4 3 0 . 5 8 6 0 . 8 1 0 5 2 . 2 6 6 8 . 6 3 R 0 . 4 6 0 7 . 1 8 2 4 . 7 2 . 0 3 9 1 . 9 2 8 2 . 1 2 4 1 . 9 8 1 2 . 0 1 0 2 . 1 1 4 2 . 6 1 9 C H I - S Q U A R E 5 6 . 1 1 7 . 3 5 0 . 0 4 0 6 . 0 9 0 3 . 0 7 9 7 . 0 4 2 7 . 0 5 9 5 . 0 4 4 1 . 0 3 9 1 3 I S 3 3 5 . 4 6 7 0 . 7 8 1 9 . 8 5 2 0 . 9 2 9 6 . 4 6 2 8 . 9 6 4 2 . 6 1 . 7 3 7 1 . 0 0 7 1 . 8 3 3 2 . 5 4 6 2 . 3 9 8 1 . 5 7 5 1 . 1 4 2 C H I - S Q U A R E 3 8 2 . 4 7 8 . 2 0 0 . 0 4 8 2 . 0 4 1 4 . 0 2 0 4 . 0 3 0 7 . 0 4 2 5 . 0 2 3 4 . 0 2 2 6 6 T H A T 1 9 4 . 3 3 8 8 . 4 4 7 4 . 7 3 0 1 . 7 1 7 1 . 6 3 6 4 . 2 3 7 2 . 1 0 . 9 9 2 1 . 1 9 6 0 . 8 4 0 0 . 6 5 2 1 . 7 2 4 1 . 1 2 5 0 . 6 0 6 C H I - S Q U A R E 2 3 0 . 2 8 9 . 1 7 6 . 0 4 7 8 . 0 5 0 3 . 0 3 3 7 . 0 1 4 0 . 0 3 5 2 . 0 2 3 2 . 0 2 2 1 8 I T 1 9 0 . 1 3 8 0 . 2 4 6 4 . 7 2 9 5 . 3 1 6 6 . 0 3 5 6 . 5 3 6 4 . 2 0 . 8 7 3 1 . 1 8 6 1 . 0 2 1 1 . 0 7 7 0 . 7 8 6 0 . 9 3 2 0 . 6 0 1 C H I - S Q U A R E 8 7 . 9 9 1 0 . 2 1 3 . 0 1 9 6 . 0 7 0 0 . 0 3 6 4 . 0 1 8 1 . 0 2 8 8 . 0 2 4 1 . 0 2 1 6 5 A R E 1 8 5 . 6 3 7 1 . 1 4 5 3 . 6 2 8 8 . 2 1 6 4 . 0 3 4 8 . 0 3 5 5 . 5 1 . 0 5 7 0 . 4 8 6 1 . 4 2 1 1 . 1 6 3 1 . 0 1 6 0 . 7 6 2 0 . 6 2 4 C H I - S Q U A R E 2 8 9 . 4 4 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O r W O R D S I N S U B J E C T T A B L E X X K V / J . D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C D E F C H T O T A L 1 1 . 2 5 8 . 0 3 1 7 . 0 6 2 5 . 0 2 6 5 . 0 1 4 5 . 0 2 0 5 . 0 3 2 6 . 0 2 1 4 1 . F O R 1 8 3 . 5 3 6 7 . 0 4 4 8 . 6 2 8 5 . 0 1 6 2 . 2 3 4 4 . 1 3 5 1 . 6 1 . 2 8 0 0 . 7 8 7 1 . 2 6 9 0 . 8 4 7 0 . F 1 4 0 . 5 4 3 0 . 8 4 4 C H I - S Q U A R E 1 6 7 . 7 4 1 2 . 3 6 6 . 0 3 6 5 . 0 6 3 3 . 0 1 4 0 . 0 1 8 7 . 0 3 4 6 . 0 5 3 . 0 2 0 9 0 . Y O U 1 7 9 . 2 3 5 8 . 2 4 3 7 . 9 2 7 8 . 2 1 5 8 . 3 3 3 5 . 9 3 4 3 . 2 1 . 8 1 6 0 . 9 0 6 1 . 2 8 5 0 . 4 4 7 l . u i O 0 . 9 1 6 0 . 1 3 7 C H I - S J U A R E 6 0 1 . 4 8 1 3 . 1 9 1 . 0 1 8 5 . 0 6 6 3 . 0 3 2 5 . 0 1 2 1 . 0 2 3 4 . 0 1 5 2 . 0 1 8 7 1 . 3 E 1 6 0 . 4 3 2 0 . 7 3 9 2 . 0 2 4 9 . 1 1 4 1 . 7 3 0 0 . 7 3 0 7 . 2 0 . 9 4 3 0 . 4 5 9 1 . 3 4 6 1 . 0 3 8 0 . 6 7 9 0 . 6 1 9 6.394 C H I - S Q U A R E 3 7 0 . 0 4 1 4 . 1 3 5 . 0 3 0 9 . 0 4 0 6 . 0 2 1 4 . 0 1 4 8 . 0 2 7 4 . 0 3 0 3 . 0 1 7 8 9 . A S 1 5 3 . 4 3 0 6 . 7 3 7 4 . 8 2 3 8 . 2 1 3 5 . 5 2 8 7 . 5 2 9 3 . 8 0 . 6 7 0 0 . 7 6 7 0 . 8 2 4 0 . 6 8 4 0 . 8 3 1 0 . 7 2 5 0 . 7 8 5 C H I - S Q U A R E 9 . 3 4 1 5 . 1 3 8 . 0 1 6 6 . 0 6 3 6 . 0 3 1 9 . 0 T l . ' o 1 9 8 . 0 1 2 2 . 0 1 6 5 0 . O R 1 4 1 . 4 2 8 2 . 8 3 4 5 . 7 2 1 9 . 7 1 2 5 . 0 2 6 5 . 2 2 7 1 . 0 0 . 6 8 s 0 . 4 1 2 1 . 2 9 1 1 . 0 1 9 0 . 3 9 9 0 . 5 2 4 0 . 3 1 6 C H I - S Q U A R E 4 5 9 . 2 9 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H F N f R Y R E P R E S E N T : F R E C U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, U F F R E Q . T O T O T A L N U . U F W U R D S I N S U B J E C T T A B L E X X X V I I D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R O B C D E F G H T O T A L 1 6 . 9 8 . 0 2 8 4 . 0 3 6 4 . 0 2 3 1 . 0 8 4 . 0 2 2 4 . 0 2 0 0 . 0 1 4 8 5 . W I T H 1 2 7 . 3 2 5 4 . 5 3 1 1 . 1 1 9 7 . 7 1 1 2 . 5 2 3 6 . 7 2 4 3 . 9 0 . 4 8 6 0 . 7 0 5 0 . 7 3 9 0 . 7 3 8 0 . 4 7 2 0 . 5 9 3 0 . 5 1 8 C H I - S Q U A R E 4 0 . 7 5 1 7 . 1 6 5 . 0 2 1 0 . 0 2 7 6 . 0 2 1 5 . 0 9 7 . 0 2 4 4 . 0 2 5 2 . 0 1 4 5 9 . O N 1 2 5 . 1 2 5 0 . 1 3 0 5 . 7 1 9 4 . 2 1 1 0 . 5 2 3 4 . 5 2 3 9 . 6 0 . 8 1 9 0 . 5 2 1 0 . 5 6 0 0 . 6 8 7 0 . 5 4 5 0 . 6 4 6 0 . 6 5 3 C H I - S Q U A K E 2 6 . 9 5 1 8 . 1 6 3 . 0 1 3 6 . 0 1 4 8 . 0 2 1 1 . 0 1 6 5 . 0 2 3 0 . 0 2 3 8 . 0 1 2 9 1 . T H I S 1 1 0 . 7 2 2 1 . 3 2 7 0 . 5 1 7 1 . 9 9 7 . 3 2 0 7 . 5 2 1 2 . 0 0 . 8 0 9 0 . 3 3 7 0 . 3 0 0 0 . 6 7 4 0 . 9 2 7 0 . 6 0 9 0 . 6 1 6 C H I - S Q U A R E 1 7 3 . 8 1 1 9 . 1 2 6 . 0 1 6 6 . 0 1 9 9 . 0 1 6 7 . 0 1 1 2 . 0 1 9 8 . 0 2 7 8 . 0 1 2 4 6 . B Y 1 0 6 . 8 2 1 3 . 6 2 6 1 . 0 1 6 5 . 9 9 4 . 4 2 0 0 . 3 2 C 4 . 6 0 . 6 2 5 0 . 4 1 2 0 . 4 0 4 0 . 5 3 4 0 . 6 2 9 0 . 5 2 4 0 . 7 2 0 C H I - S Q U A R E 5 8 . 4 4 2 0 . 2 4 . 0 4 5 0 . 0 3 9 . 0 4 1 . 0 5 9 . 0 1 3 6 . 0 3 6 1 . 0 1 1 1 0 . WAS 9 5 . 2 1 9 0 . 3 2 3 2 . 6 1 4 7 . 8 8 4 . 1 1 7 8 . 4 1 8 2 . 3 0 . 1 1 9 1 . 1 1 7 0 . 0 7 9 0 . 1 3 1 0 . 3 ) 1 0 . 3 6 0 0 . 9 3 5 C H I - S Q U A R E 8 3 8 . 8 1 to T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : CT> F R E Q U E N C Y N > E X P E C T E D F R E Q U E N C Y R A T I O A S %, U F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXVII DISTRIBUTION OF OCCURRENCE OF THE 100 HOST FREOUENT WORD TYPES ACROSS THE SUBJECT AREAS OF THE CORPUS WORD B C D E F G H TOTAL 21. 79.0 520.0 187.0 14.0 42.0 137.0 108.0 1087 HE 93.2 186.3 227.7 144.7 82.3 174.7 176.5 C.392 1.290 0.380 0.045 0.236 0.363 0.260 C H l - S j U A . ^ . E 780.81 22. 90.0 .148.0 189.0 137.0 55.0 242.0 203.0 1064 FROM 91.2 182.4 222.9 141.7 80.6 171.0 174.7 0.447 0.367 0.384 0.438 0.309 0.640 0.526 CHI-SOUARE 53.98 23. 145.0 180.0 290.0 76.0 114.0 138.0 114.0 1057 HAVE 90.6 181.2 221.5 140. 7 80. 1 169.9 173.6 . 0.719 0.447 0.589 0.243 0.640 0.365 0.295 CHI-SQUARE 124.45 Z4. 112.0 224.0 207.0 100.0 49.0 222.0 140.0 1054 AT 90.4 180.7 220.8 140.3 79.8 169.4 173.1 0 .55b 0.556 0.420 0.319 0.275 0.58S 0.363 i CHI-SQUA4E 62.59 25. 70.C 1C6.0 198.0 129.0 84.0 200.0 205.0 992 WHICH 85.0 170.0 207.8 132.1 75.1 159.4 162.9 0.347 0.263 0.402 0.412 0.472 0.529 0.531 CHI-SQUARE 49.56 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT TABLE X X X V / I DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF THE CORPUS RANK S U B J E C T S WORD B C D E F G H TOTAL 26. 68.0 141.0 182.0 108.0 121.0 180.0 133.0 933. O N E S O . O 159.9 195.5 124.2 70.7 150.0 153.2 0. 337 0.350 0.369 0.345 0.679 0.476 0.344 C H I - S Q U A R E 51.61 27. 63.0 201.0 236.0 77.0 83.C 122.0 102.0 889. NOT 76.2 152.4 186.3 118.4 67.3 142.9 146.0 0.313 0.499 0.479 0.246 0.'«94 0.323 0. 264 C H I - S Q U A R E 63.18 28. 84.0 86.0 240.0 163.0 85.0 162.0 37.0 857. CAN 73.5 146.9 179.5 114.1 64.9 137.7 140.7 0.417 0.213 0.487 0.521 0.477 0.429 0.096 C H I - S Q U A K E 155.02 £9. 179.0 136.0 324.0 23.0 37.0 138.0 19.0 853. YOUR 73.1 146.2 178.7 113.6 64.6 137.1 14C. I 0.888 0.337 0.658 0.073 0.208 0.365 0.049 C H I - S Q U A K E 460.80 30. 6C.0 171.0. 266.0 75.0 23.0 148.0 11C.0 833. THEY 71.4 142.8 174.5 110.9 63.1 133.9 136.8 0.293 0.424 0.540 0.240 0.129 0.392 C.285 C H I - S Q U A R E 99.18 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S t O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T H O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R r f N K S U B J E C T S W O R D B C D E F G H T O T A L 3 1 . 1 1 5 . 0 1 3 4 . 0 4 2 . 0 1 4 . 0 4 1 4 . 0 7 2 . 0 4 2 . 0 8 2 6 . WE 7 0 . 8 1 4 1 . 6 1 7 3 . 1 1 1 0 . 0 6 2 . 6 1 3 2 . 8 1 3 5 . 6 0 . 5 7 1 0 - 3 3 3 0 . 0 8 5 0 . 0 4 5 2 . 3 2 5 0 . 1 9 1 0 . 1 0 9 C H I - S Q U A R E 2 2 7 7 . 4 9 3 2 . 7 5 . 0 3 4 6 . 0 1 7 6 . 0 1 1 . 0 2 8 . C 7 7 . 0 1 1 3 . 0 8 1 6 . H I S 7 0 . 0 1 3 9 . 9 1 7 1 . 0 1 0 8 . 6 6 1 . 8 1 3 1 . 1 1 3 4 . 0 0 . 3 7 2 0 . 8 5 9 0 . 3 5 7 0 . 0 3 5 0 . 1 5 7 0 . 2 0 4 0 . 2 9 3 C H I - S C U A K E 4 3 6 . 1 7 33. 1 3 6 . 0 7 2 . 0 2 5 1 . 0 1 2 0 . 0 6 3 . 0 1 2 8 . 0 4 6 . 0 8 1 6 . W I L L 7 0 . 0 1 3 9 . 9 1 7 1 . 0 1 0 8 . 6 6 1 . 8 1 3 1 . 1 1 3 4 . 0 0 . 6 7 5 0 . 1 7 9 0 . 5 1 0 0 . 3 8 3 0 . 3 5 4 0 . 3 3 9 0 . 1 1 9 C H I - S Q U A R E 1 9 1 . 6 4 3 4 . 6 8 . 0 9 1 . 0 2 7 8 . 0 1 C 2 . 0 9 0 . 0 1 1 8 . 0 3 2 . 0 7 7 9 . I F 6 6 . 8 1 3 3 . 5 1 6 3 . 2 1 0 3 . 7 5 9 . 0 1 2 5 . 2 1 2 7 . 9 0 . 3 3 7 0 . 2 2 6 0 . 5 6 4 0 . 3 2 6 0 . 5 0 5 0 . 3 1 2 0 . 0 8 3 C H I - S Q U A k E 1 8 2 . 9 6 35. 6 5 . 0 1 3 5 . 0 1 4 2 . 0 1 2 1 . 0 5 0 . U 1 3 5 . 0 1 2 4 . 0 7 7 2 . A N 6 6 . 2 1 3 2 . 3 1 6 1 . 7 1 0 2 . 8 5 8 . 5 1 2 4 . 1 1 2 6 . 8 C . 3 2 3 0 . 3 3 5 0 . 2 8 8 0 . 3 8 7 0 . 2 8 1 0 . 3 5 7 0 . 3 2 1 C H I - S O U A R E 7 . 9 7 T H E T H R E c L I N E S O F F I G U < t S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E O U E N C Y R A T I O A S S t U F F R E Q . T O T O T A L N J . O F W O R D S I N S U E J E C T T A B L E XXXVII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D 5 C D E F G H T O T A L 3 6 . 6 3 . 0 1 0 8 . 0 1 9 3 . 0 1 1 8 . 0 4 5 . 0 1 2 4 . 0 4 9 . 0 7 0 0 . W H E N 6 0 . 0 1 2 0 . 0 1 4 6 . 7 9 3 . 2 5 3 . C 1 1 2 . 5 1 1 5 . 0 0 . 3 1 3 0 . 2 6 8 0 . 3 9 2 0 . 3 7 7 0 . 2 5 3 0 . 3 2 8 0 . 1 2 7 C H I - S Q U A R E 6 2 . 8 2 3 7 . 8 6 . 0 1 3 6 . 0 1 5 9 . 0 6 1 . 0 4 9 . 0 8 6 . 0 1 0 4 . 0 6 8 1 . A L L 5 8 . 4 1 1 6 . 7 1 4 2 . 7 9 0 . 7 5 1 . 6 1 0 9 . 5 1 1 1 . 8 0 . 4 2 7 0 . 3 3 7 0 . 3 2 3 0 . 1 9 5 0 . ? 7 5 0 . 2 2 8 C . 2 6 9 C H I - S Q U A K E 3 3 . 5 2 3 8 . 3 6 . 0 1 9 5 . C 1 3 5 . 0 4 7 . 0 3 4 . 0 9 6 . 0 1 2 C . 0 6 6 3 . C U T 5 6 . 8 1 1 3 . 6 1 3 8 . 9 8 8 . 3 5 0 . 2 1 0 6 . 6 1 C 8 . 9 0 . 1 7 9 0 . 4 8 4 0 . 2 7 4 0 . 1 5 0 0 . 1 9 1 0 . 2 5 4 0 . 3 1 1 C H I - S Q U A K E 9 2 . 7 0 3 9 . 6 9 . 0 6 4 . 0 1 0 3 . 0 7 6 . 0 8 7 . C 1 2 6 . 0 1 1 4 . C 6 3 9 . T H E S E 5 4 . 8 1 0 9 . 5 1 3 3 . 9 8 5 . 1 4 8 . 4 1 0 2 . 7 1 0 4 . 9 0 . 3 4 2 0 . 1 5 9 0 . 2 0 9 0 . 2 4 3 0 . 4 8 9 0 . 3 3 3 C . 2 9 5 C H I - S Q U A R E 6 7 . 5 6 4 0 . 6 8 . 0 2 6 . 0 2 7 7 . 0 6 5 . 0 3 9 . 0 6 0 . 0 4 6 . 0 5 8 1 . M A Y 4 9 . 8 9 9 . 6 1 2 1 . 7 7 7 . 3 4 4 . 0 9 3 . 4 9 5 . 4 0 . 3 3 7 0 . 0 6 5 0 . 5 6 2 0 . 2 0 8 0 . 2 1 9 0 . 1 5 9 0 . 1 1 9 C H I - S Q U A R E 2 9 9 . 1 6 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E O U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S , O F F R E Q . T O T U T A L N O . U F W O R D S I N S U B J E C T IV) T A B L E XXXV// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T ^ F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U 3 J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C 0 E" F G H T O T A L 4 1 . 3 0 . 0 1 3 0 . 0 1 1 7 . 0 7 7 . 0 4 9 . 0 6 3 . 0 1 0 3 . 0 5 6 9 . T H E R E 4 8 . 8 9 7 . 5 1 1 9 . 2 7 5 . 8 4 3 . 1 9 1 . 5 9 3 . 4 C . 1 4 9 0 . 3 2 3 0 . 2 3 8 0 . 2 4 6 C . 2 7 5 0 . 1 6 7 0 . 2 6 7 C H I - S O U A R E 2 8 . 7 4 4 2 . 5 1 . 0 7 1 . 0 1 0 2 . 0 1 0 1 . 0 2 6 . 0 8 0 . 0 1 2 7 . 0 5 5 8 . H A S 4 7 . 8 9 5 . 6 1 1 6 . 9 7 4 . J 4 2 . 3 8 9 . 7 9 1 . 6 0 . 2 5 3 0 . 1 7 6 0 . 2 0 7 0 . 3 2 3 0 . 1 4 6 0 . 2 1 2 0 . 3 2 9 C H I - S Q U A R E 3 9 . 0 2 4 3 . 2 6 . 0 4 2 2 . 0 1 0 . 0 1 0 . 0 4 . 0 5 7 . 0 2 3 . 0 5 5 2 . I 4 7 . 3 9 4 . 6 1 1 5 . 6 7 3 . 5 4 1 . 8 8 8 . 7 9 0 . 6 0 . 1 2 9 1 . 0 4 7 0 . 0 2 0 0 . 0 3 2 0 . 0 2 2 0 . 1 5 1 0 . 0 6 0 C H I - S Q U A R E 1 3 8 9 . 7 2 4 4 . 3 9 . 0 6 2 . 0 1 4 2 . 0 6 9 . 0 4 6 . 0 8 9 . 0 9 5 . 0 5 4 2 . O T H E R 4 6 . 5 9 2 . 9 1 1 3 . 6 7 2 . 2 41.1 8 7 . 1 8 9 . 0 0 . 1 9 4 0 . 1 5 4 0 . 2 8 8 0 . 2 2 0 0 . 2 5 8 0 . 2 3 6 0 . 2 4 6 C H I - S O U A R E 1 9 . 7 8 4 5 . 3 2 . 0 6 9 . 0 1 6 5 . 0 5 0 . 0 5 0 . 0 7 7 . 0 9 4 . 0 5 3 7 . S O M E 4 6 . 0 9 2 . 0 1 1 2 . 5 7 1 . 5 4 0 . 7 8 6 . 3 8 8 . 2 0 . 1 5 9 0 . 1 7 1 0 . 3 3 5 0 . 1 6 0 0 . 2 8 1 0 . 2 0 4 0 . 2 4 3 C H I - S Q U A K E 4 4 . 5 3 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S i, U F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXVII D I S T R I B U T I O N O F O C C U R R E N C E C F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C D E F G H T O T A L 4 6 . 3 8 . 0 7 5 . 0 1 3 4 . 0 7 6 . 0 2 6 . 0 7 5 . 0 1 0 5 . 0 5 2 9 . M O R E 4 5 . 3 9 0 . 7 1 1 0 . 8 7 0 . 4 4 0 . 1 8 5 . 0 8 6 . 9 0 . 1 8 9 0 . 1 8 6 0 . 2 7 2 0 . 2 4 3 0 . 1 4 6 0 . 1 9 8 0 . 2 7 2 C H I - S Q U A R E 1 9 . 0 9 4-7. 1 6 . 0 1 3 6 . 0 2 7 . 0 2 8 . 0 3 2 . 0 6 7 . 0 2 1 2 . 0 5 1 8 . W E R E 4 4 . 4 8 8 . 8 1 0 8 . 5 6 9 . 0 3 9 . 2 8 3 . 3 8 5 . 1 0 . 0 7 9 0 . 3 3 7 0 . 0 5 5 0 . 0 8 9 C . 1 8 0 0 . 1 7 7 0 . 5 4 9 C H I - S Q U A R E 3 2 2 . 7 8 4 8 . 7 . 0 2 5 8 . 0 3 4 . 0 1 9 . 0 1 7 . 0 5 9 . 0 1 1 6 . 0 5 1 0 . H A D 4 3 . 7 8 7 . 4 1 0 6 . 8 6 7 . 9 3 8 . o 8 2 . 0 8 3 . 7 0 . 0 3 5 0 . 6 4 0 0 . 0 6 9 0 . 0 6 1 0 . 0 9 5 0 . 1 5 6 C . 3 0 0 C H I - S Q U A R E 4 7 9 . 5 4 4 9 . 2 7 . 0 1 0 2 . 0 1 2 6 . 0 2 2 . 0 1 3 . 0 8 0 . 0 1 2 6 . 0 4 9 8 . T H E I R 4 2 . 7 8 5 . 4 1 0 4 . 3 6 6 . 3 3 7 . 7 8 0 . 0 8 1 . 6 0 . 1 3 4 0 . 2 5 3 0 . 2 5 6 0 . 0 7 0 0 . 0 7 3 0 . 2 1 2 0 . 3 3 2 C H I - S Q U A K E 8 5 . 4 3 ,50. 2 9 . 0 3 4 . 0 1 4 1 . 0 1 5 0 . 0 3 9 . 0 6 3 . 0 3 0 . 0 4 9 1 . U S E D 4 2 . 1 6 4 . 2 1 0 2 . 9 6 5 . 4 3 T . 2 7 8 . 9 3 0 . 6 0 . 1 4 4 0 . 0 8 4 0 . 2 8 6 0 . 4 7 9 0 . 2 1 9 0 . 1 8 0 0 . 0 7 8 C H I - S Q U A R E 1 9 1 . 0 7 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T . " 8 L E YXXVH D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C D E F 0 H T O T A L 5 1 . 3 3 . 0 3 8 . 0 1 1 5 . 0 7 5 . 0 5 3 . 0 8 4 . 0 9 0 . 0 4 8 8 . M A N Y 4 1 . 8 8 3 . 6 1 0 2 . 2 6 5 . 0 3 7 . C 7 8 . 4 8 0 . 1 0 . 1 6 4 0 . 0 9 4 0 . 2 3 3 0 . 2 4 0 0 . 2 9 8 0 . 2 2 2 0 . 2 3 3 C H I - S Q U A K E 3 8 . 4 9 5 2 . 2 4 . 0 1 1 0 . 0 1 3 2 . 0 5 1 . 0 4 1 . 0 8 2 . 0 4 1 . 6 4 8 1 . S O 4 1 . 2 8 2 . 4 1 0 0 . 8 6 4 . 0 3 6 . 4 7 7 . 3 7 9 . 0 0 . 1 1 9 0 . 2 7 3 0 . 2 6 8 0 . 1 6 3 0 . 2 3 0 0 . 2 1 7 0 . 1 0 6 C H I - S U U A H E 4 7 . 8 7 $ 3 . 6 7 . 0 3 6 . 0 9 6 . 0 3 7 . 0 8 7 . 0 1 0 9 . 0 4 4 . 0 4 7 6 . E A C H 4 0 . 8 8 1 . 6 9 9 . 7 6 3 . 4 3 6 . 1 7 6 . 5 7 8 . 2 0 . 3 3 2 0 . 0 8 9 0 . 1 9 5 0 . 1 1 8 0 . 4 8 9 0 . 2 8 8 C . 1 1 4 C H I - S Q U A R E 1 5 4 . 1 3 5 4 . 3 3 . 0 7 7 . 0 5 2 . 0 7 5 . 0 6 7 . 0 7 9 . 0 8 2 . 0 4 6 5 . TWO 3 9 . 9 7 9 . 7 9 7 . 4 6 1 . 9 3 5 . 2 7 4 . 7 7 6 . 4 0 . 1 6 4 0 . 1 9 1 0 . 1 0 6 0 . 2 4 0 0 . 3 7 6 0 . 2 0 9 0 . 2 1 2 C H I - S Q U A R E 5 4 . 5 5 5 5 . 3 3 . 0 9 6 . 0 7 0 . 0 3 5 . 0 7 9 . ' 0 8 8 . 0 6 2 . 0 4 6 3 . A B O U T 3 9 . 7 7 9 . 4 9 7 . 0 6 1 . 6 3 5 . 1 7 4 . 4 7 6 . 0 0 . 1 6 4 0 . 2 3 8 0 . 1 4 2 0 . 1 1 2 0 . 4 4 4 0 . 2 3 3 0 . 1 6 1 C H I - S Q U A R E 8 3 . 7 5 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S ? , O F F K L Q . T O T U T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV/I D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C U R P U S R A N K S U B J E C T S W ORD B C D E F G H T O T A L 5 6 . 5 1 . 0 4 1 . 0 2 3 8 . 0 3 8 . 0 2 9 . 0 3 1 . 0 1 1 . 0 4 3 9 . S H O U L D 3 7 . 6 7 5 . 2 9 2 . 0 5 8 . 4 3 3 . 3 7 0 . 6 7 2 . 1 0 . 2 5 3 0 . 1 0 2 0 . 4 8 3 0 . 1 2 1 0 . 1 6 3 0 . 0 8 2 0 . 0 2 8 C H I - S Q U A R E 3 3 3 . 8 2 5 7 . 3 9 . 0 1 2 1 . 0 5 5 . 0 1 2 . 0 5 5 . 0 1 0 7 . C 3 9 . 0 4 2 8 . W H A T 3 6 . 7 7 3 . 4 6 9 . 7 5 7 . 0 3 2 . 4 6 8 . 8 7 0 . 3 0 . 1 9 4 0 . 3 0 0 0 . 1 1 2 0 . 0 3 8 0 . 3 0 9 0 . 2 8 3 0 . 1 0 1 C H I - S Q U A R E 1 3 0 . 8 7 5 8 . 2 4 . 0 5 9 . 0 1 0 1 . 0 6 4 . 0 3 1 . 0 6 5 . 0 8 1 . 0 4 2 5 . T H A N 3 6 . 4 7 2 . 8 8 9 . 0 5 6 . 6 3 2 . 2 6 8 . 3 6 9 . 8 0 . 1 1 9 0 . 1 4 6 0 . 2 0 5 0 . 2 0 4 0 . 1 7 4 0 . 1 7 2 0 . 2 1 0 C H I - S Q U A R E 1 1 . 4 6 5 9 . 4 1 . 0 9 4 . 0 5 8 . 0 4 2 . 0 2 6 . 0 6 3 . 0 I C O . O 4 2 4 . B E E N 3 6 . 3 7 2 . 7 3 8 . 8 5 6 . 4 3 2 . 1 6 8 . 1 6 9 . 6 0 . 2 0 3 0 . 2 3 3 0 . 1 1 8 0 . 1 3 4 0 . 1 4 6 0 . 1 6 7 C . 2 5 9 C H I - S Q U A R E 3 6 . 0 5 6 0 . 1 9 . 0 1 0 5 . 0 5 4 . 0 8 2 . 0 1 1 . 0 7 4 . 0 7 8 . 0 4 2 3 . I N T O 3 6 . 3 7 2 . 5 8 8 . 6 5 6 . 3 3 2 . 0 6 8 . 0 6 9 . 5 0 . 0 9 4 0 . 2 6 1 0 . 1 1 0 0 . 2 6 2 0 . 0 6 2 0 . 1 9 6 C . 2 0 2 C H I - S Q U A K E 6 3 . 4 2 T H E T H R E E L I N E S O F F I G U K E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T ro T A B L E X X X V K D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N < S U B J E C T S W C R O B C D c F G H T O T A L 6 1 . 3 3 . 0 9 7 . 0 1 2 8 . 0 2 0 . 0 3 5 . 0 5 9 . 0 4 4 . 0 4 2 1 T h E M 3 6 . 1 7 2 . 2 6 8 . 2 5 6 . 0 3 1 . 9 6 7 . 7 6 9 . 1 0 . 1 8 9 0 . 2 4 1 0 . 2 6 0 0 . 0 6 4 0 . 1 9 7 0 . 1 5 6 0 . 1 1 4 C H I - S Q U A R E 6 0 . 3 4 6 2 . 4 6 . 0 5 8 . 0 1 2 1 . 0 5 5 . 0 5 6 . 0 5 4 . 0 2 7 . 0 4 1 7 U S E 3 5 . 7 7 1 . 5 8 7 . 4 5 5 . 5 3 1 . 6 6 7 . 0 6 8 . 5 0 . 2 2 3 0 . 1 4 4 0 . 2 4 6 0 . 1 7 6 0 . 3 1 4 0 . 1 4 3 0 . 0 7 0 C H I - S Q U A K E 6 4 . 9 6 6 3 . 4 0 . 0 4 0 . 0 1 7 3 . 0 7 2 . 0 2 1 . 0 3 6 . 0 2 9 . 0 4 1 1 M A K E 3 5 . 2 7 0 . 5 8 6 . 1 5 4 . 7 3 1 . 1 6 6 . 1 6 7 . 5 0 . 1 9 8 0 . 0 9 9 0 . 3 5 1 0 . 2 3 0 0 . 1 1 8 0 . 0 9 5 0 . 0 7 5 C H I - S Q U A R E 1 4 5 . 8 7 6 4 . 4 5 . 0 6 2 . 0 1 1 5 . 0 3 3 . 0 4 5 . 0 6 6 . 0 2 0 . 0 4 0 6 , D O 3 4 . 8 6 9 . 6 8 5 . 1 5 4 . 1 3 0 . 8 6 5 . 3 6 6 . 7 0 . 2 2 3 0 . 2 0 3 0 . 2 3 3 0 . 1 0 5 0 . 2 5 3 0 . 1 7 5 0 . 0 5 2 C H I - S Q U A R E 6 3 . 2 2 6 5 . 4 3 . 0 1 1 0 . 0 6 4 . 0 8 2 . 0 7 . 0 5 2 . 0 4 5 . 0 4 0 3 , U P 3 4 . 5 6 9 . 1 8 4 . 4 5 3 . 7 3 0 . 5 6 4 . 8 6 6 . 2 0 . 2 1 3 0 . 2 7 3 0 . 1 3 0 0 . 2 6 2 0 . 0 3 9 0 . 1 3 8 0 . 1 1 7 C H I - S Q U A R E 7 3 . 6 6 T H E T H R E L L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F K c U U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S Z, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXVII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E S U d J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C 0 E F G H T O T A L 66. 3 5 . 0 4 8 . 0 1 0 3 . 0 4 9 . 0 3 0 . 0 6 5 . 0 7 0 . 0 4 0 0 . S U C H 3 4 . 3 6 8 . 6 8 3 . 8 5 3 . 3 3 0 . 3 6 4 . 3 6 5 . 7 0 . 1 7 4 0 . 1 1 9 0 . 2 0 9 0 . 1 5 7 0 . 1 6 8 0 . 1 7 2 0 . 1 8 1 C H I - S Q U A K E 1 1 . 2 1 6 7 . 4 5 . 0 9 4 . 0 6 2 . 0 6 1 . 0 4 7 . 0 6 9 . 0 2 2 . 0 4 0 0 . T H E N 3 4 . 3 6 8 . 6 8 3 . 8 5 3 . 3 3 0 . 3 6 4 . 3 6 5 . 7 0 . 2 2 3 0 . 2 3 3 0 . 1 2 6 0 . 1 9 5 0 . 2 6 4 0 . 1 8 3 0 . 0 5 7 C H I - S J U A K E 5 8 . 1 9 6 8 . 5 8 . 0 6 7 . 0 1 0 9 . 0 3 0 . 0 1 6 . 0 5 2 . 0 6 1 . 0 3 9 3 . T I M E 3 3 . 7 6 7 . 4 8 2 . 3 5 2 . 3 2 9 . 8 6 3 . 2 6 4 . 5 0 . 2 6 8 0 . 1 6 6 0 . 2 2 1 0 . 0 9 6 0 . 0 9 0 0 . 1 3 8 0 . 1 5 8 C H I - S Q U A K E 4 4 . 2 3 6 9 . 2 1 . 0 7 7 . 0 4 0 . 0 5 5 . 0 4 . 0 9 8 . 0 9 1 . 0 3 8 6 . I T S 3 3 . 1 6 6 . 2 8 0 . 9 5 1 . 4 2 9 . 2 6 2 . 0 6 3 . 4 0 . 1 0 4 0 . 1 9 1 0 . 0 8 1 0 . 1 7 6 0 . 0 2 2 0 . 2 5 9 0 . 2 3 6 C H I - S Q U A K E 8 1 . 7 6 7 0 . 3 6 . 0 8 5 . 0 4 0 . 0 2 9 . 0 5 0 . 0 9 2 . 0 3 7 . 0 3 6 9 . W O U L D 3 1 . 6 6 3 . 3 7 7 . 3 4 9 . 1 2 7 . 9 5 9 . 3 6 0 . 6 0 . 1 7 9 0 . 2 1 1 0 . 0 8 1 0 . 0 9 3 0 . 2 8 1 0 . 2 4 3 0 . 0 9 6 C H I - S Q U A R E 7 8 . 9 4 ro T h E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : G\ F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . U F W O R D S I N S U B J E C T T A B L E XXXVII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S RiN.< S U B J E C T S W O R D B C D E F G H T O T A L 7 1 . 3 9 . 0 6 4 . 0 5 7 . 0 3 3 . 0 7 9 . 0 7 5 . 0 2 1 . 0 3 6 8 . HOW 3 1 . 5 6 3 . 1 7 7 . 1 4 9 . 0 2 7 . 9 5 9 . 1 6 0 . 4 0 . 1 9 4 0 . 1 5 9 0 . 1 1 6 0 . 1 0 5 0 . 4 4 4 0 . 1 9 8 0 . 0 5 4 C H I - S O U A R E 1 3 5 . 9 9 7 2 . 3 8 . 0 7 . 0 2 2 . 0 1 4 . 0 2 2 8 . 0 4 7 . 0 1 6 . 0 3 6 6 . N U M f c E R 3 1 . 4 6 2 . 7 7 6 . 7 4 8 . 7 2 7 . 7 5 8 . 8 6 0 . 1 0 . 1 8 9 0 . 0 1 7 0 . 0 4 5 0 . 0 4 5 1 . 2 8 0 0 . 1 2 4 0 . 0 4 1 • C H I - S Q U A K E 1 5 9 6 . 2 8 7 3 . - 2 4 . 0 4 2 . 0 1 0 9 . 0 9 d . O 5 . 0 3 8 . 0 4 7 . 0 3 6 3 . M A D E 3 1 . 1 6 2 . 2 7 6 . 1 4 8 . 3 2 7 . 5 5 8 . 3 5 9 . 6 0 . 1 1 9 0 . 1 0 4 0 . 2 2 1 0 . 3 1 3 0 . C 2 8 0 . 1 0 1 0 . 1 2 2 C H I - S Q U A R E 1 0 1 . 7 0 7 4 . 3 4 . 0 1 0 8 . 0 6 7 . 0 5 1 . 0 1 5 . 0 3 9 . 0 4 4 . 0 3 5 8 . O J T 3 0 . 7 6 1 . 4 7 5 . 0 4 7 . 7 2 7 . 1 5 7 . 5 5 8 . 8 0 . 1 6 9 0 . 2 6 8 0 . 1 3 6 0 . 1 6 3 0 . 0 8 4 0 . 1 0 3 0 . 1 1 4 C H I - S Q U A R E 5 1 . 9 9 7 5 . 2 3 . 0 4 7 . 0 7 4 . 0 6 1 . 0 8 . 0 5 0 . 0 9 0 . 0 3 5 3 . M O S T 3 0 . 3 6 0 . 5 7 4 . 0 4 7 . 0 2 6 . 7 5 6 . 7 5 8 . 0 0 . 1 1 4 0 . 1 1 7 0 . 1 5 0 0 . 1 9 5 0 . 0 4 5 0 . 1 3 2 0 . 2 3 3 C H I - S Q U A R E 4 0 . 5 6 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E w U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV/I D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W O R D B C D E F G H T O T A L 7 6 . 1 7 . 0 6 4 . 0 6 1 . 0 3 2 . 0 3 7 . 0 8 6 . 0 5 4 . 0 3 5 1 . O N L Y 3 0 . 1 6 0 . 2 7 3 . 5 4 6 . 7 2 6 . 6 5 6 . 4 5 7 . 6 0 . 0 8 4 0 . 1 5 9 0 . 1 2 4 0 . 1 0 2 0 . 2 0 8 0 . 2 2 8 0 . 1 4 0 C H I - S Q U A R E 3 2 . 5 4 7 7 . 3 2 . 0 1 1 2 . 0 6 8 . 0 1 8 . 0 2 8 . 0 5 2 . 0 4 5 . 0 3 4 9 . N O 2 9 . 9 5 9 . 8 7 3 . 1 4 6 . 5 2 6 . 4 5 6 . 1 5 7 . 3 0 . 1 5 9 0 . 2 7 8 0 . 1 3 8 0 . 0 5 8 0 . 1 5 7 0 . 1 3 8 0 . 1 1 7 C H I - S Q U A R E 6 6 . 4 8 7 8 . 5 0 . 0 5 7 . 0 7 3 . 0 8 3 . 0 1 8 . 0 4 4 . 0 1 7 . 0 3 4 2 . M U S T 2 9 . 3 5 8 . 6 7 1 . 7 4 5 . 5 2 5 . 9 5 5 . 0 5 6 . 2 0 . 2 4 8 0 . 1 4 1 0 . 1 4 8 0 . 2 6 5 0 . 1 U 1 0 . 1 1 6 0 . 0 4 4 C H I - S O U A R E 7 7 . 4 0 7 9 . 6 . 0 3 0 . 0 5 1 . 0 3 4 . 0 5 . 0 1 4 6 . 0 6 8 . 0 3 4 0 . W A T E R 2 9 . 1 5 8 . 3 7 1 . 2 4 5 . 3 ? 5 . 6 5 4 . 6 5 5 . 8 0 . 0 3 0 0 . 0 7 4 0 . 1 0 4 0 . 1 0 9 0 . 0 2 8 0 . 3 8 6 0 . 1 7 6 C H I - S Q U A R E 2 1 2 . 7 5 8 0 . 2 3 . 0 2 1 . 0 8 4 . 0 6 8 . 0 1 8 0 . 0 4 2 . 0 5 6 . 0 3 1 2 . A L S O 2 6 . 7 5 3 . 5 6 5 . 4 4 1 . 5 2 3 . 6 5 0 . 1 5 1 . 2 0 . 1 1 4 0 . 0 5 2 0 . 1 7 1 0 . 2 1 7 1 . 0 1 1 0 . 1 1 1 0 . 1 4 5 C H I - S Q U A R E 1 0 7 8 . 8 3 T H E T H R E E L I N E S O F F I G U R E S F C R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T oo T A B L E XXXVII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W ORD 3 C 0 E F G H T O T A L 8 1 . 2 7 . 0 4 7 . 0 5 4 . 0 4 1 . 0 3 5 . 0 4 2 . 0 6 2 . 0 3 0 8 . F I R S T 2 6 . 4 5 2 . 3 6 4 . 5 4 1 . 0 2 3 . 3 4 9 . 5 5 0 . 6 0 . 1 3 4 0 . 1 1 7 0 . 1 1 0 0 . 1 3 1 0 . 1 9 7 0 . 1 1 1 0 . 1 6 1 C H I - S 3 U & R E 1 1 . 9 2 8 2 . 2 7 . 0 4 8 . 0 6 5 . 0 4 3 . 0 2 0 . C 6 6 . 0 3 7 . 0 3 0 6 . V E R Y 2 6 . 2 5 2 . 5 6 4 . 1 4 0 . 7 2 3 . 2 4 9 . 2 5 0 . 2 0 . 1 3 4 0 . 1 1 9 0 . 1 3 2 0 . 1 3 7 0 . 1 1 2 0 . 1 7 5 0 . 0 9 6 C H I - S Q U A R E 1 0 . 2 2 8 3 . 4 1 . 0 5 1 . 0 1 4 0 . 0 3 3 . 0 7 . 0 1 6 . 0 1 5 . 0 3 0 3 . G C O D 2 6 . 0 5 1 . 9 6 3 . 5 4 0 . 3 2 3 . 0 4 8 . 7 4 9 . 8 0 . 2 0 3 0 . 1 2 7 0 . 2 8 4 0 . 1 0 5 0 . 0 3 9 0 . 0 4 2 0 . 0 3 9 C H I - S Q U A K E 1 5 9 . 6 0 8 4 . 1 2 . 0 1 6 0 . 0 7 6 . 0 1 . 0 1 2 . 0 2 5 . 0 9 . 0 2 9 5 . H I M 2 5 . 3 5 0 . 6 6 1 . 8 3 9 . 3 2 2 . 3 4 7 . 4 4 8 . 4 0 . 0 6 0 0 . 3 9 7 0 . 1 5 4 0 . 0 0 3 0 . 0 6 7 0 . 0 6 6 C . 0 2 3 C H I - S Q U A K E 3 3 1 . 8 7 8 5 . 2 5 . 0 1 6 . 0 4 6 . 0 4 0 . 0 5 3 . 0 8 8 . 0 1 8 . 0 2 8 5 . S A * E 2 4 . 5 4 9 . 0 5 9 . 9 3 8 . 1 2 1 . 7 4 6 . 0 4 7 . 0 0 . 1 2 4 0 . 0 4 0 0 . 0 9 3 0 . 1 2 8 0 . 2 9 8 0 . 2 3 3 C . 0 4 7 C H I - S Q U A R E 1 2 7 . 2 2 T H E THREE L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S i, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T WORO T Y P E S A C R O S S T h E S U B J E C T A R E A S OF T H E C O R P U S R A N K S U B J E C T S W ORD B C 0 E F G H T O T A L 8 6 . 7 . 0 8 3 . 0 1 2 . 0 2 3 . 0 4 5 . C 5 0 . 0 4 0 . 0 2 6 0 . C O U L D 2 2 . 3 4 4 . 6 5 4 . 5 3 4 . 6 1 9 . 7 4 1 . 8 4 2 . 7 0 . 0 3 5 0 . 2 0 6 0 . 0 2 4 0 . 0 7 3 0 . 2 5 3 0 . 1 3 2 0 . 1 C 4 C H I - S Q U A R E 1 1 4 . 9 5 8 7 . 5 2 . 0 7 0 . 0 6 3 . 0 1 1 . 0 4 . 0 1 9 . 0 4 1 . 0 2 6 0 . WHO 2 2 . 3 4 4 . 6 5 4 . 5 3 4 . 6 1 9 . 7 4 1 . 8 4 2 . 7 0 . 2 5 8 0 . 1 7 4 0 . 1 2 8 0 . 0 3 5 0 . 0 2 2 0 . 0 5 0 0 . 1 C 6 C H I - S Q U A R E 9 6 . 5 6 88. 3 1 . 0 3 7 . 0 5 9 . 0 2 5 . 0 3 3 . 0 4 7 . 0 2 3 . 0 2 5 5 . A N Y 2 1 . 9 4 3 . 7 5 3 . 4 3 3 . 9 1 9 . 3 4 1 . 0 4 1 . 9 0 . 1 5 4 0 . 0 9 2 0 . 1 2 0 0 . 0 8 0 0 . 1 8 5 0 . 1 2 4 0 . 0 6 0 C h l - S O U A K E 2 6 . 8 3 8 9 . 2 6 . 0 3 0 . 0 8 1 . 0 3 2 . 0 1 0 . 0 4 5 . 0 3 C . 0 2 5 4 . B E C A U S E 2 1 . 8 4 3 . 5 5 3 . 2 3 3 . 3 1 9 . 2 4 0 . 8 4 1 . 7 0 . 1 2 9 0 . 0 7 4 0 . 1 6 4 0 . 1 0 2 0 . 0 5 6 0 . 1 1 9 0 . 0 7 8 C H I - S Q U A R E 2 7 . 7 9 9 0 . 1 4 . 0 6 5 . 0 3 7 . 0 2 2 . 0 3 3 . 0 6 2 . 0 2 0 . 0 2 5 3 . S E E 2 1 . 7 4 3 . 4 5 3 . 0 3 3 . 7 1 9 . 2 4 0 . 7 4 1 . 5 0 . 0 6 9 0 . 1 6 1 0 . 0 7 5 0 . 0 7 0 0 . 1 8 5 0 . 1 6 4 0 . 0 5 2 C H I - S Q U A R E 5 4 . 7 6 THE THREE L I N E S OF FIGURES FUR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS S, OF FREQ. TO TOTAL NO. OF WORDS I N SUBJECT ts) cr> TABLE XXXVH D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 H O S T F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R AN< S U B J E C T S W C R D B C D fc F G H T O T A L 5 1 . 2 7 . 0 7 2 . 0 5 2 . 0 1 3 . 0 1 3 . 0 3 3 . 0 3 0 . 0 2 4 5 . L I K E 2 1 . 0 4 2 . 0 5 1 . 3 3 2 . 6 I B . 6 3 9 . 4 4 0 . 2 0 . 1 3 4 0 . 1 7 9 0 . 1 0 6 0 . 0 4 2 0 . C 7 3 0 . 1 0 1 0 . 0 7 8 C H I - S Q U A R E 3 9 . 2 7 9 2 . 2 5 . 0 3 1 . 0 4 2 . 0 2 8 . 0 1 4 . 0 4 0 . 0 5 9 . 0 2 3 9 . M J C H 2 0 . 5 4 1 . 0 5 0 . 1 3 1 . 8 1 8 . 1 3 8 . 4 3 9 . 2 0 . 1 2 4 0 . 0 7 7 0 . 0 8 5 0.0e9 0 . 0 7 9 0 . 1 0 6 0 . 1 5 3 C H I - S Q U A R E 1 6 . 1 1 9 3 . 2 8 . 0 4 0 . 0 5 3 . 0 7 . 0 8 . 0 1 6 . 0 8 4 . 0 2 3 6 . P E O P L E 2 0 . 2 4 0 . 5 4 9 . 4 3 1 . 4 1 7 . 9 3 7 . 9 3 8 . 8 0 . 1 3 9 0 . C 9 9 0 . 1 0 8 0 . 0 2 2 0 . C 4 5 0 . 0 4 2 0 . 2 1 8 C H I - S Q U A R E 9 3 . 1 8 5 4 . 1 7 . 0 2 9 . 0 1 8 . 0 4 4 . 0 4 1 . 0 5 7 . 0 2 8 . 0 2 3 4 . C A L L E D 2 0 . 1 4 0 . 1 4 9 . 0 3 1 . 2 1 7 . 7 3 7 . 6 3 8 . 4 0 . 0 8 4 0 . 0 7 2 0 . 0 3 7 0 . 1 4 1 0 . 2 3 0 0 . 1 5 1 C . 0 7 3 C H I - S Q U A R E 7 1 . 8 7 5 5 . 1 6 . 0 2 2 . 0 6 1 . 0 3 7 . 0 2 0 . 0 5 9 . 0 1 8 . 0 2 3 3 . P L A C E 2 0 . 0 3 9 . 9 4 8 . 8 3 1 . 0 1 7 . 6 3 7 . 4 3 6 . 3 0 . 0 7 9 0 . 0 5 5 0 . 1 2 4 0 . 1 1 8 0 . 1 1 2 0 . 1 5 6 0 . 0 4 7 C H I - S Q U A R E 3 6 . 4 9 T H E T h R E E L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S I, U F F R E Q . T O T U T A L N O . U F W O R D S I N S U B J E C T T A B L E XXXV// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F T H E C O R P U S R A N K S U B J E C T S W C R D B C D E F G H T O T A L 9 6 . 1 0 . 0 3 4 . 0 3 4 . 0 5 4 . 0 8 . C 6 2 . 0 3 0 . 0 2 3 2 . T H R O U G H 1 9 . 9 3 9 . 8 4 8 . 6 3 0 . 9 1 7 . 6 3 7 . 3 3 8 . 1 0 . 0 5 0 0 . 0 8 4 0 . 0 6 9 0 . 1 7 3 0 . 0 4 5 0 . 1 6 4 C . 0 7 8 C H I - S Q U A R E 5 0 . 7 5 97. 3 9 . 0 2 1 . 0 5 8 . 0 6 0 . 0 13.C 2 0 . 0 2 1 . 0 2 3 2 . WORK 1 9 . 9 3 9 . 8 4 8 . 6 3 0 . 9 1 7 . 6 3 7 . 3 3 8 . 1 0 . 1 9 4 0 . 0 5 2 0 . 1 1 8 0 . 1 9 2 0 . 0 7 3 0 . 0 5 3 0 . 0 5 4 C H I - S Q U A R E 7 3 . 3 6 99. 2 2 . 0 4 2 . 0 2 7 . 0 1 2 . 0 9 . 0 2 9 . 0 8 6 . 0 2 2 8 . NEW 1 9 . 5 3 9 . 1 4 7 . 8 3 0 . 4 1 7 . 3 3 6 . 6 3 7 . 4 0 . 1 0 9 0 . 1 0 4 0 . 0 5 5 0 . 0 3 8 0 . 0 5 1 0 . 0 7 7 0 . 2 2 3 C H I - S Q U A R E 3 9 . 1 9 99. 1 7 . 0 1 4 . 0 4 4 . 0 3 7 . 0 1 0 . 0 5 2 . 0 4 9 . 0 2 2 3 . S M A L L 1 9 . 1 3 8 . 2 4 6 . 7 2 9 . 7 1 6 . 9 3 5 . 8 3 6 . 6 0 . 0 8 4 0 . 0 3 5 0 . 0 8 9 0 . 1 1 8 0 . 0 5 6 0 . 1 3 8 0 . 1 2 7 C H I - S O U A R E 3 1 . 8 3 1 0 0 . 1 0 . 0 4 4 . 0 4 5 . 0 3 8 . 0 6 . 0 3 3 . 0 4 4 . 0 2 2 0 . O V E R 1 8 . 9 3 7 . 7 4 6 . 1 2 9 . 3 1 6 . 7 3 5 . 4 3 6 . 1 0 . 0 5 0 O . 1 0 9 0 . 0 9 1 0 . 1 2 1 0 . 0 3 4 0 . 0 8 7 0 . 1 1 4 C H I - S Q U A R E 1 6 . 5 3 T H E 1 H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U U J E C T t>J o TABLE X X X V m D I S T R I B U T I O N OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT RANK S U B J E C T S WORD 8 C 0 E F & H TOTAL 1 . 0 . 0 6 1 8 . 0 6 0 C . 0 3 3 6 . 0 4 o 8 . 0 8 8 3 . 0 9 5 4 . 0 3 8 5 9 . THE 0 . 0 5 2 8 . 5 8 3 4 . 4 3 3 7 . 7 5 1 6 . 6 7 2 3 . 5 8 1 8 . 3 0 . 0 7 . 1 8 2 5 . 2 5 2 7 . 2 6 6 6 . 6 1 7 8 . 9 1 3 6 . 5 1 4 CHI-SQUARE 1 2 8 . 2 3 2 . 0 . 0 2 9 4 . 0 2 b 9 . 0 1 4 0 . 0 3 4 0 . 0 4 2 0 . 0 4 6 5 . 0 1 9 4 9 . OF 0 . 0 3 1 7 . 4 4 2 1 . 4 1 7 0 . 6 2 6 0 . 9 3 6 5 . 4 4 1 3 . 3 0 . 0 3 . 4 1 7 2 . 5 3 0 3 . 0 2 8 4 . 0 0 7 4 . 2 3 9 4 . 1 5 0 CHI-SQUARE 8 7 . 4 1 3 . 0 . 0 2 4 9 . 0 3 6 5 . 0 1 5 2 . 0 1 3 0 . 0 1 8 0 . 0 3 8 6 . 0 1 6 4 2 . A N D 0 . 0 2 6 7 . 4 3 5 5 . 0 1 4 3 . 7 2 1 9 . 8 3 0 7 . 9 3 4 8 . 2 0 . 0 2 . 8 9 4 3 . 1 9 5 3 . 2 8 7 1 . G 3 3 1 . 8 1 7 3 . 4 4 5 CHI-SQUARE 9 5 . 9 2 4 . 0 . 0 2 3 0 . 0 3 4 2 . 0 1 3 8 . 0 1 8 1 . 0 3 1 1 . 0 2 3 4 . 0 1 6 3 6 . A 0 . 0 2 6 6 . 4 3 5 3 . 7 1 4 3 . 2 2 1 9 . 0 3 0 6 . 7 3 4 6 . 9 0 . 0 2 . 6 7 3 2 . 9 9 3 2 . 9 6 4 2 . 5 5 9 3 . 1 3 9 2 . 0 8 8 CHI-SQUARE 4 8 . 9 7 5 . 0 . 0 2 1 4 . 0 3 5 4 . 0 1 2 7 . 0 1 7 1 . 0 1 9 9 . 0 2 6 1 . 0 1 3 2 6 . TO 0 . 0 2 1 5 . 9 2 8 6 . 7 U o . O 1 7 7 . 5 2 4 8 . 6 2 8 1 . 2 0 . 0 2 . 4 8 7 3 . 0 9 8 2 . 7 4 7 2 . - . 1 0 2 . 0 0 9 2 . 3 2 9 CHI-SQUARE 2 8 . 4 3 THE THREE L I N E S OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTEC FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. UF WORDS I N SUBJECT TABLE XXKVtll DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT RANK S U B J E C T S WORD B C D E F G H TOTAL 6 . 0 . 0 1 6 1 . 0 2 6 1 . 0 8 9 . 0 1 5 0 . 0 1 8 8 . C 2 5 9 . C 1 1 0 8 IN 0 . 0 1 8 0 . 4 2 3 9 . 6 9 7 . 0 1 4 8 . 3 2 0 7 . 7 2 3 5 . 0 0 . 0 1 . 8 7 1 2 . 2 8 4 1 . 9 2 5 2 . 1 2 1 1 . 8 9 8 2 . 3 1 1 . CHI-SQUA,'<E 9 . 0 2 7 . 0 . 0 7 5 . 0 1 8 3 . 0 1 4 5 . 0 1 6 7 . 0 1 5 4 . 0 1 4 6 . 0 8 7 1 I S 0 . 0 1 4 1 . 6 1 8 8 . 3 7 6 . 2 1 1 6 . 6 1 6 3 . 3 1 3 4 . 7 0 . 0 0 . 8 7 2 1 . 6 0 2 3 . 1 3 6 2 . 3 6 1 1 . 5 5 4 1 . 3 0 3 CHI-SQUARE 1 2 4 . 1 5 8 . 0 . 0 1 3 7 . 0 1 1 8 . 0 1 2 . 0 1 3 3 . 0 9 7 . 0 7 0 . 0 5 6 7 THAT 0 . 0 9 2 . 3 1 2 2 . 6 4 9 . 6 7 5 . 9 1 0 6 . 3 1 2 0 . 2 0 . 0 1 . 5 9 2 1 . 0 3 3 0 . 2 6 0 1 . B 8 0 0 . 9 7 9 0 . 6 2 5 CHI-SQUARE 1 1 5 . 0 6 9 . 0 . 0 1 2 3 . 0 1 3 6 . 0 6 1 . 0 5 1 . 0 1 0 1 . 0 6 0 . 0 5 3 2 IT 0 . 0 8 6 . 6 1 1 5 . 0 4 6 . 6 7 1 . 2 9 9 . 7 1 1 2 . 8 0 . 0 1 . 4 2 9 1 . 1 9 0 1 . 3 1 9 0 . 7 2 1 1 . 0 1 9 0 . 5 3 5 CHI-SQUARE 5 4 . 0 4 1 0 . 0 . 0 2 7 . 0 1 8 3 . 0 4 2 . 0 7 6 . 0 6 6 . 0 9 5 . 0 4 8 9 ARE 0 . 0 7 9 . 6 1 0 5 . 7 4 2 . 8 6 5 . 5 9 1 . 7 1 0 3 . 7 0 . 0 0 . 3 1 4 1 . 6 0 2 0 . 9 0 8 1 . 0 7 5 0 . 6 6 6 0 . 8 4 8 CHI-SQUARE 1 0 0 . 8 9 THE THREE L I N E S OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. OF WORDS I N SUBJECT TABLE xxxvm D I S T R I B U T I O N OF OCCURRENCE OF THE 100 MOST FREUUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT R AN< S U B J E C T S WCRD B C D E F G H TOTAL 11. 0.0 54.0 175.0 4 4 . 0 0 4 . 0 4 3 . 0 99.0 4 9 9 . FOR 0.0 81.3 107.9 4 3 . 7 66.S 93.6 105.8 0.0 0.628 1.532 0.952 1.188 0.434 0.884 CHI-SQUARE 8 3 . 0 8 12. 0.0 83.0 24 3 . 0 3 1 . 0 6 3 . 0 116.0 9.0 5 4 5 . YOU 0.0 83.8 117.8 47.7 73.0 102.2 115.6 0.0 0.965 2.127 0.670 0.891 1.171 0.080 CHI-SQUARE 2 4 0 . 6 5 13. 0.0 40.0 189.0 3 7 . 0 4 9 . 0 5 4 . 0 49.0 4 1 8 . BE 0.0 68.1 90.4 36.6 56.0 78.4 88.6 0.0 0.465 1.654 0.800 0.693 0.545 6.437 CHI-SQUARE 145.36 14. 0.0 69.0 110.0 3 9 . 0 4 7 . 0 6 7 . 0 73.0 4 0 5 . AS 0.0 66.0 67.6 3 5 . 4 54.2 75.9 85.9 0.0 0.802 0.963 0.843 0.664 0.676 0.651 CHI-SQUAKE 1 0 . 1 9 15. 0.0 35.0 153.0 3 4 . 0 18."o 7 1 . 0 58.0 3 6 9 . OR 0.0 60.1 79.8 3 2 . 3 49.4 6 9 . 2 78.2 0.0 0.407 1.339 0.735 0.254 0.717 0.518 CHI-SQUARE 102.99 THE THREE L I N E S OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS %, UF FREQ. TO TOTAL NO. OF WORDS I N SUBJECT T A 8 L E XXXVIII DISTRIBUTION OF OCCURRENCE CF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT RANK S U B J E C T S WORD B C D E F G H TOTAL 16. 0.0 51.0 66.0 27.0 42.0 4 9 . 0 63.0 2 9 8 . WITH 0.0 48.5 64.4 2 6 . 1 39.9 5 5 . 9 63.2 0.0 0.593 0.578 0.584 0.594 0.495 C.562 CHI-SQUARE 1.15 17. 0.0 44.0 67.0 31.0 25.0 7 8 . 0 95.0 3 4 0 . ON 0.0 55.4 73.5 29.8 45.5 63.7 72.1 0.0 0.511 0.586 0.670 0.353 0.787 0.848 CHI-SQUARE 22.67 18. 0.0 21.0 21.0 2 1 . 0 79.0 6 0 . 0 61.0 2 6 3 . THIS 0.0 42.8 56.9 23.0 35.2 4 9 . 3 55.8 0.0 0.244 0.184 0.454 1.117 0.606 0.544 CHI-SOUARE 9 1 . 2 1 19. 0.0 26.0 40.0 19.0 62.0 3 5 . 0 91.0 2 7 3 . BY 0.0 44.5 59.0 23.9 36.5 51.2 57.9 0.0 0.302 0.350 0.411 0.877 0.353 0.812 CHI-SQUARE 56.58 20. 0.0 101.0 14.0 6.0 14.0 3 8 . 0 105.0 2 7 8 . WAS 0.0 4 5 . 3 60.1 24.3 37.7 52.1 59.0 0.0 1.174 0.123 0.130 0.198 0.384 0.937 CHI-SQUAKE 172.05 THE THREE L I N E S UF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS X, OF FREQ. TO TOTAL NO. UF WORDS IN SUBJECT T A B L E XXXVIII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A O E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 2 1 . 0 . 0 9 8 . 0 1 9 . 0 1 . 0 1 3 . 0 4 6 . 0 7 6 . 0 2 5 3 . H E 0 . 0 4 1 . 2 5 4 . 7 2 2 . 1 3 3 . 9 4 7 . 4 5 3 . 7 0 . 0 1 . 1 3 9 0 . 1 6 6 0 . 0 2 2 0 . 1 8 4 0 . 4 6 4 0 . 6 7 8 C H I - S O U A R E 1 4 4 . 0 0 2 2 . 0 . 0 3 0 . 0 3 4 . 0 2 3 . 0 1 7 . 0 6 9 . 0 7 5 . 0 2 4 8 . F R O M 0 . 0 4 0 . 4 5 3 . 6 2 1 . 7 3 3 . 2 4 6 . 5 5 2 . 6 0 . 0 0 . 3 4 9 0 . 2 9 8 0 . 4 9 7 0 . 2 4 0 0 . 6 9 6 0 . 6 6 9 C H I - S Q U A R E 3 8 . 2 7 2 3 . 0 . 0 4 4 . 0 9 8 . 0 1 2 . 0 4 2 . 0 3 0 . 0 3 7 . 0 2 6 3 . H A V E 0 . 0 4 2 . 8 5 6 . 9 2 3 . 0 3 5 . 2 4 9 . 3 5 5 . 8 0 . 0 0 . 5 1 1 0 . 8 5 8 0 . 2 6 0 0 . 5 9 4 0 . 3 0 3 C . 3 3 0 C H I - S O U A R E 5 0 . 2 5 2 4 . 0 . 0 4 7 . 0 5 7 . 0 1 0 . 0 1 3 . 0 7 4 . 0 4 0 . 0 2 4 1 . A T 0 . 0 3 9 . 2 5 2 . 1 2 1 . 1 3 2 . 3 4 5 . 2 5 1 . 1 0 . 0 0 . 5 4 6 0 . 4 9 9 0 . 2 1 6 0 . 1 8 4 0 . 7 4 7 0 . 3 5 7 C H I - S Q U A R E 4 0 . 1 1 2 5 . C O 2 2 . 0 4 2 . 0 3 1 . 0 3 5 . 0 5 9 . 0 6 8 . 0 2 5 7 . W H I C H 0 . 0 4 1 . 9 5 5 . 6 2 2 . 5 3 4 . 4 4 8 . 2 5 4 . 5 0 . 0 0 . 2 5 6 0 . 3 6 8 0 . 6 7 0 0 . 4 9 5 0 . 5 9 6 0 . 6 0 7 C H I - S O U A R E 2 1 . 7 3 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S it O F F R E Q . T O T O T A L N O . O F W U R D S I N S U B J E C T T A B L E XXXVIII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORO T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W ORD B C D E F G H T O T A L 2 6 . 0 . 0 3 3 . 0 3 2 . 0 1 7 . 0 5 4 . 0 5 3 . 0 3 6 . 0 2 2 5 . O N E 0 . 0 3 6 . 6 4 8 . 7 1 9 . 7 3 0 . 1 4 2 . 2 4 7 . 7 0 . 0 0 . 3 6 3 0 . 2 8 0 0 . 3 6 8 0 . 7 6 3 0 . 5 3 5 0 . 3 2 1 C H I - S Q U A R E 3 1 . 0 1 2 7 . 0 . 0 3 5 . 0 5 0 . 0 1 0 . 0 3 2 . 0 2 7 . 0 3 8 . 0 1 9 2 . N O T 0 . 0 3 1 . 3 4 1 . 5 1 6 . 8 2 5 . 7 3 6 . 0 4 0 . 7 0 . 0 0 . 4 0 7 0 . 4 3 3 0 . 2 1 6 0 . 4 5 2 0 . 2 7 3 0 . 3 3 9 C H I - S Q U A R E 8 . 9 1 2 8 . 0 . 0 2 0 . 0 6 0 . 0 1 4 . 0 3 9 . 0 5 5 . 0 8 . 0 1 9 6 . C A N 0 . 0 3 1 . 9 4 2 . 4 1 7 . 2 2 5 . 2 3 6 . 7 4 1 . 6 0 . 0 0 . 2 3 2 0 . 5 2 5 0 . 3 0 3 0 . 5 5 1 0 . 5 5 5 C . 0 7 1 C H I - S Q U A R E 5 4 . 7 3 2 9 . 0 . 0 3 6 . 0 1 2 8 . 0 1 0 . 0 1 4 . 0 6 2 . 0 5 . 0 2 5 5 . Y O U R 0 . 0 4 1 . 5 5 5 . 1 2 2 . 3 3 4 . 1 4 7 . 8 5 4 . 1 0 . 0 0 . 4 1 8 1 . 1 2 0 0 . 2 1 6 0 . 1 9 8 0 . 6 2 6 0 . 0 4 5 C H I - S Q U A R E 1 6 4 . 4 5 30. 0 . 0 4 9 . 0 6 6 . 0 8 . 0 5 . 0 3 6 . 0 6 6 . 0 2 3 0 . T H E Y 0 . 0 3 7 . 5 4 9 . 7 2 0 . 1 3 0 . 8 4 3 . 1 4 8 . 8 0 . 0 0 . 5 6 9 0 . 5 7 8 0 . 1 7 3 0 . 0 7 1 0 . 3 6 3 0 . 5 8 9 C H I - S Q U A R E 4 5 . 0 5 NJ T H E T H R E F L I N E S O F F I G U R E S F O R E A C h E N T R Y R E P R E S E N T : w F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T T A E L E X X X V / l f D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W C R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T RANK S U B J E C T S W O R D B C D E F G H T O T A L 31. 0 . 0 2 8 . 0 1 4 . 0 6 . 0 1 7 4 . 0 2 3 . 0 8 . 0 2 5 3 . W E 0 . 0 4 1 . 2 5 4 . 7 2 2 . 1 3 3 . 9 4 7 . 4 5 3 . 7 0 . 0 0 . 3 2 5 0 - 1 2 3 0 . 1 3 0 2 . 4 6 0 0 . 2 3 2 0 . 0 7 1 C H I - S I J A R E 6 7 7 . 5 6 3 2 . 0 . 0 6 9 . 0 2 6 . 0 5 . 0 1 0 . C 3 6 . 0 5 8 . 0 2 0 4 . H I S 0 . 0 3 3 . 2 4 4 . 1 1 7 . 9 2 7 . 3 3 8 . 2 4 3 . 3 0 . 0 0 . 8 0 2 0 . 2 2 8 0 . 1 0 8 0 . 1 4 1 0 . 3 6 3 0 . 5 1 8 C H I - S Q U A R E 7 1 . 3 4 33. 0 . 0 1 9 . 0 6 7 . 0 1 5 . 0 2 4 . C 2 5 . 0 1 4 . 0 1 6 4 . W I L L 0 . 0 2 6 . 7 3 5 . 5 1 4 . 4 2 2 . 0 3 0 . 7 3 4 . 8 0 . 0 0 . 2 2 1 0 . 5 8 6 0 . 3 2 4 0 . 3 3 9 0 . 2 5 2 6 . 1 2 5 C H I - S Q U A R E 43.98 34. 0 . 0 2 1 . 0 6 2 . 0 8 . 0 3 1 . C 4 3 . 0 1 0 . 0 1 7 5 . I F 0 . 0 2 8 . 5 3 7 . 8 1 5 . 3 2 3 . 4 3 2 . 8 3 7 . 1 0 . 0 0 . 2 4 4 0 . 5 4 3 0 . 1 7 3 0 . 4 3 8 0 . 4 3 4 0 . 0 8 9 C H 1 - S 3 U A 3 E 4 6 . 3 1 35. 0 . 0 1 8 . 0 4 5 . 0 2 7 . 0 1 6 . * 0 3 3 . 0 3 7 . 0 1 7 6 . A N 0 . 0 2 8 . 7 3 8 . 1 1 5 . 4 2 3 . 6 3 3 . 0 3 7 . 3 0 . 0 0 . 2 0 9 0 . 3 9 4 0 . 5 8 4 0 . 2 2 6 0 . 3 3 3 0 . 3 3 0 C H I - S Q U A R E 1 6 . 4 0 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S % , O F F R E Q . T O T O T A L N O . UF W O R D S I N S U B J E C T T A B L E X X X V M / D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S WORD B C D t F G H T O T A L 36. 0 . 0 2 0 . 0 3 8 . 0 1 7 . 0 1 8 . 0 3 0 . 0 1 9 . 0 1 4 2 . W h E N 0 . 0 2 3 . 1 3 0 . 7 1 2 . 4 1 9 . 0 2 6 . 6 3 0 . 1 0 . 0 0 . 2 3 2 0 . 3 3 3 0 . 3 6 8 0 . 2 5 4 0 - 3 0 3 0 . 1 7 0 C H I - S Q U A R E 8 . 4 2 3 7 . 0 . 0 2 6 . 0 3 0 . 0 1 2 . 0 2 3 . 0 1 7 . 0 3 6 . 0 1 4 4 . A L L 0 . 0 2 3 . 5 3 1 . 1 1 2 . 6 1 9 . 3 2 7 . 0 3 0 . 5 0 . 0 0 . 3 0 2 0 . 2 6 3 0 . 2 6 0 0 . 3 2 5 0 . 1 7 2 0 . 3 2 1 C H I - S Q U A R E 5 . 7 5 3 8 . 0 . 0 3 2 . 0 2 8 . 0 5 . 0 1 7 . 0 2 1 . 0 4 1 . 0 1 4 4 . B U T 0 . 0 2 3 . 5 3 1 . 1 1 2 . 6 1 9 . 3 2 7 . 0 3 0 . 5 0 . 0 0 . 3 7 2 0 . 2 4 5 0 . 1 0 8 0 . 2 4 0 0 . 2 1 2 0 . 3 6 6 C H I - S Q U A R E 1 3 . 2 1 39. 0 . 0 1 6 . 0 1 9 . 0 8 . 0 4 8 . 0 2 7 . 0 3 0 . 0 1 4 3 . T H E S E 0 . 0 2 4 . 1 3 2 . 0 1 3 . 0 1 9 . 8 2 7 . 7 3 1 . 4 0 . 0 0 . 1 8 6 0 . 1 6 6 0 . 1 7 3 0 . 6 7 9 0 . 2 7 3 0 . 2 6 8 C H I - S Q U A R E 5 0 . 0 9 4 0 . 0 . 0 5 . 0 6 4 . 0 1 5 . 0 1 7 . 0 8 . C 1 7 . 0 1 2 6 . M A Y 0 . 0 2 0 . 5 2 7 . 2 1 1 . 0 1 6 . 9 2 3 . 6 2 6 . 7 0 . 0 0 . 0 5 8 0 . 5 6 0 0 . 3 2 4 0 . 2 4 0 0 . 0 8 1 0 . 1 5 2 C H I - S Q U A K E 7 6 . 6 3 N J T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : J> F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T U T A L N O . OF W O R D S I N S U B J E C T T A B L E X X X V I H D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 4 1 . 0 . 0 2 8 . 0 2 9 . 0 3 . 0 2 4 . 0 1 3 . 0 4 5 . 0 1 4 2 . T H E R E 0 . 0 2 3 . 1 3 0 . 7 1 2 . 4 1 9 . 0 2 6 . 6 3 0 . 1 0 . 0 0 . 3 2 5 0 . 2 5 4 0 . 0 6 5 0 . 3 3 9 0 . 1 3 1 0 . 4 0 2 C H I - S Q U A R E 2 3 . 9 2 4 2 . 0 . 0 1 3 . 0 1 7 . 0 2 1 . 0 8 . 0 1 6 . 0 2 7 . 0 1 0 2 . H A S 0 . 0 1 6 . 6 2 2 . 1 8 . 9 1 3 . 7 1 9 . 1 2 1 . 6 0 . 0 0 . 1 5 1 0 . 1 4 9 0 . 4 5 4 0 . 1 1 3 0 . 1 6 2 0 . 2 4 1 C H I - S O U A R E 2 2 . 4 6 4 3 . 0 . 0 9 0 . 0 1 . 0 0 . 0 3 . 0 0 . 0 1 0 . 0 1 0 4 . I 0 . 0 1 6 . 9 2 2 . 5 9 . 1 1 3 . 9 1 9 . 5 2 2 . 1 0 . 0 1 . 0 4 6 0 . 0 0 9 0 . 0 0 . 0 4 2 0 . 0 0 . 0 8 9 C H I - S Q U A R E 3 7 9 . 4 8 4 4 . 0 . 0 2 0 . 0 4 2 . 0 1 0 . 0 1 2 . 0 2 4 . 0 3 1 . 0 1 3 9 . O T H E R 0 . 0 2 2 . 6 3 0 . 1 1 2 . 2 1 8 . a 2 6 . 1 2 9 . 5 0 . 0 0 . 2 3 2 0 . 3 6 8 0 . 2 1 6 0 . 1 7 0 0 . 2 4 2 0 . 2 7 7 C H I - S Q U A R E 8 . 0 3 4 5 . 0 . 0 1 3 . 0 3 4 . 0 4 . 0 1 5 . 0 2 3 . 0 3 3 . 0 1 2 2 . S O M E 0 . 0 1 9 . 9 2 6 . 4 l u . 7 1 6 . 3 2 2 . 9 2 5 . 9 0 . 0 0 . 1 5 1 0 . 2 9 8 0 . 0 8 7 0 . 2 1 2 0 . 2 3 2 0 . 2 9 5 C H I - S Q U A R E 1 0 . 8 2 T H E T H R E E L I N E S O r F I G U R E S F U R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T U T A L N U . CF W O R D S I N S U B J c C T T A B L E X X X V l / l D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M J S T F R E Q U E N T W C R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D . B C D E F G H T O T A L 4 6 . 0 . 0 1 5 . 0 2 8 . 0 7 . 0 1 4 . 0 1 7 . 0 2 7 . 0 1 0 8 . M O R E 0 . 0 1 7 . 6 2 3 . 4 9 . 5 1 4 . 5 2 0 . 2 2 2 . 9 0 . 0 0 . 1 7 4 0 . 2 4 5 0 . 1 5 1 0 . 1 9 8 0 . 1 7 2 0 . 2 4 1 C H I - S Q U A R E 3 . 2 1 4 7 . 0 . 0 3 2 . 0 8 . 0 0 . 0 9 . 0 2 8 . 0 6 5 . 0 1 4 3 . W E R E 0 . 0 2 3 . 3 3 0 . 9 1 2 . 5 1 9 . 1 2 6 . 8 3 0 . 3 0 . 0 0 . 3 7 2 0 . 0 7 0 0 . 0 0 . 1 2 7 0 . 2 8 3 0 . 5 8 0 C H I - S Q U A R E 7 7 . 8 4 4 8 . 0 . 0 3 9 . 0 7 . 0 0 . 0 9 . 0 1 7 . 0 4 1 . 0 1 1 3 . H A D 0 . 0 1 8 . 4 2 4 . 4 9 . 9 1 5 . 1 2 1 . 2 2 4 . 0 0 . 0 0 . 4 5 3 0 . 0 6 1 0 . 0 0 . 1 2 7 0 . 1 7 2 0 . 3 6 6 C H I - S Q U A R E 6 0 . 8 0 4 9 . 0 . 0 2 1 . 0 3 1 . 0 4 . 0 8 . 0 1 2 . 0 4 4 . 0 1 2 0 . T H E I R 0 . 0 1 9 . 5 2 5 . 9 1 0 . 5 1 6 . 1 2 2 . 5 2 5 . 4 0 . 0 0 . 2 4 4 0 . 2 7 1 0 . 0 8 7 C . 1 1 3 0 . 1 2 1 C . 3 9 3 C H I - S Q U A R E 2 7 . 5 9 5 0 . 0 . 0 6 . 0 2 6 . 0 3 4 . 0 2 1 . 0 1 9 . 0 1 1 . 0 1 1 7 . U S E D 0 . 0 1 9 . 1 2 5 . 3 1 0 . 2 1 5 . 7 2 1 . 9 2 4 . 8 0 . 0 0 . 0 7 0 0 . 2 2 8 0 . 7 3 5 0 . 2 9 7 0 . 1 9 2 0 . 0 9 8 C H I - S Q U A R E 7 4 . 0 1 N) T H E T H R E E L I N E S U F F I G U k E S F O R E A C H E N T R Y R E P R E S E N T : (Jl F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R t Q . T O T U T A L N O . U F W O R D S I N S U B J E C T T A B L E XX XV IU D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N < S U B J E C T S W O R D 8 C D E F G H T O T A L 5 1 . 0 . 0 5 . 0 4 4 . 0 8 . 0 1 2 . 0 2 7 . 0 2 2 . 0 1 1 8 . M A N Y 0 . 0 1 9 . 2 2 5 . 5 1 0 . 3 1 5 . 8 2 2 . 1 2 5 . 0 0 . 0 0 . 0 5 8 0 . 3 8 5 0 . 1 7 3 0 . 1 7 0 0 . 2 7 3 0 . 1 9 6 C H I - S Q U A R E 2 6 . 7 9 5 2 . 0 . 0 3 1 . 0 4 3 . 0 3 . 0 1 9 . 0 2 5 . 0 1 6 . 0 1 3 7 . S O 0 . 0 2 2 . 3 2 9 . 6 1 2 . 0 1 8 . 3 2 5 . 7 2 9 . 1 0 . 0 0 . 3 5 0 0 . 3 7 6 0 . 0 6 5 0 . 2 6 9 0 . 2 5 2 0 . 1 4 3 C H I - S Q U A R E 2 2 . 0 7 5 3 . 0 . 0 1 0 . 0 1 2 . 0 1 1 . 0 3 4 . 0 3 4 . 0 2 1 . 0 1 2 2 . E A C H 0 . 0 1 9 . 9 2 6 . 4 1 0 . 7 1 6 . 3 2 2 . 9 2 5 . 9 0 . 0 0 . 1 1 6 0 . 1 0 5 0 . 2 3 8 0 . 4 8 1 0 . 3 4 3 0 . 1 8 7 C H I - S Q U A R E 3 8 . 1 9 5 4 . 0 . 0 1 8 . 0 7 . 0 9 . 0 2 3 . U 2 9 . 0 3 1 . 0 1 1 7 . TW O 0 . 0 1 9 . 1 2 5 . 3 1 0 . 2 1 5 . 7 2 1 . 9 2 4 . 8 0 . 0 0 . 2 0 9 0 . 0 6 1 0 . 1 9 5 0 . 3 2 5 0 . 2 9 3 0 . 2 7 7 C H I - S Q U A R E 2 0 . 7 0 5 5 . C O 1 0 . 0 1 0 . 0 3 . 0 1 4 . 0 2 6 . 0 2 8 . 0 9 1 . A B O U T 0 . 0 1 4 . 8 1 9 . 7 6 . 0 1 2 . 2 1 7 . 1 1 9 . 3 0 . 0 0 . 1 1 6 0 . 0 8 8 0 . 0 6 5 0 . 1 9 8 0 . 2 6 2 0 . 2 5 0 C H I - S Q U A R E 1 8 . 3 0 T H E T h R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S 1, O F F R E Q . T O T O T A L N O . U F W O R D S I N S U B J E C T T A B L E XXXVIII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W ORD B C D E F G H T O T A L 5 6 . 0 . 0 1 1 . 0 7 4 . 0 6 . 0 1 0 . 0 6 . 0 2 . 0 1 0 9 . S H O U L D 0 . 0 1 7 . 8 2 3 . 6 9 . 5 1 4 . 6 2 0 . 4 2 3 . 1 C O 0 . 1 2 8 0 . 6 4 8 0 . 1 3 0 0 . 1 4 1 0 . 0 6 1 0 . 0 1 8 C H I - S Q U A R E 1 4 2 . 7 2 5 7 . 0 . 0 2 4 . 0 2 0 . 0 2 . 0 2 2 . 0 4 2 . 0 4 . 0 1 1 4 . W H A T 0 . 0 1 8 . 6 2 4 . 6 1 0 . 0 1 5 . 3 2 1 . 4 2 4 . 2 C O 0 . 2 7 9 0 . 1 7 5 0 . 0 4 3 0 . 3 1 1 0 . 4 2 4 0 . 0 3 6 C H I - S Q U A R E 4 8 . 5 6 5 8 . C O 1 3 . 0 2 5 . 0 1 2 . 0 1 5 . 0 2 3 . 0 1 4 . 0 1 0 2 . T H A N 0 . 0 1 6 . 6 2 2 . 1 8 . 9 1 3 . 7 1 9 . 1 2 1 . 6 0 . 0 0 . 1 5 1 0 . 2 1 9 0 . 2 6 0 0 . 2 1 2 0 . 2 3 2 C . 1 2 5 C H I - S Q U A R E 5 . 8 5 5 9 . 0 . 0 1 4 . 0 1 0 . 0 9 . 0 1 1 . 0 8 . 0 2 7 . 0 7 9 . B E E N 0 . 0 1 2 . 9 1 7 . 1 6 . 9 1 0 . 6 1 4 . 8 1 6 . 8 C O 0 . 1 6 3 0 . 0 8 8 0 . 1 9 5 0 . 1 5 6 0 . 0 3 1 C . 2 4 1 C H I - S Q U A R E 1 3 . 0 8 6 0 . 0 . 0 3 0 . 0 2 . 0 8 . 0 6 . 0 2 0 . 0 3 0 . 0 9 6 . I N T O C O 1 5 . 6 2 0 . 8 8 . 4 1 2 . 9 1 8 . 0 2 0 . 4 0 . 0 0 . 3 4 9 0 . 0 1 8 0 . 1 7 3 0 . 0 8 5 0 . 2 0 2 0 . 2 6 8 C H I - S Q U A K E 3 8 . 6 1 N) T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : O F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S t, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T T A B L E XXXVIII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 H O S T F R E Q U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 6 1 . 0 . 0 2 1 . 0 3 7 . 0 0 . 0 1 6 . 0 1 4 . 0 2 1 . 0 1 0 9 . T H E M 0 . 0 1 7 . 3 2 3 . 6 9 . 5 1 4 . 6 2 0 . 4 2 3 . 1 0 . 0 0 . 2 4 4 0 . 3 2 4 0 . 0 0 . 2 2 6 0 . 1 4 1 0 . 1 8 7 C H I - S Q U A R E 2 0 . 1 5 6 2 . 0 . 0 • 1 0 . 0 3 6 . 0 1 2 . 0 2 3 . 0 1 2 . 0 4 . 0 9 7 . U S E 0 . 0 1 5 . 8 2 1 . 0 8 . 5 1 3 . 0 1 8 . 2 2 0 . 6 0 . 0 0 . 1 1 6 0 . 3 1 5 0 . 2 6 0 0 . 3 2 5 0 . 1 2 1 0 . 0 3 6 C H I - S Q U A R E 3 7 . 5 2 6 3 . 0 . 0 8 . 0 4 9 . 0 1 1 . 0 1 3 . 0 1 1 . 0 1 5 . 0 1 0 7 . M A K E 0 . 0 1 7 . 4 2 3 . 1 9 . 4 1 4 . 3 2 0 . 1 2 2 . 7 0 . 0 C . 0 9 3 0 . 4 2 9 0 . 2 3 8 0 . 1 8 4 0 . 1 1 1 0 . 1 3 4 C H I - S Q U A R E 4 1 . 1 2 6 4 . 0 . 0 1 2 . 0 3 4 . 0 8 . 0 1 5 . 0 1 8 . 0 6 . 0 9 3 . D O 0 . 0 1 5 . 1 2 0 . 1 8 . 1 1 2 . 4 1 7 . 4 1 9 . 7 0 . 0 0 . 1 3 9 0 . 2 9 8 0 . 1 7 3 0 . 2 1 2 0 . 1 8 2 0 . 0 5 4 C H I - S Q U A R E 2 0 . 3 4 6 5 . 0 . 0 1 7 . 0 1 2 . 0 7 . 0 2 . 0 1 9 . 0 2 4 . 0 8 1 . U P 0 . 0 1 3 . 2 1 7 . 5 7 . 1 1 0 . 8 1 5 . 2 1 7 . 2 0 . 0 0 . 1 9 8 0 . 1 0 5 0 . 1 5 1 0 . 0 2 8 0 . 1 9 2 0 . 2 1 4 C H I - S Q U A R E 1 3 . 7 2 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S t O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXVIII D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U 8 J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S WORD B C D E F G H T O T A L 6 6 . 0 . 0 1 0 . 0 1 8 . 0 1 2 . 0 6 . 0 9 . 0 2 2 . 0 7 7 . S U C H 0 . 0 1 2 . 5 1 6 . 6 6 . 7 1 0 . 3 1 4 . 4 1 6 . 3 0 . 0 0 . 1 1 6 0 . 1 5 8 0 . 2 6 0 0 . 0 8 5 0 . C 9 1 0 . 1 9 6 C H I - S C U A R E 1 0 . 5 5 6 7 . 0 . 0 1 8 . 0 7 . 0 4 . 0 1 0 . 0 1 7 . 0 8 . 0 1 0 9 . T H E N 0 . 0 1 7 . f i 2 3 . 6 9 . 5 1 4 . 6 2 0 . 4 2 3 . 1 0 . 0 0 . 2 0 9 0 . 0 6 1 0 . 0 8 7 0 . 1 4 1 0 . 1 7 2 C . 0 7 1 C H I - S Q U A K E 2 6 . 7 7 6 8 . 0 . 0 1 6 . 0 1 5 . 0 0 . 0 3 . 0 1 2 . 0 2 0 . 0 7 0 . T I M E 0 . 0 1 1 . 4 1 5 . 1 6 . 1 9 . 4 1 3 . 1 1 4 . 8 0 . 0 0 . 1 8 6 0 . 1 3 1 0 . 0 0 . 0 4 2 0 . 1 2 1 0 . 1 7 8 C H I - S Q U A R E 1 4 . 2 0 6 9 . 0 . 0 1 3 . 0 4 . 0 7 . 0 1 . 0 2 6 . 0 2 5 . 0 7 6 . I T S 0 . 0 1 2 . 4 1 6 . 4 6 . 7 1 0 . 2 1 4 . 2 1 6 . 1 0 . 0 0 . 1 5 1 0 . 0 3 5 0 . 1 5 1 0 . 0 1 4 0 . 2 6 2 0 . 2 2 3 C H I - S Q U A R E 3 2 . 3 1 7 0 . 0 . 0 2 0 . 0 1 7 . 0 2 . 0 1 3 . 0 2 7 . 0 1 5 . 0 9 4 . W O U L O 0 . 0 1 5 . 3 2 0 . 3 8 . 2 1 2 . 6 1 7 . 6 1 9 . 9 0 . 0 0 . 2 3 2 0 . 1 4 9 0 . 0 4 3 0 . 1 8 4 0 . 2 7 3 0 . 1 3 4 C H I - S Q U A R E 1 2 . 9 2 to T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : « 0 F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV///. D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W C R O B C 0 E F G H T O T A L 7 1 . 0 . 0 1 8 . 0 9 . 0 3 . 0 2 3 . 0 2 9 . 0 3 . 0 8 5 . HOW 0 . 0 1 3 . 8 1 8 . 4 7 . 4 1 1 . 4 1 5 . 9 1 8 . 0 0 . 0 0 . 2 0 9 0 . 0 7 9 0 . C 6 5 0 . 3 2 5 0 . 2 9 3 0 . 0 2 7 C H I - S Q U A R E 4 3 . 7 9 7 2 . 0 . 0 2 . 0 6 . 0 3 . 0 9 9 . 0 5 . 0 2 . 0 1 1 8 . N J M E E R 0 . 0 1 9 . 2 2 5 . 5 1 0 , 3 1 5 . 8 2 2 . 1 2 5 . 0 0 . 0 0 . 0 2 3 0 . 0 5 3 0 . 0 6 5 1 . 4 0 C 0 . C 5 0 0 . 0 1 8 C H I - S Q U A R E 5 0 3 . 2 8 7 3 . 0 . 0 9 . 0 3 2 . 0 1 5 . 0 l . C 1 8 . 0 2 1 . 0 9 6 . M A D E 0 . 0 1 5 . 6 2 0 . 6 8 . 4 1 2 . 9 1 8 . 0 2 0 . 4 0 . 0 0 . 1 0 5 0 . 2 8 0 0 . 3 2 4 0 . 0 1 4 0 . 1 8 2 0 . 1 8 7 C H I - S Q U A R E 2 5 . 0 4 7 4 . 0 . 0 2 4 . 0 1 5 . 0 7 . 0 7 . 0 1 1 . 0 2 1 . 0 8 5 . O U T 0 . 0 1 3 . 8 1 8 . 4 7 . 4 1 1 . 4 1 5 . 9 ' 1 3 . 0 0 . 0 0 . 2 7 9 0 . 1 3 1 0 . 1 5 1 0 . 0 9 9 0 . 1 1 1 0 . 1 8 7 C H I - S Q U A R E 1 1 . 8 1 7 5 . 0 . 0 7 . 0 1 8 . 0 7 . 0 5 . 0 1 8 . 0 2 1 . 0 7 6 . M O S T 0 . 0 1 2 . 4 1 6 . 4 6 . 7 1 0 . 2 1 4 . 2 1 6 . 1 0 . 0 0 . 0 B 1 0 . 1 5 8 0 . 1 5 1 0 . M 7 1 0 . 1 6 2 0 . 1 8 7 C H I - S Q U A R E 7 . 6 0 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S *, C F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV/// D I S T R I B U T I U N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 7 6 . 0 . 0 1 7 . 0 9 . 0 9 . 0 2 7 . 0 2 7 . 0 1 3 . 0 1 0 2 . O N L Y 0 . 0 1 6 . 6 2 2 . 1 8 . 9 1 3 . 7 1 9 . 1 2 1 . 6 0 . 0 0 . 1 9 8 0 . 0 7 9 0 . 1 9 5 0 . 3 3 2 0 . 2 7 3 0 . 1 1 6 C H I - S Q U A R E 2 7 . 4 7 7 7 . 0 . 0 1 7 . 0 8 . 0 2 . 0 1 1 . 0 1 1 . 0 2 0 . 0 7 0 . N O 0 . 0 1 1 . 4 1 5 . 1 6 . 1 9 . 4 1 3 . 1 1 4 . 8 0 . 0 0 . 1 9 8 0 . 0 7 0 0 . 0 4 3 0 . 1 5 6 0 . 1 1 1 0 . 1 7 8 C H I - S Q U A R E 1 1 . 3 1 7 8 . 0 . 0 2 0 . 0 1 9 . 0 6 . 0 7 . C 1 2 . 0 9 . 0 7 3 . M U S T 0 . 0 1 1 . 9 1 5 . 8 6 . 4 9.M 1 3 . 7 1 5 . 5 0 . 0 0 . 2 3 2 0 . 1 6 6 0 . 1 3 0 0 . 0 9 9 0 . 1 2 1 0 . 0 6 0 C H I - S Q U A R E 9 . 9 2 7 9 . 0 . 0 6 . 0 3 . 0 5 . 0 0 . 0 1 5 . 0 2 3 . 0 5 2 . W A T E R 0 . 0 3 . 5 1 1 . 2 4 . 6 7 . 0 9 . 7 1 1 . 0 0 . 0 0 . 0 7 0 0 . 0 2 6 0 . 1 0 8 0 . 0 0 . 1 5 1 0 . 2 0 5 C H I - S Q U A R E 2 9 . 6 0 8 0 . 0 . 0 2 . 0 2 7 . 0 7 . 0 5 . C 7 . 0 1 2 . 0 6 0 . A L S O 0 . 0 9 . 8 1 3 . 0 5 . 3 3 . 0 1 1 . 2 1 2 . 7 0 . 0 0 . 0 2 3 0 . 2 3 6 0 . 1 5 1 0 . 0 7 1 0 . 0 7 1 0 . 1 0 7 C H I - S Q U A R E 2 4 . 7 2 IV) »J T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : C O F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T T A B L E X X X V / M D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 8 1 . 0 . 0 1 2 . 0 1 4 . 0 5 . 0 1 3 . 0 1 6 . 0 1 3 . 0 7 3 . F I R S T 0 . 0 1 1 . 9 1 5 . 8 6 . 4 9 . 8 1 3 . 7 1 5 . 5 0 . 0 0 . 1 3 9 0 . 1 2 3 0 . 1 0 8 0 . 1 3 4 0 . 1 6 2 0 . 1 1 6 C H I - S Q U A R E 2 . 3 6 8 2 . C O 7 . 0 1 8 . 0 5 . 0 1 0 . 0 1 9 . 0 1 4 . 0 7 3 . V E R Y 0 . 0 1 1 . 9 1 5 . 8 6 . 4 9 . R 1 3 . 7 1 5 . 5 0 . 0 0 . 0 8 1 0 . 1 5 8 0 . 1 0 8 0 . 1 4 1 0 . 1 9 2 0 . 1 2 5 C H I - S Q U A K E 4 . 8 3 8 3 . 0 . 0 1 1 . 0 6 3 . 0 6 . 0 5 . 0 4 . 0 4 . 0 9 3 . G O O D 0 . 0 1 5 . 1 2 0 . 1 8 . 1 1 2 . 4 1 7 . 4 1 9 . 7 0 . 0 0 . 1 2 8 0 . 5 5 1 0 . 1 3 0 0 . 0 7 1 0 . 0 4 0 0 . 0 3 6 C H I - S Q U A R E 1 2 0 . 5 3 9 4 . 0 . 0 3 1 . 0 1 3 . 0 0 . 0 3 . 0 1 2 . 0 5 . 0 6 4 . H I M 0 . 0 1 0 . 4 1 3 . 8 5 . 6 8 . 6 1 2 . 0 1 3 . 6 0 . 0 0 . 3 6 0 0 . 1 1 4 0 . 0 0 . 0 4 2 0 . 1 2 1 0 . 0 4 5 C H I - S Q U A K E 5 5 . 3 1 8 5 . 0 . 0 2 . 0 7 . 0 5 . 0 1 5 . 0 3 4 . 0 7 . 0 7 0 . S A M E 0 . 0 1 1 . 4 1 5 . 1 6 . 1 9 . 4 1 3 . 1 1 4 . 8 0 . 0 0 . 0 2 3 0 . 0 6 1 0 . 1 0 8 0 . 2 1 2 0 . 3 4 3 0 . 0 6 2 C H I - S Q U A K E 5 3 . 0 6 T H E T H R E t L I N E S D F F I G U R E S F U R E A C h E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXV/// D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W C R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E E I G H T R A N K S U B J E C T S W O R D B C D E F G H T O T A L 8 6 . 0 . 0 1 8 . 0 7 . 0 1 . 0 1 6 . 0 1 2 . 0 1 5 . 0 6 9 . C O U L D 0 . 0 1 1 . 2 1 4 . 9 6 . 0 9 . 2 1 2 . 9 1 4 . 6 0 . 0 0 . 2 0 9 0 . 0 6 1 0 . 0 2 2 0 . 2 2 6 0 . 1 2 1 0 . 1 3 4 C H I - S Q U A R E 1 7 . 5 1 8 7 . 0 . 0 1 3 . 0 1 8 . 0 2 . 0 2 . 0 7 . 0 2 2 . 0 6 4 . WHO 0 . 0 1 0 . 4 1 3 . 8 5 . 6 8 . 6 1 2 . 0 1 3 . 6 0 . 0 0 . 1 5 1 0 . 1 5 8 C . 0 4 3 0 . 0 2 8 0 . 0 7 1 0 . 1 9 6 C H I - S Q U A R E 1 6 . 5 5 8 8 . 0 . 0 9 . 0 1 7 . 0 0 . 0 1 2 . 0 1 6 . 0 4 . 0 5 8 . A N Y 0 . 0 9 . 4 1 2 . 5 5 . 1 7 . 8 1 0 . 9 1 2 . 3 0 . 0 0 . 1 0 5 0 . 1 4 9 0 . 0 0 . 1 7 0 0 . 1 6 2 0 . 0 3 6 C H I - S Q U A R E 1 7 . 0 1 8 9 . 0 . 0 4 . 0 2 2 . 0 1 0 . 0 4 . 0 8 . 0 1 2 . 0 6 0 . B E C A U S E 0 . 0 9 . 8 1 3 . 0 5 . 3 8 . 0 1 1 . 2 1 2 . 7 0 . 0 0 . 0 4 6 0 . 1 9 3 0 . 2 1 6 0 . 0 5 7 0 . 0 8 1 0 . 1 0 7 C H I - S Q U A R E 1 6 . 9 9 9 0 . 0 . 0 8 . 0 9 . 0 2 . 0 1 2 . 0 3 2 . 0 4 . 0 6 7 . S E E 0 . 0 1 0 . 9 1 4 . 5 5 . 9 9 . 0 1 2 . 6 1 4 . 2 0 . 0 . 0 . 0 9 3 0 . 0 7 9 0 . 0 4 3 0 . 1 7 0 0 . 3 2 3 0 . 0 3 6 C H I - S Q U A R E 4 3 . 8 4 -J T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : ^ F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R ' A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T TABLE XXXVIII. DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRAOE EIGHT RAN< S U B J E C T S WORD B C D E F G H TOTAL 9 1 . 0 . 0 1 8 . 0 1 2 . 0 3 . 0 3 . 0 1 4 . 0 1 4 . 0 6 4 . LIKE 0 . 0 1 0 . 4 1 3 . 8 5 . 6 8 . 6 1 2 . 0 1 3 . 6 0 . 0 0 . 2 0 9 0 . 1 0 5 0 . 0 6 5 0 . 0 4 2 0 . 1 4 1 0 . 1 2 5 CHI-SQUARE 1 0 . 9 3 9 2 . 0 . 0 6 . 0 1 7 . 0 3 . 0 5 . 0 9 . 0 1 5 . 0 5 5 . MUCH 0 . 0 9 . 0 1 1 . 9 4 . 8 7 . 4 1 0 . 3 1 1 . 7 0 . 0 0 . 0 7 0 0 . 1 4 9 0 . 0 6 5 0 . 0 7 1 0 . 0 9 1 0 . 1 3 4 CHI-SQUARE 5 . 7 3 9 3 . 0 . 0 6 . 0 2 0 . 0 1 . 0 7 . 0 6 . 0 2 0 . 0 6 0 . PEOPLE 0 . 0 9 . 8 1 3 . 0 5 . 3 8 . 0 1 1 . 2 1 2 . 7 0 . 0 0 . 0 7 0 0 . 1 7 5 0 . 0 2 2 0 . 0 9 9 0 . 0 6 1 0 . 1 7 8 CHI-SQUARE 1 5 . 4 5 9 4 . 0 . 0 5 . 0 3 . 0 1 0 . 0 1 4 . 0 1 3 . 0 8 . 0 5 3 . CALLEO 0 . 0 8 . 6 1 1 . 5 4 . 6 7 . 1 9 . 9 1 1 . 2 0 . 0 0 . 0 5 8 0 . 0 2 6 0 . 2 1 6 0 . 1 9 8 0 . 1 3 1 0 . 0 7 1 CHI-SQUARE 2 2 . 5 7 9 5 . 0 . 0 3 . 0 2 5 . 0 6 . 0 1 1 . 0 2 0 . 0 9 . 0 7 4 . PLACE 0 . 0 1 2 . 1 1 6 . 0 6 . 5 9 . 9 1 3 . 9 1 5 . 7 0 . 0 0 . 0 3 5 0 . 2 1 9 0 . 1 3 0 0 . 1 5 6 0 . 2 0 2 0 . 0 8 0 CHI-SQUARE 1 7 . 5 7 ThE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS %, UF FREQ. TO TOTAL NO. OF WOROS IN SUBJECT TABLE XX.XVIII DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT RANK S U B J E C T S WORD B C D E F G H TOTAL 9 6 . 0 . 0 1 1 . 0 5 . 0 5 . 0 1 . 0 1 3 . 0 9 . 0 4 4 . THROUGH 0 . 0 7 . 2 9 . 5 3 . 9 5 . 9 8 . 2 9 . 3 0 . 0 0 . 1 2 8 0 . 0 4 4 0 . 1 0 8 0 . 0 1 4 0 . 1 3 1 0 . 0 8 0 CHI' -SQUARE 1 1 . 3 4 9 7 . 0 . 0 4 . 0 8 . 0 1 5 . 0 5 . 0 3 . 0 8 . 0 4 3 , WORK 0 . 0 7 . 0 9 . 3 3 . 8 5 . 8 8 . 1 9 . 1 0 . 0 0 . 0 4 6 0 . 0 7 0 0 . 3 2 4 0 . 0 7 1 0 . 0 3 0 0 . 0 7 1 CHI -SQUARE 3 8 . 4 4 98. 0 . 0 6 . 0 7 . 0 0 . 0 5 . 0 9 . 0 1 1 . 0 3 8 NEW 0 . 0 6 . 2 8 . 2 3 . 3 5 . 1 7 . 1 8 . 1 0 . 0 0 . 0 7 0 0 . 0 6 1 0 . 0 0 . C 7 1 0 . 0 9 1 0 . 0 9 8 CHI -SQUARE 5 . 0 8 99. 0 . 0 3 . 0 1 0 . 0 3 . 0 7 . 0 9 . 0 1 6 . 0 4 8 SMALL 0 . 0 7 . 8 1 0 . 4 4 . 2 6 . 4 9 . 0 1 C . 2 0 . 0 0 . 0 3 5 0 . 0 8 8 0 . 0 6 5 0 . C 9 9 0 . 0 9 1 0 . 1 4 3 CHI-SQUARE 6 . 7 1 100. 0 . 0 8 . 0 7 . 0 3 . 0 4 . C 1 3 . 0 1 7 . 0 5 2 OVER 0 . 0 8 . 5 1 1 . 2 4 . 6 7 . 0 9 . 7 1 1 . 0 0 . 0 0 . 0 9 3 0 . 0 6 1 0 . 0 6 5 0 . 0 5 7 0 . 1 3 1 0 . 1 5 2 CHI-SOUARE 7 . 7 3 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTEO FREQUENCY RATIO AS %, OF FREQ. TO TUTAL NO. OF WORDS IN SUBJECT T A B L E XXX IX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S U F G R A D E N I N E R A N K S U B J E C T S W C R D B C D E F G H T O T A L 1 . 9 8 5 . 0 1 5 7 5 . 0 2 0 5 2 . 0 2 4 5 1 . 0 2 7 5 . 0 1 1 5 3 . 0 5 8 0 . 0 9 0 7 1 . T H E 9 2 1 . 3 1 7 0 6 . 3 2 7 9 0 . 3 1 9 6 7 . 0 2 6 6 . E 9 0 6 . 0 5 1 3 . 2 7 . 8 8 9 6 . 8 1 1 5 . 4 2 7 9 . 1 9 5 7 . 6 0 5 9 . 3 9 1 6 . 3 3 9 C H I - S Q U A R E 4 0 5 . 1 8 2 . 3 5 2 . 0 7 0 4 . 0 1 0 7 7 . 0 7 0 4 . 0 1 5 2 . 0 5 5 6 . 0 3 1 1 . 0 3 8 5 7 . O F 3 9 1 . 7 7 2 5 . 5 1 1 8 6 . 4 8 3 6 . 4 1 1 3 . 5 3 8 5 . 2 2 1 8 . 2 2 . 8 1 9 3 . 0 4 5 2 . 8 4 8 2 . 6 4 1 4 . 2 0 4 4 . 5 2 8 4 . 4 7 2 C H I - S Q U A R E 1 6 3 . 9 3 3 . 2 3 8 . 0 7 3 2 . 0 1 2 3 8 . 0 7 0 3 . C 5 8 . 0 2 4 6 . 0 2 2 4 . 0 3 5 3 7 . A N D 3 5 9 . 2 6 6 5 . 3 1 0 6 8 . 0 7 6 7 . 0 1 0 4 . 0 3 5 3 . 3 2 0 0 . 1 . 2 . 3 0 7 3 . 1 6 6 3 . 4 0 6 2 . 6 3 7 1 . 6 0 4 2 . 0 0 4 3 . 2 2 1 C H I - S Q U A / < E 1 1 8 . 7 2 4 . 3 1 3 . 0 5 4 8 . 0 1 0 2 7 . 0 6 5 9 . 0 1 1 1 . 0 3 0 6 . 0 1 2 2 . 0 3 0 8 6 . A 3 1 3 . 4 5 8 0 . 5 9 4 9 . 3 6 6 9 . 2 9 0 . 8 3 0 8 . 2 1 7 4 . 6 2 . 5 0 7 2 . 3 7 0 2 . 7 1 6 2 . 4 7 2 3 . 0 7 0 2 . 4 9 2 1 . 7 5 4 C H I - S Q U A K E 2 8 . 7 1 . 5 . 3 6 0 . 0 5 6 8 . 0 1 1 4 0 . 0 5 6 8 . 0 7 2 . 0 3 0 4 . C 1 5 6 . 0 3 1 6 8 . T O 3 2 1 . 8 5 9 5 . 9 9 7 4 . 5 6 8 7 . 0 9 3 . 2 3 1 6 . 4 1 7 9 . 2 2 . 8 8 3 2 . 4 5 6 3 . 0 1 5 2 . 1 3 1 1 . 9 9 1 2 . 4 7 6 2 . 2 4 3 C H I - S Q U A R E 6 2 . 8 9 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S X, C F F R E C . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXIX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R M N K S U B J E C T S W O R D B C D E F G H T O T A L 6. 2 6 8 . 0 4 5 0 . 0 7 8 5 . 0 5 3 1 . 0 6 2 . 0 2 4 4 . 0 1 7 5 . 0 2 5 1 5 . I N 2 5 5 . 4 4 7 3 . 1 7 7 3 . 6 5 4 5 . 4 7 4 . 0 2 5 1 . 2 1 4 2 . 3 2 . 1 4 7 1 . 9 4 6 2 . 0 7 6 1 . 9 9 2 1 . 7 1 5 1 . 9 8 7 2 . 5 1 6 C H I - S Q U A K E 1 1 . 9 5 7 . 2 0 3 . 0 2 8 2 . 0 7 1 9 . 0 6 5 1 . 0 6 8 . 0 1 8 2 . 0 1 0 2 . 0 2 2 0 8 . I S 2 2 4 . 3 4 1 5 . 3 6 7 9 . 2 4 7 8 . 8 6 5 . 0 2 2 0 . 5 1 2 4 . 9 1 . 6 2 6 1 . 2 2 0 1 . 9 0 2 2 . 4 4 2 1 . 8 3 1 1 . 4 8 2 1 . 4 6 7 C H I - S Q U A R E 1 2 0 . 1 7 8. 1 0 2 . 0 2 5 1 . 0 2 9 6 . 0 1 9 2 . 0 4 4 . 0 1 4 8 . 0 3 1 . 0 1 0 6 4 . T H A T 1 0 8 . 1 2 0 0 . 1 3 2 7 . 3 2 3 0 . 7 3 1 . 3 1 0 6 . 3 6 0 . 2 0 . 8 1 7 1 . 0 8 5 0 . 7 8 3 0 . 7 2 0 1 . 2 1 7 1 . 2 0 5 0 . 4 4 6 C H I - S Q U A R E 5 8 . 4 5 9. 1 0 9 . 0 2 3 6 . 0 3 6 7 . 0 2 7 6 . 0 3 2 . 0 1 0 7 . 0 5 4 . 0 1 1 8 1 . I T 1 1 9 . 9 2 2 2 . 2 3 6 3 . 3 2 5 6 . 1 3 4 . 7 1 1 8 . 0 6 6 . 8 0 . 8 7 3 1 . 0 2 1 0 . 9 7 1 1 . 0 3 5 0 . 8 8 5 0 . 8 7 1 C . 7 7 6 C H I - S Q U A R E 7 . 1 4 1 0 . 1 3 4 . 0 1 5 3 . 0 5 1 7 . 0 3 0 4 . 0 2 9 . 0 7 9 . 0 6 2 . 0 1 2 7 8 . A R E 1 2 9 . 8 2 4 0 . 4 3 9 3 . 1 2 / 7 . 1 3 7 . 6 1 2 7 . 6 7 2 . 3 1 . 0 7 3 0 . 6 6 2 1 . 3 6 7 1 . 1 4 0 0 . 0 0 2 0 . 6 4 3 0 . 8 9 1 C H I - S Q U A R E 9 5 . 5 3 N) CD T H E T H R E E L I N E S U F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : M F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I U A S i, U F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T TABLE X X X I X DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RAN< S U B J E C T S WORD B C 0 E F G H TOTAL 11. 159.0- 188.0 450.0 221.0 28.0 68.0 68.0 1182. FOR 120.1 222.3 363.6 256.3 34.0 118.1 66.9 1.274 0.813 1.190 0.829 0.774 0.554 0.978 CHI-SQUARE 65.91 12. 240.0 232.0 390.0 109.0 56.0 139.0 25.0 1191. YOU 121.0 224.0 366.4 258.3 35.0 119.0 67.4 1.922 1.003 1.031 0.409 1.549 1.132 0.359 CHI-SQUARE 247.80 13. 118.0 114.0 474.0 288.0 14.C 68.0 26.0 1102. BE 111.9 207.3 339.0 239.0 32.4 110.1 62.4 0.945 0.493 1.254 1.080 0.387 0.054 0.374 CHI-SQUAKE 153.90 14. 87.0 179.0 296.0 175.0 37.0 78.0 47.0 899. AS 91.3 169.1 276.5 194.9 26.4 89.8 50.9 0.697 0.774 0.783 0.657 1.023 0.635 0.676 CHI-SQUARE 10.25 15. 101.0 113.0 483.0 285.0 15".0 56.0 22.0 1075. OR 109.2 202.2 330.7 233.1 31.6 107.4 60.8 0.809 0.489 1.277 1.069 0.415 0.456 0.316 CHI-SQUARE 179.79 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FREQ. TO TOTAL NO. UF WORDS IN SUBJECT TABLE X X X I X DISTRI BUT 13N OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RAN< S U B J E C T S WORD B C D E F G H TUTAL 16. 70.0 163.0 298.0 204.0 17.0 74.0 44.0 870. WITH 88.4 163.7 267.6 188.7 25.6 86.9 49.2 0.561 0.705 0.788 0.765 0.470 0.603 0.633 CHlrS3UARE 13.87 17. 110.0 107.0 209.0 164.0 29.0 93.0 27.0 759. ON 77.1 142.8 233.5 164.6 22.3 75.8 42.9 0.881 0.463 0.553 0.690 0.602 0.757 0.388 CHI-SQUARE 39.68 18. 105.0 88.0 127.0 190.0 25.0 78.0 37.0 650. THIS 66.0 122.3 199.9 141.0 19.1 64.9 36.8 0.841 0.381 0.336 0.713 0.691 0.635 0. 532 CHI-SQUAKE 80.74 19. 87.0 106.0 159.0 148.0 14.0 70.0 43.0 627. BY 63.7 117.9 192.9 136.0 18.4 62.6 35.5 0.697 0.458 0.421 0.555 0.387 0.570 0.618 CHI-SQUARE 20.30 20. 18.0 222.0 25.0 35.0 24.0 30.0 65.0 419. WAS 42.6 78.8 128.9 90.9 12.3 41.9 23.7 0.144 0.960 0.066 0.131 0.664 0.244 0.935 CHI-SQUARE 478.70 CD THE THREE LINES UF FIGURES FOR EACH ENTRY REPRESENT: tO FREQUENCY EXPECTED FREQUENCY RATIO AS %, UF FREQ. TO TOTAL NI). UF WURDS IN SUBJECT TABLE XXXIX DISTRIBUTION OF OCCURRENCE CF THE 100 MOST FREOUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RANK S U B J E C T S WORD B C D £ F G H TOTAL 21. 28.0 244.0 168.0 13.0 10.0 45.0 13.0 521. HE 52.9 98.0 160.3 113.0 15.3 52.0 29.5 0.224 1.055 0.444 0.049 0.277 0.367 0.187 CHI-SQUARE 330.08 22. 65.0 83.0 155.0 114.0 14.0 78.0 25.0 534. FROM 54.2 100.4 164.3 115.8 15.7 53.3 30.2 0.521 0.359 0.410 0.428 0.387 0.635 0.359 CHI-SOUARE 18.21 23. 57.0 110.0 192.0 64.0 16.0 39.0 25.0 503. HAVE 51.1 94.6 154.7 109.1 14.£ 50.2 28.5 0.457 0.476 0.508 0.240 0.442 0.318 0.359 CHI-SQUAKE 33.83 24. 71.0 115.0 150.0 90.0 14.0 66.0 20.0 526. AT 53.4 98.9 161.8 114.1 15.5 52.5 29.8 0.569 0.497 0.397 0.338 0.387 0.538 0.288 CHI-SOUARE 21.12 25. 49.0 65.0 156.0 98.0 3.0 65.0 48.0 484. WHICH 49.2 91.0 148.9 105.0 14.2 48.3 27.4 0.392 0.281 0.413 0.368 0.083 0.529 0.690 CHI-SQUARE 38.38 THE THREE LINES OF FIGURES FOR EACh ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT TABLE X X X / X DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE R4NK S U B J E C T S WORD B C 0 E F G H TOTAL 26. 45.0 87.0 150.0 91.0 18.0 60.0 30.0 481. ONE 48.9 90.5 148.0 104.3 14.1 48.0 27.2 0.360 0.376 0.397 0.341 0.498 0.489 0.431 CHI-SQUARE 6.47 27. 32.0 123.0 186.0 67.0 10.0 39.0 35.0 492. NOT 50.0 92.5 151.3 106.7 14.5 49.1 27.8 0.256 0.532 0.492 0.251 0.277 0.318 0.503 CHI-SQUARE 44.50 28. 44.0 62.0 180.0 149.0 19.0 52.0 7.0 513. CAN 52. 1 96.5 157.8 111.2 15.1 51.2 29.0 0.352 0.268 0.476 0.559 0.525 0.424 0.101 CHI-SQUARE 47.27 29. 120.0 87.0 196.0 13.0 16.0 35.0 14.0 481. YOUR 48.9 90.5 148.0 104.3 14.1 48.C 27.2 0.961 0.376 0.518 0.049 0.442 0.285 0.201 CHI-SQUARE 209.47 30. 27.0 88.0 200.0 67.0 11.0 34.0 25.0 452. THEY 45.9 85.0 139.0 98.0 13.3 45.1 25.6 0.216 0.381 0.529 0.251 0.304 0.277 0.359 CHI-SQUARE 47.60 M 00 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: w FREQUENCY EXPECTED FREQUENCY RATIO AS Zt UF FREQ. TO TOTAL NO. CF WORDS IN SUBJECT TABLE XXXIX DISTRIBUTION OF OCCURRENCE OF THE 100 HOST FREOUENT WORO TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RANK S U B J E C T S WORD B C 0 E F G H TOTAL 31. 40.0 83.0 28.0 8.0 27.0 27.0 6.0 219. WE 22.2 41.2 67.4 47.5 6.4 21.9 12.4 0.320 0.359 0.074 0.030 0.747 0.220 0.086 CHI-SQUARE 182.54 32. 34.0. 167.0 150.0 6.0 8.0 20.0 22.0 407. HIS 41.3 76.6 125.2 88.3 12.0 40.7 23.0 0.272 0.722 0.397 0.023 0.221 0.163 0.316 CHI-SQUARE 201.53 33. 98.0 39.0 184.0 105.0 12.0 44.0 13.0 495. WILL 50.3 93.1 152.3 107.3 14.6 49.4 28.0 0.785 0.169 0.487 0.394 0.332 0.358 6.187 CHI-SQUAKE 92.51 34. 50.0 56.0 216.0 94.0 12.0 38.0 11.0 477. IF 48.4 89.7 146.7 103.4 14.0 47.6 27.0 0.40C 0.242 0.571 0.353 0.332 0.309 0.158 CHI-SQUARE 53.01 3 5 . 44.0 91.0 97.0 94.0 11.0 45.0 27.0 409. AN 41.5 76.9 125.8 88.7 12.0 40.9 23.1 0.352 0.394 0.257 0.353 0.364 0.367 0.388 CHI-SQUARE 10.78 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREOUENCY RATIO AS %, OF FREQ. TO TUTAL NU. UF WORDS IN SUBJECT T A B L E XXX IX DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES. ACROSS THE SUBJECT AREAS OF GRADE NINE RANK S U B J E C T S WORD B C D E F G H TOTAL 36. 34.0 67.0 155.0 101.0 15.0 36.0 16.0 424. WHEN 43.1 79.8 130.4 91.9 12.5 42.3 24.0 0.272 0.290 0.410 0.379 0.415 0.293 0.230 CHI-SQUARE 13.60 37. 50.0 78.0 129.0 49.0 2.0 39.0 19.0 366. ALL 37.2 68.8 112.6 79.4 10.8 36.6 20.7 0.400 0.337 0.341 0.184 0.055 0.318 0.273 CHI-SQUARE 27.10 38. 17.0 116.0 107.0 42.0 3.0 35.0 36.0 356. BUT 36.2 67.0 109. 5 77.2 10.5 35.6 20.1 0.136 0.502 0.283 0.158 0.083 0.285 0.518 CHI-SQUAKE 79.98 39. 45.0 44.0 84.0 68.0 7.0 47.0 24.0 319. THESE 32.4 60.0 98.1 69.2 9.4 31.9 18.0 0.360 0. 190 0.222 0.255 0. 194 0. 383 0.345 CHI-SQUAKE 20.98 •0. 45.0 18.C 213.0 50.0 6.0 18.0 8.0 353. MAY 36.4 67.3 110.1 77.6 10.5 35.8 20.3 0.36U 0.078 0.563 0.188 0.166 0.147 0.115 CHI-SQUAKE 162.34 r o co THE THREt LINES UF FIGURES FUR EACH ENTRY REFRESENT: J> FREOUENCY EXPECTED FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT T A B L E X X X IX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R D B C D E F G H T O T A L 4 1 . 2 2 . 0 7 6 . 0 8 8 . 0 7 4 . 0 5 . 0 1 6 . 0 2 0 . 0 3 0 1 . T H E R E 3 0 . 6 5 6 . 6 9 2 . 6 6 5 . 3 8 . 9 3 0 . 1 1 7 . 0 0 . 1 7 6 0 . 3 2 9 0 . 2 3 3 0 . 2 7 8 0 . 1 3 8 0 . 1 3 0 0 . 2 8 8 C H I - S Q U A R E 1 9 . 2 1 4 2 . 2 9 . 0 4 9 . 0 8 5 . 0 8 0 . 0 8 . 0 3 0 . 0 1 5 . 0 2 9 6 . H A S 3 0 . 1 5 5 . 7 9 1 . 1 6 4 . 2 8 . 7 2 9 . 6 1 6 . 7 0 . 2 3 2 0 . 2 1 2 0 . 2 2 5 0 . 3 0 0 0 . 2 2 1 0 . 2 4 4 0 . 2 1 6 C H I - S Q U A R E 5 . 3 8 4 3 . 1 . 0 2 1 5 . 0 9 . 0 1 0 . 0 0 . 0 2 5 . 0 8 . 0 2 6 8 . I 2 7 . 2 5 0 . 4 8 2 . 4 5 8 . 1 7 . 9 2 6 . 8 1 5 . 2 0 . 0 0 8 0 . 9 3 0 0 . 0 2 4 0 . 0 3 8 0 . 0 0 . 2 0 4 0 . 1 1 5 C H I - S Q U A R E 6 7 9 . 2 4 4 4 . 2 2 . 0 3 4 . 0 1 0 0 . 0 5 9 . 0 1 3 . 0 . 2 7 . 0 1 4 . 0 2 6 9 . O T H E R 2 7 . 3 5 0 . 6 8 2 . 7 5 8 . 3 7 . 9 2 6 . 9 1 5 . 2 0 . 1 7 6 0 . 1 4 7 0 . 2 6 4 0 . 2 2 1 0 . 3 6 0 0 . 2 2 0 0 . 2 0 1 C H I - S Q U A R E 1 3 . 4 6 4 5 . 2 1 . 0 4 9 . 0 1 3 1 . 0 4 6 . 0 0 . 0 1 7 . 0 1 9 . 0 2 9 9 . S O M E 3 0 . 4 5 6 . 2 9 2 . 0 6 4 . 8 8 . 8 2 9 . 9 1 6 . 9 0 . 1 6 8 0 . 2 1 2 0 . 3 4 6 0 . 1 7 3 0 . 0 0 . 1 3 8 0 . 2 7 3 C H I - S Q U A R E 4 0 . 4 5 T H f c T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F K E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S % , O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E X X X / X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R D B C D E F G H T O T A L 4 6 . 2 1 . 0 4 9 . 0 1 3 1 . 0 4 6 . 0 0 . 0 1 7 . 0 1 9 . 0 2 9 9 . M O R E 3 0 . 4 5 6 . 2 9 2 . 0 6 4 . 8 8 . 8 2 9 . 9 1 6 . 9 0 . 1 6 8 0 . 2 1 2 0 . 3 4 6 0 . 1 7 3 0 . 0 0 . 1 3 8 0 . 2 7 3 C H I - S Q U A R E 4 0 . 4 5 4 7 . 2 5 . 0 4 8 . 0 1 0 6 . 0 6 9 . 0 7 . 0 2 6 . 0 2 3 . 0 3 0 4 . W E R E 3 0 . 9 5 7 . 2 9 3 . 5 6 5 . 9 8 . 9 3 0 . 4 1 7 . 2 0 . 2 0 0 0 . 2 0 8 0 . 2 8 0 0 . 2 5 9 0 . 1 9 4 0 . 2 1 2 0 . 3 3 1 C H I - S Q U A R E 7 . 4 1 4 8 . 9 . 0 7 4 . 0 1 8 . 0 2 8 . 0 5 . 0 5 . 0 3 9 . 0 1 7 8 . H A D 1 6 . 1 3 3 . 5 5 4 . 8 3 8 . 6 5 . 2 1 7 . 8 1 0 . 1 0 . Q 7 2 0 . 3 2 0 0 . 0 4 8 0 . 1 0 5 0 . 1 3 8 0 . 0 4 1 0 . 5 6 1 C H I - S Q U A R E 1 7 3 . 4 6 4 - 9 . 3 . 0 1 3 0 . 0 2 7 . 0 1 9 . 0 1 . 0 2 5 . 0 1 4 . 0 2 1 9 . T H E I R 2 2 . 2 4 1 . 2 6 7 . 4 4 7 . 5 6 . 4 2 1 . 9 1 2 . 4 0 . 0 2 4 0 . 5 6 2 0 . 0 7 1 0 . 0 7 1 0 . C 2 8 0 . 2 0 4 C . 2 0 1 C H I - S Q U A R E 2 5 4 . 6 1 $0. 6 . 0 6 0 . 0 9 5 . 0 1 8 . 0 4 . 0 1 4 . 0 2 9 . 0 2 2 5 . U S E D 2 3 . 0 4 2 . 5 6 9 . 5 4 9 . 0 6 . 6 2 2 . 6 1 2 . 8 0 . 0 4 8 0 . 2 5 9 0 . 2 5 1 0 . 0 6 8 O . l l l 0 . 1 1 4 0 . 4 1 7 C H I - S Q U A R E 7 3 . 5 4 ts) CO T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : JJ, F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S , O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T T A B L E XXX'X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W C R D B C D E F G H T O T A L 5 1 . 2 2 . 0 2 6 . 0 1 1 5 . 0 1 1 6 . 0 9 . 0 1 7 . 0 5 . 0 3 1 0 . M A N Y 3 1 . 5 5 8 . 3 9 5 . 4 6 7 . 2 9 . 1 3 1 . 0 1 7 . 5 0 . 1 7 6 0 . 1 1 2 0 . 3 0 4 0 . 4 3 5 0 . 2 4 9 0 . 1 3 8 0 . 0 7 2 C H I - S Q U A R E 7 5 . 4 7 5 2 . 1 5 . 0 3 1 . 0 7 1 . 0 6 7 . 0 2 5 . 0 1 7 . 0 1 9 . 0 2 4 5 . S O 2 4 . 9 4 6 . 1 7 5 . 4 5 3 . 1 7 . 2 2 4 . 5 1 3 . 9 0 . 1 2 0 0 . 1 3 4 0 . 1 8 8 0 . 2 5 1 0 . 6 9 1 0 . 1 3 8 0 . 2 7 3 C H I - S U U A R E 6 0 . 8 5 5 3 . 1 8 . 0 5 9 . 0 8 9 . 0 4 8 . 0 8 . C 2 2 . 0 5 . 0 2 4 9 . E A C H 2 5 . 3 4 6 . 8 7 6 . 6 5 4 . 0 7 . 3 2 4 . 9 1 4 . 1 0 . 1 4 4 0 . 2 5 5 0 . 2 3 5 0 . 1 8 0 0 . 2 2 1 0 . 1 7 9 0 . 0 7 2 C H I - S Q U A R E 1 4 . 1 9 5 4 . 5 0 . 0 2 4 . 0 8 4 . 0 2 6 . 0 2 6 . 0 3 4 . 0 9 . 0 2 5 3 . T W O 2 5 . 7 4 7 . 6 7 7 . 8 5 4 . 9 7 . 4 2 5 . 3 1 4 . 3 0 . 4 0 0 0 . 1 0 4 0 . 2 2 2 0 . 0 9 8 0 . 7 1 9 0 . 2 7 7 0 . 1 2 9 C H I - S Q U A R E 1 0 1 . 6 2 55. 2 4 . 0 3 8 . 0 4 5 . 0 6 6 . 0 1 1 . 0 2 5 . 0 1 6 . 0 225. A B O U T 2 2 . 9 4 2 . 3 6 9 . 2 4 8 . 8 6 . 6 2 2 . 5 1 2 . 7 0 . 1 9 2 0 . 1 6 4 0 . 1 1 9 0 . 2 4 8 0 . 3 0 4 0 . 2 0 4 0 . 2 3 0 C H I - S Q U A R E 1 9 . 0 6 T H E I H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E O U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Xt O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXIX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R D B C D E F G H T O T A L 5 6 . 2 1 . 0 5 9 . 0 6 0 . 0 3 2 . 0 4 3 . 0 3 3 . 0 . 1 7 . 0 2 6 5 . S H O U L D 2 6 . 9 4 9 . 8 8 1 . 5 5 7 . 5 7 . 8 2 6 . 5 1 5 . 0 0 . 1 6 8 0 . 2 5 5 0 . 1 5 9 0 . 1 2 0 1 . 1 8 9 0 . 2 6 9 0 . 2 4 4 C H I - S Q U A R E 1 8 0 . 8 1 5 7 . 3 3 . 0 2 0 . 0 1 6 4 . 0 3 2 . 0 5 . 0 1 0 . 0 2 . 0 2 6 6 . W H A T 2 7 . 0 5 0 . 0 8 1 . 8 5 7 . 7 7 . 8 2 6 . 6 1 5 . 1 0 . 2 6 4 0 . 0 8 6 0 . 4 3 4 0 . 1 2 0 0 . 1 3 8 0 . 0 8 1 0 . 0 2 9 C H I - S Q U A R E 1 3 5 . 9 9 5 8 . 1 8 . 0 8 3 . 0 3 5 . 0 1 0 . 0 8 . 0 3 5 . 0 5 . 0 1 9 4 . T H A N 1 9 . 7 3 6 . 5 5 9 . 7 4 2 . 1 5 . 7 1 9 . 4 1 1 . 0 0 . 1 4 4 0 . 3 5 9 0 . 0 9 3 0 . 0 3 8 0 . 2 2 1 0 . 2 8 5 0 . 0 7 2 C H I - S Q U A R E 1 1 0 . 8 4 5 9 . 1 4 . 0 3 7 . 0 7 6 . 0 5 2 . 0 9 . 0 2 1 . 0 8 . 0 2 1 7 . B E E N 2 2 . 0 4 0 . 8 6 6 . 7 4 7 . 1 6 . 4 2 1 . 7 1 2 . 3 0 . 1 1 2 0 . 1 6 0 0 . 2 0 1 0 . 1 9 5 0 . 2 4 9 0 . 1 7 1 0 . 1 1 5 C H I - S Q U A R E 7 . 6 8 6 0 . 1 9 . 0 5 6 . 0 4 8 . 0 3 3 . 0 7 . 0 2 4 . 0 1 0 . 0 1 9 7 . I N T O 2 0 . 0 3 7 . 1 6 0 . 6 4 2 . 7 5 . 8 1 9 . 7 1 1 . 1 0 . 1 5 2 0 . 2 4 2 0 . 1 2 7 0 . 1 2 4 0 . 1 9 4 0 . 1 9 5 0 . 1 4 4 C H I - S Q U A R E 1 5 . 8 8 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S % , O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T 00 T A B L E X X X / X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R O T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R O B C 0 E F G H T O T A L 6 1 . 1 4 . 0 5 7 . 0 5 2 . 0 7 4 . 0 1 . 0 2 4 . 0 1 5 . 0 2 3 7 . T H E M 2 4 . 1 4 4 . 6 7 2 . 9 5 1 . 4 7 . 0 2 3 . 7 1 3 . 4 0 . 1 1 2 0 . 2 4 7 0 . 1 3 8 0 . 2 7 8 0 . 0 2 8 0 . 1 9 5 0 . 2 1 6 C H I - S Q U A R E 2 8 . 9 2 6 2 . 2 3 . 0 5 7 . 0 9 1 . 0 2 0 . 0 6 . 0 1 5 . 0 8 . 0 2 2 0 . U S E 2 2 . 3 4 1 . 4 6 7 . 7 4 7 . 7 6 . 5 2 2 . 0 1 2 . 4 0 . 1 8 4 0 . 2 4 7 0 . 2 4 1 0 . 0 7 5 0 . 1 6 6 0 . 1 2 2 0 . 1 1 5 C H I - S Q U A R E 3 3 . 8 8 6 3 . 3 1 . 0 4 6 . 0 8 5 . 0 4 3 . 0 2 0 . 0 2 1 . 0 7 . 0 2 5 3 . M A K E 2 5 . 7 4 7 . 6 7 7 . 8 5 4 . 9 7 . 4 2 5 . 3 1 4 . 3 0 . 2 4 8 0 . 1 9 9 0 . 2 2 5 0 . 1 6 1 0 . 5 5 3 0 . 1 7 1 0 . 1 0 1 C H I - S Q U A R E 3 0 . 0 2 6 4 . 2 7 . 0 2 3 . 0 1 2 4 . 0 6 1 . 0 5 . 0 1 1 . 0 4 . 0 2 5 5 . D O 2 5 . 9 4 8 . 0 7 8 . 4 5 5 . 3 7 . 5 2 5 . 5 1 4 . 4 0 . 2 1 6 0 . 0 9 9 0 . 3 2 8 0 . 2 2 9 0 . 1 3 8 0 . 0 9 0 0 . 0 5 8 C H I - S Q U A R E 5 6 . 6 9 6 5 . 2 7 . 0 6 1 . 0 8 1 . 0 2 5 . 0 1 2 . 0 2 2 . 0 6 . 0 2 3 4 . U P 2 3 . 8 4 4 . 0 7 2 . 0 5 0 . 7 6 . 9 2 3 . 4 1 3 . 2 0 . 2 1 6 0 . 2 6 4 0 . 2 1 4 0 . 0 9 4 0 . 3 3 2 0 . 1 7 9 0 . 0 8 6 C H I - S Q U A R E 2 9 . 0 3 T h e T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXIX. D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R O B C 0 E F G H T O T A L 66. 2 6 . 0 5 9 . 0 5 2 . 0 7 5 . 0 4 . 0 1 3 . 0 1 1 . 0 2 4 0 . S U C H 2 4 . 4 4 5 . 1 7 3 . 8 5 2 . 0 7 . 1 2 4 . 0 1 3 . 6 0 . 2 0 8 0 . 2 5 5 0 . 1 3 8 0 . 2 8 1 0 . 1 1 1 0 . 1 0 6 0 . 1 5 8 C H I - S Q U A R E 2 7 . 7 8 6 7 . 1 7 . 0 3 4 . 0 6 5 . 0 3 7 . 0 7 . 0 2 0 . 0 1 4 . 0 2 1 4 . T H E N 2 1 . 7 4 0 . 3 6 5 . 8 4 6 . 4 6 . 3 2 1 . 4 1 2 . 1 0 . 1 3 6 0 . 1 4 7 0 . 2 2 5 0 . 1 3 9 0 . 1 9 4 0 . 1 6 3 0 . 2 0 1 C H I - S Q U A R E 9 . 9 6 6 8 . 3 7 . 0 5 1 . 0 5 5 . 0 5 7 . 0 5 . 0 2 6 . 0 7 . 0 2 3 8 . T I M E 2 4 . 2 4 4 . 8 7 3 . 2 5 1 . 6 7 . 0 2 3 . 8 1 3 . 5 0 . 2 9 6 0 . 2 2 1 0 . 1 4 5 0 . 2 1 4 0 . 1 3 8 0 . 2 1 2 0 . 1 0 1 C H I - S Q U A R E 1 6 . 6 5 6 9 . 2 7 . 0 3 3 . 0 9 4 . 0 2 6 . 0 4 . C 1 8 . 0 1 6 . 0 2 1 3 . I T S 2 2 . 1 4 1 . 0 6 7 . 1 4 7 . 3 6 . 4 2 1 . 8 1 2 . 3 0 . 2 1 6 0 . 1 4 3 0 . 2 4 9 0 . 0 9 8 0 . 1 1 1 0 . 1 4 7 0 . 2 3 0 C H I - S Q U A R E 2 5 . 6 8 7 0 . 9 . 0 4 4 . 0 3 6 . 0 4 8 . 0 0 . 0 3 5 . 0 1 3 . 0 1 8 5 . W O U L D 1 8 . 8 3 4 . 8 5 6 . 9 4 0 . 1 5 . 4 1 8 . 5 1 0 . 5 0 . 0 7 2 0 . 1 9 0 0 . 0 9 5 0 . 1 8 0 0 . 0 0 . 2 8 5 0 . 1 8 7 C H I - S Q U A R E 3 7 . 5 9 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : Jj F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E X XX I X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E X O O M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R O B C D E F G H . T O T A L 7 1 . 1 8 . 0 4 8 . 0 2 3 . 0 2 7 . 0 1 0 . 0 3 0 . 0 6 . 0 1 6 2 . HOW 1 6 . 5 3 0 . 5 4 9 . 8 3 5 . 1 4 . 8 1 6 . 2 9 . 2 0 . 1 4 4 0 . 2 0 8 0 . 0 6 1 0 . 1 0 1 0 . 2 7 7 0 . 2 4 4 0 . 0 8 6 C H I - S Q U A R E 4 5 . 2 0 7 2 . 1 9 . 0 4 0 . 0 4 8 . 0 3 0 . 0 3 4 . 0 3 1 . 0 2 . 0 2 0 4 . N J K . 8 E R 2 0 . 7 3 8 . 4 6 2 . 8 4 4 . 2 6 . 0 2 0 . 4 1 1 . 5 0 . 1 5 2 0 . 1 7 3 0 . 1 2 7 0 . 1 1 3 0 . 9 4 0 0 . 2 5 2 0 . 0 2 9 C h l - S O U A R E 1 5 2 . 3 3 7 3 . 3 3 . 0 3 . 0 1 5 . 0 1 0 . 0 1 8 . 0 1 5 . 0 3 . 0 9 8 . M A D E 1 0 . 0 1 8 . 4 3 0 . 1 2 1 . 3 2 . 9 9 . 8 5 . 5 0 . 2 6 4 0 . 0 1 3 0 . 0 4 0 0 . 0 3 8 0 . 4 9 8 0 . 1 2 2 0 . 0 4 3 C H I - S Q U A R E 1 6 3 . 0 7 7 4 . 1 6 . 0 2 4 . 0 7 7 . 0 8 3 . 0 1 . 0 5 . 0 4 . 0 2 1 0 . O U T 2 1 . 3 3 9 . 5 6 4 . 6 4 5 . 5 6 . 2 2 1 . 0 1 1 . 9 0 . 1 2 8 0 . 1 0 4 0 . 2 0 4 0 . 3 1 1 0 . 0 2 8 0 . 0 4 1 0 . 0 5 8 C H I - S Q U A R E 6 2 . 3 5 7 5 . 1 6 . 0 5 7 . 0 5 2 . 0 4 4 . 0 3 . 0 1 9 . 0 1 0 . 0 2 0 1 . M O S T 2 0 . 4 3 7 . 8 6 1 . 8 4 3 . 6 5 . 9 2 0 . 1 1 1 . 4 0 . 1 2 8 0 . 2 4 7 0 . 1 3 8 0 . 1 6 5 0 . 0 8 3 0 . 1 5 5 0 . 1 4 4 C H I - S Q U A R E 1 3 . 9 2 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S tt O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T T A B L E XXX IX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W O R D 8 C D E F G H T O T A L 7 6 . 1 0 . 0 3 1 . 0 5 6 . 0 5 4 . 0 2 . 0 9 . 0 1 1 . 0 1 7 3 . O N L Y 1 7 . 6 3 2 . 5 5 3 . 2 3 7 . 5 5 . 1 1 7 . 3 9 . 8 0 . 0 8 0 0 . 1 3 4 0 . 1 4 8 0 . 2 0 3 0 . 0 5 5 0 . 0 7 3 0 . 1 5 8 C H I - S Q U A R E 1 6 . 7 2 7 7 . 1 2 . 0 3 4 . 0 5 2 . 0 2 3 . 0 4 . 0 2 1 . 0 1 5 . 0 1 5 1 . N O 1 6 . 4 3 0 . 3 4 9 . 5 3 4 . 9 4 . 7 1 6 . 1 9 . 1 0 . 0 9 6 0 . 1 4 7 0 . 1 3 8 0 . 0 8 6 0 . 1 1 1 0 . 1 7 1 0 . 2 1 6 C H I - S Q U A R E 1 1 . 2 3 7 8 . 2 0 . 0 7 6 . 0 5 9 . 0 1 5 . 0 2 . 0 1 5 . 0 4 . 0 1 9 2 . M U S T 1 9 . 5 3 6 . 1 5 9 . 1 4 1 . 6 5 . 6 1 9 . 2 1 0 . 9 0 . 1 . 6 0 0 . 3 2 9 0 . 1 5 6 0 . 0 5 6 0 . 0 5 5 0 . 1 2 2 0 . 0 5 8 C H I - S Q U A R E 6 8 . 7 0 7 9 . 3 6 . 0 2 9 . 0 5 4 . 0 7 7 . 0 0 . 0 1 8 . 0 2 . 0 2 1 6 . W A T E R 2 1 . 9 4 0 . 6 6 6 . 4 4 6 . » 6 . 4 2 1 . 6 1 2 . 2 0 . 2 8 8 0 . 1 2 5 0 . 1 4 3 0 . 2 8 9 0 . 0 0 . 1 4 7 0 . 0 2 9 C H I - S O U A R E 4 9 . 5 9 8 0 . 1 . 0 1 5 . 0 4 8 . 0 2 9 . 0 5 . 0 9 4 . 0 6 . 0 1 9 8 . A L S O 2 0 . 1 3 7 . 2 6 0 . 9 4 2 . 9 5 . 8 1 9 . 8 1 1 . 2 0 . 0 0 8 0 . 0 6 5 0 . 1 2 7 0 . 1 0 9 0 . 1 3 8 0 . 7 6 6 0 . 0 6 6 C H I - S Q U A R E 3 1 9 . 8 0 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T CO CO T A B L E X X X / X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W ORO 8 C D E F G H . T O T A L 8 1 . 1 3 . 0 1 7 . 0 5 7 . 0 6 1 . 0 0 . 0 1 5 . 0 1 1 . 0 1 7 4 . F I R S T 1 7 . 7 3 2 . 7 5 3 . 5 3 7 . 7 5 . 1 1 7 . 4 9 . 8 0 . 1 0 4 0 . 0 7 4 0 . 1 5 1 0 . 2 2 9 0 . 0 0 . 1 2 2 0 . 1 5 8 C H I - S Q U A R E 2 8 . 9 5 8 2 . 1 6 . 0 2 2 . 0 4 0 . 0 3 6 . 0 4 . 0 1 2 . 0 1 2 . 0 1 4 2 . V E R Y 1 4 . 4 2 6 . 7 4 3 . 7 3 0 . 8 4 . 2 1 4 . 2 8 . 0 0 . 1 2 8 0 . 0 9 5 0 . 1 0 6 0 . 1 3 5 0 . 1 1 1 0 . 0 9 8 0 . 1 7 3 C H I - S Q U A R E 4 . 5 0 8 3 . 1 0 . 0 3 4 . 0 4 7 . 0 3 8 . 0 0 . 0 2 2 . 0 6 . 0 1 5 7 . G O O D 1 5 . 9 2 9 . 5 4 8 . 3 3 4 . 0 4 . 6 1 5 . 7 8 . 9 0 . 0 8 0 0 . 1 4 7 0 . 1 2 4 0 . 1 4 3 0 . 0 0 . 1 7 9 0 . 0 8 6 C H I - S Q U A R E 1 1 . 4 9 8 4 . 2 3 . 0 3 1 . 0 7 7 . 0 2 7 . 0 0 . 0 4 . 0 6 . 0 1 6 8 . H I M 1 7 . 1 3 1 . 6 5 1 . 7 3 6 . 4 4 . 9 1 6 . 8 9 . 5 0 . 1 8 4 0 . 1 3 4 0 . 2 0 4 0 . 1 0 1 0 . 0 0 . 0 3 3 0 . 0 8 6 C H I - S Q U A R E 3 2 . 9 0 8 5 . 7 . 0 7 6 . 0 6 3 . 0 1 . 0 1 . 0 1 0 . 0 2 . 0 1 6 0 . S A M E 1 6 . 3 3 0 . 1 4 9 . 2 3 4 . 7 4 . 7 1 6 . 0 9 . 1 0 . 0 5 6 0 . 3 2 9 0 . 1 6 7 0 . 0 0 4 0 . 0 2 8 0 . 0 8 1 0 . 0 2 9 C H I - S Q U A R E 1 2 2 . 5 1 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R C Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S X, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E XXXIX D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E N I N E R A N K S U B J E C T S W ORD B . C D E F G H T O T A L 8 6 . 1 4 . 0 1 1 . 0 3 9 . 0 3 5 . 0 8 . 0 2 5 . 0 5 . 0 1 3 7 . C O U L O 1 3 . 9 2 5 . 8 4 2 . 1 2 9 . 7 4 . 0 1 3 . 7 7 . 8 0 . 1 1 2 0 . 0 4 8 0 . 1 0 3 0 . 1 3 1 0 . 2 2 1 0 . 2 0 4 0 . 0 7 2 C H I - S Q U A R E 2 3 . 8 9 8 7 . 0 . 0 4 1 . 0 5 . 0 2 2 . 0 8 . 0 1 1 . 0 8 . 0 9 5 . WHO 9 . 6 1 7 . 9 2 9 . 2 2 0 . 6 2 . 8 9 . 5 5 . 4 0 . 0 0 . 1 7 7 0 . 0 1 3 0 . 0 8 3 0 . 2 2 1 0 . 0 9 0 0 . 1 1 5 C H I - S Q U A R E 7 0 . 9 8 8 8 . 2 2 . 0 4 1 . 0 4 5 . 0 9 . 0 2 . 0 4 . 0 9 . 0 1 3 2 . A N Y 1 3 . 4 2 4 . 8 4 0 . 6 2 8 . 6 3 . 9 1 3 . 2 7 . 5 0 . 1 7 6 0 . 1 7 7 0 . 1 1 9 0 . 0 3 4 0 . 0 5 5 0 . 0 3 3 0 . 1 2 9 C H I - S Q U A R E 3 7 . 5 9 8 9 . 1 4 . 0 2 4 . 0 4 2 . 0 2 5 . 0 4 . 0 1 3 . 0 4 . 0 1 2 5 . B E C A U S E 1 2 . 8 2 3 . 7 3 8 . 8 2 7 . 3 3 . 7 1 2 . 6 7 . 1 0 . 1 1 2 0 . 1 0 4 0 . 1 1 1 0 . 0 9 4 0 . 1 1 1 0 . 1 0 6 0 . 0 5 8 C H I - S O U A R E 2 . 0 0 9 0 . 1 6 . 0 2 3 . 0 5 9 . 0 2 2 . 0 3 . 0 1 3 . 0 8 . 0 1 4 4 . S E E 1 4 . 6 2 7 . 1 4 4 . 3 3 1 . 2 4 . 2 1 4 . 4 8 . 1 0 . 1 2 8 0 . 0 9 9 0 . 1 5 6 0 . 0 8 3 0 . 0 8 3 0 . 1 0 6 0 . 1 1 5 C H I - S Q U A R E 8 . 8 5 M 0 0 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : ^ F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T TABLE X X X ' X DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREOUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RANK S U B J E C T S WORO B C D E F G H . TOTAL 9 1 . 1 1 . 0 3 9 . 0 2 8 . 0 2 0 . 0 5.0 2 0 . 0 1 0 . 0 1 3 3 . L I K E 1 3 . 5 2 5 . 0 4 0 . 9 2 8 . 8 3 .9 1 3 . 3 7 .5 0 . 0 8 8 0 . 1 6 9 0 . 0 7 4 0 . 0 7 5 0 . 1 3 8 0 . 1 6 3 0 . 1 4 4 CHI-SQUARE 1 9 . 5 8 92 . 9 .0 4 0 . 0 4 0 . 0 1 0 . 0 4 . 0 1 7 . 0 9 . 0 1 2 9 . MUCH 13.1 2 4 . 3 3 9 . 7 2 8 . 0 3 .8 1 2 . 9 7 .3 0 .072 0 . 1 7 3 0 . 1 0 6 0 .038 0 .111 0 . 1 3 8 0 . 1 2 9 CHI-SQUARE 2 4 . 7 6 9 3 . 1 2 . 0 2 1 . 0 2 5 . 0 2 5 . 0 0 . 0 1 6 . 0 1 0 . 0 1 1 6 . PEOPLE 1 1 . 8 2 1 . 8 3 5 . 7 2 5 . 2 3 .4 1 1 . 6 6 .6 0 . 0 9 6 0 .091 0 . 0 6 6 0 . 0 9 4 0 . 0 0 . 1 3 0 0 . 1 4 4 CHI-SOUARE 1 0 . 1 3 9 4 . 6 . 0 2 7 . 0 3 3 . 0 6 . 0 1.0 2 . 0 2 9 . 0 1 0 4 . CALLED 1 0 . 6 19 .6 3 2 . 0 2 2 . 6 - 3.1 1 0 . 4 5 .9 0 . 0 4 3 0 . 1 1 7 . 0 . 0 8 7 0 . 0 2 3 0 . 0 2 8 0 . 0 1 6 0 . 4 1 7 CHI-SQUARE 1 1 5 . 9 5 95. 7 .0 1 8 . 0 1 5 . 0 3 4 . 0 1 3 . 0 1 7 . 0 7 . 0 1 1 1 . PLACE 1 1 . 3 2 0 . 9 34 .1 2 4 . 1 3 .3 11 .1 6 . 3 0 . 0 5 6 0 . 0 7 8 0 . 0 4 0 0 . 1 2 8 Q . 3 6 0 0 . 1 3 8 0.101 CHI-SQUARE 49 .11 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FKEwUENCY EXPECT ED FREQUENCY RATIO AS S , OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT TABLE X X X ' X DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE RANK S U B J E C T S WORD B C D E F G H TOTAL 96. 1 3 . 0 1 4 . 0 3 6 . 0 3 1 . 0 2 .0 2 5 . 0 2 . 0 1 2 3 . THROUGH 1 2 . 5 23 .1 37 .8 26-7 3.6 1 2 . 3 7 .0 0 . 1 0 4 0 .061 0 .095 0 .116 0 . 0 5 5 0 . 2 0 4 0 . 0 2 9 CHI-SQUARE 2 1 . 8 4 9 7 . 5.0 17 .0 2 9 . 0 4 9 . 0 1.0 2 4 . 0 1 1 . 0 1 3 6 . WORK 1 3 . 8 2 5 . 6 4 1 . 8 2 9 . 5 4 . 0 1 3 . 6 7 .7 0 . 0 4 0 0 . 0 7 4 0 . 0 7 7 0 . 1 8 4 0 . C 2 8 0 .195 0 . 1 5 8 CHI-SQUARE 3 7 . 0 0 9 8 . 3 0 . 0 1 6 . 0 50 .0 4 5 . 0 6 .0 9 . 0 7 .0 1 6 3 . NEW 1 6 . 6 3 0 . 7 50.1 3 5 . 3 4 . 8 1 6 . 3 9 .2 0.2-40 0 . 0 6 9 0 .132 0 . 1 6 9 0 . 1 6 6 0 . 0 7 3 0 .101 CHI-SQUARE 2 4 . 6 6 99. 1 1 . 0 2 1 . 0 2 0 . 0 1 2 . 0 2 . 0 8 . 0 1 1 . 0 8 5 . SMALL 8 .6 16.0 26 .1 1 8 . 4 2 .5 8 .5 4 . 8 0 . 0 8 8 0 .091 0 .053 0 . 0 4 5 0 . 0 5 5 0 . 0 6 5 0 . 1 5 8 CHI-SQUARE 14.01 100 . 6 .0 9 . 0 3 4 . 0 3 4 . 0 1.0 2 0 . 0 1 3 . 0 1 1 7 . OVER 1 1 . 9 2 2 . 0 3 6 . 0 2 5 . 4 3 .4 1 1 . 7 6 . 6 0 . 0 4 8 0 . 0 3 9 0 .090 0 .128 0 . 0 2 8 0 . 1 6 3 0 . 1 8 7 CHI-SQUAKE 2 7 . 4 4 THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS tt OF FREO. TO TOTAL NO. OF WOROS IN SUBJECT O T A B L E X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E T E N R A N < W O R D B S U B J E C D T S E H T O T A L 1 . 4 7 8 . 0 6 0 6 . 0 0 . 0 0 . 0 T H E 5 9 1 . 9 6 6 1 . 7 0 . 0 0 . 0 6 . 2 4 8 7 . 0 8 5 0 . 0 0 . 0 C H I - S Q U A K E 4 7 . 9 o 2 . 2 5 4 . 0 - 1 9 0 . 0 0 . 0 0 . 0 O F 3 0 6 . 1 3 4 2 . 2 0 . 0 0 . 0 3 . 3 2 0 2 . 2 2 1 0 . 0 0 . 0 C H I - S Q U A R E 1 2 3 . 1 2 3 . 1 8 3 . 0 2 7 6 . 0 0 . 0 0 . 0 A N D 1 8 8 . 1 2 1 0 . 2 0 . 0 0 . 0 2 . 3 9 2 3 . 2 2 7 0 . 0 0 . 0 C H I - S O U A R E 7 5 . 8 4 4 . 1 8 9 . 0 2 1 4 . 0 0 . 0 0 . 0 A 1 8 0 . 5 2 0 1 . 7 0 . 0 0 . 0 2 . 4 7 0 2 . 5 0 2 0 . 0 0 . 0 C H I - S Q U A R E 7 0 . 9 1 5 . 2 3 9 . 0 1 9 3 . 0 0 . 0 0 . 0 T O 1 8 4 . 1 2 0 5 . 8 0 . 0 0 . 0 3 . 1 2 4 2 . 2 5 7 0 . 0 0 . 0 C H l - S Q U A R c 2 0 . 3 3 5 2 0 . 0 1 2 3 0 . 0 1 7 5 5 . 0 4 5 8 9 . 5 4 9 . 3 1 2 0 5 . 6 I 5 e 0 . 4 7 . 3 2 4 7 . 8 9 3 8 . 5 9 1 2 6 4 . 0 6 5 9 . 0 1 0 0 5 . 0 2 3 7 3 . 2 8 4 . 0 6 2 3 . 4 8 1 7 . 3 3 . 7 1 8 4 . 2 2 9 4 - 9 2 0 9 9 . 0 3 2 0 . 0 5 8 0 . 0 1 4 5 8 . 1 7 4 . 5 3 8 3 . 0 5 0 2 . 1 1 . 3 9 4 2 . 0 5 4 2 . 8 3 9 2 3 3 . 0 4 1 6 . 0 3 4 7 . 0 1 3 9 9 . 1 6 7 . 5 3 6 7 . 5 4 8 1 . 8 3 . 2 8 2 2 . 6 7 0 1 . 6 9 9 1 7 8 . 0 3 5 6 . 0 4 6 1 . 0 1 4 2 7 . 1 7 0 . 8 3 7 4 . 9 4 9 1 . 5 2 . 5 0 7 2 . 2 8 5 2 . 2 5 7 T H E T H R E E L I N E S O F F I G U R t S F O R E A C H E N T R Y R E P R E S E N T : F R E w U E N ' C Y E X P E C T E D F R E Q U E N C Y R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T T A B L E X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J t C T A R E A S O F G R A D E T E N R A N K S U B J E C T S W ORD B C D E F G H T O T A L 6. 1 4 3 . 0 1 6 6 . 0 0 . 0 0 . 0 1 4 6 . 0 3 6 7 . 0 5 7 7 . 0 1 3 9 9 . I N 1 8 0 . 5 2 0 1 . 7 0 . 0 O . C 1 6 7 . 5 3 6 7 . 5 4 8 1 . 8 1 . 8 6 9 1 . 9 4 1 0 . 0 0 . 0 2 . 0 5 6 2 . 3 5 5 2 . 8 2 5 C H I - S Q U A R E 3 5 . 6 6 7 . 1 4 6 . 0 4 8 . 0 0 . 0 0 . 0 1 9 1 . 0 2 5 8 . 0 1 9 2 . 0 8 3 6 . I S 1 0 7 . 8 1 2 0 . 5 0 . 0 0 . 0 1 0 0 . 1 2 1 9 . 6 2 8 7 . 9 1 . 9 0 3 0 . 5 6 1 0 . 0 0 . 0 2 . 6 9 0 1 . 6 5 6 C . 9 4 0 C H I - S Q U A R E 1 7 8 . 4 5 8 . 9 8 . 0 9 4 . 0 0 . 0 0 . 0 1 3 0 . 0 1 8 0 . 0 1 3 3 . 0 6 3 5 . T H A T 8 1 . 9 9 1 . 6 0 . 0 0 . 0 7 6 . 0 1 6 6 . 8 2 1 8 . 7 1 . 2 8 1 1 . 0 9 9 0 . 0 0 . 0 1 . 6 3 1 1 . 1 5 5 0 . 6 5 1 C h l - S Q U A K E 7 6 . 2 0 9 . 6 7 . 0 1 1 9 . 0 0 . 0 0 . 0 5 7 . 0 1 4 4 . 0 1 1 8 . 0 5 0 5 . I T 6 5 . 1 7 2 . 8 0 . 0 0 . 0 6 0 . 4 1 3 2 . 7 1 7 3 . 9 0 . 8 7 6 1 . 3 9 1 0 . 0 0 . 0 0 . 8 0 3 0 . 9 2 4 0 . 5 7 8 C H I - S Q U A K E 4 8 . 4 8 1 0 . 7 9 . 0 1 6 . 0 0 . 0 0 . 0 7 6 . 0 1 4 3 . C 8 4 . 0 3 9 3 . A R E 5 1 . 3 5 7 . 4 0 . 0 0 . 0 4 7 . 6 1 0 4 . 6 1 3 7 . 1 1 . 0 3 3 0 . 1 8 7 0 . 0 0 . 0 1 . 0 7 0 0 . 9 1 8 0 . 4 1 1 C H I - S Q U A K E 9 6 . 3 2 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S , O F F R E Q . T O T O T A L N O . O F W O R D S I N S U 8 J E C T to T A B L E X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E T E N R A N K S U B J E C T S WORD B C D E F G H T O T A L 11. 99.0 75.0 0.0 0.0 33.0 94.0 159.0 460. F O R 59.3 66.3 0.0 0.0 55.1 120.8 158.4 1.294 0.877 0.0 0.0 0.465 0.603 0.778 C H I - S Q U A R E 42 .46 12. 126.0 . 50.0 0.0 0 .0 68.0 91.0 19.0 354. Y D U 45.7 51.0 0.0 0.0 42.4 93.0 121.9 1.647 0.585 0.0 0.0 0.958 0.584 0.093 C H I - S Q U A R E 243.79 i 13. 73.0 31.0 0.0 0.0 58.0 112.0 77.0 351. B E 45.3 50.6 0.0 0.0 42.0 92.2 120.9 0.954 0.352 0.0 0.0 0.817 0.719 ol 377 C H I - S Q U A R E 50.84 14. 48.0 61.0 0.0 0.0 64 .0 129.0 183.0 485. A S 62.6 69.9 0.0 0.0 53.1 127.4 167.0 0.627 0.713 0.0 0.0 0.901 0.828 0.896 C H I - S Q U A R E 6.60 15. 37.0 18.0 0.0 0.0 38.0 71.0 42.0 206. OR 26.6 29.7 0.0 0.0 24.7 54.1 70.9 0.484 0.210 0.0 C O 0.535 0.456 0.206 C H I - S Q U A R E 33.00 T H E T H R E E L I N E S OF F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E C U E N C Y E X P E C I E O F R E O U E N C Y R A T I O A S % , O F F R E Q . T O T O T A L N O . O F WORDS I N S U B J E C T T A B L E X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E T E N R A N K S U B J E C T S WORD B C D E F G H T O T A L 16. 28.0 70.0 0.0 0.0 25.0 101.0 93.0 317. W I T H 40.9 45.7 0.0 0.0 37.9 83.3 109.2 0.366 0.813 0.0 0.0 0.352 0.648 0.455 C H I - S Q U A R E 27.55 17. 55.0 59.0 C O 0.0 43.C 73.0 130.0 360 ON 46.4 51.9 0.0 0.0 43.1 94.6 124.0 0.719 0.690 0.0 0.0 0.606 0.468 0.636 C H I - S Q U A K E 7.76 18. 58.0 27.0 0.0 0.0 61.0 92.0 140.0 378 T H I S 48.8 54.5 0.0 0.0 45.2 99.3 13C.2 0.758 0.316 0.0 0.0 0.859 0.590 0.685 C H I - S O U A R E 22.40 19. 39.0 34.0 0.0 0.0 36.0 93.0 144.0 346 B Y 44.6 49.9 0.0 0.0 41.4 90.9 119.2 0.510 0.398 C O C O 0.507 0.597 0.705 C H I - S Q U A K E 11.71 20. 6.0 127.0 0.0 0.0 21.0 68.0 191.0 413 WAS 53.3 59.6 0.0 0.0 49.4 108.5 142.2 0.078 1.485 C O 0.0 0.296 0.436 0.935 C H I - S Q U A R E 166.53 T H E T H R E E L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S % , O F F R E Q . T O T O T A L N U . O F W O R O S I N S U B J E C T TA8LE X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 H O S T F R E O U E N T W O R D T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E T E N R A N K S U B J E C T S W O R D B C D E F G H T O T A L 2 1 . 5 1 . 0 1 7 8 . 0 0 . 0 0 . 0 1 9 . 0 4 6 . 0 1 9 . 0 3 1 3 . H E 4 0 . 4 4 5 . 1 0 . 0 0 . 0 3 7 . 5 8 2 . 2 1 0 7 . 8 0 . 6 6 7 2 . 0 8 1 0 . 0 0 . 0 0 . 2 6 8 0 . 2 9 5 0 . 0 9 3 C H I - S Q U A k E 4 9 2 . 1 5 2 2 . 2 5 . 0 3 5 . 0 0 . 0 0 . 0 2 4 . 0 9 5 . 0 1 0 3 . 0 2 8 2 . F R O M 3 6 . 4 4 0 . 7 0 . 0 0 . 0 3 3 . 8 7 4 . 1 9 7 . 1 0 . 3 2 7 0 . 4 0 9 0 . 0 0 . 0 0 . 3 3 8 0 . 6 1 0 0 . 5 0 4 C H I - S Q U A R E 1 3 . 4 3 2 3 . 8 8 . 0 2 6 . 0 0 . 0 0 . 0 5 6 . 0 6 9 . 0 5 2 . 0 2 9 1 . H A V E 3 7 . 5 4 2 . 0 0 . 0 0 . 0 3 4 . 8 7 6 . 5 1 0 0 . 2 1 . 1 5 0 0 . 3 0 4 0 . 0 0 . 0 0 . 7 8 9 0 . 4 4 3 0 . 2 5 5 C H I - S Q U A R E 1 1 0 . 7 1 2 4 . 4 1 . 0 6 2 . 0 0 . 0 0 . 0 2 2 . 0 8 2 . 0 8 0 . 0 2 8 7 . A T 3 7 . 0 4 1 . 4 0 . 0 0 . 0 3 4 . 4 7 5 . 4 9 8 . 8 0 . 5 3 6 0 . 7 2 5 0 . 0 0 . 0 0 . 3 1 0 0 . 5 2 6 0 . 3 9 2 C H I - S Q U A R E 1 9 . 3 1 2 5 . 2 1 . 0 1 9 . 0 0 . 0 0 . 0 4 6 . 0 7 6 . 0 8 9 . 0 2 5 1 . W H I C H 3 2 . 4 3 6 . 2 0 . 0 0 . 0 3 0 . 0 6 5 . 9 8 6 . 4 0 . 2 7 4 0 . 2 2 2 0 . 0 0 . 0 0 . 6 4 8 0 . 4 8 8 0 . 4 3 6 C H I - S Q U A K E 2 2 . 2 5 T H E T H R E E L I N E S O F F I G U R E S F U R E A C H E N T R Y R E P R E S E N T : F R E Q U E N C Y E X P E C T E D F R E Q U E N C Y R A T I O A S S , O F F R E Q . T O T O T A L N O . U F W O R D S I N S U B J E C T T A B L E X X X X D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S O F G R A D E T E N R A N K S U B J E C T S W O R D B C D E F G H T O T A L 2 6 . 2 3 . 0 2 1 . 0 0 . 0 0 . 0 4 9 . 0 6 7 . 0 6 7 . 0 2 2 7 . O N E 2 9 . 3 3 2 . 7 0 . 0 0 . 0 2 7 . 2 5 9 . 6 7 8 . 2 0 . 3 0 1 0 . 2 4 6 0 . 0 0 . 0 0 . 6 9 0 0 . 4 3 C 0 . 3 2 8 CHI7SOUARE 2 5 . 6 0 2 7 . 3 1 . 0 4 3 . 0 0 . 0 0 . 0 4 6 . 0 5 6 . 0 2 9 . 0 2 0 5 . N O T 2 6 . 4 2 9 . 6 0 . 0 0 . 0 2 4 . 5 5 3 . 9 7 0 . 6 0 . 4 0 5 0 . 5 0 3 0 . 0 0 . 0 0 . 6 4 8 0 . 3 5 9 0 . 1 4 2 C H I - S Q U A K E 5 0 . 2 7 2 8 . 4 0 . 0 4 . 0 0 . 0 0 . 0 2 7 . C 5 5 . 0 2 2 . 0 1 4 8 . C A N 1 9 . 1 2 1 . 3 0 . 0 0 . 0 1 7 . 7 3 8 . 9 5 1 . 0 0 . 5 2 3 0