Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Computer generated corpus and lexical analysis of English language instructional materials prescribed… Edwards, Peter 1974

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1974_A2 E38_8.pdf [ 16.04MB ]
Metadata
JSON: 831-1.0055689.json
JSON-LD: 831-1.0055689-ld.json
RDF/XML (Pretty): 831-1.0055689-rdf.xml
RDF/JSON: 831-1.0055689-rdf.json
Turtle: 831-1.0055689-turtle.txt
N-Triples: 831-1.0055689-rdf-ntriples.txt
Original Record: 831-1.0055689-source.json
Full Text
831-1.0055689-fulltext.txt
Citation
831-1.0055689.ris

Full Text

A COMPUTER GENERATED CORPUS AND LEXICAL ANALYSIS OF ENGLISH LANGUAGE INSTRUCTIONAL MATERIALS PRESCRIBED FOR USE IN BRITISH COLUMBIA JUNIOR SECONEARY GRADES  by PETER EDWARDS B.A., University B.Ed., U n i v e r s i t y M.A., University  o f Western A u s t r a l i a , 1963 o f W e s t e r n A u s t r a l i a , 1967 of B r i t i s h Columbia, 1972  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF EDUCATION in  t h e Department of  ©  READING EDUCATION . FACULTY OF EDUCATION  We a c c e p t t h i s t h e s i s required standard  as conforming  t c the  Adviser  External Examiner THE UNIVERSITY OF BRITISH October,  1974  COLUMBIA  In  presenting  this  an a d v a n c e d  degree  the  shall  I  Library  further  for  agree  scholarly  by  his  of  this  thesis  in p a r t i a l  fulfilment  of  at  University  of  Columbia,  the  make  it  that permission  p u r p o s e s may  representatives. thesis  freely  for  available  financial  is  gain  shall  Faculty of Education,  The U n i v e r s i t y o f B r i t i s h V a n c o u v e r 8, C a n a d a  by  the  Columbia  not  requirements  reference copying of  I  agree  and this  copying or  be a l l o w e d  for  that  study. thesis  Head o f my D e p a r t m e n t  understood that  written permission.  Department o f  for  for extensive  be g r a n t e d  It  British  the  or  publication  without  my  ABSTRACT The  major  purpose  representative prescribed Columbia through  sample  f o r use  samples  organized and  representative and  subject  by  textbook  approximately  results  on  of  significantly  seven  469 the  types  programs, sentence  textbooks  word  British  develop  a  a c r o s s grades,  words and  the  corpora  based  on  the  subjects related  a sub-set of were  to a i d i n s e l e c t i n g  lists  lexical  c h a r a c t e r i s t i c s of  occurring  developed  subject  lexical  greater  Considerable  determine  the  a  f o r computer p r o c e s s i n g  then  lexically  samples  by  and  areas,  produced  running  words o f n a t u r a l  diversity subjects,  rate  use  in  was  to  from  thirty-seven  a  Corpus  words  of  language  each.  The  t h a t G r a d e 9 makes  terms  (word-types)  of  volume  than e i t h e r  exhibited  i n t y p e and  token  textbooks  but  of  Yule's  characteristic  frequency  of  differences  K  word-types  no  exhibited  apparent  across  i n redundancy i n the  of  Grades 8  and  revealed great v a r i a t i o n  w i t h t h e most s t r i k i n g  500  indicated  demands  vocabulary  However,  the repeat corpora,  applied  analysis  reading  grades,  emerged.  model,  samples of a p p r o x i m a t e l y  or  pattsrn  sample  across  a quarter million  (tokens)  distribution  from  c o r p o r a . A number o f h y p o t h e s e s  sampling  material 10.  capture  textbooks.  from  based  to  curriculum for  subjects  lengths was  language  needed  v o c a b u l a r y from  stratified  textbooks  the  of frequently  model  was  secondary  grades,  sentence  a  area  A  various  junior  organize  t o the d i s t r i b u t i o n  significant  study  d e s c r i b e t h e word and  grades  tested  the natural  the development of and  text  of  i n the  schools,  analysis  within  of  of  to the  word-  samples  from  English  textbooks  and  to  some  extent  those  from  Home  E c o n o m i c s and Commerce. S i m i l a r r e s u l t s were o b t a i n e d  in  Yule's  f o r sentence  K  as  lengths.  a measure o f t h e r e p e a t  Samples  exceptional  from  English  variability  rate frequency  textbooks,  i n sentence length  by t h e a n a l y s i s o f  variability  computation  coefficients degree, all  average  instances,  samples o r g a n i z e d  Chi-square substantiated  to  analyses  the  frequently  representative  within  grades.  variations. materials therefore occurrence  Further substantiated on  gross  grade  variability  sentence  use  the u t i l i t y  the  by  further  the  lexical  English  lengths  grade l e v e l  with  and  the  grades  and  again  masked  characteristics i n the separate  of  samples subjects  the  subject  a  subject print  areas are  i n a f f e c t i n g the frequency o f  common  of sentence  of  in  across  instrumental most  of the  i n the d i s t r i b u t i o n of  words  by g r o s s  the  analysis  lesser  word s a m p l e . In  by  revealed  exhibited  subjects  for  sub-set  omission  was  and c o n t e n t  significantly  representative  based  levels,  prescribed  deviations,  f a c t o r and, t o a  samples  of  textbooks.  variability  of  Grouping  even  and  measures  standard  real inherent  occurring  The s t y l e  of  the  uniformity  sub-set  by g r a d e  the  exhibited  o f word and s e n t e n c e d i s t r i b u t i o n  the inherent  Little  organized  mask  of  by s u b j e c t s  analysis.  other  number o f s e n t e n c e s p e r 500  organization  tended  most  of  o f v a r i a t i o n , P e a r s o n ' s skew  the  groupings  on  again,  v a r i e t y . These r e s u l t s  were f u r t h e r s u b s t a n t i a t e d based  applying  words  in  English  and  a  lengths.  word  of developing  lists  produced  i n the study  an e l i m i n a t i o n  o f t h e most f r e q u e n t l y  occurring  technique,  words and t h e  V relatively  r a r e words, t o i d e n t i f y  word  based  lists  The  of It  prescribed  variability  linguistic is  s a m p l e s from  major c o n c l u s i o n  materials markad  on  that  i f analyses  the  reading  instruction instruction across on  to  materials  study  that  even t h e  and  by  organized  be  of  should  learning b a s e d on  most  word and  variability  expertise  maximize  suggests  print  would  b a s e d on the be from  straightforward  sentence be  other  subject  print  materials  within  by  grade g r o u p i n g s .  even  and  developing  materials. by  - "  Such  subjects  grades rather -  more  specialist  in  organized  subjects  frequency.  syntactic  area  combined  separate gross  the  from  i n j u n i o r secondary grades e x h i b i t  were d e v e l o p e d  specialist  would b e s t  grades  this  vocabulary areas.  when examined on  s e m a n t i c v a r i a b l e s . The and  use  significant  texts i n subject  c h a r a c t e r i s t i c s such as  suggested  pronounced  for  of the  the  than  ACKNOWLEDGEMENTS  I wish people  t o express  fortheir  role  counsel, the  my  t h e problem,  and e n c o u r a g e m e n t  to  of t h i s  a d v i s e r and t h e s i s  defining  patience,  gratitude  i n the completion  Dr.E.G.Summers, help i n i n i t i a l l y  my s i n c e r e  and  the  following  dissertation.  supervisor, f o r h i s for  his  guidance,  during the p r e p a r a t i o n of  dissertation.  The  members  of  my  thesis  Dr.J.Catterson,  Dr.Br.L.Courtney,  advice,  and many f r u i t f u l  support,  the course  and  Dr.E.Bentley,  Dr.D.Pratt,  suggestions  for  made  their  throughout  of the study.  Dr.J.Bormuth spent  committee.  examining  of the U n i v e r s i t y  o f Chicago,  t h e d i s s e r t a t i o n and t h e h e l p f u l  f o r t h e t i m e he suggestions  he  made. Mr. J , C o u l t h a r d , Amiraslany during used  i n the study.  for  his  programming  Nissen  techniques  I  School  the textbooks  used  and  help  and k e y - p u n c h i n g  I would e s p e c i a l l y  enthusiasm  Finally, Vancouver  Inger  o f t h e Computing C e n t r e , f o r t h e i r  t h e computer  computer  Mr. A. M i l l e r ,  expertise  like  and and  Irene advice  of the material  t o thank  i n developing  Allan  Miller  t h e numerous  and p r o g r a m s r e q u i r e d d u r i n g t h e s t u d y .  would Board,  like  to  thank  Dr.E.N.Ellis  f o r h i s h e l p and k i n d n e s s  i n the d i s s e r t a t i o n .  of  the  i n obtaining  vii TABLE OF CONTENTS CHAPTER I  PAGE THE PROBLEM  1  BACKGROUND OF THE PROBLEM STATEMENT OF THE PROBLEM  ............... AND  PURPOSE  OF THE STUDY  6  TASKS, QUESTIONS AND HYPOTHESES ......... SIGNIFICANCE OF THE STUDY  II  1  ...............  7 13  DEFINITION OF TERMS  16  LIMITATIONS  17  OVERVIEW  18  OF THE STUDY  REVIEW OF THE LITERATURE AND CONCEPTUAL FRAMEWORK  20  INTRODUCTION  20  WORD L I S T S AND THEIR BOLE IN READING RESEARCH  22  The D e v e l o p m e n t  o f Word L i s t s  .........  Word L i s t s  and C o n t e n t M a t e r i a l s  Word L i s t s  and R e a d a b i l i t y  COMPUTER  TECHNOLOGY  22  ......  29  ............  31  IN LANGUAGE  RESEARCH  32  RESEARCH INTO THE READABILITY OF INSTRUCTIONAL MATERIALS . . . . . . . . . . . . . . .  36  Readability  Formulas ..................  37  language v a r i a b l e s  ........  37  f o r m u l a s .........  39  Important Recent  readability  New t r e n d s  i n research  The C l o z e P r o c e d u r e  ...................  40 42  viii TABLE OF CONTENTS CHAPTER  PAGE A description Important Cloze  of Cloze  linguistic  variables  and r e a d a b i l i t y  DETERMINING  ..............  43  ...............  SIGNIFICANT CONTENT MATERIAL  SUMMARY III  .,  THE RESEARCH  DESIGN  50  , 58  Procedures  59  TASK 2. INPUT PROCESSING,  KEY PUNCHING ..  EDITING  Text  47  56  TASK 1. SELECTION OF MATERIALS  AND  45  55  THE PILOT STUDY  Sampling  43  61  Corrections  63  TASK 3. PRODUCTION OF THE CORPUS ........  64  TASK 4. PRODUCTION OF WORD LISTS  65  TASK 5. DESCRIPTION AND LEXICAL TASK 6.  ........  ANALYSIS OF  CHARACTERISTICS . . . . . . . . . . . . . . .  70  DESCRIPTION AND ANALYSIS OF  SENTENCE CHARACTERISTICS  72  TASK 7. ANALYSIS OF DISTRIBDTION OF MOST FREQUENT  100  WORD-TYPES  74  TASK 8. ANALYSIS OF DISTRIBUTION OF SELECTED SENTENCE LENGTHS TASK 9.  .............  IDENTIFICATION OF SIGNIFICANT  CONTENT MATERIAL I?  76  ANALYSIS OF THE DATA  77 AND  FINDINGS  .........  82  ix TABLE OF CONTENTS CHAPTER  PAGE 83  TASKS, QUESTIONS AND HYPOTHESES Task  1  83  Task 2  83  Task 3  83  Task 4  84  4.1  84  4.2  86  Task 5  87  5.1  87  5.2  95  Task 6  102  6.1  102  6.2  113  6.3  113  Task 7  117  7. 1  117  7.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  120  Task 8  ....121  Task 9  123  9.1  124  9.2  125  9.3  128  SUMMARY  130  Computers and L a n g u a g e A n a l y s i s  135  X  T ABLE OF CONTENTS CHAPTER V  PAGE DISCUSSION, CONCLUSIONS,  AND  RECOMMENDATIONS  137  DISCUSSION OF MAIN FINDINGS . . . . . . . . . . . . . Tasks  138  1 and 2: S a m p l i n g and P r o c e s s i n g  Procedures  138  Task 3: P r o d u c t i o n  o f t h e C o r p u s ......  138  T a s k 4: P r o d u c t i o n  o f Word L i s t s  139  T a s k 5: L e x i c a l  ......  Characteristics  T a s k 6: S e n t e n c e C h a r a c t e r i s t i c s  139 .......141  T a s k 7: Common Words . . . . . . . . . . . . . . . . . . T a s k 8: S e l e c t e d S e n t e n c e L e n g t h s T a s k 9: E l i m i n a t i o n T e c h n i q u e .........  142 ,143 144  CONCLUSIONS  145  RECOMMENDATIONS . . . . . . . . . . . . . . . . . . . . . . . . .  147  BIBLIOGRAPHY  151  APPENDIXES  169  xi L I S T OF TABLES TABLE I II  III  IV  V  VI  VII  VIII  IX  X  A SUMMARY  OF WORD L I S T S : 1921-1972  THE TWENTY-ONE, 500 WORD SAMPLES IN THE PILOT STUDY  ........  PAGE 28  USED 57  NUMBER OF TEXTS AND SAMPLES FOR EACH GRADE LEVEL AND SUBJECT AREA . . . . . . . . . . . .  59  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR GRADE LEVELS AND THE CORPUS  88  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS ACROSS GRADE LEVELS  89  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 8  90  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 9  91  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 10  92  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF EACH GRADE LEVEL OF THE CORPUS . . . . . . . . . .  93  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE THIRTY-SEVEN TEXTS  94  xii  xiii L I S T OF TABLES TABLE XXI  XXII  XXIII  XXIV  PAGE MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 8  105  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 9 . . . . . . . . . . . . . . . .  105  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 10  106  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE THIRTY-SEVEN TEXTS  107  XXV  PEARSON'S SKEW FACTOR FOR EACH GRADE LEVEL, THE CORPUS, AND SUBJECTS ACROSS THE CORPUS 110  XXVI  PEARSON'S  SKEW  FACTOR FOR SUBJECTS IN  EACH GRADE LEVEL ........................111 XXVII  PEARSON'S  SKEW FACTOR FOR EACH TEXT .......  XXVIII  K FACTORS  (SENTENCES) FOR EACH GRADE LEVEL,  112  THE CORPUS, AND SUBJECTS ACROSS XXIX  THE CORPUS ., K FACTORS (SENTENCES) FOR SUBJECTS WITHIN GRADE LEVELS  XXX  K FACTORS  114  ............................  (SENTENCES) FOR EACH TEXT  114  ..115  xiv LIST  OF  TABLES  TABLE  PAGE  XXXI  CHI-SQUARE FREQUENT ACROSS SUBJECTS  A N A L Y S I S OF T H E 100 MOST WORD-TYPES I N THE CORPUS GRADES, SUBJECTS, AND WITHIN GRADES ............118  XXXII  SUMMARY OF C H I - S Q U A R E A N A L Y S I S OF 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS G R A D E S , S U B J E C T S , AND S U B J E C T S WITHIN GRADES . . . . . . . . . . . . . . . . . .  121  XXXIII  C H I - S Q U A R E A N A L Y S I S OF S E L E C T E D S E N T E N C E L E N G T H S FOR THE G R A D E S , S U B J E C T S ACROSS GRADES, AND S U B J E C T S WITHIN GRADES . . . . . 123  XXXIV  NUMBER AND P E R C E N T A G E OF WORD-TYPES E L I M I N A T E D BY P O I N T A ( 5 0 % C U T O F F OF TOKENS) AND P O I N T B ( 1 0 % C U T O F F OF T O K E N S ) . . . . . . . . . . . . . . . . . .  127  XXXV  NUMBER AND P E R C E N T A G E OF WORD-TYPES BETWEEN P O I N T A AND POINT B ( 4 0 % OF TOKENS) FOR THE CORPUS, GRADES, AND S U B J E C T S ACROSS GRADES . . . . . . . . . . . . . . 1 2 9  XXXVI  D I S T R I B U T I O N OF OCCURRENCE OF T H E MOST F R E Q U E N T WORD-TYPES ACROSS THE GRADE L E V E L S OF T H E CORPUS  XXXVII  XXXVIII  XXXIX  100 251  D I S T R I B U T I O N OF OCCURRENCE OF T H E 100 MOST F R E Q U E N T WORD-TYPES ACROSS THE S U B J E C T AREAS OF T H E CORPUS . . . . . . . . .  261  D I S T R I B U T I O N OF OCCURRENCE OF T H E 100 MOST FREQUENT WORD-TYPES ACROSS THE S U B J E C T AREAS OF GRADE 8 . . . . . . . . . . . .  271  D I S T R I B U T I O N OF OCCURRENCE OF T H E 100 MOST FREQUENT WORD-TYPES ACROSS THE S U B J E C T A R E A S OF GRADE 9 . . . . . . . . . . . .  281  XV  LIST OF TABLES TABLE XXXX  PAGE DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT flORD-TIPES ACROSS THE SUBJECT AREAS OF GRADE 10  , 291  XXXXI  DISTRIBUTION OF OCCURRENCE OF F I V E SELECTED SENTENCE LENGTHS ACROSS THE GRADE LEVELS OF THE CORPUS . . . . . . . . . . . . . . . . . . . . 302  XXXXII  DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF THE CORPUS 302  XXXXIII  DISTRIBUTION OF OCCURRENCE OF F I V E SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 8 303  XXXXIV  DISTRIBUTION OF OCCURRENCE OF F I V E SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 9 304  XXXXV  DISTRIBUTION OFOCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF GRADE 10  304  xvi L I S T OF FIGURES FIGURE 1.  PAGE PRODUCTION OF VOLUMES C.G. AND C.S. OF THE CORPUS .  65  2.  PRODUCTION OF WORD L I S T S : VOLUMES C.V., G.V., S.V., S.G.V., AND T.V.............. 69  3.  MODEL OF A WORD FREQUENCY DIAGRAM  78  4.  APPLICATION OF "ELIMINATION TECHNIQUE" TO THE MODEL OF A WORD FREQUENCY DIAGRAM  80  5.  WORD FREQUENCY DIAGRAM  6.  APPLICATION OF THE "ELIMINATION TECHNIQUE" TO THE WORD FREQUENCY DIAGRAM OF THE CORPUS  126  GRAPHS OF SENTENCE LENGTH DISTRIBUTION (7.1 TO 7.66)  216  7.  8.  WORD FREQUENCY DIAGRAMS  OF THE CORPUS  125  (8.1 TO 8.11) .....  306  xvii LIST OF APPENDIX A  APPENDIXES PAGE  INDEX OF TEXTS AND LEVEL ,.  SAMPLES BY  GRADE 169  B  SAMPLE SIZES IN ALPHABETICAL ORDER AND ASCENDING RANK ............187  C  COMPUTER F I L E S AND STUDY  D  E  PROGRAMS USED IN THE 200  ALPHABETICAL LISTING (SAMPLE)  OF CORPUS VOCABULARY 207  RANK L I S T I N G OF CORPUS VOCABULARY (SAMPLE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  F  ASCENDING AND DESCENDING VOCABULARY (SAMPLES)  G  SENTENCE CORPUS  LENGTH  209  ORDER OF CORPUS ............211  DISTRIBUTION OF THE  (SAMPLE)  ....214  H  GRAPHS OF SENTENCE  LENGTH  DISTRIBUTION  I  CHI-SQUARE RESULTS OF DISTRIBUTION OF 100  J  MOST COMMON WORD-TYPES 250 CHI-SQUARE RESULTS OF DISTRIBUTION OF SELECTED SENTENCE LENGTHS . . . . . . . . . . . . . . 3 0 1  K  WORD FREQUENCY DIAGRAMS  (GRAPHS)  ... 216  .........  305  1  CHAPTER I  THE  PROBLEM  BACKGROUND OF THE PROBLEM Research suggests  that  cumulative, refinement. ability  i n t o the various development  life-long Few  t o read  a person  would well  1967).  ephemeral defined of of  spoken  argue  with  the  act  proficiency  is a  learning  statement  Our w o r l d i s a r e a d i n g use  of  memory i n word.  as the s o c i a l l y  competency  reading  reading  In  gives  striking  contrast  to  with  reading  skills  (Bond  language  planned,  i n dealing  world"  written  fact,  and  t h a t , "The  c o n s t i t u t e s o n e o f t h e most v a l u a b l e  The  permanent, e x t e r n a l  of  of the  process r e q u i r i n g continued  can a c q u i r e .  Tinker,  constituents  and  man  the  more  i n s t r u c t i o n could  guided or  aided  the e x t e r n a l  a  be  establishment  p r i n t memory s y s t e m  man. Unlike  reading  certain c h a r a c t e r i s t i c s that  proficiency  relearned  by  understanding man's  i s generally  each printed  educative  an a c q u i r e d  generation.  Thus,  l a n g u a g e has been  processes since  are genetic skill  basic  e a r l y recorded  origin,  which must be  developing a  in  skill  in  objective  in  history  (Dodds,  1967). E i n s t e i n oost  amazing  convention Bruner  suggested feat.  of  In  the  from  an  address  International  (1972) n o t e d  meaning  t h a t the a b i l i t y  that  the  is  the result  o f complex  psychological  have  of  i s readily  demand  the  a v a i l a b l e from  t o : sequencing  Such r e s e a r c h  and d e v e l o p i n g  process  of  reading,  has  reading  the  reading,  and  (Robinson  e t a l , 1967; Summers  40 p e r c e n t the  the  recent  development  s p e c i a l reading relates  to  stage  physiological  and  as technology  school,  cnce  has  Evidence  a variety  of  of i t s  sources. has e x i s t e d f o r  explored  topics  instruction  of  related  at a l l levels,  or s k i l l s the  the  an  become an  schools.  on r e a d i n g  and  of reading, pedagogy  disabled  of  reader  e t a l , 1968, 1967, 1 9 6 8 ) . R o u g h l y  trend  in  relates  to  reading  secondary reading  has been t h e a t t e n t i o n of  extract  beyond  level.  significant years  and  t h e p r i n t e d page  to reading,  problems  of the reported research  elementary  A  special  1972  understood.  products  language development as i t r e l a t e s  the  instruction,  elementary  i n secondary  mankind's  the u l t i m a t e  for literacy  reading  to  read  between  i n f o r m a t i o n base of r e s e a r c h  seventy-five years.  the  to  but vaguely  increased,  area o f study  importance  still  was  A s s o c i a t i o n , Jerome  Comprehending  interaction  increased  concern  important  An  processes  the  exclusive  Reading  t h e p r i n t e d page may r e p r e s e n t  t h e e v o l u t i o n o f homo s a p i e n s .  information  presented  capacity  in  With  to read  reading  abilities  given  to  more  systematic  i n c l u d i n g the o r g a n i z a t i o n of  p r o g r a m s , i n c r e a s e d e m p h a s i s on  subject  instruction i n  reading  c l a s s e s , and p r o v i s i o n o f s p e c i a l  as  i t  services  for  students  Robinson 1968,  with  et  serious  a l , 1960;  1970;  Farr  fifty  indexing al,  sources  information  1973),  outpouring  In  preschool  Hill  sources  towards  there  o v e r 200  adult  analyses to  et  al  by  Thorndike  (1971), and  Although  based  the on  project reported  the  first  which  of  generated  techniques  by  and  on  and  analysis  applicable  (1945),  Jacobson adult  (1967)  any  those  Carroll (1972).  materials, represented  word, c o m p u t e r - b a s e d  word and  of  linguistic  results  and  has  reasonably  m a t e r i a l s have been  Francis  relevant to  a  extensive  more g e n e r a l  K u c e r a and  based  linguistic  Harris of  from  corpus  sentence lengths.  a n a l y s i s of s c h o o l  The  based  materials.  In r e c e n t important  are  the  (1944), R i n s l a n d  a massive m i l l i o n  results  employed  instructional  study  by  on  school  Lorge  a sampling  the  study  and  for  i n secondary reading  word l i s t s  s e t t i n g s and  publishers  levels.  notable  which have g e n e r a t e d  instructional  reported  most  et  massive  ranging  f a c e t s . A f a c e t that r e f l e c t s  The  the  designed  programs  more i n t e r e s t  i s that of focussing  a l s o been a  specifically  instruction  old  material.  has  by  (Summers  North American  materials  some new  print  Artley,  evidenced  f o r secondary reading  some o l d and interest  1967;  B a r t i n , 1971).  a recent  years,  reading  trend  and  1966,  b i b l i o g r a p h i c a l guide,  in  t h r o u g h c o l l e g e and  The  1964,  1952;  cited  recent  in  1963,  (Davis,  is  instructional  use  problems  volume of i n t e r e s t  o f o f f e r i n g s from  developing student  Summers,  e t a l , 1970;  In a d d i t i o n , t h e over  reading  tool  research,  computer  for analyzing  technology  transformed  has  provided  n a t u r a l language  an  text.  4 organizing  corpora,  significantly, data  enabling  across  materials. varied  developing  statistical  language. research  are  text  data  analyses  advantages  aptly  and  within  a  bases f a c i l i t a t e  and of  counts,  make i t p o s s i b l e  sizable  and  comparison of  sub-components  characteristics The  frequency  analysis  numerous  Computerized  linguistic  the  word  bodies  masses corpus  the  by  Kucera  of of  use  of  to study  the  of  of computer t e c h n o l o g y i n  illustrated  more  written linguistic  (1969).  S i n c e any u s e f u l a n a l y s i s o f l a n g u a g e usage has to be b a s e d on a l a r g e body of t e x t u a l m a t e r i a l , even e l e m e n t a r y i n f o r m a t i o n c o u l d be o b t a i n e d , before the advent o f c o m p u t e r s , o n l y w i t h enormous l a b o r . L e t us i m a g i n e t h a t one w i s h e d t o d e t e r m i n e some very basic lexical properties of a t e x t u a l corpus containing a m i l l i o n r u n n i n g words. I f t h i s were t o be done by hand ( o r , more a c c u r a t e l y , by the human brain), the task would require an i n o r d i n a t e amount o f t i m e ; e a c h o f t h e one m i l l i o n words would have to be inspected individually, and e a c h new word r e c o r d e d a f t e r f i r s t c h e c k i n g t o make s u r e i t had n o t a l r e a d y been noted. If the analysis were also to preserve information about t h e f r e q u e n c y o f o c c u r r e n c e o f i n d i v i d u a l words, or perhaps r e f e r e n c e s to the pages or lines of the text where their occurrences were t o be f o u n d , t h e a s s i g n m e n t would become more formidable still..... Linguists and l e x i c o g r a p h e r s a l i k e have f o u n d i n t h e c o m p u t e r a new and u s e f u l t o o l t h a t has not o n l y made the analysis of language l e s s time-consuming but has a l s o opened new i n s i g h t s i n t o important problems in language usage.  New in  avenues to r e s e a r c h  interest  readability in  in of  the  linguistics  on  criteria  the  of  i n s t r u c t i o n a l materials.  f r e q u e n c y c o u n t s and  based  have a l s o been opened by  placement  based  semantic c r i t e r i a  on  item of  written  within  the  a  text, between  p a r t i c u l a r a r e a of  the  interest  co-occurrence, p o s i t i o n a l items  upsurge  l a n g u a g e and  There i s great  structural relationships  d e p e n d i n g on  an  criteria syntactic  items,  and  discourse  5 and  on  the  As  l a r g e r context  Robinson  within  (1971) p o i n t e d  which a g i v e n  text i s  placed.  out:  "We must study o u r l a n g u a g e b e f o r e we g e n e r a t e a p p r o a c h e s t o r e a d i n g i n s t r u c t i o n . . . . we need t o learn more a b o u t t h e p a t t e r n s o f s p e c i f i c l e t t e r s i n words, sentence patterns, and the overall organizational p a t t e r n s o f our l a n g u a g e . "  In  a  recent  information and  article,  gaps r e l a t e d t o r e s e a r c h  outlined  vital  "problems i n h e r e n t her  discussion  analysis  of the  The  work  subjectively areas  and  Jenkinson  areas  by  of  Smith  identify  to r e l a t e  reading further  materials."  stressing  language of  on  needing  w i t h i n the  (1970) a n a l y z e d  the  need  (1964)  patterns  necessary  was  reading  the  reading  and  study  s o c i e t y ' s communication  the  print,  "What  sources  are  the  prescribed  Answers  to  this  students.  linguistic  attempt  to these  skills  which  Science  to  subject  patterns. were common  and  Mathematics  with  in a  students  school through  t h e n i t becomes i n c r e a s i n g l y i m p o r t a n t linguistic  for  use  query  may  that i s better adjusted secondary  major  m a t e r i a l s s e l e c t e d f o r use  program r e f l e c t  ask,  concluded  12.  instructional  language of  Jenkinson  skills  texts i n Literature, Social Studies,  If  including  of w r i t i n g i n d i f f e r e n t  to  7 through  study  for "further  a  analysis included  Grades  comprehension,  textbooks..."  The  in  current  in  characteristics Canadian  w e l l form  to the  real  the  reading  of the  secondary basis for ability  to  print  schools?" instruction  and  needs of  6  This which old  question  was  provides  undertaken  and new t r e n d s  linguistic which  curriculum, analysis  basis  i n two main a r e a s :  the  f o r the  present  as a c o n t r i b u t i o n to r e s e a r c h  characteristics  forms  the  study  reflecting  1) t h e a n a l y s i s o f c e r t a i n  o f a sample o f n a t u r a l language  basis  of  an  existing  secondary  school  and 2) t h e a p p l i c a t i o n o f computer t e c h n i q u e s  of n a t u r a l language  text  t o the  text.  STATEMENT OF THE PROBLEM AND PURPOSE OF THE STUDY Numerous s t u d i e s r e p o r t e d and  education  illustrate  language content problems  in  linguistic areas  this  features  comprising  prescribed and  of  10  The corpus  body  of  computer  and  print  study  were  across  and w i t h i n  a  British  specific  frequency  attempts t o describe  sample  the  Columbia  The  subject  through  basic  and compare c e r t a i n grades  areas  and  subject materials  of Grades  8,  9  t h e d e v e l o p m e n t o f a model  p u r p o s e s were t o g e n e r a t e a  and s e n t e n c e  linguistic  lengths)  and  and programming  language  ( i n v o l v i n g word  comparisons  storage  of  the  total  t h e power o f c o m p u t e r  techniques.  study of  natural  analyses  and i t s sub-components by a p p l y i n g  characteristics  and compare t h e  technology.  t o make v a r i o u s  ideal  linguistics  of p r i n t e d i n s t r u c t i o n a l  corpus  The  of  materials.  to describe  f o r use i n t h e v a r i o u s  in  utilizing  a  i n the l i t e r a t u r e  print  in  describing  encountered  by  the Grade  linguistic 8,  9 and 10  7 s t u d e n t s would draw  s a m p l e s from a l l p o s s i b l e  print sources  student  contact  regular  comes  in  supplementary sources, samples study  of  limited  are  secondary  school  for  most  million  sampled  from  10  materials,  and  explicitly  in  defined,  to  this  running  in British  words o f  Columbia  built a  around  fully  such  developed  a theory  relate  to  natural  theory  Bormuth  comprehension  linguistic  to  be  during  their  print of  that junior  materials  approximately  language  a  selected,  a  systematically i n G r a d e s 8,  i n s t r u c t i o n a l materials  of  9  would  variables identified  be in  l a n g u a g e c o m p r e h e n s i o n . However,  developed. of  print  variables that  analyzes  HYPOTHESES  printed  (1969) a n a l y z e d of  this  text materials  i n subjects  isolated linguistic  yet  even  schools.  comprehensibility  seminal study, the  has  the  consisted  f o r use  d e s c r i p t i o n of  clearly  and  c l a s s e s . The  TASKS, QUESTIONS AND A complete  perhaps  carefully  encounter  study  prescribed  textbooks,  s p o k e n l a n g u a g e . However,  years i n subject  texts  and  a s i n g l e p r i n t component  likely  analysis  quarter  and  of  written  including  a v a i l a b l e language samples from  students  used  on  number  readily  reference  student  concentrates  with,  the  Innumerable materials.  factors  materials  c o r r e l a t e with  In a  that  and  variables recent  relate  to  identified  169  comprehensibility  and  readability. The explicate  s t a t e - o f - t h e - a r t i s such a  complete  that  i t i s not  t h e o r e t i c a l account of  yet  the  possible  to  comprehension  8  process, which  determine  linguistic  pedagogical  consistently  comprehension The  of  procedures predictive  print  development of  a fully will  fully  can  developed on  theory  are  important  r e l a t e d to  insights  which  may  Such  descriptive  which  may  and  study,  f o r use  questions on:  subjects), textbooks  materials  in  generating  explicated,  high  level  that  student  scientific  theory  of  d o u b t emerge g r a d u a l l y .  Until  a  used  to generate s t u d i e s ,  to  process,  diverse  provide  can  increase  acquire  materials  comprehension  research  and  evidence  can  factors in instructional  pedagogy  research  variables that  comprehension  their  the  focus  is  on  characteristics  demands.  produce r e s u l t s  effectiveness  knowledge  from  describing  and  in  written  comparing  of i n s t r u c t i o n a l  i n seven secondary s u b j e c t  d e s c r i p t i v e and  stratification  based  instructional  materials  areas  in  are  framed  British  schools.  The a  and  and  materials.  sentence  prescribed  difficulty,  comparative  students  In t h i s  materials  comprehension  manipulable  the  to  and  influence  instructional  Columbia  be  i n t o the  contribute  teaching  word  no  straightforward  indicates  print  materials.  language comprehension  based  variables in  c o r r e l a t e most c l o s e l y w i t h  develop are  the  and  comparative analyses  model a l l o w i n g  test  data  hypotheses a c r o s s  grade l e v e l s subjects  (Grades 8,  within  and 9,  grades  (thirty-seven prescribed  t o be  organized  within  10),  answer  the  total  sample  subject  areas  (seven  (eighteen  texts) .  to  within  subjects),  and  9 Word-types c h a r a c t e r s and  are  identified  relative  words i n d i c a t e d . The with  frequency  repeat  counts  word c h a r a c t e r i s t i c s characteristics levels for  and  of the  are  an  the  The  study  designed  the  various  corpora.  analyses  of  corpora.  The  comparative  on  on  8,  were  tasks  with  9  an  vocabulary  a  total  designed  length  the  grade  or  model aid  in  in word  1 t o 4 were Corpus  word l i s t s  to  produce  f o r t h e C o r p u s and  were  of  texts.  necessary  statistics and  as  tasks. Tasks  into  develop  6  occurrence  across  subject area  nine  indicated  Sentence  proposed  content  selected linguistic nine  based  graphic  individual  words i s  compared  is  into  and  5 and  T a s k s 7,  of  as  a d e c i s i o n theory,  input data  corpora  and  and  Finally,  i s organized  summary t a b l e s . T a s k s descriptive  of  printed materials.  samples based  other  occurrence  significant  to organize  sixty-five  of  technique,  most  d e r i v e d from  such  comparisons  described  elimination  features  rate frequency  and  subject areas.  identifying lists  frequency  and  developed  to  and and the the  produce  f e a t u r e s o f t h e C o r p u s and  the  their  and  related  questions  hypotheses f o l l o w .  Ia§3S  Is.  basei  on  Develop  a  Corpus  instructional  subject  areas  of B r i t i s h  T a s k 2±  Organize  to represent  materials Columbia  n a t u r a l language t e x t  prescribed junior  for  secondary  use  in  the  grades.  t h e C o r p u s o f m a t e r i a l s f o r computer i n - p u t  and  manipulation.  Task  3.  Generate  two  volumes o f t h e  C o r p u s : one  organized  by  10 grade  levels  descriptive  and  corpora  grade  corpora 4.1  organized  by  subject-areas,  each  with  a  index.  Task 4 O r g a n i z e grade  one  the  samples i n t o  word l i s t s  (3), the s u b j e c t c o r p o r a ( 1 8 ) , and  the  textbook  f o r the Corpus,  (7), the  corpora  subject  the  within  (37).  F o r e a c h o f t h e a b o v e , p r o v i d e an a l p h a b e t i c a l and a rank order (descending frequency) l i s t i n g o f wordtypes to give the f o l l o w i n g i n f o r m a t i o n . 4.11  The  4.12 The type.  frequency cumulative  of occurrence percentage  4.13 The r e l a t i v e frequency w o r d - t y p e per 1 ,000 tokens.  of each  frequency of  word-type. of  occurrence  e a c h wordof  each  4.14 The descriptive statistics f o r the rank o r d e r l i s t s o f t h e C o r p u s and c o r p o r a i n c l u d i n g : X, FX, SUM FX, FX * X, SUM FX * X, CUM % FX * X . (A f u l l e x p l a n a t i o n of these terms i s g i v e n i n Chapter I I I ) . 4.2  Task 5  t  C o n s t r u c t two summary t a b l e s f o r e a c h o f t h e s i x t y - s i x word l i s t s , i n d i c a t i n g t h e word f r e q u e n c y figures in descending order (highest frequency first) and i n a s c e n d i n g o r d e r ( hajjax lecjomena f i r s t ) .  Generate comparative  and  the  lexical  characteristics  data  produced  i n Tasks 1 through  statistical  analyses  o f t h e C o r p u s and  based  on  the corpora  and  4.  5.1  What a r e t h e l e x i c a l c h a r a c t e r i s t i c s of the Corpus; the Grade 8, 9 and 10 c o r p o r a ; each o f the seven s u b j e c t a r e a c o r p o r a a c r o s s G r a d e s 8, 9 and 10; each o f t h e c o r p o r a f o r s u b j e c t s w i t h i n G r a d e s 8, 9 and 10; and e a c h o f t h e t h i r t y - s e v e n t e x t b o o k c o r p o r a i n t e r m s o f : t o t a l number o f g r a p h i c c h a r a c t e r s , a v e r a g e number of graphic characters, tokens, and d i s c r e t e wordtypes?  5.2  What a r e t h e c h a r a c t e r i s t i c s , i n t e r m s o f repeat-rate frequency of words (Yule's K), f o r the C o r p u s and c o r p o r a d e f i n e d i n 5.1 a b o v e ?  11 Task  6._ G e n e r a t e c o m p a r a t i v e and  sentences the  Task  and  statistical  analyses  sentence l e n g t h s f o r the Corpus,  d a t a produced  i n Tasks  1 through  based  on  the corpora,  and  4.  6.1  What a r e t h e s e n t e n c e - l e n g t h c h a r a c t e r i s t i c s of the Corpus: the Grade 8, 9 and 10 c o r p o r a : each o f t h e s e v e n s u b j e c t a r e a c o r p o r a a c r o s s G r a d e s 8, 9 and 10:, each of the c o r p o r a f o r s u b j e c t s w i t h i n Grades 8, 9 and 10; and e a c h o f the t h i r t y - s e v e n t e x t b o o k c o r p o r a i n t e r m s o f : a v e r a g e number o f s e n t e n c e s ; mean, median and modal sentence length in words; standard d e v i a t i o n , c o e f f i c i e n t o f v a r i a t i o n , a v e r a g e number o f sentences, and Pearson's skew factor f o r sentence lengths?  6.2  Produce a s e t of graphs to i l l u s t r a t e s i x t y - s i x sentence length d i s t r i b u t i o n s and c o r p o r a d e f i n e d i n 6.1 above.  6.3  What are the c h a r a c t e r i s t i c s , i n terms of r e p e a t rate frequency of sentence-lengths (Yule's K) , for the C o r p u s and c o r p o r a d e f i n e d i n 6.1 a b o v e ?  7j_  Generate  comparative  distribution  of the 100  the  across  Corpus  a r e a s , and  most  the  and s t a t i s t i c a l  frequently  three  the eighteen subject  grade areas  each of the f o r the Corpus  a n a l y s e s o f the  occurring  word-types  levels,  the seven  within  the  of  subject  three  grade  levels. 7.1  Test  the f o l l o w i n g  null  hypotheses.  Hypothesis 1. There are no significant d i f f e r e n c e s i n t h e a c t u a l d i s t r i b u t i o n o f t h e 100 most f r e q u e n t w o r d - t y p e s o f t h e C o r p u s when compared t o the e x p e c t e d d i s t r i b u t i o n o f e a c h word-type f o r :  7.2  Hypothesis  1. 1 t h e  three  grade l e v e l s  Hypothesis  1. 2 the  seven  subject areas  of the Corpus, of the Corpus,  H y p o t h e s i s 1. 3 t h e  subject  areas within  Grade 8,  Hypothesis  1. 4 t h e  subject  areas within  G r a d e 9,  Hypothesis  1. 5 t h e  subject  areas within  G r a d e 10.  the  of  Investigate  and  describe  number  word-types  12 which differ significantly i n their a c r o s s e a c h o f t h e a r e a s t e s t e d i n 7.1.  £§sk  8_. Do  subject the  the  areas  sentence differ  Corpus ? T h i s  length  from  task  distributions  the sentence  involves  distribution  of  the  seven  length d i s t r i b u t i o n of  testing  the  following  null  hypotheses. Hypothesis 2. There are no significant differences i n the a c t u a l d i s t r i b u t i o n of short, average, and l o n g sentences when compared to the expected d i s t r i b u t i o n of each o f t h e s e n t e n c e lengths for: Hypothesis  2.1  t h e t h r e e grade l e v e l s  of t h e Corpus,  Hypothesis Corpus,  2.2  the  areas  Hypothesis  2.3  the subject area corpora  w i t h i n Grade  Hypothesis 9,  2.4  the subject area  corpora  w i t h i n Grade  Hypothesis 10.  2.5  the subject area corpora  w i t h i n Grade  seven  subject  of  the  8,  T a s k 9.. D e v e l o p an " e l i m i n a t i o n most  significant  frequency corpora,  lists  content  developed  technique"  for  words i n a word l i s t f o r t h e Corpus,  and t h e s e v e n s u b j e c t a r e a  selecting  using  the  the ranked  the t h r e e grade  level  corpora.  9.1  Produce a s e t o f graphs to illustrate t h e word f r e q u e n c y by r a n k o f t h e C o r p u s , t h e t h r e e g r a d e l e v e l c o r p o r a , and t h e s e v e n s u b j e c t - a r e a c o r p o r a .  9.2  What i s the effect of eliminating the highest frequency words and t h e l o w e s t f r e q u e n c y words from t h e t o t a l s p e c t r u m o f words f o r each of the corpora s t a t e d i n 9.1?  9.3  Can t h e r e s i d u a l o f words r e m a i n i n g a f t e r e l i m i n a t i n g t h e h i g h and low f r e q u e n c y words described i n 9.2 serve as a p o o l f o r s e l e c t i n g t h e most u s e f u l c o n t e n t words f o r t h e C o r p u s , t h e t h r e e g r a d e level corpora,  13  and t h e seven s u b j e c t -area corpora based on r e l a t i v e frequency of subjective criteria?  through analyses occurrence and  SIGNIFICANCE OF THE STUDY The the  comprehension  longer and  linguistic  sentences, is  also  generally  considered  to  word-types,  and  word  graphologically comprehension  research  related  quantity,  teaching  t h e a r r a n g e m e n t o f words  been  example,  t h a n do  comparisons o f the r e l a t i v e  freguency  the  word-types  reader  in  different  types  and  is  factor  measured  that  both  Increasingly,  stimulated  by  enabling  results Now  that  labor,  facilitating multiple  with  influences  phonologically  c a n be p r e d i c t i v e recent  t o such f a c t o r s i s s h a r p e n i n g ,  further research.  bodies of material  i n t e r m s o f l e x i c a l and  i n s y l l a b l e s or l e t t e r s ,  providing  has  with i t .  another  length,  of  since the  the p r o b a b i l i t y the reader  of material,  difficulty.  and  shorter  deriving  technology has g r e a t l y minimized has  For  in  the greater  loading  c o m p r e h e n s i o n . Even  of  complicate  t o come i n c o n t a c t  vocabulary  structural  and  material.  h a s s i g n i f i c a n c e f o r l e a r n i n g and t e a c h i n g ,  an o p p o r t u n i t y  The  influence  of printed  influence  discrete  more a word i s u s e d , had  discourse  i n t e r m s o f word and s e n t e n c e r e p e t i t i o n  Furthermore,  occurrence of  discourse  of written  demands upon memory i n r e a d i n g  Redundancy  information. of  difficulty  sentences  make g r e a t e r  components  both  language i n design  implications  for  t h e a d v e n t o f computer such  language  research  enquiry  b a s e d on l a r g e  comparisons across  diverse  14 sub-components. This written  study  language  specific  in  and  to other  turn  can  several  unique f e a t u r e s .  models  The  study  represents  characteristics  materials  p r e s c r i b e d f o r use province.  reflect  the  demands  Canadian students dated  word l i s t s The  data  developing  supply  needs  and  limited  instruction coping  i n the  with  to  by  the  for  be  study made  materials  and  answering from  of  schools  emanating from  study  the  from  of  the  most  a  study  used of  by the  those  c o u l d have s i g n i f i c a n c e of  school  they  The  would  data  have  speakers  in  instructional  m a t e r i a l s . The  of students,  native  with  are  of  reading m a t e r i a l s being  information  E n g l i s h as a s e c o n d the  which  In  instructional  secondary  distinct  the  base.  analysis  language  lists  authors  using  ability. both  extensive  of  use.  capabilities  for  produced  populations  the  1970*s as  w r i t e r s with  D a t a from analyses  in  currently in  reading  be  English  real  teachers  the  first  of  of  generated  and  the i n - p u t d a t a  will  vocabulary  guidelines  materials would  The  about  d e r i v e s i t s major s i g n i f i c a n c e  the  lexical  Canadian  information  become u s e f u l t o o l s i n r a i s i n g  questions.  study  of a l l , a u s e f u l p o o l  extensive  idiosyncratic  further  The  and  first  m a t e r i a l s s e r v i n g as  programs  generalizable  provide,  samples  Corpus of  addition,  and  will  need  word  to  meet  particulary value  lists  in  o f E n g l i s h and  those  the of  planning students  language. permit  existing  a number o f word l i s t s  correlational and  word-graded  15 reading  tests  Researchers lists  in  without  sources.  now  The  the  readability  could  make use o f t h e word  on d a t a  of  a readily  development  outmoded  Canada.  or  foreign  a c c e s s i b l e , fundamental  when needed.  the s t u d y  could  secondary  from  and  the  related  word  be u s e d t o f u r t h e r r e s e a r c h instructional  of  both  materials  standardized  t e s t s f o r placement, e v a l u a t i o n of student  into  and  as  and i n f o r m a l progress  and  o f program e f f e c t i v e n e s s .  Improved  the  disciplines  from  statistics  the  estimation  reading  throughout  t h a t c a n be c o n s u l t e d  sentence  reading  use  provide  samples o b t a i n e d  in  extensive  to rely  results  and  input  related  having  The  compilation  in  teaching  comprehension  study.  methods  of the subject-areas  materials  important  p r o c e d u r e s and t h e  geared  t o the unique  s u b j e c t - a r e a . The  results  which  aid  determining  greatly  instructional  materials  of  and  students  in  outcome  would e m p h a s i z e  demands o f each would  instruction i n  d i f f e r e n c e s between t h e b a s i c  develop a l t e r n a t i v e teaching instructional  facilitate  c o u l d be a v i t a l l y  Significant  characteristics  to  in  reflect the  vocabulary  identification  language  t h e need t o  utilization  of  word and s e n t e n c e  could  provide  to  what  data extent  that i s within of  of  reach  words o f s p e c i a l  importance needing s p e c i a l a t t e n t i o n i n teaching.  The  study  findings,  however.  extensive readily  has  potential The  model  use o f natural-language be  conversely  adapted  to  t h e model c o u l d  impact  beyond  designed computer  facilitate  much  for  i t s the  technology larger  be used t o examine v e r y  specific  study  makes  and  could  studies, small  or  units  16 of  m a t e r i a l . The  be  a p p l i e d i n the  printed  edit  and  serve  tables  and  produce the f i n a l  as  projects  Finally,  to produce the  lists,  a  programs generated  a n a l y s i s of other  materials.  extensively word  computer  the  computer  the  itself,  p u r p o s e s of t h e  study  and  the  model i n d e v e l o p i n g  could  populations was  format  data,  p r i n t e d copy. Thus  processing  study  technology  g r a p h s f r o m raw  DEFINITION OF For  the  idiosyncratic  dissertation  representative  involved with  by  used  and  to  print  develop,  study other  of  could  research  of n a t u r a l language t e x t .  TERMS a number o f d e f i n i t i o n s  were  developed. Character A letter, digit, organize, control,  or other symbol or r e p r e s e n t data.  £2J=I!i«~i^£t o f V a r i a t i o n A method o f m e a s u r i n g t h e move away f r o m t h e mean.  t h a t i s used  r a t e a t which sentence  to  types  Commuting C e n t r e Dollar^CCS Used i n a c c o u n t i n g by the Computer Centre. A CC$ represents an amount of computing r e s o u r c e s which c o s t s the U n i v e r s i t y of British Columbia $1.00 to provide. £2£Z§E§§tional T e r m i n a l A t y p e w r i t e r - l i k e device communicate w i t h MTS.  which  enables  a  user  to  Corpus The t o t a l body o f 235,107 t o k e n s o f natural language text based on the 469, five h u n d r e d word s a m p l e s across t h i r t y - s e v e n textbooks p r e s c r i b e d f o r use in the subject areas of G r a d e s 8, 9 and 10 i n B r i t i s h Columbia. Disk A computer s t o r a g e d e v i c e used i n sequential file storage, batch paging.  MTS for line queue storage,  and and  17  File Used w i t h MTS to refer to collections of i n f o r m a t i o n r e s i d i n g on d i r e c t a c c e s s d e v i c e s . Ma^netic  related  Tage A storage medium w h i c h p e r m i t s t h e r e c o r d i n g o f as a s e r i e s o f m a g n e t i z e d s p o t s .  data  MTS The M i c h i g a n T e r m i n a l model c o m p u t e r . Pearson^s  Skew F a c t o r A method o f m e a s u r i n g curve.  System  designed  the skewness of  to run  a  on  an  IBM  distribution  Sentence A number o f t o k e n s , t h e f i r s t b e g i n n i n g w i t h a c a p i t a l letter and the last ending with a p e r i o d , q u e s t i o n mark, o r an e x c l a m a t i o n mark, followed by a blank s p a c e o r a p a i r o f q u o t a t i o n marks. Token  An  Word  individual  occurrence  of a word-type.  A continuous string o f c h a r a c t e r s bounded l e f t by a b l a n k s p a c e and d e l i m i t e d by a b l a n k s p a c e o r one of t h e f o l l o w i n g c h a r a c t e r s • - ( ) " ; : , ?a>/$#+%=!_0.  Word Ty_pe A "distinct word" representing i n d i v i d u a l words ( t o k e n s ) . Yule^s  a  s e t of  identical  Characteristic K A method o f d e t e r m i n i n g t h e r a t e o f r e p e t i t i o n types or sentences) i n a passage of p r i n t .  (word-  LIMITAT IONS  There  are  three  main l i m i t a t i o n s  to the f i n d i n g s  of  this  thirty-seven  "A"  study. 1. The issue 9,  study  English  and  10  i s restricted  language  in British  t o t h e use  textbooks Columbia  of  p r e s c r i b e d f o r use  secondary  i n Grades  schools during  8,  1972-73.  18 Because o f t h e s i z e available percent  was  used.  of the prose  2.  No a t t e m p t  features  3.  The  computer  as  procedures  a  i s made t o  of l e x i c a l study  sampling  analyze  the  content  o f between 30 - 40  to validate  the  linguistic  i n t h e s t u d y . The main f o c u s  limited were  and s e n t e n c e  by t h e a c c u r a c y  developed  w e l l as by t h e a c c u r a c y in  a l l of  characteristics  is  employed  not a l l of  s e l e c t i o n s was made.  programs w h i c h  project,  utilized  Instead,  o f t h e m a t e r i a l used  the a n a l y s i s  as  of the undertaking,  data  procedures  forms.  of the various  specifically  o f keypunching  preparation.  A  i s on  f o r the  and e d i t i n g  Pilot  Study  and programs and m i n i m i z e  was  errors  much a s p o s s i b l e .  OVERVIEW OF THE STUDY The  study  treatment  and  comparison  of  of  Chapter conceptual  II  related  the  f o r c o m p u t e r i n p u t ; 2)  statistics; generated stated.  presents  review  framework  1) t h e s e l e c t i o n and  programs needed t o g e n e r a t e  computer  the  f o r t h e study. involving  I I I . Chapter  the f i n d i n g s  aspects:  and h y p o t h e s e s  the i n v e s t i g a t i o n  Chapter  major  m a t e r i a l s used  computer  other  questions raised  of  three  of the text  development lists  has  3)  the  data  of  in relation  The d e s i g n  the nine  tasks  IV p r e s e n t s t h e a n a l y s i s  of the study. F i n a l l y ,  Chapter  t h e word-  analysis  literature  the  and to the  and  the  and m e t h o d o l o g y i s  outlined  in  o f t h e d a t a and  V p r e s e n t s a summary  19  of the  results,  suggests future  a  the  number  research.  conclusions  for  of i m p l i c a t i o n s  the  investigation,  for reading i n s t r u c t i o n  and and  20  CHAPTER I I  REVIEW  OF THE LITERATURE AND  CONCEPTUAL  FRAMEWORK  INTRODUCTION T h i s study comparison,  and  instructional few  reported  necessary order  i s concerned with  m a t e r i a l used i n s e c o n d a r y g r a d e s . studies i n this  construct  investigation. ampirical  description,  a n a l y s i s o f a r e p r e s e n t a t i v e sample  area  t o make u s e o f r e s e a r c h  to  t h e development,  Some  research  the of  and  a t other  conceptual  the  and o t h e r s  therefore  studies  cf printed  T h e r e have been i t  educational  has  been  levels in  framework f o r t h e p r e s e n t mentioned  report the r e s u l t s  are  based  on  of d e s c r i p t i v e  research. Although  t h e major  o f comprehension  in  reading  is  r e l a t i o n s h i p s o f phonology,  syntax,  and  reader  to  concsrned  with  semantics,  t h e use of p r i n t e d d i s c o u r s e r e q u i r e s the  leal  first  once a g a i n printed  with  words.  In r e c e n t  years  a t t e n t i o n has been  t o t h e d e v e l o p m e n t o f word l i s t s  m a t e r i a l s and s e v e r a l new word l i s t s  (Kucera-Francis, 1972) .  the f u l l  aspect  1967; C a r r o l l  given  f o r the a n a l y s i s have been  of  developed  e t a l , 1971; H a r r i s and J a c o b s o n ,  21 A text  recent  has  innovation  been t h e  emphasized  that  complexities  of  Harris a  and  use  Jacobson  would  materials  to  currently printed and  of  related  t o the  Finally print  the  print  materials  the  Cloze  and  materials  for  which  of  of  of research  which  print  in  to i d e n t i f y be  students.  word  and  1969;  in  determining  that  the  1970).  having  to  selected  as  best  means  comprehensibilities 1972).  words have  has  sentence  Guthrie,  passages  the  The  of  lexical  implications  materials.  i n the to  language  instructional  factors  deleted  appears  s e n t e n c e s which can material.  of  Ramanauskas, the  gained  printed materials  constitutes  measuring  1969;  modify  of  main  words  words  a passage of  advantages  Information  technique maintain  of techniques  of  vast  research.  massive c o r p u s of  Mclaughlan,  development and  (1971)  the  language  needs o f i n d i v i d u a l  1968;  comprehension  area  of a  importance of  aspects  an  materials.  readability  the  (Bormuth,  functional  language  Alford  handle  many o f t h e  select  deleted  available  prose  in  reading  (Fry,  randomly  representative  techniques  to  two  I n v e s t i g a t o r s using replace  could  makes use  the  as  difficulty  technology.  computer  printed  stress  characteristics reading  of  into  natural  o f word a n a l y s i s o f f e r e d , e s p e c i a l l y i n  educators  Research  a n a l y s i s of  (1972) o u t l i n e d  t o meet t h e  continued  day  which  enable  a  system  research  the  computer  only  comparative s t u d i e s from  of  modern  computerized  in  linguistic  analysis  of  have p o t e n t i a l c o n c e r n s  the  the  used  » s i g n i f i c a n t ' body  t o summarize  the  of  content  22 The  p u r p o s e of t h i s  framawork these  and  areas,  make a s e l e c t i v e  including:  research;  computer  readability  of  identifying  chapter  to  review  word l i s t s technology  printed  the  is  a  conceptual  of studies that relate  and  their  and  language  materials;  "significant"  organize  and  role  in  content  in  reading  research;  techniques a  the  useful  body  to  cf  in  print  material.  WORD L I S T S AND This section of  word l i s t s ;  and  lists  d e a l s with  word l i s t s  readability.  word  to  The  section.  research  is  treatment  printed  an  Similarly,  research,  the  will  w i l l be  third  study,  which  use taken  of  word  but  lists  generated the  presented in  use  of  in  a  readability  but  a  fuller  in part four.  vocabulary  used  s i n c e t h e 10,000 words l i s t e d  had  a  over  from  literature,  be  topic,  presented  The_reacherj_s_£ord_Book  prose  outline  t h e r o l e o f word l i s t s  under  development  i n c l u d e computer  s t u d i e s have been made o f the  made  children's  will  the  m a t e r i a l s ; and  up-to-date  m a t e r i a l s i n t h e O.S.A  Thorndike's  RESEARCH  major t o p i c s :  content  topic  READING  i n language r e s e a r c h  of r e a d a b i l i t y  Thorn d i k e ' s ,  running  first  included  Extensive  three  and  present  computer t e c h n o l o g y later  THEIR ROLE IN  a  were  great  published  impact  f o u r and  a half  variety  of  elementary  school  on  in  in in  1921.  educational  million  words o f  sources  including  texts,  commercial  23 materials, added  and t h e B i b l e .  another  collaborated of  book  pioneer  10,000  work  magazine c o n t e n t of  Thorndike there  instructional  m a t e r i a l s based  later,  the  frequency  Lorge  was  10,000 words used mainly  from  a  on  had  real  great  need  language  i n the  adult  lists  (1931) and then  a much more d i v e r s e  f r e q u e n c y . However, t h e p o i n t  additional  Thorndike  (Thorndike and L o r g e ,  and  because  compiled  to  t o produce  significance  functional  years  words  with Lorge  and  Ten  sampling 1944). The  educational  t o have s c h o o l  which  had  a  high  h a s been made t h a t t h e  Thorndike-Lorge  materials  (Harris  list  were  and J a c o b s o n ,  1972) . During  this  period  constructing  word  most  to  common  vocabulary isolate  (1926) d e v e l o p e d grades  in  s e r i e s of  reading  spoken  and  researchers  on t h e l a n g u a g e (1924)  were  considered  compiled  special  s c h o o l s u b j e c t s i n an a t t e m p t t o  emphasis  in  language  usage.  Gates  f o r the primary  2,500 o f t h e h i g h e s t f r e q u e n c y words i n  work, 1,000 o f t h e most f r e q u e n t words i n a readers  by  influence  10,000  expression reading  from  textbooks.  of  Pressey  other  a 1,500 word r e a d i n g v o c a b u l a r y  children's  considerable  of  mainly  fifteen  areas of  initial  frequently  number  based  children.  by s e l e c t i n g  Thorndike's  list  lists  lists  specific  a  made  1,000  young c h i l d r e n .  of  (1926)  the  The G a t e s '  on t h e v o c a b u l a r y used  Horn  words  and  in  words  word l i s t had primary  p u b l i s h e d an a d u l t  considered  to  be  i t possible  to  compare  most  basic  vocabulary  for this  grade  written  mode w i t h  and s p e a k i n g v o c a b u l a r i e s .  Another  major  development  about  this  time  was  the  International  Kindergarten  List  o f 2,596 words c o n s i d e r e d most  widely  known by k i n d e r g a r t e n c h i l d r e n  list  of  769  International Thorndike results  easy  was p r o d u c e d  a  words i n t h e E n g l i s h  other  variables Maki,  developed  a word  language  based  220  words  by  words  common  Word L i s t and  grade  basic  1,000 words o f t h e year,  the  to assess  most o f t e n and how  their  use  (Faucett  Buckingham and D o l c h  Dolch  (1936) c o m p i l e d  (1936)  from  three sources:  a  were common a  list  o f 153 words t a k e n  important  from  list to the  of  t o p r e s c h o o l e r ' s v o c a b u l a r i e s ; the Gates'  first  English  started  in  authors  i n children's  2,596 Primary  reading;  a number o f p r i m e r s and f i r s t  major u n d e r t a k i n g t o d e v e l o p vocabulary  1945  used  i n Canadian  (Stothers,  Jackson  p o i n t e d t o t h e complete  word l i s t s  little in  were used  the  readers.  The  on  both  following  193 words which  o f 1,811 words j u d g e d  a list  to  a  on t h e word knowledge o f c h i l d r e n i n  selecting  most f r e q u e n t words t a k e n  The  influenced  G r a d e s 2 t o 6. About t h e same t i m e of  common  1931,  were p r e s e n t e d , d e s i g n e d  A few y e a r s l a t e r , list  1928) . I n  and t h e f i r s t  i n t h e language  19 32).  were  by D a l e .  complex s t u d y  which  and  which  Kindergarten L i s t  list, of  words  (West,  review  to  also  had been  textbooks  as a d i s t i n c t i v e to  and  reliance  by  or to the r o l e  reading s k i l l .  assess  the  word  carried  uncommon  lists  of vocabulary  The method  used  were t h e n  prepared:  of  that  very  vocabulary development  i n the study  o u t between  vocabulary  educators  stressed the  the  1947). The  Canadian  of  of  s c h o o l s was  Minkler,  paid t o the nature  a number o f s u r v e y s  O n t a r i o . Three  elementary  c o n s t r u c t e d i n t h e U.S.A. They  attention  Canadian  a knowledge  1921-1945  was and  r e a d e r s used i n  f o r Grades  1  and  25 2,  Grades  were n e x t check  3  and  4,  examined by  content  distributed In the  a  series  U.S.A of  core  by  in  and the  reading  lists  students  the  materials  5,764  school  i n 1949,  were  to  words  and  to  The  serve  vocabulary  basic core vocabulary  was  children  the f i r s t  c o n s t r u c t e d by  preparation  r e g a r d i n g the  development of r e a d a b i l i t y  lists  presented  R i n s l a n d and  facilitate  The  attempt  of  f o r elementary  vocabulary  to  an  total  D e v e l o p m e n t a l L a b o r a t o r i e s (EDL).  supplementary  addition,  a  1 t o 6 was  developed  6 respectively.  teachers  Finally  a basic vocabulary  designed  teachers  s t u d e n t s and  validity.  was  Educational were  G r a d e s 5 and  a c r o s s Grades  1945,  in  and  EDL  word  of  basic  suggested  and  books.  f o r use  levels  of reading materials  study,  150  the  lists  as a g u i d e  load of  of  in  for In the  (Taylor,  1949) .  In  the  initial  EDL  R e v i s i o n s f o l l o w e d i n 1951 nine  basal readers  lists  were based  a combination the  against list,  from  their  taken  and  (G l i s t i n g )  at  4  (1 944)  and  warranted  to  of  6  the  the  an  additional  primary  grades  intermediate  word  level  against  word«s f r e q u e n c y  measured (1944)  a word. F o r G r a d e s 7 and rechecked  Rinsland  and  the R i n s l a n d was  against  (1945) l i s t s  remainder  f o r G r a d e s 9 t o 13  1  checked  were  i t . The  the Thorndike-Lorge  The  investigated.  of t h e T h o r n d i k e - L o r g e  inclusion  Grades  the core vocabulary  survey.  the  were  i n 1968,  knowledge o f t h e  list  the  frequency  from  and  b a s a l r e a d e r s and  (1945)  determined  Thorndike-Lorge if  on  the f r e q u e n c y  t h e words  1955,  were added t o t h e  of p u p i l s *  Rinsland  and  sources  of the  and  by  the added  words were  lists.  compiled  8,  Finally using  the  26 highest well  frequency  as  a  vocabulary  use which  number  Kucera-Francis  of  computer  was  unique  invaluable  from  Fifteen  data  a  as  bibliography  of  a 1,014,232 word  i n t h a t i t was  base  g e n r e were u s e d  the  for  adult  The  area  analyses  study to  the  in  the  were  provided use  an in  aspects  of  study  was  not designed of  randomly  each  However, t h e K u c e r a - F r a n c i s was  corpus  USA  words  lexicographical  and  made  Kucera-Francis  researchers  and  materials  or subject  only  i n the  2,000  genre.  other  phonological  from  the  m a t e r i a l p u b l i s h e d i n the  across  E n g l i s h language.  level  to compile  s a m p l e s of a p p r o x i m a t e l y  investigating  grade  at the time  selected  derived  words  (1944) l i s t  (1967) a n a l y s i s o f A m e r i c a n - E n g l i s h  techniques  year.  500  randomly  written  other  sample o f p r i n t e d  and  the T h o r n d i k e - L o r g e  improvement m a t e r i a l s .  calendar  study  of  The  selected one  words from  to  provide  material  being  treated.  Carroll, about  the  et  lexical  involving  Corpus,  9.  as  The  the  to generate  taken  some 1,000  and  from  word  of  American study  different  need  to l e a r n  language i n a massive used  by  called, lists  from  made use over  publications.  The  talking  which  would  to i t s c h i l d r e n *  frequencies  were  listed  serve  of  5,000,000 AHI  in or  computer words  Corpus  was  judgment  'a  reflection  (Carroll  et  al,  grade l e v e l s  study  Corpus  as  by  more  students  Heritage Intermediate  was  frequency  the  t o p r o v i d e a ' c u l t u r a l frame of r e f e r e n c e f o r  comparison'  culture  emphasized  characteristics  techniques  designed  (1971)  published materials frequently  Grades 3 through AHI  al  1971).  of  the The  thus p r o v i d i n g  27 valuable  information f o r teachers  materials.  The  authors  and w r i t e r s  a l s o noted  of  t h a t word f r e q u e n c y  been u s e f u l i n h e l p i n g t o d e t e r m i n e r e a d a b i l i t y selection  of  texts  English  as a second  lists.  The  analyses  data  but  incorporated  no  characteristics  In  1972,  Harris  attempt  and  reading  widely  i n elementary  for  used  Grades  lists  six  basal  appear Core the  readers,  a  comparative  analyses  lists  provided to  during  series,  found  in  1970. The  school  textbooks  p l u s two s e r i e s o f  (English, included  a  Social General  basal  o f words f o u n d  Studies, Vocabulary  readers  i n three  and  ofthe  made up o f words which  s e r i e s o f books  used.  A  were a l s o i n c l u d e d f o r e a c h o f  (Preprimer  t h r o u g h Grade 6 ) . The a u t h o r s the  basis  for a  be made between word l i s t s , obsolescence,  levels  o f words, word f r e q u e n c y ,  c o n s t r u c t i o n such as s i n g u l a r 1972) .  sentence  were  o f elementary  Core L i s t  elements such as c o n t e n t , and l e n g t h  word  c o n s i s t i n g o f words which  and an A d d i t i o n a l L i s t  lists  the  of  subjects  The  levels  stated that t h e i r  number  series  a n d an A d d i t i o n a l L i s t  basal reader  using  series  i n f o u r o r more o f t h e f o u r t e e n List  vocabulary  number o f s t a t i s t i c a l  textbooks  c o n t a i n i n g common v o c a b u l a r y textbooks,  of  published  1 t o 6, s i x b a s a l r e a d e r  Mathematics).  content  the  the t e a c h i n g o f  made t o examine  Jacobson  school  f o r each o f t h e c o r e  Science,  was  vocabularies  made use o f f o u r t e e n  texts  a  and  o f t h e m a t e r i a l used i n t h e s t u d y .  elementary  study  instruction,  l a n g u a g e , and t h e c o m p i l a t i o n  Corpus  d a t a had  levels  o f t h e C o r p u s by g r a d e and s u b j e c t a r e a  frequency length  AHI  f o r classroom  instructional  plural  of  number  including difficulty,  and a s p e c t s  (Harris  and  of  o f word  Jacobson,  28 A between  summary 1921  -  of 1972  the  most w i d e l y  i s presented  known word  lists  developed  i n TABLE I .  TABLE I A SUMMARY Year  OF WORD LISTS  : 1921-1972  Description Thg^. lE£§5i?£l§..ffQ£§..§99J£ c o n t a i n e d 10,000 words t a k e n from p r i n t e d m a t e r i a l s i n t h e U.S.A.  Author Thorndike  T92T  Gates  1926  A_Heading_Vocabulao_^2I_illS_££il^^ G r a d e s c o n t a i n e d 1,500 words f o r G r a d e s 1, 2, and 3.  Horn  1926  A v o c a b u l a r y s t a t e d t o be written expression.  Thorndike  1931  Another list.  Dale  1931  A l i s t o f 769 e a s y words which were common t o t h e I n t e r n a t i o n a l K i n d e r g a r t e n L i s t and t h e f i r s t 1 ,000 words of t h e T h o r n d i k e L i s t .  1936  A word l i s t based on v o c a b u l a r i e s of c h i l d r e n i n G r a d e s 2 t o 6.  19 36  A basic sight  1944  A much more d i v e r s e s a m p l i n g o f book and magazine c o n t e n t i n t h e U.S.A. 30,000 words i n t h e l i s t .  1945  A B a s i c Vocabulary of Elementary S c h o o l C h i l d r e n . I l l u s t r a t e d the f r e q u e n c i e s o f 14,571 words t a k e n an a n a l y s i s o f 200,000 w r i t t e n papers.  Buckingham Dolch  &  Dolch Thorndike Lorge  &  Rinsland  Stothers, Jackson, & Minkler  1947  Taylor, Frackenpohl & White  1949 revised i n 1951 & 19 55  e  basic for  10,000 words added t o t h e  v o c a b u l a r y of 220  1921  words.  from  The f i r s t major u n d e r t a k i n g o f p r o d u c e a C a n a d i a n w o r d - l i s t f o r G r a d e s 1-6. A t o t a l o f 5,764 words used. A s e r i e s of core v o c a b u l a r i e s d e v e l o p e d by the E.D.L.  29  TABLE I  (CONT.)  A SUMMARY OF WORD LISTS  : 1921-1972  Kucera £ Francis  1967  An a n a l y s i s o f A m e r i c a n - E n g l i s h a d u l t m a t e r i a l s u s i n g computer t e c h n i q u e s t o g e n e r a t e a c o r p u s o f 1,014,232 words.  Taylor, Frackenpohl & White  1968  An a d d i t i o n a l n i n e b a s a l r e a d e r s were added t o t h e 1955 r e v i s i o n . L i s t s a t t h e p r i m a r y , i n t e r m e d i a t e , and s e c o n d a r y l e v e l s were p r o v i d e d .  Carroll  19 71  The_American_Heritage  et a l  ££§3S®li22_S22JSi  A  _Word  computer-generated  a n a l y s i s o f o v e r 5,000,000 words t a k e n from 1 ,000 d i f f e r e n t p u b l i c a t i o n s u s e d i n Grades 3 to 9 Harris & Jacobson  1972  B a s i c Elementary,, R e a d i n g V o c a b u l a r i e s ^ A s e t o f word l i s t s at t h e e l e m e n t a r y l e v e l d e v e l o p e d by computer t e c h n i q u e s .  S t u d i e s concerned materials lists. the  Between  Painter,  1931;  variety  researchers  areas  frequency Fries  word  and  Malsbary  s t u d e n t s had  of  content  in  and  lists  Traven,  measured  printed  the  in  most  (Powers,  word  investigated instructional common  1925;  words  Patty  and  1940).  the  o f b u s i n e s s and  of newspapers,  of  development of f r e q u e n c y  t h e v o c a b u l a r y used  reported  1952,  content  between  the  a  vocabulary  1925-1945 a number  materials i n  school  the  have o f t e n f o l l o w e d t h e  relationship  In  with  journals,  understanding  that  high  economic terms s e l e c t e d  from  and  that  newscasts.  He  found  30 there the  was some r e l a t i o n s h i p  between  student  frequency of the item, Malsbary  n i n e o f t h e i t e m s were u n d e r s t o o d  also  by  understanding  reported that  only  50  and  seventy-  percent  of the  stude n t s . Kyte  (1953)  vocabulary the  500  required most  Thorndike-Lorge list  a  study  lists  determine  f o r various instructional  common  words f r o m  1944 l i s t  continued  compiled  studies  developed school from  each  the  core  p r o g r a m s . He  used  o f H o r n ' s 1926 l i s t , t h e  and t h e R i n s l a n d 1945  several  made two  reliance  list.  A  final  o f e d u c a t i o n a l r e s e a r c h e r s on word  decades e a r l i e r  during  forms  the  of  selected  Another  structure  expression. list,  freshmen  words f o u n d  "Relative  H o r n ' s 1926 work  following  another  122  most commonly u s e d  of  lists  study  including  showed  corpus  In  of  English  items  Thorndike-Lorge the frequency of  and a d u l t s '  of English  written  Rinsland's Speech  ( C a r d and M c D a v i d ,  words a s d e t e r m i n e d  1945  Sounds,"  1965).  The  bias of the  in  a  number  Dewey's and R i n s l a n d ' s . The r e s u l t s o f t h i s  t h a t s t r u c t u r e words i n E n g l i s h  1967,  the  selecting  study analysed the frequency  (Card and McDavid  results  f o r high  words were t a k e n f r o m  (1923) and from year  vocabulary test  i n children's  Frequency  series (1963)  by r a n d o m l y  word  ina  Traxler  r e s e a r c h p r o j e c t compared  The s t r u c t u r e  Dewey's  1960*s.  a fifty-item  s t u d e n t s and c o l l e g e  list.  was r e f l e c t e d  early  t h e 1 0 , 0 0 0 t h t o t h e 20,000th  word  list  to  o f 663 words was p r e s e n t e d . The  of  conducted  Jacobs  formed  a  typical  1966).  compared  with a study  carried  t h e 1926 B u c k i n g h a m - D c l c h  word  o u t i n Oregon.  that  He  found  31 free-association through  6 differed  original  study.  more c h i l d r e n words t h a n In basic  not  197 1,  He  Corpus. that  significantly An  Johnson  stated  that  82  220  most  the  from  original instrument  materials  i n the  that  of  instructional  an  implication  materials  in  i s that  they a l s o  examination  i t s relationship o f t h e 220  the  although  knew  fewer  of the Dolch  (1936)  to the  words l i s t e d  Kucera-Francis by  f r e q u e n t words i n t h e  Dolch  list  was  no  f o r the vocabulary  Dolch  were  Kucera-Francis  Johnson  longer  concluded  suitable  content of  as  a  instructional  1970's.  materials currently  from  the  studies into  used  i n Canadian  preceding discussion.  description used  i n Grades 2  reported  p o i n t t o note  need f o r e x t e n s i v e , a n a l y t i c a l  a  children  d i s c r e p a n c i e s were r e p o r t e d and  measuring  present  made an  and  the  from  predecessors.  list  Other  The  interesting  word  among  elicited  knew t h e same word i n 1966,  their  sight  study.  vocabulary  of  i n Canadian  the language education  nature  schools i s  Such s t u d i e s would  composition  i n the  the  of reading  1970's.  Word_Lists_and_Readability  Word l i s t s readability Thorndike selected  included  formulas. list  from  textbooks.  have been u s e d  A  to  Lively  give  a  and  Pressey  'weighting  elementary  basal  number  researchers  of  i n the T h o r n d i k e  readability  e x t e n s i v e l y i n the  (Vogel  and  list  as  1  readers  (1923)  used  to m a t e r i a l s they and  used  a variable  Washbourne,  development  college  words t h a t in their  1928;  of the had  science were  work  Washbourne  not into and  32 Morphatt,  1938;  Gray Dale  and  Leary  in their  (1953) .  Jacobson,  1961).  (1935) u s e d  readability  Spache  later  formula  made use  Dale's l i s t  i n h i s formula.  Dale  of  List  formula.  3,000  A later  list  1931  word  list  as d i d Lorge of the Stone  In 1948,  words  word  the  Dale  developed  (1944) and  and  (1962)  Spache  (1956) r e v i s i o n Chall  as a v a r i a b l e i n t h e i r  by B o t e l  by  was  used  of the  readability  also  used  in  readability research. In  recent  materials  has  years,  work  made more use  word f r e q u e n c y .  into  of  the  readability  language  variables  T h i s aspect of r e a d a b i l i t y  cf  print  other  than  i s discussed later  in  the chapter.  COMPUTER In use  recent  o f computer  language areas: library  t h e r e has  LANGUAGE RESEARCH been a g r o w i n g  techniques to help compile  the a n a l y s i s  The  studies  have  generated  i n German d e s i g n e d  indicated  that  n o t been c o v e r e d The  over by  author  analyze  of m a t e r i a l s i n s p e c i a l i z e d  30  use  pointed  developed out  that  natural  at  main  such  as  languages;  and  general frequency  the  percent of the o r i g i n a l  previously  i n the  materials.  a computer-based for  areas  foreign  of e d u c a t i o n a l , i n s t r u c t i o n a l  s t u d y which  and  interest  c o n c e n t r a t e d i n two  s c i e n c e , i n f o r m a t i o n s c i e n c e , and  word-list  1967) .  years  samples.  the a n a l y s i s  A  TECHNOLOGY IN  college  sample t e x t  word-lists although  level, had  (Siliakus,  most  of  the  33 untreated  words were v e r y low f r e q u e n c y  numerous  high  frequency  p o r t i o n s of the t e x t very  thorough  proper  there  nouns and c o g n a t e s  not covered.  analysis  items,  T h i s study  o f language  possible  were found  also i n the  t h u s emphasized the with  the  aid  of  a  computer. A  later  materials for  study  by  procedure  work  analysis,  of  frequency  language  g r o u p s and an a l g o r i t h m  identification  Weeks  resources  quantifying  of occurrence  discipline  and  in  (1968)  and m a r k i n g  that  of  o f main  there  concepts.  was  to r e f l e c t  some  science. This  HQ  computer,  was  t h e t e r m i n o l o g y and e s t a b l i s h i n g t h e  information  tended  examined  information  made u s e o f t h e IBM 360 Model  in  showed  nature  Fuellhart  lexical  which  successful  examined  of a frequency  with Russian  on an IBM 360 computer.  twenty-three  also  (1972)  made u s e o f a s e t o f f r e q u e n c y  the implementation  The  Johnson  no  However,  formal  science  and  the  structure that  the  study  f o r the materials  t h e o p i n i o n s o f t h e a u t h o r s about the  of the structure.  Austin  (1969)  authenticity computer Austin  of  a  assisted  study  conducted piece  of  technique  was i m p o r t a n t  i n that  used  to  (1972)  showed t h a t  difficult scanned  literature  for stylistic  o f words and o t h e r p e r t i n e n t determine  investigation  English  lists  help  an  the  using  a  d i s c r i m i n a t i o n . The  i t illustrated  how  frequency  l i n g u i s t i c v a r i a b l e s c o u l d be  a u t h o r s h i p . L a t e r r e s e a r c h by B e r k e l e y  t h e computer c o u l d be used  terminology  by  into  in  a  specific  a c h a p t e r o f a Navy t r a i n i n g  to  discipline.  help  isolate  The c o m p u t e r  manual c o n s i s t i n g  of  9,800  34 words  and  "assumed  classified  audience  expected  to  words o f two  vocabulary"  know),  clarification.  The  or  made  use  procedure  which  has  study  by  of  a  (words  the  difficult  computer  Berkeley  syllables  audience  words  scanning  either  would  needing  technique  previously  important  o r l o n g e r as  further  described  d e f i n e d l e x i c o n and  implications  for  be  future  by is a  language  research. The first in  attempt  and  research  dealt  developing  with  adult  computer  investigation  of  a general word-list  through  1,000,000 words) c o r p u s .  which  F r a n c i s (1967) r e p r e s e n t e d  to computer-generate  educational  (over  Kucera  the  Since  m a n i p u l a t i o n of a the  materials,  techniques  to  instructional  for  researchers  have  aid  in  materials  at  use  massive  Kucera-Francis  them  the  work, been their  various  grade  le vels.  A study words Grade  by C r o n n e l l (1971) d e v e l o p e d  which 3.  were  With t h e use  systematically introduction designed  to  Harris  series. lists  from  of  arranged  a  m a t e r i a l s used computer,  both  aid  and  the  investigation  Jacobson  taken  This computer-assisted  Vocabulary  included set,  and  a a  of  from  project  General Total  9,000  i n k i n d e r g a r t e n to 9,000  words  The  were by  study  the was  the s p e l l i n g - t o - s o u n d  reading.  (1972) c o m p i l e d  lists  of  by o r d e r o f word l e n g t h and  needed i n b e g i n n i n g  vocabulary  which  the  lexicon  of vowels i n u n s t r e s s e d s y l l a b l e s .  correspondences  reading  taken  a  a number o f  127  books i n  produced  Vocabulary  Alphabetical  elementary twenty-eight  three  sets  of  set, a Technical List.  The  study  35  generated  approximately  adjustments) 4,500,000  from  an  running  description  of  original  80,000 u n i q u e  procedures  including  of the m a t e r i a l ,  of  language and  in  o f word  A who  a t the  students eighty for  had  including early  in their  The was  first  presented  the American The  for  t h e need  authors  the  the types  of  B5500  t o make  statistical  to g i v e a  description  grade  was  level  textbooks i n  basic  reference  up-to-date  w h i c h c o n c e n t r a t e d on The  that  vocabulary  made  chosen  of  several  were  reading  to i n t r o d u c e c h i l d r e n  very  the of  children  a frequency l i s t  t h e r e were  teaching  use  by  100,000 r u n n i n g words  w o r d - t y p e s and  (1973)  b o o k s which  author  which were f r e q u e n t l y  into  the  p r e s e n t e d by D u r r  a need f o r an  r e a d i n g . Over  concluded  implications  excellent  Burroughs  important  f o r themselves,  books  analyzed Durr  level  selected  recreational  tokens.  designed  approach  t h e r e was  primary  library  computer  with the  the  frequency.  different  that  in  throughout  i n f o r m a t i o n on  t h e r e f o r e p r o v i d e d an  slightly  insisted  list  followed  comprising elementary  t h e A910's studies  was  certain  a u t h o r s gave an  they  f o r use  but  (after  words f o u n d  However, the s t u d y d i d n o t a t t e m p t  analyses the  The  valuable  p r o g r a m s which were d e v e l o p e d computer.  word-types  words t r e a t e d .  the  investigation,  17,000  then  o f word  important  from  his  to high frequency  study words  reading experience.  major s t u d y i n the  involving  American  junior  that  o v e r 5,000,000 words t a k e n  the s t u d y ,  from  materials  H e r i t a g e Word F r e q u e n c y  Heritage School Dictionary stressed  secondary  (Carroll  et  al,  which computer  books f r e q u e n t l y  used  Book  and  1971).  processed in  Grades  36 3 through in  s c h o o l s today,  English  and by t h e r a p i d  l a n g u a g e . The  include not  9, was n e c e s s i t a t e d by b o t h  Carroll  carry the i n v e s t i g a t i o n  characteristics  RESEARCH The during  this  descriptive  study  the  the Cloze  procedure.  cause  attempts  syntactic  on  word  of by  i n reading  researchers  comprehension  semantic  process.  need  to  but d i d word  MATERIALS analyses  variables. is  with  The r o l e o f  presented  formulas;  variables  in  two  and work on  readability  of language  made  sentence  by e d u c a t i o n a l r e s e a r c h e r s t o i d e n t i f y  readability  and  level,  and  formulas and  then  thought  to  comprehension.  s e c t i o n on t h e d e v e l o p m e n t o f t h e C l o z e  instrument  the  analyzing  statistical  research  discussion dealing  v a r i o u s combinations  difficulty  attempts  of  words i n t h e  m a t e r i a l s used i n t h e U.S.A.  development o f r e a d a b i l i t y  initial  The  stage  a s t h e two main l a n g u a g e  sections:  simplify  and  concentrated  variables i n readability  traced  the  school  INTO THE READABILITY OF INSTRUCTIONAL  various  The  recognized  secondary  and d e a l t o n l y w i t h  characteristics these  past  o f m a t e r i a l used  i n c r e a s e o f new  study  materials at the junior  the types  a n a l y s i s concentrated to  gain  an  relationship  technique  a s an  on more  recent  understanding of  language  of in  the the  37 Readability__Formulas  Works by  Chall  research  into  expertise  i n the  (1958) and  readability design  of  and  variables involved,  research.  In  texts  i n use  variables  study  i n the  which  readability  level.  incidence  of v e r b a l s ,  he  book b a s e d findings  on  further  research.  passage  morphological Latin,  1967).  considered Klare*s may  very (1968)  in fact  relative  for  had  had  value  and  Until important research  earlier  of counting  frequency recently in  encompass most o f  the  to  he  table of  each  that  his  evidence  would  engage  in  work by  Coleman  and  per  been  latter  believe  other  which  word  as  a  words s u c h  as  syllables originating in  readability  l e d him  the  to  letters  have  the  and  a  stressed  Other a s p e c t s of  number o f  and  difficulty  opportunity  that  difficulty.  also  book's  length  constructed  that empirical  an  a  abstraction  reading  word s a m p l e s . He  complexity, the  abstractness,  (Bormuth,  the  then  future  language  complexity,  and  of  literature  five  account  mechanical  Aukerman  explained  shown t h e  of  as  t e n t a t i v e and  he  (1966)  Bormuth had measure  500  only  wait u n t i l  (1965) l i s t e d  helped  greater  results in  secondary school  word d i f f i c u l t y ,  listed  have* t o  Klare  classed  five,  were  need f o r  early  understanding  a n a l y s i s of  Aukerman  complexity.  claimed  to the  Aukerman's v a r i a b l e s were s e n t e n c e  which he  which  summarized  studies,  and  maintained  complexity  termed v e r b a l  pointed  of s i x t y - s i x  U.S. A,  he  (1963)  research  linguistic  a  Klare  investigated  v a r i a b l e was  analysis. that  variables.  word  not  However, frequency  38  Coleman studied most  (1968) o u t l i n e d a number of e x p e r i m e n t s  grammatical  cases  readable,  when  form.  prose  i s abstract  simply  -  article  greater  is  illustrated  of verbs  verbs  improved  sentence  Coleman  derivatives  his  a  r e a d a b i l i t y . He rewritten  by  because the  (e.g.  from  suggesting  that  r e a d a b i l i t y of  Rosenshine experiments  value  (1968)  in  "operate").  a  readability  t o the  words  f i n d i n g s of t h i s  phrases.  language  The  variables  which  indeterminate  qualifiers  v a g u e n e s s , and  the  passage. the  Bormuth  r e a d a b i l i t y of  to explain a  processes  involved.  Several  more  recent  irrelevant  measures  have r e s u l t e d being  cited  in  that recent  as  the  of  their several  the  caused  from  the  research  into  had  reading  of  namely  attempted difficulty  psychological  inherent  difficulties  of  their  contribution  to  and  either  three  which  sentences  i n s t r u c t i o n a l materials  i n t o the  of  words  written  studies  a  grammar.  readability,  examination  the  lead to  suggested  pointed  detailed  from  into  could  (1968)  grammatical  readability difficulty  study  c o r r e l a t i o n s between l a n g u a g e and  through  various  research  probability  out  certain  concluded  similarity  affected  omission of  much  where s i m i l a r p a s s a g e s  cognitive  and  in  that  use  description  were compared a c c o r d i n g and  to  of t r a n s f o r m a t i o n a l  presented  horizontal  out  Coleman  instructional materials  the  i t more  nominalizations  further  in  transformation  pointing  noun  had  that  make  w r i t e r chose  abstract  "operation"  awareness of  t h i s by  stated  to  i t u s u a l l y undergoes a grammatical  some  active  r e l a t i o n s and  which  sentence most  length  important  or  word  variables.  39 HacGinitie  and T r e t i a k  measurement  and  (1969)  Allen's  p a s s a g e s and compared formula length  applied  "sector  the  to  used  phrase  structure  a n a l y s i s " on e i g h t y  results  t e s t s based  Yngve's  to  the  Lorge  selected  Readability  on the same p a s s a g e s .  emerged a s t h e v a r i a b l e most c l o s e l y c o r r e l a t e d  Sentence with t e s t  scores. In  an  learning books,  experiment from  Guthrie  including  such  length  learnability  gain  multiple  used  length,  as  learnability  (new  eighteen word  linguistic  variables,  difficulty,  parts  of speech,  certain  other  stimulus  and  familiarity.  Guthrie  reported  were t h e b e s t  that  predictors  of  w e l l a s r e a d a b i l i t y . He s u p p o r t e d h i s f i n d i n g s sentence length  scores,  choice  word  the  well as the r e a d a b i l i t y o f t e x t  and word d i f f i c u l t y  as  stating that  Cloze  (1970)  as  transformations,  dimensions sentence  investigate  passage)  sentence  grammatical  by  a  to  while  gain  was  found  word d i f f i c u l t y  t o c o r r e l a t e .842  with  correlated  with  .815  scores.  1§£§Si_£§a d a b i 1 i t y _ f o r m u l a s Early materials time  and  formula  to  measure  resulted i n instruments effort  to  apply.  variables.  correlated highly  the  readability  of  considerable  F r y (1968) d e v e l o p e d  a readability  Fry's  formula  with  a  which  levels  required  which u s e d s e n t e n c e l e n g t h  language and  attempts  and  syllables  was r e l a t i v e l y  number  of  as  the  easy t o  existing  two apply  readability  formulae. The  following  y e a r an even more s i m p l i s t i c  and  purportedly  40 more  valid  appeared. that  readability McLaughlin  after  that  reading  linguistic  difficulty  years  the  characteristics  words  finding  t h e square  three.  He  instrument  then  which  stated  McLaughlin several  variables  interact,  to  a  mere  t o use i n  he  of  determining  the  explained h i s  dissertation  semantic  and  some  counting  claimed of  root  of  the  the  material  concluded,  the  higher than  "SMOG  semantic  he was a b l e  the  number  obtained,  account  of the v a l i d i t y  in  a  and  measure  of  sentences,  number  t h a t i t gave  were  syntactic  By n o t i n g t h a t  McLaughlin  had  l e n g t h were t h e  McLaughlin  adding of h i s  of  complete  contrast to other  formulas  a 'general understanding'  grades  problem,  i n three sets of ten consecutive  gave a d e t a i l e d  of  the  words and s e n t e n c e  measures  and e m p h a s i z e d  understanding  into  reading d i f f i c u l t y .  h i s formula  polysyllabic  t h e c r e a t o r o f "SMOG," e x p l a i n e d  materials.  best  of  Grading,"  he had shown t h a t words and s e n t e n c e s  syntactic variables  to reduce  "SMOG  out that i n h i s doctoral  earlier  respectively,  and  of  by p o i n t i n g  entitled,  research  polysyllabic  most p r e d i c t i v e  three  (1969),  considerable  concluded  decision  formula  only.  For  Grading"  other r e a d a b i l i t y  this  would formulas  reason,  usually  be  in  common  Bormuth  (1969)  use.  M^^gnfls_in_reseai:ch In  a  illustrated  well just  designed how f a r  series research  achieving  i t s objective  that  a n a l y s e s would s i m p l i f y  these  and  of into  studies,  readability  s t a t e d , " I t had been t h e problem  was  from  anticipated  of constructing  *»1 the  theory  realize  of  studies  p a s s a g e s and  was  12.  vocabulary  per  and  like) ;  (including  which, might  (including  structural  structural  complexity, of  anaphora,  and  variables, of  62  on  the  syntactic  Grades  included  eight  structures  the  or  types  variables,  and  eleven  variables  between  complexity, length);  anaphora  structures,  numbers  comprehension  complexity  syntactic  interval  words  syntactic  and  time  syllable,  structural  transformational  anaphoric  330  sixty-  variables  density  of  of  occurrence  of  an  i t s antecedent) .  of  measures  vocabulary  based  of  of  l e t t e r s per and  of  relation  consisted  Yngve d e p t h  speech  frequency of  with  as  to  printed  causal  variables  i n d i c a t e how  density,  (including  A total  in  cf  representative  lexical  variables  thirty-eight  a n a p h o r a and  of  features  materials  such  failure  objective  a sentence c o n t a i n s might i n f l u e n c e  difficulty) ;  parts  The  areas,  (factors  fifty  factors  stood  linguistic  word, f r e q u e n c y  structures  two  subject  169  variables  A main  isolate linguistic  ten  The  syllables the  to  l a n g u a g e . The  spectacular".  difficulties.  drawn f r o m  1 through  was  of  d e t e r m i n e which f e a t u r e s  comprehension  passages,  of  comprehension  t h i s expectation  Bormuth's  to  the  94 of  of  the  169  passage  variables, 34  out  parts  of  of  38  variables  difficulty 19  out  syntactic  speech  It i s interesting  to  variables  and  of  out  included  in syntactic  passage  difficulty.  of  12  complexity, The  high  including 50  and  note the  8  cut  syntactic  complexity  variables,  variables.  11  of  correlated significantly  8  that  of  all  11 8  syntactic length  correlated  number o f  8  structure  variables, out  of  25  out  anaphoric vocabulary variables,  significantly  with  significantly correlated  42 variables  suggested  presently  be  given  comprehension variables relate  that  an overwhelming  to  the  difficulty  of  printed  The  94  number o f p o s s i b l e  significantly  analyzed.  related  Bormuth s t a t e d ,  analysis,  then,  variables  c o r r e l a t i n g with  passage  In o r d e r t o f a c i l i t a t e printed  materials,  valid  construction varying  variables  were  technology near  also  factor  t h e r e s u l t s from  factor  research  into the  readability  t h e use of very linguistic  obtained  variables studies  guidance t o educators concerned  with the  ability.  exciting  from  large  suitable  The  use  possibilities  f o r students at of  computerized  i n t h i s regard  i n the  future.  The regular will  200.  such  reading  offers  over  does n o t seem t o u n d e r l y t h e  of i n s t r u c t i o n a l materials  l e v e l s of  Some e s t i m a t e s  a t well  Bormuth a d v o c a t e d  valuable  could  difficulty".  t o be a d e q u a t e l y e x a m i n e d . R e s u l t s offer  i n the study  difficulty.  s a m p l e s o f words t o e n a b l e r a r e l y o c c u r r i n g  would  for  In addition,  variables  structure  might  accounts  materials?"  "To summarize  a simple  "What  examined  s i g n i f i c a n t l y t o comprehension the t o t a l  of  question,  not s p e c i f i e d i n t h e t o t a l  place  number o f a n s w e r s  deletion  o f words from a p a s s a g e o f p r i n t m a t e r i a l s  i n t e r v a l s ensures that  be  omitted.  When  Cloze  b o t h l e x i c a l a n d s t r u c t u r a l words t e s t s have been c o n s t r u c t e d  administered  correctly, the results  facility  student  a  has  in  at  are  said  understanding  semantic i n t e r r e l a t i o n s h i p s o f the m a t e r i a l  to the  being  measure  and the  s y n t a c t i c and read.  43 A_^5§2£iJ2iion_of _ C l o z e The as  C l o z e t e c h n i q u e , which  1897,  reading  was  developed  comprehension  readability  by  procedure  of  by E b b i n g h a u s  (1953).  five  Basically  early  measuring the  Cloze  steps;  The selection evaluated,  2.  The d e l e t i o n o f e v e r y " n t h " word ( u s u a l l y e v e r y fifth word) and the insertion of u n d e r l i n e d b l a n k s of a standard length,  3.  The who  passages  from  the m a t e r i a l t o  4.  The instruction to students to w r i t e i n the s p a c e s t h e words t h e y t h o u g h t had been d e l e t e d ,  blank  5.  The m a r k i n g o f c o r r e c t r e s p o n s e s when i d e n t i c a l have been i n s e r t e d (Bormuth, 1968).  items  a d m i n i s t e r i n g of the m u t i l a t e d text had n o t r e a d t h e o r i g i n a l work,  the  to  (b)  of  Cloze  as  readability,  s u r v e y o f some o f t h e s t u d i e s  a  means  and  of  pertaining  as i t r e l a t e s  to the secondary  this  o f t h e c h a p t e r . Much more c o m p r e h e n s i v e  section  the C l o z e p r o c e d u r e  Weintraub  (1968),  (1 9 7 0 ) , Jongsma  have b e e n  Potter  organized  (1968),  (1971), Bormuth  Bickley,  (1972), and  will  by  be  category  presented i n treatments  Rankin  Ellington Bailey  (a)  variables. A  to the l a t t e r  school level  studies  measuring  (c) l a n g u a g e  be  students  work o f T a y l o r , t h e r e have been numerous  the a p p l i c a t i o n  comprehension,  of  as  a s an i n s t r u m e n t f o r  Taylor  involved  used  1.  Since into  first  was  and  (1965), Bickley  (1973).  I mp. o r t a n t _ 1 i ng u i s t i c_ v a r i a b l e s Louthan deliberately  (1965) deleted,  noted  that  increased  when  specific  emphasis  was  meaning o f t h e r e m a i n i n g words i n c o n t e x t . The  words placed  Cloze  were on  the  technique.  44 therefore,  made  the  reader  grammatical c l u e s inherent series  of  in  to  see  comprehension which  i f the  compared  had  not  showed t h e  most  which  reading  was  that,  Another predictors  been  The  words  (a,  the,  deleted. attempting  of  of  percent  of  conducted  using  could  information  for  o f the  recall  not  recall  in  Weaver and  both s t r u c t u r a l  could  could  by  words  written  w r i t i n g , whereas  accurate each  more  independent  content  that o r i g i n a t o r s of  but  by  information  words,  It  producers,  discover  exact  the  be  Cloze  learned  clauses,  and  language  was  1966).  a study  only,  to  difficulties  important  v i t a l importance  materials  which  certain function  difficulties  their  group  m a t e r i a l with  comprehension  days a f t e r  material  one  (Bormuth,  85  read  reading  the  by  about  in  and  was  researcher  found  words,  experimental  a  i n comprehension  illustrated was  who  In  linguistic  improved  group  and  structure.  function  subjects  The  lexical  gain  stated that  the  the  a v a r i e t y of  and  control  mutilated.  of  language  speech  to a  significant  of  technique,  sentences  the  experimental  whose, what, h i s )  about  all  e x p e r i m e n t s , Loutham d e l e t e d  v a r i a b l e s i n c l u d i n g p a r t s of tested  use  areas  students structural  was  an  could  recall  lexical who  deletions had  words as  words. The  p e r t a i n i n g to s p e c i a l i z e d  content  (1967).  material  and  lexical  Bickley  core  two  read  the  w e l l as  the  importance  of  vocabularies  obvious i m p l i c a t i o n  to  be  drawn.  Bickley, had  et  been c o n d u c t e d  al  (1970) p o i n t e d  i n t o the  effect  of  out  that research  sentence  length  in on  Cloze the  45  comprehension t o bs  of the reader  more r e a d a b l e  than  and  long  that short sentences  were  found  sentences.  £l2£e_and_readability A number o f s t u d i e s d e s i g n e d instructional in  1968.  that  Cloze  (1968) r e v i e w e d t e c h n i q u e had  measuring r e a d a b i l i t y . offer  I t was  valuable insights  Bormuth  (1968)  readability reader.  He  entirely  on  would  l e v e l and stated  contributions  that  that  instrument Bormuth  move  to  t o use  planned  as s t i m u l i  be  gained  variables students.  the  reading  relationship  on  Cloze t e s t s  of  of c e r t a i n  vital  tremendously.  develop  to i d e n t i f y  of 'the  between  efforts  Cloze procedure of t h e and  to  His early  d i d not  linguistic  causal relationship  levels  of  depend This  other the  concentrated  By  effective  this  means,  features that  Thus  the  have a i d e d  i n t o an  o p e r a t i o n a l i z e those  would e n a b l e  the  material.  work  readability.  for instruction.•  the  by  Bormuth's  the Cloze procedure  in studies  could  f u n c t i o n words i n  importance.  serve  then  processes  the  to in a  application  a greater understanding between s p e c i f i c  comprehension  among  in  process.,  gained  p r i o r knowledge o f t h e role  validity  f o r t h e v a r i o u s c o m p r e h e n s i o n p r o c e s s e s ' and  'towards  the  of  the  scores  the  manner t h a t i s s u i t a b l e of  aspects  showed  that Cloze  t o r e s e a r c h i n the C l o z e t e c h n i q u e  work i n t o r e a d a b i l i t y need  suggested  and  t h e amount of i n f o r m a t i o n  the reader's  suggest  the  into  further  of  were r e p o r t e d  s e v e r a l s t u d i e s which  high r e l i a b i l i t y  investigated  l a n g u a g e s t r u c t u r e was  on  suitability  m a t e r i a l s u s i n g the C l o z e t e c h n i q u e  Weintraub  the  t o e x p l o r e the  to  linguistic  secondary  school  46 An  excellent  summary  of  experiments  technique  to  determine  readability  levels  children  and  adults  presented  Potter  to  his  discussion  on  Pottar  mentioned  content  words may  purposes. a  Geyer  student's  also  to  was the  that  (19 68)  readability  level  results  the  of  comprehension  to  Kulm  English.  that  material.  that  rely  on  had  Kulm  are  English.  work  The  procedure level  of  was  Ramanauskas examples  of  material  components b u t  with  the  the  there  on  that  existing  and  sentence  Houska  instrument  to  An  interesting  conducted with  some o f  second  sample.  Ramanauskas  second  sample,  as  measured  an  identical the  by  that  reducing  language  measure  mathematical  that  the  the  was  experiment  at  offered using  and  the by two  semantic  rearranged  readability  Cloze  readability  Education  syntactic  of  formulas  tc  with  the  and  readability  approach  sentences  argued  ten  length  determine  that  mathematical  readability  showed  The  (1969)  of  the  use  (1971)  by  Hater  least  effect  easier  showed  improved  at  to  an  study  readability were  a  comprehension.  materials in Industrial  who  and  specialized  at  improved  of  appropriate  level.  (1972)  in  a significant  a viable  school  rewritten  sentence complexity.  that  of  function  for  C l o z e as  significantly  maintained  instruction  secondary  aspect be  not  of  studies,  and  word d i f f i c u l t y  readability  the  comprehend s o c i a l s t u d i e s c o n t e n t  measured  Kulm r e p o r t e d  variables the  (1971)  addition  of  result  and  of  for  predictor  would  not  Cloze  materials  scoring  of  the  ( 1 9 6 8 ) . In  information use  materials  vocabulary d i f f i c u l t y later  the  if  latter  may  valuable  of  aspects  separate  tested  ability  determine  technical the  provide  by  using  in  readability formulas,  of  the the was  47 unchanged.  That  sentences, using  be  there  were  words, s y l l a b l e s ,  the  could  is,  Cloze  exactly  etc.  as  technique that  generally  structure  referential  (Betts  words  clues  readily  latter  distinguishable  identified fifteen  some  154  groups  was  only  by  measure of r e a d a b i l i t y  did  that  (Lefevre,  The  function Dauzat,  from  was  (e.g.  of  illustrated  o f L e w i s C a r r o l ' s poem  an,  of  at,  categorized verbs,  words by Young  words  in  and as  or  words  by,  what,  which  are  (1952)  them  into  conjunctions, However, F r i e s other  writers  structure  providing  (1973) who  two  lexical  former  meanings  determiners.  more  role  function  a,  exhaustive  Goodman e t a l , 1966).  was  1968), The  auxiliary  considerably  one  s t r u c t u r a l meanings. F r i e s  p r o n o u n s , and  his l i s t  to  words, and  t y p e have l e x i c a l  1964;  information  belong  structure  including  relative  defined  to  s t r u c t u r e words and  prepositions, claim  or  , 1965;  to grammatical  whereas t h e  have  It  of  CONTENT MATERIAL  considered  classifications:  not  same number  obtained.  Words a r e  very)  before.  a valid  DETERMINING SIGNIFICANT  a c t as  the  words  structural  quoted a  portion  "Jabberwocky":  •Twas b r i l l i g , a n d _ t h e s l i t h y t o v e s D i d g y r e and g i m b l e i n _ t h e wabe; A l l mimsy wera_the b o r o g o v e s , And_the mome r a t h s o u t g r a b e . . . ' Young helped  pointed  generate  the  out ideas  that  the  underscored  which were i n h e r e n t  in  structure the  words  nonsense  48 sentences comprising the  Rogers relatively language that of  (1965) few  of t h e i r  structure  just  structure frequency  words a c c o u n t e d  50  percent  words.  The  The  of  vast  words i n e a c h  function  words.  most  second  i n rank  An  of  the  first  type of genre  in five  30  Fries  of  t h e methodology a word-list  (1967)  the  significance  of  measurements  was  s e n t e n c e s . The  sample that  megaword c o r p u s c o n s i s t e d study  showed  that  of the  g r e a t l y a c r o s s the v a r i o u s o f t h e 100  fifteen  most  frequently  genre  examined  were  o r d e r of the  structure  words  i n a l l cases)  was  noticeably was  the  w h i l e "and"  was  literature  abstract passage.  study  derived  Luhn  i n descending  in  sentences. then  used  each A  determining  an a n a l y s i s o f t h e sentence  to  combination  to g i v e a " s i g n i f i c a n c e "  " a u t o - a b s t r a c t " was  r a n k i n g o r most s i g n i f i c a n t  finally  compiled  sentences.  a  relative  determine of  an  outlined  o r d e r of f r e q u e n c y t o g i v e  f o r words, and  words  i s the  from  (1958)  o f " a u t o - a b s t r a c t i n g " which i n v o l v e d  "significance" factor  found  estimated  i n which i t o c c u r r e d ; " i f "  in a literary  compiled  of  (1952)  the  genre.  of a  words  in  percent of a  a r e a of r e s e a r c h which i s p e r t i n e n t t o t h i s  analysis  words were  important  f r e q u e n t word i n t e n o f t h e g e n r e ,  creation  highest  their  However, t h e r a n k  automatic  position  Francis  majority  ( e x c e p t f o r " t h e " w h i c h was by t h e  and  f o r over  words d i f f e r e d  occurring  second  although s t r u c t u r e  Kucera-Francis  of s t r u c t u r e  samples.  affected  that  dense d i s t r i b u t i o n .  words w h i l e K u c e r a  under  genre  stated  i n number t h e y were e x t r e m e l y  because  1,000  poem.  these  the two  factor for from  Luhn d e f i n e d  the the  49 most  "significant"  highest  words a s  frequency  (these  system), nor i n the area would  negate  their  "significant" occur  to  o f low  frequency  relevance  between  to  "significance"  factor  the proximity  the  higher  words  in  close  and were c l a s s e d  the  drawback  extreme  rarity  matter.  would  for  t o the system  intellectual  i n the  be  possible  proximity  was  described made  making  a  arrived  to  occurring were  the  on t h e b a s i s o f t h e i r  by  Luhn  by  final  a t by  words t o one a n o t h e r .  t o each other  o f the excerpt  power"  distribution.  number o f f r e q u e n t l y  selected  decisions in  sentences  The  therefore  points  middle s e c t i o n of the  of  i n the  their  subject  a s more " s i g n i f i c a n t "  "auto-abstract"  disciplines  two  of "significant"  T h e s e s e n t e n c e s were t h e n form  where  the  region  the 'noise'  t h a t i t would t h e n  S e n t e n c e s w h i c h had t h e g r e a t e s t different  the  the degree o f d i s c r i m i n a t i o n o r " r e s o l v i n g  o f t h e words making up t h i s  identifying  in  words c o n s t i t u t e d  Luhn h y p o t h e s i z e d  determine  The  neither  s e c t i o n o f words i n t h e m a t e r i a l  somewhere  distribution.  being  or a r t i c l e . was  the  of  subject. rank t o  An o b v i o u s absence  specialists  selection  ranked  in  of  various  "significant"  content. A  similar  materials the  was s u g g e s t e d  automatic  content. of  technique  statistics  would  by Maron  indexing  matter  (1961) who was  that  reasonably  o f documents c o u l d  i n v o l v i n g word f r e q u e n c y ,  main d i f f i c u l t y be n e i t h e r  concerned  too rare  analyzing  to their valid  with  subject  predictions  be made on t h e b a s i s o f  word o r d e r ,  location, etc.  the s e l e c t i o n of clue  t o be v a l i d  printed  concerned  o f documents a c c o r d i n g  Maron's t h e s i s s t a t e d  the s u b j e c t  The  f o r automatically  words w h i c h  p r e d i c t o r s , nor b e l o n g  to  50 the  'logical'  referential high  class  meaning  frequency  of  structure  f o r the material.  structure  percent of the t o t a l  words  lack  Similarly,  high frequency  because o f t h e i r rejected  lack  determine  subject  of s p e c i f i c i t y  the  f o r over  40  s h o u l d be e x c l u d e d  the  subject  matter.  words were n e x t  excluded  f o r t h e s u b j e c t . Next t o be  1,000 words were t h e n l i s t e d  did  not  and  i n the  analyzed  predictors  attempt  a n a l y s i s of the data along t h e l i n e s  for  materials  for  lexical  the present study  o r Maron  strategy  this  o f i n f o r m a t i o n about  that  of the  content.  intensive  The  accounted  which o f t h e s e words were v a l i d  Although  (1958)  which  decided  were t h e words which o c c u r r e d o n l y once o r t w i c e  c o r p u s . The r e s u l t i n g to  Haron  occurrences i n h i s study  because of t h e i r the  words w h i c h d i d n o t s u p p l y  (1961),  further  and  study  into  the  offer  a readily  corpora  available  of developing techniques  content  vocabulary  content  i n print  an  by Luhn  a possible  analysis  of significant  l e v e l and s u b j e c t a r e a  also  t h e purpose  significant  research  make  suggested  T a s k 9 i n CHAPTER I s u g g e s t s  the s e l e c t i o n  v a r i o u s grade  to  of  print  vocabulary.  generated  by  sample o f m a t e r i a l s  for  the  selection  of  sources.  SUMMARY Numerous s t u d i e s have been made i n t o printed  English  language  studies  have  originated  w o r d - l i s t s designed The  development  used i n  m a t e r i a l s s i n c e t h e 1 9 2 0 » s . Most o f t h e i n t h e U.S.A and have c o n c e n t r a t e d on  f o r use i n primary  of  the vocabulary  frequency  counts  and of  elementary words  schools.  occurring i n  51 written  discourse  examination  of  has  both  secondary  latest  have a l l o w e d  more  diverse  procedures  from  analysis  across  do  variability addition, length  grades  rate by  grades  analyses  have  carefully  selected  been  print  and  characteristics  from  sentence subject  areas.  much  computers  greater  based  based on  on  and  careful  area.  Word  of occurrence  rate frequency by  representative  m a t e r i a l s which a l l o w  i n frequency  or  sentence  subjective  word-  averages  or  grades or s u b j e c t a r e a s .  a v e r a g e s and  for  Although  s t u d i e s b a s e d on have  been  o p i n i o n s with  In  sentence  variability,  length types  areas.  few  for  lists of  s t u d i e s have r e p o r t e d  sources  validate  the  level.  subject  subject  empirically  at  past.  remains  secondary  by  reported,  nature  on  materials  indicating  and  relies  made i n t o i n s t r u c t i o n a l  repeat  for  the  important  of d i g i t a l  with  of  similar,  a s m a l l number o f  frequency  their  for  subject area and  indicate  organized  this  i n the  need  most  research  materials  The  be  data  characteristics,  repeat  to d e a l  f o r samples o r g a n i z e d only  the  much o f t h i s  have b e e n d e v e l o p e d  provide not  of  Sources of  at the  secondary  traditionally  one  printed  to  schools  word l i s t s  but  of  studies  sampling  types  researchers  in  psychological aspects  have s e e n t h e use  o f random s a m p l i n g .  i n Canadian  researchers  have been l a c k i n g  amounts  well-designed  Few  provide  lists.  trends  which  and  work and  o f word  school l e v e l  The  used  lists  in readability  the a v a i l a b i l i t y  other  linguistic  language. Vocabulary factors  aided  or  samples  subjective data  announced  from which  respect to  word  o f samples o f n a t u r a l language  text  52  Early in  attempts  formulas  that  computations. variables (often  research  involved  as  readability  in  which  to  closely  has e m p h a s i z e d  base  relationship  representative in  the  need  materials  this  which  be e x a m i n e d , d a t a  to  computer i n p u t  The gained  and  and  considerable  structural  language. technique  understanding basic  language  many a s p e c t s  word l i s t s  have  a  i s o f prime  on  much more causal  importance.  carefully  selected necessary  of the l i n g u i s t i c bases should  able  feel to  t o measure  of  grammar,  variables  the  variables  be o r g a n i z e d f o r  and in  the a  semantic  instructional analyses  of well organized  1950's.  including  connotative  contribute  in  readability  late  language  that  E f f e c t i v e Cloze  the a v a i l a b i l i t y  larger  B e c a u s e o f t h e numbers o f  since  o f t h e s y n t a c t i c and  secondary l e v e l . by  attention  words,  be  much  language t e x t are  technique  Many r e s e a r c h e r s will  to  into  processing.  use of the C l o z e  procedure involves  develop  of  of r e s e a r c h .  and t h e c o m p l e x i t y  sentence  t h e need t o l o o k  difficulty  samples i n v o l v e d need  to  consisting  type  difficulty  research  and more u s a b l e  samples from n a t u r a l  furthering  complex language  word  Recent  v a r i a b l e s which appear  bases  fairly  isolated  variables.  to comprehension  data  and  resulted  and word f r e q u e n c y ) and  investigations. Also  at linguistic  Again,  readability,  important  samples of i n s t r u c t i o n a l  lengthy  measures  i n t o t h e most s i g n i f i c a n t  a s word l e n g t h  two  readability  required  Later  measured  length  to construct  data  are  future great  lexical  the deal  materials  bases that  This  features of  functions  also  has  Cloze t o an  of  the  at the  facilitated have known  53 word and  sentence  The content  development material  researchers. word l i s t s areas.  The  the  summary,  sample of subject  print  related  print for  literature  the  in  junior  seven  subject  of  of  sentence data  the  and  are  tests  representative  by  grades,  the  analyzed  sentence  of  the  common lengths.  words A  for  discussed. body  A of  and  to  across  eighteen  the the  subjects word  freguency  lengths,  pattern  represent  g r a d e s , by  relative  and  secondary  comparative  three  the  a  basis  materials  sentence the  the  forms the  organized  i n terms of the  on  selected,  t h i r t y - s e v e n t e x t b o o k s . The  w o r d - t y p e s and  most  by  the  methodology  stratified  and  by  made t o i l l u s t r a t e the  and  word s a m p l e s ,  print  is  i n junior  analysis  are  for  is  lists.  areas p r e v i o u s l y  samples  subject  attention  study  design  lists  prescribed  areas across  occurrence of v a r i o u s  of  500  of  c h a r a c t e r i s t i c s of  adequately  word  The  i n such  f o r use  and  secondary curriculum,  grades,  occurrence  four  c o n s i s t i n g of  analyses.  total  empirical  the  several  examination  little  this  base,  by  techniques  lexical  review  representative,  characteristics  within  conceptual  development  statistical  of  prescribed  the  but  content  a n a l y s i s of the  The  material  lists  focus  significant  m a t e r i a l s from  adequate  significant  emanate f r o m  well defined,  word  of  materials  areas.  study  generate  the  and  identify recommended  from samples o f p r i n t  of the  identification  to  have p o t e n t i a l i n the  provision  identification  In  methodology  techniques  derived  to  of  i n a p r i n t e d p a s s a g e was  Many s t u d i e s  given  the  characteristics.  and  and of the  i n frequency  of  a  of  series  technique i s proposed  which  54  serves as a model f o r the i d e n t i f i c a t i o n content  i n word  Finally,  lists  computer  derived technology  from is  of the most  subject used  area  and  in  dissertation  the itself.  production  of the f i n a l  materials.  throughout  development, o r g a n i z a t i o n , comparison and a n a l y s i s of base  significant  in the  the data  p r i n t e d copy of the  55  CHAPTER I I I  THE RESEARCH This chapter for  the  study.  research  The s t u d y  set of  unobtrusive text.  describes the research  and  developed  t o make  distribution,  in  derived  The i n f o r m a t i o n  questions  survey  phenomena  measures  hypotheses  "present-oriented*  a p p r o a c h was u s e d t o d e s c r i b e and  of  the  themselves  answers  posed.  accurate  and  with  methodology  utilizing  from samples o f n a t u r a l language  provides  an  d e s i g n and  was c o n c e r n e d  and a d e s c r i p t i v e ,  a specific  DESIGN  The  of  the  research  assessment  relationships  to  of the  research  method  the  was  incidence,  phenomena  under  investigation. The of  natural  develop to and  research design  test  the nine  the  programs,  the study. and  presented.  produce  to generate the  and g e n e r a t e  the  samples  Corpus o f m a t e r i a l s , the  data  necessary  major t a s k s , answer t h e q u e s t i o n s  specific  I. A P i l o t  computer  design  text,  t h e v a r i o u s word l i s t s  accomplish  Chapter  for  language  was o r g a n i z e d  Study test  Following  methodology  hypotheses of the study was f i r s t  conducted  procedures  a description for  each  as o u t l i n e d i n  t o generate  and s h a r p e n  needed  t h e methodology  of the P i l o t  o f the nine  raised,  Study,  the  major t a s k s a r e  56 THE B e f o r e commencing trial  >  prose  was u t i l i z e d  the  time  i t was n e c e s s a r y  the  materials,  needed  f o r keypunching  an e s t i m a t e c o u l d be  made  incidence  of  errors  in  keypunching  to  (c)  existing  the efficiency  of  additional  programs  statistical  analyses, e t c ,  (e)  the size a valid the  technique (f)  (g)  necessary  programs to  within a  using  the time  amount o f d a t a  and r e s o u r c e s  (See  for  lists,  make  needed  to  material, sampling  words and s e n t e n c e s  t h a t c o u l d be f e a s i b l y  in  Study  material  contained likely  analyzed within  available.  textbook  Table II) , This  the P i l o t  language  text  need  m a t e r i a l , and  t h e p r e s c r i b e d "B" i s s u e  Shop_Book  the  textbook,  Twenty-one s a m p l e s o f a p p r o x i m a t e l y from  determine  a random, s t r a t i f i e d  t h e use o f d e l i m i t e r s t o d e t e r m i n e  the  the  by m a c h i n e s ,  word  each  r e p r e s e n t a t i o n o f the content of  and  organize  o f t h e samples taken from  reliability  the content  of  o f t h e d a t a b a s e t o be used i n t h e s t u d y ,  t o have t h e work v e r i f i e d  give  This  a s e t amount o f r u n n i n g  whether i t was n e c e s s a r y  (d)  t o make a  t o determine:  (10,000 words) s o t h a t  eventual size (b)  with the study  r u n w i t h a s m a l l sample o f i n s t r u c t i o n a l  procedure (a)  PILOT STUDY  500  b e c a u s e i n t h e judgment o f  t o be f o u n d  were  f o r Agriculture,  particular  a good s e l e c t i o n  words  text the  taken  Farmer^s  was c h o s e n f o r researcher  the  o f b o t h v e r b a l and s y m b o l i c  i n the other content  areas.  57 TABLE I I THE TWENTY-ONE,  500 WORD SAMPLES USED IN THE PILOT STUDY  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11  14-17 34-36 62-64 78-81 109-11 1 129 146-147 161-164 1 89-190 216-220 223-275  12 13 14 15 16 17 18 19 20 21  231-233 261 274 313-316 328-329 353-355 375 390-391 416-417 433-437  The  rate of error  500  words  not  warranted.  keypunched  approximately As  was f o u n d  a  The  t o be l e s s t h a n  which s u g g e s t e d rate  of  that  keypunching  1,000 words p e r hour under  result  of  the P i l o t  Study,  one  word  per  machine c h e c k i n g was  ideal  estimated  was at  conditions.  the following  decisions  were made: (a)  A Corpus o f approximately  taken  from  469  235,000 words o f  s a m p l e s o f 500 words e a c h  running  prose  was f e a s i b l e f o r t h e  study. (b)  The  throughout from of  Command  Operand  the text  input  the f i n a l  frequency  n  )P"»  which  was  t o s i g n a l a new p a r a g r a p h , count  interspersed was d e l e t e d  o f words b e c a u s e o f i t s h i g h  rate  occurrence.  (c)  An  additional  bring  the t o t a l  n i n e d e l i m i t e r s o f a word were i n c l u d e d t o  t o t w e n t y - o n e . These c o n s i s t e d o f :  58  (d) was  The d i c t i o n a r y set at  (e)  and and  established  to  deal  with  word-types  20,000.  A "Repeat  incidence  size  Rate  Frequency"  of s i m i l a r l y  sentence  occurring  l e n g t h s was  f o r the sentence  designed to i l l u s t r a t e  f r e q u e n c i e s f o r both  included  f o r each  the  word-types  frequency  word  list  analysis.  (f)  The c h i - s q u a r e and Y u l e ' s  were  tested  and  table  included  Characteristic  i n the study  "K"  f o r both  statistics  word  frequency  and s e n t e n c e l e n g t h a n a l y s e s . (g)  A number  of a d d i t i o n a l  d a t a t o be g e n e r a t e d (h)  The  sentence  graphs length  p r o g r a m s were d e v e l o p e d  i n t h e form  depicting  were p l o t t e d  sampling  word  frequency  by computer  representative sufficient procedure decision used,  lexical  quantities  of  t o determine by  t h e number  data  two  t h e number a  of samples  not  Passages  every  prose. an  texts  special  initial and  provide text  The  subjective  linguistic with  on  to  procedure  within  were n o t i n c l u d e d  with  selection  samples  random s a m p l i n g  the usual syntax a s s o c i a t e d  containing  to  prescribed  t o be s e l e c t e d  t h e y seemed t o i n v o l v e s p e c i a l constitute  developed  phases: of  frequency o f  programs.  language  stratified,  Works o f v e r s e o r drama that  for  and  MATERIALS  were  of n a t u r a l  consisted  followed  determine  procedures  enable  desired.  TASK 1. SELECTION OF The  to  each  the  be to  text.  grounds  problems  and d i d  normal  prose.  c o d i n g t e c h n i q u e s such  as shorthand  59 and  m a t h e m a t i c s were e x c l u d e d  f o r t h e same  reasons.  § §S£iiJS2_E£.ocedures (a) T e x t b o o k s _ A n d _ S a m 2 l e s _ I textbooks  containing  words o r o v e r textbooks  of English  language  were i n c l u d e d i n t h e s t u d y . The  and  TABLE I I I .  samples  T h i r t y - s e v e n "A"  samples  f o r each  Information  pertaining  p u b l i s h e r s o f t h e books i s l i s t e d  to  prose  total  content area  o f 500  number  of  i s presented i n  titles,  i n APPENDIX  issue  authors  and  A.  TABLE I I I NUMBER OF TEXTS AND SAM PL ES FOR EACH GRADE LEVEL AND AREA SUBJECT Commerce  GRADE 8 GRADE 10 SUBJECT TC GRADE 9 T e x t Sample T e x t Sample T e x t Sample T e x t Samp! X  2 X  English  2  Home E c o n o m i c s  1  Ind. E d u c a t i o n  1  Mathematics  1  Science  2  4  a  copy  35 6  24  31  75 6  2 13  i n each  o f t h e t e x t . Other  14 2  2  t h i r t y - s e v e n "A" i s s u e  because e v e r y s t u d e n t  3  7  19  63  X 1  2  104  98 4  54 1  22  The  X X  20  9  80 6  76  14  Totals  16 X  3  2  41 8  47 5  9  Grade  16 2  22  Studies  4  2 25  17  Social  SUBJECT  246  textbooks  class  77  42 9  119  37  were s e l e c t e d  or course  of study  t e x t s i n c l u d e those  469  f o r use receives  provided i n sets  60 to  be s h a r e d  teacher ("D"  by  use o n l y ,  and  the  students,  ("C"  issue);  " E " i s s u e ) and  ("B"  or a l l o t t e d  E d u c a t i o n , P r o v i n c e of B r i t i s h  The  grade  the  "A"  level  number  Commerce English Grade  as  and  follows:  exercises;  (with  in  brackets)  were a s f o l l o w s :  ( 1 ) , Grade 9 E n g l i s h  Grade  10  ( 1 ) , Grade  10  ( 1 ) , Grade 10 M a t h e m a t i c s  ( 1 ) , Grade 9 S o c i a l  (1) . The  the  of  subject area of the omitted textbooks  8 English  the  Department  study.  Studies  Studies  purposes,  i n the  texts  ( 2 ) , Grade  for  Columbia.  ( 2 ) , Grade 9 M a t h e m a t i c s  Social  the  i s s u e t e x t b o o k s were n o t i n c l u d e d  of  8 Social  for special  p u b l i s h e d by  i  Eleven  prescribed  are described i n the booklet, Prescribed  Iglibooks _J972^73 _Grades_I^ x  issue);  Studies  (1),  ( 1 ) , Grade  reasons f o r e x c l u d i n g the textbooks  Commerce  textbooks  contained  English textbooks consisted  v e r s e ; the  Mathematics textbooks  geometric  problems;  and  the  Social  Studies  are  shorthand  o f p o e t r y and  c o n t a i n e d mainly  10  blank  algebraic  and  textbooks  were  Economics  and  atlases.  The  textbooks  Industrial in  used  Education  Grade 9 and  in  Grade  10  were t h e same a s t h o s e  were i n c l u d e d  m a t e r i a l would have d i s t o r t e d  Grade  English  included  i n the l a t t e r  (b) S a m p l i n g ^ 500 (see  textbook  A total  w o r d s , were s e l e c t e d APPENDIX  research  A).  evidence  used  total  Samples suggested  in  f o r use  the r e p e t i t i o n  of  the r e s u l t s o b t a i n e d .  A  Grade  10  English  was  not  f o r t h e same r e a s o n ,  o f 469  from  prescribed  only once because  identical 9  Home  samples,  the of  each o f  thirty-seven 500  that  words there  were was  approximately "A"  textbooks  used both  because greater  61 diversity samples, content  and  word-types sufficient  selected  using a t a b l e  first  samples of t h i s  flexibility  approximately  500  including  picture  captions.  alphabetical presented  from  on  every  Two  running  lists  and  APPENDIX  of  one B.  in These  random sample o f a p p r o x i m a t e l y instructional of  40  8,  9,  and  10  Once  the  selections  and  and each  continued  running  the for  prose  was  heads, f o o t n o t e s , t a b l e s ,  and  the  than  sample  ascending  sizes,  rank  procedures percent  in  one  order,  produced  of  the  in are  a  total  "A"  i n the seven  British  issue subject  Columbia  sampling  procedures  p r o g r a m . FMT  junior  were  directly to  formatted words  the and  on t h e  EDITING  established,  cards  using  the  i s a program which e n a b l e s  o f m a t e r i a l s i n u p p e r and  Input  PUNCHING AND  were keypunched o n t o c o m p u t e r  (FMT)  characters  command  of  schools.  TASK 2. INPUT PROCESSING, KEY  was  prose  twenty pages throughout  m a t e r i a l s p r e s c r i b e d f o r use  Grades  secondary  printing  representation  t h e page s e l e c t e d  titles,  order  in  larger  et a l , 1971).  words. E v e r y t h i n g o t h e r  omitted,  FORMAT  the  than  o f random numbers. E a c h sample began w i t h  complete sentence  areas  in  size  samples c o n s i s t e d o f E n g l i s h language r u n n i n g  randomly  text  from  materials (Carroll  The were  of  system  program was  controlled  interspersed  lower  case  and  the  the UBC rapid  with  special  The  material  printer.  i n free-form l i n e s .  according throughout  to the  control input.  cards The  and basic  62  and S £ e c i a l _ o p e r a n d _ v a l u e s  command_o£erands were  used  to organize  most i n s t a n c e s underlining, addition, signify  of  special  arrangment  etc)  usually  required  was  level  (designated  letter  signifying  given by  S c i e n c e (G),  Social  the text;  2,  Industrial  did  not  of the grade 9, and  digit  i n the t h i r d  10); a  English(C),  Mathematics (F),  number f o r t h e  order  t h e C o r p u s ; and a n o t h e r  t h e sequence o f s a m p l e s i n e a c h sample  in  f o r G r a d e 9 Commerce and 3H 03 C 07  information  character  /370  on  "card-image"  permanently  five  that  In  the  first  represented  textbook  listed  for  Grade  cards  then t r a n s f e r r e d  10  Studies.  The  IBM  paper.  Commerce(B),  C 01 d e s i g n a t e d t h e f i r s t  s e v e n t h sample  Social  centering,  formal  a period  "C" t o r e p r e s e n t  2B 01  the  a  E d u c a t i o n (E),  S t u d i e s (H) ; a two  text.  listed  after  area.  number t o d i s t i n g u i s h  textbook  in  o r 3 f o r G r a d e s 8,  two-digit Thus  program  ( e . g . Dr.-»).  subject  the l e t t e r  (indenting,  a code number c o n s i s t i n g  1,  the  Home E c o n o m i c s ( D ) ,  of  placed  t h e end o f a s e n t e n c e sample  FMT  t h e f o r m a t o f t h e document and a l l o w f o r  t h e s y m b o l -• was  Each  of the  in  Model  form  to  t h e computer 168  9-channel  the  computer,  1600  a  was  magnetic  library.  tape  The e q u i p m e n t  w i t h 2 megabytes  b p i IBM  2401  and  of  tape d r i v e s .  A  more  programs conducting  readers  detailed  developed the  and c a r d  explanation  for  storage,  analyses  is  an and  The computer  has  printers,  punches.  o f t h e 209  processing  stored  used was  14 I T E L 7330 d i s k u n i t s and f o u r 1 1 0 0 - l i n e - p e r - m i n u t e p l u s a number o f c a r d  i n 80  the  provided  computer  input in  files  samples  APPENDIX  C.  and and  These  63  programs a r e a v a i l a b l e British  Centre,  University  Three  methods o f t e x t c o r r e c t i o n  content cards  o f each were  original  machine  sample had been k e y p u n c h e d  scanned  text.  print-out  by  the  The c a r d s  was a g a i n means  was  not  cost  method.  2.  The  second  Conversational Terminal consisting  scanned  again  print-out  Conversational correction  3.  most  twice  of  The  final  of  had  errors.  input data  against the  of  of  made  3270  and t h e by  the  high  use  of the  Display  Station  (CRT) and a t y p e w r i t e r -  were d i s p l a y e d on t h e CRT  corrections  were  f o r examination.  facilitated  been  The  Verification  because  IBM  screen  errors,  stage  cards.  a s FMT o u t p u t  editing  which i s an  of the m a t e r i a l being  very f a s t  made  and  a  The use o f t h e  proof-reading  and  processed.  proofing  arranged  in  was p o s s i b l e a f t e r t h e descending  w o r d - t y p e s . T h i s method o f e d i t i n g  efficient  legomena  stage  was o b t a i n e d  Terminal  Corpus v o c a b u l a r y frequency  for  IBM  checked  printed  S t u d y and a l s o  The o r i g i n a l  p l a c e when t h e  because of the s m a l l i n c i d e n c e o f  of a cathode-ray-tube  keyboard.  revised  used  onto  and  scanned f o r obvious  i n the P i l o t  of this  writer  were t h e n  e r r o r s noted  and  were u s e d t o  accuracy.  1. A p r e l i m i n a r y s t a g e o f p r o o f - r e a d i n g t o o k  like  of  Columbia.  I ex t _ Co r r e c t i on s.^ ensure  a t t h e Computing  f o r i t merely  entailed  order  of  was by f a r t h e  checking  the  hapax  (words t h a t o c c u r r e d o n c e ) and words t h a t had o c c u r r e d  to quickly  identify  obvious  errors.  The c h a n c e o f  a  word  64 being of  incorrectly  the  keypunched  corpus  was  this  belief.  confirmed  more t h a n  considered  twice  unlikely  i n different  and  a  parts  quick  check  TASK 3. PRODUCTION OF THE CORPUS Task  3  involved  d e v e l o p m e n t o f new  the  programs  C o r p u s : one o r g a n i z e d and  An  was a l s o d e v e l o p e d  full study. and  the other  description  of  The two c o r p o r a  are  identified  Subjects).  of of two  organized  as CG  copies  of  f o r each  (Corpus  computer  copies  of the  of the three  which  grade  listed  and t h e s a m p l e s used as  separate  by G r a d e s ) a n d CS program  was u s e d  the  i n the volumes  (Corpus by tc  produce  c o r p o r a . The p r o d u c t i o n o f t h e two c o p i e s i n FIGURE 1. A d e t a i l e d  Corpus  i s  followed  presented  E£°aiaSl§£l§_2liii.§_t2_tk§_MliS£^§i_£2EE .§x u  (1974) .  two  Corpus  p r o g r a m s and p r o c e d u r e s the  p r o g r a m s and t h e  by s u b j e c t s a c r o s s t h e C o r p u s .  have b e e n p r o d u c e d  the Corpus i s i l l u s t r a t e d t h e computer  existing  generate  the textbooks  T h e MTS FORMAT  the p r i n t - o u t o f these  to  of  by s u b j e c t s w i t h i n each  levels, index  use  in by  description  t o produce the the Allan  document, Miller  6 5  Extracts.  ->  Vol. I I Subjects of C o r p u s  <-  ->  K eypunch.  *FMT U7B7C7 formatting program  Files:  Cards with *FMT controls  .EDITOR.  "MTTTST"  <-  <- -  Editor  repeat correction cycle.  AGRICULTURE, COMMERCE, ENGLISH, HOMEC, INDED, MATH, SCIENCE, SOCIALS.  A l l f i l e s above, e x c e p t AGRICULTURE , a n d e x c l u d i n g *FMT c o n t r o l commands f r o m 2nd and subsequent f i l e s .  _SFLITJ _S_ program separates grades. J!  <-  Grade8, Grade9 , GradelO _*FMT  ->  u.ITcT formatting program  >  CORPUS  <  Vol. I Grades o f the Corpus  FIGURE 1 PRODUCTION OF VOLUMES C.G. AND C S . OF THE CORPUS  TASK This  task  4. PRODUCTION OF WORD L I S T S  involved  the generation  v a r i o u s c o m b i n a t i o n s o f samples i n t o s u b - t a s k s were  involved.  o f word  distinctive  lists  b a s e d on  corpora.  Two  66 Task_U_1 Existing developed  computer  where  p r o g r a m s were u t i l i z e d  needed  the f o l l o w i n g s i x t y - s i x corpora; grade  c)  corpora; The  of  seven  first  e)  c o l u m n s and provides first  a  tokens  d)  (See  lists  i s an  list  a number p l a c e d a t  the  top  of  (FREQ) i n d i c a t e s t h e word-type; the  of occurrence;  and  grade  subject  within  arrangement  c o n s i s t s of the  first  three column  w o r d - t y p e s to t h a t p o i n t . relative  frequency  second column  the  third  of  three  alphabetical The  o f the  on e a c h  corpora.  D),  total  programs  based  eighteen  APPENDIX  f o r each  frequency  two  lists  new  t h e C o r p u s ; b)  t h i r t y - s e v e n textbook  running  column  c o r p o r a ; a)  two  subject corpora;  of the  word-types  to generate  and  column  The 1000  per  (COUNT) s t a t e s  the  (WORD)  the  lists  word-type. The (See  second  list  APPENDIX E ) . The  similar  to  the  presents  the  rank l i s t  f o r by  alphabetical l i s t  each  o f each  a l s o c o n s i s t s of  (FREQ) i n d i c a t e s t h e c u m u l a t i v e accounted  rank-order  except  percent  word-type  three  t h a t the f i r s t of  the  columns column  total  corpus  word-type.  Ta.sk_4._2 Two  additional  w h i c h summarized of  t a b l e s were i n c l u d e d f o r t h e  t h e r a n k i n (a)  highest frequency  ascending frequency The  order  first  (i.e.  word l a s t ) .  and  the  (See  descending  the  ha_ax  order  o r g a n i z a t i o n of the  list  ( i . e . the  word  legomena l a s t ) ;  haj>ax_legomena f i r s t APPENDIX  rank-order  and  the  and  (b)  highest  F).  t a b l e s and  t h e column headings  are  67 identical, order of  locate  the  column  which g i v e s t h e d e s c e n d i n g  (RANK) w h i c h p r o v i d e s t h e r a n k  t h e rank  t h e f r e q u e n c y o f a word i n e i t h e r  same  frequency  cases  where t h e f r e q u e n c y  range  of these The  number  to quickly  o f any word i n t h e C o r p u s o r t h e v a r i o u s c o r p o r a  o r the rank-order l i s t  tables  table  w o r d - t y p e . T h i s a r r a n g e m e n t makes i t p o s s i b l e  matching  list  that  h a s an e x t r a  each  by  except  in  column  under  the  column  the a l p h a b e t i c a l COUNT,  X i n the descending  of word-types i s the  with  the  order table. In same,  the  rank  words i s s u p p l i e d .  column  headings  i n the descending  and a s c e n d i n g  order  provide the following information.  Column X The  frequency  of occurrence of tokens.  The  number o f w o r d - t y p e s o f t h e f r e q u e n c y X.  Column FX  Column SUM  FX The  sum o f w o r d - t y p e s c o u n t i n g from  the  top  of  the  table. Column CUM%  FX  The  sum  the t o t a l  of  w o r d - t y p e s as a c u m u l a t i v e  number o f  percentage of  word-types.  Column FX * X The  number o f t o k e n s  accounted  for  by  each  of  the  word-types. Column SUM  FX * X The  number  word-types. Column CUM%  FX * X  of  t o k e n s due t o t h e c u m u l a t i v e t o t a l o f  68 The  previous  number o f The rapid  descending  For  first  100  mere  0.610  the  same  for  percentage  of  percent  ascending  the  descending  The  i n the  order  number o f word l i s t s  the  for  a  the other  of the  hand, t h e  only  total  ascending  or l e s s  account  14.705 p e r c e n t  of  tokens. and into  accompanying the  t a b l e s d e s c r i b e d i n Task  following five  Corpus,  designated  as  2)  Grades,  designated  a s G.V.  3)  Subjects, designated  4)  Subjects  The  word  t a b l e shows t h a t  f o r 48.973 p e r c e n t  w o r d - t y p e s but  1)  5)  various  the  number o f w o r d - t y p e s . However,  C o r p u s , on  of the  have been o r g a n i z e d  Grade  total  word-types i n the Corpus account  words-types account  84.505 p e r c e n t  the  tables facilitate  contained  of the t o t a l  i n the  order  t a b l e shows t h a t words o c c u r r i n g t e n t i m e s  the t o t a l  by  a  information  example,  number of t o k e n s order  and  most f r e q u e n t  100  as  tokens.  a n a l y s i s of the  lists.  column  C.V.  (Corpus (Grade  a s S.V.  volumes: Vocabulary) ;  Vocabulary);  (Subject  within Grades, designated  4  Vocabulary);  as  S. 0. V.  (Subjects  V o c a b u l a r y ) ; and  Textbooks, designated  production  accomplished illustrated  by  of  making u s e  i n FIGURE 2 .  the  a s T.V.  volumes  (Textbook  Vocabulary).  discussed  o f a number o f c o m p u t e r  i n Task  4  was  programs  as  69  ->  G r a d e 8 ,| Grade9, GradelO,  ->  Control Info. ->  - ->  .COUJTW..S_ word count program  ->  SPLIT2.S program separates gde/subjs SPLIT3.S program separates texts  ->  ->  Vols. See below  ->  Control info. Tables  <-  <-  TABL. B1 .S Descending with rank  <-  • SORT K.T.S. sort  <-  Control info.  Tables  <-  TABL.B1.S ascending without Rank  Files:  <<-  •SORT M.T.S. sort  <-  a l l 67 raw d a t a f i l e s l i s t e d i n APPENDIX C w i t h t h e e x c e p t i o n o f AGRICULTURE ( t h e P i l o t S t u d y ) .  V o l u m e s : C.V., G.V., S.V., S.G.V., and T.V.  FIGURE 2 PRODUCTION OF WORD L I S T S : VOLUMES T.V.  C.V., G.V.,  S.V., S.G.V.,  AN C  70 A  complete  p r o g r a m s used British  d e s c r i p t i o n of the  i s a v a i l a b l e on  Columbia  TASK 5.  For  DESCRIPTION AND  Task 5 e x i s t i n g  programs  developed  comparative grade  Computing  and  level  subject  textbook  corpora.  Task_5.1 tokens, for  was  corpora; grade  of c)  the  to  and  was  word-types  designed for  corpora;  within  subject-area  determine the  a)  the d)  number  f o r t h i s data  three the  b)  word-types, per  token  three  grade  subject  corpora. and  of  thirty-seven  of  Corpus;  t h i r t y - s e v e n textbook  the  corpora,  the  eighteen  new  a number  f o r the Corpus,  and  and  are  within  Comparative presented  in  IV.  Task_5±2  grade  of  CHARACTERISTICS  generate  corpora,  corpora;  summary t a b l e s were d e v e l o p e d Chapter  to  seven  following:  e)  computer  University  a v e r a g e number o f c h a r a c t e r s  seven s u b j e c t  corpora;  the  the  p r o g r a m s were u t i l i z e d  analyses  grade  and  LEXICAL  necessary  the  designed  characters,  each  ANALYSIS OF  where  within  and  Centre.  statistical  eighteen  T a p e #RE0616 a t  computer  corpora,  word l i s t s  each c)  repeat-rate  the s i x t y - s i x  of  determine the the  corpora  and  e)  corpora;  t h e C o r p u s ; b) d)  eighteen  t h i r t y - s e v e n textbook  frequency are  repeat-rate frequency  f o l l o w i n g : a)  seven s u b j e c t  grade c o r p o r a ;  The  to  t a b l e s of  included  in  the  for three  subject  corpora.  word-types f o r each five  volumes  of  C.V.,  71 G.V.,  S.V.,  S.G.V.,  (REPETITIONS) o f and  the  that  second have  the  statistical Poisson the K and  use  of  The  and  characteristic  The  said  be  s a m p l e s have been c o l l e c t e d  Formula  K =  for  zero  = _T  x  10,00 0  like than  earlier.  K  which  based on  underlying  the  theoretically  f r o m a l a r g e body o f  a the  use  (Yule,  F r a n c i s , 1 9 6 7 ) . In of sample  is  of 1944)  brief,  size  when  the the  materials.  S1_-_S2  f x X i s the  as o r i g i n ,  word-types  K:  SI  where S1  word-type  combine  distribution  independent  column  information  characteristic  and  the  thus  discussed  have been s t a t e d  to  of  different  assumptions  t e s t e d e m p i r i c a l l y (Kucera  first  number o f  tables  of a f r e q u e n c y  law.  The  frequency  present  Yule*s  parameter  is  the  word f r e q u e n c i e s  probability  factor  Task 4 . 2 ) .  (RATE) i n d i c a t e s t h e  word-types  makes  (See  frequency.  b a s i c t a b l e s of  lS§iS«5i3  K  column  of  T.V.  each t a b l e g i v e s  this  frequencies  and  S2  = _T  2  first  fx X  2  moment o f t h e  i s the  second  distribution  moment,  and  about fx  is  x the  number  introduced  of  words o c c u r r i n g X t i m e s .  to avoid  d e a l i n g with  Yule»s c h a r a c t e r i s t i c of  the  concentration  particular commonly  area.  A large  occurring  K was  of  small  value  vocabulary  quantity  or  10,000 i s  decimals.  used t o p r o v i d e  vocabulary K  The  in  implies words  the a  an  indication  samples greater  of high  from use  frequency  a of of  72 o c c u r r e n c e . A low greater  K value i m p l i e s  proportion  of  rare  that  words  Summary t a b l e s o f Y u l e ' s K f o r e a c h presented  TASK 6.  i n Chapter  DESCRIPTION AND  f o r use  statistical sentence study.  Four  ANALYSIS OF  including of  b)  corpora.  Comparative  the  sentence  the  number  on  and  the sentence  and  corpora of  the  grade  length  characteristics  number  of  sentences,  of the f o l l o w i n g :  seven  subject  corpora;  and  summary t a b l e s  a)  corpora;  e)  coefficient  the d)  were d e v e l o p e d  first  i n words and of  Corpus;  textbook  for this  data  IV.  length  distribution  volume a r r a n g e s t h e d a t a f o r e a c h The  and  eighteen  thirty-seven  c o r p o r a a r e p r o v i d e d i n a volume t i t l e d  headings.  the  based  others  Comparative  standard deviation,  d e t a i l s o f the sentence  APPENDIX G ) . The five  f o r each  are presented i n Chapter  sixty-six  6.  and  involved.  length,  c o r p o r a ; c)  within  Full  o f Task  mode, a v e r a g e  skew f a c t o r  subject  corpora are  CHARACTERISTICS  c h a r a c t e r i s t i c s of the s i x t y - s i x  median,  t h r e e grade  a  frequency.  p r o g r a m s were used  designed to provide sentence  variation,  and  SENTENCE  were g e n e r a t e d ,  mean s e n t e n c e  Pearson's  computer  major s u b - t a s k s were  Task_6was  o r words o f low of the s i x t y - s i x  i n the development  analyses  length  contains  IV.  A number o f e x i s t i n g modified  the m a t e r i a l  column  (LENGTH) s t a t e s  the second  occurrences of t h i s  column  o f each SENT.  table  particular  (See under  the l e n g t h  (REPETITIONS) sentence  of  of  gives length.  73 Column t h r e e from  t h e sum  running  total  of  the sentence  t h e percentage length  set of  sixty-six  length distributions  sentence  UBC p l o t t i n g  Centre.  Task  6^3  This  graphs  rate  frequency  was d e s i g n e d  o f sentence  subject  textbook  corpora.  The  within  the s i x t y - s i x  complete  description  available  on t a p e  Task  6.3  suggested  procedure, occurrences number o f indicating sentence  of  grade  cases  corpora the  use o f  specific  of  whether  lengths  Plotter  X.  of the  at  t h e UBC  a s APPENDIX H.  seven  corpora;  on t h e r e p e a t -  subject  and  sentence  Yule's  lengths f o r  (low K v a l u e ) ;  this  t h e number o f  l e n g t h and f x e q u a l s the  characteristic  or  K along the  a n d F r a n c i s , 1967). I n  contains  i s  Centre.  characteristic  sentence  material  i n SENT. A  characteristics  S1 = 51 f x X , e q u a l s x  The  corpora;  e) t h i r t y - s e v e n  are a l s o presented  i n T a s k 5.3 ( K u c e r a  a  each  t a b l e s f o r sentence  X i n t h e statement of  accounted f o r  was p r i n t e d t h r o u g h t h e  #RE0616 a t t h e UBC C o m p u t i n g  made  gives  l e n g t h s f o r e a c h o f t h e f o l l o w i n g : a)  repeat-rate frequency  of  (% WORDS)  t o provide data  t h e C o r p u s ; b) t h e t h r e e g r a d e c o r p o r a ; eighteen  WORDS)  illustrating  The g r a p h s a r e p r e s e n t e d  task  (ACCUM  o f tokens  package u s i n g a CALCOMP Drum  Computing  counting  distribution.  £ask_6._2 A m a t c h i n g  lines  sentences  t h e same f u n c t i o n f o r words. Column f i v e  throughout  each  of  t h e t o p o f t h e t a b l e and t h e f o u r t h c o l u m n  serves the  (CUM. SENT) l i s t s  a  whether  K  i s useful  in  diversity  of  great there  i s a  high  74 repetition K value),  o f commonly  occurring sentence  The i m p l i c a t i o n s o f t h e K - f a c t o r  writing style  are discussed  K  of  f o r each  presented  existing  new p r o g r a m s where needed frequently  subjects  in  Summary t a b l e s o f  o u t l i n e d i n T a s k 6.1 a r e  OF DISTRIBUTION OF 100 MOST FREQUENT WORD-TYPES  Task 7 u t i l i z e d  corpora:  corpora  (high  f o r differences  i n l a t e r chapters.  the s i x t y - s i x  present  i n C h a p t e r IV.  TASK 7. ANALYSIS  most  lengths  a) t h r e e in  to analyze  occurring  programs  and  the d i s t r i b u t i o n  word-types  grade l e v e l s ;  Grade  s u b j e c t s i n Grade  computer  across  following  areas;  8; d) s e v e n s u b j e c t s i n Grade were  o f t h e 100  the  b) s e v e n s u b j e c t  10. Two major s u b - t a s k s  developed  c) s i x  9; a n d e) f i v e  involved.  Task_7.1 The  chi-square  significant frequent usual  test  was used  differences  word-types  in  to  the  test  whether  distribution  i n the f i v e areas  described  there  were  o f t h e 100 most above, u s i n g t h e  formula:  X  2  =  l _ o _ - e_j_ z  e~~ where o = t h e o b s e r v e d f r e q u e n c y the  expected  equals to  frequency  the r a t i o  the  total  o f t h e word-types.  of the t o t a l  number  o f t h e w o r d - t y p e s , and e =  number  (The e x p e c t e d  o f word-types i n a  of word-types i n the Corpus,  the value  corpora  multiplied  by  75 the  total  type test  number of C o r p u s o c c u r r e n c e s o f t h e  being  tested).  The  the hypotheses  That  is,  i t  when t h e y  level  decided to r i s k  were t r u e o n l y one details  distribution t a b l e s and  of  of  frequency,  line  indicates  the s p e c i f i c  in a  the  lists  hand s i d e  Task_7 2  of the  was  i  designed  a c r o s s each  of the f i v e  of these  the rank column  into  subject  Grade  areas  word-type  gives  the  a  of  there  observed  frequency,  and  the of  number o f a l l  percentage.  placed i n ranked  order  across  significantly  columns with  not the  in  Task  the f i r s t  o f t h e 100 the  The  100  on t h e  left  each  of  t h r e e grade  the  subjects  10  (SUBJECTS  i n Grade 9 10).  A  summary  IV. The  table.is  (BANK)  giving  the  second  and  word-types  levels  is  (GRADES); t h e Grade  (SUBJECTS 9 ) ; and  of  distribution  Columns t h r e e t o  (SUBJECTS C) ; the s u b j e c t s i n  the  number  7.1.  column  word-types  word-type.  the  in their  i s presented i n Chapter  listing or  illustrate  areas t e s t e d  of each  whether  distributed  8);  seven  (WORD)  indicate  differed  results  listing  the  in a series  to the t o t a l  as  t o a n a l y z e and  which  organized  line  for  tables.  word-types  table  error.  hypotheses  tests  I. For each  word-type i n the c o r p o r a  f r e q u e n t word-types are  to  o f t h e number o f o c c u r r e n c e s  word-types i n the c o r p o r a expressed most  chosen  100.  the e x p e c t e d  the r a t i o  word-  1  null  been a r r a n g e d  first  was  type  the  chi-square  have  The  against a  rejecting  i n APPENDIX  of data.  the second  line  time  word-types  are presented  are three l i n e s  of s i g n i f i c a n c e  i n o r d e r t o guard  was  Complete  third  .01  respective  8  seven evenly seven  (SUBJECTS  the s u b j e c t s i n  76 TASK  8. ANALYSIS  This  task  hypotheses  involved  which  differences areas  OF DISTSIBOTION OF SELECTED SENTENCE  in  stated  various  testing  that  i n the sentence the  population  the  of  there  a  number  were  no  length d i s t r i b u t i o n s  corpora  when  e x p r e s s e d by t h e s e n t e n c e  LENGTHS  of  significant  of the  compared  null  to  subject  the  length d i s t r i b u t i o n  normal  of  the  Corpus. The  chi-sguare test  was used  to test  these hypotheses  using  the u s u a l formula:  X  where  0=  =  2  in  e  the observed  the expected value  l_o_-_e_l_  frequency of the sentence  frequency of the  equals  the r a t i o  multiplied  the r e s p e c t i v e significance  The  length  to test  chi-square tests  sentence  l e n g t h s were c h o s e n  length,  sentences  30,  included  a l l sentences  v a r i e t y and s m a l l number A computer  program  40, to  on e i t h e r  and two g r o u p s  number  of  of  sentence  expected  lengths  in  the  o f Corpus  o c c u r r e n c e s of  being  tested).  The  and  50+  represent  level  of  was .01.  ranges  of sentence  words i n l e n g t h . short  s i d e of the Corpus sentences.  sentences,  a  mean s e n t e n c e range  50 words o r a b o v e b e c a u s e  of the  to  expected test  the  The  These  last  of sentences  was d e v e l o p e d  (The  and e=  number  longer with  length,  of sentence lengths  were r u n u s i n g f i v e  :  of  20,  length.  these hypotheses  lengths  group  10,  number  by t h e t o t a l  sentence  used  sentence  of the t o t a l  a corpora t o the t o t a l  Corpus,  2  in this  category.  distribution  of  77  occurrence  of  three  grade  levels,  areas  in  five  Grade  i n Chapter  the sentence  five  tables  tables  i s the  selected the  and  five the  8,  subject areas  appears for  the  selected  seven  the  subject  seven  of Grade  sentence  A  IV. C o m p l e t e d e t a i l s  are presented  same a s t h a t  the  six  of  these  been  i n APPENDIX  a r e p l a c e d on  J . The  the  the  results tests  arranged  into  format  d i s c u s s e d i n Task 7 , e x c e p t  lengths  and  of the c h i - s q u a r e  have  left  the  subject  i n G r a d e 9,  summary  length d i s t r i b u t i o n  sentence  areas,  s u b j e c t areas  10.  lengths across  of  the  that  the  hand s i d e  of  tables.  TASK  9.  The  IDENTIFICATION OF  final  "elimination significant frequency  task i n the  lists  corpora, the  seven  textbook  corpora graphs the  frequency  plotted  frequency  the  of each  magnitude of the used.  (See  rank  K).  areas  g e n e r a l shape o f the diagram  the  the  and  the  UBC  of  the  the  an most  ranked  three  grade  thirty-  were i n v o l v e d .  were c o n s t r u c t e d f o r t h e  eleven  CALCOMP Drum P l o t t e r .  The  w o r d - t y p e a l o n g t h e a b s c i s s a and  word-type  The  of  using  f o r the Corpus,  sub-tasks  o f each  quantities  APPENDIX  content  graphs  d e s c r i b e d above u s i n g  development  selection  subject-area corpora,  c o r p o r a . Three  Word  the  developed  seven  CONTENT MATERIAL  i n v o l v e d the  for  words i n s p e c i f i c  level  £ask_9..1  study  technique"  word  SIGNIFICANT  on  being word  the o r d i n a t e . Because o f  plotted,  a one-tenth  frequency  i n FIGURE  3.  graphs  the  scale  was  take  the  78  Words Ranked I n D e s c e n d i n g  order  FIGURE 3  l&sk_9±2 consists the  high  The  MODEL OF A WORD FREQUENCY  DIAGRAM  "elimination  suggested  technique"  o f two s t a g e s . The f i r s t frequency  be t o o common  stage i s designed  words i n a word  list  that  i n this to  study  identify  a r e considered to  t o have s p e c i a l s i g n i f i c a n c e f o r t h e c o n t e n t  area  79 under  investigation.  frequency  A cutoff  p o i n t i s determined  words a r e e l i m i n a t e d . The d e c i s i o n  position  on t h e a b s c i s s a where 50 p e r c e n t  as  cutoff  the  structure  rare  warrant  or  position low  being  area.  frequency  A  point  is  tokens  B i n FIGURE  t h e gray  A  and B c o u l d o f c o u r s e on  significant  immediately  the  judgment  content  and  the and  the  t h e s e low  of the  point. This  results  three  Words  times  and r i g h t  which  fall  o f both  point  be i n c l u d e d a s h a v i n g of  specific  t o be low i n s i g n i f i c a n c e ,  to the l e f t  also  are  10 p e r c e n t  4 r e p r e s e n t s the c u t o f f .  in  depending  area  for  o c c u r o n l y one t o  and which a r e r e g a r d e d  which  was made t o u s e t h e  o c c u r r e d as t h e c u t o f f  o f words t h a t  cutoff.  of occurrence to  determined  on t h e a b s c i s s a where a p p r o x i m a t e l y  most l i s t s  The l i n e  frequency  considered as s i g n i f i c a n t cutoff  the elimination  in  occurred  o f most o f t h e  t o e l i m i n a t e words  words a r e e l i m i n a t e d . The d e c i s i o n  frequency  in  i s designed  do n o t have s u f f i c i e n t  their  content  o f the tokens  words. The l i n e A i n FIGURE 4 r e p r e s e n t s t h i s stage  high  was made t o u s e t h e  point. This involves elimination  The s e c o n d too  and t h e s e  the  degree  significance  individual of  selecting  accuracy  desired i n  d e s i g n a t i n g t h e words t o be e l i m i n a t e d .  Task_9_,3 The b a l a n c e o f t h e words r e m a i n i n g B  (approximately  40 p e r c e n t  o f the  for  most p u r p o s e s ,  That  i s , these a r e the items  too  frequently  infrequently emphasized  t h e most s i g n i f i c a n t  to  be  t o be c l a s s e d that  total  as  subjective  tokens), content  of vocabulary  classed rare  as  that  common  words.  judgment  between p o i n t s A and  by  It  represents,  i n a word occur words, must  specialists  list.  neither nor too again  be  i n the  80 c o n t e n t area concerned elimination  and  i s v i t a l i n making f i n a l d e c i s i o n s i n the  retention  of  'gray'  area  words  and  in  e s t a b l i s h i n g the g e n e r a l c u t o f f p o i n t s f o r A and B.  FIGURE APPLICATION  OF  "ELIMINATION  4  T E C H N I Q U E " TO  FREQUENCY  DIAGRAM  THE  MODEL OF  A WORD  81  A  complete  "elimination corpora, Chapter as  technique"  and  the  17 u s i n g  of  the  results  t o t h e Corpus,  seven  the  subject-area  the frequency  of  three  corpora  distribution  applying the grade  level  i s presented i n  graph  of the  Corpus  an example. one  and  final  technology. computer disk.  using  was  and very  APPENDIX  great  with  carefully  chapter  was t h e n  use  of  of  the  computer  keypunched  onto  IBM  e d i t e d , u s i n g a 3270 CRT u n i t , printed i n i t s  present  program. The g r a p h s and c h i - s g u a r e by s p e c i a l  form tables  p r o g r a m s and r e d u c e d  of presentation.  The  the  of allowing constant  of the r e v i s e d  dissertation  had  r e v i s i o n s t o be made  manuscript  to  be  obtained  o f t a b l e s and o t h e r  descriptive  p l u s t h e c o n s t r u c t i o n o f g r a p h s were a l s o  relatively  computer  techniques  formatting  facilities.  The  major  drawback i n u s i n g  was t h e need f o r t h e r e s e a r c h e r t o e d i t  the ' l o g i c a l '  but s e t procedures  involved  an u n d e r s t a n d i n g  of  programming  output.  the  was  t h e o r g a n i z a t i o n and i n t e r p r e t a t i o n  the  production  t h e computer memory bank and s t o r e d on  and f i n a l l y  advantage  quickly.  r e s p e c t t o the design  entire  through  were p r o d u c e d  multiple copies  computer  The  u s e o f t h e computer i n p r o d u c i n g  statistics easy  into  dissertation  f o r convenience  the  read  t h e FMT computer  The  study.  each  numerous t i m e s  the  be made w i t h  accomplished  Initially,  cards,  The  revised  in  point should  methodology o f the  dissertation  in  discussion  of p r i n t  o f b a s i c computer  language  used  used  by t h e  very  computer  materials. This  processes  i n generating  plus  some  the computer  82  CHAPTER  ANALYSIS OF The  purpose obtained  resulted  i n the  from  including  graphs,  organized and  of  over  5,500  facsimiles  the s i x t y - s i x word  analyze 1-9.  pages  The  of  the tasks  printed  of a l l the i n s t r u c t i o n a l lists,  and  accompanying  and  statistical  summaries. T h e s e  into eight  volumes and  are discussed f u l l y  computer f i l e s  magnetic  d a t a were t h e n i n Tasks  tape.  of  (Tape  Division  o f the L i b r a r y  British  Columbia.  A  Corpus,  by  in  the  Allan  University  the  (Tape  tape  material have  are  and  been  3  the S p e c i a l at  description  the  over twenty  p l a c e d on  available  u s i n g the v a r i o u s  booklet,  of B r i t i s h  and  #RE0617)  technical  Killer,  the  programs,  #RE0616)  f o l l o w e d i n d e v e l o p i n g and given  organize  computer  Copies  Centre  generated i n the study, i n c l u d i n g  used t o  developed  Computing  the  and  completion of Tasks  print  of the m a t e r i a l  specially  is  FINDINGS  U. All  200  the  production  m a t e r i a l s sampled, tables,  DATA AND  of t h e c h a p t e r i s t o p r e s e n t  results  material,  THE  IV  from  the  Collections  University  of the computer  of  procedures programs  P£23E3Sl§£i^§Mi^_:_,l:_!__:]l§_I^_;_:dsJ!.  available Columbia.  from  t h e Computing  Centre  at  83 TASKS, QUESTIONS AND The section,  tasks the  outlined  in  HYPOTHESES  Chapter  I  are  restated i n  q u e s t i o n s a n s w e r e d , the h y p o t h e s e s  general findings  this  t e s t e d , and  the  presented.  Task_J_. Develop a r e p r e s e n t a t i v e corpus of n a t u r a l language t e x t based on i n s t r u c t i o n a l material prescribed for use i n B r i t i s h C o l u m b i a j u n i o r s e c o n d a r y g r a d e s . The  thirty-seven  developing are  the  described  words t o 338 deviation  235,107 word C o r p u s in  words  of  textbooks  APPENDIX A. with  44.187.  a Two  organized i n alphabetical ascending  order  are  469  used  501.294  and  469  one  i n APPENDIX  and  a  sample  ranked  in  materials  sample s i z e s r a n g e d of  order,  samples  instructional  copies of the  presented  Task_2 Organize the manipulation.  of  The  mean  and  by  from  657  standard sizes,  one  size  in  B.  t  The  keypunching  accomplished the  computer c a r d s on  involved  Centre  FffT  at the  were r e a d  disk to await i n the  for  o f computer  u s i n g t h e UBC  Computing  placed  Corpus  into  computer  and  c a r d s c o n t a i n i n g the Corpus  (FORMAT)  University the  input  program of  available  from  Columbia.  The  computer v i a a c a r d - r e a d e r  and  reorganization into  British  was  the  various  study.  Task_3 j_ Generate two v o l u m e s o f t h e C o r p u s : one o r g a n i z e d by g r a d e l e v e l s , and one o r g a n i z e d by s u b j e c t a r e a s , e a c h with a d e s c r i p t i v e index.  tasks  84  Computer p r o g r a m s were u t i l i z e d of  the Corpus:  instructional which  1) C.G.,  which p r e s e n t s  material  organizes  the  organized  o f t h e development  index  full  is  included i n the front  listing is  particulars  i n APPENDIX  facsimiles  material  of the corpora,  f o r each t e x t  by  files  of  the  and 2) C.S., subject.  A  w h i c h i n c l u d e s an  and t h e s a m p l e s  o f e a c h o f t h e two v o l u m e s .  o f t h e 209 computer  presented  print  t h e two v o l u m e s  by g r a d e l e v e l s ,  instructional  description and  to generate  and programs u s e d  A  used,  detailed  i n the study  C.  Organize t h e s a m p l e s i n t o word l i s t s f o r t h e C o r p u s , the grade c o r p o r a ( 3 ) , t h e s u b j e c t corpora (7), t h e subjects within g r a d e c o r p o r a ( 1 8 ) , and t h e t e x t b o o k c o r p o r a (37) . 4.1 F o r each o f t h e above, p r o v i d e an a l p h a b e t i c a l and a rank order (descending frequency) l i s t i n g o f word-types t o g i v e t h e f o l l o w i n g i n f o r m a t i o n . 4.11 The f r e q u e n c y o f o c c u r r e n c e o f each wordtype. 4.12 The cumulative percentage f r e q u e n c y o f each word-type. 4.13 The r e l a t i v e f r e q u e n c y o f o c c u r r e n c e o f e a c h w o r d - t y p e p e r 1000 t o k e n s . 4.14 The descriptive statistics f o r the rank order l i s t s o f t h e C o r p u s and c o r p o r a i n c l u d i n g : X, FX, SUM FX, FX * X, CUM % FX * X . 4.2 C o n s t r u c t two summary t a b l e s f o r e a c h of the sixty-six word lists, i n d i c a t i n g t h e word f r e q u e n c y f i g u r e s i n descending order (highest frequency words first), and i n ascending order (highest frequency words l a s t ) .  Task_4_.J[ The a l p h a b e t i c a l and r a n k relevant  statistical  organized  into  1)  C.v.  five  details  order  word  lists  f o r t h e C o r p u s and a l l c o r p o r a a r e  volumes a s f o l l o w s :  Represents  and  t h e word  list  f o r the Corpus  (345  85 pages) , 2)  G.V.  level 3)  corpora S.V.  subject 4)  5)  All  area corpora  textbook  lists  both  lists  i s preceded  the  first  quantity  in  p e r 1000 t o k e n s first  quantity  the  word  (1 ,292  for  the  f o r the t h i r t y -  pages). two  columns  FREQ  the  figure  Each  indicates  FREQ  column  i n each  list  gives  listing  list  begins  with  the  list  relative  indicates  the  order the  frequency  s e v e r a l command  of the alphanumerical  order  list  A l l other  begins  types  indexes  a r e p l a c e d a t t h e end o f t h e l i s t .  in  word e n t r y i n  i n t h e C o r p u s c o n t r i b u t e d by t h e  with  rank  with  basically  For the rank  i n the study.  The  is  For t h e a l p h a b e t i c a l  o f the word e n t r y . in  page  The o r g a n i z a t i o n  lists  samples used letters  per  (See APPENDIXES D and E) .  alphabetical  p l u s a complete  order  w i t h one e x c e p t i o n .  column  eighteen  (986 p a g e s ) ,  lists  and t h e r a n k  t o t a l o f the tokens  t h e word e n t r y .  The  corpora  by two f i g u r e s .  word e n t r y . The s e c o n d of  f o r the seven  p e r column f o r added c o n v e n i e n c e .  same f o r a l l c o r p o r a  cumulative  lists  lists  a r e s e t up i n  the  the  word  t h e word  corpora  the alphabetical  list  grade  (730 p a g e s ) ,  Represents  word l i s t s  frequency  the  within grade l e v e l  words  f o r the three  (550 pages) ,  Represents  T.V.  seven  of  t h e word l i s t s  S.G.V. R e p r e s e n t s  subject  fifty  Represents  with  t h e C o r p u s and p l a c e s a l l o t h e r  that  do  symbols  o f t h e 469 not  the highest frequency  types  in  descending  begin  word rank.  86 tlis_or der_within_t IlliSki£§_Ii§ted_last _  A  s  i n d e x e s of t h e 469 beginning  each  The  in  descending  lists  i s included  matching  of  this  descending  the  the  list.  word  from  (APPENDIX the rank there than  a r e 54  "The  frequency  list  that  account  A similar  The  the f i r s t  100  f o r 115,141  by  the  word  that  "about"  the  (APPENDIX D)  i s 463  of 55  This  and list  which i s  means  that  occurrence occur  less  c o u l d be f o l l o w e d  c o r p o r a word  descending  for  lists.  order  the r e l a t i o n s h i p list  in  frequency  Order"  which  procedure  descending  of  i n column X o f  words  the  of  then  Corpus.  determining  tokens.  the  the  rank  and  c o u l d note  16,350  offered  involves  p e r c e n t of  in  word-  descending  list  of the  of the other s i x t y - f i v e  service  w o r d - t y p e s and indicates  alphabetical  corresponds t o a rank  and  the  the frequency  have a g r e a t e r f r e q u e n c y o f  i n the Corpus.  Another  0.610  X = 463  "about"  i n any  the  e n a b l e s the  finding  list  each  C o r p u s w i t h Rank i n D e s c e n d i n g  words which  word  entries  the  the reader  o f t h e word " a b o u t "  the  by f i r s t  the a l p h a b e t i c a l  F ) , that  frequently with  in  at  constructed  of  s a m p l e page from  For example, i f the rank  in  appears  word f r e q u e n c y l i s t s  A  located  alphanumerical  entries.  number w i t h t h e same f r e q u e n c y  word  determine  study  i n APPENDIX F. T h i s l i s t  the Corpus i s r e q u i r e d , of  the  of the  c o r p o r a g i v e s the rank  order.  word t o be q u i c k l y  occurrence  listing  in  descending  of the s i x t y - s i x  type  any  samples used  o f the ha£ax_le_omena  2^§iS_iii2 for  complete  the  word  between Corpus  most f r e q u e n t words c o n s t i t u t e  only  w o r d - t y p e s i n t h e C o r p u s y e t t h e same words tokens  o r 48.973  percent of the t o t a l  number  87 of  word o c c u r r e n c e s The  i n the  ascending  order  Corpus. word f r e q u e n c y  list  developed  corpora  g i v e s the  rank o f each word-type i n a s c e n d i n g  sample  page  i n c l u d e d i n APPENDIX F. T h i s l i s t  determining occurring Order" ten the of  the  number o f t o k e n s  word-types.  list,  times  is  For  example,  i n d i c a t e s t h a t low  for  "The  frequency  total  number o f  of  by  i n the  the or  A  i s useful in the  Corpus i n  f o r o n l y 34,572 t o k e n s word o c c u r r e n c e s  order.  word-types  o r l e s s c o n s t i t u t e 84.505 p e r c e n t  Corpus yet account the  accounted  f o r each  rarely  Ascending occurring  word-types i n 14.705  percent  Corpus.  Task_5 G e n e r a t e c o m p a r a t i v e and s t a t i s t i c a l a n a l y s e s b a s e d on the lexical characteristics of the Corpus, the c o r p o r a , and d a t a p r o d u c e d i n T a s k s 1 t h r o u g h 4. 5.1 What are the lexical c h a r a c t e r i s t i c s of the C o r p u s ; t h e G r a d e 8,9, and 10 corpora; each of the s e v e n s u b j e c t a r e a c o r p o r a a c r o s s G r a d e s 8, 9, and 10; e a c h o f t h e s u b j e c t c o r p o r a w i t h i n G r a d e 8, 9, and 10; and each of the t h i r t y - s e v e n textbook corpora, i n terms o f the total number of graphic characters, a v e r a g e number o f g r a p h i c c h a r a c t e r s p e r t o k e n , tokens and d i s c r e t e w o r d - t y p e s ? 5.2 What a r e t h e c h a r a c t e r i s t i c s i n t e r m s o f r e p e a t rate f r e q u e n c y ( Y u l e ^ s K) of words f o r the C o r p u s and c o r p o r a d e f i n e d i n T a s k 5.1? T a s k _ 5 . i l The various corpora  The  total  are  of  Grades  the  total  presented  of t h e C o r p u s and  i n TABLES IV  through  the  X.  C o r p u s i n c l u d e s 16,405 w o r d - t y p e s a c r o s s t h e  samples developed relatively  lexical characteristics  for  large size 8 and  10.  the  study.  TABLE  of the Grade 9 corpus O v e r 50  percent  Corpus are r e p r e s e n t e d  by  69  IV  illustrates  i n contrast to  the those  (122,953) o f t h e t o k e n s percent  (11,401)  of  469  in the  88 Corpus word-types i n the n i n e t e e n  t e x t b o o k s used i n G r a d e  Grade  10  8 (52,867  approximately  t o k e n s ) and Grade the  same  size  in  9.  The  tokens) corpora  are  terms of both word-types  and  (59,343  tokens.  TABLE IV  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH TOKENS IN CHARACTERS FOR GRADE LEVELS AND THE CORPUS  OF  Grade  Types  Tokens  Characters  Average  8 9 10 Corpus  7,027 11,401 7,736 16,405  52,867 122,953 59,343 235,107  234, 527 554,488 273, 654 1,062,411  4.44 4.51 4.61 4.52  The Corpus  lexical across  the  indicate  that  subject  corpora  which  characteristics three  Home E c o n o m i c s and  of the  grade  levels,  (49,257  Mathematics largest  corpora  t h e most w o r d - t y p e s  (7,079)  indicating  used  a s compared  to the other  grades.  content  is  of  i n TABLE  the V,  the  largest  the s m a l l e s t .  English,  (40,300  throughout the e i g h t  areas  outlined  tokens)  (17,808)  i s the second  vocabulary  subject  t o k e n s ) has  a much g r e a t e r  by  variety  textbooks i n t h i s  areas i n the  junior  far of  subject  secondary  89 TABLE V  HUM BER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS ACROSS GRADE LEVELS Subject  Types  Tokens  Characters  A verage  Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Corpus  3,020 7,079 5,529 4,060 1,952 4,833 6,21 1 16,405  20,155 40 ,300 49,257 31 ,300 17,808 37,787 38,608 235,107  90,171 178,192 221, 576 141 ,176 73, 852 173,023 184,727 1,062, 411  4.47 4. 42 4.50 4.51 4.15 4.58 4.78 4.52  TABLE subject  VI  areas  Economics  and  gives  the  lexical  characteristics  (Commerce i s n o t o f f e r e d ) Social  within  word-types although of  (2,388)  t h a n Home E c o n o m i c s  r a n k i n g second i n t o t a l  w o r d - t y p e s f o r Grade  Grade  S t u d i e s a r e t h e two l a r g e s t  o v e r 11,000 t o k e n s e a c h a l t h o u g h E n g l i s h  8  (2,890  of the s i x 8.  Home  corpora  with  has a g r e a t e r  number o f  (2,169). S o c i a l  Studies,  tokens, has the g r e a t e s t types).  number  90 TABLE 71  HUM BER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 8 Subject  Types  Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 8  In corpus  Grade (37,812  tokens)  and  largest  number  (the a l g e b r a number  of  the l a r g e s t nineteen  2,388 2,169 1,305 1,164 1,975 2,890 7,027  9,  (TABLE  Tokens  Characters  Average  8,605 11,425 4,624 7,073 9,907 11,205 52,867  37 ,901 50,472 20,981 30,201 43,363 51,480 234,527  4. 40 4.42 4.54 4.27 4.38 4.59 4.44  V I I ) , Home E c o n o m i c s  tokens) f o l l o w e d English  (23,123  by I n d u s t r i a l tokens).  i s the largest  Education  English  again  o f w o r d - t y p e s . O n l y one M a t h e m a t i c s t e x t  text  was e x c l u d e d ) r e s u l t i n g  samples  of running  prose  has the was u s e d  i n a relatively  small  (3,616 t o k e n s ) . G r a d e  o f t h e three grade l e v e l corpora  textbooks included.  (26,656  with  a  total  9i s of  91 TABLE V I I  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRACE 9 Subjac t  Types  Tokens  Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 9  2,208 4,920 4,894 3,688 910 2,365 2,065 11,401  12,485 23 ,123 37,812 26,656 3, 616 12 ,278 6,955 122 ,953  TABLE subjects Social  VIII  i n Grade Studies  (7,100 t o k e n s ) . word-types The  4.4 6 4. 48 4.52 4.51 4.28 4.53 4.74 4.51  the l e x i c a l c h a r a c t e r i s t i c s  10. The l a r g e s t  of the five  s u b j e c t c o r p u s i n Grade  Social Studies also  9  textbooks  are repeated  textbooks  Nine were  sufficiently  has t h e  10 i s  i s Mathematics  largest  number  of  i n Grade  f o r Home E c o n o m i c s and I n d u s t r i a l 10 b u t were n o t  used  t e x t b o o k s were u s e d t o o b t a i n excluded  large  included  Mathematics Studies.  55, 653 103 ,490 171,040 120 ,125 15,460 55,612 32,973 273 ,654  (3,930).  study.  texts  A verage  (20,428 t o k e n s ) and t h e s m a l l e s t  Grade  Education the  lists  Characters  because  they  q u a n t i t i e s o f running two  (Geometry)  Commerce text,  books, and  an  s a m p l e s and  did  prose. two atlas  again  not  in six  contain  The s i x e x c l u d e d  English used  books, in  a  Social  92 TABLE  VIII  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF GRADE 10 Subject  Types  Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 10  1 ,746 2,489 912 3,015 3,930 7,736  A  summary  of  Tokens  with  selection  i n this  where  study  five  Industrial  Education  and  smallest  Grade  tokens),  (3,616  and  in  with  4,894  3.96 4.75 4.91 4. 61  i s presented  of  sampled.  prose  tokens).  Grade  Social  of  Studies  running  9  prose  i n TABLE IX .  (3.96)  were s a m p l e d .  throughout  i n Grade  dealt  were  areas  Grade  9 English  9  (23,123  tokens).  The  were l o c a t e d i n G r a d e 9  (6 ,955  the smallest recorded  characters*  level  subject  (20 ,428  8 Industrial  Social Studies  f o r each o f  {37,812 t o k e n s )  Other  running  t o k e n s ) . Grade  Grade  types.  28,123 73,990 100,210 273,654  (26,656  number o f w o r d - t y p e s o c c u r textbooks  7,100 15,583 20,428 59,343  were  10  Mathematics c o n t a i n e d tokens  4.50 4.30  o f m a t e r i a l a t t h e one g r a d e  amounts  quantities  Mathematics  34,459 36,743 -  grade l e v e l s  textbooks large  tokens),  7,651 8,553 -  was i n G r a d e 9 Home E c o n o m i c s  containing  Average  a l l the l e x i c a l c h a r a c t e r i s t i c s  the subject areas across The l a r g e s t  Characters  Education tokens).  'average  the s t u d y .  9 English  (4,920)  (4,624 Grade 10  length  of  The l a r g e s t where f o u r  Grade 9 Home E c o n o m i c s i s a c l c s e  second  93 TABLE IX  NUMBER OF TYPES, TOKENS, CHARACTERS, AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE SUBJECT AREAS OF EACH GRADE LEVEL OF THE CORPUS Grade  Types  Tokens  Characters  Commerce Commerce  9 10  2,208 1 ,746  12,485 7,651  55,653 34,459  4.46 4.50  English English English  8 9 10  2,388 4,920 2,489  8, 605 23 ,123 8,553  37,901 103 ,490 36,743  4.40 4.48 4.30  Home E c o n o m i c s Home E c o n o m i c s  8 9  2,169 4 ,894  11,425 37,812  50,472 171 ,040  4.42 4.52  I n d u s t r i a l Ed. I n d u s t r i a l Ed.  8 9  1,305 3,688  4,624 26,656  20,981 120,125  4.54 4.51  Ma t h e m a t i c s Mathematics Ma the m a t i c s  8 9 10  1,164 910 912  7, 073 3,616 7,100  30 ,201 15,460 28 ,123  4. 27 4.28 3.96  Science Science Science  8 9 10  1,975 2,365 3,015  9,907 12 ,278 15, 583  43,363 55,612 73,990  4.38 4.53 4.75  Soc. Soc. Soc.  8 9 10  2,890 2,065 3,930  11,205 6,955 20,428  51,480 32 ,973 100,210  4.59 4.74 4.91  Subject  Studies Studies Studies  TABLE X l i s t s thirty-seven Studies  the l e x i c a l  textbooks  text  used  (*3H01),  characteristics of in  the  study.  Drama.,IV  smallest  number  textbook  has  Grade  m  a  Grade  o f tokens the  10 E n g l i s h  each  of the  A Grade 10 S o c i a l  A_R__io_al_Geo_ra_h__^  contains the l a r g e s t s e l e c t i o n of running while  Average  10  English  (1,867).  largest  number  t e s t has t h e l e a s t  A  prose text  Grade  (14,736 t o k e n s ) , (*3C02),  10  Social  o f word-types (822).  has t h e Studies  (2,913) and a  9a  TABLE X  NUMBER  OF TYPES , TOKENS, CHARACTERS , AND AVERAGE LENGTH OF TOKENS IN CHARACTERS FOR THE THIRTY-SEVEN TEXTS  Text  Types  Tokens  Characters  Average  *1C01 (Eng ) *lc02(Eng ) *1 D01 (H, Ec) * 1E01(I.Ed) *1F01 (Math) *1G01(Sci ) *1G02 ( S c i ) *1H01 (S.St) *1H02 (S.St)  1 ,187 1 ,672 2,169 1 ,305 1 ,164 1 ,033 1 ,399 2 ,177 1,215  3,500 5,105 11,425  7,073 4,402 5, 505 7,728 3, 477  15,601 22 ,300 5,8472 20 ,981 30,201 18 ,926 24,437 35 ,235 16,245  4.46 4. 37 4.42 4.54 4.27 4.30 4.44 4.56 4.67  *2B01 (Comm) *2B02(Comm) • 2C01 (Eng ) *2C02(Eng ) *2C03 (Eng ) *2C0a(Eng ) *2D01 (H.Ec) •2D02(H.Ec) *2D03 (H. Ec) *2D04 (H.Ec) *2D05 (H. Ec) *2E01(I.Ed) *2E02 (I.Ed) •2E03(I.Ed) *2F01 (Math) *2G01(Sci ) *2G02 ( S c i ) *2H01 (S.St) *2H02(S.St)  1 ,234 1,511 2,436 1 ,232 1 ,705 1 ,638 1,872 1 ,871 1 ,685 1 ,467 1 ,269 1 ,615 1 ,638 2,062 910 1 ,516 1 ,474 1 ,420 984  5,494 6,991 9, 646 3,400 5,035 10,198 10,755 6,928 4, 599 5,332 4, 599 6,075 7,792 12,789 3, 616 6,748 5, 530 4,408 2,547  24,022 31 ,631 44,736 15 ,122 21,839 21 ,793 46,425 48 ,323 31,352 24,051 20,889 27 ,547 34,579 57 ,999 15,466 30 ,618 24,994 20,365 12 ,608  4.37 4.52 4.64 4. 45 4.34 4. 32 4.55 4.49 4.53 4.51 4.54 4.53 4.44 4.54 4.28 4.54 4.52 4.62 4.95  *3B01 (Comm) *3B02 (Comm) *3C01 (Eng ) *3C02(Eng ) • 3F01 (Math) *3G01 ( S c i ) *3G02 ( S c i ) *3H01 (S.St) *3H02 (S.St)  1 ,017 1 ,170 1 ,946 822 912 1 ,955 1 ,844 2,913 1 ,837  3, 546 4,105 6, 686 1 ,867 7, 100 8,592 6,991 14,736 5,692  15,477 18 ,982 27,972 8 ,771 28,123 40 ,616 33,374 70,766 29 ,444  4.36 4.62 4.18 4.70 3.96 4.73 4.77 4.80 5. 17  4,624  95 Ta^k_5_2, and  corpora  S.G.v.,  and  characteristic  repeat-rate frequency are  listed  T.V.  (See  K  i n the f i v e Task  4.2).  tables  for  the  Corpus  volumes C . V . ,  G.V.,  The  for  results  Yule's  i n TABLES XI  ( f o r words) a r e p r e s e n t e d  S.V.,  through  XVI. As  stated earlier  III,  i n Chapter  the K value i s u s e f u l  a measure o f t h e r e p e a t  r a t e o f words and  of  of  the  concentration  vocabulary  material. A large K factor use which  of  high  frequency  suggests  K factor  in  low  used.  For  this  r e s u l t s of K f o r the The  reason  (rare)  independent  various  K f a c t o r s f o r each  presented  in  TABLE  XI.  ( 106. 547)  and  Grade  10  although  all  grades  (108.104).  i t  of  Grade  were  words. sample  the  size  population  p o s s i b l e t o compare  the  corpora. grade  has  is  greater  a s m a l l value of K  when t h e s a m p l e s have been r a n d o m l y s e l e c t e d from being  printed  proportionately  frequency  i s theoretically  indication  a passage cf  (common) words t h a n  i m p l i e s more r e l i a n c e on The  a  p r o v i d e s an  as  the  level 9  has  and the  largest  the  Corpus  are  smallest value of K K  value  c l o s e to the K value  (1 12.587)  f o r the  Corpus  96 TABLE XI  K FACTORS (WORDS) FOR Grade  K Factor  8 9 10 Corpus  109.510 106.547 112.587 108.104  TABLE X I I p r e s e n t s areas  across  the K factors  grade l e v e l s .  (100.517) have m a r k e d l y subjects  the other  ranked  Home E c o n o m i c s  lower  use a r e l a t i v e l y  words than  EACH GRADE LEVEL AND THE CORPUS  values of K  for  the  subject  (92.572) and E n g l i s h implying  that  g r e a t e r number o f l o w f r e q u e n c y  subjects.  TABLE X I I  SUBJECT AREAS ACROSS GRADES RANKED BY K FACTOR  Rank  2 3 4 5 6 7 8  Subject Home E c o n o m i c s English Corpus Commerce Mathematics Science I n d u s t r i a l Education Social Studies  K Factor 92.572 100.517 108.104 108.922 121.662 129.894 129.922 130.372  (WORDS)  these (rare)  97 The three  K  factors  grade l e v e l s listed  are presented  order  i s  Social  S t u d i e s , the  indicating either and  a  in  Science,  values  are  TABLE  subject areas  XIV.  lowest  value  the  two g r a d e s , lowest  I n Home in  In E n g l i s h , of  K  9  s u b j e c t s have t h e i r  Grade  10 and one i n G r a d e 8.  lowest  TABLE  Subjec t Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Corpus  in  occurs  and  in  (rare)  Commerce,  Grade  9,  words t h a n i n  Industrial  and 10 r e s p e c t i v e l y .  seven  K FACTORS  while  ranked  M a t h e m a t i c s , and  K v a l u e s a r e i n Grade  Economics  Grades  w i t h i n each o f t h e  i n TABLE X I I I a n d t h e i r  g r e a t e r u s e o f low f r e q u e n c y  of the other  respectively.  f o r the  Education  8 and Grade 10 the Four  lowest  K  out of the  K v a l u e s i n Grade 9 w i t h  two i n  XIII  (WORDS) FOR SUBJECTS  WITHIN GRADE LEVELS  Gde 8  Gde 9  Gde 10  107.175 98.166 116.973 123.568 135.992 130.613 109. 510  117.619 98.491 91.788 133.630 118.672 145. 159 127.738 106.547  99.329 104.271  131.571 118.004 133.350 112.587  Corpus 108.922 100.517 92. 572 129.922 121.662 129.894 130.372  TABLE XIV p r e s e n t s grades  t h e rank o f  the  subject  areas  and i n d i c a t e s t h a t Commerce, Home E c o n o m i c s ,  occupy  seven of  the  first  eight  places  among  within  and E n g l i s h  the  eighteen  positions.  TABLE XIV SUBJECT ABEAS WITHIN GRADE LEVELS RANKED BY K FACTOR Rank  K  are: ranked within  a  across  factors  by s u b j e c t a r e a s grade  level  English  low K v a l u e s is  evident  (TABLE  (TABLE  91.788 98.166 98.491 99.329 104.271 107.175 116.973 117.619 118.004 118.672 123.568 127.738 130.613 131.571 133.350 133.630 135.992 145.159  XV), l i s t e d  XVI) ,  i n TABLE XV. O n l y  in  subjects  independently  XVII). Home  Economics  and  one o f t h e Home E c o n o m i c s  (*2D01), h a s a K f a c t o r  most o f t h e E n g l i s h t e x t s have  f o l l o w . They by  and r a n k e d  (TABLE  f o r the textbooks  t e x t s , Guide_to_Modern_Meals while  K Factor  f o r each i n d i v i d u a l t e x t b o o k  a l l s u b j e c t s and g r a d e l e v e l s  The  9 8 9 10 10 8 8 9 10 9 8 9 8 10 10 9 . 8 9  Home E c o n o m i c s Home E c o n o m i c s English Commerce English English I n d u s t r i a l Education Commerce Science Mathematics Mathematics Social Studies Social Studies Mathematics Social Studies I n d u s t r i a l Education Science Science  1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.  The  Gde  Subject  (WORDS)  over  K f a c t o r s approaching  100, 100.  TABLE XV  TEXTS IN SUBJECT  AREAS RANKED BY K FACTOR Subject  Text  (WORDS)  K Factor  *3 B01 *3B02 *2B01 *2B02  Commerce Commerce Commerce Commerce  95.532 111.939 114.560 125.988  *2C03 *2C04 *1C02 *3C01 *2C02 *2C0 1 *1C01 *3C02  English English English English English English English English  99.231 99.253 101.873 102. 632 103.655 105.651 118.117 125.065  *2D04 *2D0 3 *2D05 *2D02 *1D01 *2D01  Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s  81. 857 87.723 92.203 97.723 98.166 111.747  *2E01 *2E02 *1E01 *2E03  Industrial Industrial Industrial Industrial  113.084 114.428 116.973 169.462  *2 F01 *1F01 *3F01  Mathematics Mathematics Mathematics  118.672 123.568 131.571  *1G02 *3G02 *3G01 *2G02 *2G0 1 *1G01  Science Science Science Science Science Science  117.664 117.712 128.905 142.048 150.283 167.198  *2H01 *1H02 • 3H0 2 *2H02 *1H01 *3H01  Social Social Social Social Social Social  Education Education Education Education  Studies Studies Studies Studies Studies Studies  126.723 127.347 128.258 134.655 137.026 137.962  100 TABLE XVI  K FACTOR Text  Subject  K  *1C01 *1C02 • 1D01 *1E01 *1F01 *1G01 *1G02 *1H01 *1H02  English English Home E c o n o m i c s I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  118.117 101.873 98.166 1 16.973 123.568 167.198 117.664 137.026 127.347  *2B01 *2B0 2 *2C01 *2C02 *2C03 *2C04 *2D01 *2D02 *2D03 *2D04 *2D05 *2E01 • 2E02 *2E03 *2F01 • 2G01 *2G02 *2H01 *2H02  Commerce Commerce English English English English Home Economics Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s I n d u s t r i a l Education I n d u s t r i a l Education I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  114. 560 125.988 105.651 103.655 99.231 99.253 111.747 97.723 87.723 81.857 92.203 113.084 114.428 169.462 118.672 150.283 142.048 126.723 134.655  *3B01 *3B02 *3C01 *3C02 *3F01 *3G01 *3G02 *3H01 *3H02  Commerce Commerce English English Mathematics Science Science Social Studies Social Studies  95. 532 111.939 102.632 125.065 131.571 128.905 117.712 137.962 128.258  Within while  (WORDS) FOR EACH TEXT BY GRADES  Grade  8 and 9, Home E c o n o m i c s  Commerce h a s t h e l o w e s t  value  within  Factor  has t h e l o w e s t K v a l u e Grade 10.  TABLE factor  XVII p r e s e n t s  (words).  t h e ranked order  Five of the  Economics, s i x a r e E n g l i s h  first  (WORDS) FOR EACH TEXT  Rank  Text  1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37.  *2D04 *2D03 *2D05 • 3B01 *2D02 *1D01 *2C03 *2C04 *1C02 *3C01 • 2C02 *2C01 *2D01 *3B02 • 2E01 *2E02 *2B01 • 1E01 *1G02 *3G02 *1C01 *2F01 *1F01 *3C02 *2B02 *2H01 *1H02 *3H02 *3G01 *3F01 *2H02 *1H01 *3H01 *2G02 *2G01. *1G01 *2E03  textbooks  t e x t s , and one i s a Commerce  TABLE K FACTOR  twelve  of the textbooks  by K  a r e Home text.  XVII  RANKED ACROSS SUBJECTS AND GRADES Subject  Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Commerce Home E c o n o m i c s Home E c o n o m i c s English English English English English English Home E c o n o m i c s Commerce I n d u s t r i a l Education I n d u s t r i a l Education Commerce I n d u s t r i a l Education Science Science English Mathematics Mathematics English Commerce Social Studies Social Studies Social Studies Science Mathematics Social Studies Social Studies Social Studies Science Science Science I n d u s t r i a l Education  K Factor 81.857 87.723 92.203 95.532 97.723 98.166 99.231 99.253 101.873 102.632 103.655 105.651 111.747 111.939 113.084 114.428 114.560 116.973 117.664 117.712 118. 117 118.672 123.568 125.065 125.988 126.723 127.347 128.258 128.905 131.571 134.655 137.026 137.962 142.048 150.283 167.198 169.462  102  G e n e r a t e c o m p a r a t i v e and s t a t i s t i c a l a n a l y s e s b a s e d on the sentence length d i s t r i b u t i o n of the Corpus, the c o r p o r a , and d a t a p r o d u c e d i n T a s k 1 t h r o u g h T a s k 4. 6.1 What a r e t h e s e n t e n c e - l e n g t h c h a r a c t e r i s t i c s of the Corpus; the Grade 8, 9, and 10 c o r p o r a ; e a c h o f the s e v e n s u b j e c t a r e a c o r p o r a a c r o s s G r a d e s 8, 9, and 10; e a c h o f t h e c o r p o r a f o r s u b j e c t s w i t h i n G r a d e s 8, 9, and 10; and each of the t h i r t y - s e v e n textbook c o r p o r a i n t e r m s o f t h e mean, median, modal sentence length in words, standard d e v i a t i o n , c o e f f i c i e n t of v a r i a t i o n , a v e r a g e number o f s e n t e n c e s , and Pearson's skew f a c t o r . 6.2 Produce a s e t o f graphs t o i l l u s t r a t e each o f the s i x t y - s i x s e n t e n c e l e n g t h d i s t r i b u t i o n s developed during the study. 6.3 What a r e t h e c h a r a c t e r i s t i c s i n t e r m s o f r e p e a t rate f r e q u e n c y o f s e n t e n c e l e n g t h s ( Y u l e ' s K) f o r t h e C o r p u s and t h e c o r p o r a d e f i n e d i n 6.1 above?  Task_6_1 and  the  given  The  various  sixty-six  distribution  when t h e s a m p l e s  organized although  of  SENT i s i n c l u d e d  TABLE X V I I I i l l u s t r a t e s  Corpus.  XXIV.  Complete  the  This by  of the  Corpus  pattern subjects  i n APPENDIX  the f a i r l y  details  Corpus  c o r p o r a a r e p r e s e n t e d i n t h e volume  c o n t e n t s of  length  characteristics  c o r p o r a ( a l l measured i n number o f words) a r e  i n TABLES X V I I I t h r o u g h  sentence-length  the  sentence length  and  each  also  across  the range i n averages  the  three  increases.  of the of  G.  uniform average  repeated  the  SENT. A sample  a r e o r g a n i z e d by g r a d e l e v e l s is  of  sentence  across  the  when t h e s a m p l e s  are  grades  (TABLE  XIX),  103 TABLE X V I I I  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH GRADE LEVEL AND THE CORPUS Grade  Mean  8 9 10 Corpus  S. D.  Variation  18.595 9.7745 17.824 10.2550 17.593 9.8504 17.927 10.0480  Median 16.764 15.428 15.733 15.743  0.5256 0.5753 0.5599 0.5605  Mode 18 15 10 15  Average Sentences 27.33 28.04 28.34 27.96  TABLE XIX MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH SUBJECT AREA ACROSS GRADES Subject  Mean  Commerce English Home E c o n o m i c s I n d u s t r i a l ed. Mathematics Science Social studies Corpus  TABLE  XX  S.D.  17.772 9.080 17.568 13.685 18.476 8.633 16.683 8.449 15.247 8.150 18.495 9.785 19.973 9.582 17.927 10.048  presents  a c r o s s the grade l e v e l s . in  Grade  Studies.  Variation  8 Mathematics  the  Mode  Average Sentences  0.510 0.779 0.467 0.506 0.534 0.529 0.479 0.560  15.770 13.750 16.813 14.550 13.532 16.444 18.207 15.743  13 7 16 11 14 15 21 15  sentence  length  characteristics  The s m a l l e s t and  Median  the  27.66 28.68 27.20 29.78 33.37 27.24 25.10 27.96  average sentence length i s  largest  in  Grade  10  Social  104 TABLE XX  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR EACH SUBJECT AREA WITHIN GRADE! LEVELS OF THE CORPUS Gde  Subject  Var.  M edian  Mode  Average Sentences  Mean  S . D. 9.642 8.016  0.549 0.443  15. 475 16.300  17 14  28.440 26.437  0.741 0.777 0.780  14.280 14.509 11.230  12 8 7  28.764 26.234 35.750  Comm. Comm.  9 10  17.558 18.0 87  Eng. Eng. Eng.  8 9 10  17.597 18.753 14.953  13.049 14.585 1 1.670  H. Ec. H.Ec,  8 9  19.430 18.196  8. 105 8.738  0. 417 0.480  18.100 16.442  16 16  26.727 27.342  I.Ed. I . Ed.  8 9  17.511 16.535  7. 677 8.552  0.438 0.517  15.530 14.392  14 15  29.333 29.851  Math. Math. Math.  8 9 10  17.421 14.406 13.894  8. 872 6. 802 7.781  0.509 0. 472 0.560  16.170 13.190 12.330  18 9 10  29.000 35.857 36.500  Sci. Sci. Sci,  8 9 10  17.081 17.924 20.028  8.631 9.173 10.861  0.505 0.511 0.542  15.170 15.757 17.900  15 15 18  29.000 28.541 25.096  S.St. S. S t . S. S t .  8 9 10  21.715 21.204 18.758  9.876 1 1.137 8.666  0.4 54 0.525 0.462  19.700 18.444 17.260  23 15 10  23.454 25.230 28.340  The Grade  sentence  length  characteristics  8, G r a d e 9, a n d G r a d e 10  X X I I , and X X I I I r e s p e c t i v e l y .  f o r subjects  a r e presented  within  i n TAELES X X I ,  105 TABLE XXI MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NDMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 8  Subject Comme r c e English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 8  Mean  S.D.  17. 597 13. 049 19. 430 8. 105 17. 511 7. 677 17. 421 8. 872 8. 631 17. 081 9. 876 21. 715 18. 595 9. 774  Variation 0. 741 0. 417 0. 438 0. 509 0. 505 0. 454 0. 525  Median 14.280 18.100 15.530 16.170 15.170 19.700 16.764  Mode 12 16 14 18 15 23 18  Average Sentences 28. 764 26. 727 29. 333 29. 000 29. 000 23. 454 27. 330  TABLE X X I I MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 9 Subj e c t Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 9  Mean  S.  17.558 9. 642 18.753 14. 585 18.196 8. 738 16.535 8. 552 14.406 6. 802 17.924 9. 173 21.204 11 .137 17.824 10. 255  Variation 0.549 0.777 0.480 0. 517 0.472 0.511 0.525 0. 575  Median 15. 475 14. 509 16. 442 14. 392 13. 190 15. 757 18. 444 15. 428  Mode  Average Sentences  17 8 16 15 9 15 15 15  28.440 26.234 27.342 29.851 35.857 28.541 25.230 28.040  106 TABLE  XXIII  MEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE SUBJECT AREAS OF GRADE 10  Subject  Mean  Commerce English Home E c o n o m i c s I n d u s t r i a l Ed. Mathematics Science Social Studies Grade 10  The grades range  0. 443 0. 780  13.- 894  - 560 0.  -  the t h i r t y -  7. 781  20.028 10. 861 18.758 8. 666 17.593 9. 850  -  0. 542 0. 462 0. 559  Median 16. 30 11. 23  -  12. 33 17. 90 17. 26  Mode  XXIV  Average Sentences  14 7  26. 437 35. 750  10  -  36. 500 25. 096 25. 928 28. 340  18 21 10  sentence lengths f o r the s u b j e c t areas w i t h i n  c o n s i d e r a b l y w i t h Grade 9 e x h i b i t i n g  i n sentence TABLE  Variation  18.0 87 8. 0 16 14.9 53 11. 670  average  differ  S.D.  the  greatest  lengths. lists  the sentence l e n g t h c h a r a c t e r i s t i c s f o r  seven textbooks.  107 TABLE XXIV BEAN SENTENCE LENGTH, STANDARD DEVIATION, COEFFICIENT OF VARIATION, MEDIAN, MODE, AND AVERAGE NUMBER OF SENTENCES PER SAMPLE FOR THE THIRTY-SEVEN TEXTS Subject  Mean  S.D.  Variation  Median  Mode  Average Sentences  *1C01 (Eng.) *1C02 (Eng.) * 1 D01 (H.Ec) *1E01 (I.Ed) *1F01 (Math) *1G01 ( S c i ) * 1G02 ( S c i ) *1H01 (S. St) *1 H02 (S.St)  18.717 14.639 16.904 11.933 19.430 8.105 7.677 17.511 17.421 8.872 14.673 8. 125 19.661 8.421 21.348 9.811 22.578 10.006  0.782 0.706 0.417 0.438 0.509 0. 553 0.428 0.459 0. 443  14.60 14.00 18.10 15.53 16.25 12.87 17.90 19.13 21. 14  15 12 16 14 18 15 18 17 23  26.714 30.200 26.727 29.333 29.000 33.333 25.454 24. 133 22.000  *2B01 [Comm) *2B02 (Comm) *2C01 (Eng) • 2C02 (Eng) • 2C03 (Eng) *2C04 (Eng) *2D01 (H.Ec) *2D02 (H.Ec) *2D03 (H.Ec) *2D04 (H. Ec) *2D05 (H.Ec) *2E01 (I.Ed) *2E02 (I.Ed) *2E03 (I.Ed) *2F01 (Math) *2G01 ( S c i ) *2G02 ( S c i ) *2H01 (S. St) *2 H02 (S. S t )  15.652 19.417 17.762 19.101 25.429 16.057 20.1 14 19.002 17.451 18.071 14.693 14.194 16.402 18.037 14.406 16.828 19.472 21. 822 20.214  8.223 10.532 12.041 14.070 18.775 14.666 9.761 8. 8 28 8.095 8. 379 6.565 6.972 9.069 8.744 6.802 8.905 9.338 12.143 9. 262  0.525 0. 542 0.678 0.736 0.738 0.913 0.485 0. 464 0.463 0. 463 0.446 0. 491 0.553 0. 484 0.472 0. 529 0.479 0.556 0. 458  14.13 17.07 14.60 15.40 21.20 11.20 17.90 17.20 16.40 16.80 13.40 12.40 18.50 16.20 13.20 14.90 17.70 19.40 17.40  17 14 8 12 15 4 15 17 17 18 13 12 9 15 9 15 11 21 16  31.909 25.714 27. 145 25.428 19.800 31.400 24. 142 25.727 28.357 29.500 34.777 32.920 28.562 28.360 35.857 30.846 25.818 25.250 25.400  (Comm) (Comm) (Eng.) (Eng) (Math) (Sci.) (Sci.) (S. St) (S.St)  17.214 18.917 13.701 22.226 13.894 18.636 22.054 17. 522 22.952  7.936 8.021 10.242 16.081 7.781 9. 441 12 . 3 84 7.517 10.761  0.461 0. 424 0.747 0.723 0.560 0.506 0.561 0.429 0.468  15.50 17.00 10.10 17.00 12.30 17.25 19.85 16.20 21.00  10 18 7 14 10 14 22 21 20  29.428 24.111 40.666 21.000 36.500 27.117 22.642 28.033 20.666  *3B01 *3B02 *3C01 *3C02 *3F01 *3G01 *3G02 *3H01 *3H02  108 A Grade length  and  One length  a Grade final  evident  grades,  and  by  level  samples  and  when  text  lowest  the  should  average  be made a b o u t t h e s e n t e n c e  subjects  across  across  XXIX.  consistent  within  the  grades,  textbooks.  In  samples subjects  addition  i s the considerable  patterns  coefficients organized  through  However, c o n s i d e r a b l e r a n g e i s  lengths  feature  as of  by  sentence  highest.  are r e l a t i v e l y  corpora.  individual  length  the  p r e s e n t e d i n TABLES X V I I I  grades,  sentence  indicated variation  subjects  within  to  this  variability  by  of  the standard  reported  across  when  for  grades,  the  subjects  g r a d e s , and t e x t b o o k s .  For subjects  example, i n TABLE XIX  from 8.150  t h e Math s a m p l e s , r a n g e from  the  samples  f o r Mathematics  a p p r o x i m a t e l y 68  range  samples,  from 3.883 t o 31.253  organized  by  f o r the sentence  t o 13.685 f o r  English.  p e r c e n t of the sentences  6.097 t o 23.397 words i n l e n g t h  15.247. F o r t h e E n g l i s h  would  for  across grades, the standard d e v i a t i o n  l e n g t h s range  would  English  variability  a striking  deviations  of  and  grade  by  diversity,  For  has  i n average sentence  organized  within  text  observation  length  three  the  10  characteristics  Sentence the  9 English  with an  68 p e r c e n t o f t h e  words i n l e n g t h  average  sentences  w i t h a mean o f  17.568. This with  variability  the e x c e p t i o n  reported  sample.  throughout the  of grades,  f o r the c o e f f i c i e n t  the ranges reported word  exists  and  i s also  of v a r i a t i o n  f o r a v e r a g e numbers  The c o e f f i c i e n t  range evident  and of  of v a r i a t i o n  of  samples,  i n the  ranges  t o some d e g r e e f o r sentences  per  i n d i c a t e s the r a t e  500 at  109 which the  i t e m s move away from  greater  sample.  the  For  that in  The  subject The  this  the  and  of  Pearson's  presented  in  indicates  sentence  where  the  distribution negatively  TABLES  mean  XXV  a  s e n t e n c e s about The of  the  Grade Grade  8  Social closest  Studies, figure  were t h e  alike  for  are  indicates  than  those  sentence  result  of  are zero  distribution  positively  longer  skewed  sentences while  a  tailing  mean. A n o r m a l  distribution  within a l l  the  a normal  A  has,  various corpora A  indicates the  English  variation  XXVII.  to  subjects  0.479. E n g l i s h  C o r p u s and  off  the  variation  of  off  a to  distribution  long  and  short  Corpus  (0.029),  Grade  G r a d e 9 Commerce  distribution 8  a Grade  (-0.004).  10  (0.060),  (0.057),  ( M a t h e m a t i c s , -0.065; S c i e n c e ,  0 . 0 4 2 ) , and all  approximating a normal  (0.065),  corpora  of  of  by  in  mean.  Mathematics  8 textbook  to  equivalent  a r e a s most c l o s e l y  sentence lengths  of  factor  coincide.  tailing  of  less  approximating  distribution  a generally  organized  0.779 f o r  skew  through  mode  indicates  of  variation  homogeny  coefficients  the  lengths  and  skewed  for  shorter sentences i n r e l a t i o n indicates  samples  coefficient  corpora.  lower the  length  a coefficient  largest  characteristics  the  s u b j e c t area are  text  results  length  the  S t u d i e s which has  overwhelmingly, the  the  coefficient  samples i n  Social  mean, and  sentence  for  (TABLE X I X ) ,  varied. the  of  example,  a c r o s s grades quite  degree  the  Science textbook  three  -0.040; with  the  110 The included (0.737), (English Education Grade  corpora Grade  10 ( 0 . 7 7 0 ) ,  Grade 0.811  which  9  10 Commerce  the  English  Mathematics  and  0.816;  had  0.822;  textbook,  skewed  (0.772),  (0.794),  Home  Mathematics  most  distributions  Grade  s i x Grade 9  Economics  0.847;  9  English textbooks  Industrial  0.794; and S c i e n c e 0 . 9 0 7 ) ,  and a  0.909.  TABLE XXV PEARSON'S SKEW FACTOR FOR EACH GRADE LEVEL, SUBJECTS ACROSS THE CORPUS Grade 8 9 10 Corpus Commerce English Home E c o n o m i c s I n d u s t r i a l Educ. Mathematics Science Social Studies  THE CORPUS  Skew 0.060 0.275 0.770 0.029 0.525 0.772 0.286 0.672 0. 153 0.357 -0.107  AND  111  TABLE XXVI PEARSON'S SKEH FACTOR FOR SUBJECTS IN EACH GRADE LEVEL Subject Commerce English Home E c o n o m i c s I n d u s t r i a l Educ. Mathematics Science S o c i a l Studies Corpus  Grade 8 —  0.429 0. 423 0.457 -0.065 0.241 -0.130 0.060  Grade 9 0.057 0.737 0.251 0. 179 0.794 0.318 0.557 0.275  Grade 0. 509 0. 681 —  -  0. 500 0. 186 -0. 258 0. 770  10  TABLE XXVII PEARSON'S SKEW FACTOR FOR EACH TEXT Text  Subject  Skew  *1C01 *1C02 *1D01 *1E01 *1F01 *1G01 *1G02 *1H01 *1H02  English English Home E c o n o m i c s I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  0.254 0.411 0.423 0.457 -0.065 -0.040 0. 197 0.443 -0.042  *2B01 *2B02 *2C01 *2C02 *2C03 *2C04 *2D01 *2D02 *2D03 *2D04 *2D05 *2E01 *2E02 *2E03 *2F01 *2G01 *2G02 *2H01 • 2H02  Commerce Commerce English English English English Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s I n d u s t r i a l Education I n d u s t r i a l Education I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  -0.164 0.514 0.811 0.504 0.555 0.822 0.523 0.226 0.557 0. 847 0.257 0.314 0.816 0.347 0.794 0.205 0.907 0.677 0.455  *3B01 *3B02 *3C01 *3C02 *3F01 *3G01 *3G02 *3H01 *3H02  Commerce Commerce English English Mathematics Science Science Social Studies Social Studies  0.909 0. 114 0.654 0.511 0.500 0.490 0.004 -0.462 0.274  113 Task_6._2 To sixty-six using The of  illustrate  graphs  the  were p r o d u c e d  a CALCOMP Drum P l o t t e r .  n a r r o w r a n g e of s t a n d a r d the  C o r p u s and  leptokurtic around  the  nature  of  lengths  the  indicated  by  the  some  mesokurtic  graphs p r o v i d e  distribution  of  short  sentence  t h e O.B.C.  presented  corpora  graphs, greater corpora  is  visual  and  long  package  i n APPENDIX  sentence  (sentences degree  tend  of  (English  H.  lengths  indicated  p l a t e a u of t h e i r  good  lengths,  plotting  d e v i a t i o n s f o r the  of these  in  by  for  These a r e  mean l e n g t h ) . The  sentence  The  most  data  by  to  the  cluster  variation  f o r example),  of is  graphs.  illustration sentences  of the  in  the  relative sixty-six  corpora. Task_6_.3 and  corpora  and  T. V.  are a  T a s k 5.2).  frequency  t a b l e s f o r the  i n v o l u m e s C. V. , G.V., The  results  i n TABLES X X V I I I t h r o u g h  low  occurring The Corpus,  values sentence  indicate  XXX.  the  TABLE X X V I I I .  indicating G r a d e s 9 and  a  each  subject Grade  greater 10.  (for  High K v a l u e s sentence  a c o n c e n t r a t i o n of  Corpus S.G.  v.,  sentences) indicate lengths  less frequently  lengths.  K values f o r and  S.V.,  of Yule's K  g r e a t e r c o n c e n t r a t i o n o f commonly o c c u r r i n g  while  in  repeat-rate  are presented  (See  listed  The  of  areas 8  has  variety  the across the  three  grade  levels,  the  grade l e v e l s a r e  presented  smallest  (326.67),  of sentence  value  l e n g t h s used than  in  111 TABLE X X V I I I K FACTORS  (SENTENCES) FOR EACH GRADE LEVEL, SUBJECTS ACROSS THE CORPUS K  Grade  areas  diversity  Factor 326.67 344.88 334.55 336.35 364.57 296.64 361.32 399.64 40 2.07 334.32 333.49  8 9 10 Corpus C ommerce English Home E c , Ind. Ed. Math. Science Soc. S t .  The g r e a t  THE CORPUS, AND  of the K factor  within the grade l e v e l s  i n the various  subject  i s shown i n TABLE XXIX.  TABLE XXIX K FACTORS  (SENTENCES)  Subject Commerce English Home E c . I n d . Ed. Math. Science Soc. S t . Corpus  FOR SUBJECTS WITHIN GRADE LEVELS  Grade 8 —  279.02 377.56 434.17 385.35 356.30 312.18 326.67  Grade 9 357.26 287.23 359.24 398.78 465.07 343.89 291.12 344.88  G r a d e 10 397.36 360.10 —  -  427. 39 319.42 361.56 334.55  115 The  K  factors,  465.07 i n Grade  range  study.  484.43  279.02  i n Grade 8 E n g l i s h t o  9 Mathematics.  TABLE XXX p r e s e n t s the  from  t h e K f a c t o r s f o r each textbook  T h e s e r e s u l t s r a n g e f r o m 204.57  f o r a Grade  9 I n d u s t r i a l Education  Text  (SENTENCES)  textbook.  FOR EACH TEXT  Subject  in  (Grade 9 E n g l i s h ) t o  TABLE XXX K FACTORS  used  K Factor  *1C01 *1C02 *1D01 *1E01 *1F01 *1G01 • 1G02 *1H01 *1H02  English English Home E c o n o m i c s I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  244.22 297.57 377.56 434.17 385.35 403.33 348.21 311.35 308.65  *2B01 *2B02 • 2C01 *2C02 *2C03 *2C04 *2D01 • 2D02 *2C03 *2D04 *2D05 *2E01 *2E02 • 2E03 • 2F01 *2G01 *2G02 *2H01 *2H02  Commerce Commerce English English English English Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s Home E c o n o m i c s I n d u s t r i a l Education I n d u s t r i a l Education I n d u s t r i a l Education Mathematics Science Science Social Studies Social Studies  406.81 324.85 310.80 275.85 204.57 337.54 335.97 344.99 368.25 351.62 449.12 484.43 429.83 365.36 465.07 362.81 342.44 266.15 311.16  116 TABLE XXX K FACTORS  (SENTENCES) FOR EACH TEXT  with f i v e  greatest  number  o f the f i r s t of textbooks  Studies  has f o u r t e x t b o o k s  Science  textbook  textbooks and  with  ranked  t e n textbooks with  a low v a l u e  out of the f i r s t  number  low K v a l u e s  s i x . Four  ranked,  o f K. S o c i a l  t e n and t h e r e of  has  the  i s one  first  ten  a r e i n G r a d e 8 , f o u r a r e i n G r a d e 9,  two a r e i n G r a d e 10.  Industrial Mathematics with with  382.22 409.44 400.35 226.76 427.39 348.48 285.01 385.76 311.85  Comme r c e Commerce English English Mathematics Science Science Social Studies Social Studies  • 3B01 *3B02 *3C01 *3C02 *3F01 *3G01 *3G02 *3H01 *3H02  the  K Factor  Subject  Text  English  (CONT.)  the  Education  with  two o f t h e l a s t  greatest  t e x t a l s o had a h i g h  number  three  s i x t e x t s and  s i x t e x t s a r e t h e two  o f high  K value,  of the l a s t  subjects  K v a l u e s . One Home E c o n o m i c s  117 G e n e r a t e c o m p a r a t i v e and s t a t i s t i c a l a n a l y s e s of the distribution of the 100 most frequently occurring word-types of the Corpus across the three grade l e v e l s , t h e s e v e n s u b j e c t a r e a s , and the s u b j e c t a r e a s w i t h i n the t h r e e g r a d e l e v e l s . 7.1  Test  the  following null  hypotheses.  U2E2_kS§i§_li T h e r e a r e no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n o f t h e 100 most f r e q u e n t wordtypes of the Corpus when compared to the expected d i s t r i b u t i o n o f each word-type f o r : 1.1 1.2 1.3 1.4 1.5  the the the the the  three grade l e v e l s of the Corpus, s e v e n s u b j e c t a r e a s of the C o r p u s , s u b j e c t a r e a s w i t h i n G r a d e 8, s u b j e c t a r e a s w i t h i n Grade 9 , s u b j e c t a r e a s w i t h i n G r a d e 10.  7.2 Investigate and describe the number o f wordtypes which differed significantly in their distribution across each of the a r e a s t e s t e d i n Task 7.1. Task_7__ occurring  words  comparison. 100  In  most  The  this in  analysis,  the  was  frequencies  are  grade  by  subjects  within  indicate  that  grades?"  there  is  frequency  of  frequently  occurring  occurring not  been p o s s i b l e  to  words of  Acceptance o f the  between  words i n t h e various  the do  computed.  Complete  available  i n APPENDIX  so. data I.  A  for  null  the  the  the  list  of  the  most  Chi-square  500  samples  100  and  would of most  frequently tests  i t would  chi-squares  chi-square  total  terms  the  of the  grades,  (in  textbooks but of  the  hypotheses  similarity  corpora.  total  when  basis "Do  from  across  C o r p u s and  thirty-seven  as t h e  derived  subjects  frequently  question,  occurrence  level,  occurrence)  for  answer t h e  substantial  words f o r t h e  computed  to  occurring  C o r p u s have s i m i l a r organized  most  t o t a l C o r p u s were used  b a s i c task  frequently  100  the  analyses  were have were are  118 TABLE XXXI p r o v i d e s a summary o f  the  chi-square  O n l y two words have s i m i l a r f r e q u e n c i e s o f o c c u r r e n c e corpora  - " a s " and " v e r y " . The o t h e r  considerable  variation  ninety-eight  i n t h e i r frequency  of  t o t a l o f 372 o u t o f 500 t e s t s ,  set  with  across a l l  words  occurrence  the v a r i o u s corpora. In a l l , the n u l l hypothesis a  results.  exhibit across  was r e j e c t e d i n  the l e v e l of  significance  a t .01.  TABLE XXXI CHI-SQUARE ANALYSIS OF THE 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Rank  Word  1 2 3 4 5 6 7 8 9 10 1 1 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29  THE OF AND A TO IN IS THAT IT ARE FOR YOU BE AS OR WITH ON THIS BY WAS HE FROM HAVE AT WHICH ONE NOT CAN YOUR  Grades  **** ** **** **  **** ** ** * -* ** ******  *-*  --  ** **  Subjects (C)  ** ** ** ** ** ** ** ** ** ** ** ** **  **** ** ** ** ** ** ** ** ** ** ** ** ** **  Subjects (8)  ** ** ** ** **  **** ** ** ** ** **  **  ** ** ** ** ** ** ** ** ** ** **  **  Subjects (9)  ** ** ** ** **  ****  ** ** ** **  **-  **** ** ** ** ** ** ** ** **** **  Subjects (10)  ** ** ** ** ** ** ** ** ** ** ** ** **  ****  **  ** ** ** ** ** ** ** ** ** **  119 TABLE XXXI (CONT.) CHI-SQUARE ANALYSIS OF THE 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Rank  Word  30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76  They We His Will If An When All But These May There Has I Other Some More Where Had Their Used Many So Each Two About Should What Than Been Into Them Use Make Do Up Such Then Time Its Would How Number Made Out  MOSt  Only  Grades  ** **  ** **  ** -  **-  -  ** -  **  ** ** **  -  **  **  **  **  **  **  ** -  ** ** **-  Subjects (C)  ** ** ** ** **  ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **  ** ** ** ** ** ** ** **  ** ** ** ** ** ** ** ** **  Subjects (8)  ** ** ** ** ** **  **  ** ** ** **  -—  ** ** ** ** ** ** ** ** ** ** **  -  **  ** ** ** **  -  ** ** ** ** **  **  Subjects (9)  ** ** ** ** **  **  •* ** ** **  **  ** **  ** ** ** ** **  ** ** ** ** ** **  ** ** ** ** **  **  ** ** ** ** **  -—  Subjects (10)  ** ** ** ** **  ** ** ** ** ** ** ** ** -  — —  ** ** ** ** ** ** ** ** ** ** ** ** — —  ** **  ** ** —  ** ** ** ** ** **  *• ** **  120 TABLE XXXI  (CONT.)  CHI-SQUARE ANALYSIS OF THE 100 HOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GBADES, SUBJECTS, AND SUBJECTS WITHIN GRADES  77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100  Grades  Word  Rank  Sub j e c t s (C)  _ ** **  No Must Water Also First Very Good Him Same Could Who Any Because See Like Much People Called Place Through Work New Small Over  ** ** ** **  ** ** -  -  —  ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **  -  ** **  -  -  Subjects (9)  Subjects (8)  ** SIGNIFICANT AT THE - NOT SIGNIFICANT.  in  across  ****  -—  ** *# ** ** ** ** ** **  grades.  —  —  —  ** ** ** ** ** ** ** ** ** ** ** **  — —  —  ** ** —  ** **  ** ** ** ** **  **  -  —  ** —  ** **  **-  -  .01 L E V E L .  results for  grades,  and s u b j e c t s w i t h i n g r a d e s i s p r e s e n t e d  Only  46 o u t o f 100 c h i - s q u a r e  of s u b j e c t s a c r o s s grades,  rejected. 9,  ** **  i n the  distribution  of  o c c u r r i n g words a p p e a r s when t h e s a m p l e s a r e o r g a n i z e d  the case  8,  ** ** ** ** ** ** ** ** **  **  TABLE XXXII. The g r e a t e s t s i m i l a r i t y  commonly by  grades,  -  —  _  T a s k _ 7 _ 2 A breakdown o f t h e c h i - s q u a r e subjects  Subject (10)  94  t e s t s were r e j e c t e d . I n chi-square  tests  S i m i l a r r e s u l t s are evident f o r subjects within  and  10  with  72, 76, and 81 c h i - s q u a r e  were Grades  tests rejected.  121 These r e s u l t s TABLE  a r e a l s o shown i n  the  pattern  of  rejection  in  XXXI.  TABLE XXXII SUMMARY OF CHI-SQUARE ANALYSIS OF 100 MOST FREQUENT WORD-TYPES IN THE CORPUS ACROSS GRADES, SUBJECTS, AND SUBJECTS WITHIN GRADES Gdes i n Corpus Sig. Non S i g . Total  Subj. i n Corpus  46 54 100  d.f  Subj. i n Gde 8  Subj, i n Gde 9  72 28 100  76 24 100  94 6 100  2  6  5  Subj. i n Gde 10 81 19 100  6  4  Task_8_ Do t h e s e n t e n c e l e n g t h d i s t r i b u t i o n s o f t h e t h r e e grade l e v e l c o r p o r a , t h e seven s u b j e c t area c o r p o r a , and t h e e i g h t e e n s u b j e c t w i t h i n grade level corpora differ from the sentence length d i s t r i b u t i o n of t h e Corpus? T h i s t a s k i n v o l v e d t e s t i n g t h e f o l l o w i n g null hypotheses.  j3y.£2ti®§i§_2i  T h e r e a r e no s i g n i f i c a n t d i f f e r e n c e s i n the a c t u a l d i s t r i b u t i o n of short, average, and l o n g sentences when compared t o t h e e x p e c t e d d i s t r i b u t i o n o f each o f t h e sentence l e n g t h s f o r : 2.1 2.2 2.3 2.4 2.5  Task_8 the  t h e t h r e e grade l e v e l s of the Corpus, t h e seven s u b j e c t areas o f t h e Corpus, t h e s u b j e c t a r e a c o r p o r a w i t h i n Grade 8 , t h e s u b j e c t a r e a c o r p o r a w i t h i n G r a d e 9, t h e s u b j e c t a r e a c o r p o r a w i t h i n Grade 1 0 .  The p u r p o s e o f t h i s a n a l y s i s was  i  sentence  length  various corpora. similarity  in  across a l l the  distributions  were  distributions  corpora  involved.  determine i f  similar  I t would have been u n w i e l d l y the  to  to  across  the  determine  the  of a l l sentence For  example,  length 94  types  different  122 sentence  lengths  were i d e n t i f i e d  in  from one  word t o 117  length  select as  a number o f  a  whole  and  determine  across  distribution  f o r the  was  scrutinized  being  the  total  representative  and  of  lengths  represent  a g r o u p on  larger The  sentences  the  positively  end  of  of  the  varieties  The  b a s i c task lengths  grade  grades?" there  represents 50  was  across  levels,  the  corpora  of the  null  sentence  distribution  of sentences  tests  not  were  across  computed f o r t h e  have been p o s s i b l e t o do  square analyses corpora  are  f o r the  five  so.  selected of  five  their  sentence  Corpus mean  (10  groups  of  respectively).  l a r g e r sentences  in  i n length.  question, Corpus  "Do have  samples are  grades,  and  would  the  the  organized  subjects  within  indicate  distribution Corpus  various corpora.  Complete data  for  that  of  and  the the  Chi-square but i t  the  lengths across the  a v a i l a b l e i n APPENDIX J . A t o t a l  five  similar  t h i r t y - s e v e n textbooks  sentence  as  great  between the  the  length  s m a l l q u a n t i t i e s and  hypothesis  of  similar  This  when t h e  lengths  Corpus  lengths.  the  across  to  APPENDIX H )  three  sentence  over  from  is substantial similarity  representative  would  the  t o answer t h e  subjects  Acceptance  a l l the  for  words and  derived  plus  words i n l e n g t h  skewed d i s t r i b u t i o n curve  distributions  50+  the  basis  s i d e of the  words r e p r e s e n t  of sentences  sentence  by  50+  and  the  were c h o s e n . The  either  made  sentence  sentence l e n g t h s  respectively),  (30, 40  The  on  ranging  exhibited  FIGURE 7.1,  Corpus  occurrence  length  sentences  these  (See  five  the  of  words i n  r e p r e s e n t a t i v e of if  Corpus  frequency  20  d e c i s i o n was  various corpora.  relative  and  words. The  sentence lengths  distributions  carefully  f o r the C o r p u s a l o n e ,  chi-  various  of twenty-five c h i -  123 square  tests  TABLE In a l l t e s t s of  were computed. X X X I I I p r o v i d e s a summary of t h e c h i - s q u a r e the n u l l  significance  length  pointed sentence than  illustrating  distributions  subjects across  was r e j e c t e d a t t h e the  diversity  in  grades,  and s u b j e c t s w i t h i n g r a d e s . i s greater  apparent  within  are  organized  by  sentence  by  similarity  subjects  level  grades,  I t should  l e n g t h d i s t r i b u t i o n s f o r t h e sample o r g a n i z e d they  .01  the  f o r the r e p r e s e n t a t i v e sentences  out t h a t t h e r e  when  subjects  hypothesis  results.  in  be the  by g r a d e s  across grades or  grades.  TABLE  XXXIII  CHI-SQUARE ANALYSIS OF SELECTED SENTENCE LENGTHS FOR THE GRADES, SUBJECTS ACROSS GRADES, AND SUBJECTS WITHIN GRADES  X2 value .01 l e v e l d.f  Grades  Subjects Corpus  21.98 18.48  152. 23 42.98  8  24  Subj. i n Gde 8 53. 33 37. 57 20  Subj. i n Gde 9  Subj. i n Gde 10  109. 63 42. 98  41.68 32.00  24  16  Task_9.. D e v e l o p an " e l i m i n a t i o n t e c h n i q u e " f o r s e l e c t i n g the most significant content words i n a word l i s t u s i n g the ranked frequency l i s t s developed f o r the Corpus, the three grade l e v e l c o r p o r a , and t h e s u b j e c t a r e a corpora. 9.1 Produce a s e t o f graphs t o i l l u s t r a t e t h e word f r e q u e n c y by r a n k o f t h e C o r p u s , the t h r e e g r a d e l e v e l c o r p o r a , and t h e s e v e n s u b j e c t a r e a c o r p o r a . 9.2 What i s the e f f e c t of e l i m i n a t i n g the h i g h e s t f r e q u e n c y words and t h e l o w e s t frequency words from the total spectrum o f words f o r each o f the areas  124 stated  i n 9. 1?  9.3 Can the r e s i d u a l o f words remaining after e l i m i n a t i n g t h e h i g h and low f r e q u e n c y words d e s c r i b e d in 9.2 s e r v e a s a p o o l f o r s e l e c t i n g t h e most u s e f u l c o n t e n t words f o r t h e C o r p u s , t h e t h r e e grade level corpora, and t h e s e v e n s u b j e c t a r e a c o r p o r a , t h r o u g h a n a l y s e s based on r e l a t i v e f r e q u e n c y o f o c c u r r e n c e and subjective criteria?  Task_9__ feasibility selecting derived  This of  task  was  designed  developing  an  t h e most s i g n i f i c a n t from s a m p l e s  prescribed  subject  frequency  distributions  "elimination  language  APPENDIX  thirty-seven possible The analysis and  K.  The  text  for  the  Corpus, i n this  of  t e x t b o o k s were n o t p l o t t e d  grade  the  word  level,  and  are  within  g r a d e s and t h e  although  i t  graphs  approximate  the  shape  usually  o f word f r e q u e n c y d a t a . T h e g r a p h s corpora  have  the  f o r the grade  same g e n e r a l  FIGURE 5 ) i l l u s t r a t e s  the high frequency f o r the f i r s t  words,  the c l u s t e r i n g  of the graph, times  legomena  be  .  or  and  Corpus  graph  the  final  off to  words  tabulation  (see  100 most  o f word f r e q u e n c i e s a b o u t  the gradual t a i l i n g  less,  The  level  shape a s t h e word  f o r t h e Corpus.  three  would  found i n t h e  frequency d i s t r i b u t i o n  point  presented  t o do s o .  subject  common  words  r e p r e s e n t a t i v e of  task  graphs f o r s u b j e c t s  the  technigue" f o r  The g r a p h s i l l u s t r a t i n g  s u b j e c t s a c r o s s grades c o r p o r a used in  determine  v o c a b u l a r y from a l i s t  of n a t u r a l  materials.  to  t h e mid  occurring  o f t h e ha_ax  125  0 I  I  I  ro  I  I  3o  l  l  So  i  fa  .-  •  go  7o  1  lo  "o  I n D e s c e n d i n g O r d e r (x io'.)  Words Ranked  FIGURE 5 WORD FREQUENCY  J§sk_9_2 is  used  to i l l u s t r a t e  technique" for  to  50 p e r c e n t  Point  The word  B  frequency  a word  frequency  the e f f e c t list.  of the high  represents tokens  DIAGRAM  OF THE  CORPUS  graph o f t h e Corpus of  applying  Point A represents  frequency  the cutoff  i n the Corpus.  tokens  point The  in  the  "elimination  the c u t o f f the  f o r 10 p e r c e n t remaining  (FIGURE 5)  40  point  Corpus  and  o f t h e low percent  of  126 tokens  between  "significant"  points  body o f c o n t e n t  for  these  with  a mean f r e q u e n c y  off  s  •  from  '  words  both  would  f o r the Corpus.  To  Words  TABLE word-types  The  most l i k e l y a p p r o x i m a t e  of occurrence  and  distribution a normal  proportionate  I  I  1  1  1  3o  Vo  So  £o  to  Ranked I n D e s c e n d i n g FIGURE  APPLICATION  to represent the  curve  tailing  s i d e s o f t h e mean.  i lO  A and B a r e c o n s i d e r e d  »  Crder  p r e s e n t s the data  and t h e p e r c e n t a g e  i  >  'o©  9t>  (x 10')  6  OF THE "ELIMINATION TECHNIQUE" TO THE DIAGRAM OF THE CORPUS  XXXIV  i  fa  WORD FREQUENCY  f o r determining  of the t o t a l  t h e number  word-types  of  accounted  127 for grade  by  both  level  the cutoff  corpora,  p o i n t s A and B i n t h e C o r p u s , t h e t h r e e  and t h e s e v e n s u b j e c t a r e a  corpora.  TABLE XXXIV NUMBER AND PERCENTAGE OF WORD-TYPES ELIMINATED BY POINT A ( 5 0 % CUTOFF OF TOKENS) AND POINT B (10% CUT OFF OF TOKENS) Point A No.of Word Types  The  words up t o P o i n t  word-types which  in  each  of  had 111 w o r d - t y p e s  percentage  (0.68),  %  of Word Types  111 94 109 118 82 92 81 90 53 96 120  Corpus Grade 8 Grade 9 Grade 10 Commerce English Home E c o n o m i c s I n d u s t r i a l Education Mathematics Science Social Studies  0. 68 1. 33 0.96 1.52 2.72 1. 30 1 .47 2.22 2.72 1.99 1 .93  A account the  while  Commerce  53 w o r d - t y p e s  had 2.72 p e r c e n t  of t h e i r  from  great  the  frequency  majority  distribution  eleven  of Word Types  Total No.pf Word Types  77.40 12,695 4, 593 65.36 67.80 7 ,730 63. 40 4,906 1 ,904 63.00 4, 010 72. 60 69. 30 3,832 2, 433 60.00 1 ,298 66.50 2, 968 61.41 3 ,144 50.62  16 ,405 7,027 11,401 7,736 3,020 7,079 5,529 4,060 1.952 4,833 6,211  No.of Word Types  %  f o r a very  small  distributions.  'eliminated*, represented  Mathematics with  The  Point B  total  with  'eliminated* word-types  o f t h e word-types using  s t r u c t u r e words which  this are  82  number  of  The Corpus,  the  smallest  word-types  respectively,  and both  deleted.  which  would  be d e l e t e d  technique  would  similarly  common  be  high  t o the  128 Corpus which  and  the  ten  corpora  c o n s t i t u t e 'noise  distinct  enough  material  they  The types  to  i n v e s t i g a t e d . These  system',  special  are  not  significance  words,  considered  f o r the  content  represent.  words up  t o Point B account  f o r the m a j o r i t y  eleven d i s t r i b u t i o n s .  ' e l i m i n a t e d ' ranged  Social  the  have  i n each o f the  types  in  being  Studies, to a  f r o m a low  high  of  The  o f 3,144  12,695  of  word-  numbers o f  word-  (50.62 p e r c e n t )  (77.4  percent)  in  in the  Corpus. Most which are  occur  the  low  frequency  only s e v e r a l times  considered  their  t o be  too  respective content The  of  of  eleven  areas  is  British  Columbia.  described  p r e v i o u s l y under Task  2)  Grade L e v e l s  3)  Subject  Areas  Task_9_3 The A  and  corpora  B  t o have s p e c i a l  provided  from  Computing The  (Volume  distribution.  items  These  words  significance  o f t h e w o r d - t y p e s and  available  1) C o r p u s  the  rare  would be  for  materials.  complete l i s t i n g  the  in a  words d e l e t e d  in  Centre,  organization  the  tokens  for  each  f o l l o w i n g volumes  at  the  of  these  University  of  volumes  was  4.  C.V.)  (Volume (Volume  balance  (approximately  G.V.) S.V.)  of the 40  a r e c o n s i d e r e d t o be  words r e m a i n i n g  percent  of tokens)  between i n each  n e i t h e r t o o common n o r  points of  the  too r a r e  and  129 have t h e  greatest significance  represent.  TABLE  w o r d - t y p e s between grades,  and  XXXV the  for  the  presents A and  the  B cutoff  content number and points  s u b j e c t s a c r o s s g r a d e s c o r p o r a . The  word-types  in  this  items  which o c c u r  times  or  section  seven times  more i n most of  the  of  material  they  percentage  for  the  vast  the d i s t r i b u t i o n s  Corpus,  majority are  o r more i n t h e C o r p u s ,  of  of  lexical  and  three  corpora.  TABLE XXXV NUHBER AND PERCENTAGE OF WORD-TYPES BETWEEN POINT A AND POINT B (40% OF TOKENS) FOR THE GRADES, AND SUBJECTS ACROSS GRADES NO. o f Word-Types  % Of Word-Types  No. Of Word-Types  3,599 2, 340 3,562 2,712 1,034 2,977 1,616 1 ,537 601 1 ,769 2,947  21.92 33.31 31.24 35.08 34, 28 26.10 29.23 37.78 30.78 36.60 47.45  16,405 7,027 11,401 7,736 3,020 7,079 5,529 4 ,060 1,952 4,833 6,211  Corpus Grade 8 Grade 9 Grade 10 Commerce English Home E c o n o m i c s I n d u s t r i a l Education Hathematics Science Social Studies  It lexical  is  interesting  content,  words a r e  organized subjective  once the  to  note  that  the  common words and  really  the r a r e l y  ' e l i m i n a t e d * , c o n s i s t s of a r e l a t i v e l y  word-types  when by  CORPUS,  compared  Corpus, grades,  e v a l u a t i o n of the  to the and  total  f o r the  significant occurring  s m a l l number complete  subjects across grades.  words r e m a i n i n g  between  of  sample Further  points  A  130 and  B  by  subject  even  further.  specialists  would no d o u b t r e d u c e t h e t o t a l  SUMMARY This  chapter  findings  for  has presented  the  study.  completion  of the tasks  pages  printed  of  instructional  p l u s accompanying  Task__  A  selected  tasks  were  which  sampled  and  t a b l e s , graphs,  organized.  included the  Corpus  English  and  o f some  sixty-six  the 5,500  word  lists  summaries.  sample o f i n s t r u c t i o n a l The  and t h e  f a c s i m i l e s of the  and s t a t i s t i c a l  consisted  h u n d r e d word s a m p l e s o f n a t u r a l l a n g u a g e prescribed  involved  resulted i n the production  representative and  Nine  material  materials  the a n a l y s i s of the data  materials of  taken from  language textbooks r e p r e s e n t i n g  469,  was five  thirty-seven seven  subject  areas.  Task_2 the  The  C o r p u s was keypunched  o n t o IBM c o m p u t e r  FMT computer p r o g r a m and s t o r e d  on d i s k  t o await  cards  using  processing.  T a s k _ 3 Two e d i t i o n s o f t h e C o r p u s were p r o d u c e d .  One e d i t i o n  organized  subject  areas.  o f an a d d i t i o n a l s i x t y - f i v e  corpora  This with  by g r a d e l e v e l s  enabled  the production  t h e samples o r g a n i z e d  subject  areas,  thirty-seven  Tasl___  and one o r g a n i z e d  eighteen  by  the  three  subjects  within  by  grade grade  levels, levels,  was  seven and  textbooks.  Two word  frequency  lists,  one  organized  alphabetically  131 and the  one  organized  by d e s c e n d i n g  C o r p u s and t h e s i x t y - f i v e  r a n k o f word f r e q u e n c y were d e v e l o p e d  Task_5_ based  total  corpora.  Yule's  was  loading of s p e c i f i c  with  were  generated  Corpus  and t h e  K  was computed t o  in  Grade  9 in  as  and  vocabulary Education  and  Social  a much h i g h e r  were o r g a n i z e d Social  which a l s o h a s t h e the  three  grades.  subject  corpora  p r o p o r t i o n of  load. Social  word-  Studies also  Studies  have t h e l a r g e s t  and 8;  have  English  Home  the largest  subject  have  largest  vocabulary  and  the  Home  Economics  l o a d f o r G r a d e 9; a n d .  S t u d i e s have t h e l a r g e s t  vocabulary  grades.  E c o n o m i c s and I n d u s t r i a l  subject corpora  have t h e l a r g e s t  the largest  by s u b j e c t s w i t h i n  Studies  f o r Grade  and S o c i a l  With  by  a c r o s s t h e t h r e e g r a d e l e v e l s by  a greater vocabulary  load  English  Science  measured  p r o p o r t i o n of word-types.  Economics  corpora  also  the  material  word-types  E n g l i s h having  suggesting  a high  order  corpora.  introduced  When t h e s a m p l e s  and  analyses  E c o n o m i c s and E n g l i s h were t h e two l a r g e s t  types  the  t h e c o n c e n t r a t i o n o f commonly o c c u r r i n g v o c a b u l a r y i n  tokens  subjects,  Home  and a s c e n d i n g  characteristic  when t h e s a m p l e s were o r g a n i z e d  had  presenting  corpora.  of  h e a v i e s t l o a d o f new r e a d i n g  heaviest Home  statistical  were p r o d u c e d f o r  Tables  i n descending  characteristics  each o f t h e s i x t y - s i x The  figures  and  on t h e l e x i c a l  illustrate  corpora.  f o r each o f t h e s i x t y - s i x  Comparative  sixty-five  rank o r d e r ,  subject corpora  and  l o a d f o r Grade 10.  t h e samples o r g a n i z e d  by t e x t b o o k s ,  a Grade  10 S o c i a l  132 S t u d i e s t e x t and a Grade 9 I n d u s t r i a l largest text  subject  corpora  Education  w h i l e t h e same G r a d e  and a G r a d e 9 E n g l i s h  text  have  the  text  have  10 S o c i a l  largest  the  Studies  vocabulary  load. Thus, i t i s e v i d e n t word-type by  and t o k e n  distribution  the v a r i o u s grade, The  application  occurring  had  the lowest  K value  the  greatest  variety  except  English  have  the  Task_6_ based  developed  types  repeated the t h r e e  used  had  the various  Commerce  and  by  textbooks,  English  texts  K values.  statistical  analyses  were  generated  l e n g t h d i s t r i b u t i o n s o f t h e Corpus and t h e  Graphs f o r each  for this  uniform  were  English  d e n s i t y o f commonly o c c u r r i n g words.  organized  and  density  t h a t Grade 9  across  10 where  of  task. Yule's of  the  sixty-six  corpora  K characteristic  was u s e d  commonly  i n t h e C o r p u s and s i x t y - f i v e  Relatively samples  again indicated  vocabulary  describe the concentration  length  the  of  have t h e l o w e s t  corpora.  t o measure  and t h a t Home E c o n o m i c s and  lowest  on t h e s e n t e n c e  were to  the  Comparative  sixty-five  vocabulary  f o r s u b j e c t s w i t h i n Grade  samples  consistently  corpora.  of the K c h a r a c t e r i s t i c  commonly  corpora  when t h e s a m p l e s a r e o r g a n i z e d  s u b j e c t and t e x t  of  With  that there i s considerable d i v e r s i t y i n  average sentence  organized  when t h e s a m p l e s grades although  by  sentence  corpora.  lengths  grades.  were o r g a n i z e d  occurring  were found  when  T h i s p a t t e r n was a l s o  by t h e s u b j e c t s  the range i n averages  increased.  across  133 When t h e s a m p l e s were o r g a n i z e d fairly  uniform  exception in  the  average  of S o c i a l S t u d i e s .  sentence  extent  word  by  10,  Science  the  has  samples  the  the  l a r g e s t average  should  subjects by  the  Grade  value  for  values  f o r the  Task_7^  t h a t with  by  exhibits  the lowest  the  Ho  by  K value;  other  10  to  the  500  within  exception  length.  that  With  the  E n g l i s h t e x t has  the  the samples o r g a n i z e d  greatest  and  10,  by and  variation  subject area  deviation  and  in  approaches  coefficient  Grade  K characteristic  10;  and  two  samples o r g a n i z e d  the  tests 100  were  lengths  E n g l i s h had  subjects across  f o r s a m p l e s 8 and  of  by  and  subjects  s u b j e c t s w i t h i n Grades 8 , 9 ,  o c c u r r i n g sentence  Chi-square  distribution  out  of Y u l e ' s  samples o r g a n i z e d K value  as i n d i c a t e d  of  f o r E n g l i s h samples.  o f commonly  lowest  variability  of v a r i a t i o n  sentence  a Grade  standard  application  had  the  length.  distribution.  reported  8  average  pointed  the  9,  l e n g t h s r e p o r t e d per  organized  textbooks,  English  magnitude o f  density  by  grades,  length  The  are e v i d e n t  sentence  samples  largest  be  textbooks,  variation  for  average  sentence  across  sentence  considerable  same p a t t e r n i s e x h i b i t e d w i t h  organized  It  However,  deviation, coefficient  samples. With the  Grade  s u b j e c t s w i t h i n Grade  lengths are evident with  length d i s t r i b u t i o n s  the range i n standard some  sentence  by  9;  the  grades;  Science  had  to  the  indicated  that  lowest  value  K  E n g l i s h had the  E n g l i s h t e x t s had by  measure  the  the  lowest  K  lowest  K  textbooks.  computed  to  illustrate  most commonly o c c u r r i n g w o r d - t y p e s  the of  134 the  Corpus a c r o s s the  subject level  area  three  corpora,  grade  and  corpora,  the eighteen  the  seven  s u b j e c t s w i t h i n grade  corpora. A total  o f 500  chi-square  samples  organized  subjects  w i t h i n grades corpora  distribution  by  most  commonly  occurring  by  of  the t h r e e  corpora,  tests  five  results  of  sentence  lengths  for  the  diversity most  the  exists  within  a  apparent  similarity  organized within  particular  by  grades.  grades  grade  the  by G r a d e s 8, 9, and 10.  computed  to  sentence  corpora,  illustrate  chi-square various  seven  tests  subject  levels  the  selected  indicated  that  length d i s t r i b u t i o n f o r  subjects across grades,  sentence than  area  corpora.  lengths across the corpora  suggested  the  on  corpora  i n sentence  the  l e n g t h s o f the Corpus  the  An a n a l y s i s o f  in  indicated  g r a d e s and s u b j e c t s w i t h i n  by g r a d e s ,  test  the  were  common s e n t e n c e  within grades.  when  of  i n t h e use o f t h e  s u b j e c t s w i t h i n grade l e v e l  the  t h e samples a r e o r g a n i z e d subjects  nature  samples  were  grade l e v e l  The  even  t o determine the  selected  and t h e e i g h t e e n  significant  f o r the  s u b j e c t s a c r o s s g r a d e s , and  vocabulary  were o r g a n i z e d  Chi-square  distribution across  computed  more v a r i a b i l i t y  subjects across  when t h e y  Task_8_  were  o f t h e 100 most common words. The r e s u l t s  t h e r e was s i g n i f i c a n t l y  organized  tests  grade l e v e l ,  that  than  level  the that  chi-square there  length  when and  results  was g r e a t e r distributions  by s u b j e c t a r e a s  or subjects  135 Task_9_,  An  selecting of  the  "elimination  t h e most " s i g n i f i c a n t " Corpus,  subject  technique"  area  represent  the  three  corpora.  the  A  was  content  grade  set  word f r e q u e n c y  described  words i n t h e  level  of  was  o f each  use  in  word  lists  the  seven  c o r p o r a , and  graphs  by r a n k  for  constructed  of the eleven  to  areas  investigated. The  use  "significant" great  the  these  "elimination  a  relatively  words have on  common words had  the been  r a r e words, a c o r e o f  f o r examination  Computer  and  techniques  computer  235,107  words; s i x t y - f i v e  tables,  copy  dissertation  into  eight  placed the  to  on  and  and  computer  to generate  corpora  available  files  and  a Corpus  representing  graphs;  numerous and,  of  grade  within within grades,  and  statistical  print  the  final  itself. by  the  study  occupied over  were  formatted  5,500 pages w h i c h  volumes.  A l l information pertaining  magnetic  t a p e s and  Special  Once  e x t e n s i v e l y i n most a s p e c t s  analyze the material;  data generated  program  frequent  distibution.  vocabulary i s  prepared  smaller  figures,  used  FMT  were u s e d  areas, subject areas  procedures  The  * significant'  programs were d e v e l o p e d  subject  of the  frequency  highly  the  • e l i m i n a t e d * , a l o n g w i t h a number o f  specially  twenty  textbooks;  material illustrated  s m a l l number o f  word  the  analysis.  o f t h e s t u d y . O v e r 200  levels,  technique" to determine  words i n a body o f p r i n t  influence  structure  very  of  Collections  of  the  the  were o r g a n i z e d  t o the  s t o r e d i n t h e Computing  Division  using  Library  study  was  Centre  and  at  the  136 University The  of  British  3270 C o n v e r s a t i o n a l T e r m i n a l  the study  t o monitor  organize  the  accompanying Apart a  the  words,  the o t h e r  and  need one  to  edit  used  the  throughout and  word l i s t s  and  C o r p u s , which  contained nearly a quarter  of  considerable  r e o r g a n i z a t i o n was  to  of the C o r p u s , c o r p o r a ,  sixty-five file  illustrating  to  two  as  descending  was  in a  thorough  l e n g t h c h a r a c t e r i s t i c s of  provide  relevant  graphs  as  and  study.  The by  i n a l p h a b e t i c a l order  to  f o r each  develop order  study.  sentence  the  f u r t h e r complicated  (one  ascending the  in  required  f o r t h e C o r p u s and  well  and  used  program  used  word l i s t s  rank order)  corpora,  corpora  corpora  management t a s k  compile  the s i x t y - f i v e  necessary  i n p u t of d a t a ,  was  material  i n descending  sixty-six  (CRT)  statistics.  magnitude of the the  the  production  from  million  develop  Columbia.  In  tables  f o r each o f  addition,  examination  of the  t h e C o r p u s and  t a b l e s , and  two  test  the the  of  the  i t  was  word  and  corpora,  statistical  hypotheses.  The  use  development  of of  existing new  computer  computer  techniques,  programs,  facilitated  and  enabled  described  above t o be  rapidly  necessary  statistical  i n f o r m a t i o n r e q u i r e d f o r the  plus the  also provided study.  the tasks the  137  CHAPTER V  DISCUSSION, CONCLUSIONS, AND This chapter the  study.  given  and  presents  a d i s c u s s i o n of the  major f i n d i n g s  of  the f i n d i n g s  are  A number o f c o n c l u s i o n s drawn from  the r e l a t i o n s h i p  reading  RECOMMENDATIONS  in  the  of these  secondary  c o n c l u s i o n s t o the  school  role  discussed.  recommendations f o r f u t u r e r e s e a r c h r e s u l t i n g  from  of  Finally,  the study  are  presented. The  central  technology related from in  to  f o c u s of the  1)  corpora  develop  Columbia  sentence  The  study  selecting  be  and  was  research A  into  nine  procedures,  text  grades,  and  of  selected  and  2)  use  make a  the grade  Study  of  techniques  methodological  tasks  methods o f  analysis,  q u e s t i o n s t o be  the  a series  analyses of s e l e c t e d  Corpus,  p r o c e s s i n g and  Pilot  computer  word level,  corpora.  commencement and  secondary  the  organized  sampling  tested.  of  language  of  materials prescribed for  comparative  textbook  r e c o r d i n g , data  relevant to  and  junior  features  s u b j e c t a r e a , and  and  natural  E n g l i s h language i n s t r u c t i o n a l British  use  a r e p r e s e n t a t i v e C o r p u s and  of samples o f  number o f d e s c r i p t i v e and  study i n v o l v e d the  and  answered and  was  conducted  investigation procedures,  to and  which  involved  data  collecting  the  posing  null  of  hypotheses  prior  validate generate  to  the  sampling needed  138 computer  programs.  DISCUSSION  The IV.  detailed  They  are  Processing Word  discussed  FINDINGS  of the study a r e p r e s e n t e d i n Chapter here under  the h e a d i n g s : Sampling  Procedures, P r o d u c t i o n o f t h e Corpus,  Lists,  Common  findings  OF MAIN  Lexical Characteristics,  Words,  Selected  Sentence  Sentence  Lengths,  Production  and of  Characteristics, and  Elimination  Technique.  Tasks_l_an__2  The word  235,107 word C o r p u s  samples  data  total  base  statistical corpora. the  from t h e 469, f i v e  hundred  t a k e n f r o m t h e t h i r t y - s e v e n t e x t b o o k s was d e v e l o p e d  and p r e p a r e d f o r computer The  derived  sample for  processing.  used i n the  the  various  a n a l y s e s performed  The  processing  FMT  computer  of the  study  descriptive, on  the  program  natural  provided  adequate  comparative,  Corpus  was a n i d e a l  language  an  samples  and  and  sixty-five  instrument f o r used  in  the  investigation.  T a s k _ 3 _ _ P r o d u c t i o n _ of_ t h e _Corpus  Two  copies  of  the  Corpus,  one o r g a n i z e d by g r a d e  levels  1 3 9  (CS.)  and t h e o t h e r o r g a n i z e d by  subject  areas  (C.S.),  were  produced. The  organization  arranged areas)  by g r a d e provided  of  the  two e d i t i o n s o f t h e C o r p u s  levels  and  the  useful  access  other  arranged  t o t h e 469 s a m p l e s  by  (one  subject  used  i n the  study.  The  v a r i o u s s a m p l e s were o r g a n i z e d  descending  rank  three  levels  grade  eighteen  word  (G.V.),  subjects within  textbooks corpora  order  lists  development  alphabetical summary  tables  132  and d e s c e n d i n g  tables  t y p e s and  of  facilitated  tokens  subject areas  f o r each  frequency  order  the rapid  throughout  (S.V.),  the  the seven  sixty-six  developed.  word  rank  of  and  (C.V.), the  ( S . G . v . ) , and t h e t h i r t y -  grades  d e s c r i b e d a b o v e were a l s o  The  alphabetical  f o r the Corpus  the seven  (T. V. ). S t a t i s t i c a l  into  the  along  with  location  Corpus  lists  in  statistical  of specific  and  the  both  word-  sixty-five  corpora.  The resulted These  processing  of  the  i n the i d e n t i f i c a t i o n  results  total of  are proportionately  Corpus  of  235,107  tokens  16,405  specific  word-types.  similar  t o t h e t y p e and t o k e n  140 distributions generated (1967)  A  (1972)  the  10  nine  noted  organized  word-types.  i n G r a d e 10.  than  by  twice  the  either  Education  With  and  can  this  50  in  the s p e c i f i c  i n G r a d e 9 and  most l i k e l y  F u r t h e r r e s e a r c h would have t o be  samples  subjects  within  grades,  existed  in  data  type  token  and  the  organized and  except  distribution.  characteristic,  which  concentration  vocabulary  of  middle  of  Grade  year  in  Home E c o n o m i c s  use  the  made  9 the  reading  However, i t s h o u l d  and  8  size  i n Grade  heavier  and  sheer  be and  reading  of  these  by  the  12.  subjects no  Grade  to determine  11 and  textbooks,  vocabulary  continues into  conducted  r e a d i n g demands i n Grade  the  corpus  might assume t h a t a marked  exposure occurs  With  and  greater i n  i n Grade 10  q u a n t i t y of reading content  f e a t u r e s of t h e  The  potentially  mind, one  and  Harris  were u s e d  t h a t the  repeated  Davis,  characteristics  percent  i n c r e a s e i n the  10.  token  i n Grade 9 E n g l i s h , be  Francis  of t h a t f o r both  G r a d e 8 o r G r a d e 10. used  and  Carroll, and,  computer  tokens,  textbooks  exhibits  on  Kucera  levels.  size  T h i s suggests  f o r t h i s g r a d e d e p e n d s on  textbooks.  grade  Nineteen  grades  that textbooks  Industrial load  based  tokens;  i n t y p e and  in terms of tokens  secondary  demands  tokens;  5,088, 721  evident  f o r G r a d e 9 was  t e r m s of  junior  was  samples  Grade  and  1,014,232  80,000 t y p e s , 4,500,000  pattern  material  in  types,  (1971) 86,741 t y p e s ,  Jacobson  and  i n other recent research  corpora of various s i z e s i n c l u d i n g :  50,406  Richman  with  found  across  grades,  apparent  pattern  for considerable diversity  i n word-  However, a p p l i c a t i o n  provides  a  statistical  within print  of  Yule's  indices  K  of  the  materials, resulted  in  141  clear  trends  grade,  based  subject  on  and  text  redundancy i n v o c a b u l a r y English  had  vocabulary (with  vocabulary  and  measure  of  words,  distributions  These using  type  studies. tokens on for  the  the  least  degree, each  in  grade  a low K v a l u e ) ; and redundancy  in  (with t h e e x c e p t i o n  Considering that K i s a distributions  Economics  vocabulary  f o r a l l subject  large  least  textbooks  E n g l i s h and Home  clearly  make  The  token  demands.  word  lists  tend to  also  display  variation.  results  provide  and  token  can provide  the relative  statistics  the useful  specific data,  evidence  f o r t h e value of  to  computed number  supplement  i n word of  o f those  of the r e a l  measure words  vocabulary  the  frequency  word-types  but a s t a t i s t i c a l  redundancy i n occurrence  differentiation  various subject  striking  as t h e K c h a r a c t e r i s t i c  Determining  sharper  had  had  within  exhibiting  to other  greater  measures s u c h  usual  English  f o r the various  Home E c o n o m i c s and  a  subjects  f o r Commerce t e x t s ) .  proportionately  considerable  t o other  to  the degree t o which t h e token  have d i f f e r e n t  9  Grade  redundancy,  i n comparison  t h e low K v a l u e  corpora.  Grade 9 a l s o  in  Economics  r a t e frequency  o f t h e t h r e e grades;  least  i n comparison  Commerce  Home  of  the  repeat  and based  allows  demands o f  areas.  Task_6j__Sentence_Char  No evident  apparent  pattern i n  f o r the  sentence  samples o r g a n i z e d  length by grades.  distribution  was  With t h e s a m p l e s  142 organized  by s u b j e c t s a c r o s s g r a d e s ,  textbooks, was  c o n s i d e r a b l e range o f v a r i a t i o n  apparent.  standard  sentence  length  d e v i a t i o n s and  English  had  subjects  for  the  length  by  least  length  of  variation  in  and  with  of  with the  textbooks.  English  exceptional  t h a t Grade 8  sentence  lengths;  samples organized samples  organized  had t h e l e a s t  by by  redundancy  redundancy f o r samples  again,  as  demands  in  in  the  terms  case  of  of  sentence  variety.  significant  E n g l i s h i s focused demands  minimal sentence with  easily  the data  develop  in  on h e r e  available  useful  because  of  terms o f l a c k o f v o c a b u l a r y  length r e p e t i t i o n ,  r e d u n d a n c y and s e n t e n c e  and  rate  redundancy  grades  makes  Although  of  i n repeat  G r a d e 1 0 ; and, E n g l i s h had t h e l e a s t  vocabulary,  that  redundancy  w i t h i n G r a d e s 8 and 9 ; S c i e n c e  organized  and  coefficients  of the K c h a r a c t e r i s t i c i n d i c a t e d  across  subjects  sentence  statistics.  Application the least  in  and  In a d d i t i o n , E n g l i s h o v e r w h e l m i n g l y e x h i b i t e d t h e  largest  had  subjects within grades,  from  this  i t should study,  i t s rather redundancy  be p o i n t e d o u t  i t i s possible  to  descriptive  statements  on  length repeat  rate for a  great  variety  by g r a d e s ,  subjects  c o n f i g u r a t i o n s f o r t h e samples o r g a n i z e d  vocabulary  textbooks.  Task_7__Common__ords The  type  frequently relatively  and p e r c e n t a g e represented  similar  in  o f "common words" f o u n d the  samples  to the r e s u l t s obtained  of  this  i n other  t o be study word  most are count  143  studies.  (See  f o r example,  the  three  corpora  referred  to  previously) . The  results  organized  by g r a d e s ,  grades, little  f o r the  provide  statistical  occurring  evidence  w o r d - t y p e s used  groupings  style are of  than  the  significantly  organized grades  by g r a d e s ,  provide  little  samples  samples f o l l o w uniformity by  suggest  In  those that  There  tended to  organized  by  most  gross  areas  the frequency  words f o u n d  i n English.  analysis  for  evidence  samples  f o r the assumption  i n the d i s t r i b u t i o n no  subject  or  pattern.  of  grade  There  across  found  and  within  f o r the  common  content  that  representative  groupings d i d the  tends  to  t h e s a m p l e s o r g a n i z e d by g r o s s g r a d e  subjects  parallel  the  s u b j e c t s a c r o s s g r a d e s , and s u b j e c t s w i t h i n  a homogeneous  with  even  of the separate subject  chi-square  statistical  lengths.  that  Lengths  the  uniformity exists  sentence  than  of  of  instrumental i n affecting  Task.8: ..Selected,Sentence  results  and s u b j e c t s w i t h i n  i n writing.  o c c u r r e n c e o f even t h e most common  The  samples  when t h e y were o r g a n i z e d by s u b j e c t s . The  and c o n t e n t c h a r a c t e r i s t i c s thus  for  f o r the assumption  i n the distribution  be a g r e a t e r u n i f o r m i t y w i t h grade  analyses  s u b j e c t s a c r o s s grades,  uniformity exists  commonly  chi-square  grades.  These  word a n a l y s i s  the  style  and  characteristics  separate  subject  areas  are also  affecting  the frequency o f occurrence o f r e p r e s e n t a t i v e  significantly  be  more  groupings results and a g a i n of the  instrumental i n sentence  14a  lengths.  An  elimination  suggested percent  technique  a t t h e 50 p e r c e n t  o f t h e low f r e q u e n c y  as  a  model.  and  the separate  grade  rare  and  of  running  words a r e f a i r l y  print  sources  is  useful  elimination be u s e f u l list  would  tokens,  purposes.  revealed that  even  common  corpora  when  coupled  and 10  the t o t a l  contain  words.  Full  knowledge  of  a  technique  with t h e a p p l i c a t i o n  larger  majority of  elimination  t h e most s i g n i f i c a n t  Corpus  of  a l l word-types.  t h e most s i g n i f i c a n t  The e l i m i n a t i o n  list  comprehension  o f h i g h l y f r e q u e n t and r e l a t i v e l y  i n determining  points  tokens  though t h e l a r g e  p o s s i b l e and t h e  ascertaining  cutoff  u s i n g t h e C o r p u s word  subject  involve  i s seldom  in  instructional  word-types  with  of the high frequency  This analysis also  number  However, t h i s  was d e v e l o p e d ,  technique  vocabulary f o r (based  on  the  r a r e words) c o u l d content  in a  word  o f judgment by s u b j e c t  specialists.  Word f r e q u e n c y significance. develops  can  Certain  a topic, from  words  vocabulary  lists  with  probability idea.  true that  justified  as  words a r e n o r m a l l y  when t h e most s i g n i f i c a n t  separated  particularly  be  that  serve  high content  in  a given  to  a  measure  of  the  words  t i e writing  expository writing  word  r e p e a t e d a s an a u t h o r  significance  word i s used  of  result.  are  together, This  is  where t h e r e i s l i t t l e  to reflect  more  than  one  145 CONCLUSIONS  In  Chapter  linguistic use  I  the  characteristics  i n Canadian  answers t o that  question  when  "Print  asked,  of the p r i n t  sources  major  answer  prescribed to  the  i n relation t o ; quantity  of pattern  samples o f t h e study the  print  subjects  congruity  within  grades,  and  t h e m s e l v e s . The v a r i a b i l i t y  at  based  frequency  on s t r a i g h t f o r w a r d  and s e n t e n c e  In a l l c a s e s , patterns print  masked  sources  representing be  were  more p r e c i s e t o  secondary subjects than  years within  by g r o s s The  reading  subject  speak  separate  grades,  i s marked e v e n  lexical  into  of  subject reading  terms o f s u b j e c t s  the t h r e e  grade  across  by t e x t b o o k s  w i t h i n the in  looking  v a r i a b l e s s u c h a s word  differences  and w i t h i n  in  reflect  o f the samples i n t o  organized  across  to  length.  organization the  sentence  e x i s t s a c r o s s the  subjects  samples  subjects data  d i s t r i b u t i o n of  representative  f o r grades,  be  prescribed,  when t h e r e s u l t s a r e o r g a n i z e d  sources prescribed  can  characteristics  characteristics,  types."  little  f o r the j u n i o r  of material  of  for  partial  question  common words, and t h e d i s t r i b u t i o n fact,  are the  prescribed  extremely diverse  redundancy, sentence  In  "What  This study provides  for materials  sources e x h i b i t  examined  vocabulary  was  secondary schools?"  s e c o n d a r y grades. The stated,  question  level  grades,  grade  s o o b v i o u s when t h e  various  combinations  g r o u p i n g s . I t would demands across  in  the  junior  the three  separate  thus  grades,  text,  rather  alone.  materials  demands a s p r i n t  o r by  gross  in  each s u b j e c t  s o u r c e s when compared  area  make  within  unique  subjects  146 or  to o t h e r  lacking  in  comprising the  the 50  there  sentence the  percent  style  prose.  English  materials  (and  to  some  pointed  extent Mo  tend  to  is  assumed  reading  difficulty  repeat  rate  and  characteristics materials  are  have  Home  Economics  other  subject  and area  redundancy,  length  greater  homogeny.  concentration  here  that  variability  i s related  to  widely  fluctuating  patterns  of  words  more d i f f i c u l t  exhibiting  a  and  variety  diverse  f o r the  more  of  of  a  of  great  and  u n i q u e demands  i n vocabulary  a  lengths.  vocabulary  very  sentence  for  over  that  frequency  and  true  sentence  the  the  is  most common words  same h o l d s  in  out.  characteristics,  uncommon words It  the The  variation  e x h i b i t s such v a r i a b i l i t y  length  grades. U n i f o r m i t y  a r e p r e s e n t a t i v e s e t of  be  sentence  lengths.  even  demands i n a l l s u b j e c t s ,  must  consistently  across  of  of running  of  E n g l i s h genre  relatively  w i t h i n or  i s considerable  Commerce)  than  areas  distribution  distribution  While  of  subject  sentence  sentence  reader  to  length  cope  even d i s t r i b u t i o n  of  with these  characteristics.  The print  results source,  characteristics features,  pronounced all  reference addition, broader  this  study  prescribed of only  two  word f r e q u e n c y  variability  from  of  in if  the  types and  of  of  based  k"  and  samplings texts,  would  possibly  had  been a n a l y z e d  sources  were made  and  syntactic  and  one  and  the  had  been  statistics  lexical  examined.  be  (including  reading  from  straightforward  sentence length, are  samples print  on  issue  relatively  recreational  i f probes array  n  results  total  are  even and  The more  i f samples  supplementary, included.  developed  In  on  a  semantic v a r i a b l e s r e l a t e d  to  1U7 g r a m m a t i c a l f u n c t i o n i n g , s y n t a x , and greater  variability  in  subject  those  each  a r e t h e most o b v i o u s f a c t o r s t o be c o n s i d e r e d .  This  realistic  must f o c u s  subject  areas.  specialist  while  between  rather subject  of the d i s c i p l i n e the  underlying  contribute  reading  with  the i d e a s  the  subject  specialist  brings  and i t s p r i n t  and  of  the  print  and c o n c e p t s i n viewed  teacher  as  a  and t h e  of the  reading  u n i q u e k n o w l e d g e and to  the  knowledge reading  the c h a r a c t e r i s t i c s o f print  t o problems i n comprehending  secondary  specific  sources  contributes  skills  for  be  than t h e s o l e province  specialist  processes  familiarity  a r e a s and t h e  S u c h i n s t r u c t i o n may b e s t  reading  The  secondary  instruction  as t o o l s i n presenting  responsibility  insight  reading  on t h e s u b j e c t  shared  specialist.  junior  demands o f p r i n t  o f t h e word a n d s e n t e n c e c h a r a c t e r i s t i c s w i t h i n  used  f o r use i n  the r e a d i n g  the  that  materials  a  grades,  area  suggests schools  i n describing  prescribed  variability  relationships,  would be e x p e c t e d .  conclusion,  materials  logical  of the act  i n general  instructional  team,  and which  materials.  RECOMMENDATIONS T h i s s t u d y s u g g e s t s a number o f for  the  immediate  avenues f o r f u t u r e 1.  implementation  reading secondary  corpora  provide  specialists, level  recommendations  o f t h e main f i n d i n g s a n d a l s o  research.  The word f r e q u e n c y  sixty-five  practical  with  lists  produced  subject  and  school  a  valuable  f o r t h e C o r p u s and t h e  teachers  and  coordinators,  administrators  a t the j u n i o r  source  of  language  data  148  representing a  grade,  each g r a d e  and  examined and settings,  level,  individual  their  adult  subject  area,  textbook.  relevance  to  subject  The  area  word l i s t s  should  instruction in regular  basic education,  and  classes for  within be  classroom  New  Canadians  determined. 2. the  A  word  lists  developed This  number  by  from  comparison  from  sources  The  could  Corpus  be  lists and  made  with  previously  Carroll  et a l .  b a s i c d i f f e r e n c e s between i n two  provide  of r e p r e s e n t a t i v e a u s e f u l data  The  samples c o u l d  example,  it  would  samples  for Cloze  modify  the  and  word  different  basic differences  of a r e a s .  be  and  Kucera-Francis,  identify  print the  study  could  between  data  countries  and  Canadian  and  English.  3. study  present  could  i n determining  American  the  Lorge-Thorndike,  bases compiled aid  of c o r r e l a t i o n a l analyses  be  to  research  by  computer  program  readability further  project for  research.  analyzed  using  transformational  grammar  studies  could  provide  structure  of  print  easy  effect  would  samples  techniques and  other  further  materials  in a  the  number  to  generate  computer  mutilated  programs  word. R e s e a r c h  of d i f f e r i n g  For  sample  to  could length  a p p l i c a t i o n of e x i s t i n g r e a d a b i l i t y  syllable  The  in  i n r e a d a b i l i t y research.  developing  determine the  useful  used  d e l e t e every "nth"  number o f s a m p l e s i n t h e  measures. A  base f o r r e s e a r c h  relatively  s a m p l e s and  undertaken  be  samples generated  plays  be  the  counts  development for  themselves and  insight  in  could  be  i n the  also studies  algorithms.  into  a  application  measures from  linguistic  of  the  processing  Such  role of  in  the  written  149 language. 4.  A  textbooks  thorough  a n a l y s i s of the r e a d a b i l i t y  used i n t h e s t u d y  as  new  adoptions  i n subsequent  5.  An  concerns areas.  area  the  are  made  research  could  or  different  possible  being  patterns  the  to the  used'.  t o develop reading  have  been  be added t o and  as the curriculum i s  requiring  t o as 'the m a n i p u l a b l e  materials  length  The 469  years.  of  a causal relationship  undertaken.  in  of  continued  language  T h e r e i s a need t o f u r t h e r i d e n t i f y  referred  with  each  s e l e c t e d and d e s c r i b e d . The d a t a  updated revised  be r e a d i l y  5 0 0 words  samples o f approximately carefully  could  of the various  linguistic  difficulty  with  what  this  i n the subject Bormuth  (1969)  v a r i a b l e s which  of  the  teaching strategies  to  help  demands  in  their  bear  instructional  information  presented  attention  i t would students  be cope  instructional  ma t e r i a l s . 6. F u r t h e r characteristics the  specific  written  improving  7. and  reading  in  be  For  into  inherent  example,  expression  construction  present  made  the  within a subject area  difficulties  English  sentence  pages l a t e r  writing  of textbooks  expression.  instruction  few  a n a l y s i s should  a literary  linguistic  t o determine  i n certain  types  a  textbook  dealing  may  offer  suggestions  of with on  i n one p a r t o f t h e book and a excerpt  as an example o f good  style. The use o f t h e " e l i m i n a t i o n t e c h n i q u e "  developed  to  produce  a  core  could  vocabulary  be  f o r each  refined of the  150 subject areas. information placement  These v o c a b u l a r y  in  the  development  evaluation  in  to  samples.  allow  for  computer  to  become  in their  further  procedures,  objectives  to the  of  to  computer  the  to  The  in this  study  textbooks  used i n the wider  other  reading.  encompass  Grade  description  and  and  the  across  secondary grade provinces  or  Finally,  the  analysis  of  sample  to  Grade  and  elementary, levels.  The  studies  model c o u l d selected  be  area  of  the  basic  needs  and who  may  model c o u l d  adapted  provide teachers,  on to  and  in  be  extended  to  a  within  and  senior other  provinces.  more  within  reading  thorough  areas  samples a c r o s s  important  further  applied i n  a  a  students  secondary,  features  for  by  subject  allow  deal  materials  supply  enabling  a l s o be  in a  to  provide  could  thus  junior  modified  enlarged  would  made o f t h e  linguistic  would  subject  model  based  be  using  instructional  This  12,  a n a l y s i s t o be  in  which  researchers,  4  the  in  technicians  be  v a r i a b l e s encountered  Secondly,  language  of  their  could  o f p r i n t e d s a m p l e s and  insights into linguistic their  and  researchers  other  model c o u l d  j u n i o r secondary grades.  representation  developed  advice.  with  of  and  understanding  communicate  the  sample  formative,  advantages  an  number o f ways. I n i t i a l l y , a  valuable  natural  for  programmers and  model d e v e l o p e d  of  exists  gain  and  further  analysis  a v a i l a b l e f o r c o n s u l t a t i o n and 9.  provide  summative,  be  need  aware  work,  computer  are  should  In a d d i t i o n , a v i t a l  education  of  would  reading.  8. Computer t e c h n i q u e s modified  lists  detailed a  language  information specialists.  for  151  BIBLIOGRAPHY  152 A. BOOKS AND MONOGRAPHS A l f o r d , M.H.T. "Computer A s s i s t a n c e i n L a n g u a g e L e a r n i n g and i n Authorship Indentification". _he_Co_p_te__in_Lit_rar__a __M_i_tic_Research_ ed R. A. Wisbey, London: Cambridge U n i v e r s i t y P r e s s , 1971. Artlay, A Sterl. "The Development o f R e a d i n g M a t u r i t y i n H i g h School - I m p l i c a t i o n s of the Gray-Rogers Study". lEE£ovin_ Reading i n Secondary S c h o o l s : . S e l e c t e d Readings^ ed. Lawrence E. H a f n e r , New Y o r k : M a c m i l l a n Co., 1967. Trends_and_Practic_s_in_Sec Delaware: I n t e r n a t i o n a l Reading  Newark, A s s o c i a t i o n , 1968.  "Implementing a Developmental Reading Program Secondary Level, Ieachin_jReadinjg_in_H Ik___.£l . .x » R o b e r t K a r l i n , I n d i a n a p o l i s , New Y o r k : M e r r I l l ~ C o . I n c . , 1969. e s  on t h e Bobbs-  e d  Aukerman, CR. R e a d i n g i n t h e S e c o n d a r y ,„ S c h o o l _Classroom., Y o r k : M c G r a w - H i l l Book Company, 1972. M  New  Ballou, Stephen V. A_Model_for_Theses_and_Research_Pa_ers_ B o s t o n : Houghton M i f f l i n Company, 1970. Bond, Guy L . , and E v a Bond. D e v e l o p i n g R e a d i n g _ i n H i g h S c h o o l New Y o r k : M a c M i l l a n Co., 19417 - -  l l  L  Bond, Guy L., and M i l e s A. T i n k e r . ReadingDifficulties__Their D i a g n o s i s a n d C o r r e c t i o n . New Y o r k : A p p l e t o n - C e n t u r y - C r o f t s , 19677" Bormuth, John R. R e _ d a b i l i t _ _ i n _ 1 9 6 8 _ A Research N a t i o n a l C o u n c i l o f T e a c h e r s o f E n g l i s h , 1968. Botel, Morton. Botel_Predicting_Rea_abili^ F o l l e t t P u b l i s h i n g ~ C o 7 7 1962." Buckingham, B.R., and E.W.Dolch. A Combined G i n n and Co. , 1936.  Eulletin, Chicago:  _ord_List_  Boston:  B u r t o n , Dwight L. "Heads Out o f t h e Sand: S e c o n d a r y S c h o o l s F a c e the Challenge o f Reading". Teachin___ea_in__in_Hi_ §§I§£i§i__E_icles_ ed. Robert Karlin, Indianapolis, New Y o r k 7 ~ B o b b s - M e r r i l l Co. I n c . , 1969. Campbell, William R. Form a n d _ S t y l e _ i n _ T h e s i s _ W r e d i t i o n . B o s t o n : Houghton M i f f l i n Company, 1969.  Third  153 Carroll, John B. "The N a t u r e o f t h e Beading Process". Theoretical_Models_and ed. Harry S i n g e r , Newark, D e l a w a r e : I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1970. Carroll, John B. "Development o f N a t i v e L a n g u a g e S k i l l s Beyond the E a r l y Years". The_Learnin2_of_Lan^ ed. C a r r o l l E. Reed, New Y o r k : A p p l e t o n - C e n t u r y C r o f t s , 1971. Carroll, J o h n B., P e t e r D a v i e s , and B a r r y Richman. T h e _ A m e r i c a n S®Eii33S_]i2£^_E£S2]i2i}£l_§22JSi Ne w Y o r k : Houghton Mifflin Company, 1971. Ch a l l , Jean. R e a d a b i l i t y • ^ An_ A p g r ^ i g a l o f R e s e a r c h _ a n d A££li£§.ti.2Il». Bureau o f E d u c a t i o n a l R e s e a r c h Monographs, No. 6, Columbus, Ohio: Ohio State University, Bureau Of E d u c a t i o n a l R e s e a r c h , 1958. Chomsky, N. S y n t a c t i c _ S t r u c t u  The Hague: Mouton, 1957.  A s p e c t s o f t h e T h e o r y , o f Syntax. Cambridge, P r e s s , 1965.  Mass.: M.I.T.  Cola, Luella. The_.Teacher_s _ Hand book _ o f T e c h n i c Bloomington, Illinois: Public School Publishing 1940.  Company,  Coleman, E.B. " E x p e r i m e n t a l S t u d i e s o f R e a d a b i l i t y " , R e a d a b i l i t y i n _ _ 9 6 8 _ e d . J o h n R. Bormuth, N a t i o n a l C o u n c i l o f Teachers o f " E n g l i s h , 1968. Coombs, C l y d e H. A _ T h e o r y _ o f _ p a t a I n c . , 1964.  i  New Y o r k : J o h n  W i l e y 6 Sons,  D a v i s , F r e d e r i c k B. " R e s e a r c h i n B e a d i n g i n High S c h o o l and College", Review o f E d u c a t i o n a l Research^. 22 ( A p r i l , 1952), 65-75. D e c h a n t , E.V. I m p r o v i n g t h e T e a c h i n g o f Reading. S e c o n d New J e r s e y : P r e n t i c e - H a l l I n c . , 1970.  Edition.  Deese, J. Structure_of_Associ B a l t i m o r e : J o h n H o p k i n s P r e s s , 1965. Ebel, Robert L. Fourth e d i t i o n .  (ed) . Iij£y»£i2£_;^i .-2f_MM£3;_;i2fi§i_!2§£l£2]:i L o n d o n : The M a c m i l l a n Company, 1969. a  F a r r , R., e t a l . "An E x a m i n a t i o n o f R e a d i n g p r o g r a m s i n I n d i a n a Schools," Bulletin_of_School_o 45 (1972), 5-92.  1 54  F a u c e t t , L a w r e n c e , and I t s u Maki. A_St____of____li_h_Wor_ Statistically Deterlined_from_thg_Latest_Extensiye_Word C o u n t s _ T o k y o : Matsumura S a n s h o d a , 1932. Ferguson, Charles A. " I n t r o d u c t i o n " , The L e a r n i n g o f Language, 3d. C a r r o l E. Reed, New Y o r k : A p p l e t o n - C e n t u r y C r o f t s , 1971. Fox,  David j . The_Research_Process_i^ R i n e h a r t and W i n s t o n , I n c . , *1969.  F r a n c i s , W. Henry,  Nelson. and W.  Fries,  Thg„_tructure^pf E n g l i s h .  York:  Holt,  Manual o f I n f o r m a t i o n t o accompany Kucera, Nelson Francis. Com_utational_AnaJL._sis_of ££§§5___riiy._^£§£i£a_-S£_2iS_2. P r o v i d e n c e , Rhode Island: Brown U n i v e r s i t y P r e s s , 1967.  and  C.  New  York:  Harcourt,  Brace  World7~Inc77~T952.~  F r i e s , C h a r l e s C., and A. A i l e e n Traver. En2_ish_Word_Lists__A §tud__of_Their_Ad_£tability_for_Instruction_ Washington: A m e r i c a n C o u n c i l on E d u c a t i o n , 1940. Gates, New  Arthur I. A^Reading__gcabulary for_the_Primar^_Grades_ York: Teachers C o l l e g e , Columbia U n i v e r s i t y , 1926.  "Reflection and Return". _ea_in_____Hu__n__i_ht_a Human_Problem_ ed. R a l p h C. S t a i g e r and Oliver Andresen, Newark, D e l a w a r e : I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1968. Glass,  G.V.,  and  J.C.  Stanley. S t a t i s t i c a l . Methods_in_Education Jersey: Prentice-Hall,  l_i_£§I£il2£22Is. Englewood C l i f f s , New Inc77~19707~  Goodman, Kenneth S., e t a l . C h o o s i n _ _ M a t e r i a l s _ t o _ T e ^ D e t r o i t : Wayne s t a t e U n i v e r s i t y P r e s s , 1 9 6 6 . Gray, William y_N_E_S_C_O x  S. 1969.  I--^^gi2„£B2_2l^§§§^-Bg..,^ ^-^g^^ i gr. n  Gray, William S., and Bernice E. Leary. Readable. C h i c a _ o _ _ U n i v e r s i t _ _ o f _ C h Harris, Albert e d i t i o n . New  J. York:  e  s  ®  n  l  What Makes a Eook  2ow_to_Increase_Readi^ David~icKay~Co7 Inc., 1 9 7 0 .  Harris, Albert J . , and R__ii£__y.2£___l_£_ . .s.  :  Fifth  Milton D. J a c o b s o n . Basic_Elementary. Y o r k : The M a c m i l l a n Company, 1972.  ev  155 H a r r i s , C h e s t e r W. Third edition.  (ed). gpcyclo_edia_of_Educational_Research_ New Y o r k : M a c m i l l a n Co., 1960.  Horn, Ernest. Basic_Writing_Voca E d u c a t i o n , F i r s t S e r i e s , No. H) Iowa Iowa, 1926. r  City:  (Monographs University  in of  Huus, Helen, " I n n o v a t i o n s i n Reading Instruction: At Later Levels" . T_g_67th^ea^bgok_of_-.._Mtiona!_Society_for_^h^ Study_of_Education _Part_II e d . H e l e n M. R o b i n s o n , C h i c a g o : The U n i v e r s i t y o f C h i c a g o P r e s s , 1968. x  x  Jenkinson, M.D. " I n f o r m a t i o n Gaps i n R e s e a r c h in Reading Comprehension". ReadJ.ng__£^ George B. S c h i c k and Merrill M. May, Milwaukee: National Reading C o n f e r e n c e , 1970, pp. 179-192. J e w a t t , Arno ( e d . ) . IgpcQY4M_S§§§J-I!9^4B_l]bg_vIl}I!iSI^5i2li^SSi29l?. Washington, D.C: U n i t e d s t a t e s Government P r i n t i n g O f f i c e , 1957. J o h n s o n , M a r j o r i e Seddon. "Word Perception i n the Reading Thinking Process". Paper in Reading_and_Think P r o c e e d i n g s o f t h e 22nd A n n u a l R e a d i n g I n s t i t u t e at Temple University, 1965, Temple U n i v e r s i t y , P h i l a d e l p h i a , P a . (ED 015 094) Jones, Lyle V., and Joseph M. Wepman, A_Sgoken_Word_Count. C h i c a g o , I l l i n o i s : Language R e s e a r c h A s s o c i a t e s , 1966. Jongsma, Eugene. T h e _ C l o z e _ P r o c e d u r e _ a s _ a _ Newark, D e l a w a r e : The International Reading 1 971. Klare, George R. T he_ Measure me nt^of__ R e a d a b i l i t y ^ Iowa S t a t e U n i v e r s i t y P r e s s , ? 9 6 3 .  Association, Ames, Iowa:  Kucera, Henry. "Computers in Language Analysis and Lexicography". The_American^Heri SS-iiSk-iaiiasaaSx. B o s t o n : American Heritage Publishing Company, Inc., and Houghton Mifflin Company, 1969, p.xxxviii. K u c e r a , Henry and W. N e l s o n Francis. Computational_Analysis_of Pgg§§flt~MY^ A m e r i c a n ^ EMligh,;, Providence, Rhode Island: Brown U n i v e r s i t y P r e s s , 1967.  156 Luhn, H.P. "The A u t o m a t i c C r e a t i o n o f L i t e r a t u r e A b s t r a c t s " . IBM l°]iE£iI_°l_E®§§S££h_§._^_"2§vil2£S§__ ( A p r i l , 1958), 159-165. Reprinted i n , Ke__Pa£_rs_in_Information_Science_ ed. Arthur W. E l i a s , W a s h i n g t o n , D. C: A m e r i c a n S o c i e t y f o r I n f o r m a t i o n S c i e n c e , (1972) , 87-93. Malmguist, Eve. " R e a d i n g : A Human R i g h t a n d A Human P r o b l e m " . R e a d i n g : A Human R i g h t and A Human P r o b l e m , ed. Ralph C. Staiger and Oliver Andressen, Newark, Delaware: I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1968. M a l s b a r y , Dean R. "A S t u d y o f t h e Terms that People Need t o U n d e r s t a n d i n O r d e r t o Comprehend and I n t e r p r e t t h e B u s i n e s s and E c o n o m i c News A v a i l a b l e T h r o u g h t h e Mass M e d i a " , S t u d i e s _E_I^U2__i°"ix Thesis Abstract S e r i e s , No. 4, B l o o m i n g t o n , I n d i a n a : S c h o o l o f E d u c a t i o n , I n d i a n a U n i v e r s i t y , 1952. Karon, M.E. " A u t o m a t i c I n d e x i n g : An E x p e r i m e n t a l Inquiry, Journal_of_the_As§.2£_2_12__f2E_£om£utin^_Machiner__ (1961), 404-417. R e p r i n t e d i n , Ke__Pa_ers_in_Information Science, ed. Arthur W. E l i a s , W a s h i n g t o n 7 D.C: A m e r i c a n S o c i e t y f o r I n f o r m a t i o n S c i e n c e , ( 1 9 7 2 ) , 94-107. McLuhan, Marshall. New Y o r k : S i g n e t  U_derstandin__Media__The Books7"964.  Nason, H.M. " E f f i c i e n t R e a d i n g - A Way t o Permanent E d u c a t i o n " . _e__in__-__he_Ri_ht Milwaukee, Wisconsin: T w e n t i e t h Y e a r b o o k o f t h e N a t i o n a l R e a d i n g C o n f e r e n c e , 1971. Paivio, A. Imagery and v e r b a l P r o c e s s e s . R i n e h a r t and W i n s t o n , I n c . , 1971. Pei,  Mario A. T h e _ _ o r l d _ s _ C h i e f _ L a n _ u London: George A l l e n S Unwin L t d . , 1949.  New  York:  Third  Holt,  edition.  Pratt, Edward. "Reading as a Thinking Process". Vistas_in I_§.^iSa__l2i3i£S_II__Part_I_ ed. J . A l l e n Figurel, Seventh A n n u a l C o n v e n t i o n , I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1966. Pressey, L u e l l a C. V o c a b _ l a _ _ _ L i s t s _ i Bloomington, I l l i n o i s : P u b l i c School P u b l i s h i n g Rankin,  E  F.  "Cloze  Procedure  -  A  Survey  of  Co., 1924. Research",  !L2_E_ -_____I§!E_22__2l___i_I___2__2_&SS^_£2»£2_£§£S££§x. e  E.L.Thurston and C o n f e r e n c e , 1965.  L.E.Hafner,  Milwaukee:  National  eQ  " » s  Reading  Rinsland, Henry D. A_Basic_Vocabular__of_E lementarj^School C h i l d r e n ^ New Y o r k : The M a c m i l l a n Company, 1945.  157 R o b i n s o n , H.A. " C o m m u n i c a t i o n s and C u r r i c u l u m Change". L a n g u a g e ^ R e a d i n g , and t h e C o m m u n i c a t i o n . P r o c e s s , ed. Carl Eraun, Newark, Delaware: International Reading Association C o n f e r e n c e , 1971, p . 2 . Rogers, John Mississippi: 1965.  R. (ed) . L i n g u i s t i c s i n .Read i"3^£n§j_xuc£:iQ5s. U n i v e r s i t y o f M i s s i s s i p p i , The R e a d i n g Clinic,  Russell, D.H. And H.R. F e a . " R e s e a r c h on T e a c h i n g R e a d i n g " . S S I i 5 i 2 2 l S _ ° £ _ S ® § S S £ £ ^ _ ° £ _ T e a c h i n g ed. N. L. Gage, Chicago: x  R a n d ~ M c N a l l y 7 T963.  Singer, H. , and R. B. Ruddell, ( e d s ) . T h e o r e t i c a l _ M o d e l s and ££2£®§§SS_2^_E§5MMi Newark, Delaware: International R e a d i n g A s s o c i a t i o n , 1970. Smith, Nila Delaware:  Banton. American_Reading_Instruct I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1965.  S t e i n b e r g , S. H.  Fi_e_Hundred_Years_o^^^^^^  Newark,  Bristol:  Penguin  Books, 1966. S t o t h e r s , G.E., R.W.B. J a c k s o n , and F.W. Minkler. _ o r d _ L i s t . . T o r o n t o : T h e R y e r s o n P r e s s , 19 47.  A_Canadian  i  Strang, Ruth., CM. McCullough, Improvement o f R e a d i n g . New Y o r k :  19677"  and A . E . T r a x l e r . The M c G r a w - H i l l Book Company,  Tatsuoka, Maurice M., and D a v i d V. Tiedeman. " S t a t i s t i c s a s a n Aspect of Scientific Method i n Research on T e a c h i n g " , £&-4fa22J£;_gf_gg§gg£gh 2 2 _ T e a c h i n g _ e d . N.L. Gage, C h i c a g o : R a n d ~ M c N a l l y 7 T9637 T a y l o r , S t a n f o r d E . , H e l e n F r a c k e n p o h l , and C a t h e r i n e E. White. i_Kevised_Core_Vocabu §.•,-.&S_M , .il£ggLy22aI !arY._f o c _ G r a d e s _ 9 - 1 3 _ Research and Information B u l l e t i n No. 5 (revised!, H u n t i n g t o n , N.Y.: Educational Developmental Laboratories, 1949, 1969 (revised), ?  a  u  t  T h o r n d i k e , Edward L. T h e _ T e a c h e r _ s _ _ o r g _ B C o l l e g e , C o l u m b i a U n i v e r s i t y , 192*1.  New Y o r k :  Teachers  Thorndike, Edward L. &_Sgacher_s__ord_Bogk_of_the_Twenty Thousand _ Word s_Fou R e a d i n g f o r C h i l d r e n and Young P e o p l e . New Y o r k : Teachers C o l l e g e , C o l u m b i a U n i v e r s i t y , 1931.  158 T h o r n d i k e , Edward L., and I r v i n g L o r g e , Th§_l§acherJ_s_Word_Book of_30 000_Words_ New York: Teachers College, Columbia A  UnIversity7~T9iTu.  Toffler,  A. F u t u r e ^ Shock., New  Y o r k : Random House,  1970.  T r a x l e r , A r t h u r E . "Development o f a V o c a b u l a r y T e s t for High School Pupils and College Freshmen", 1962 F a l l Testing Program i n I n d e p e n d e n t S c h o o l s and Supplementary Studies, Iducational_Records_Bulled No. 83 ( F e b r u a r y , 1963), 6773. Webb, Eugene J, et al. U n o b t r u s i v e Measures^..Nonreactiye R e s e a r c h i n the S o c i a l S c i e n c e s . . C h i c a g o : Rand McNally & Co.7~1966. West, Michael. i_S_udy_of_the_Vocabu IStStiSS-Iils^Grade^ Washington: K i n d e r g a r t e n U n i o n , 1928. Yule, G. Udny. The S t a t i s t i c a l C a m b r i d g e , E n g l a n d , 1944.  B.  Ames, Wilbur S. "The Contextual Aids", 1 966).  The  International  Study of L i t e r a r y  Vocabular%«..  PERIODICALS  Development o f a C l a s s i f i c a t i o n Reading_Research CjuarterlXt  2  Scheme o f (fall*  Aukerman, R o b e r t C. " R e a d a b i l i t y o f S e c o n d a r y S c h o o l L i t e r a t u r e Textbooks: A F i r s t Report", E n g l i s h J o u r n a l ^ 54, (September, 1965). . B a r t o n , J o h n s o n D. "Computer F r e q u e n c y C o n t r o l o f V o c a b u l a r y i n Language L e a r n i n g R e a d i n g M a t e r i a l s " , I n s t r u c t i o n a l S c i e n c e ^ 1 (March, 1 9 7 2 ) , 1 2 1 - 1 3 1 . Beier, Ernst G., John A. S t a r k w e a t h e r , and Dan E. M i l l e r . "Analysis of Word F r e q u e n c i e s i n Spoken Language of C h i l d r e n " , _anguage_and_Speech o (1967), 217-227. x  Betts, Emmett A. " S t r u c t u r e i n t h e R e a d i n g Program", E n g l i s h j , XL (March, 1 9 6 5 ) , 2 3 8 - 2 4 2 .  Elementary  159 B i c k l e y , A.C., e t a l . "The Cloze Procedure: A Conspectus", J Q u g n a l ^ o f „ J e a d i n g _ B e h a v i o r ^ 2 (Summer, 1 9 7 0 ) , 232-243. Bloomer, R.H. " C o n n o t a t i v e Meaning and t h e R e a d i n g and S p e l l i n g D i f f i c u l t y of Words", T h e _ J o u r n a l _ o f _ E u d c a t i o n a l Research.,. 55 (November, 1 9 6 1 ) , 107-112. Bormuth, John R. " R e a d a b i l i t y : A New _ u a r t _ r l _ _ 1 (1966), 79-131.  Approach",  Readin__Research  Bormuth, John R. "New Developments i n R e a d a b i l i t y R e s e a r c h " , E l e m e n t a r _ _ E n _ l i s h _ 44 (December, 1 9 6 7 ) , 840-845. B o r t n i c k , R., and G.S. L o p a r d o . "An I n s t r u c t i o n a l the Cloze Procedure", J o u r n a l of_Readin__ 1 9 7 3 ) , 296-299.  Application of 16 (January,  Broadbent, D.E. "Word Frequency Effect and Response Psychologic__________ 74 ( J a n u a r y , 1967), 1-10. B r u n e r , J . S. " N e u r a l Mechanisms _ e v i e _ _ 64 ( 1 9 5 7 ) , 340-358.  in  Perception",  Bias",  Psychological  Card, William, and V i r g i n i a McDavid. " F r e q u e n c i e s o f S t r u c t u r e Words i n t h e W r i t i n g o f Children and Adults", Elementary E n _ l i s h _ 42 (December, 1 9 6 5 ) , 878-882,894. Chronister, G.M., and K.M. Ahrendt. "Reading I n s t r u c t i o n i n B r i t i s h Columbia's Secondary Schools", Journal of^Reading (March, 1968) , 425-427. c  Coleman, E.B. " E x p e r i m e n t a l S t u d i e s o f R e a d a b i l i t y " . E _ _ l i _ h _ XLV (March, 1968), 316-323,333.  Elementary  Conway, James A., and Troy V. McKelvey. "The Role of the Relevant Literature: A Continuous Process", The_Journal of Educational__________ 63 (May-June, 1970), 407-4T57 C u l h a n e , J o s e p h W. " C l o z e Procedures and Comprehension", S _ _ 3 i n _ _ T e a c h e r _ 23 ( F e b r u a r y , 1970), 410-413.  The  Dale, Edgar., and Jeanne S. C h a l l . "A F o r m u l a f o r P r e d i c t i n g R e a d a b i l i t y " , E d u c a t i o n a l R e s e a r c h B u l l e t i n _ 27 ( 1 9 4 8 ) , 1120. D a l e , E d g a r . "The P r o b l e m o f V o c a b u l a r y i n R e a d i n g " , R e s e a r c h _ B u l l e t i n _ 35 (1956) , 113-123.  Educational  160 "VOCABULARY MEASUREMENT: TECHNIQUES AND MAJOR FINDINGS", 4 2 (December, 1965), 895-901.  ___________________  Dodds, William J. " H i g h l i g h t s from the History of Reading I n s t r u c t i o n " , The R e a d i n g T e a c h e r _ 21 (December, 1 9 6 7 ) , 274280. D o l c h , Edward W. "A B a s i c S i g h t Vocabulary", _ o _ r n a l _ 36 ( 1 9 3 6 ) , 456-460.  Elementar__School  Durr, William K. "Computer Study o f H i g h F r e q u e n c y Words i n P o p u l a r T r a d e J u v e n i l e s " , T h e R e a d i n g T e a c h e r _ 27, ( O c t o b e r , 1973), 37-42. Fry,  Edward. "A R e a d a b i l i t y F o r m u l a T h a t S a v e s Time", R e a d i n g , 11 ( A p r i l , 1968) , 513-516, 575-578.  Journal of " ~~  S l a z e r , Susan Mandal. " I s S e n t e n c e L e n g t h a Valid Measure of Difficulty i n R e a d a b i l i t y F o r m u l a s ? " , The R e a d i n g , T _ e a c h e r 27 ( F e b r u a r y , 1974), 464-468. t  Howes, D a v i s . "A Word Count o f Spoken English", Jour_al_of Z_Ebal______i_g__nd_Verbal_____yior_ 5 ( 1 9 6 6 ) , 572-604. Johnson, Dale D. "The D o l c h L i s t Re-examined", T e a c h e r _ 24 ( F e b r u a r y , 1 9 7 1 ) , 449-457.  The_Reading  J o h n s o n , D. B a r t o n . Computer F r e q u e n c y C o n t r o l o f V o c a b u l a r y i n Language L e a r n i n g R e a d i n g M a t e r i a l s " , I n s t r u c t i o n a l S c i e n c e _ 1, (March, 1972), 121-131. Klare, George Approach",  R. "Comments on Bormuth*s R e a d a b i l i t y : A New Re_ding_R_se__ch_2uarterl_ , 4 ( 1 9 6 6 ) , 119-125.  K l a r e , George R. "The R o l e o f Word Frequency i n Readability R e s e a r c h " , E l e _ e _ t a r _ _ _ n g l i _ h , 45 ( J a n u a r y , 1 9 6 8 ) , 12-22. Kyte, George C. "A C o r e V o c a b u l a r y i n t h e Language __i£__Ka._£___ 34 (March, 1953), 231-34.  Arts", Phi  L i v e l y B e r t h a , and S.L P r e s s e y . "A Method of Measuring the Vocabulary Burden o f Textbooks", E d u c a t i o n a l A d m i n i s t r a t i o n ___________ 9 ( O c t o b e r , 1923)7 389-3 98." L o r g a , I r v i n g . "Word L i s t s as Background f o r Communication", T _ a c h e r s _ C o l l e g e _ R e c o r _ _ 45 (May, 1944), 543-52. Louthan, Vincent. "Some S y s t e m a t i c G r a m m a t i c a l D e l e t i o n s and T h e i r E f f e c t s on R e a d i n g C o m p r e h e n s i o n " , E n g l i s h J o u r n a l _ 54 ( A p r i l , 1965), 295-299.  161 M a c G i n i t i e , W a l t e r H., and R i c h a r d Tretiak. "Sentence Depth Measures as P r e d i c t o r s o f Reading Difficulty", leading I .§§a££._,_i2]i§,_:lsEl24. ( S p r i n g , 1971), 364-377. V  e  I  M c L a u g h l i n , H a r r y G. "SMOG G r a d i n g - a New R e a d a b i l i t y 22a££il_2J_S®adin2 (May, 1 9 6 9 ) , 639-646.  Formula",  x  Nyman, P a t r i c i a , e t a l . "An A t t e m p t t o S h o r t e n t h e Word List with the Dale-Chall Readability Formula." Educational 8 2§ arch_Bulletin 40 (September, 1961) , 150-152. e  x  P a l m e r , W i l l i a m S. " R e a d a b i l i t y , R h e t o r i c , and t h e R e d u c t i o n o f Uncertainty", Journal ofjReading , 19 ( A p r i l , 1974), 552558. P a t t y , W.W., and W . I . P a i n t e r . "A Technigue f o r Measuring the Vocabulary Burden o f Textbooks", J o u r n a l , , o f _g d u c a t i o n a l R e s e a r c h ^ 24 (September, 1931), 127-134. Powers, S.R. "The V o c a b u l a r i e s of High Textbooks", Teachers C o l l e g e Record_ 26 368-382.  School Science (January, 1925),  Ramanauskas, S i g i t a . "The R e s p o n s i v e n e s s of Cloze Readability Measures t o L i n g u i s t i c V a r i a b l e s O p e r a t i n g Over Segments o f T e x t L o n g e r Than a S e n t e n c e " , R e a d i n g Res e a r c h_p_.ua r t e r l y 8 ( F a l l , 1972), 72-91. A  Robinson, H. Alan and Dan S. Dramer. " H i g h S c h o o l R e a d i n g 1958", J 2 U £ _ a l _ o f _ D e y e l o _ m e n t a l _ R e a d i n g _ 3 (Winter, 1960), 94-105. (See s u c c e s s i v e i s s u e s f o r summaries r e l a t e d t o 1961 t h r o u g h 1966) . Robinson, Helen M., Samuel Weintraub, and H e l e n K. S m i t h . "Summary o f I n v e s t i g a t i o n s R e l a t i n g t o R e a d i n g , J u l y 1, 1966 t o June 30, 1967", Seading_Research_2uarter 3 (Winter, 1967), 151-301. (See s u c c e s s i v e W i n t e r i s s u e s f o r summaries r e l a t e d t o 1968 t h r o u g h 1 9 7 3 ) . R o s s , Ramon R o y a l . " F r a n n i e and F r a n k and t h e F l a n n e l b c a r d " , The E§ading_Teacher 27 ( O c t o b e r , 1 9 7 3 ) , 43-47. x  S i l i a k u s , H.J. " C o m p u t e r - A i d e d 1 967) , 19-21.  Word R e s e a r c h " , B a b e l _  3  (July,  Smith, John M., and M a x w e l l E. McCombs. " R e s e a r c h i n B r i e f : The G r a p h i c s o f P r o s e " , V i s i b l e Language_ V (Autumn, 1971) 365369.  162 Smith, Nila Banton. "What Have We a c c o m p l i s h e d i n R e a d i n g ? A Review o f t h e P a s t Fifty Years". Elementary E n g l i s h , 38 ( M a r c h , 1961), 141-150. "The Many Faces of Reading Comprehension". T e a c h e r , 23 (December, 1969) S p a c h e , G e o r g e D. "A New Reading Materials", 1 9 5 3 ) , 410-413.  The R e a d i n g  R e a d a b i l i t y Formula f o r Primary Elementary S c h o o l _ J o u r n a l , 53  -Grade (March,  S t a u f f e r , R.G. "A S t u d y o f t h e P r e f i x e s i n t h e T h o r n d i k e L i s t t o E s t a b l i s h a L i s t o f P r e f i x e s T h a t s h o u l d be T a u g h t i n the Elementary S c h o o l " , J o u r n a l o f _ E d u c a t i o n a l R e s e a r c h , 35, ( 1944) , 453-458. S t o n e , C l a r e n c e R. "Measuring Difficulty of Primary Reading Material: A Constructive Criticism of Spache's Measure", E l e , , , t _ , y _ S c _ , o l _ _ o _ r , a _ _ 57 ( O c t o b e r , 19 56), 36-41. Summers, Edward G. " I m p o r t a n t R e s o u r c e f o r S e c o n d a r y _ o u r n _ l _ o f _ _ e a d i n g _ 10 (November, 1 9 6 6 ) , 88-102.  Reading",  Summers, E.G., Brother Leonard Courtney, and P e t e r Edwards. " G u i d e t o P r o f e s s i o n a l T e x t b o o k s and R e s e a r c h i n Secondary Reading Instruction," The E n g l i s h Q u a r t e r l y ^ 7 (Summer, 1974) , 124-146. Taylor, W. "Cloze Procedure: A New Readability", ___r_alis__2u_rterly_  Tool for Measuring 3 0 , 1953, 414-433.  Townsend, A. "Reading i n the Junior T e a c h e r , 15 (March, 1962), 369-371.  Grades",  The__eading  V o g e l , M a b e l , and W. C a r l e t o n Washbourne. "An O b j e c t i v e Method of Determining Grade Placement of Children's Reading M a t e r i a l s " , E l e m e n t a r y S c h o o l J o u r n a l _ 28 (January, 1928), 373-381. Warfel, Harry R. "A Bag With H o l e s " , J o u r n a l o f R e a d i n g , I I I (Autumn, 1959), 320-333? Washbourne, C a r l e t o n of Children's (January, 1938),  Developmental  W., and Mabel V. M o r p h e t t . "Grade P l a c e m e n t Books", Elementary School J o u r n a l , 38 355-364.  163 Weintraub, Samuel. "The C l o z e P r o c e d u r e " , The 21 (March, 1 9 6 8 ) , 5 6 7 , 5 6 9 , 5 7 1 , 6 0 7 .  Reading  Teacher,  Z i p f , George K i n g s l e y . "The Meaning-Frequency Relationship of Words", The_Journal of.General Psychology, 33 (October, 1945) , 2 5 1 - 2 5 6 7  UNPUBLISHED  MATERIALS  Aaronson, S h i r l e y . "Vocabulary Instruction: Challenge of the 70*s". Paper read at the National Reading Conference, December, 1 9 7 1 , Tampa, F l o r i d a . (ED 0 5 8 0 1 6 ) Artley, A. Sterl. T r e n d s and_ P r a c t i c e s . i n ^ S e c o n d a r y S c h o o l Rea_i______Co__a_io Monograph. Bloomington, Indiana: Indiana U n i v e r s i t y , ~ M a r c h , 1*970. 1AA~000 5 0 7 ) --  A u l l s , Mark W. "Toward a S y s t e m a t i c Approach t o How the Reader Uses Context to Determine Meaning". Paper read a t the N a t i o n a l R e a d i n g C o n f e r e n c e , December, 1 9 7 0 , S t . P e t e r s b u r y , F l o r i d a . (ED 0 4 9 0 0 3 ) Austin, Warren B. i_Com£uter-Aided_Techni D i s c r i m i n a t i o n _ _ T h e A u t h o r s h i p of Greene's _Groatsworth _ I t _ _ ~ F i n a I ~ R e p o r t r 1 9 6 9 7 ~ 7 E D 6"lo~322f  o f ~~  -  Bailey, Stephen D. " R e c e n t T r e n d s and D e v e l o p m e n t s i n R e s e a r c h Involving the Cloze Procedure". Unpublished research paper. F a c u l t y of E d u c a t i o n , U n i v e r s i t y o f B r i t i s h Columbia, 1 9 7 3 . Berg, P. C. "The P s y c h o l o g y o f R e a d i n g B e h a v i o r " . P a p e r r e a d a t the National Reading Conference, December, 1968, Los A n g e l e s . (ED 0 2 8 0 5 0 ) Berkeley, Edmund C. _esea_ch_in_Coinp_te_-_ssis_ A________o__a___T_ain Springfield, V i r g i n i a : National T e c h n i c a l Information Service, 1972. (ED 0 7 4 729) Bormuth, John R, "New Data on R e a d a b i l i t y " . P a p e r r e a d a t a m e e t i n g c o s p o n s o r e d by t h e I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n and t h e American E d u c a t i o n a l Research Association, May, 1 9 6 7 , S e a t t l e . (ED 0 1 6 5 8 6 ) Cloze_Rea_ability__roced_re_ Report Number C S E I P - 0 R - 1 , ~ T o s A n g e l e s 7 ~ U n i v e r s i t y o f C a l i f o r n i a , 1 9 6 7 . (ED 0 1 0 9 8 3 )  164 "Empirical Determination of the Instructional Reading Level". Paper r e a d at the I n t e r n a t i o n a l Reading Association C o n f e r e n c e , A p r i l , 1968, B o s t o n . (ED 020 084) "The E f f e c t i v e n e s s o f Current Procedures for Teaching Reading Comprehension". Paper read at the Fifty-Eighth Annual Meeting of the National Council of Teachers of E n g l i s h , November, 1968, M i l w a u k e e . ~U.S.  De velo_menl__of__e^ O f f I c e ~ o f ~ * E d u c a t i o n , ~ M a r c h ~ 1969.  "EDUPLAN Bibliography". U n i v e r s i t y o f C h i c a g o , 1972.,  Project  No.  7-0052,  ~  Unpublished  bibliography.  B o t e l , M. " A s c e r t a i n i n g I n s t r u c t i o n a l L e v e l s " . P a p e r r e a d a t t h e International Reading Association Conference, May, 1967, S e a t t l e . (ED 014 373) B r i t i s h Columbia Department of E d u c a t i o n . P£escribed_Textbooks_ 12Z£rl3__Grades_I-X_I_ Victoria: Curriculum Development B r a n c h 7 19727 B r u n e r , Jerome. " H e l l Begun i s H a l f Done: T h o u g h t s About Early C h i l d h o o d " . A d d r e s s a t t h e I n t e r n a t i o n a l Reading Association A n n u a l C o n v e n t i o n , May, 1972, D e t r o i t , M i c h i g a n . Carver, R o n a l d P. "What i s R e a d i n g C o m p r e h e n s i o n and How S h o u l d it be Measured?". Paper read at the National Reading C o n f e r e n c e , December, 1969, A t l a n t a , G e o r g i a . (ED 038 243) Carroll, John B. " B e h i n d t h e S c e n e s i n t h e M a k i n g o f a C o r p u s B a s e d D i c t i o n a r y and a Word F r e q u e n c y Book". P a p e r read at the m e e t i n g of t h e N a t i o n a l C o u n c i l o f T e a c h e r s o f E n g l i s h , November, 1971, L a s V e g a s , Nevada. (ED 056 842) C h a l l , J . "Research i n L i n g u i s t i c s and Reading Instruction: Implications f o r F u r t h e r R e s e a r c h and P r a c t i c e " . P a p e r r e a d at the I n t e r n a t i o n a l Reading A s s o c i a t i o n Conference, April, 1968, B o s t o n , Mass. (ED 028 904) Cooper, J . L. "The R e a d i n g P r o g r a m Spans the T o t a l C u r r i c u l u m " . Paper read at the International Reading Association C o n f e r e n c e , May, 1967, S e a t t l e . (ED 015 824) Cronnell,  Bruce.  D e s i g n i n g a Reading  £iliMM§_i__Ortho_ra£hy_  Dauzat, Two The  Program  19777 ~(ED 057 990)  Based  on  Research  S. V. " S t r u c t u r e Word Usage i n t h e V e r b a l D i s c o u r s e of Groups o f C h i l d r e n " . U n p u b l i s h e d D o c t o r ' s d i s s e r t a t i o n , U n i v e r s i t y of M i s s i s s i p p i , 1968.  165 D u n n - R a n k i n , P e t e r . " A n a l y z i n g t h e Development o f R e a d i n g Skill U s i n g an E r r o r - W o r d P r e f e r e n c e I n v e n t o r y " . P a p e r r e a d a t t h e meeting o f t h e American E d u c a t i o n a l Research A s s o c i a t i o n , F e b r u a r y , 1971, New Y o r k , N.Y. (ED 051 960) Eagan, Sister Ruth Louise. "An I n v e s t i g a t i o n into the Relationship o f t h e P a u s i n g Phenomena i n O r a l R e a d i n g and Reading Comprehension". U n p u b l i s h e d Doctor's dissertation, t h e U n i v e r s i t y o f A l b e r t a , 1973. Fagan, William T. "An I n v e s t i g a t i o n into the Relationship Between R e a d i n g Difficulty a n d t h e Number o f Types o f Sentence Transformations". Unpublished Doctor's d i s s e r t a t i o n , t h e U n i v e r s i t y o f A l b e r t a , 1969. F r o s t , J o e L. " A p p l i c a t i o n o f S t r u c t u r e P r o c e s s Theory to the Teaching o f Reading". Paper read a t the N a t i o n a l Conference o f T e a c h e r s o f E n g l i s h , M a r c h , 1970, S t L o u i s , M i s s o u r i . (ED 045 288) F u e l l h a r t , P a t r i c i a 0., and David C. Weeks. Com_ilation_and A n a l y s i s o f Lexical^Resou ReportT Springfield, Virginia: Clearinghouse f o r Federal S c i e n t i f i c and T e c h n i c a l I n f o r m a t i o n , 1968. (ED 021 602) S e y e r , James R. C l o z e _ P r o c e d u r e _ a s _ a i --?'g£Pi3i^§£-.„?,2--il. § - § ^ g _ 3 § t e r i a l s _ Olympia: Washington S t a t e ~ B o a r d f o r " c o m m u n i t y C o l l e g e E d u c a t i o n , 1968. (ED 039 157) n  u  e  Suthrie, J. T. Learnability_Ve Baltimore, Maryland: Center f o r Social Organization of S c h o o l s , John H o p k i n s U n i v e r s i t y , 1970. (ED 042 594) Harris, Jessica ComplexTerms  L. A _ S t u d y _ o f _ t h e _ C o m p u t qccurring_in_a  of  III__III_IIIiIIIsI^  H a t e r , M A. "The C l o z e P r o c e d u r e as a M e a s u r e o f t h e Reading Comprehensibility and D i f f i c u l t y of Mathematical E n g l i s h . " U n p u b l i s h e d D o c t o r ' s d i s s e r t a t i o n , P u r d u e U n i v e r s i t y , 1969.  166 Hill, Walter R., and Norma G. Bartin. Secondar__Readi Programs^ DescriEtdon_a^d_Resea£ch_ ERIC/CRIER Reading Review~Series7~July~197T7~(ED~055~759). Hill, Walter R., and Norma G Bartin. Reading Pr^gr^mj i n §§£ondar__Schools__^ Newark, D e l a w a r e : I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , 1971. (ED 071 057) Houska, J.T. "The Efficiency of the Cloze Procedure as a Readability Tool on Technical Content material Used in Industrial E d u c a t i o n at t h e High S c h o o l L e v e l . " Unpublished D o c t o r ' s d i s s e r t a t i o n . U n i v e r s i t y of I l l i n o i s at Urbana and Champaign, 1971. Jacobs, H. Donald. Willamette Valley. Eugene: T967.~ED~0"T5 8 457  Association_Word_List_for_the Oregon University, September,  J a c o b s o n , M i l t o n D. " R e a d i n g D i f f i c u l t y o f P h y s i c s and C h e m i s t r y Textbooks in Use i n Minnesota." Unpublished Doctor's d i s s e r t a t i o n . U n i v e r s i t y o f M i n n e s o t a , 1961. J a c o b s o n , M i l t o n D. " D e v e l o p i n g and Comparing E l e m e n t a r y Word Lists by Computer". P a p e r r e a d a t t h e m e e t i n g American Educational Research Association, April, C h i c a g o . (ED 062 102)  School of the 1972,  Jacobson, Milton D., and Mary Ann M a c D o u g a l l . " C o m p u t e r i z e d Model o f Program S t r u c t u r e and L e a r n i n g D i f f i c u l t y " . Paper in Proceedings of the 1969 Convention of t h e American P s y c h o l o g i c a l A s s o c i a t i o n , W a s h i n g t o n , D.C. (ED 040 006) Jongsma, Eugene R. f,!}g_^°2^fr9£§§y£gl.J.§!JOgL9LU§ R e s e a r c h . I n d i a n a U n i v e r s i t y : S c h o o l o f E d u c a t i o n , 1970, (ED 050~893)"~ Kulm, G. " M e a s u r i n g t h e R e a d a b i l i t y o f E l e m e n t a r y A l g e b r a U s i n g the C l o z e T e c h n i q u e . " Paper p r e s e n t e d a t t h e A n n u a l Meeting of t h e American E d u c a t i o n a l Research A s s o c i a t i o n , February, 1971. L e f e v r e , C a r l A. "Language and C r i t i c a l R e a d i n g : The Consummate Reader". Paper read at the N a t i o n a l Reading Conference, December, 1969, A t l a n t a , G e o r g i a . (ED 038 249) Lerner, J. W. A G l o b a l T h e o r y o f Reading... a n ^ _ L i n g u i s t i x s _ Newark, Delaware: international Reading Association, F e b r u a r y , 1968. (ED 023 538)  167 Lott, Deborah, et al. Bl___ionaJ._E_^ C o m b i n a t i o n s , i n t h e . v i s u a l , I d e n t i f i c a t i o n _ p f words. Inglewood, California: Southwest Regional Educational L a b o r a t o r y , 1968. (ED 035 516) L y n c h , M e r v i n D., e t a l . "The Building Block Construct as a Possible Model for D e c o d i n g P r o c e s s e s " . P a p e r r e a d a t the N a t i o n a l R e a d i n g C o n f e r e n c e , December, 1970, St. Petersburg, F l o r i d a . (ED 049 002) MacGinitie, W. H., and R. Tretiak. "Measures o f Sentence Complexity as Predictors of the Difficulty of Reading M a t e r i a l s " . In P r o c e e d i n g s of the 7 7 t h A n n u a l C o n v e n t i o n of t h e A m e r i c a n P s y c h o l o g i c a l A s s o c i a t i o n , 1969. (ED 038 254) Miller, Allan. Programmer's G u i d e t o t h e E d w a r d s ' C o r p u s . V a n c o u v e r , B r i t i s h C o l u m b i a : Computing C e n t r e , U n i v e r s i t y o f B r i t i s h Columbia, 1974. Olsen H. C. "Linguistic Principles and the Selection of Materials". Paper read at the International Reading A s s o c i a t i o n c o n f e r e n c e , A p r i l , 1968, B o s t o n , Mass. (ED 022 649) Potter, Thomas C. A_Taxonom__of _Clo_e_ResearcJi__Par^__I_ Readability_and.Reading^Comprehension. I n g l e w o od, C a l i f o r n i a : Southwest R e g i o n a l E d u c a t i o n a l L a b o r a t o r y , 1968. (ED 035 514) Rawson, H i l d r e d I . "A S t u d y o f t h e R e l a t i o n s h i p s and Development of Reading and Cognition". Unpublished Doctor's d i s s e r t a t i o n , The U n i v e r s i t y o f A l b e r t a , 1969. Rosenshine, Barak. "New Listenability." Paper Association conference, 528)  Correlates of Readability and read at the I n t e r n a t i o n a l Reading A p r i l , 1968, B o s t o n , Mass. (ED 024  Seels, Barbara, and Dale Edgar. £eadabilit__and_Re^ _nnotate__Bibliogr_ Newark, Delaware: I n t e r n a t i o n a l R e a d i n g A s s o c i a t i o n , *1971. (ED 049 896) Shima, Fred. _ e s e _ r c h _ o _ Wo r d _ A s s o c i a t i o n ^ Discourse... Inglewood, California: Southwest E d u c a t i o n a l L a b o r a t o r y , 1970. (ED 043 470)  Regional  168 Summers, Edward G. an A n n o t a t e d B i b l i o g r a p h y {i®§§S£_b__ela_ed_to__each^ School__T_0(__196_7 University o f " Pittsburgh?  I d u c a t i o n ? " 19 63 7~~(ED 0 10 7 5 7 ) .  of Selected School  of  A______ta__d_Bibli__ra_h_^  2 §_£EiliU__ l§_4i!i__i__oIfilP§i.t tSecond 1963_ niversity sburgh:  S c h o o l o f E d u c a t i o n , 1964.  "(ED~01 0 758) . I_tjjrna t i o n a l _ _ e a ^ i n ^ Re_orts_on Secondary Reading. ER I C / C R I E R ? " T96 7 7 " E D ~ 0 1 3 ~ 185) Summers,  Bloomington,  Indiana:  Edward G.,  C h a r l e s H. D a v i s , and C a t h e r i n e F. Siffin. era ERIC Document R e p r o d u c t i o n S e r v i c e , B e t h e s d a , Md., 1968. (ED 013 970) .  SJlfeii-hg^Re s e a r c h _ L i t  Published_Research_Lit Document Reproduction Service, 834) . Published__esj3arch_L Document R e p r o d u c t i o n S e r v i c e , 969) .  Bethesda,  iethesda,  Md.,  Md.,  1967. (ED 012  1968. (ED  ERIC 013  Vernon, Evelyn, I . "Words Make F o r S u c c e s s " . P a p e r r e a d a t t h e I n t e r n a t i o n a l Reading A s s o c i a t i o n , April-May, 1969, Kansas C i t y , M i s s o u r i . (ED 034 662) Weaver, Wendell W., and A. C. Bickley. "Structural-Lexical P r e d i c t a b i l i t y o f M a t e r i a l s Which P r e d i c t o r Has Previously Produced or Read". Paper i n t h e 1967 P r o c e e d i n g s o f t h e American P s y c h o l o g i c a l A s s o c i a t i o n , Division 15. (ED 011 812) Whipple, Gertrude. " P r a c t i c a l Problems of Schoolbook S e l e c t i o n f o r Disadvantaged P u p i l s " . Paper r e a d a t the International Reading Association C o n f e r e n c e , A p r i l , 1968, B o s t o n , Mass. (ED 029 750) W o l f e , J o s e p h i n e B, " A p p l y i n g R e s e a r c h F i n d i n g s i n Comprehension to Classroom P r a c t i c e " . Paper read at the International Reading A s s o c i a t i o n C o n f e r e n c e , May, 1967, S e a t t l e . (ED 0 14 371) Young, C a r o l E. D e y e l o p m e n t _ _ o f _ L a n g u a 2 e _ A n a l y s i s _ P r o c e ^ ^EEli_S_ion_to___to__tic_I_de_in Columbus Ohio: Computer and I n f o r m a t i o n S c i e n c e R e s e a r c h C e n t e r , 1973, p . 6 9 . (ED 078 843)  169  APPENDIX A INDEX  OF  TEXTS  AND  SAMPLES  BY  GRADE  LEVEL  \ C. ENGLISH  8.]  (Total • 1C01C T e x t :  Author:  o f 17 Samples)  The_Craft_of__riti_3. Don M i l l s , Longmans Canada L t d . , 1965.  Ontario:  R . J . McMaster.  Sample  Pages  Sample  Pages  01 02 03 04  2-3 25-26 48-49 71-75  05 06 07  96-97 119-121 132-136  * 1C02C T e x t : Author:  S h o r t Storie__gf_Distinction. Agincourt: The Book S o c i e t y o f Canada L t d . , 1960. L.H. N e w e l l and J.W.  MacDonald  (eds).  Sample  Pages  Sample  Pages  01 02 03 04 05  9-10 32-33 55-56 78-79 101-102  06 07 08 09 10  124- 125 147-148 170-171 192-194 215-216  r |  1  D. HOWE ECONOMICS  8, |  L  J  ( T o t a l o f 22 Samples) *1D01C T e x t : Authors:  r  |  Teen_Guide_to_Hom Toronto: M c G r a w - H i l l ~ C o 7 ~ o f " C a n a d a L t d . , 1968. M.S. B a r c l a y and F. Champion.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11  6-8 34-36 52-53 7 1-72 85-86 108-111 124-125 153-154 168-170 180-181 218-221  12 13 14 15 16 17 18 19 20 21 22  230-235 247-248 262-265 278-280 306-308 334-336 342-345 366-368 392-395 406-408 428-430  E. INDUSTRIAL EDUCATION  i  1  8. | J  ( T o t a l o f 9 Samples) •1E01C T e x t : Authors:  gxploring_Industrials M c G r a w - H i l l Co7 o f ~ C a n a d a ~ L t d 7 ,  New Y o r k : 1968.  J e s Laustrup, et a l .  Sample  Pages  Sample  Pages  01 02 03 04 05  3-5 32-38 51-55 55-106 106-115  06 07 08 09  141-144 161-162 181-183 196-197  I  I  F. MATHEMATICS ( T o t a l o f 14 Samples)  *1F01C T e x t : Author:  Introduction^ Addison-Hesley  Publishing  C F . Brumfiel,  et a l .  . R e a d i n g , Mass: Co. I n c . , 1962.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  14-16 31-35 52-56 70-74 101-102 131-133 136-141  08 09 10 11 12 13 14  175-179 188-197 201-207 226-228 243-244 259-260 264-268  8.1  I*" G. SCIENCE i  i  ( T o t a l o f 20 Samples) *1G01C T e x t :  Authors:  Labtext_in_Science:_Book_1. Toronto: The C o p p ~ C l a r k P u b l i s h I n g ~ C o . , 1968. G.H. Cannon, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05  13-16 25-27 61-62 78-80 100-101  06 07 08 09  121-124 138-140 163-164 179-180  173  •1G02C T e x t :  Authors:  i  | i  _e________o_t_Scie_c___. H o l t 7 ~ R i n e h a r t ~ 6 ~ W i n s t o n o f Canada, L t d . , 1968. Clifford  J . Anastasiou, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06  12-13 43 57-58 72-73 90 122-123  07 08 09 10 11  143-144 157-158 176-177 200-201 226-227  1  H. SOCIAL STUDIES  8.| j  ( T o t a l o f 22 Samples) *1H01C T e x t : Authors: Sample  Man._i__the_Tropics. Scarborough, O n t a r i o . B e l l h a v e n House L t d . , 1968. Bordon E . C a r s w e l l , e t a l . Pages  Sample  01  1-3  09  203-205  02 03 04 05 06 07 08  25-29 51-54 76-78 100-104 126-127 153-156 177-181  10 11 12 13 14 15  227-233 245-247 269-272 294-297 319-322 345-346  •1H02C T e x t :  Author:  Pages  _he_Shapin__of_Modern_Euro . Toronto: The H a c M i l l a n Company o f Canada L t d . , 1968. Geoffrey Williams.  Sample  Pages  Sample  Pages  01 02 03 04  5-6 28-29 50-51 76-77  05 06 07  99-100 1 17- 1 18 140-141  \ B. COMMERCE  9.1 ( T o t a l o f 25 Samples)  *2B01C T e x t : Authors:  £S£sonal_Tyjgewriti  . Toronto:  wTjT~Gage L t d . 7 ~ 1 9 6 7 . S . J . Hanous, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06  preface vi-vii 54-61 65-69 95-99 132-133  07 08 09 10 11  151-156 168 189-195 211-212 239-242  *2B02C T e x t : Authors:  The_Junior_Clerk. Toronto: S i r ~ I s a a c P i t m a n (Canada) L t d . , 1970. C.A.  Trotter  and P.C.  Glover.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  1-3 32-35 60-61 75-77 95-97 125-126 133-14 8  08 09 10 11 12 13 14  171-174 187-189 204-213 238-239 261-263 279-280 295-298  1  I  I  9. j  C. ENGLISH  ( T o t a l o f '47 Samples) *2C01C T e x t : Author:  •2C02C  Philip  G. P e n n e r and R u t h E. M c C o n n e l l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10  1-3 24-26 4 9-50 55-57 70-73 95-101 123-124 146-158 181-183 202-208  11 12 13 14 15 16 17 18 19 20  230-234 256-263 284-288 310-314 337-340 360-363 384-386 411-412 435-441 453-455  j  Text:  Author:  •2C03C  k^Srnin__E_n_lish. Toronto: T h e ~ M a c M i l l a n Company o f Canada L t d . , 1963.  The_Accomplished_R B e l l h a v e n House, 7 Maurice  Don M i l l s ,  Ontario  9 6 4 .  G i b b o n s and A l a n Dawe.  Sample  Pages  Sample  Pages  01 02 03 04  1-2 26-27 50-51 73-74  05 06 07  96-97 118 142-143  Text:  Author:  Prose_Readin_s. O n t a r i o : Longmans Canada L t d . , 1964. J a n de B r u y n  (ed) .  Sample  Pages  Sample  Pages  01 02 03 04 05  3-4 26-28 50-51 73-74 96-97  06 07 08 09 10  119-120 142-144 166-167 189- 190 212-213  176  •2C04C T e x t : Author:  ____H_r____B_o__of__o_er_^ C l a r i c e ? I r w i n S~Co.~Ltd.7~1964."  Toronto  J.G. B u l l o c k e ( e d ) . .  Sample  Pages  Sample  Pages  01 02 03 04 05  9-10 2 1-22 44-45 67-68 89-91  06 07 08 09 10  113-115 137- 138 159-161 183- 184 200-202  \ D. HOME ECONOMICS  9.| (Total  *2D01C T e x t : Authors:  o f 76 Samples)  Guide t o Mod_rn_Meals. T o r o n t o : M c G r a w - H i l l Co7~of"Canada L t d . , 1970. D.E. Shank, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11  2-3 31-32 61-61 72-73 97-98 120-121 137-141 159-163 186-187 206-207 223  12 13 14 15 16 17 18 19 20 20 21  246-249 267 289-291 306-307 324 342-344 365-366 383-384 417 417 426-427  *2D02C T e x t :  Clothes_for_Teens. Toronto:  D7c7~Heath~Canada L t d . , 1970. Authors:  E . Todd and F. R o b e r t s .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11  2-3 35 62 81 108 125 147-148 167-168 194-196 214-215 240  12 13 14 15 16 17 18 19 20 21 22  258 283-284 299 328-329 338-339 359-361 376-378 400-401 439-440 460-461 489-490  • 2D03C T e x t : Authors:  Learning;_About_Chil Philadelphia: J . B . L i p p i n c o t t Co., ?964. R.M.  Shuey, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  18-20 36-39 63-64 82-83 95-97 126-128 146-148  08 09 10 11 12 13 14  170-171 193-194 216-219 237-240 258-259 279-281 289-290  178  • 2D04C T e x t :  Author:  S  Welfare,  Not given.  Sample  Pages  Sample  Pages  01 02 03 04 05  8-10 31-33 55-56 68-69 95-96  06 07 08 09 10  115-116 136-139 151-152 183-184 201-202  *2D05C T e x t : Author:  \  _E_________s_Fro____to_6. Dept7~of National Health C a n a d a , 1967.  _o_-_Yo__Are__ead__To_Cgg_. M i n n e a p o l i s : B u r g e s s P u b l i s h i n g Co., 1964. M.A. D u f f i e .  Sample  Pages  Sample  Pages  01 02 03 04 05  4-6 37 59-61 86-87 103  06 07 08 09  127-130 141- 143 162 182-183  E. INDUSTRIAL  EDUCATION  9.*|  ( T o t a l o f 54 Samples) •2E01C T e x t :  Author:  Genera1_Woodworking. T o r o n t o : M c G r a w - H i l l ~ C o . o f Canada L t d . , 1965. Chris.  H.  Groneman.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  1-2 42-44 54-55 74-76 95-113 114-135 158-160  08 09 10 11 12 13  175 184-186 210-211 235-238 253-254 272-273  Ottawa,  •2E02C T e x t : Author:  General__etaIs, Toronto: M c G r a w - H i l l ~ C o . o f Canada L t d . , John  L.  1965.  Feirer.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08  1 2-14 35-36 59-61 67-69 104-105 126-129 1 49-151 167-170  09 10 11 12 13 14 15 16  185-187 210-212 226-227 238-240 260-262 273-274 317-319 340-349  *2E03C T e x t :  Authors:  General Power_Mechanics. Toronto: M c G r a w - H i l l C o . " o f Canada L t d . , 1970. M  Robert  M. W o r t h i n g t o n ,  et a l .  Sample  P ages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11 12 13  1 8-21 35-37 57-59 79-81 100-101 127-129 1 48-149 169-170 192-193 210-212 237-240 260-261 281-283  14 15 16 17 18 19 20 21 22 23 24 25  297-298 328-331 342-348 367-369 390-391 413-414 435 458-460 473-474 500-501 521-522 546-547  I i  F. MATHEMATICS  9. | J ( T o t a l o f 7 Samples)  •2F01C T e x t : Authors:  M o d e r n G e n g r a l Mathematics. R.E. E i c h o l y ,  Ontario  et a l .  Sample  Pages  Sample  Pages  01 02 03 04  1-70 75-106 108-127 130-143  05 06 07  146-161 183-214 227-331  i  |  Don M i l l s ,  A d d i s o n " W e s l e y I c a n a d a ) " l t d . , 1966.  1  G. SCIENCE  9. | ( T o t a l o f 24 Samples)  • 2G01C T e x t :  Authors: Sample  _J§__lo_in___ci_n Scarborough, O n t a r i o : P r e n t i c e - H a l l o f Canada L t d . , 1968. W.H.  Rasmussen and M.C.  Schmid.  Pages  Sample  Pages  01  1-2  08  156-159  02 03 04 05 06  22-23 41-44 76-77 92-94 111-116  09 10 11 12 13  171- 175 199-202 222-224 248-251 278  07  136-138  181  *2G02C T e x t :  E_adin__About_Sci H o l t , B i n e h a r t 5 W i n s t o n o f Canada L t d . , 1969.  Authors:  i  M. F o r s t e r ,  et a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06  13-14 30 63-64 78-79 100-101 124-125  07 08 09 10 11  145-146 172-173 191-192 218 241-242  SOCIAL  STUDIES  9.I i ( T o t a l o f 13 Samples)  *2H01C T e x t :  M__I__ -k§_5E§§__£2IS_£__2* T  B e l l h a v e n House L t d . , 1969.  Authors:  Scarborough,  Ontario:  G.E. C a r s w e l l , e t a l ,  Sample  Pages  Sample  Pages  01 02 03 04  1-3 22-27 61-62 82-85  05 06 07 08  95-100 111-114 132-134 155-156  • 2H02C T e x t : Author:  Our_World_of_Change. T o r o n t o : M c G r a w - H i l l Company o f C a n a d a , L t d . , 1969, Hugh  B.  Innis.  Sample  Pages  Sample  Pages  01 02 03  13-14 20-21 4 9-51  04 05  71-72 104-106  I*" A. AGRICULTURE  10.1  i  J  ( P i l o t Study: Not u s e d i n Corpus) ( T o t a l o f 21 Samples) •3A01C T e x t : Authors:  F a r m e r ' s Shop Book. M i l w a u k e e : The B r u c e P u b l i s h i n g Co., 1953. L. M.  R o e h l and A.D. L o n g h o u s e .  Sample  Pages  Sample  Pages  01 02 03 OU 05 06 07 08 09 10 11  14-17 34-36 62-64 7 8-81 1 09-111 129 146-147 161-164 1 89-190 216-220 223-275  12 13 14 15 16 17 18 19 20 21  231-233 261 274 313-316 328-329 353-355 375 390-391 416-417 433-437  r |  1  B. COMMERCE  10. | ( T o t a l o f 16 Samples)  •3B01C T e x t :  Author:  _ew_Basic_Course_ Toronto S i r I s a a c P i t m a n (Canada) L t d . , 1964, Not g i v e n .  Sample  Pages  Sample  Pages  01 02 03 04  viii-ix 27-37 55-66 77-87  05 06 07  88-89 113-121 137-145  *3B02C T e x t : Authors:  Exploring Business. Toronto: M c G r a w - H i l l Co., o f Canada L t d . , 1968. J . Frank  Dame, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05  5-7 27-29 54-56 85 100-102  06 07 08 09  114-117 143 178-179 186  T C. ENGLISH i  10.1 J  ( T o t a l o f 16 Samples) • 3C01C T e x t :  ________________^ 3TM.  Authors:  D  O  N  M___  Dent~S~Sons ( C a n a d a ) ,  Malcolm  Ontario:  s>  1965.  Boss and J o h n S t e v e n s  (eds).  Sample  Pages  Sample  Pages  01 02 03 04 05 06  1-3 25-26 48-49 71-72 94-95 117-118  07 08 09 10 11 12  140-141 163-165 187-188 210-211 233-234 251-252  • 3C02C T e x t :  Author:  Drama_IV. T o r o n t o : T h e ~ M a c M i l l a n Co. o f Canada L t d . , 1965. Herman  Voaden ( e d ) .  Sample  Pages  Sample  Pages  01 02  2-3 142-143  03 04  226-227 383-384  I  I i  1  F. MATHEMATICS  10. | i  ( T o t a l o f 14 Samples) • 3F01C T e x t :  Authors:  M a t h e m a t i c s : ..A.Modern^Approach. Don M i l l s , O n t a r i o : A d d i s o n Wesley (Canada) L t d . , 1966. M.S.  W i l c o x and J . E . Y a r n e l l e .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  1-5 14-20 54-57 65-66 90-91 101-102 150-153  08 09 10 11 12 13 14  190-191 207-209 237-263 295-297 304-305 322-324 346-347  \ G. SCIENCE i  10.1 j ( T o t a l o f 31 Samples)  • 3G01C T e x t :  Author:  Extendino__Science_Co^ Scarborough, Ontario: P r e n t i c e - H a l l o f Canada L t d . , 1970. M.C.  Schmid ( e d ) .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09  1-5 31-37 55-56 79-85 106-108 126-128 149-154 164-169 193-200  10 11 12 13 14 15 16 17  204-210 253-256 259-264 291-298 311-319 323-326 329-336 372-375  185  •3G02C Text:  Author:  f  _g__i____bg_t_Scie_ce_3. Toronto: H o l t 7 ~ R i n e h a r t S Winston of Canada, L t d . , 1970. J . Woodrow.  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07  38-42 59 75 100-101 128-129 139-143 155  08 09 10 11 12 13 14  186-188 203-204 227-228 234-235 254-255 277-278 301-302  H. SOCIAL STUDIES  10.I  i  J  ( T o t a l o f 42 Samples) • 3H01C T e x t : Authors:  A____i°Ml___2_^ Gage E d u c a t i o n a l Pub. L t d . , 1970. G.S.  Toronto:  Tomkins, e t a l .  Sample  Pages  Sample  Pages  01 02 03 04 05 06 07 08 09 10 11 12 13 14 15  11 26-27 53 74 97 112 134-135 1 59 175 1 96 210-211 2 36-23 7 254 279 299  16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  320 327-328 347 372 387 406 434 449 469 490 514-515 536 557-559 581-582 596  186  *3H02C T e x t :  A_Nation_pevelo£in__ Toronto: M c G r a w - H i l l Company o f Canada L t d . , 1970.  Author:  J . A. Lower.  Sample  Pages  Sample  Pages  01 02 03 04 05 06  14-15 35-36 54-55 77-78 92-93 1 15-116  07 08 09 10 11 12  131-132 158-159 178-179 197-198 213-214 230-231  187  APPENDIX  B  SAMPLE SIZES IN ALPHABETICAL ORDER AND ASCENDING  RANK  SAMPLES IW ALPHABETICAL ORDER SAMPLE *1C01C01 *1C01C02 • 1C01C0 3 *1C01C04 •1C01C05 * 1 CO 1 CO 6 •1C01C07 •1C02C01 •1C02C02 *1C02CO 3 •1C02C04 *1C02C05 •1C02C06 * 1 CO 2 CO 7 •1C02C08 *1C02C09 •1C02C10 *1D01C01 *1D01C02 •1D01C03 *1D01C04 •1D01C05 •1D01C06 • 1D01C0 7 *1D01C08 *1D01C09 •1D01C10 • 1D01C1 1 •MD01C12 •1D01C13 *1D01C14 •1D01C15 *1D01C16 *1D01C17 •1D01C18 •1D01C19 •1D01C20 *1D01C21 •1D01C22 * 1 EO 1 CO 1  SIZE 523 507 496 455 485 508 526 470 475 484 575 515 588 450 529 526 493 505 387 49 1 576 384 557 618 584 577 554 573 480 509 560 427 391 535 635 436 466 611 571 464  SAMPLE *1E0 1C02 •1E01C03 *1E01C04 •1E01C05 *1E01C0 6 *1E0 1C07 *1E01C08 *1E01C09 • 1F01C01 *1F01C02 • 1F01C03 *1F0 1C04 • 1F01C0 5 •1F01C06 *1F01C07 •1F01C08 •1F01C09 •1F01C10 • 1F01C11 *1F0 1C12 •1F01C13 *1F0 1C14 •1G01C01 *1G0 1C02 •1G01C03 •1G01C04 *1G01C05 *1G0 1C06 • 1G01C07 *1G0 1C08 •1G01C09 •1G02C01 *1G02C02 •1G02C03 •1G02C04 •1G02C05 •1G02C06 •1G02C07 • 1G02C08 •1G02C09  SIZE 521 612 579 556 505 473 473 441 453 459 498 427 481 566 549 552 54 5 491 541 522 481 509 505 515 514 482 512 453 458 468 495 513 490 524 485 445 470 499 533 495  SAMPLES IN ALPHABETICAL ORDEB SAMPLE *1G02C10 •1G02C11 *1H01C01 *1H01C02 *1H01C03 •1H01C04 •1H01C05 *1H01C06 * 1 HO 1 CO 7 *1H01C08 *1H01C09 *1H01C10 *1H01C1 1 •1H01C12 *1H01C13 •1H01C14 •1H01C15 •1H02C01 *1H02C02 •1H02C03 *1H02C04 *1H02C05 •1H02C06 •1H02C07 *2B01C01 •2B01C02 •2B01C03 •2B01C04 •2B01C0 5 •2B01C06 •2B01C07 •2B01C08 *2B01C09 •2B01C10 • 2B01C1 1 •2B02C01 •2B02C02 •2B02C03 *2B02C04 • 2B02C05  SIZE 526 525 532 522 543 512 495 562 500 537 512 494 512 493 519 499 496 495 508 479 501 490 527 477 451 469 470 361 618 498 530 458 551 608 480 571 528 545 489 448  SAMPLE •2B02C06 *2B02C07 •2B02C08 *2B02C09 •2B02C10 •2B02C11 •2B02C12 •2B02C13 *2B02C14 •2C01C01 •2C01C02 •2C01C03 *2C01C04 •2C01C05 *2C0 1C06 •2C01C07 •2C01C08 *2C01C0 9 •2C01C10 •2C01C11 *2C0 1C12 •2C01C13 *2C01C14 *2C01C15 •2C01C16 •2C01C17 •2C01C18 *2C01C19 *2C0 1C20 •2C02C01 *2C02C02 •2C02C03 •2C02C04 •2C02C05 •2C02C06 •2C02C07 •2C03C01 *2CO3 CO 2 •2C03C03 •2C03C04  SIZE 498 4 85 471 463 495 523 470 490 515 500 487 499 528 500 465 455 445 453 480 445 500 476 504 547 444 491 509 381 537 498 520 458 494 489 420 521 491 439 562 499  SAMPLES SAMPLE *2C03C35 *2C03C06 •2C03C07 •2C03C08 *2C03C09 *2C03C10 •2C04C01 •2C04C02 •2C04C03 •2C04C04 •2C04C05 •2C04C06 •2C04C07 •2C04C38 •2C04C09 •2C04C10 •2D01C01 *2D01C02 *2D01C03 •2D01C04 •2D01C05 *2D01C06 *2D01C07 •2D01C08 •2D01C09 •2D01C10 *2D01C11 •2D01C12 •2D01C13 *2D01C14 •2D01C15 •2D01C16 •2D01C17 *2D01C18 *2D01C19 *2D01C20 *2D01C21 *2D02C0 1 •2D02C02 *2D02C03  IN ALPHABETICAL ORDER  SIZE 530 470 474 572 483 515 532 451 514 514 503 513 500 508 505 502 534 479 525 452 487 446 455 444 508 457 507 479 473 520 469 508 515 517 504 465 454 513 525 504  SAMPLE •2D02C04 •2D02C05 •2D02C06 •2D02C07 •2D02C08 •2D02C09 *2D02C10 •2D02C11 •2D02C12 *2D02C13 •2D02C14 *2D02C15 *2D02C16 *2D02C17 •2D02C18 •2D02C19 *2D02C20 •2D02C21 •2D02C22 *2D03C01 •2D03C0 2 *2D03C03 •2D03C04 *2D03C05 *2D03C06 •2D03C07 •2D03C08 •2D03C09 •2D03C10 *2D03C11 •2D03C12 •2D03C13 • 2D03C14 *2D04C01 •2D04C02 •2D04C03 *2D04C04 *2D04C05 •2D04C06 •2D04C07  SIZE 549 458 471 470 496 478 526 485 480 489 511 438 485 477 514 407 524 454 501 544 4 96 508 507 516 506 516 502 493 473 435 496 448 488 564 615 501 572 588 469 489  SAMPLES IN ALPHABETICAL SAMPLE •2D04C08 •2D04C09 •2D04C10 •2D05C01 *2D05C02 •2D05C03 •2D05C04 •2D05C05 •2D05C06 •2D05C07 *2D05C08 •2D05C09 •2E01C01 •2E01C02 •2E01C03 *2E01C04 •2E01C05 *2E01C06 • 2E01C07 *2E01C08 *2E01C09 *2E01C10 •2E01C11 •2E01C12 •2E01C13 •2E02C01 •2E02C02 •2E02C03 •2E02C04  SIZE 524 488 522 518 539 524 475 486 506 504 49 1 556 535 511 448 455 657 536 400 446 486 445 338 404 414 503 508 457 504  •2E02C05  476  *2E02C06 *2E02C07  502 356  *2E02C08 •2E02C09  508 573  • 2E02C10 •2E02C11  484 472  • 2E02C13 •2E02C14 •2E02C15  490 467 471  •2E02C12  528  OBDEB  SAMPLE  SIZE  •2E02C16 •2E03C01  494 511  •2E03C02 •2E03C03 •2E03C04 •2E03C05 *2E03C06 *2E03C07 •2E03C08 *2E03C09 •2E03C10 •2E03C11 •2E03C12 *2E03C13 *2E03C14 •2E03C15 •2E03C16 •2E03C17 •2E03C18 •2E03C19 *2E03C20 *2E03C21 •2E03C22 *2E03C23 •2E03C24 •2E03C25 *2F01C01 •2F01C02 *2F01C03 •2F01C04 *2F01C05 *2F01C06 •2F01C07 •2G01C01 •2G01C02 •2G01C03 •2G01C04 •2G01C05 •2G01C06 •2G01C07  519 467 558 525 525 551 561 514 458 479 538 479 488 488 548 506 523 521 519 567 474 506 447 517 505 503 480 485 561 501 581 507 500 517 543 502 494 508  SAMPLES SAMPLE •2G01C08 •2G01C09 *2G01C10 *2G01C11 *2G01C12 *2G01C13 *2G02C01 *2G02C02 *2G02C03 •2G02C04 •2G02C05 *2G02C06 •2G02C07 *2G02C08 •2G02C09 •2G02C10 *2G02C11 *2H01C01 *2H01C02 •2H01C03 •2H01C04 •2H01C05 •2H01C06 *2H01C07 •2H01C08 *2H02C01 •2H02C02 •2H02C03 *2H02C04 •2H02C05 •3B01C01 •3B01C02 •3B01C03 •3B01C0 4 *3B01C05 *3B01C06 •3B01C07 *3B02C0 1 *3B02C02 •3B02C03  IN ALPHABETICAL ORDER  SIZE 540 512 523 572 511 519 496 516 452 513 484 522 514 511 524 460 538 532 544 638 559 557 508 513 557 507 472 525 497 546 543 552 483 494 573 419 482 482 403 428  SAMPLE •3B02C04 •3B02C0 5 •3B02C06 *3B02C07 *3B02C08 *3B02C09 • 3C01C01 *3C0 1C02 •3C01C0 3 *3C0 1C04 •3C01C05 •3C01C06 *3C01C0 7 *3C01C08 •3C01C09 •3C01C10 *3C01C11 • 3C01C12 *3C02C01 •3C02C02 • 3C02C03 •3C02C04 •3F01C01 *3F0 1C02 *3F01C03 •3F01C0 4 •3F01C05 *3F01C06 *3F01C07 *3F0 1C08 •3F01C09 •3F01C10 •3F01C11 *3F01C12 •3F01C13 *3F01C14 •3G01C01 •3G01C02 •3G01C03 •3G01C04  SIZE 477 435 499 429 495 458 509 570 523 551 564 519 594 612 532 620 538 556 454 522 461 430 499 450 508 484 450 505 549 531 478 492 537 541 546 530 548 477 517 497  193  SAMPLES IN ALPHABETICAL ORDEB SAMPLE •3G01C05 *3G01C0 6 *3G01C07 •3G01C08 •3G01C09 *3G01C10 *3G01C11 •3G01C12 *3G01C13 •3G01C14 *3G01C1 5 •3G01C16 *3G01C17 *3G02C01 *3G02C02 *3G02C03 *3G02C04 *3G02C05 *3G02C06 •3G02C07 *3G02C08 *3G02C09 *3G02C10 *3G02C1 1 *3G02C12 *3G02C13 •3G02C14 •3H01C01 *3H01C02 *3H01C03 *3H0 1C04 •3H01C05 • 3H01C06 •3H01C07 *3H0 1C08 •3H01C09 *3H01C10 *3H01C11 *3H01C12 •3H01C13  SIZE 4 86 520 470 517 458 532 517 514 543 492 498 510 498 523 451 538 497 495 435 522 493 511 ,469 467 513 522 555 482 568 514 480 525 565 501 535 552 497 567 423 54 6  SAMPLE •3H01C14 *3H01C15 *3H01C16 •3H01C17 •3H01C18 •3H01C19 •3H01C20 •3H01C21 *3H0 1C22 *3H01C23 *3H0 1C24 •3H01C25 *3H0 1C26 *3H01C27 •3H01C28 *3H01C29 *3H01C30 •3H02C01 •3H02C02 *3H02C03 •3H02C04 •3H02C05 •3H02C06 •3H02C07 *3H02C08 *3H02C09 •3H02C10 •3H02C11 •3H02C12  SIZE 437 489 503 478 482 428 431 455 520 439 494 469 466 503 530 480 377 460 449 492 442 490 423 470 447 537 485 541 456  SAMPLES SAMPLE •2E01C1 1 *2E02C07 *2B0 1C04 *3H01C30 •2C01C19 * 1D01C0.5 •1D01C02 •1D01C16 •2E01C07 *3B02C02 •2E01C12 *2D02C19 *2E01C13 • 3B01C06 •2C02C06 *3H02C06 •3H01C12 "MD01C15 •1F01C04 •3B02C03 *3H01C19 *3B02C07 •3C02C04 *3H01C20 *2D03C11 •3G02C0 6 •3B02C0 5 •1D01C19 *3H01C14 • 2D02C1 5 •3H01C23 •2C03C02 *1E01C09 *3H02C04 •2C01C16 *2D01C08 •1G02C05 *2E01C10 •2C01C08 •2C01C11  BANKED IN ASCENDING ORDER  SIZE 338 356 361 377 381 384 387 391 400 403 404 407 414 419 420 423 423 427 427 428 428 429 430 431 435 435 435 436 437 438 439 439 441 442 444 444 445 445 445 445  SAMPLE *2E01C08 •2D01C06 •3H02C08 •2E03C24 • 2B02C05 •2D03C13 *2E01C0 3 *3H02C02 *1 C02C07 *3F0 1C05 •3F01C0 2 •2C04C02 •2B01C01 •3G02C02 •2D01C0 4 •2G02C03 •2C01C09 •1F01C01 *1G01C06 *2D0 1C2 1 •3C02C01 •2D02C21 •3H01C21 *1C01C04 *2E01C04 •2D01C07 •2C01C07 •3H02C12 *2D01C10 •2E02C03 *2D02C05 •2B01C08 *3B02C0 9 *3G0 1C09 •2E03C10 •1G01C07 •2C02C03 •1F01C02 •3H02C01 *2G02C10  SIZE 446 446 447 447 448 448 448 449 450 450 450 451 451 451 452 452 453 453 453 454 454 454 455 455 455 455 455 456 457 457 458 458 458 458 458 458 458 459 460 460  SAMPLES BANKED IN ASCENDING OBDER SAMPLE •3C02C03 •2B02C09 *1E01C01 *2C01C06 •2D01C20 •3H01C26 •1D01C20 •2E02C14 •2E03C03 •3G02C11 •1G01C08 *2D04C06 •2B01C0 2 *2D01C15 *3G02C10 •3H01C25 • 2C03C06 *1G02C06 •2D02C07 •2B01C03 •2B02C12 *3G01C07 *1C02C01 •3H02C07 *2D02C06 *2E02C15 *2B02C08 •2H02C02 •2E02C11 •1E01C07 •2D03C10 •1E01C08 *2D01C13 •2C03C07 *2E03C22 •2D05C04 *1C02C02 •2E02C05 *2C01C13 •1H02C07  SIZE 461 463 464 465 465 466 466 467 467 467 468 469 469 469 469 469 470 470 470 470 470 470 470 470 471 471 471 472 472 473 473 473 473 474 474 475 475 476 476 477  SAMPLE •2D02C17 •3G01C02 *3B02C04 *3H01C17 •2D02C09 •3F01C09 *2D01C02 *1H02C03 •2E03C11 •2D01C12 •2E03C13 •2F01C03 •2C01C10 *2B01C11 •2D02C12 *3H01C04 *1D01C12 •3H01C29 •1F01C13 *1F01C05 •3B01C07 •3H01C01 •3B02C01 •1G01C04 •3H01C18 •2C03C09 •3B01C03 •2E02C10 •1C02C03 •2G02C05 •3F01C04 •2D02C11 •3H02C10 • 1C01C05 •1G02C04 •2D02C16 •2F01C04 *2B02C07 *3G0 1C05 •2E01C09  SIZE 477 477 477 478 478 478 479 479 479 479 479 480 480 480 480 480 480 480 481 481 482 482 482 482 482 483 483 484 484 484 484 485 485 485 485 485 485 485 486 486  SAMPLES SAMPLE *2 DO 5 CO 5 *2C01C02 *2D0 1C05 •2D04C09 *2E03C14 *2D03C1 4 *2E03C15 *3H01C15 •2B02C04 *2D04C07 *2C02C05 *2D02C1 3 •2E02C13 •2B02C13 •3H02C05 •1G02C02 •1H02C05 *1DO 1CD3 •1F01C10 • 2C03C0 1 • 2C01C17 •2D05C08 •3H02C03 *3G01C14 *3F01C10 *2D03CD9 *1H01C12 "MC02C10 •3G02C08 •3B01C0U •3H01C24 • 2C02C04 •2E02C16 •1H01C10 •2G01C06 •2B02C10 •1H02C01 • 3B02C08 •1G01C09 * 1 HO 1 CO 5  RANKED IN ASCENDING ORDER  SIZE 486 487 48 7 488 488 488 488 489 489 489 489 489 490 490 490 490 490 491 491 49 1 491 491 492 492 492 493 493 493 4 93 494 494 494 494 494 494 495 495 495 495 495  SAMPLE •3G0 2C0 5 •1G02C09 •2D02C08 •1H01C15 •1C01C03 •2G02C01 • 2D03C02 •2D03C12 *3H01C10 •3G01C04 *2H02C04 *3G02C04 •2C02C01 •2B01C06 •2B02C06 *1F01C0 3 *3G01 C17 *3G01C15 •2C03C04 •1H01C14 *2C01C03 •3F01C01 •3B02C06 *1G02C07 • 2C04C0 7 *1H01C07 *2C01C1 2 •2G01C02 •2C01C01 *2C01C05 •3H01C07 •2D04C0 3 •2F01C06 •2D02C22 •1H02C04 *2C04C10 *2G01C05 •2D03C08 •2E02C06 •2C04C05  SIZE 495 49 5 496 496 496 496 496 496 497 497 497 497 498 498 498 498 498 498 499 499 499 499 499 499 500 500 500 500 500 500 501 501 501 501 501 502 502 502 502 503  SAMPLES RANKED IN ASCENDING ORDER SAMPLE *3H01C16 •2E02C0 1 *2F01C02 • 3H01C27 •2C01C14 •2E02C04 *2D02C0 3 *2D05C07 *2D01C19 •1G01C01 • 2C04C09 •3F01C06 * 1 EO 1 CO 6 •2F01C01 * 1 DO 1 CO 1 *2D05C06 •2E03C17 •2D03C06 *2E03C23 •2H02C01 * 1 CO 1 CO 2 *2G01C01 *2D0 1C1 1 •2D03C04 •2H01C0 6 •2C04C08 *2E02C08 •2E02C02 • 1H02C02 •1C01C06 •2G01C0 7 •2D01C16 *3F01C03 *2D01C09 *2D03C03 *1F01C14 *2C01C18 •1D01C13 •3C01C01 *3G01C16  SIZE 503 503 503 503 504 504 504 504 504 505 505 505 505 505 505 506 506 506 506 507 507 507 507 507 508 508 508 508 508 508 508 508 508 508 508 509 509 509 509 510  SAMPLE •2D02C14 *2G02C08 •2E03C01 •3G02C09 •2E01C02 *2G01C12 *1H01C11 •1G01C05 *1H01C09 *1H01C04 *2G01C09 •2C04C06 •2H01C07 •2D02C01 •2G02C04 *3G02C12 *1G02C01 *2C04C03 •3H01C03 •2E03C09 •1G01C03 *2C04C04 •3G01C12 •2D02C18 •2G02C07 •2B02C14 *1G0 1C02 *2 D01C17 •2C03C10 • 1C02C05 •2G02C02 •2D03C07 •2D03C05 •3G01C03 •2E03C25 *2G01C03 •2D01C18 *3G0 1C08 •3G01C11 *2D05C01  SIZE 511 511 511 511 511 511 512 512 512 512 512 513 513 513 513 513 513 514 514 514 514 514 514 514 514 515 515 515 515 515 516 516 516 517 517 517 517 517 517 518  SAMPLES RANKED IN ASCENDING ORDER SAMPLE *1H.01C1 3 *2E0 3C0 2 *2G01C13 •3C01C06 •2E03C20 *3H01C22 *2C02C02 *3G01C06 *2D0 1C1 4 •2C02C07 * 1EO1 CO 2 •2E03C19 *1F01C12 •3C02C02 •3G02C07 *2D04C10 •2G02C06 •1H01C02 *3G02C13 *1C01C01 *2G01C10 •2E03C18 • 3G02C0 1 •3C01C03 •2B02C11 •1G02C03 •2D02C20 •2G02C09 *2D05C03 *2D04C08 •2H02C03 •2D01C03 *1G02C11 •2E03C05 *3H0 1C0 5 •2D02C02 •2E03C06 •1G02C10 * 1 CO 1 CO 7 • 2D02C1 0  SIZE 519 519 519 519 519 520 520 520 520 521 521 521 522 522 522 522 522 522 522 523 523 523 523 523 523 524 524 524 524 524 525 525 525 525 525 525 525 526 526 526  SAMPLE • 1C02C09 •1H02C06 •2B02C02 •2E02C12 •2C01C04 • 1C02C08 *2B01C07 •2C03C05 •3F01C14 •3H01C28 *3F0 1C08 •3C01C09 •2H01C01 *2C04C01 *3G0 1C10 • 1H01C01 *1G02C08 *2D01C01 •3H01C08 *1D01C17 •2E01C01 •2E01C06 *3F01C11 •2C01C20 •1H01C08 •3H02C09 •2E03C12 *3 GO 2 CO 3 •3C01C11 •2G02C11 •2D05C02 •2G01C08 •3H02C11 • 1F01C11 •3F01C12 •3B01C01 *3G0 1C13 •2G01C04 *1H01C03 •2D03C01  SIZE 526 52 7 528 528 528 529 530 530 530 530 531 532 532 532 532 532 533 534 535 535 535 536 537 537 537 537 538 538 538 538 539 540 541 541 541 543 543 543 543 544  SAMPLES SAMPLE •2H01C02 *2B02C03 *1F0 1C09 •2H02C0 5 •3F01C13 •3H01C13 *2C01C1 5 • 3G01C0 1 •2E03C16 * 1FO1 CO 7 *2D02C04 *3F0 1C0 7 •3C01C04 •2B01C0 9 *2E03C07 •1F01C08 *3B01C02 •3H01C09 *1D01C10 •3G02C14 *2D05C09 •1E01C05 •3C01C12 *2H01C08 * 1 DO 1 CO 6 •2H01C05 •2E03C04 •2H01C04 • 1D01C14 •2F01C05 •2E03C08 •1H01C06 *2C03C03 *2D04C01 •3C01C0 5 •3H01C06 *1FO 1C0 6 •2E03C21 •3H01C11 •3H01C02  RANKED IN ASCENDING ORDER  SIZE 544 54 5 54 5 546 54 6 546 547 548 54 8 549 54 9 549 551 551 551 552 552 552 554 555 556 556 556 557 557 557 558 559 560 561 561 562 562 564 564 565 566 567 567 568  SAMPLE *3C0 1C02 •2B02C0 1 •1D01C22 • 2G01C11 *2C03C08 •2D04C04 •2E02C09 *3B01C05 *1D01C11 *1C02C04 • 1D01C04 *1D0 1C09 • 1E01C04 *2F01C07 •1D01C08 •1C02C06 •2D04C05 •3C01C07 •2B01C10 *1D01C21 • 3C01C0 8 • 1E01C03 •2D04C02 •2B01C05 •1D01C07 •3C01C10 •1D01C18 •2H01C03 *2E0 1C05  SIZE 570 571 571 572 572 572 573 573 573 575 576 577 579 581 584 588 588 594 608 611 612 612 615 618 618 620 635 638 657  200  APPENDIX C COMPUTER FILES  AND PROGRAMS USED IN THE STUDY  20 1  FILE# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40  F I L E NAME  SIZE  (394) (87) (206) (102) (18) (35) COMMERCE (70) ENGLISH (85) HOMEC (54) INDED (30) MATH (6 7) SCIENCE (72) SOCIALS GRADE8.ENGLISH (16) (20) GRADE8. HOMEC GRADE8.INDED (9) GRADE8. MATH (13) GRADE8.SCIENCE (17) GRADE8.SOCIALS (21) GRADE9.COMMERCE (22) GRADE9.ENGLISH (41) (66) GRADE9.HOMEC (47) GRADE9.INDED GRADE9.MATH (7) (23) GRADE9.SCIENCE (14) GRADE9.SOCIALS GRADE10.COMMERCE (14) GRADE10.ENGLISH (15) GRADE10.MATH (12) (2 8) GRADE10.SCIENCE GRADE10.SOCIALS (41) 8E01 (7) 8E02 (9) (2 0) 8H01 8101 (9) (13) 8M01 (8) 8SC01 (11) 8SC02 (14) 8SO01 8SO02 (7) 9C01 (11)  CORPUS GRADE8 GRADE9 GRADE10 AGRICULTURE  DESCRIPTION OF  FILE  COMPLETE ENGLISH LANGUAGE SAMPLE. SAMPLE TAKEN JUST FROM GRADE EIGHT. ti •i II II GRADE NINE. II n II ii GRADE TEN. II it n it AGRICULTURE •i  II  ii  II  COMMERCE  n  II  II  it  II  ••  it  it  II  II  n  II  II  II  ii  •t  II  II  it  •i  n  n  ii  it  ENGLISH HOME EC. INDUSTRIAL ED. MATHEMATICS SCIENCE SOCIALS  GRADE-SUBJECT  SAMPLE •i  II II  ii  n  II  II  II  II  II  II  •i  it  II  II  II  II  •i  II  II  II  II  n  ti  II  •i  II  II  II  II  II  ti  it  II  GRADE-:SUBJECT -TEXT* SAMPLE •i  II  it  II  it  •i  II  II  ii  II  II  n  n  II  •i  II  it  it  II  II  II  n  ti  n  II  ii  II  THE S I Z E COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO PRINTED PAGES EACH.  8X11  202  FILES 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80  F I L E NAME 9C02 9E01 9E02 9E03 9E04 9H01 9H02 9H03 9H04 9H05 9101 9102 9103 9M01 9SC01 9SC02 9SO01 9SO02 10C01 10C02 10E01 10E02 10M01 10SC01 10SC02 10SO01 10S002 WRDSTAT.S WRDSTAT.O SPLIT1.S SPLIT 1.0 ST.DEV.S ST.DEV.0 TABL.B1.S TABL.B1.0 TABL.B4.S TABL.B4.0 SPLIT2.S SPLIT2.0 SPLIT3.S  SIZE  DESCRIPTION OF F I L E  (13) GRADE-SUBJECT-TEXT* SAMPLE II II II (19) II II n (7) II •i ti (9) II II it (9) II it it (18) II n it (19) II tt ti (13) •i it it (10) it ti it (9) it it •i (12) n it tt (14) II II ii (23) II tt it (V) II it it (13) it tt it (11) II II II (8) II II it (6) it n it (6) ti it ti (8) it it ti (12) II it it (5) ti it it (12) II tt n (16) it tt tt (14) it it it (29) ti ti it (12) (<*) SENTENCE STATISTICS it it (*») (3) PROGRAM TO BREAK 'CORPUS' INTO GDES •i it it n II it (2) (D STANDARD DEVIATION PROGRAM it  it  (D TABLE (DESC. ORDER) PROGRAM (D RANK it ti II it it (D ASCENDING TABLE PROGRAM (D UNRANKED II II ti ti (D II  (3) BREAKS GRADES INTO GRADE-SUBJECTS n it II it (3) « (3) BREAKS GRADE-SUBJECTS INTO TEXTS  THE S I Z E COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO PRINTED PAGES EACH.  8X11  203  FILEt 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120  F I L E NAME  DESCRIPTION OF F I L E  SIZE  SPLIT3.0 (3) COUNTW.S (7) COUNTW.O (7) UNSORT. CORP. FREQS (50) UNSORT.GRD8.FREQS (21) UNSORT. GRD9. FREQS (35) UNSORT.GD10. FREQS (24) UNSORT. COMM. FREQS (9) UNSORT.ENGL.FREQS(21) UNSORT. HOME. FREQS (17) UN SORT. INDE. FREQS (12) UNSORT. MATH. FREQS (6) UNSORT.SCIE.FREQS (15) UNSORT. SOCI. FREQS (20) P1 (1) P2 (1) P3 (1) P4 (1) P5 (1) P6 (1) P7 (1) P8 (1) P9 (1) P10 (1) P11 (1) P12 (1) P13 (1) P14 (1) P15 (1) P16 (1) P17 (1) P18 (1) P19 (1) P20 (1) P21 (1) P22 (1) P23 (1) P24 (1) P25 (1) P26 (1)  BREAKS GRADE-SUBJECTS INTO TEXTS WORD COUNT PROGRAM n  II  II  CORPUS (TABL. B 1, & TABL.B4 DATA) II GRADE8 II GRADE9 II GRADE10 •i COMMERCE II ENGLISH II HOME EC. ii INDUSTRIAL ED. i i MATHEMATICS n SCIENCE ii SOCIALS P1 TO P37 FOLLOW 8E01 THRU' 10SO02 AND ARE THE INPUT DATA FOR PLOTT.S  .  THE S I Z E COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO PRINTED PAGES EACH.  8X11  204  F I L E NAME  FILE* 121 122 123 124 125 126 127 128 129 130 131 132  P27 P28 P29 P30 P31 P32 P33 P34 P35 P36 P37 PG1  133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160  PG2 PG3 PG4 PG5 PG6 PG7 PG8 PG9 PG10 PG1 1 PG12 PG13 PG14 PG15 PG16 PG17 PG1 8 PS 1 PS 2 PS3 PS4 PS5 PS6 PS7 PS 8 PS9 PS 10 PS 11  SIZE  DESCRIPTION OF F I L E  (D (D (D  (1)  (D (D  (1)  (D (D (D (D (D PG1 TO PG18 FOLLOW GRADE8.ENGLISH TO GRADE10.SOCIALS. AND ARE THE INPUT DATA FOR (1) (1)  PLOTGS.S  (D  (1) (1)  (D (D  (1)  (D (D (D (D (D (D (D (D  (1)  (D PS1 TO PS11 FOLLOW CORPUS TO (D SOCIALS (EXCL. AGRICULTURE) AND ARE (D INPUT DATA FOR PLOTG.S, & PLOTS.S (D (D (D (D (D (D (D (D  THE S I Z E COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO 8 X 11 PRINTED PAGES EACH.  205  FILE # 161 162 163 164 165 166 167 168 169 170 171 172 1 73 174 175 176 1 77 178 179 180 1 81 182 183 184 1 85 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200  FILE  NAME  CORP.X2 GRADES.X2 EIGHT.X2 NINE.X2 TEN.X2 COR SEN T.X2 GDSSENT.X2 G8SENT.X2 G9SENT.X2 G10SENT.X2 LENGS WORDS CHIS.3 CHI0.3 CHIS.4 CHI0.4 CHIS.7 CHI0.7 CHIS.8 CHI0.8 COUNTW.O WRDSTAT. S WRDSTAT.0 PLOTS.S PLOTS.0 PLOTG.S  SIZE (2) (2) (2) (2) (2) (1)  (D  (1) (1)  (D  (1) (1)  (D  (1)  DESCRIPTION OF F I L E CORPUS 'WORD' CHI-SQUARE TABLE GRADES GRADES « GRADE9 " GRADE 10 " CORPUS * SENTENCE' CHI-SQUARE TABLE GRADES " GRADE8 " GRADE9 " GRADE10 '» LENGTHS DATA FOR SENTENCE CHI-SQUARE WORDS DATA FOR WORDS CHI-SQUARE 3 COLUMN CHI-SQUARE PROGRAM II  PROGRAM (D 3 COLUMN CHI-SQUARE II (D  (SENTS.)  (1) 7 COLUMN CHI-SQUARE PROGRAM ti (3) PROGRAM (SENTS.) (D 7 COLUMN CHI-SQUARE ti (2) (8) WORD COUNT (OUTPUTS PLOT DATA TOO) (<*) SENT. STATS. (OUTPUTS PLOT DATA TOO) ii (4) (1) PLOT SENT. LENGTH DISTR. FOR SUBJS.  (D (D PLOT SENT. LENGTH DISTR. FOR ii  CORPUS S GRADES ii PLOTG.O (D CORPUS.INDEX (7) TEXT INFORMATION FOR BOOKS II GRADES.INDEX (7) •i CORPUS. INTRODUCTIO (3) II GRADES.INTRODUCTIO (3) II GRADES. INTRO. INSER (1) PLOTT.S (D PLOT SENT. LENGTH •iDISTR. FOR TEXTS PLOTT.0 d) PLOTGS.S (D PLOT SENT. LENGTH DISTR. FOR GRADE-SUBJ ECTS II PLOTGS.O (D PW1 (3) PW1 THRU' PW11 FOLLOW PS 1 THRU' PS11 PW2 (2) AND ARE INPUT DATA FOR PLOTGW.S PW3 (3) • PW4 (D •  THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE 8X11 COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO PRINTED PAGES EACH.  206  FILE #  FILE  NAME  201 202 203 204 205 206 207 208  PW5 PW6 PW7 PW8 PW9 PW10 PW11 PLOTGW.S  209  TABL.B1W.S TOTAL  SIZE  SIZE  DESCRIPTION OF  FILE  (D INPOT DATA FOR PLOTGW.S CONT'D (D (2)  (D (D  (1)  (D  (1) PLOT WORD-FREQ-DISTR. (CORPUS, GRADES, 6 SUBJECTS) VERSION OF TABL. B1.S TO GIVE DATA (D FOR PLOTGW.S (2,522)  THE SIZE COLUMN REFERS TO MACHINE PAGES INSIDE THE COMPUTER, WHICH ARE APPROXIMATELY THE SAME AS TWO PRINTED PAGES EACH.  8X11  207  APPENDIX ALPHABETICAL  LISTING  D  OF CORPUS  VOCABULARY  (SAMPLE)  208 CORPUS VOCABULARY FREQ 501 0.0043 0.0043 0.0043 0.0043 0.0043 0.0043 0.0255 0.0085 0.0043 0.0043 0.0213 0.0936 0.3445 0.0043 0.0085 0.0043 0.0043 0.0043 0.0043 0.0043 1.9693 0.4424 0.0043 0.0766 0.0468 0.0128 0.0043 0.0043 0.0043 0.0213 0.0128 0.0510 0.0213 0.0468 0.0510 0.0085 0.0085 0.0128 0.0255 0.0085 0.0170 0.0043 0.0213 0.0255 0.0085 0.0043 0.0043 0.0085 0.0043 0.0043  COUNT 1 1 1 1 I 1 6 2 1 1 5 22 81 1 2 1 1 1 1 1 463 104 1 18 11 3 1 1 1 5 3 12 5 11 12 2 2 3 6 2 4 1 5 6 2 1 1 2 1 1  WORD  ABBEY ABBOTS ABBREVIATED ABBREVIATING ABDICATE ABDICATED ABDOMEN ABE* S ABERDARES ABIDES ABILITIES ABILITY ABLE ABNER ABNORMAL ABNORMALITIES ABOARD ABODE ABDLISHED ABOUND ABOUT ABOVE ABRAHAM ABRASIVE ABRASIVES ABROAD ABRUPT ABRUPTLY ARSCURED ABSENCE ABSENT ABSOLUTE ABSOLUTELY ABSORB ABSORBED ABSORBENCY ABSORBENT ABSORBING ABSORBS ABSORPTION ABSTRACT ABSURDITY ABUNDANCE ABUNDANT ABUSES ABUTTED ACADEMY ACAOIAN ACCELERATE ACCELERATED  ALPHABETICAL FREQ 551 0.0085 0.0085 0.0043 0.0128 0.0043 0.0638 0.0213 0.0085 0.0383 0.0128 0.0043 0.0043 0.0043 0.0170 0.0043 0.0298 0.0085 0.0043 0.0383 0.0043 0.0043 0.0043 0.0043 0.0043 0.0128 0.0170 0.0043 0.0043 0.0213 0.0468 0.0043 0.0128 0.0043 0.0085 0.0170 0.1957 0.0170 0.1999 0.0128 0.0128 0.0043 0.0681 0.0128 0.0043 0.0043 0.0808 0.1531 0.0766 0.0043 0.0043  LIST  COUNT 2 2 1 3 1 15 5 2 9 3 1 1 1 4 1 7 2 1 9 1 1 1 1 1 3 4 1 1 5 11 1 3 1 2 4 46 4 47 3 3 1 16 3 1 1 19 36 18 1 1  WORD  ACCELERATING ACCELERATOR ACCELERATORS ACCENT ACCENTED ACCEPT ACCEPTABLE ACCEPTANCE ACCEPTED ACCEPTING ACCEPTS ACCESS ACCESSIBLE ACCESSORIES ACCESSORY ACCIDENT ACCIDENTALLY ACCIDENTALS ACCIDENTS ACCLIMATED ACCLIMATIZED ACCOMMODATE ACCOMMODATES ACCOMMODATIONS ACCOMPANIED ACCOMPANIES ACCOMPANIMENT ACCOMPANY ACCOMPANYING ACCOMPLISHED ACCOMPLISHES ACCOMPLISHMENT ACCOMPLISHMENTS ACCORD ACCORDANCE ACCORDING ACCORDINGLY ACCOUNT ACCOUNTANT ACCOUNTED ACCOUNTING ACCOUNTS ACCUMULATE ACCUMULATES ACCUMULATION ACCURACY ACCURATE ACCURATELY ACCUSED ACCUSING  209  APPENDIX RANK  LISTING  OF CORPUS  E  VOCABULARY  (SAMPLE)  RANK  CORPUS VOCABULARY  COUNT WORD 1 7.4515 17519 THE 10.9295 8177 OF 13.6768 6459 AND 5921 A 16.1952 18.7136 5921 TO 20.8497 5022 IN 22.5140 3913 IS 2266 THAT 23.4778 2218 IT 24.4212 25.3421 2165 ARE 2141 FOR 26.2527 27.1417 2090 YOU 1871 BE 27.9375 1789 AS 28.6984 29.4002 1650 OR 30.0318 1485 WITH 1459 ON 30.6524 1291 THIS 31.2015 31.7315 1246 BY 32.2036 1110 WAS 1087 HE 32.6660 1064 FROM 33.1185 1057 HAVE 33.5681 1054 AT 34.0164 34.4384 992 WHICH 933 ONE 34.8352 889 NOT 35.2133 35.5778 857 CAN 35.9419 856 YOUR 36.3047 853 THEY 36.6590 833 WE 37.0104 826 HIS 37.3575 816 WILL 779 IF 37.6888 38.0172 772 AN 38.3149 700 WHEN 38.6045 681 ALL 663 BUT 38.8865 639 THESE 39.1583 39.4055 581 MAY 569 THERE 39.6475 39.8848 558 HAS 40.1196 552 I 40.3501 542 OTHER 40.5785 537 SOME 529 MORE 40.8035 41.0239 518 WERE 41.2408 510 HAD 41.4526 498 THEIR 491 USED 41.6615 FREQ  FREQ 51 41.8690 42.0736 42.2761 42.4738 42.6708 42.8575 43.0396 43.2203 43.4007 43.5806 43.7596 43.9370 44.1118 44.2845 44.4559 44.6261 44.7962 44.9633 45.1275 45.2845 45.4410 45.5967 45.7511 45.9034 46.0535 46.2028 46.3512 46.4967 46.6413 46.7740 46.9050 47.0352 47.1640 47.2895 47.4112 47.5298 47.6404 47.7510 47.8595 47.9675 48.0751 48.1793 48.2810 48.3814 48.4809 48.5804 48.6795 48.7782 48.8769 48.9739  LIST  COUNT 488 481 476 465 463 439 428 425 424 423 421 417 411 406 403 400 400 393 386 369 368 366 363 358 353 351 349 342 340 312 308 306 303 295 286 279 260 260 255 254 253 245 239 236 234 234 233 232 232 228  WORD  MANY SO EACH TWO ABOUT SHOULD WHAT THAN BEEN INTO THEM USE MAKE DO UP SUCH THEN TIME ITS WOULD HOW NUMBER MADE OUT MOST ONLY NO MUST WATER ALSO FIRST VERY GOOD HIM SAME 1 COULD WHO ANY BECAUSE SEE LIKE MUCH PEOPLE CALLED 2 PLACE THROUGH WORK NEW  211  DESCENDING  AND  ASCENDING  APPENDIX  F  ORDER  CORPUS  OF  VOCABULARY  (SAMPLES)  212 THE CORPUS WITH RANK  51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 88 89 90 91 92 93 94 96 97 99 100 101 102 103 105  X 488 481 476 465 463 439 428 425 424 423 421 417 411 406 403 400 393 386 369 368 366 363 358 353 351 349 342 340 312 308 306 303 295 286 279 260 255 254 253 245 239 236 234 233 232 228 223 220 217 216  FX 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2  SUM FX 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 88 89 90 91 92 93 94 96 97 99 100 101 102 103 105  IN DESCENDING ORDER  CUM? FX 0.311 0.317 0.323 0.329 0.335 0.341 0.347 0.354 0.360 0.366 0.372 0.378 0.384 0.390 0.396 0.408 0.415 0.421 0.427 0.433 0.439 0.445 0.451 0.457 0.463 0.469 0.475 0.482 0.488 0.494 0.500 0.506 0.512 0.518 0.524 0.536 0.543 0.549 0.555 0.561 0.567 0.573 0.585 0.591 0.603 0.610 0.616 0.622 0.628 0.640  FX*X SUM FX*X CUM? FX*X 98437 41.869 488 481 98918 42-073 476 99394 42.276 99859 465 42.473 463 100322 42.670 439 100761 42.857 101189 43.039 428 425 101614 43.220 424 102038 43.400 102461 43.580 423 421 43.759 102882 417 103299 43.937 411 103710 44.111 406 104116 44.284 403 104519 44.455 105319 800 44.796 393 105712 44.963 386 106098 45.127 369 106467 45.284 45.440 368 106835 366 107201 45.596 107564 363 45.750 358 107922 45.903 108275 353 46.053 351 108626 46.202 349 108975 46.351 109317 46.496 342 340 109657 46.641 109969 312 46.773 308 110277 46.904 306 110583 47.035 303 110886 47.163 295 111181 47.289 286 111467 47.411 279 47.529 111746 112266 520 47.750 47.859 255 112521 254 112775 47.967 113028 253 48.074 245 48.179 113273 239 113512 48.280 48.381 236 113748 468 114216 48.580 48.679 114449 233 464 114913 48.876 228 115141 48.973 115364 223 49.068 220 115584 49.162 49.254 217 115801 432 49.438 116233  213  THE CORPUS IN ASCENDING ORDER  X L 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49  FX 7098 2418 1240 854 632 453 387 322 251 208 189 156 123 138 113 100 98 98 58 68 51 63 36 44 37 44 39 34 33 38 27 30 21 39 40 31 26 25 21 26 18 21 24 17 13 18 10 12 8  SUM FX 7098 9516 10756 11610 12242 12695 13082 13404 13655 13863 14052 14208 14331 14469 14582 14682 14780 14878 14936 15004 15055 15118 15154 15198 15235 15279 15318 15352 15385 15423 15450 15480 15501 15540 15580 15611 15637 15662 15683 15709 15727 15748 15772 15789 15802 15820 15830 15842 15850  CUM? FX 43.267 58.007 65.565 70.771 74.624 77.385 79.744 81.707 83.237 84.505 85.657 86.608 87.357 88.199 88.887 89.497 90.094 90.692 91.045 91.460 91.771 92.155 92.374 92.642 92.868 93.136 93.374 93.581 93.782 94.014 94.178 94.361 94.489 94.727 94.971 9 5 . 160 95.318 95.471 95.599 95.757 95.867 95.995 96.141 96.245 96.324 96.434 96.495 96.568 96.617  FX*X SUM FX*X CUM? FX*X 3.019 7098 7098 5.076 4836 11934 6.658 3720 15654 19C70 8. I l l 3416 9.455 22230 3160 24948 10.611 2718 11.764 2709 27657 12.859 2576 30233 2259 13.820 32492 2080 34572 14.705 36651 15.589 2079 16.385 38523 1872 1599 17.065 40122 17.887 42054 1932 43749 18.608 1695 45349 19.289 1600 19.997 1666 47015 20.748 1764 48779 21.216 49881 1102 1360 51241 21.795 52312 22.250 1071 1386 53698 22.840 23.192 828 54526 1056 23.641 55582 24.034 925 56507 57651 24.521 1144 24.969 58704 1053 59656 25.374 952 25.781 957 60613 26.266 1140 61753 26.622 837 62590 63550 27.030 960 27.325 64243 693 27.889 1326 65569 66969 28.484 1400 28.959 68085 1116 29.368 962 69047 69997 950 29.772 30.121 819 70816 30.563 71856 1040 30.877 738 72594 31.252 882 73476 31.691 74508 1032 32.009 748 75256 75841 32.258 585 76669 828 32.610 470 77139 32.810 576 77715 33.055 78107 33.222 392  214  APPENDIX SENTENCE  LENGTH  DISTRIBUTION  G OF  THE  CORPUS  (SAMPLE)  215  SENTENCE-LENGTH THE CORPUS LENGTH  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46  REPETITIONS 36 86 100 176 283 313 415 471 522 581 577 616 623 632 655 634 584 580 505 471 432 424 384 343 294 266 252 239 192 193 171 124 109 103 81 75 59 43 50 48 50 36 24 22 19 23  CUM. 36 122 222 398 681 994 1409 1880 2402 2983 3560 4176 4799 5431 6086 6720 7304 7884 8389 8860 9Z92 9716 10100 10443 10737 11003 11255 11494 11686 11879 12050 12174 12283 12386 12467 12542 12601 12644 12694 12742 12 792 12828 12852 12 874 12 893 12916  D I S T R I 3 U T I 0 N OF  ACCUM  WORDS 36 208 508 1212 2627 4505 7410 11178 15876 21686 28033 35425 43524 52372 62197 72341 82269 92709 102304 111724 120796 130124 138956 147188 154538 161454 168258 174950 180518 186308 191609 195577 199174 202676 205511 208211 210394 212028 213978 215898 217948 21,9460 220492 221460 222315 223373  Z WORDS 0.02 0.09 0.22 0.52 1.12 1.92 3.15 4.75 6.75 9.22 11.92 15.07 18.51 22.28 26.46 30.77 34.99 39.43 43.51 47.52 51.38 55.35 59.10 62.61 65.73 68.67 71.57 74.41 76.78 79.25 81.50 83.19 84.72 86.21 87.41 88.56 89.49 90.19 91.02 91.83 92. 70 93.35 93. 79 94.20 94.56 95.01  216  APPENDIX GRAPHS OF  SENTENCE  H  LENGTH  DISTRIBUTION  217 CI  0 0  10 0  20.0  3fl.O  40.0  SENTENCE  50.0  LENGTH  60.0  70.0  BO.O  90.0  PO.O  no.o  100.0  FIGURE 7 . 2 SENTENCE-LENGTH DISTRIBUTION OF GRADE EIGHT  z LU<=> UJ  A '  0.0  10.0  20. U  30.0  00.0 SENTENCE  50.0 1 ENOTH  -1—  60.0  70.0  -I  100.0  FIGURE 7 3 SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE  n  1  30.0  40.0  50.0  63.0  70.0  60.0  ~T  90.0  ' SENTENCE LENGTH  FIGURE  74- SENTENCE-LENGTH DISTRIBUTION .  OF GRADE TEN  w.a 50.n SENTENCE LENGTH  63.0  70.0  80.0  90.0  220  FIGURE  SENTENCE-LENGTH  7.7  OF  HOME  DISTRIBUTION  ECONOMICS  —  50.0  10.0  20.0  SENTENCE  FIGURE  7?  SENTENCE-LENGTH OF  r  2J  j  JO  0  LENGTH  INDUSTRIAL  70.0  60.0  60.0  SENIENCE LING-TH  93.0  1 100.0  DISTRIBUTION  EDUCATION  _^*^=^*=_-*^==*=*  Hi 0  BO.O  60.0  10.0  ,  80.0  r  90.0  ,  100.0  FIGURE 7.10  SENTENCE-LENGTH DISTRIBUTION OF SCIENCE  o  z. O r  LU  CC  0.0  10.0  - 1 — 20.0  1  JO.O  50.0  '•L-'^-r 60.0  SENTENCE LENGTH  70.0  00.0  "~I—  90.0  "I  100  222  FIGURE 7.11  SENTENCE-LENGTH DISTRIBUTION OF SOCIAL STUDIES  a ID.  (_> z.  UJC3 UJ  cc  0.0  10.0  20.0  30.0  so.o 40.0 SENTENCE LENGTH  FIGURE 7./*  60.0  70.0  80.  90.0  100.0  SENTENCE-LENGTH DISTRIBUTION OF GRADE EIGHT ENGLISH  UJ CK  0.0  10.0  20.0  30.0  40.0  ''.0.0  SENTENCE LENGTH  co.o  70.0  80.0  ~!  90.0  I  100.0  224  226  FIGURE 7/9  SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE ENGLISH  o.o  T 10.0  20.0  30.0  40.0 5 0 . 0 .SENTENCE LENGTH  100.0  70.0  FIGURE 7.20 SENTENCE-LENGTH DISTRIBUTION OF GRADE NINE HOME ECONOMICS  A -  r  AT . o  50 . n  ENTENCE LENGTH  co.o  70.0  BO.O  ~1 90.0  1U0.0  228  230  FIGURE  SENTENCE-LENGTH DISTRIBUTION OF GRHDE TEN MATHEMATICS  <_)  o— IU  0.0  - 1 — 10.0  I  20.0  —1  30.0  1 40.0  1  50.0  1  i  70.0  60.0  80.0  1  90.0  I  100.0  SENTENCE LENGTH  FIGURE 7-28 SENTENCE-LENGTH DISTRIBUTION OF GRADE TEN SCIENCE  -fl  T  -vi. o  1  so o  SENTENCE LENGTH  cu.o  0  ~ia.a  tr-  —r— 00.0  90.0  100.0  231  232 o  <3h FIGURE 7.3/ SENTENCE-LENGTH DISTRIBUTION OF TEXT X1C02  8'  233  234  235  236  237  238  2 3 9  240  FIGURE 7.W SENTENCE-LENGTH DISTRIBUTION OF TEXT #2D05  242  243  FIGURE 7-53 SENTENCE-LENGTH  DISTRIBUTION  OF TEXT -H2F01  50.0  LENGTH  FIGURE 7-5V-  60.  SENTENCE-LENGTH  70.0  eo.o  90.0  -1 100.0  DISTRIBUTION  . OF TEXT *2G01  1  40.0 50.0 if.HTENCE LENGTH  60.0  70.0  —I  ao.o  ! 90.0  "I 100.0  244  9n  F 1 G U R E 7 « " SENTENCE-LENGTH DISTRIBUTION  8-  OF TEXT *2G02  R-  (_> z  UJQ  o-  - | —  0.0  10.0  30.0  30.0  1 40.0  50.0  60.0  70.0  SENTENCE LENGTH  - 1 — 80.0  90.0  — i —  ~1 90.0  1  100.0  FIGURE 7 - » SENTENCE-LENGTH DISTRIBUTION OF TEXT *2H01  z UJO  g2UJ  CK  0.0  10.0  20.0  1  40.0  1  50.0  SENTENCE LENGTH  (30.0  70.0  80.0  100.0  245  FIGURE 7S7 SENTENCE-LENGTH DISTRIBUTION  SB-  OF TEXT -K2H02  u  UJQ  2d. On UJ 0£  T 30.0  40.0 50.0 SENTENCE LENGTH  60.0  FIGURE 1-58 SENTENCE-LENGTH DISTRIBUTION OF TEXT #3801  R-  o z UJQ o <-« UJ a;  0.0  10.0  1  10.0  1  bfl.O  SENTENCE LENGTH  60.0  "70.0  eo.o  i— 90.0  100.0  246  FIGURE 7-59 SENTENCE-LENGTH DISTRIBUTION OF TEXT K30O2  0.0  10.0  10.0  100.0  so.o  SENTENCE LENGTH  9h  FIGURE 740 SENTENCE-LENGTH DISTRIBUTION OF TEXT X3C01  o z  UJO  §2-  0.0  ~1— 10.0  20.0  30.0  ni.n  M.o  SENTENCE LENGTH  60.0  70.0  I  eo.o  90.0  100.0  247  FIGURE 7-bl  SENTENCE-LENGTH DISTRIBUTION OF TEXT  O  *3C02  248  FIGURE 7U SENTENCE-LENGTH DISTRIBUTION OF TEXT K3H02 o  o . o  10.0  20.0  30.0  m . n  w.n  SENTENCE LENGTH  co.o  IO.O  n o . o  90.0  IOO.O  250  APPENDIX I CHI SQUARE RESULTS OF DISTRIBUTION OF  100 MOST COMMON WORD TYPES  TABLE  XXXVI  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST FREQUENT WORD TYPES ACROSS THE GRADE L E V E L S OF THE CORPUS  RANK WORD 1.  8  G R A DfcS 9 10  17519.  WORD  6.  3938.4 9159.7 4420.9 7.299 CHI-SQUArtE  2.  7.378  7.733  CHI-SJUA^E 3. AND  8.85  4.  1452.0 3377.0  1629.9  2.878  CHI-SOUARE  879.7 2045.9  987.4  1494.2  IT  2.357  509.4  1494.2  2.577  2.405  4.72  T H E T H P . E t L I N E S OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FRFCQ. TO TOTAL NO. OF WORDS IN GRADE  ARE  635.0  0.865 1.070  532.0 1181.0  505.0  498.6  559.7  11^9.7  0.961 7.97  398.0  486.7 1132.0  546.3  0.925  2218.  0.851  4U9.0 1278.0  CHI-SQUARE  2266.  25.80  1.006  10.  1.4C9  1184.8 571.8  CHI-SQUARE 5921.  1.796  3913.  36.22  CHI-S3UAKE 9.  2.357  836.0  1.073  14.36  1331.1 3095.7 2.503  THAT  5921.  2.045  567.0 1064.0  2.457  1326.0 3168.0 1427.0 TO  8.  5022.  1267.3  891.0 2208.0  1.685  25.97  1331.1 3095.7  CH1-SQUAKE  IS  6459.  TOTAL  18.75  CHI-SQUARE  2.716 . 2.510  5.  7.  94.03  1436.0 3036.0 1399.0 A  3177.  3.137 3.999  1*58.0  CHI-SQUA!'.E  1129.0 2625.7  CHI-rSQUARE  1462.0 3539.0  2.765  G R A D E S 9 10  8  2.096  1838.3 4275.3 2063.5 3.6B7  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  1108.0 2515.0 1399.0 IN  1949.0 3857.0 2373.0 OF  X X X ^ I  RANK TOTAL  3859.0 9071.0 4589.0 THE  TABLE  1.039  2165.  0.671  59.13  T H E T H R E E L I N E S Vr F I G U H E S F O R E A C H E N T R Y R E P R E S E N T FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T U T A L N O . f l F W O R D S I N G R A D E  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE LEVELS O F THE CORPUS  X X X V I  TABLE  RANK WORD  G R A D E S 9 10  8  11.  499.0 FOR  1182.0  4 3 1 . 3 1119.4 0.944  YOU  469.9  1092.7  527.4  0.969  899.0  4 0 1 . 1 932.7 0.766  CHI-SaUARE  AS  3 5 1 . 0  1 8 7 1 .  18. THIS  LINES  485.0 1 7 8 4 .  19.  45U.2  360.0 1 4 5 9 .  328.0  762.8  368.2  263.0  650.0  378.0 1 2 9 1 .  290.2  675.0  325.8  0.874  ENTRY  REPRESENT  FRTO.  TO  TOTAL  NU.  UF  WORDS  IN  GRADE  346.0 1 2 4 6 . 314.4  0 . 5 1 0 C.583  278.0  419.0413.0  249.5  580.4 2 8 0 . 1  CHI-SQUAKE  EACh  0.637  4.27  0.526  FOR  627.0  280.1651.5  0.347  U F FIGUKES  0.529  1 1 . 8 5  CHI-SQUARE  WAS  0 . 6 1 7 0.607 0.64  0.516  20.  0.534  759.0  0.817  206.0 1 6 5 0 .  1 4 8 5 .  340.0  273.0 BY  0.70S  TOTAL  24.02  CHI-SOUAf(E  FRE-JUENCY UF  374.7  0.49T  158.55  IT  776.4  CHI-SQUASE  FREQUENCY EXPECTED  333.8  C.643  362.7 4 1 6 . 4  C.698  RATIO  317.0  0.597  0.731  1075.0  370.9  THREE  ON  3.95  369.0  THE  17.  46.76  CHI-SQUAKE  870.0  0.896 0 . 5 9 1  CHI-SaUArtE  OR  2090.  978.2 4 7 2 . 1  405.0  298.0  0.564  77.87  0.791  15.  WITH  G R A D E S 9 10  8  CHI-SQUARE 354.0  420.6  AS  16.  540.3  1191.0  4 1 8 . 01102.0  14.  WORD  460.0 2 1 4 1 .  5<-5.0  CHI-S3UAAE  8E  RANK  TOTAL  16.oa  1.031  13.  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H t GRADE LEVELS OF T H E CORPUS  0 . 9 6 1 0.775  CHI-SQUARE  12.  X X X V /  TABLE  1 1 1 0 .  0 . 3 4 1 0.696 111.16  THE THREi LINES U F FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S St O F F R E Q . T O T O T A L N U . ;)F W O R D S I N G R A D E  rO Ul  CO  TABLE  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREUUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  XXXVI  RANK WORD 21. HE  8  244.4  568.3  CHI-SQUARE 22. FROM  23. HAVE  556.3  268.5  237.6  552.6  266.7  0.456 CHI-SQUARE 25. WHICH  487.8  0.391 0.383  287.0  1054.  29.  199.9 464.8  224.3  YOUR  CHI-SQUARt  2.87  223.0  518.7 250.3 0.423  7.50  THE THREE LINES Or FIGURES FOR EACH ENTRY REPRESENT FREUUENCY EXPECTED FREQUENCY KATIU AS *, UF FREQ. TO TUTAL NO. OF WORDS IM GRADE  30.  230.0 THEY  889.  0.345  3.57 857.  216.3  0.417 0.249 31.01 481.0 120.0  853.  191.8 446.0 215.3 0.482  992.  0.400  192.7 448.1  255.0  0.484  251.0  1.51 205.0  0.371  933.  235.4  192.0 492.0  CHI-SQUARE  484.0  CHI-SQUARE  CAN  551.1 266.0  0.394  209.7  TOTAL  196.0 513.0 148.0  0.490  257.0  0.486  28.  9.38  0.428  481.0 227.0  CHI-SQUARE 291.0 1057.  241.0 526.0  225.0  0.363  0.475  503.0  236.9  2.7. NOT  263.0  CHI-SQUAKE  AT  1064.  1.90  0.409  G R A D E S 9 10  8  CHI-SQUAKE  239.2  0.434  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORO TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  0.426  9.71 282.0  0.497  24.  ONE  0.527  534.0  CHl-SQUA-iE  26.  274.3  248.0  0.469  WORO  TOTAL  521.0 313.0 1087.  0.424  XXXVI  RANK  G R A D E S 9 10  253.0  0.479  TABLE  0.391 0.202 65.75 452.0  171.0  833.  187.3 435.5 210.2 0.435 CHI-SQUARE  0.368  0.288  17.69  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS S, OF FREQ. TO TOTAL NO. OF WORDS IN GRADE  TABLE  XXXVI  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREOUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  RAN< WURD 31. WE  8  G R A D E S 9 10  253.0  219.0 361.0  CHI-SQUARE 2C4.0 HIS  CHI-S3UA<E  WILL  407.0  816.  ALL  CHI-SQUARE  24.64 779.  THESE  CHI-SQUARE 187.0  772.  173.6 403.6 194.8 0.333 0.315 0.42  THE THREc LINES OF FIGURES FOR EACH ENTRY REPRESENT FREJUEUCY EXPECTED FREQUENCY RATIO AS %, UF FRcQ. TO TOTAL NU. OF WORDS IN GRADE  0.298  0.288  0.82 163.0  0.290  40.  0.259  0.275  639.  161.3 0.290  1.53  126.0 358.0 MAY  663.  0.53  143./ 334.1 0.280  36.56  681.  171.8  148.0 319.0 172.0  0.388 0.214  176.0 409.0  CHI-SQUARE  $9.  171.0  149.0 346.6 167.3 0.272  0.265  127.0  20.99  144.0 356.0 OUT  175.1 407.3 196.6  0.333  38.  700.  0.226  153.1 356.1 0.272  816.  134.0  0.345  144.0 366.0  CHI-SQUAKC  205.9  CHI-SQUARE  37.  TOTAL  157.4 366.0 176.6 0.269  205.9  133.4 426.6  0.331  AN  WHEN  3.61  175.0 477.0  35.  215.0  G R A D E S 9 10  142.0 424.0  0.331 0.362  0.403  8  CHI-SQUARE  157.0  CHI-SQUARE  IF  36.  240.98  164.0 495.0  0.310  34.  WORD  0.178 0.608  133.4 426.6 0.386  33.  826.  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  XXXVI  RANK  185.7 431.9 208.4 0.479  32.  TOTAL  TABLE  97.0  581.  130.6 303.8 146.6 0.238 CHI-SQUARE  0.291  0.163  26.63  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FREQ. TO TOTAL NO. OF WORDS IN GRAOE  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  XXXV/  TABLE  RANK WORD 41.  G R A D E S 9 1 0  8  142.0 301.0 126.0  THERE  1 2 7 . 9 297.5 0.269 CHI-SQUARE  +Z.  0.245  160.0  47. WERE  0.218  CHI-SQUAKE  0.219  THEIR  1 2 0 . 7 280.3  135.5  0.243  0.195  4 . 0 1  The THREE LINES O FFIGURES F O R EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY R A T I O A S %, U F F R E Q . T O T O T A L N O . C;T W O R D S I N G R A D E  50.  0.300  152.0  4 9 8 .  125.7  0.184  0.256  1 0 . 6 3  117.0 310.0 USED  5 1 0 .  12U.7  0.178  112.0 260.4  CHI-SQUAKE 5 3 7 .  0.332  27.43  0.227  U.226  116.0  0.145  120.0 226.0  3 . 2 0  1 2 2 . 0 299.0  0.231  49.  5 1 8 .  130.7  114.7 266.6  CHI-SQUARE 5 4 2 .  0.197  7 1 . 4 8  0.214  0.303  134.0  133.5  113.0 219.0 178.0 HAD  139.3  121.8 283.4 1 3 o . d  CHI-SQUARE  48.  5 Z 9 .  5 . 7 6  0.270  5 5 2 .  117.0  0.247  1 1 6 . 5 270.8  0.270  180.0  TOTAL  143.0 178.0 197.0  1 6 . 6 2  0.263  SOME  0.204  5 5 8 .  304.0  118.9 276.6  CHI-SQUAKE  139.0 269.0  45.  MORE  140.8  0.241  124.1 288.6  CHI-SOUAKE  OTHER  108.0  7 . 0 6  0.197  44.  46.  G R A D E S 9 10  8  CHI-SQUARE  104.0 268.0 I  5 6 9 .  0.212  125.4 291.7  CHI-SQUAKE  43.  WORO  3 . 7 5  0.193  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE GRADE LEVELS OF THE CORPUS  XXXV/  RANK TOTAL  143.6  102.0 296.0  HAS  TABLE  U 0 . 4 0.221 CHI-SQUARE  256.7  6 4 . 0  4 9 1 .  123.9  0.252  0.108  40.42  to Tht THREE LINES O F FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I U A S Z, O F F R E Q . T O T O T A L N O . OF W O R D S I N G R A D E  Ul  (JT  XXXV/  TABLE  RAN<  D I S T R I B U T I O N OF O C C U R R E N C E OF T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S THE GRADE L E V E L S OF THE C O R P U S  51. MANY  G R A D E S 9 10  8  WORD  118.0  245.0  109.7  255.I 123.1  0.223 CHI-SQUARE 52.  0.206  4 9 1 .  57.  121.4  CHI-SOUAKE  58. THAN  59.  0.172 CHI-SQUARE  0.216  BEEN  0.179  0.57  95.3 221.7 0.149  60. INTO  107.0  0 . 1 6 0 0.249  2 1 . 2 6  9 6 . 0 237.0  9 0 . 0  95.1 221.2  0.180  4.64  0.176  79.0 197.0 148.04 2 4 .  CHI-SQUARE  116.8  4 2 5 .  9 5 . 5 222.2 1 0 7 . 2 0.193  3.11  1 0 4 . 1 2 4 2 . 1  0 . 1 5 8 0.202 8.53  CHI-SQUARE  107.04 6 3 -  0.108  102.0 217.0 106.0  0 . 1 8 3 0.207  9 1 . 0 265.0  0.216  9 6 . 2223.8 1 0 8 . 0 0.216  5.21 123.04 6 5 .  4 3 9 .  26.63  CHI-SQUARE 101.V 4 7 6 .  6 4 . 0  114.0 194.0 120.04 2 8 . WHAT  104.5 243. 1 117.3 0.221  ABOUT  0.211  9 5 . 0  TOTAL  9 8 . 7 229.5 1 1 0 . 8  0.206 0 . 1 7 0  1 1 7 . 0 225.0  55.  SHOULD  1 0 7 . 0 248.9 1 2 0 . 1  CHI-SQUARE  TWO  1 0 9 . 0 266.0  1 3 . 4 6  0.231  54.  56.  G R A D E S 9 10  8  0.203 U . 1 6 0  1 2 2 . 0 253.0 EACH  WORD  CHI-SQUARE  1C8.1 251.5  CHI-SilUA< =  D I S T R I B U T I O N OF OCCURRENCE OF T H E 100 MDST F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE C O R P U S  XKXVI  RANK  1.06  0.259  53.  1 2 5 . 04 8 8 .  0.199  1 3 7 . 0 249.0 SO  TOTAL  TABLE  0.182 CHI-SQUARE  0.193  4 2 3 .  106.7 0.152  3.77  NJ T H E T H R E E L I N E S O F F I G U R E S FOR E A C H E N T R Y R E P R E S E N T FREQUENCY EXPECTED FREQUENCY R A T I U A S 2 , U F F K E Q . TO T O T A L N O . O F WORDS I N GRADE  THE THREE L I N E S O F FIGURES F O R EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T U T A L N O . C F W O R D S I N G R A D E  CTl  TABLE  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 HOST F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE CORPUS  XXXVI  RiNK WORD  G R A D E S 9 10  8  61.  109.0 220.0 THEM  94.6  62.  0.183  63.  92.4 0.202 CHI-SQUARE  (4. 00  .  255.0  4 9 . 0  67. THEN  69.9  209.1  IOC.9  0.174 0.184 2 . 6 1  64.0  238.0  69.9  209.1 100.9  0.121  4 1 1 .  68. TIME  '  7 9 . 0  4 0 6 .  69. ITS  218.0  68.4  205.5 9 9 . 2  81.0  240.0  8 2 . 0  90.6  210.7 101.7  4 0 3 .  0.195 0.138 8 . 9 0  THE THREE L I N E S U F F I G U R E S FOR EACH ENTRY R E P R E S E N T FREQUENCY EXPECTED FREQUENCY R A T I O A S i, O F F K L Q . T O T U T A L N O . C F W O R D S I N G R A D E  70. WOULD  0.177 0.177  76.0  185.0  86.8  201.8  CHI-SQUARE  7 . 6 3  105.0 3 9 3 .  4 . 9 2  0.144  0.190 0.133  4 0 0 .  0.194 0.165  70.0  CHI-SQUARE  38.66  9 8 . 0  1 1 . 5 4  0.132  0.083  212.3 102.5  0.153  109.04 0 0 .  CHI-SQUARE  91.3  CHI-SQUARc  4 1 7 .  214.9 103.7 0.207  214.0  0.146  0.113  234.0  CHI-SQUARE  UP  6 7 . 0  TOTAL  77.0  CHI-SQUARE  93.0  0.176  65.  SUCH  218.0 105.2 0.206  G R A D E S 9 10  8  46.  1 9 . 6 1  107.0 MAKE  4 2 1 .  WORO  4 . 0 9  43.7  CHI-SQUARE  9 2 . 0  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S O FTHE CORPUS  XXXVI  0.179 U.155  -97.0 253.0 USE  RANK TOTAL  220.1 106.2  0.206 CHI-SQ'JASE  TABLE  9 7 . 4  0.150 0.211  1 0 . 5 6  94.0  162.0  83.0  192.9  0.176 CHI-SQUARE  125.0 3 8 6 .  90.0 3 6 9 . 9 3 . 1  0.132 0.152 6 . 5 3  t-O Ul  THE THREE LINES UF FIGURES FREQUENCY EXPECTED FREUUENCY R A T I U A S X, U F F R E Q .  FUR EACH  TO TOTAL  ENTRY  NO. O F WORDS  REPRESENT:  IN  GRADE  ~J  TABLE  XXXVI.  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E GRAOE L E V E L S O F THE. CORPUS  RANK WORD 71.  G R A D E S 9 10  8  8 5 . 0 204.0 HOW  82.7 192.4 0.161  3 6 8 .  91.6  72.0  0.163  90.3  0.141  0.175  3.37  THE THREE L I N E S OF F I C U U S F O R E A C h ENTRY REPRESENT FREQUENCY EXPECTEO FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . C F W O R D S I N G R A D E  0.156  0.150  53.0  76.9 178.8  WATER  86.3  0 . 1 7 6 0.089  0.161  U.152  10.32  60.0 174.0 ALSO  340.  7 6 . 4 1 7 7 . 88 5 . 8 0.098  SO.  342.  20.78  CHI-SQUARE  89.1  88.1  52.0 198.0 90.0  5.00  79.4 184.6 0.144  79.  0.121  76.0 173.0104.0 3 5 3 .  CHI-SQUAKE  MUST  0.138  3 5 8 .  0.148  1.42  CHI-SiUARE  60.5 187.2  CHI-SOUAkE  78.5 182.5  73.0 216.0  17.76  0.161  MOST  78.  0 . 1 7 1 0.096  85.0 201.0  75.  NO  0.132  3 6 3 .  0.131 9.52  CHI-SQUARE 57.0  88.6  70.0 192.0 89.0 349.  0.256  81.6 139.8  CHI-SQUAKE  OUT  77.  TOTAL  88.0 351.  78.9 183.5 0.193  99.57  0.182  74.  102.0 161.0 ONLY  92.4  0.080  96.0 210.0  G R A D E S 9 10  8  CHI-SQUARE  62.3 191.4  CHI-SQUARE  MADE  76.  2.83  0.223  73.  WORD  0.133  118.0 98.0 152.0 3 6 6 .  NUMBER  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE CORPUS  XXXVI  RANK TOTAL  9 2 . 9  0.166  CHI-SQUAKE 72.  79.0  TABLE  78.0  70.1 163.1 0.113 CHI-SQUARE  0.142  312.  78.7 0.131  2.20  THE THREE L I N E S U F F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S 2 O F F R E Q . T O T O T A L N U . C-F W O R D S I N G R A D E t  CO  TA8LE  XXXV/  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE CORPUS  RANK NCRD 81. FIRST  G R A 0 ES 9 10  8 73.0  142.0  69.2  161.0  0.138 CHI-SCUARE 82. VERY  157.C  be.8  160.0  168.0  68.1  158.4  CHI-SQUARE  HIM  SAME  160.0  66.3  154.2  64.3  149.5  CHI-SQUAPE  69.0  95.0  58.5  135.9  CHI-SQUARE 7a.0 3 ? 6 .  87.  7 7 . 2  WHO  0.077  64.0  132.0  58.5  135.9  CHI-SQUARE 42.0 3 0 3 .  S3.  7 6 . 5  ANY  7 4 . 4  58.0  126.0  57.3  133.3  7 2 . 2  0.111 0.133  144.0  BECAUSE  57.1  132.8  0.113  2 . 2 0  THE THREE L I N E S Of F I G U R E S FOR EACH ENTRY R E P R E S E N T FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E O . T O T O T A L N O . O F W O R D S I N G R A D E  SEE  64.0 2 6 0 . 6 5 . 6  0.117  64.3  50.0 2 5 4 . 6 4 . 1 0.084  4 . 1 9  67.0  133.0  56.9  132.3  0.127 CHI-SQUARE  71.0 2 5 5 .  1.10  60.0  90.  0.162  0.102 0.120  89.  CHI-SQUARE 79.0 2 8 6 .  6 5 . 6  0 . 6 8  0.110  2 9 5 .  96.0 2 6 0 .  0.107 0.108  CHI-SQUARE 7 1 . 0  TOTAL  2 8 . 3 1  0.121  0 . 4 6 137.0  G R A D E S 9 10  8  0.131  0.130 0.120  70.0  0.132  COULO  25.20  0.121  85.  7 7 . 7  86.  0.137 0.071  t>4.0  CHI-SQUAKE  93.0 3 0 8 .  0 . 3 3  0.176  84.  WORO  0.128 0.128  93.0  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORD TYPfcS A C R O S S T H E GRADE L E V E L S OF THE CORPUS  RANK  TOTAL  0.115 0.157  73.0  CHI-SQUAKE  GOOO  XXXV/  5 . 4 6  0.138  83.  TABLE  0.108  53.0 2 3 3 . 6 3 . 8 0.089  3 . 6 5  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY R A T I U A S %t O F F R E Q . T O T U T A L N O . O F W O R D S I N G R A D E  T ABL E  XXXVI  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE CORPUS  RAN< WORD 91. LIKE  8  129.0  55.1  128.1  CHI-SQUARE  MUCH  CALLED  104.0  53.1  123.4  95. PLACE  0.085  THROUGH  52.2  121.3  58.5  97. WORK  38.  59.6  NEW  0.121  52.6  122.3  59.0  234.  99. SMALL  0.118  123.0  52.4  121.8  121.3  58.5  233."  58.8  0.100 0.061 17.77  THE THREE LINES OF F I G U R E S FUR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREUUENCY RATIO AS S t OF F R E Q . TO T O T A L NO. DF WOROS I N GRADE  100. OVER  51.3  119.2  0.044  105.0  0.069  117.0  50.1  116.6  57.5 0.177  0.095  58.0  223.  56.3 0.098  0.15  52.0  123.0  45.0  49.5  115.0  55.5  CHI-SQUARE  228.  52.40  48.0  O.Osb  232.  34.03 05.0  CHI-SQUARE 36.0  0.133  38.0  0.091  3.09  74.0  5 2 .2  CHI-SQUARE 70.0  232.  3.79 26.0  0.072  TOTAL  0.088  163.0  0.031  236.  0.111  43.0  CHI-SCUARE 72.0  lll.O  CHI-SQUARE  52.0  0.115  53.0  0.140  136.0  0.083  239.  G R A D E S 9 10  44.0  6.56  0.090  8  96.  1.65  60.0  CHI-SQUARE  WORD  CHI-SQUARE  60.3  0.094  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E GRADE L E V E L S OF THE CORPUS  RANK  3.01  125.0  0.100  245.  XXXVI  0.088  53.7  CHI-SOUA^E 94.  0.105  68.0  0.113  TOTAL  61.8  116.0  CHI-SQUARE  PEOPLE  52.0  55.0  0.104  93.  G R A D E S 10  64.0  0.121  92.  9  TABLE  0.100  2?0.  0.076  2.68  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT FREQUENCY EXPECTED FREQUENCY R A T I O A S S t O F F R E Q . T O T O T A L N G . UF W O R D S I N G R A D E  TABLE  XXXVU  DISTRIBUTION OF OCCURRENCE OF T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  RANK WORD 1. THE  8  C  S U B J E C T S D E 2652.0  2767.0  1263.0  3266.0  3289.0  1501.8  30C3.0  3670.4  2332.3  1327.0  2815.7  2876.9  607.0  6.945  845.0  701.0 1401.6  1713.2  1088.6  2.950  2.775  2.700  757.0  1636.0  1782.0  6 1 9 . 41314.2  1342.8  4.251  4.330  287.0  746.0  1190.0  553.7  1353.2  859.9  489.2  1038.1  1060.7  1107.1  3.119  3.356  2.732  1.612  1369.0  797.0  525.0  507.6  1014.9  1240.5  788.3  448.5  8177.  7. IS  1.974  6459.  8.  3.082  2.779  2.546  1033.0  2.948  951.6 2.734  703.0  1046.0  620.0  358.0  799.0  1052.2  668.6  3R0.4  607.1  860.8  1.928  9.  972.3  IT  1.821  975.0  1494.0  595.0  421.0  859.0  507.6  1014.9  1240.5  788.3  448.5  951.6  3.033  2.220  2.364  2.273  878.0 972.3 2.274  100.72  THE THREE L I N E S OF F I G U R E S F C R EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  10. ARE  2.114 2.619  903.0  797.0  427.0  595.0  441.0  335.4  670.7  819.8 520.9  296.4  628.9  642.6  1.007  1.833  2.546  2.398  194.3 388.4  1.575 1.142  414.0 204.0  307.0  425.0  234.0  474.7  171.6  364.2  372.1  1. 1 9 6  301.7  0.840  0.652  1.724  1.125  2266  0.606  230.28  176.0 478.0  503.0  337.0  140.0  352.0  190.1 380.2  464.7  295.3  166.0  356.5  1.186  1.021  1.077 0.786  0.932  232.0  2218  364.2 0.601  87.99  213.0 196.0 700.0  364.0  181.0 288.0  241.0  185.6 371.1 453.6  288.2  164.0  355.5  CHI-SQUARE  3913  382.47  482.0  1.057  5022  824.7  406.0  0.873  5921.  1.981 2.010  1011.0  TOTAL  56.11  CHI-SQUARE  599.0  2.124  H  350.0  0.992  5921.  G  411.0 777.0  200.0 THAT  F  430.5  1.737  108.59  2.419  S U B J E C T S 0 E  CHI-SQUARE  992.0  2.462  C  2.039  280.64  502.0  CHI-SQUARE  IN  B  CHI-SQUARE 855.0  2.972  6.  4.616  1653.0  CHl-SQUA'E  TO  WORD  8.519  471.0 12a7.0  2.491  5.  8.643  RANK  422.35  CHI-SQUARE  A  7.092  17519.  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  CHI-SQUARE  1367.0  2.337  4.  8.904  TOTAL  XXXV//  520.19  CHI-SQUARE  AND  5.384  1189.0  3.012  3.  H  2799.0  CHI-SQUARE  CF  0  1463.0  7.259  2.  F  TABLE  0.486  1.421  1.163  348.0  1.016  0.762  2165  0.624  289.44  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O r W O R D S I N S U B J E C T  XXKV/J.  TABLE  D I S T R I B U T I O N OF O C C U R R E N C E O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  RANK WORD 11. FOR  C  258.0  317.0  625.0  265.0  183.5  367.0  448.6  285.0  1.280  0.787  CHI-SQUARE 12. YOU  326.0  162.2  344. 1  351.6  0.847  0.F14  0.543  2141.  16. WITH  53.0  179.2  358.2  437.9  278.2  158.3  335.9  343.2  1.285  0.447  l.uiO  0.916  2090.  17. ON  185.0  663.0  325.0  121.0  234.0  152.0  160.4  320.7  392.0  249.1  141.7  300.7  307.2  1.346  1.038  0.679  0.619  1871.  18. THIS  135.0  309.0  406.0  214.0  148.0  274.0  303.0  153.4  306.7  374.8  238.2  135.5  287.5  293.8  0.824  0.684  0.831  BY  636.0  319.0  Tl.'o  141.4  282.8  345.7  219.7  125.0  0.412  224.0  200.0  127.3  254.5  311.1  197.7  112.5  236.7  243.9  0.705  1.291  1.019  0.399  198.0 265.2 0.524  122.0  1650.  271.0 0.316  459.29  THE THREE L I N E S OF F I G U R E S FOR EACH F N f R Y REPRESENT: FRECUENCY EXPECTED FREQUENCY R A T I O A S %, U F F R E Q . T O T O T A L N U . U F W U R D S I N S U B J E C T  20. WAS  0.739  0.738  0.472  0.593  1485.  0.518  40.75  165.0  210.0  276.0  215.0  97.0  244.0  252.0  125.1  250.1  305.7  194.2  110.5  234.5  239.6  0.521  0.560  0.687  0.545  0.646  1459.  0.653  26.95  163.0  136.0  148.0  211.0  165.0  230.0  238.0  110.7  221.3  270.5  171.9  97.3  207.5  212.0  0.300  0.337  0.674  0.927  0.609  1291.  0.616  173.81  126.0  166.0  199.0  167.0  112.0  106.8  213.6  261.0  165.9  94.4  0.625  0.785  TOTAL  84.0  0.412  CHI-SQUARE  166.0  CHI-SQUARE  19.  9.34  138.0  0.68s  0.725  1789.  H  231.0  CHI-SQUARE  370.04  G  364.0  0.809  6.394  F  284.0  CHI-SQUAKE  191.0  0.767  S U B J E C T S D E  98.0  0.819  0.137  601.48  0.459  C  CHI-SQUARE 346.0  0.906  B  0.486  0.844  187.0  CHI-SQUARE  OR  205.0  140.0  0.670  15.  145.0  WORO  TOTAL  633.0  CHI-SQUARE  AS  H  365.0  0.943  14.  C  366.0  CHI-SJUARE  3E  F  167.74  1.816  13.  1.269  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXVII  RANK  S U B J E C T S D E  B  TABLE  450.0  95.2  190.3  CHI-SQUARE  0.534  278.0  200.3  2C4.6  0.629  0.524  1246.  0.720  58.44  24.0  0.119  0.404  198.0  39.0 232.6  1.117  0.079  41.0  59.0  136.0  147.8  84.1  178.4 182.3  0.131  0.3)1  0.360  361.0  1110.  0.935  838.81  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S %, U F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T  to  CT> N>  TABLE  XXXVII  WORD 21.  B  DISTRIBUTION OF OCCURRENCE OF THE 100 HOST FREOUENT WORD TYPES ACROSS THE SUBJECT AREAS OF THE CORPUS  C  D  79.0 520.0 HE  93.2 186.3 227.7 C.392  1.290 0.380  CHl-SjUA.^.E 22.  187.0  0.367  CHI-SOUARE 23.  0.384  0.447  CHI-SQUARE  WHICH  82.3 0.236  174.7 0.363  141.7 0.438  55.0 242.0 8 0 . 6 171.0 0.309  0.640  0.589  0.243  80. 1 0.640  0.365  0. 337  1064  27.  174.7 0.526  79.8  0.420 i  0.319 0.275  169.4  28.  84.0 CAN  0.295  0.58S  £9. YOUR  0.363  85.0 170.0 207.8  75.1 159.4  0.263  0.402  132.1  0.412 0.472  205.0  0.888  992  162.9  0.529 0.531  49.56  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT  30.  0.499  0.479  G  H  TOTAL  77.0  0.246  86.0 240.0  163.0  0.213 0.487  CHI-SQUARE  0.679  83.C  150.0 0.476  153.2 0.344  122.0 102.0  67.3 142.9 0.'«94  85.0  0.323  162.0  64.9 137.7  0.521 0.477  889.  146.0 0. 264  0.429  37.0  857.  140.7 0.096  155.02 23.0  0.337  0.658  0.073  37.0 138.0 64.6 0.208  19.0  853.  137.1 14C. I 0.365  0.049  460.80 75.0  71.4 142.8 174.5 110.9 0.293  70.7  933.  63.18  6C.0 171.0. 266.0 THEY  0.345  73.1 146.2 178.7 113.6  CHI-SQUAKE 84.0 200.0  F  51.61  179.0 136.0 324.0  62.59  70.C 1C6.0 198.0 129.0  0.369  73.5 146.9 179.5 114.1 0.417  173.1  0.350  76.2 152.4 186.3 118.4 0.313  140.0 1054  S U B J E C T S D E  63.0 201.0 236.0 NOT  .  C  S O . O 159.9 195.5 124.2  CHI-SQUAKE  140.3  0.556  B  68.0 141.0 182.0 108.0 121.0 180.0 133.0  0.260  1 6 9 . 9 173.6  90.4 180.7 220.8  CHI-SQUARE  26. ONE  76.0 114.0 138.0 114.0 1057  49.0 222.0  0.347  WORD  176.5  203.0  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF THE CORPUS  RANK  CHI-SQUARE  100.0  CHI-SQUA4E 25.  42.0 137.0 108.0 1087  207.0  0.55b  TOTAL  124.45  112.0 224.0 AT  H  CHI-SQUARE  9 0 . 6 181.2 221.5 140. 7 0.719  Z4.  0.045  G  53.98  145.0 180.0 290.0 HAVE  144.7  189.0 137.0  91.2 182.4 222.9 0.447  14.0  F  XXXV/I  780.81  90.0 .148.0 FROM  E  TABLE  0.424  0.540  0.240  23.0 148.0 11C.0 63.1  133.9  0.129 0.392  833.  136.8 C.285  99.18  THE T H R E E L I N E S O FF I G U R E S F O R EACH E N T R Y REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S S t O FF R E Q . T OT O T A L N O . O F WORDS I N S U B J E C T  TABLE  RrfNK  WORD  31.  B  C  S U B J E C T S D E  115.0 134.0 WE  70.8  H  72.0  WORD  TOTAL  42.0  826.  36. WHEN  132.8 135.6  2.325  70.0  139.9  171.0 108.6  61.8  0.859  0.357  72.0  70.0  139.9  0.035  77.0  113.0  816.  37. ALL  131.1 134.0  0.157  0.204  63.0  128.0  171.0  61.8  131.1 134.0  108.6  0.510 0.383  0.354  0.339  46.0  816.  38. CUT  91.0  66.8  133.5 0.226  1C2.0  90.0  118.0  163.2  103.7  59.0  125.2 127.9  0.564  0.326  0.505  0.312  32.0  THESE  50.U  135.0 124.0  66.2  132.3  161.7 102.8  58.5  124.1 126.8  0.335  0.288  120.0 146.7  53.C  112.5 115.0  0.268  0.387  0.281  0.357  772.  0.321  7.97  40. MAY  0.392  93.2 0.377  0.253  0.328  49.0  700.  0. 1 2 7  62.82  86.0  136.0 159.0  61.0  49.0  58.4  116.7  90.7  51.6  142.7  0.337  0.323  86.0  104.0  681.  109.5 111.8  0.195 0.?75  0.228  C.269  33.52  36.0  195.C  135.0  47.0  34.0  56.8  113.6  138.9  88.3  50.2  0.484  0.274  96.0  12C.0  106.6 1C8.9  0.150 0.191  0.254  0.311  69.0  64.0  103.0  76.0  87.C  126.0  54.8  109.5  133.9  85.1  48.4  102.7 104.9  0.159 0.209  0.243  0.489  0.333  114.C  67.56 26.0  277.0  65.0  39.0  60.0  46.0  49.8  99.6  121.7  77.3  44.0  93.4  95.4  CHI-SQUARE  639.  C.295  68.0  0.337  663.  92.70  CHI-SQUARE  142.0 121.0  TOTAL  60.0  0.342  182.96  135.0  CHI-SOUARE  39.  0.083  65.0  C.323  779.  H  124.0  CHI-SQUAKE  278.0  G  45.0  0.179  0.119  F  108.0 193.0 118.0  CHI-SQUAKE  191.64  68.0  S U B J E C T S D E  63.0  0.427  0.293  251.0 120.0  0.179  C  0.313  0.191 0.109  436.17  136.0  5  CHI-SQUARE 28.C  CHI-SQUAkE  AN  0.045  11.0  0.337  35.  0.085  176.0  CHI-SQUARE  IF  62.6  346.0  0.675  34.  173.1 110.0  75.0  CHI-SCUAKE  WILL  414.0  G  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXVII  2277.49  0.372  33.  RANK F  14.0  0-333  CHI-SQUARE  HIS  42.0  141.6  0.571  32.  TABLE  DISTRIBUTION OF OCCURRENCE OF T H E 100 MOST FREOUENT HORO T Y P E S ACROSS T H E SUBJECT AREAS OF THE CORPUS  XXXV//  0.065  0.562  0.208  0.219  581.  0.159 0.119  299.16 IV)  THE THREc L I N E S OF F IGU<tS FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREOUENCY RATIO A S S t U F F R E Q . TO TOTAL N J . OF WORDS I N S U E J E C T  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREOUENCY EXPECTED FREQUENCY R A T I O AS S, OF F R E Q . T O T U T A L NO. UF WORDS I N S U B J E C T  TABLE  XXXV//  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T ^ F R E Q U E N T WORD T Y P E S A C R O S S T H E SU3JECT AREAS OF THE CORPUS  RANK WORD 41. THERE  B  C  S U B J E C T S 0 E" 117.0  77.0  49.0  63.0  103.0  48.8  97.5  119.2  75.8  43.1  91.5  93.4  0.323  127.0  47.8  95.6  116.9  42.3  89.7  91.6  0.176  94.6  0.207  7 4 . J 0.323  0.146  0.212  10.0  4.0  57.0  23.0  115.6  73.5  41.8  88.7  90.6  0.020  0.032  0.022  0.151  142.0  69.0  46.0  89.0  46.5  92.9  113.6  72.2  41.1  87.1  4-7. WERE  552.  48.  0.220  0.258  0.236  95.0  49.  89.0  THEIR  0.246  165.0  50.0  50.0  77.0  94.0  46.0  92.0  112.5  71.5  40.7  86.3  88.2  0.335  76.0  26.0  75.0  105.0  45.3  90.7  110.8  70.4  40.1  85.0  86.9  0.186  0.160  0.281  0.204  0.243  44.53  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S i, U F F R E Q . T O T O T A L NO. OF WORDS I N S U B J E C T  ,50. USED  0.243  0.146  0.198  136.0  27.0  28.0  32.0  67.0  212.0  44.4  88.8  108.5  69.0  39.2  83.3  85.1  43.7  0.337  0.055  0.089  C.180  0.177  322.78  258.0 87.4 0.640  34.0  19.0  17.0  59.0  116.0  106.8  67.9  38.o  82.0  83.7  0.069  0.061 0.095  0.156  510.  C.300  479.54 102.0  126.0  22.0  13.0  80.0  126.0  42.7  85.4  104.3  66.3  37.7  80.0  81.6  0.253  0.256  0.070  0.073  0.212  498.  0.332  85.43  29.0  34.0  141.0 150.0  39.0  63.0  42.1  64.2  102.9  3T.2  78.9  CHI-SQUARE  518.  0.549  27.0  0.144  529.  0.272  16.0  0.134  537.  0.272  TOTAL  19.09  CHI-SQUAKE  69.0  H  134.0  0.035  542.  G  75.0  7.0 HAD  F  38.0  0.079  0.060  32.0  CHI-SQUAKE  558.  19.78  0.171  S U B J E C T S D E  CHI-SQUARE  62.0  0.288  C  0.189  1389.72  0.154  B  CHI-SQUARE  39.0  0.159  MORE  0.329  10.0  1.047  CHI-SOUARE  SOME  46.  39.02  422.0  0.194  WORD  CHI-SQUARE 80.0  47.3  569.  D I S T R I B U T I O N OF OCCURRENCE C F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  RANK  0.267  26.0  CHI-SQUARE  45.  0.167  102.0 101.0  0. 1 2 9  OTHER  C.275  71.0  26.0  44.  0.246  TOTAL  XXXVII  28.74  CHI-SQUARE  I  0.238  51.0  0.253  43.  H  130.0  CHI-SOUARE  HAS  G  30.0  C.149  42.  F  TABLE  0.084  0.286  65.4 0.479  0.219  0.180  30.0  491.  30.6 0.078  191.07  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  YXXVH  T."8LE  D I S T R I B U T I O N OF OCCURRENCE O F T H E 1 0 0 MOST F R E O U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS OF T H E CORPUS  RANK WORD 51. MANY  B  C  S U B J E C T S D E  38.0  115.0  75.0  53.0  84.0  41.8  83.6  102.2  65.0  37.C  78.4  0.094  CHI-SQUAKE  SO  82.0  41.6  41.2  82.4  100.8  64.0  36.4  77.3  79.0  0.273  0.268  0.163  0.230  SHOULD  37.0  87.0  109.0  40.8  81.6  99.7  63.4  36.1  76.5  481.  57. WHAT  0.195 0.118  0.489  0.288  44.0  476.  58.  78.2  THAN  C.114  77.0  52.0  75.0  67.0  79.0  82.0  39.9  79.7  97.4  61.9  35.2  74.7  76.4  0.240  0.376  0.209  75.2  465.  59. BEEN  0.212  238.0 92.0  0.102  96.0  70.0  35.0  79.'0  88.0  39.7  79.4  97.0  61.6  35.1  74.4  0.112  0.444  0.233  62.0 76.0 0.161  83.75  THE THREE L I N E S OF F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S ? , OF F K L Q . T O T U T A L NO. OF WORDS I N S U B J E C T  60. INTO  G  38.0  29.0  31.0  58.4  33.3  70.6  0.121  0. 163  H 11.0  TOTAL 439.  72.1  0.082  39.0  121.0  55.0  12.0  55.0  107.C  36.7  73.4  69.7  57.0  32.4  68.8  0.300  0.028  0.112  0.038  0.309  39.0  428.  70.3  0.283  0.101  130.87  24.0  59.0  101.0  64.0  31.0  65.0  36.4  72.8  89.0  56.6  32.2  68.3  0.146  0.205  0.204  0.174  81.0  425.  69.8  0.172 0.210  11.46  41.0  94.0  58.0  42.0  26.0  63.0  ICO.O  36.3  72.7  38.8  56.4  32.1  68.1  69.6  0.203  463.  0.483  F  333.82  0.233  CHI-SQUARE  33.0  0.142  37.6  0.119  54.55  0.238  41.0  CHI-SQUARE  33.0  0.106  S U B J E C T S D E  51.0  0.194  154.13  0.191  C  CHI-SQUARE  96.0  0.089  B  0.253  47.87 36.0  CHI-SQUARE  56.  0.217 0.106  67.0  0.164  WORD  D I S T R I B U T I O N OF OCCURRENCE O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CURPUS  CHI-SQUARE 41.0  CHI-SQUARE  ABOUT  RANK  0.233  51.0  0.164  55.  0.222  488.  80.1  132.0  CHI-SQUARE  TWO  0.298  90.0  110.0  0.332  54.  0.240  TOTAL  XXXV/I  38.49  CHI-SUUAHE  EACH  0.233  H  24.0  0.119  $3.  0  33.0  0.164  52.  F  TABLE  0.118  0.134 0.146  0.167  424.  C.259  36.05  19.0  105.0  54.0  82.0  11.0  74.0  36.3  72.5  88.6  56.3  32.0  68.0  0.094 CHI-SQUAKE  0.261  0.110  0.262  0.062  78.0  423.  69.5  0.196  C.202  63.42  ro THE THREE LINES OF FIGUKES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T  TABLE  X X X V K  D I S T R I B U T I O N OF OCCURRENCE O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  RAN< WCRO  61. ThEM  B  C  S U B J E C T S D c  35.0  59.0  36.1  72.2  56.0  31.9  67.7 69.1  68.2  0 . 2 4 1 0.260  55.0  56.0  54.0  35.7  71.5  87.4  55.5  31.6  67.0 68.5  0 . 1 4 4 0.246  UP  27.0  0.176 0.314 0.143  67. THEN  173.0  72.0  21.0  36.0  35.2  70.5  86.1  54.7  31.1  66.1 67.5  0.099  0.351  0.230  0.1 1 8  0.095  411  68. TIME  0.075  62.0 115.0  33.0  45.0  6 6 . 0  34.8  69.6  54.1  30.8  65.3 66.7  85.1  0.203  0.233  0 . 1 0 5 0.253  0.175  20.0  406,  69. ITS  0.052  64.0  82.0  7.0  34.5  84.4  53.7  30.5  69.1 0.273  34.3  68.6  83.8  53.3  30.3  64.3 65.7  0 . 1 1 9 0.209  0 . 1 3 0 0.262  0.039  52.0  45.0  403,  64.8 66.2 0.138  0.117  73.66  THE THREL LINES U F FIGURES FOR EACH ENTRY REPRESENT: FKcUUENCY EXPECTED FREQUENCY R A T I U A S Z, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T  70. WOULD  70.0  0.157 0.168 0.172  11.21 94.0  62.0  61.0  47.0  69.0  34.3  68.6  83.8  53.3  30.3  64.3 65.7  0.233  0.126 0.195 0.264  22.0  0.183  58.19 67.0  109.0  30.0  16.0  52.0  33.7  67.4  82.3  52.3  29.8  63.2 64.5  0 . 1 6 6 0.221 0.096  0.090  0.138  61.0  3 9 3 .  0.158  44.23  21.0  77.0  40.0  55.0  4.0  33.1  66.2  80.9  51.4  29.2  0 . 1 9 1 0 . 0 8 1 0 . 1 7 6 0.022  98.0  91.0  386.  62.0 63.4 0.259  0.236  81.76  36.0  85.0  40.0  29.0  50.0  92.0  31.6  63.3  77.3  49.1  27.9  59.3 60.6  CHI-SQUARE  4 0 0 .  0.057  58.0  0.179  4 0 0 .  0.181  45.0  CHI-SQUAKE  43.0 110.0  TOTAL  65.0  0.104  63.22  H  30.0  CHI-SQUAKE  45.0  G  49.0  0.268  145.87  F  103.0  0.223  29.0  S U B J E C T S 0 E  48.0  CHI-SJUAKE  40.0  0.213  417  C  35.0  0.174  0.070  40.0  CHI-SQUARE  SUCH  64.96  CHI-SQUARE 65.  66.  B  CHI-SQUAKE  121.0  0.223  421  WORD  0.114  58.0  CHI-SQUARE  DO  0.197 0.156  46.0  0. 1 9 8  64.  0.064  44.0  DISTRIBUTION OF OCCURRENCE O F T H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUdJECT AREAS OF THE CORPUS  RANK  60.34  CHI-SQUAKE  MAKE  TOTAL  20.0  0.223  63.  H  97.0 128.0  CHI-SQUARE  USE  G  33.0  0.189  62.  F  XXXVII  TABLE  0 . 2 1 1 0 . 0 8 1 0.093  0.281 0.243  37.0  3 6 9 .  0.096  78.94  ThE THREE L I N E S OF FIGURES F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . U F W O R D S I N S U B J E C T  ro  G\  TABLE  RiN.<  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXVII  WORD  71. HOW  B  C  NUMfcER  7 3 .MADE  75.0  21.0  31.5  63.1  77.1  49.0  27.9  59.1  60.4  0.159  0.444  0.198  ONLY  7.0  22.0  14.0  31.4  62.7  76.7  48.7  0.017  CHI-SQUAKE  0.045  228.0  0.045  27.7 1.280  47.0 58.8  16.0  9d.O  5.0  38.0  47.0  31.1  62.2  76.1  48.3  27.5  58.3  59.6  NO  0.313 0.C28  363.  78. MUST  108.0  67.0  51.0  15.0  39.0  44.0  30.7  61.4  75.0  47.7  27.1  57.5  58.8  0.136  0.163  0.084  358.  79. WATER  23.0  47.0  74.0  61.0  8.0  50.0  30.3  60.5  74.0  47.0  26.7  56.7  0.117  0.150  37.0  86.0  54.0  30.1  60.2  73.5  46.7  26.6  56.4  57.6  0.159 0.124  0.195  0.045  0.132  90.0  353.  58.0 0.233  40.56  THE T H R E E L I N E S OF F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : FREwUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T  80. ALSO  0.102 0.208  0.228  351.  0.140  32.54  32.0  112.0  68.0  18.0  28.0  52.0  29.9  59.8  73.1  46.5  26.4  56.1  0.278  0.138  0.058  0.157  45.0 57.3  66.48 57.0  73.0  83.0  18.0  44.0  29.3  58.6  71.7  45.5  25.9  55.0  0.141  0.148  0.265  0.1U1  0.116  17.0  0.044  77.40 30.0  51.0  34.0  5.0  146.0  68.0  29.1  58.3  71.2  45.3  ?5.6  54.6  55.8  0.074  0.104  0.109 0.028  0.386  340.  0.176  212.75  23.0  21.0  84.0  68.0  180.0  42.0  56.0  26.7  53.5  65.4  41.5  23.6  50.1  51.2  CHI-SQUARE  342.  56.2  6.0  0.114  349.  0.138 0.117  50.0  CHI-SQUARE  51.99  TOTAL  32.0  0.030  0.103 0.114  H  61.0  CHI-SOUARE  34.0  G  64.0  0.248  0.101 0.122  F  17.0  0.159  101.70  0.268  S U B J E C T S D E  CHI-SQUARE  109.0  CHI-SQUARE  77.  0.124 0.041  42.0  0.114  366.  60.1  24.0  0.221  C  0.084  0.054  1596.28  0.104  B  CHI-SQUARE  38.0  CHI-SQUARE  MOST  0.105  76.  135.99  0.169  75.  0.116  368.  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXV/I  WORD  TOTAL  79.0  CHI-SQUARE  OJT  H  33.0  0.119  74.  G  57.0  0.189 •  F  64.0  CHI-SOUARE 72.  RANK  S U B J E C T S D E  39.0  0.194  TABLE  0.052  0.171  0.217  1.011  312.  0.111 0.145  1078.83  THE THREE L I N E S OF F I G U R E S F C R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  oo  TABLE  XXXVII  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O F THE CORPUS  RANK WORD 81. FIRST  3  42.0  26.4  52.3  64.5  41.0  23.3  49.5 50.6  S U B J E C T S WORD  86. COULD  6 6 . 0  26.2  52.5  64.1  40.7  23.2  49.2 50.2  37.0  3 0 6 .  87. WHO  51.0 140.0  33.0  7.0  26.0  51.9  40.3  23.0  0 . 1 0 5 0.039  16.0  15.0  3 0 3 .  88.  48.7 49.8 0.042  ANY  0.039  76.0  1.0  12.0  25.0  2 5 . 3  61.8  39.3  22.3  47.4 48.4  5 0 . 6 0.397  0 . 1 5 4 0.003  0.067  0.066  9.0  2 9 5 .  40.0  53.0  88.0  24.5  4 9 . 0  59.9  38.1  21.7  4 6 . 0 47.0  0.124  0.040  0.093  50.0  22.3  44.6  54.5  34.6  19.7  41.8 42.7  0.206  0 . 1 2 8 0.298  0.233  18.0  C.047  127.22  0.073  0.253  52.0  70.0  63.0  11.0  4.0  22.3  44.6  54.5  34.6  19.7  0.132  0 . 1 7 4 0 . 1 2 8 0.035  0.022  19.0  41.0  0.050  0.1C6  96.56 37.0  59.0  25.0  33.0  4 7 . 0  21.9  43.7  53.4  33.9  19. 3  4 1 . 0 41.9  0.092  0 . 1 2 0 0.080  23.0  0.185 0.124  26.83 32.0  10.0  45.0  BECAUSE  21.8  43.5  53.2  33.3  19.2  40.8 41.7  0.074  0. 164  0 . 1 0 2 0.056  3C.0  27.79 65.0  37.0  22.0  33.0  6 2 . 0  21.7  43.4  53.0  33.7  19.2  4 0 . 7 41.5  0.069  2 5 4 .  0 . 1 1 9 0.078  14.0  CHI-SQUARE  2 5 5 .  0.060  81.0  SEE  2 6 0 .  4 1 . 8 42.7  30.0  90.  2 6 0 .  0.1C4  31.0  0.129  2 8 5 .  0.024  4 0 . 0  114.95  CHI-SQUARE  46.0  TOTAL  26.0  331.87 16.0  H  89.  C.023  25.0  G  45.C  Chl-SOUAKE  12.0 160.0  F  23.0  0.154  159.60  E  12.0  CHI-SQUARE  41.0  0 . 1 2 7 0.284  0  83.0  0.258  10.22  C  7.0  0.035  0 . 1 1 9 0 . 1 3 2 0 . 1 3 7 0 . 1 1 2 0 . 1 7 5 0.096  63.5  B  CHI-SQUARE 20.C  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E O U E N T WORO T Y P E S A C R O S S T h E S U B J E C T A R E A S OF T H E C O R P U S  RANK  11.92 43.0  CHI-SQUARE  3 0 8 .  XXXV//  0.161  65.0  CHI-SQUAKE  SA*E  0.110 0.131 0.197 0.111  48.0  0.060  85.  0. 117  62.0  27.0  CHI-SQUAKE  HIM  TOTAL  35.0  0.203  84.  H  41.0  CHI-SQUARE  GCOD  G  54.0  0.134  83.  F  47.0  CHI-S3U&RE  VERY  S U B J E C T S 0 E  27.0  0. 1 3 4  82.  C  TABLE  0 . 1 6 1 0.075  0.070  20.0  2 5 3 .  0 . 1 8 5 0 . 1 6 4 0.052  54.76 ts)  T H E THREE L I N E S O F F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S i, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  THE THREE L I N E S OF F I G U R E S FUR EACH E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O AS S, OF F R E Q . TO TOTAL NO. OF WORDS I N S U B J E C T  cr>  TABLE  R AN<  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 H O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXVH  WCRD  51. LIKE  B  C  S U B J E C T S D fc  MJCH  1 3.0  33.0  30.0  21.0  42.0  51.3  32.6  IB.6  39.4  40.2  0.179  0.101  14.0  40.0  59.0  20.5  41.0  50.1  31.8  18.1  38.4  39.2  0.077  0.085  0.0e9  0.079  53.0  7.0  8.0  16.0  84.0  20.2  40.5  49.4  31.4  17.9  37.9  38.8  0.022  0.C45  0.042  62.0  30.0  THROUGH  19.9  39.8  48.6  30.9  17.6  37.3  38.1  97.  99. NEW  44.0  41.0  57.0  28.0  20. 1  40. 1  49.0  31.2  17.7  37.6  38.4  0.141  0.230  0.151  99. SMALL  16.0  22.0  61.0  37.0  20.0  59.0  18.0  20.0  39.9  48.8  31.0  17.6  37.4  36.3  0.055  0.124  0.118  0.112  0.156  233.  0.047  36.49  THE T h R E E L I N E S UF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S I, U F F R E Q . T O T U T A L N O . U F W O R D S I N S U B J E C T  100. OVER  0.164  13.C  20.0  21.0  19.9  39.8  48.6  30.9  17.6  37.3  38.1  0.052  0.118  0.192  0.073  0.053  73.36 42.0  27.0  12.0  9.0  29.0  86.0  19.5  39. 1  47.8  30.4  17.3  36.6  37.4  0.104  0.055  0.038  0.051  0.077  228.  0.223  39.19  17.0  14.0  44.0  37.0  10.0  52.0  49.0  19. 1  38.2  46.7  29.7  16.9  35.8  36.6  0.035  0.089  0.118  0.056  223.  0.138 0.127  31.83  10.0  44.0  45.0  38.0  6.0  33.0  44.0  18.9  37.7  46.1  29.3  16.7  35.4  36.1  CHI-SQUARE  232.  0.054  22.0  0.050  232.  C.078  60.0  CHI-SOUARE  71.87  0.045  58.0  0.084  C.073  0.173  21.0  0.109  234.  0.069  TOTAL  50.75  CHI-SQUARE  18.0  0.037  0.084  39.0  0.194  93.18 29.0  CHI-SQUARE  8.C  0.218  17.0  H  54.0  WORK  236.  G  34.0  0.050  239.  F  34.0  CH I - S Q U A R E  40.0  0.072  S U B J E C T S D E  10.0  0.106 0.153  28.0  0.108  C  96.  16.11  0.C99  B  CHI-SQUARE 28.0  0.079  WCRD  0.078  42.0  CHI-SQUARE  PLACE  0.C73  31.0  0.084  55.  0.042  245.  D I S T R I B U T I O N OF O C C U R R E N C E O F T H E 1 0 0 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF THE CORPUS  XXXV//  39.27  CHI-SQUARE  CALLED  0.106  25.0  0.139  54.  TOTAL  13.0  CHI-SQUARE  PEOPLE  H  52.0  0.124  93.  G  72.0  CHI-SQUARE 92.  RANK F  27.0  0.134  TABLE  O.109  0.091  0.121  0.034  0.087  220.  0.114  16.53  THE 1HREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U U J E C T  t>J  o  TABLE  XXXVm  D I S T R I B U T I O N OF O C C U R R E N C E OF T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S OF GRADE E I G H T  RANK WORD  1. THE  8  C  S U B J E C T S 0 E  5 1 6 . 6 723.5 8 1 8 . 3  834.4  7 . 1 8 25.252  2b9.0  0.0 317.4 421.4 1 7 0 . 6  260.9  365.4 4 1 3 . 3  0.0  4.007  4.239 4 . 1 5 0  3.028  152.0 1 3 0 . 0  0.0 267.4  355.0  1 4 3 . 7 2 1 9 . 83 0 7 . 9  3 . 1 9 53 . 2 8 7  IN  465.0 1 9 4 9 .  8.  348.2  0 . 0  THAT  1 3 8 . 0 1 8 1 . 0 3 1 1 . 0 234.0 1 6 3 6 .  0.0 266.4  353.7  1 4 3 . 22 1 9 . 0 3 0 6 . 7  2.993  2.964  2.559  9.  346.9  IT  3 . 1 3 9 2.088  1 2 7 . 01 7 1 . 01 9 9 . 0 2 6 1 . 0  0.0 2 1 5 . 9 286.7  U o . O 177.5 248.62 8 1 . 2  3.098  2.747  2.-.10  2.009  2. 3 2 9  28.43  THE THREE L I N E S OF F I G U R E S F O R E A C H E N T R Y R E P R E S E N T : FREQUENCY E X P E C T E C FREQUENCY R A T I O AS % , OF F R E Q . TO T O T A L NO. UF WORDS I N S U B J E C T  1 3 2 6 .  10.  ARE  TOTAL 1 1 0 8  235.0  9 7 . 01 4 8 . 3 207.7  1 . 9 2 5 2 . 1 2 1 1 . 8 9 8 2.311.  7 6 . 2 1 1 6 . 6 1 6 3 . 3  1 2 . 0 133.0  1.303  9 7 . 0 7 0 . 0  4 9 . 6 7 5 . 91 0 6 . 3  1.592 1 . 0 3 3 0.260  1.B80 0.979  5 6 7  1 2 0 . 2 0.625  115.06 6 1 . 0 51.0 101.0  0.0  4 6 . 6 7 1 . 2  86.6 115.0  9 9 . 7  6 0 . 0 532 1 1 2 . 8  1 . 4 2 9 1 . 1 9 0 1 . 3 1 90 . 7 2 1 1 . 0 1 9 0 . 5 3 5 54.04  0.0  2 7 . 0 183.0  4 2 . 0 7 6 . 0  0.0  79.6 105.7  4 2 . 8  6 5 . 5 9 1 . 7  0 . 3 1 41 . 6 0 20.908  1 . 0 7 50.666  CHI-SQUARE  1 3 4 . 7  2.361 1.554  0.0 123.0 136.0  0.0  871  124.15  0 . 0 9 2 . 3 122.6  CHI-SQUARE  0 . 0 2 1 4 . 0 354.0  H  8 9 . 01 5 0 . 01 8 8 . C 259.C  1 . 6 0 2 3. 136  1 3 7 . 0 1 1 8 . 0  0.0  48.97  2.487  0.872  CHI-SQUARE  342.0  G  9 . 0 2  141.6 188.3  0.0  1.G33 1 . 8 1 7 3.445  F  7 5 . 0 1 8 3 . 0 1 4 5 . 0 167.0 1 5 4 . 0 1 4 6 . 0  0 . 0  0 . 0  239.6  1.8712.284  0.0  0.0 230.0  CHI-SQUARE  7. IS  1 8 0 . 0 386.0 1 6 4 2 .  S U B J E C T S D E  0 . 0 180.4 0.0  95.92  2.673  C  CHI-SQUARE  365.0  2.894  B  0 . 0 161.0 2 6 1 . 0  6.  8 7 . 4 1  0.0 249.0  0.0  WORD  CHI-SQUA,'<E 420.0  3 . 4 1 72.530  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 MOST FREQUENT WORD T Y P E S A C R O S S THE S U B J E C T AREAS OF GRADE E I G H T  RANK  6.514  340.0  CHI-SQUARE  TO  6 . 6 1 78 . 9 1 3  1 4 0 . 0  0.0  5.  7.266  3859.  XXKVtll  128.23  CHI-SQUARE  A  954.0  337.7  0.0  4.  883.0  0.0 528.5  CHI-SQUARE  AND  TOTAL  4o8.0  0.0 294.0  3.  H  336.0  CHI-SQUARE  OF  &  0.0 6 1 8 . 0 60C.0  0.0  2.  F  TABLE  6 6 . 0 9 5 . 0 4 8 9 103.7 0.848  1 0 0 . 8 9  THE T H R E E L I N E S OF F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY E X P E C T E D FREQUENCY R A T I O AS % , OF F R E Q . TO TOTAL NO. OF WORDS I N S U B J E C T  xxxvm  TABLE  D I S T R I B U T I O N OF O C C U R R E N C E OF T H E 100 MOST F R E U U E N T WORD T Y P E S A C R O S S T H E S U B J E C T A R E A S OF GRADE E I G H T  R AN< WCRD  11. FOR  B  C  S U B J E C T S D E  175.0  44.0  04.0  43.0  0.0  81.3  107.9  43.7  66.S  93.6 105.8  0.628  14. AS  63.0  116.0  0.0  83.8  117.8  47.7  73.0  102.2  OR  2.127 0.670  0.891  16. WITH  49.0  54.0  0.0  68.1  90.4  36.6  56.0  78.4  0.465  1.654 0.800  0.693  0.545  69.0  110.0  39.0  47.0  67.0  0.0  66.0  67.6  35.4  54.2  75.9  9.0  545.  17.  115.6  ON  0.080  49.0  418.  18.  88.6  THIS  6.437  0.843  0.664  0.676  73.0  19.  85.9  BY  0.651  35.0  153.0  34.0  18."o  71.0  0.0  60.1  79.8  32.3  49.4  69.2  1.339 0.735  0.254  58.0  42.0  49.0  63.0  0.0  48.5  64.4  26.1  39.9  55.9  63.2  0.593  78.2  0.717 0.518  102.99  THE THREE L I N E S OF F I G U R E S FOR E A C H E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U AS %, U F F R E Q . TO TOTAL NO. OF WORDS I N S U B J E C T  20. WAS  0.578  0.584  0.594  0.495  44.0  67.0  31.0  25.0  78.0  0.0  55.4  73.5  29.8  45.5  63.7  0.511 0.586  0.670  0.353  0.787  21.0  21.0  21.0  79.0  60.0  0.0  42.8  56.9  23.0  35.2  49.3  0.244  0.184 0.454  1.117 0.606  C.562  95.0  340.  72.1 0.848  61.0  263.  55.8 0.544  91.21  0.0  26.0  40.0  19.0  62.0  35.0  0.0  44.5  59.0  23.9  36.5  51.2  0.302  0.350  0.411 0.877  0.353  91.0  273.  57.9 0.812  56.58  0.0  101.0  14.0  6.0  14.0  38.0  105.0  0.0  45.3  60.1  24.3  37.7  52.1  59.0  CHI-SQUAKE  298.  22.67  0.0  0.0  TOTAL  1.15  0.0  0.0  369.  H  27.0  CHI-SQUARE  0.0  G  66.0  0.0  405.  F  51.0  0.0  10.19  0.407  S U B J E C T S D E  0.0  CHI-SOUARE  0.0  0.963  C  0.0  145.36  0.802  B  CHI-SQUARE 37.0  CHI-SQUARE  WORD  240.65 189.0  0.0  RANK  0.884  1.171  40.0  CHI-SQUAKE 15.  0.965  0.0  0.0  499.  D I S T R I B U T I O N OF OCCURRENCE C F T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S THE S U B J E C T AREAS OF GRADE E I G H T  CHI-SQUARE 31.0  CHI-SQUARE  TOTAL  XXXVIII  83.08 243.0  0.0  99.0  1.188 0.434  83.0  CHI-SQUARE  BE  1.532 0.952  0.0  0.0  13.  H  54.0  CHI-SQUARE  YOU  G  0.0  0.0  12.  F  TA8LE  1.174 0 . 1 2 3 0 . 1 3 0 0.198 0.384  278.  0.937  172.05  THE THREE L I N E S UF F I G U R E S FOR EACH E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U AS X, OF F R E Q . TO TOTAL NO. UF WORDS I N S U B J E C T  TABLE  RANK  DISTRIBUTION O FOCCURRENCE O FT H E 1 0 0 M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O FGRAOE EIGHT  XXXVIII  WORD  B  21. HE  4 1 . 2 5 4 . 7 2 2 . 1 3 3 . 94 7 . 4  0.349  0.298  0.497  0.240  0.696  27. NOT  5 2 . 6  0.858  0.260  0.594  0.303  28. CAN  5 5 . 8  0.0  4 7 . 0 5 7 . 0 1 0 . 0 1 3 . 0 7 4 . 0 4 0 . 0  0.0  3 9 . 2 5 2 . 1 2 1 . 1 3 2 . 34 5 . 2 0.546  0.499  0.216  0.184  0.747  2 4 1 .  29. YOUR  5 1 . 1  2 2 . 0 4 2 . 0 3 1 . 0 3 5 . 0 5 9 . 0 6 8 . 0  0.0  4 1 . 9 5 5 . 6 2 2 . 5 3 4 . 4 4 8 . 2 0.368  0.670  0.495  0.596  5 4 . 5 0.607  2 1 . 7 3  THE THREE L I N E S O F F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S it O F F R E Q . T O T O T A L N O . O F W U R D S I N S U B J E C T  2 5 7 .  30. THEY  TOTAL  0.280  0.368  0.763  0.535  0.321  3 1 . 0 1  0.0  3 1 . 3 4 1 . 5 16.8 2 5 . 7 3 6 . 0 0.407  0.433  0 . 2 1 6 0.452  0.273  2 0 . 0 6 0 . 0 1 4 . 0 3 9 . 0 5 5 . 0  0.0  3 1 . 9 4 2 . 4 1 7 . 2 2 5 . 2 3 6 . 7 0.232  0.525  0.303  0.551  0.555  4 0 . 7 0.339  8.0 1 9 6 . 4 1 . 6 C.071  54.73  0.0  36.0  0.0  4 1 . 5 55.1 2 2 . 3 34.1 4 7 . 8  128.0  0.418  1.120  1 0 . 0 14.0 6 2 . 0  0.216 0.198  0.626  5.02 5 5 . 5 4 . 1 0.045  164.45  0.0  4 9 . 0 6 6 . 0  0.0  3 7 . 5 4 9 . 7 2 0 . 1 30.8 4 3 . 1  CHI-SQUARE  1 9 2 .  8.91  0.0  0.0  2 2 5 .  4 7 . 7  3 5 . 0 50.0 1 0 . 0 32.0 2 7 . 0 3 8 . 0  CHI-SQUARE  C O  H  0.0  0.0  0.357  4 0 . 1 1  0.256  0.363  CHI-SQUARE  50.25  G  3 6 . 6 4 8 . 7 1 9 . 7 30.1 4 2 . 2  0.0  C.330  F  0.0  CHI-SQUARE 2 6 3 .  S U B J E C T S D E  3 3 . 0 3 2 . 0 1 7 . 0 5 4 . 0 5 3 . 0 3 6 . 0  0.0  0.669  C  0.0  0.0  38.27  0.511  B  CHI-SQUARE  4 2 . 8 5 6 . 9 2 3 . 0 3 5 . 2 4 9 . 3  CHI-SOUARE  26.  144.00  0.0  0.0  WORD  ONE  2 4 8 .  DISTRIBUTION OF OCCURRENCE O FTHE 1 0 0 M O S T F R E Q U E N T WORO T Y P E S A C R O S S T H E S U B J E C T AREAS O FGRADE E I G H T  XXXVIII  0.678  4 4 . 0 9 8 . 0 1 2 . 0 4 2 . 0 3 0 . 0 3 7 . 0  CHI-SQUARE  WHICH  0.464  0.0  0.0  25.  0.184  2 5 3 .  5 3 . 7  4 0 . 4 5 3 . 6 2 1 . 7 3 3 . 2 4 6 . 5  CHI-SOUARE  AT  1 3 . 0 4 6 . 0 7 6 . 0  0.0  0.0  24.  0.022  TOTAL  3 0 . 0 3 4 . 0 2 3 . 0 1 7 . 0 6 9 . 0 7 5 . 0  CHI-SQUARE  HAVE  0.166  H  0.0  0.0  23.  G  0.0  1.139  1.0  RANK F  9 8 . 0 1 9 . 0  CHI-SOUARE  FROM  S U B J E C T S D E  0.0  0.0  22.  C  TABLE  0.569  0.578  8.0  5.0  0.173 0.071  3 6 . 0 6 6 . 0  0.363  2 3 0 .  4 8 . 8 0.589  45.05  NJ THE THREF L I N E S O F F I G U R E S FOR EACh ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T  w  TAELE  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WCRD T Y P E S ACROSS T H E S U B J E C T AREAS O F GRADE E I G H T  XXXV/lf  RANK  S WORD  B  31. WE  4 1 . 2  5 4 . 7  2 2 . 1  3 3 . 9  4 7 . 4  5 3 . 7  0 - 1 2 3  0 . 1 3 0  2 . 4 6 0  0 . 2 3 2  WORD  TOTAL  0 . 0  2 5 3 .  36.  WhEN  0 . 0 7 1  0 . 0  6 9 . 0  2 6 . 0  0 . 0  3 3 . 2  4 4 . 1  0 . 8 0 2  5 . 0 1 7 . 9  0 . 2 2 8  0 . 1 0 8  10.C  3 6 . 0  5 8 . 0  2 7 . 3  3 8 . 2  4 3 . 3  0 . 1 4 1  0 . 3 6 3  2 0 4 .  37.  ALL  0 . 5 1 8  1 9 . 0  6 7 . 0  1 5 . 0  2 4 . C  2 5 . 0  1 4 . 0  0 . 0  2 6 . 7  3 5 . 5  1 4 . 4  2 2 . 0  3 0 . 7  3 4 . 8  0 . 5 8 6  0 . 3 2 4  2 1 . 0  6 2 . 0  8 . 0  0 . 0  2 8 . 5  3 7 . 8  1 5 . 3  0 . 3 3 9  0 . 2 5 2  1 6 4 .  3 8 .  BUT  3 8 . 0  0 . 0  2 3 . 1  3 0 . 7 1 2 . 4  0 . 2 3 2  0 . 5 4 3  0 . 1 7 3  31.C  4 3 . 0  1 0 . 0  2 3 . 4  3 2 . 8  3 7 . 1  0 . 4 3 8  0 . 4 3 4  1 7 5 .  39. THESE  0 . 0 8 9  1 8 . 0  4 5 . 0  2 7 . 0  16.*0  3 3 . 0  3 7 . 0  0 . 0  2 8 . 7  3 8 . 1  15.4  2 3 . 6  3 3 . 0  3 7 . 3  0 . 3 9 4  0 . 5 8 4  0 . 2 2 6  0 . 3 3 3  0 . 3 3 0  1 6 . 4 0  40.  MAY  3 0 . 0  1 9 . 0 2 6 . 6 0.254  0 - 3 0 3  H  TOTAL  1 9 . 0  0 . 1 7 0  3 0 . 0  1 2 . 0  2 3 . 0  1 7 . 0  3 6 . 0  0 . 0  2 3 . 5  3 1 . 1  1 2 . 6  1 9 . 3  2 7 . 0  3 0 . 5  0 . 3 0 2  0 . 2 6 3  0 . 2 6 0  0 . 3 2 5  3 2 . 0  2 8 . 0  0 . 0  2 3 . 5  3 1 . 1  0 . 3 2 1  0 . 3 7 2  0 . 2 4 5  5 . 0  1 7 . 0  2 1 . 0  4 1 . 0  1 2 . 6  1 9 . 3  2 7 . 0  3 0 . 5  0 . 1 0 8  0 . 2 4 0  0 . 2 1 2  1 4 4 .  0 . 3 6 6  1 3 . 2 1  0 . 0  1 6 . 0  1 9 . 0  0 . 0  2 4 . 1  3 2 . 0 1 3 . 0  0 . 186  0 . 1 6 6  8 . 0  0 . 1 7 3  4 8 . 0  2 7 . 0  1 9 . 8 2 7 . 7 0 . 6 7 9  0 . 2 7 3  3 0 . 0  1 4 3 .  3 1 . 4 0 . 2 6 8  5 0 . 0 9  0 . 0  5 . 0  6 4 . 0  1 5 . 0  1 7 . 0  0 . 0  2 0 . 5  2 7 . 2  1 1 . 0  1 6 . 9  CHI-SQUAKE  0 . 172  1 4 4 .  5 . 7 5  0 . 0  0.0  1 4 2 .  3 0 . 1  2 6 . 0  0.0  1 7 6 .  0 . 3 6 8  1 8 . 0  G  8 . 4 2  CHI-SQUARE  0 . 0  0 . 3 3 3  1 7 . 0  F  0 . 0  0 . 0  6.125  4 6 . 3 1  CHI-SQUARE  2 0 . 0  CHI-SQUARE  0 . 0  0 . 2 0 9  0 . 0  0.0  43.98  0.0  S U B J E C T S D t  CHI-SQUARE  0 . 0  0 . 2 4 4  C  CHI-SQUARE  7 1 . 3 4  0 . 2 2 1  B  0.0  6 7 7 . 5 6  C H 1 - S 3 U A 3 E  AN  H  G  8 . 0  0.0  35.  F  2 3 . 0  0 . 3 2 5  D I S T R I B U T I O N OF OCCURRENCE O F T H E 1 0 0 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE E I G H T  XXXVM/  RANK  S  E  1 7 4 . 0  CHI-SQUARE  IF  T  6 . 0  0 . 0  34.  C  1 4 . 0  CHI-SQUARE  WILL  E  2 8 . 0  0.0  33.  J D  C H I - S I J A R E  HIS  B  0 . 0  0.0  32.  U  C  TABLE  0 . 0 5 8  0 . 5 6 0  0 . 3 2 4  0 . 2 4 0  8 . C 2 3 . 6 0 . 0 8 1  1 7 . 0  1 2 6 .  2 6 . 7 0 . 1 5 2  7 6 . 6 3  NJ THE  THREE  L I N E S  OF  F I G U R E S  F O R EACH  ENTRY  R E P R E S E N T :  FREQUENCY EXPECTED RATIO  AS  FREQUENCY %,  OF  FREQ.  TO  TOTAL  N O .  UF  WORDS  I N  SUBJECT  THE THREE L I N E S O FF I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T U T A L N O . O F W O R D S I N S U B J E C T  J>  TABLE  RANK  XXXVIH  WORD  41. THERE  B  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE EIGHT  C  S U B J E C T S D E 29.0  3.0  24.0  13.0  45.0  0.0  23.1  30.7  12.4  19.0  26.6  30.1  0.325  8.0  16.0  0.0  16.6  22.1  8.9  13.7  19.1  0.151  0.149  0.454  0.113  MORE  0.402  1.0  0.0  3.0  0.0  0.0  16.9  22.5  9.1  13.9  19.5  1.046  0.009  0.0  0.042  0.0  20.0  42.0  10.0  12.0  24.0  0.0  22.6  30.1  12.2  18.a  26.1  27.0  102.  47. WERE  21.6  10.0  104.  48.  22.1  HAD  0.089  0.216  0.170  0.242  31.0 29.5  49. THEIR  0.277  13.0  34.0  4.0  0.0  19.9  26.4  l u . 7  0.298  7.0  14.0  17.0  0.0  17.6  23.4  9.5  14.5  20.2  0.174  0.087  15.0  23.0  16.3  22.9  0.212  0.232  33.0 25.9 0.295  10.82  50. USED  0.151 0.198  27.0  8.0  0.0  9.0  28.0  65.0  0.0  23.3  30.9  12.5  19. 1  26.8  30.3  0.372  0.070  0.0  0.127  0.283  143.  0.580  77.84  0.0  39.0  7.0  0.0  9.0  17.0  0.0  18.4  24.4  9.9  15.1  21.2  0.453  0.061  0.0  0.12 7  0.172  41.0  113.  24.0 0.366  60.80  0.0  21.0  31.0  4.0  8.0  12.0  44.0  0.0  19.5  25.9  10.5  16.1  22.5  25.4  0.244  0.271  0.087  C.113  0.121  120.  C.393  27.59  0.0  6.0  26.0  34.0  21.0  19.0  0.0  19.1  25.3  10.2  15.7  21.9  CHI-SQUARE  108.  0.172 0.241  32.0  0.0  TOTAL  22.9  0.0  0.0  122.  0.245  H  3.21  CHI-SQUARE  0.0  G  28.0  0.0  139.  F  15.0  0.0  8.03  0.151  S U B J E C T S D E  CHI-SQUARE  0.0  0.368  C  0.0  0.0  379.48  0.232  . B  CHI-SQUARE  90.0  CHI-SQUARE  46.  0.162 0.241  0.0  0.0  WORD  22.46  CHI-SQUARE  SOME  142.  RANK  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M J S T F R E Q U E N T WCRD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE E I G H T  CHI-SQUARE 21.0  0.0  45.  0.131  17.0  CHI-SQUARE  OTHER  0.339  13.0  0.0  44.  0.065  TOTAL  XXXVl/l  23.92  CHI-SOUARE  I  0.254  0.0  0.0  43.  H  28.0  CHI-SQUARE  HAS  G  0.0  0.0  42.  F  TABLE  0.070  0.228  0.735  0.297  0.192  11.0  117.  24.8 0.098  74.01  N) THE THREE LINES O r FIGURES FUR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T U T A L N U . CF W O R D S I N S U B J c C T  THE THREE L I N E S UF F I G U k E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R t Q . T O T U T A L N O . U F W O R D S I N S U B J E C T  (Jl  TABLE  RAN<  XX  WORD  51. MANY  XV  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE EIGHT  IU  S U B J E C T S D E  8  C  0.0  5.0  44.0  8.0  12.0  27.0  22.0  0.0  19.2  25.5  10.3  15.8  22.1  25.0  0.0  0.058  CHI-SQUARE 52. SO  43.0  3.0  0.0  22.3  29.6  12.0  0.350  0.065  25. 0  18.3  25.7  0.269  0.252  12.0  11.0  34.0  34.0  0.0  19.9  26.4  10.7  16.3  22.9  16.0  56. SHOULD  0.116  0.105  137.  57. WHAT  0.143  0.238  0.481  0.343  21.0  122.  58.  25.9  THAN  0.187  9.0  23.U  29.0  31.0  0.0  19. 1  25.3  10.2  15.7  21.9  24.8  0.195  0.325  0.293  117.  59. BEEN  0.277  10.0  10.0  3.0  14.0  26.0  28.0  0.0  14.8  19.7  6.0  12.2  17.1  19.3  0.088  0.065  0.198  0.262  0.250  18.30  THE T h R E E L I N E S OF F I G U R E S F O R E A C H ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S 1, O F F R E Q . T O T O T A L N O . U F W O R D S I N S U B J E C T  60. INTO  H 2.0  6.0  10.0  6.0  0.0  17.8  23.6  9.5  14.6  20.4  0.128  0.648  0.130 0.141  TOTAL  0.061 0.018  142.72 24.0  20.0  2.0  22.0  42.0  4.0  0.0  18.6  24.6  10.0  15.3  21.4  24.2  0.279  0.175  0.043  0.311  0.424  13.0  25.0  12.0  15.0  23.0  0.0  16.6  22. 1  8.9  13.7  19.1  0.036  0.151  0.219 0.260  0.212  14.0  102.  21.6  0.232  C.125  5.85  0.0  14.0  10.0  9.0  11.0  8.0  27.0  0.0  12.9  17.1  6.9  10.6  14.8  16.8  0.163  0.088  0.195 0.156  79.  0.031 C.241  13.08  0.0  30.0  2.0  8.0  6.0  20.0  30.0  C O  15.6  20.8  8.4  12.9  18.0  20.4  CHI-SQUAKE  114.  48.56  C O  0.0  109.  23.1  0.0  C O  91.  G  74.0  CHI-SQUARE  C O  F  11.0  0.0  20.70  0.116  S U B J E C T S D E  CHI-SQUARE  7.0  0.061  C  0.0  C O  38.19  0.209  B  C O  29.1  18.0  CHI-SQUARE  WORD  0.196  0.0  0.0  118.  RANK  CHI-SQUARE  10.0  CHI-SQUARE  ABOUT  0.376  19.0  0.0  0.0  55.  TOTAL  22.07  CHI-SQUARE  TWO  0.273  H  CHI-SQUARE  31.0  0.0  54.  0.170  G  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE E I G H T  XXXVIII  26.79  CHI-SQUARE  EACH  0.173  0.0  0.0  53.  0.385  F  TABLE  0.349  0.018  0 . 1 7 3 0.085  0.202  96.  0.268  38.61  N) THE THREE L I N E S O F F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S t, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T  O  TABLE  RANK WORD 61. THEM  B  C  S U B J E C T S D E  USE  37.0  0.0  16.0  14.0  0.0  17.3  23.6  9.5  14.6  20.4  0.244  UP  4.0  0.0  15.8  21.0  8.5  13.0  18.2  20.6  0.116  0.315  0.260  0.325  0. 121  SUCH  97.  67. THEN  49.0  11.0  13.0  11.0  15.0  0.0  17.4  23.1  9.4  14.3  20.1  22.7  0.238  0.184  107.  68. TIME  0.0  12.0  34.0  8.0  15.0  18.0  6.0  0.0  15.1  20.1  8.1  12.4  17.4  19.7  0.139  0.298  0.173 0.212  0.182  93.  69. ITS  22.0  0.0  12.5  16.6  6.7  10.3  14.4  16.3  0.116  0.0  17.0  12.0  7.0  2.0  19.0  24.0  0.0  13.2  17.5  7.1  10.8  15.2  17.2  0.105 0.151  0.028  0.192 0.214  13.72  81.  70. WOULO  0.158  0.260  0.085  18.0  0.0  17.fi 23.6  7.0  0.209  0.061  4.0  10.0  17.0  8.0  9.5  14.6  20.4  23.1  0.087  0.141  109.  0.172 C.071  26.77 16.0  15.0  0.0  3.0  12.0  20.0  0.0  11.4  15.1  6.1  9.4  13.1  14.8  0.0  0.042  0.186 0.131  70.  0.121 0.178  14.20  0.0  13.0  4.0  7.0  1.0  26.0  25.0  0.0  12.4  16.4  6.7  10.2  14.2  16.1  0.151  0.035  0.151 0.014  0.262  76.  0.223  32.31  0.0  20.0  17.0  2.0  13.0  27.0  0.0  15.3  20.3  8.2  12.6  17.6  CHI-SQUARE  77.  0.C91 0.196  0.0  0.0  TOTAL  10.55  0.0  CHI-SQUARE  20.34  0.198  9.0  0.0  0.054  H  6.0  CHI-SQUARE  41.12  G  12.0  0.0  0.111 0.134  F  18.0  CHI-SQUAKE  8.0  0.429  S U B J E C T S D E  10.0  0.0  37.52  C.093  C  0.0  0.0  0.036  0.0  CHI-SQUARE  66.  B  CHI-SCUARE 12.0  0.0  WORD  0. 1 8 7  23.0  CHI-SQUARE 65.  0.141  12.0  0.0  109.  23.1  36.0  CHI-SQUARE  DO  0.226  21.0  10.0  0.0  64.  0.0  TOTAL  DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U 8 J E C T AREAS OF GRADE E I G H T  XXXVIII  20.15  CHI-SQUARE  MAKE  0.324  H  0.0 •  0.0  63.  G  21.0  CHI-SQUARE 62.  RANK F  0.0  0.0  TABLE  DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 HOST F R E Q U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE EIGHT  XXXVIII  0.232  0.149  0.043  0.184  0.273  15.0  94.  19.9 0.134  12.92  to THE T H R E E L I N E S OF F I G U R E S F O R E A C H ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S S t OF F R E Q . T O T O T A L NO. OF WORDS I N S U B J E C T  THE THREE L I N E S OF F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  «0  TABLE  RANK  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E O U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE E I G H T  XXXV///.  WCRO  71. HOW  B  C  S U B J E C T S 0 E  0.0  1 8 . 0  0.0  1 3 . 8 1 8 . 4  0.0  0.209  CHI-SQUARE 72.  0.0  NJMEER  0.0 0.0 CHI-SQUARE  73.  0.0 MADE  0.0 0.0  OUT  7.4  1 1 . 4 1 5 . 9  0.C65  2.0  6.0  3.0  0.023  0.053  0.065  8 5 .  76. ONLY  1 8 . 0  9.0  3 2 . 0 1 5 . 0  15.6 2 0 . 6 0.105  0.280  0.131  9 9 . 0  5.0  2.0 1 1 8 .  77. NO  1.40C  0.C50  8.4 0.324  0.151  16.6 2 2 . 1 0.198  l . C 1 8 . 0 2 1 . 0 1 2 . 9 1 8 . 0 0.014  9 6 .  78. MUST  2 0 . 4  7.0  1 1 . 0 2 1 . 0  8 5 .  WATER  1 1 . 4 15.9 ' 13.0 0.099  0.111  79.  0.187  1 2 . 4 16.4  6.7 0.151  5.0  1 8 . 0 2 1 . 0  10.2 1 4 . 2 0.M71  7 6 .  16.1  0.162 0.187  7.60  THE THREE L I N E S OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S *, C F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  80. ALSO  0.079  0.0  17.0  0.0  11.4 15.1  G  H  TOTAL  9.0  27.0 2 7 . 0 1 3 . 0  8.9  1 3 . 7 1 9 . 1  0.195  0.332  0.273  0.198  8.0  0.070  2.0 6.1  9.4  0.043  0.156  1 3 . 1 0.111  0.116  7 0 .  1 4 . 8 0.178  1 1 . 3 1 2 0 . 0 19.0  6.0  7.C  1 2 . 0  0.0  11.9 15.8  6.4  9.M  1 3 . 7  0.130  0.099  0.232  0.166  0.121  9.0  7 3 .  1 5 . 5 0.060  9.92  0.0  6.0  3.0  5.0  0.0  0.0  3.5  11.2  4.6  7.0  9.7  1 1 . 0  0.0  0.070  0.026  0.108  0.0  0. 151  0.205  1 5 . 0 2 3 . 0  5 2 .  29.60  0.0  2.0  27.0  7.0  5.C  7.0  0.0  9.8  13.0  5.3  3.0  1 1 . 2  0.0  0.023  0.236  0.151  0.071  CHI-SQUARE  1 0 2 .  2 1 . 6  11.0 1 1 . 0 2 0 . 0  0.0  0.0  0.182 0.187  9.0  F  27.47  CHI-SQUARE 7.0  0.158  0.0  CHI-SQUARE  1 8 . 0  0.0B1  S U B J E C T S D E  17.0  0.0  0.018  1 1 . 8 1 7.0  C  0.0  0.0  0.027  25.04  0.279  B  CHI-SQUARE  7.4  CHI-SQUARE  3.0  WORD  503.28  13.8 1 8 . 4  0.0  0.293  TOTAL  1 9 . 2 2 5 . 5 1 0 , 3 1 5 . 8 22. 1 2 5 . 0  0.0  0.0  0.325  H  CHI-SQUARE  7.0  0.0 MOST  2 3 . 0 2 9 . 0  2 4 . 0 1 5 . 0  CHI-SQUARE  G  3.0  0.0  0.0  75.  0.079  RANK F  D I S T R I B U T I U N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE E I G H T  XXXV///  43.79  CHI-SQUARE 74.  9.0  TABLE  1 2 . 0  6 0 .  1 2 . 7  0.071 0.107  24.72  THE THREE LINES O F FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T  IV) »J  CO  TABLE  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE E I G H T  XXXV/M  RANK WORD 81. FIRST  B  C  S U B J E C T S D E 14.0  5.0  13.0  16.0  0.0  11.9  15.8  6.4  9.8  13.7  0.108  0.134  CHI-SQUARE 82. VERY  10.0  19.0  14.0  0.0  11.9  15.8  6.4  9.R  13.7  15.5  0.108  0.141  0.081  SAME  0.158  63.0  6.0  5.0  4.0  4.0  0.0  15.1  20.1  8.1  12.4  17.4  19.7  0.551  0. 1 3 0  0.071  0.040  WHO  93.  88. ANY  13.0  0.0  3.0  12.0  5.0  0.0  10.4  13.8  5.6  8.6  12.0  13.6  0.0  0.042  0.114  0.121  64.  5.0  15.0  34.0  7.0  0.0  11.4  15.1  6.1  9.4  13.1  14.8  0.108  0.212  0.023  0.061  1.0  16.0  12.0  15.0  0.0  11.2  14.9  6.0  9.2  12.9  14.6  0.022  0.226  0.121 0.134  0.209  0.343  0.062  53.06  THE THREt LINES DF FIGURES FUR EACh ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  70.  0.061  0.0  13.0  18.0  2.0  2.0  7.0  22.0  0.0  10.4  13.8  5.6  8.6  12.0  13.6  C.043  0.028  0.151  0.158  16.55  0.0  9.0  17.0  0.0  12.0  16.0  4.0  0.0  9.4  12.5  5.1  7.8  10.9  12.3  0.0  0.105  0.0  0.170  0. 162  0.149  17.01 10.0  4.0  8.0  12.0  BECAUSE  0.0  9.8  13.0  5.3  8.0  11.2  12.7  0.0  0.046  0.216  0.057  0.193  60.  0.081 0.107  16.99  0.0  8.0  9.0  2.0  12.0  32.0  4.0  0.0  10.9  14.5  5.9  9.0  12.6  14.2  0.0  .0.093  0.043  0.170  CHI-SQUARE  58.  0.036  22.0  SEE  64.  0.071 0.196  4.0  90.  69.  17.51  CHI-SQUARE  7.0  TOTAL  0.0  55.31 2.0  H  89.  0.045  0.0  G  7.0  CHI-SQUARE  31.0  F  18.0  0.0  0.036  0.0  CHI-SQUAKE  87.  120.53  0.360  S U B J E C T S D E  CHI-SQUARE  11.0  0.0  7 3 .  0.192 0.125  0.0  0.1 2 8  C  0.0  0.0  4.83  CHI-SQUAKE 85.  COULD  B  CHI-SQUARE 5.0  0.0  86.  0. 1 1 6  18.0  CHI-SQUARE  HIM  73.  15.5  7.0  0.0  94.  13.0  0.162  WORD  TOTAL  2.36  CHI-SQUAKE  GOOD  0.123  H  C O  0.0  83.  G  12.0  0.139  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST F R E Q U E N T WCRD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE E I G H T  XXXV///  RANK F  0.0  0.0  TABLE  0.079  0.323  67.  0.036  43.84  -J THE THREE L I N E S OF F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R ' A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  ^  TABLE  RAN<  WORD  B  91.  0 . 0  LIKE  MUCH  PEOPLE  CALLEO  5 . 6  8 . 6 1 2 . 0  0 . 0  0 . 0 6 5  0 . 0 4 2  0 . 2 0 9  PLACE  0 . 1 0 5  1 4 . 0  0 . 0  6 . 0  0 . 0 0 . 0  1 7 . 0  3 . 0  5 . 0  9 . 0 1 1 . 9  4 . 8  7 . 4  0 . 0 7 0  0 . 0 6 5  0 . 0 7 1  0 . 1 4 9  9 . 0  6 4 .  1 3 . 6  1 5 . 0  1 0 . 3 0 . 0 9 1  7 . 0  0 . 0  9 . 8  1 3 . 0  5 . 3  8 . 0 1 1 . 2  0 . 0  0 . 0 7 0  0 . 0 2 2  0 . 0 9 9  5 5 .  6 . 0  0 . 0 6 1  5 . 0  3 . 0  0 . 0  8 . 6  1 1 . 5  0 . 0  0 . 0 5 8  0 . 0 2 6  4 . 6  1 4 . 0  7 . 1  1 3 . 0  1 3 . 0  9 . 0  THROUGH  0 . 0  7 . 2  9 . 5  3 . 9  5.9  8 . 2  9 . 3  0 . 0  0. 128  0 . 0 4 4  0. 108  0 . 0 1 4  0 . 1 3 1  0 . 0 8 0  97.  98.  NEW  1 2 . 7 0 . 1 7 8  9 . 9  5 3 .  99. SMALL  1 1 . 2  0 . 2 1 6 0 . 1 9 8 0 . 1 3 10 . 0 7 1  2 5 . 0  0 . 0 1 2 . 1 1 6 . 0 0 . 0 3 5  11 . 3 4  0 . 0  4 . 0  8 . 0  15.0  5.0  3 . 0  8 . 0  0 . 0  7.0  9 . 3  3.8  5.8  8 . 1  9.  0 . 0  0 . 0 4 6  0 . 0 7 0  0 . 3 2 4  0.071  0 . 0 3 0  0 . 0 7 1  6 . 0  6 . 5  1 1 . 0  2 0 . 0  9 . 9 1 3 . 9  0 . 2 1 9 0 . 1 3 0 0 . 1 5 6  0 . 2 0 2  9 . 0  1 5 . 7 0 . 0 8 0  1 7 . 5 7  ThE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS %, UF FREQ. TO TOTAL NO. OF WOROS IN SUBJECT  7 4 .  100. OVER  38 . 4 4 6 . 0  7.0  0 . 0  5.0  9 . 0  0 . 0  6 . 2  8 . 2  3 . 3  5.1  7. 1  8.1  0 . 0  0 . 0 7 0  0 . 0 6 1  0 . 0  0.C71  0 . 0 9 1  0 . 0 9 8  1 1 . 0  38  5 . 0 8  0 . 0  3 . 0  10.0  3 . 0  7.0  9 . 0  16.0  0 . 0  7.8  10.4  4 . 2  6.4  9 . 0  1C.2  0 . 0  0 . 0 3 5  0 . 0 6 5  0 . C 9 9  0 . 0 9 1  3 . 0  4 . C  4 . 6  7 . 0  9 . 7  0 . 0 5 7  0 . 1 3 1 0 . 1 5 2  0 . 0 8 8  48  0 . 1 4 3  6 . 7 1  0 . 0  8 . 0  0 . 0  8 . 5  0 . 0  0 . 0 9 3  CHI-SOUARE  43,  1  0 . 0  CHI-SQUARE  2 2 . 5 7 3 . 0  44.  1.0  0 . 1 3 4  8 . 0  TOTAL  5 . 0  CHI -SQUARE 1 0 . 0  H  5 . 0  WORK  6 0 .  G  1 1 . 0  1 1 . 7  2 0 . 0  F  0 . 0  1 5 . 4 5  0 . 0  S U B J E C T S D E  CHI -SQUARE 1 . 0  0 . 1 7 5  C  96.  5 . 7 3 2 0 . 0  CHI-SQUARE  1 4 . 0  B  CHI'-SQUARE  6 . 0  0 . 0  WORD  TOTAL  0 . 1 4 1 0 . 1 2 5  0 . 0  0 . 0  H  1 0 . 9 3  CHI-SQUARE 9 5 .  1 2 . 0  G  0 . 0 1 0 . 4 1 3 . 8  CHI-SQUARE 94.  1 8 . 0  RANK F 3 . 0  CHI-SQUARE 93.  S U B J E C T S D E 3 . 0  CHI-SQUARE 9 2 .  C  DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE EIGHT  TABLE XX.XVIII  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRAOE EIGHT  XXXVIII.  7 . 0  1 1 . 2  0 . 0 6 1 0 . 0 6 5  1 3 . 0  1 7 . 0  1 1 . 0  7 . 7 3  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTEO FREQUENCY RATIO AS %, OF FREQ. TO TUTAL NO. OF WORDS IN SUBJECT  5 2  TABLE  XXX  D I S T R I B U T I O N  IX  MOST  SUBJECT  RANK  S WCRD  1. THE  B  5. TO  F  G  H  9 2 1 . 3  1 7 0 6 . 3  2 7 9 0 . 3  1 9 6 7 . 0  266.E  9 0 6 . 0  513.2  5 . 4 2 7  9.195  7 . 6 0 5  9 . 3 9 1  9 0 7 1 .  RMNK  S WORD  6. IN  1 0 7 7 . 0  7 0 4 . 0  1 5 2 . 0  5 5 6 . 0  391.7  7 2 5 . 5  1 1 8 6 . 4  8 3 6 . 4  113.5  3 8 5 . 2  2 . 8 4 8  2 . 6 4 1  4 . 2 0 4  3 8 5 7 .  7.  2 1 8 . 2  4 . 5 2 8  IS  7 3 2 . 0  1 2 3 8 . 0  7 0 3 . C  5 8 . 0  2 4 6 . 0  3 5 9 . 2  6 6 5 . 3  1 0 6 8 . 0  7 6 7 . 0  1 0 4 . 0  3 5 3 . 3  3. 166  3 . 4 0 6  2 . 6 3 7  1 . 6 0 4  3 5 3 7 .  8.  2 0 0 . 1  2 . 0 0 4  THAT  5 4 8 . 0  1 0 2 7 . 0  6 5 9 . 0  3 1 3 . 4  5 8 0 . 5  9 4 9 . 3  6 6 9 . 2  2 . 7 1 6  2 . 4 7 2  111.0  3 0 6 . 0  9 0 . 8 3 . 0 7 0  2 5 5 . 4  4 73. 1  7 7 3 . 6  5 4 5 . 4  74.0  2 5 1 . 2  3 6 0 . 0  5 6 8 . 0  1 1 4 0 . 0  5 6 8 . 0  7 2 . 0  3 0 4 . C  3 2 1 . 8  5 9 5 . 9  9 7 4 . 5  6 8 7 . 0  9 3 . 2  3 1 6 . 4  3 0 8 6 .  9. IT  1.754  3 . 0 1 5  2 . 1 3 1  1.991  179.2  2 . 4 7 6  2 . 2 4 3  10. ARE  1.992  G  1.715  7 1 9 . 0  6 5 1 . 0  6 8 . 0  1 8 2 . 0  2 2 4 . 3  4 1 5 . 3  6 7 9 . 2  4 7 8 . 8  6 5 . 0  2 2 0 . 5  1 . 2 2 0  1 . 9 0 2  2 . 4 4 2  1.831  TOTAL  175.0  2 5 1 5 .  1 4 2 . 3 2 . 5 1 6  1 0 2 . 0  2 2 0 8 .  1 2 4 . 9  1.482  1 . 4 6 7  1 2 0 . 1 7  1 0 2 . 0  2 5 1 . 0  2 9 6 . 0  1 9 2 . 0  4 4 . 0  1 4 8 . 0  108.1  2 0 0 . 1  3 2 7 . 3  2 3 0 . 7  31.3  1 0 6 . 3  1 . 0 8 5  0 . 7 8 3  0 . 7 2 0  1.217  3 1 . 0  1 0 6 4 .  6 0 . 2  1.205  0 . 4 4 6  5 8 . 4 5  1 0 9 . 0  2 3 6 . 0  3 6 7 . 0  2 7 6 . 0  3 2 . 0  1 0 7 . 0  119.9  2 2 2 . 2  3 6 3 . 3  2 5 6 . 1  3 4 . 7  118.0  1.021  0 . 9 7 1  1.035  0 . 8 8 5  5 4 . 0  1181.  6 6 . 8  0 . 8 7 1  C . 7 7 6  7.14  134.0  1 5 3 . 0  517.0  3 0 4 . 0  2 9 . 0  1 2 9 . 8  2 4 0 . 4  393.1  2/7.1  37.6  1.073  H  1 . 9 8 7  2 8 2 . 0  0 . 8 7 3  3 1 6 8 .  2 . 0 7 6  F  11.95  CHI-SQUARE 156.0  E  2 0 3 . 0  0 . 8 1 7  1 7 4 . 6  2 . 4 9 2  2 8 . 7 1  CHI-SQUARE  1 2 2 . 0  3 0 8 . 2  CHI-SQUAKE  2 . 4 5 6  2 4 4 . 0  CHI-SQUARE  3 1 3 . 0  2 . 3 7 0  6 2 . 0  1.946  100  THE  S  5 3 1 . 0  3 . 2 2 1  1 1 8 . 7 2  T  7 8 5 . 0  1.626  2 2 4 . 0  C  T H E  NINE  4 5 0 . 0  CHI-SQUARE  2 3 8 . 0  E  GRADE  2 6 8 . 0  4 . 4 7 2  1 6 3 . 9 3  J  OF  OF  ACROSS  D  2.147  311.0  B  AREAS  T Y P E S  C  CHI-SQUAKE  7 0 4 . 0  3 . 0 4 5  U  OCCURRENCE  WORO  B  6 . 3 3 9  4 0 5 . 1 8  3 5 2 . 0  2 . 8 8 3  TOTAL  OF  FREQUENT  SUBJECT  5 8 0 . 0  6 . 8 1 1  MOST  S E  D I S T R I B U T I O N  NINE  1 1 5 3 . 0  2 . 5 0 7  .  T  XXXIX  TABLE  THE  2 7 5 . 0  CHI-SQUA/<E  A  C  100  T H E  2 4 5 1 . 0  .2.307  4.  E  GRADE  OF  ACROSS  2 0 5 2 . 0  CHI-SQUARE  AND  J  U F  TYPES  1 5 7 5 . 0  2 . 8 1 9  3 .  AREAS  D  OCCURRENCE  WORO  9 8 5 . 0  CHI-SQUARE  OF  B  C  7 . 8 8 9  2.  U  OF  FREQUENT  0 . 6 6 2  1.367  1.140  7 9 . 0  6 2 . 0  127.6  0 . 0 0 2  1 2 7 8 .  7 2 . 3  0 . 6 4 3  0 . 8 9 1  6 2 . 8 9 CHI-SQUARE  9 5 . 5 3  N) THE THREE FREQUENCY EXPECTED RATIO  AS  L I N E S  OF  F I G U R E S  FOR  EACH  ENTRY  THE THREE FREQUENCY  FREQUENCY X,  CF  F R E C .  CD  R E P R E S E N T :  TO  TOTAL  NO.  OF  WORDS  I N  S U B J E C T  L I N E S  UF  EXPECTED  FREQUENCY  RATIU  i,  AS  UF  F I G U R E S  FREQ.  TO  FOR  TOTAL  EACH  NO.  OF  ENTRY  WORDS  R E P R E S E N T :  I N  S U B J E C T  M  TABLE  XXXIX  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE  RAN< WORD 11. FOR  B  C  28.0  120.1 222.3  363.6  256.3  34.0 118.1 66.9  0.813 1.190 0.829  121.0 224.0  366.4  258.3  35.0 119.0 67.4  1.003 1.031 0.409  15. OR  239.0  32.4 110.1 62.4  0.493  339.0  1.254 1.080 0.387  68.0  0.054  26.0 1102.  18. THIS  87.0 179.0 296.0  175.0  37.0  78.0  194.9  26.4  89.8 50.9  0.783  0.657  1.023 0.635  47.0  19. BY  0.676  25.6  86.9 49.2  0.705  285.0  15".0  56.0  22.0 1075.  109.2 202.2  233.1  31.6 107.4 60.8  1.277 1.069 0.415 0.456 0.316  179.79  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FREQ. TO TOTAL NO. UF WORDS IN SUBJECT  20.  0.765  0.470  0.603  44.0  164.0  29.0  93.0  77.1 142.8 233.5  164.6  22.3  75.8 42.9  0.463  0.553  0.690  0.602  0.757  27.0  39.68 25.0  78.0  66.0 122.3 199.9 141.0  19.1  64.9 36.8  0.381 0.336  0.713 0.691 0.635  37.0  80.74 14.0  70.0  63.7 117.9 192.9 136.0  18.4  62.6 35.5  CHI-SQUARE  650.  0. 532  87.0 106.0 159.0 148.0  0.144  759.  0.388  88.0 127.0 190.0  42.6  870.  0.633  110.0 107.0 209.0  0.458  0.421 0.555  0.387  43.0  627.  0.570 0.618  20.30  18.0 222.0 WAS  0.788  TUTAL  13.87  CHI-SQUARE  101.0 113.0 483.0  0.489  188.7  0.697  10.25  330.7  88.4 163.7 267.6  0.841  899.  H  74.0  CHI-SQUAKE  91.3 169.1 276.5  G  17.0  105.0  0.374  F  204.0  0.881  153.90  0.774  S U B J E C T S D E  CHI-SQUARE  111.9 207.3  CHI-SQUARE  ON  247.80 14.C  0.809  17.  C  70.0 163.0 298.0  0.561  1.549 1.132 0.359  288.0  CHI-SQUARE  WITH  25.0 1191.  118.0 114.0 474.0  0.697  16.  B  r  56.0 139.0  CHI-SQUAKE  AS  WORD  CHl S3UARE 109.0  0.945  RAN<  0.978  390.0  DISTRI BUT 13N OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE  X X X I X  65.91  CHI-SQUARE  14.  0.554  TOTAL  68.0 1182.  232.0  1.922  BE  0.774  68.0  H  221.0  240.0  13.  G  450.0  CHI-SQUARE  YOU  F  159.0- 188.0  1.274  12.  S U B J E C T S 0 E  TABLE  25.0  35.0  24.0  30.0  78.8 128.9  90.9  12.3  41.9 23.7  0.960  0.066  0.131 0.664  0.244  65.0  419.  0.935  478.70  THE THREE LINES UF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS %, UF FREQ. TO TOTAL NI). UF WURDS IN SUBJECT  CD tO  TABLE  XXXIX  RANK WORD 21.  B  DISTRIBUTION OF OCCURRENCE CF THE 100 MOST FREOUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE  S U B J E C T S C D £  28.0 244.0 HE  52.9 0.224  CHI-SQUARE 22.  65.0  FROM  13.0  10.0  45.0  98.0 160.3 113.0  15.3  52.0 29.5  1.055 0.444  0.359  0.410 0.428  WHICH  0.387  0.635  25.0  51.1  14.£  50.2 28.5  94.6 154.7 109.1 0.476  0.508  534.  27.  0.240  0.442  25.0  66.0  98.9 161.8 114.1  15.5  52.5 29.8  0.397  48.9  90.5 148.0 104.3  14.1  48.0 27.2  52. 1 0.352 CHI-SQUARE  14.0  0.338  0.387  0.538  20.0  526.  29.  48.9 0.961  21.12  CHI-SQUARE  49.0  65.0 156.0 98.0  3.0  49.2  91.0 148.9 105.0  14.2  0.281 0.413 0.368  120.0  YOUR  0.288  0.083  65.0  48.0 484.  48.3 27.4 0.529  0.690  38.38  30. THEY  H  60.0  44.0 CAN  G  18.0  0.256  28.  F  87.0 150.0 91.0  50.0  0.318 0.359  90.0  S U B J E C T S C 0 E  0.376  0.397  0.341 0.498  30.0 481.  0.489 0.431  10.0  92.5 151.3 106.7 14.5 0.532  0.492  0.251 0.277  39.0  35.0  0.318 0.503  44.50 62.0 180.0 149.0 19.0 96.5 157.8 111.2  15.1  0.268  0.525  0.476  0.559  52.0  7.0  513.  51.2 29.0 0.424 0.101  47.27 87.0 196.0  13.0  16.0  35.0  90.5 148.0 104.3  14.1  48.C 27.2  0.376  0.518 0.049  0.442  14.0 481.  0.285 0.201  209.47 88.0 200.0  67.0  11.0  34.0  45.9  85.0 139.0 98.0  13.3  45.1 25.6  CHI-SQUARE  492.  49.1 27.8  27.0  0.216  TOTAL  6.47  32.0 123.0 186.0 67.0 NOT  503.  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE  45.0  0.360  33.83  0.497  B  CHI-SQUARE 39.0  CHI-SQUARE  ONE  18.21 16.0  0.392  26.  0.359  57.0 110.0 192.0 64.0  CHI-SOUARE  WORD  CHI-SQUARE  53.3 30.2  0.569  R4NK  0.367 0.187  15.7  53.4  521.  XXX/X  330.08  71.0 115.0 150.0  25.  0.277  54.2 100.4 164.3 115.8  CHI-SQUAKE  AT  0.049  TOTAL  13.0  78.0  0.457  24.  H  14.0  CHI-SOUARE  HAVE  G  83.0 155.0 114.0  0.521  23.  168.0  F  TABLE  0.381 0.529  0.251 0.304  0.277  25.0  452.  0.359  47.60 M  THE THREE LINES OF FIGURES FOR EACh ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIU AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS Zt UF FREQ. TO TOTAL NO. CF WORDS IN SUBJECT  00 w  TABLE  DISTRIBUTION OF OCCURRENCE OF THE 100 HOST FREOUENT WORO TYPES ACROSS THE SUBJECT AREAS OF GRADE NINE  XXXIX  RANK  SUBJECTS C 0 E  B  WORD  40.0 83.0 28.0  31.  WE  22.2  41.2 67.4 47.5  0.320  41.3  27.0 27.0 6.4  H  TOTAL  6.0  219.  21.9 12.4  CHI-SQUARE  6.0  0.722  8.0  20.0 22.0 407.  36.  0.397 0.023 0.221 0.163  CHI-SQUAKE 34.  ALL  IF  BUT  0.40C CHI-SQUARE  0.242 0.571 0.353 0.332  AN  CHI-SQUARE  0.394 0.257 0.353 0.364 0.367 0.388 10.78  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREOUENCY RATIO AS %, OF FREQ. TO TUTAL NU. UF WORDS IN SUBJECT  3.0  35.0 36.0 356.  77.2 10.5 35.6  20.1  THESE  79.98  45.0 44.0 84.0 68.0  7.0  47.0 24.0 319.  32.4 60.0 98.1 69.2  9.4  31.9 18.0  0.360 0. 190 0.222 0.255 0. 194 0. 383 0.345 CHI-SQUAKE  41.5 76.9 125.8 88.7 12.0 40.9 23.1 0.352  39.  0.309 0.158  11.0 45.0 27.0 409.  39.0 19.0 366.  0.136 0.502 0.283 0.158 0.083 0.285 0.518  53.01  44.0 91.0 97.0 94.0  35.  36.2 67.0 109. 5  0.358 6.187  47.6 27.0  2.0  27.10  17.0 116.0 107.0 42.0  38.  CHI-SQUAKE  48.4 89.7 146.7 103.4 14.0  TOTAL  0.400 0.337 0.341 0.184 0.055 0.318 0.273  0.316  38.0 11.0 477.  H  37.2 68.8 112.6 79.4 10.8 36.6 20.7  CHI-SQUARE  12.0  G  13.60  50.0 78.0 129.0 49.0  92.51  50.0 56.0 216.0 94.0  F  43.1 79.8 130.4 91.9 12.5 42.3 24.0  37.  201.53  0.169 0.487 0.394 0.332  SUBJECTS C D E  0.272 0.290 0.410 0.379 0.415 0.293 0.230  50.3 93.1 152.3 107.3 14.6 49.4 28.0 0.785  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES. ACROSS THE SUBJECT AREAS OF GRADE NINE  34.0 67.0 155.0 101.0 15.0 36.0 16.0 424.  WHEN  12.0 40.7 23.0  98.0 39.0 184.0 105.0 12.0 44.0 13.0 495. WILL  B  CHI-SQUARE  76.6 125.2 88.3  0.272  33.  RANK WORD  182.54  34.0. 167.0 150.0 HIS  G  XXX IX  0.359 0.074 0.030 0.747 0.220 0.086  CHI-SQUARE 32.  8.0  F  TABLE  •0.  45.0 MAY  20.98 18.C 213.0 50.0  6.0  18.0  36.4 67.3 110.1 77.6 10.5 35.8 0.36U CHI-SQUAKE  0.078  0.563 0.188 0.166 0.147  8.0  353.  20.3 0.115  162.34 ro  THE THREt LINES UF FIGURES FUR EACH ENTRY REFRESENT: FREOUENCY EXPECTED FREQUENCY RATIO AS %, OF FREQ. TO TOTAL NO. OF WORDS IN SUBJECT  co  J>  T A B L E  IX  XXX  D I S T R I B U T I O N MOST  SUBJECT  RANK  S WORD  4 1 .  B  U  C  OF  FREQUENT  B  AREAS  J  E  C  D  OCCURRENCE  WORD OF  T  TYPES  GRADE  OF  5 . 0  1 6 . 0  3 0 . 6  5 6 . 6  9 2 . 6  6 5 . 3  8 . 9  3 0 . 1  CHI-SQUARE 4 2 . HAS  8 5 . 0  8 0 . 0  8 . 0  3 0 . 0  30.1  55.7  91.1  6 4 . 2  8 . 7  2 9 . 6  0 . 2 1 2  5 0 . 4  0 . 2 2 1  2 5 . 0  8 . 0  8 2 . 4  5 8 . 1  7 . 9  2 6 . 8  15.2  0 . 0 2 4  0 . 0 3 8  0 . 0  0 . 2 0 4  1 3 . 0  . 2 7 . 0  2 7 . 3  5 0 . 6  8 2 . 7  5 8 . 3  7.9  2 6 . 9  0 . 1 6 8  0 . 2 6 4  WERE  4 8 .  9 2 . 0 0 . 3 4 6  OF  0 . 2 2 1  0 . 3 6 0  14.0  HAD  FREQUENCY  RATIO  %,  OF  0 . 2 2 0  4 6 . 0  0 . 0  17.0  6 4 . 8  8 . 8  2 9 . 9  0 . 1 7 3  2 6 9 .  5 7 . 2  T H E I R  0 . 2 0 1  0 . 0  19.0 16.9  0 . 1 3 8  $0. USED  0 . 2 7 3  FOR  EACH  ENTRY  R E P R E S E N T :  TO  TOTAL  NO.  OF  WORDS  I N  SUBJECT  GRADE  T S E  OF  100  THE  NINE  F  G  H  19.0  4 6 . 0  0 . 0  17.0  6 4 . 8  8 . 8  2 9 . 9  0.173  T H E  ACROSS  0 . 0  TOTAL  2 9 9 .  16.9  0 . 1 3 8  0 . 2 7 3  106.0  6 9 . 0  7.0  2 6 . 0  9 3 . 5  6 5 . 9  8 . 9  3 0 . 4  0 . 2 8 0  0 . 2 5 9  2 3 . 0  3 0 4 .  17.2  0 . 1 9 4  0 . 2 1 2  5 . 0  0.331  7.41 7 4 . 0  18.0  2 8 . 0  5 . 0  33.5  5 4 . 8  3 8 . 6  5.2  0 . 3 2 0  0 . 0 4 8  0 . 1 0 5  0 . 1 3 8  178.  10.1  0 . 0 4 1  0.561  1 7 3 . 4 6  1 3 0 . 0  2 7 . 0  19.0  1.0  2 5 . 0  2 2 . 2  41.2  6 7 . 4  4 7 . 5  6 . 4  2 1 . 9  0 . 5 6 2  0 . 0 7 1  0.071  0 . C 2 8  14.0  2 1 9 .  12.4  0 . 2 0 4  C . 2 0 1  2 5 4 . 6 1  6 . 0  6 0 . 0  9 5 . 0  1 8 . 0  4 . 0  1 4 . 0  2 3 . 0  4 2 . 5  6 9 . 5  4 9 . 0  6 . 6  2 2 . 6  0 . 0 4 8  3 9 . 0  17.8  3 . 0  0 . 0 2 4  2 9 9 .  C  0 . 3 4 6  0 . 2 0 8  16.1  4-9.  E  9 2 . 0  0 . 2 5 9  CHI-SQUARE  F I G U R E S  F R E Q .  3 0 . 9  0 . Q 7 2  15.2  FKEQUENCY E X P E C T E D  4 8 . 0  9 . 0  0.115  4 0 . 4 5  L I N E S  2 5 . 0  0 . 2 0 0  2 6 8 .  J  OF  TYPES  4 0 . 4 5  CHI-SQUARE  131.0  0 . 2 1 2  C H I - S Q U A R E  AS  47.  1 3 . 4 6  5 6 . 2  THREE  2 9 6 .  B  OCCURRENCE  WORD  AREAS  131.0  0 . 2 1 2  CHI-SQUARE 5 9 . 0  3 0 . 4  5 6 . 2  0 . 2 1 6  0 . 0  1 0 0 . 0  4 9 . 0  3 0 . 4  U  CHI-SQUARE  3 4 . 0  21.0  4 9 . 0  0 . 1 6 8  16.7  0 . 2 4 4  2 2 . 0  CHI-SQUARE  THfc  0 . 3 0 0  15.0  1 0 . 0  0.147  S D  C  21.0  0 . 2 8 8  6 7 9 . 2 4  0 . 1 7 6  SOME  MORE  17.0  9 . 0  0 . 9 3 0  CHI-SQUARE  4 5 .  0 . 2 2 5  2 1 5 . 0  0 . 0 0 8  OTHER  4 6 .  5 . 3 8  2 7 . 2  44.  3 0 1 .  B  CHI-SQUARE  4 9 . 0  1.0  2 0 . 0  0 . 1 3 0  2 9 . 0  CHI-SQUARE  I  0 . 1 3 8  WORD  TOTAL  19.21  0 . 2 3 2  4 3 .  0 . 2 7 8  H  OF  FREQUENT  SUBJECT  G  7 4 . 0  0 . 2 3 3  D I S T R I B U T I O N  RANK F  8 8 . 0  0 . 3 2 9  XXX/X  MOST  S  E  7 6 . 0  0 . 1 7 6  TABLE  100  THE  N I N E  2 2 . 0 THERE  THE  ACROSS  0 . 2 5 1  0 . 0 6 8  O.lll  2 9 . 0  2 2 5 .  12.8  0 . 1 1 4  0 . 4 1 7  7 3 . 5 4  ts) CO THE  THREE  L I N E S  OF  FIGURES  FREQUENCY EXPECTED FREQUENCY RATIO AS S, OF FREQ.  TO  FOR  TUTAL  EACH  NO.  OF  ENTRY  WORDS  R E P R E S E N T :  I N  SUBJECT  JJ,  XXX'X  TABLE  D I S T R I B U T I O N MOST  SUBJECT  RANK  S WCRD  B  51. MANY  U  31.5  5 8 . 3  TABLE  6 7 . 2 0 . 4 3 5  H  MOST  TOTAL  9 . 0  17.0  5.0  9.1  3 1 . 0  17.5  0 . 2 4 9  0 . 1 3 8  2 4 . 9  4 6 . 1  75.4  53.1  7.2  2 4 . 5  0 . 1 8 8  0 . 2 5 1  0 . 6 9 1  19.0  SHOULD  2 4 5 .  57.  0.138  WHAT  0 . 2 7 3  4 8 . 0  8.C  2 2 . 0  5 . 0  2 5 . 3  4 6 . 8  7 6 . 6  5 4 . 0  7.3  2 4 . 9  14.1  0 . 2 5 5  0 . 2 3 5  0 . 1 8 0  0.221  0.179  2 4 9 .  58. THAN  0 . 0 7 2  2 4 . 0  8 4 . 0  2 6 . 0  2 6 . 0  3 4 . 0  9 . 0  2 5 . 7  4 7 . 6  77.8  5 4 . 9  7.4  2 5 . 3  14.3  0 . 1 0 4  0 . 2 2 2  0 . 0 9 8  0 . 7 1 9  0 . 2 7 7  2 5 3 .  5 9 . BEEN  0.129  3 8 . 0  4 5 . 0  6 6 . 0  2 2 . 9  4 2 . 3  6 9 . 2  4 8 . 8  0 . 1 6 4  CHI-SQUARE  0.119  0 . 2 4 8  11.0  2 5 . 0  6 . 6  16.0  2 2 . 5  0 . 3 0 4  12.7  0 . 2 0 4  L I N E S  OF  225.  6 0 . INTO  0 . 2 3 0  19.06  IHREE  4 3 . 0  2 6 . 9  4 9 . 8  81.5  57.5  7.8  0 . 2 5 5  FOR  EACH  ENTRY  R E P R E S E N T :  3 3 . 0  2 0 . 0  164.0  2 7 . 0  5 0 . 0  81.8  0 . 0 8 6  FREOUENCY  100  THE  N I N E  F  0 . 1 2 0  G  H  33.0.  17.0  2 6 . 5  1.189  TOTAL  15.0  0 . 2 6 9  0 . 2 4 4  0 . 4 3 4  3 2 . 0  5.0  1 0 . 0  2 . 0  57.7  7.8  2 6 . 6  15.1  0 . 1 2 0  0.138  0 . 0 8 1  8 3 . 0  35.0  10.0  8 . 0  3 5 . 0  5.0  19.7  36.5  59.7  42.1  5.7  19.4  11.0  0 . 3 5 9  0 . 0 9 3  0 . 0 3 8  0.221  0 . 2 8 5  Xt  OF  FREQ.  1 1 0 . 8 4  14.0  3 7 . 0  76.0  5 2 . 0  9.0  2 1 . 0  8 . 0  2 2 . 0  4 0 . 8  66.7  47.1  6.4  2 1 . 7  12.3  0.160  0 . 2 0 1  0.195  0 . 2 4 9  0.171  TOTAL  NO.  OF  WORDS  I N  SUBJECT  5 6 . 0  4 8 . 0  3 3 . 0  7.0  2 4 . 0  2 0 . 0  37.1  6 0 . 6  4 2 . 7  5.8  19.7  RATIO  AS  217.  0.115  7.68  19.0  EXPECTED TO  194.  0 . 0 7 2  0 . 2 4 2  0.127  0 . 1 2 4  0.194  10.0  197.  11.1  0.195  0.144  15.88  L I N E S  OF  FIGURES  FOR  EACH  ENTRY  R E P R E S E N T :  FREQUENCY FREQUENCY  2 6 6 .  0 . 0 2 9  18.0  THREE  2 6 5 .  1 3 5 . 9 9  0.152  THE  THE  1 8 0 . 8 1  CHI-SQUARE  FIGURES  E  0.159  OF  ACROSS  S  3 2 . 0  CHI-SQUARE  2 4 . 0  T  6 0 . 0  0.112  1 0 1 . 6 2  C  GRADE  5 9 . 0  CHI-SQUARE  5 0 . 0  E  OF  TYPES  21.0  0.144  14.19  J  AREAS  D  CHI-SQUARE  8 9 . 0  B  OCCURRENCE  WORD  C  0 . 2 6 4  6 0 . 8 5  U  OF  FREQUENT  B  0.168  13.9  5 9 . 0  AS  56.  0 . 0 7 2  18.0  RATIO  310.  S WORD  CHI-SQUARE 17.0  0 . 1 3 4  DISTRIBUTION SUBJECT  G  2 5 . 0  EXPECTED  XXXIX  RANK F  6 7 . 0  0.192  THE  100  THE  S  71.0  CHI-SQUARE  ABOUT  T H E  N I N E  3 1 . 0  0 . 4 0 0  55.  GRADE  OF  ACROSS  15.0  CHI-SQUARE  TWO  TYPES  7 5 . 4 7  0.144  54.  T  116.0  0 . 3 0 4  CHI-SUUARE  EACH  C  OF  E  9 5 . 4  0 . 1 1 2  0 . 1 2 0  53.  E  115.0  CHI-SQUARE  SO  J  OCCURRENCE  WORD  AREAS  D  2 6 . 0  0.176  5 2 .  B  C  2 2 . 0  OF  FREOUENT  FREQUENCY %,  OF  F R E Q .  TO  TOTAL  NO.  OF  WORDS  I N  SUBJECT  00  RANK  WORO  61. THEM  B  C  S U B J E C T S 0 E  USE  1.0  24.0  15.0  24.1  44.6  72.9  51.4  7.0  23.7  13.4  0.247  20.0  6.0  15.0  8.0  22.3  41.4  67.7  47.7  6.5  22.0  12.4  0.247  0.241  0.075  0.166  85.0  43.0  20.0  21.0  7.0  25.7  47.6  77.8  54.9  7.4  25.3  14.3  0.225  0.161  0.553  THEN  253.  68. TIME  124.0  61.0  5.0  11.0  4.0  25.9  48.0  78.4  55.3  7.5  25.5  14.4  0.328  0.229  0.138  0.090  255.  69. ITS  11.0  24.4  45.1  73.8  52.0  7.1  24.0  13.6  0.255  27.0  61.0  81.0  25.0  12.0  22.0  6.0  23.8  44.0  72.0  50.7  6.9  23.4  13.2  0.214  0.094  0.332  0.179  234.  0.086  29.03  The THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  70. WOULD  0.138  0.281  0.111  0.106 0.158  17.0  34.0  65.0  37.0  7.0  20.0  14.0  21.7  40.3  65.8  46.4  6.3  21.4  12.1  0.147  0.225  0.139  0.194  9.96 51.0  55.0  57.0  5.0  26.0  7.0  24.2  44.8  73.2  51.6  7.0  23.8  13.5  0.221  0.145  0.214  0.138  238.  0.212 0.101  16.65  27.0  33.0  94.0  26.0  4.C  18.0  16.0  22.1  41.0  67.1  47.3  6.4  21.8  12.3  0.143  0.249  0.098  0.111  0.147  213.  0.230  25.68  9.0  44.0  36.0  48.0  0.0  35.0  13.0  18.8  34.8  56.9  40.1  5.4  18.5  10.5  CHI-SQUARE  214.  0.163 0.201  37.0  0.072  240.  27.78  CHI-SQUARE  56.69  0.264  13.0  0.216  0.058  TOTAL  4.0  CHI-SQUARE  23.0  H  75.0  0.296  30.02  G  52.0  0.136  0.171 0.101  27.0  CHI-SQUARE  67.  F  59.0  CHI-SQUARE  46.0  0.216  220.  0.122 0.115  31.0  0.099  S U B J E C T S 0 E  26.0  0.208  33.88  0.199  C  CHI-SQUARE  91.0  CHI-SQUARE  UP  SUCH  0.195 0.216  57.0  0.216  65.  0.028  B  66.  28.92  CHI-SQUARE  DO  0.278  23.0  0.248  64.  0.138  237.  D I S T R I B U T I O N OF OCCURRENCE O F T H E 1 0 0 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE N I N E  XXXIX.  WORO  TOTAL  74.0  CHI-SQUARE  MAKE  H  52.0  0.184  63.  G  57.0  CHI-SQUARE 62.  RANK F  14.0  0.112  TABLE  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST F R E Q U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE NINE  X X X / X  TABLE  0.190  0.095  0.180  0.0  0.285  185.  0.187  37.59  THE THREE LINES OF FIGURES FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  Jj  TABLE  RANK  XXXIX  WORO  71. HOW  B  D I S T R I B U T I O N OF OCCURRENCE OF T H E XOO M O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE NINE  C  S U B J E C T S D E 27.0  10.0  30.0  6.0  16.5  30.5  49.8  35.1  4.8  16.2  9.2  0.208  31.0  2.0  20.7  38.4  62.8  44.2  6.0  20.4  11.5  0.173 0.127  0.113 0.940  0.252  76. ONLY  204.  77. NO  15.0  10.0  18.0  15.0  3.0  10.0  18.4  30.1  21.3  2.9  9.8  5.5  0.498  0.122  0.043  0.038  98.  78. MUST  24.0  77.0  83.0  1.0  5.0  4.0  21.3  39.5  64.6  45.5  6.2  21.0  11.9  0.311  0.028  0.041  210.  79. WATER  0.058  57.0  52.0  44.0  3.0  19.0  10.0  20.4  37.8  61.8  43.6  5.9  20.1  11.4  0.247  0.138  2.0  9.0  11.0  17.6  32.5  53.2  37.5  5.1  17.3  9.8  0.134  0.165 0.083  201.  0.155 0.144  13.92  THE THREE L I N E S OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S tt O F F R E Q . T O T O T A L N O . O F W O R O S I N S U B J E C T  80. ALSO  0.148  0.203  0.055  0.073  34.0  52.0  23.0  4.0  21.0  15.0  16.4  30.3  49.5  34.9  4.7  16.1  9.1  0.147  0.138  0.086  0.111  151.  0.171 0.216  11.23  20.0  76.0  59.0  15.0  2.0  15.0  4.0  19.5  36.1  59.1  41.6  5.6  19.2  10.9  0.329  0.156 0.056  0.055  0.122  192.  0.058  68.70  36.0  29.0  54.0  77.0  0.0  18.0  2.0  21.9  40.6  66.4  4 6 . »  6.4  21.6  12.2  0.125  0.143  0.289  0.0  0.147  216.  0.029  49.59  1.0  15.0  48.0  29.0  5.0  94.0  6.0  20.1  37.2  60.9  42.9  5.8  19.8  11.2  CHI-SQUARE  173.  0.158  12.0  0.008  TOTAL  16.72  CHI-SOUARE  16.0  H  54.0  0.288  62.35  G  56.0  CHI-SQUARE  16.0  F  31.0  0.1.60  163.07  0.204  S U B J E C T S D E  CHI-SQUARE  3.0  0.104  C  10.0  0.096  152.33  0.013 0.040  8  0.080  0.029  33.0  CHI-SQUARE  WORD  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE NINE  CHI-SQUARE 34.0  0.128  162.  RANK  0.086  30.0  CHI-SQUARE  MOST  0.244  48.0  0.128  75.  0.277  40.0  CHI-SQUARE  OUT  0.101  19.0  0.264  74.  0.061  TOTAL  XXXIX  45.20  Chl-SOUARE  MADE  .  23.0  0.152  73.  H  48.0  CHI-SQUARE  NJK.8ER  G  18.0  0.144  72.  F  TABLE  0.065  0.127 0.109 0.138  0.766  198.  0.066  319.80  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S %, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  CO CO  TABLE  XXX/X  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE NINE  RANK WORO 81. FIRST  8  C  S U B J E C T S D E 57.0  61.0  0.0  15.0  11.0  17.7  32.7  53.5  37.7  5.1  17.4  9.8  0.074  4.0  12.0  12.0  14.4  26.7  43.7  30.8  4.2  14.2  8.0  0.095  SAME  0.106  0.135  0.111  0.098  47.0  38.0  0.0  22.0  6.0  15.9  29.5  48.3  34.0  4.6  15.7  8.9  0.147  0.124  0.143  0.0  0.179  142.  87. WHO  157.  88. ANY  77.0  27.0  0.0  4.0  6.0  17.1  31.6  51.7  36.4  4.9  16.8  9.5  0.101  0.0  0.033  25.0  5.0  13.9  25.8  42.1  29.7  4.0  13.7  7.8  0.048  1.0  1.0  10.0  2.0  16.3  30.1  49.2  34.7  4.7  16.0  9.1  0.004  0.028  0.081  0.029  122.51  0.131  0.221  0.204  41.0  5.0  22.0  8.0  11.0  8.0  9.6  17.9  29.2  20.6  2.8  9.5  5.4  0.083  0.221  0.090  0.177  0.013  70.98  22.0  41.0  45.0  9.0  2.0  4.0  9.0  13.4  24.8  40.6  28.6  3.9  13.2  7.5  0.177  0.119 0.034  0.055  0.033  37.59 25.0  4.0  13.0  4.0  BECAUSE  12.8  23.7  38.8  27.3  3.7  12.6  7.1  0.104  0.111  0.094  0.111  0.106  0.058  23.0  59.0  22.0  3.0  13.0  8.0  14.6  27.1  44.3  31.2  4.2  14.4  8.1  CHI-SQUARE  125.  2.00  16.0  0.128  132.  0.129  42.0  SEE  95.  0.115  24.0  90.  137.  0.072  0.0  0.112  160.  0.103  TOTAL  23.89  CHI-SOUARE  63.0  0.167  8.0  14.0  32.90 76.0  H  89.  0.086  7.0  G  35.0  0.176  168.  F  39.0  CHI-SQUARE  31.0  0.329  S U B J E C T S D E  11.0  0.0  0.086  23.0  0.204  C  14.0  0.112  11.49  0.134  B.  CHI-SQUARE  34.0  CHI-SQUARE  COULO  0.173  10.0  0.056  86.  4.50  CHI-SQUARE 85.  WORD  CHI-SQUARE 36.0  0.184  174.  RANK  0.122 0.158  40.0  CHI-SQUARE  HIM  0.0  22.0  0.080  84.  0.229  TOTAL  DISTRIBUTION OF OCCURRENCE OF T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE NINE  XXXIX  28.95  CHI-SQUARE  GOOD  0.151  16.0  0.128  83.  H .  17.0  CHI-SQUARE  VERY  G  13.0  0.104  82.  F  TABLE  0.099  0.156  0.083  0.083  144.  0.106 0.115  8.85 M  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FRCQUENCY EXPECTED FREQUENCY R A T I O A S X, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  00 ^  TABLE  RANK  XXX'X  WORO  91. LIKE  B  S U B J E C T S  RANK  D  20.0  10.0  13.5  25.0  40.9  28.8  3.9  13.3  7.5  0 . 1 6 9 0.074  17.0  9.0  13.1  24. 3  39.7  28.0  3.8  12.9  7.3  0 . 1 7 3 0 . 1 0 6 0.038  0.111  0.138  129.  25.0  25.0  0.0  16.0  10.0  11.8  21.8  35.7  25.2  3.4  11.6  6.6  0.094  0.0  0.130  33.0  6.0  1.0  2.0  29.0  10.6  19.6  32.0  3.1  10.4  5.9  0.117.  0.087  0.023  0.028  0.016  25.0  2.0  THROUGH  12.5  23.1  37.8  26-7  3.6  12.3  7.0  0.104  97.  98. NEW  99. SMALL  15.0  34.0  13.0  17.0  7.0  11.3  20.9  34.1  24.1  3.3  11.1  6.3  0.040  0.128 Q.360  111.  0 . 1 3 8 0.101  49.11  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY REPRESENT: FKEwUENCY EXPECT ED FREQUENCY R A T I O AS S , OF F R E Q . TO T O T A L N O . OF WORDS I N SUBJECT  OVER  0.029  29.0  49.0  1.0  24.0  11.0  13.8  25.6  41.8  29.5  4.0  13.6  7.7  0.074  0.077  0.184 0.C28  37.00 16.0  50.0  45.0  6.0  9.0  7.0  16.6  30.7  50.1  35.3  4.8  16.3  9.2  0.069  0.132 0.169 0 . 1 6 6 0.073  24.66 21.0  20.0  12.0  2.0  8.0  11.0  8.6  16.0  26.1  18.4  2.5  8.5  4.8  0.055  0.065  0.091  0.053  0.045  85.  0.158  14.01  6.0  9.0  34.0  34.0  1.0  20.0  13.0  11.9  22.0  36.0  25.4  3.4  11.7  6.6  CHI-SQUAKE  163.  0.101  11.0  0.048  136.  0.1950.158  30.0  0.088  100.  0.204  123.  21.84  CHI-SQUARE  18.0  0.116 0.055  17.0  0.2-40  115.95  0.078  0.061 0.095  5.0  0.040  0.417  7.0  TOTAL  H  2.0  CHI-SQUARE  22.6 -  G  31.0  0.144  104.  F  36.0  WORK  116.  E  14.0  10.13 27.0  D  CHI-SQUARE  21.0  6.0  S U B J E C T S  13.0  0.129  12.0  0.066  C  96.  24.76  0.091  B  CHI-SQUARE 4.0  CHI-SQUARE  133.  WORD  19.58 10.0  0.056  TOTAL  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 MOST FREQUENT WORD T Y P E S ACROSS THE S U B J E C T AREAS OF GRADE NINE  XXX'X  0.144  40.0  CHI-SQUARE  PLACE  0.138 0.163  40.0  0.043  95.  0.075  9.0  CHI-SOUARE  CALLED  H .  5.0  0.096  94.  G  20.0  CHI-SQUARE  PEOPLE  F  28.0  0.072  93.  E  39.0  CHI-SQUARE  MUCH  TABLE  11.0  0.088  92.  C  D I S T R I B U T I O N OF OCCURRENCE OF T H E 1 0 0 MOST FREOUENT WORD T Y P E S ACROSS THE S U B J E C T AREAS OF GRADE NINE  0.039  0.090  0.128 0.028  0.163  117.  0.187  27.44  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY REPRESENT: FREQUENCY E X P E C T E D FREQUENCY R A T I O AS tt OF F R E O . TO TOTAL NO. OF WOROS IN S U B J E C T  O  TABLE  XXXX  S U B  RAN< WORD 1. THE  606.0  591.9 661.7 6.248  7.085  CHI-SQUAKE 2. OF  1230.0  0.0  0.0  549. 3 1205.6  0.0  0.0  7.324  1755.0  4589.  7.893  IN  306.1  342.2  0.0  0.0  284.0  623.4  817.3  0.0  0.0  3.718  4.229  2373.  7. IS  0.0  0.0  99.0  320.0  188.1 210.2  0.0  0.0  174.5  383.0  0.0  0.0  1.394  580.0 1 4 5 8 .  8. THAT  502.1  2.054  189.0 214.0  0.0  0.0  233.0  4 1 6 . 0 347.0 1 3 9 9 .  180.5 201.7  0.0  0.0  167.5  367.5  0.0  0.0  3.282  9.  184.1 205.8 3.124 CHl-SQUARc  2.257  0.0  O.C 167.5 367.5  0.0  0.0  1.941  IT  481.8  2.670  0.0  178.0  356.0  0.0  0.0  170.8  374.9  0.0  0.0  2.507  461.0  1 4 2 7 .  491.5  2.285  2.257  20.33  THE THREE L I N E S O F F I G U R t S F O R EACH ENTRY REPRESENT: FREwUEN'CY EXPECTED FREQUENCY R A T I O A S Z, O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  10. ARE  2.056  577.0 1 3 9 9 . 481.8  2.355  2.825  0.0  0.0 191.0 258.0  192.0 8 3 6 .  107.8 120.5  0.0  0.0 100.1 219.6  287.9  0.0  0.0  0.561  2.690  1.656  C.940  178.45  9 8 . 0 9 4 . 0  0.0  0.0 130.0  180.0 133.0 6 3 5 .  8 1 . 9 9 1 . 6  0.0  0.0  166.8 218.7  1.099 0.0  0.0  76.0  1.631  1.155 0.651  76.20  119.0  6 5 . 1 72.8  CHI-SQUAKE  0.0  367.0  TOTAL  35.66  0.876  1.699  H  1 4 6 . 04 8 . 0  67.0  7 0 . 9 1 193.0  180.5 201.7  Chl-SQUAKE  75.84  G  0.0 146.0  1.281  2.839  F  0.0  CHI-SQUARE  183.0 276.0  2.502  S U B J E C T S D E  143.0 166.0  1.903  4-920  123.12  3.227  C  CHI-SQUARE 1005.0  2.221  B  1.869  8.591  659.0  239.0  WORD  6.  I5e0.4  264.0  CHI-SQUARE  TO  520.0  0.0  2.470  5.  0.0  0.0  CHI-SOUARE  A  0.0  190.0  2.392  4.  TOTAL  H  254.0-  CHI-SQUARE  AND  RANK  E C T S E  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J t C T AREAS O F GRADE T E N  X X X X  47.9o  3.320  3.  J D  B 478.0  TABLE  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE T E N  1.391  0.0  0.0  57.0  144.0 118.0 5 0 5 .  0.0  0.0  60.4  132.7 173.9  0.0  0.0  0.803  0.924  0.578  48.48  7 9 . 0 16.0  0.0  0.0  76.0  143. C  5 1 . 3 5 7 . 4  0.0  0.0  47.6  104.6 137.1  0.0  0.0  1.033 CHI-SQUAKE  0.187  1.070  8 4 . 0  3 9 3 .  0.918 0.411  96.32  to THE THREE L I N E S O F F I G U R E S FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S S , O F F R E Q . T O T O T A L NO. O F WORDS I N S U 8 J E C T  DISTRIBUTION  XXXX  TABLE  MOST  SUBJECT  RANK  S WORD  B  11. FOR  U  C  B  AREAS  J  E  C  OCCURRENCE  WORD OF  T  TYPES  GRADE  OF  THE  ACROSS  E  G  H  TOTAL  0.0  33.0  94.0  159.0  59.3  66.3  0.0  0.0  55.1  120.8  158.4  0.0  0.0  0.603  16.  45.7  51.0  1.647  0.585  U  0.0  68.0  91.0  0.0  0.0  42.4  93.0 121.9  0.0  0.0  0.958  19.0  0.584  J  E  C  T  OF  GRADE  E  F  40.9  45.7  0.0  0.0  37.9  0.0  0.0  ON  H  101.0  93.0  83.3  0.352  TOTAL  317.  109.2  0.648  0.455  27.55  55.0  59.0  C O  0.0  43.C  73.0  130.0  46.4  51.9  0.0  0.0  43.1  94.6  124.0  0.0  0.0  0.719  0.093  THE  G  25.0  17.  100  THE  S  D  0.0  0.813  OF  ACROSS  TEN  0.0  0.366  354.  AREAS  TYPES  70.0  CHI-SQUARE  0.0  B  C  OCCURRENCE  WORD  28.0  0.778  243.79  CHI-SQUARE  B  WITH  42.46  126.0 . 50.0 YDU  0.465  460.  S WORD  OF  FREQUENT  SUBJECT  RANK F  0.0  CHI-SQUARE  DISTRIBUTION MOST  S  D  X X X X  TEN  75.0  0.877  TABLE  100  THE  99.0  1.294  12.  OF  FREQUENT  0.690  CHI-SQUAKE  0.606  0.468  360  0.636  7.76  i  13.  BE  73.0  31.0  0.0  0.0  58.0  112.0  77.0  45.3  50.6  0.0  0.0  42.0  92.2  120.9  0.0  0.0  0.954  0.352  AS  0.0  0.0  64.0  129.0  183.0  62.6  69.9  0.0  0.0  53.1  127.4  167.0  0.0  0.0  0.713  0.828  0.0  38.0  71.0  42.0  26.6  29.7  0.0  0.0  24.7  54.1  70.9  0.0  C O  OF  EXPECIEO  FREOUENCY  RATIO  %,  OF  61.0  92.0  48.8  54.5  0.0  0.0  45.2  99.3  0.0  0.0  0.316  0.535  0.456  206.  20. WAS  FOR  EACH  ENTRY  REPRESENT:  TO  TOTAL  NO.  OF  WORDS  IN  SUBJECT  THE  0.590  0.685  34.0  0.0  0.0  36.0  93.0  144.0  44.6  49.9  0.0  0.0  41.4  90.9  119.2  C O  C O  0.398  0.507  0.597  346  0.705  11.71  6.0  127.0  0.0  0.0  21.0  68.0  191.0  53.3  59.6  0.0  0.0  49.4  108.5  142.2  C O  0.0  THREE  378  13C.2  39.0  0.078  0.206  0.859  140.0  22.40  1.485  0.296  0.436  413  0.935  166.53  CHI-SQUARE  FIGURES  FREQ.  0.0  0.510  FRECUENCY AS  19. BY  33.00 LINES  0.0  CHI-SQUAKE  0.0  THREE  485.  0.896  18.0  CHI-SQUARE THE  0.901  37.0  0.210  27.0  0.758  6.60  0.484  58.0  CHI-SOUARE  61.0  CHI-SQUARE  OR  THIS  ol 377  48.0  0.627  15.  0.719  18.  50.84  CHI-SQUARE  14.  0.817  351.  LINES  OF  FIGURES  FOR  EACH  ENTRY  REPRESENT:  FREQUENCY EXPECTED  FREQUENCY  RATIO  %,  AS  OF  FREQ.  TO  TOTAL  NU.  OF  WOROS  IN  SUBJECT  TA8LE  XXXX  D I S T R I B U T I O N O F OCCURRENCE O F T H E1 0 0 H O S T F R E O U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE T E N  RANK WORD  B  21. HE  C  S U B J E C T S D E  19.0 4 6 . 0 19.0  4 0 . 4 4 5 . 1  0.0  0.0  3 7 . 5 82.2  2.081 0.0  0.0  0.0  0.0  3 3 . 8 7 4 . 1  0.0  0.0  0.610  9 7 . 1  0.0  5 6 . 0 6 9 . 0 5 2 . 0  3 7 . 5 4 2 . 0  0.0  0.0  3 4 . 8 76.5100.2  0.0  0.0  0.789  0.443  0.0  2 2 . 0 8 2 . 0 8 0 . 0  3 7 . 0 4 1 . 4  0.0  0.0  3 4 . 4 7 5 . 4  0.0  0.0  0 . 3 1 0 0.526  YOUR  0.392  0.0  4 6 . 0 7 6 . 0 8 9 . 0  3 2 . 4 3 6 . 2  0.0  0.0  3 0 . 0 6 5 . 9  0.0  0.0  CHI-SQUAKE  0.648  0.488  8 6 . 4 0.436  22.25  L I N E S  OF  FIGURES  F U R EACH  ENTRY  REPRESENT:  FREQUENCY OF  FREQ.  0.0  0.0  0.246  2 5 1 .  30. THEY  TO  TOTAL  N O .  UF  WORDS  I N  SUBJECT  0.690  0.43C  0.328  25.60 0.0  0.0  4 6 . 0 5 6 . 0 2 9 . 0  2 6 . 4 2 9 . 6  0.0  0.0  2 4 . 5 5 3 . 9  0.0  0.0  0.503  0.648  0.359  2 0 5 .  7 0 . 6 0.142  50.27 4.0  19.1 2 1 . 3 0.047  0.0  0.0  27.C 5 5 . 0 2 2 . 0  0.0  0.0  1 7 . 7 3 8 . 9  0.0  0.0  0.380  0.353  7.0  4 1 . 0  1 4 8 .  5 1 . 0 0.108  6 5 . 0 1  5 9 . 0 1 3 . 0  0.0  0.0  1'5.5 1 7 . 3  0.0  0.0  0.152 0.0  0.0  0.099  7.0  14.4 3 1 . 5 0.263  0.0 1 2 0 . 4 1 . 3 C O  171.39  3 3 . 0 3 4 . 0  0.0  0.0  2 2 . 1 2 4 . 7  0.0  C O  0.0  0.0  0.431  2 2 7 .  7 8 . 2  3 1 . 0 4 3 . 0  CHI-SQUARE  FREQUENCY S ,  4 9 . 0 6 7 . 0 6 7 . 0 2 7 . 2 5 9 . 6  CHI-SQUARE  0.0  0.222  0.0 0.0  0.771  1 9 . 3 1  2 1 . 0 1 9 . 0  AS  29.  9 8 . 8  TOTAL  0.0  0.523  2 8 7 .  H  0.0  4 0 . 0 CAN  G  2 9 . 3 3 2 . 7  CHI-SQUARE  0.0  RATIO  28.  0.255  4 1 . 0 6 2 . 0  EXPECTED  2 9 1 .  F  2 3 . 0 2 1 . 0  0.405  110.71  0.725  S U B J E C T S D E  CHI-SQUAKE  0.0  THREE  NOT  0.504  8 8 . 0 2 6 . 0  0.274  THE  0.338  27.  1 3 . 4 3  0.304  C  CHI7SOUARE  3 6 . 4 4 0 . 7 0.409  B  0.301  2 4 . 0 95.0 103.0 2 8 2 .  CHI-SQUARE  WHICH  26.  0.093  0.0  0.536  25.  WORD  ONE  0.0  CHI-SQUARE  AT  0.295  D I S T R I B U T I O N O F OCCURRENCE O F T H E1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE T E N  RANK  107.8  2 5 . 0 3 5 . 0  1.150  24.  0.268  3 1 3 .  X X X X  492.15  CHI-SQUARE  HAVE  TOTAL  0.0  0.327  23.  H  0.0  CHI-SQUAkE  FROM  G  51.0 178.0  0.667  22.  F  TABLE  0.398  7 8 . 0 19.0  2 0 . 5 4 4 . 9 0.099  0.501  1 7 1 .  5 8 . 9 0.093  6 9 . 2 1  T H E T H R E E L I N E S U F F I G U R E S F D R E A C H E.NTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S i . U F F R E Q . T O T O T A L NU. O F WORDS I N S U B J E C T  LO  TABLE  RANK  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100. MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE TEN  XXXX.  WCRD  31. WE  B  C  S U B J E C T S D E  75.0  23.0  0.0  0.0  213.0  46.6  52. 1  0.0  0.0  43.2  0.0  0.0  0.980  0.269  CHI-SQUARE 32. HIS  27.7  0.0  0.0  25.7  56.5  74.0  0.0  0.0  31.0 1.286  0.141  WHEN  215.  37. ALL  0.0  0.0  27.0  59.0  19.0  20.3  22.6  0.0  0.0  18.3  41.2  54.1  0.0  0.0  0.380  0.379  157.  38.  BUT  0.093  14.0  0.0  0.0  47.0  37.0  11.0  16.4  18.3  0.0  0.0  15.2  33.4  43.7  0.0  0.0  0.662  0.237  127.  39.  THESE  C.C54  26.0  0.0  0.0  23.0  57.0  60.0  24.1  27.0  0.0  C O  22.4  49.1  64.4  0.0  0.0  0.304  14.0  17.3  19.3  C O  0.0  16.0  35.2  46.1  C O  C O  0.246  0.324  0.366  187.  0.294  2.02  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREUUENCY R A T I O A S It O F F R E Q . T O T O T A L N U . tip WORDS I N S U B J E C T  40. MAY  0.169  0.372  0.069  36.0  32.0  C O  0.0  24.0  30.0  49.0  22.1  24.7  0.0  C O  20.5  44.9  58.9  C O  0.0  0.374  0.338  0. 1 9 3  18.23 47.0  C O  0.0  14.0  40.0  43.0  21.0  23.5  0.0  C O  19.5  42.8  56.1  C O  0.0  0.550  0.197  0.257  163.  C. 2 1 0  28.50  24.0  4.0  C O  C O  32.0  52.0  60.0  22.2  24.8'  0.0  0.0  20.6  45.2  59.2  0.047  C O  0.0  0.451  0.334  172.  0.2 9 4  24.96  23.0  3.0  C O  C O  16.0  34.0  21.0  12.5  14.0  C O  C O  11.5  25.5  33.4  0.0  0.0  CHI-SQUARE  171.  0.240  19.0  0.301  134.  46.26  CHI-SQUAKE  21.0  TOTAL  58.0  0.314  92.59  H  12.0  CHI-SQUAKE  18.0  G  C O  0.248  52.82  F  0.0  CHI-SQUARE  14.0  0.164  S U B J E C T S D E  21.0  0.471  262.31  0.164  C  29.0  0.379  0.135 0.162  38.0  CHI-SQUAKE  36.  B  CHI-SQUARE 33.0  0.274  WORD  0.141 0.137  21.0  CHI-SQUAKE  AN  361.  RANK  94.8 124.3  10.C  0.235  35.  26.0  0.0  CHI-SQUAKE  IF  TOTAL  0.0  0.497  34.  22.0  H  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE T E N  X X X X  831.29  CHI-SQUARE  WILL  3.000  G  41.0. 110.0  0.536  33.  F  TABLE  0.035  0.225  0.218  97.  0.103  26.53  vO THE T H R E E L I N E S O F F I G U R E S F O R E A C h E N T R Y R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S tt O F F R E Q . T O T O T A L N O . O F W O R D S I N S U B J E C T  J>  TABLE  XXXX  RAN< WORD  B  41. THERE  DISTRIBUTICN OF OCCURRENCE OF THE 1 0 0 HOST FREQUENT WORO TYPES ACROSS THE SUBJECT AREAS OF GRADE TEN  TABLE  S U B J E C T S  RANK  C  20.0  34.0  38.0  25.4  28.4  0.0  0.0  23.6  51.8  67.8  0.0  0.0  0.304  0.0  10.C  34.0  85.0  20.6  23.1  CO  CO  19.2  42.0  55.1  0.0  0.0  0.141  0.218  117.0  0.0  0.0  1.0  32.0  23.2  26.0  0.0  0.0  21.5  47.3  CC  0.0  WERE  1.368  0.014  0.205  5.0  180.  48. HAD  62.0 .  8.0  0.0  0.0  21.0  38.0  50.0  . 17.3  19.3  0.0  CO  16.0  35.2  46.1  0.0  CO  0.094  0.296  0.244  134.  49. THEIR  55.0  15.1  16.9  CO  0.0  14.0  30.7  40.3  CO  0.0  0.140  11.0  7.0  0.0  CO  19.0  37.0  42.0  15.0  16.7  0.0  CO  13.9  30.5  40.0  0.0  0.0  0.260  0.237  0.206  10.09  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY  EXPECTED FREQUENCY  RATIO AS I, OF FREQ. TO TOTAL NO. CF WORDS IN SUBJECT  116.  50. USED  0.070  0.205  117.  0.269  7.0  30.0  CO  CO  13.0  34.0  108.0  25.4  28.4  0.0  0.0  23.6  51.8  67.8  0.0  CO  0.254  0.351  0.218  197.  0.529  44.60  4.0  89.0  0.0  0.0  7.0  17.0  61.0  23.0  25.7  0.0  0.0  21.3  46.8  61.3  0.0  CO  0.C99  1.041  0.109  178.  C.299  200.48  21.0  21.0  CO  0.0  1.0  54.0  55.0  19.6  21.9  CO  CO  18.2  39.9  52.3  CO  0.0  0.014  0.246  0.347  152.  0.269  21.48  7.0  2.0  CO  0.0  9.0  32.0  14.0  8.3  9.2  CO  CO  7.7  16.8  22.0  0.091  0.023  0.0  0.0  0.127  CHI-SQUAKE  TOTAL  12.90  CHI-SQUARE  8.72  0.082  32.0  C274  0.245  H  5.0  CHI-SQUARE  396.43  G  CO  0.052  0.024  F  CO  0.091  0.416  17.0  CHI-SQUARE  47.  E  12.0  0.170  160.  0  13.0  CHI-SQUARE  25.0  0.144  MORE  30.80  CHI-SQUARE  SOME  46.  C  CHI-SQUARE  0.0  0.105  B  0.186  9.0  0.222  45.  0.218  TOTAL 197.  S U B J E C T S WORD  31.90  CHI-SQUARE  OTHER  0.232  22.0'  0.327  44.  H  0.0  CHI-SQUARE  I  G  0.0  0.286  43.  F  26.0  CHI-SQUARE  HAS  E  8.0  0.105  42.  0  DISTRIBUTION OF OCCURRENCE OF THE 100 MOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE TEN  XXXX  0.205  64.  0.069  22.74  THE THREE LINES OF FIGURES FOR FACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS Z, OF FREO. TO TOTAL NO. OF WORDS IN SUBJECT  NJ vD  Ul  TABLE  D I S T R I B U T I O N  XXXX  MOST  SUBJECT  RANK  S WORD  B  5 1 .  16.1  4 0 . 0  18.0  0 . 0  0 . 0  15.0  3 2 . 8  0 . 0  0 . 0  3 5 . 0  13.7  0 . 0  0 . 0  11.4  2 5 . 0  0 . 0  0 . 0  0.197  0 . 2 4 0  0 . 0  0 . 0  2 7 . 0  4 1 . 0  13.0  14.6  0 . 0  0 . 0  12.1  2 6 . 5  0 . 0  0 . 0  2 0 . 0  9 5 .  5 7 . WHAT  3 2 . 7  0 . 3 8 0  14.0  101.  5 8 .  0 . 2 6 3  15.9  .17.7 0 . 2 4 6  0 . 0  3 3 . 0  2 5 . 0  0 . 0  0 . 0  14.7  3 2 . 3  0 . 0  0 . 0  0 . 4 6 5  35.0  5 9 . BEEN  4 2 . 4  0 . 1 6 0  12.0  2 7 . 0  0 . 0  0 . 0  2 2 . 0  2 9 . 0  13.8  15.4  0 . 0  0 . 0  12.8  2 8 . 1  0 . 0  0 . 0  0 . 1 5 7  0 . 3 1 6  CHI-SQUAKE  0 . 3 1 0  17.0 3 6 . 9  0 . 1 8 6  L I N E S  OF  6 0 . INTO  FOR  EACH  ENTRY  R E P R E S E N T :  F  G  H  TOTAL  0 . 0  0 . 0  7.7  16.8  2 2 . 0  0 . 2 3 5  0.117  0 . 0  0 . 0  0 . 1 9 7  0 . 0 9 6  0 . 0 3 4  2 1 . 0  14.0  0 . 0  0 . 0  2 5 . C  3 0 . 0  15.5  17.3  0 . 0  0 . 0  14.4  31.5  0 . 0  0 . 0  0 . 3 5 2  0 . 0  0 . 0  7.0  2 1 . 0  0 . 0  0 . 0  12.7  2 7 . 8  0 . 0  0 . 0  0 . 0 9 9  3 . 0  0.164  3 0 . 0  EXPECTED  FREOUENCY  RATIU  St  OF  F R E Q .  TO  TOTAL  NO.  OF  WOROS  I N  SUBJECT  0 . 1 9 3  0.147  13.65 9 . 0 15.3 0 . 1 0 5  5 9 . 0  106.  3 6 . 5  0 . 1 3 5  0 . 2 8 9  2 1 . 6 6  2 2 . 0  2 4 . 0  0 . 0  0 . 0  19.1  21.3  0 . 0  0 . 0  0 . 0  0 . 0  0.113  0 . 2 8 1  3 1 . 0  17.7  6 3 . 0  3 8 . 9  148.  51.0  0 . 1 9 9  0 . 3 C 8  1 0 . 5 4 18.0  0 . 0  0 . 0  4 . 0  3 0 . 0  13.0  0 . 0  0 . 0  10.8  2 3 . 6  0 . 0  0 . 0  0 . 0 6 5  0 . 2 1 0  0 . 0 5 6  3 3 . 0  9 0 .  31.0  0 . 1 9 3  0.162  11.80  THREE  L I N E S  OF  EXPECTED  FREQUENCY  RATIO  St  AS  120.  41.3  FIGURES  FOR  EACH  ENTRY  REPRESENT:  FREQUENCY  FREQUENCY  6 4 .  2 7 . 2 7  11.6  THE  E  9.2  CHI-SOUARE  F I G U R E S  S  D  8.3  5 . 0  0 . 0 8 3  2 6 . 2 3  THREE  107.  T  TEN  7.0  CHI-SQUARE  2 9 . 1 9  C  100  THE  1 5 . 0  0 . 2 8 8  0.171  E  GRAOE  T H E  14.0  0.131  123.  J  OF  OF  ACROSS  0 . 0  CHI-SQUARE  0 . 0  AREAS  TYPES  0 . 0  13.7  0 . 0 6 9  B  OCCURRENCE  WORD  10.0  10.0 THAN  3 4 . 8  OF  FREQUENT  18.0  0 . 2 7 4  0 . 0 9 8  5 0 . 7 4 21.0  U  C  CHI-SQUARE  2 . 0  AS  56. SHOULD  4 3 . 0  0 . 2 2 5  17.0  0 . 0 2 3  B  CHI-SQUARE 14.0  CHI-SQUARE  THE  125.  S WORD  TOTAL  4 9 . 0  0 . 2 5 7  0 . 0  0.118  ABOUT  0 . 2 2 5  H  15.68  9 . 0  55.  G  0 . 0  CHI-SQUAKE  .  F  2 0 . 0  0 . 2 2 2  MOST  RANK  16.0  0 . 2 3 4  D I S T R I B U T I O N  X X X X  SUBJECT  S  0 . 0  CHI-SQUARE  TWO  T  TABLE  100  THE  TEN  0 . 0  0 . 0 7 8  54.  C  T H E  16.92  12.3  EACH  E  GRADE  2 . 0  0 . 0 2 3  6 . 0  53.  J  OF  OF  ACROSS  E  CHI-SQUARE  SO  AREAS  TYPES  D  0 . 2 3 5  52.  B  OCCURRENCE  WORD  C  1 8 . 0 MANY  U  OF  FREOUENT  OF  F R E O .  TO  TOTAL  NO.  OF  WOROS  I N  SUBJECT  Cjl  TABLE  RANK  X X X X  WORD  61. THEM  B  MAKE  00  0.0  1 1 . 0 2 4 . 2  0.0  0.0  0.193  1 8 . 0 SUCH  3 1 . 7  2.0  0.0  0.0  0.235  0.073  1 3 . 0 2 1 . 0 16.0  8.6  9.7  0.0  0.0  8.0  0.196  0.023  0.0  0.0  0.183  1 7 . 6 0.135  6 7 .  67.  8.0 THEN  2 3 . 1  0.0  0.0  3.0  14.0  10.0  6.3  7.1  0.0  0.0  5.9  12.9  16.9  0.170  0.105  0.0  0.0  0.042  0.090  49.  68. TIME  C.049  0.0  0.0  18.0  26.0  8.0  10.2  11.4  0.0  0.0  9.5  20.8  27.2  0.0  0.254  0.105  0.0  0. 167  79.  69. ITS  0.039  29.09  17.0  34. 0  0.0  0.0  1.0  20.0  10.0  10.6  11.8  0.0  0.0  9.6  21.5  28.2  0.0  0.0  0.014  0.398  0. 128  C.049  82.  70. WOULD  17.0 3 6 . 0 3 4 . 0  0.0  0.0  13.0 2 8 . 6  0.0  0.0  0.239  3 7 . 5  1 3 . 2 6  0.0  1 1 . 7 2 5 . 7  0.0  0.0  0.451  9.0  0.292  0.167  7.0  0.0  0.0  1 3 . 5 15.1  0.0  0.0  0.0  0.0  0.127  3.0  0.210  3 3 . 8 C.034  2 2 . 0 2 5 . 0  1 2 . 6 2 7 . 6  1 0 5 .  3 6 . 2  0.141 0.122  28.63  1 2 . 0 2 0 . 0  0.0  0.0  16.1 1 8 . 0  0.0  0.0  0.0  0.0  0.234  3 7 . C 5 3 . 0  1 5 . 0 3 2 . 8 0.042  0.237  1 2 5 .  4 3 . 0 0.259  1 3 . 6 6  1 8 . 0 17.0  0.0  0.0  2 7 . U 3 5 . 0 1 6 . 0  1 4 . 6 1 6 . 3  0.0  0.0  1 3 . 5 2 9 . 7  0.0  0.0  CHI-SOUARE  9 8 .  66.29  3 1 . 0 1 8 . 0  0.235  1 0 9 .  0.231 C.166  0.0  CHI-SQUARE  TOTAL  0.0  1 2 . 6 1 4 . 1  0.157  H  0.0  32.C 2 6 . 0  CHI-SQUARE  9.0  0.047  G  0.0  0.405  11 . 8 9  18.0  4.0  F  0.0  CHI-SQUAKE  1 6 . 6 7 9.0  S U B J E C T S D E  2 5 . 0  0.105  0.078  13.0  C  14.1 15.7  CHI-SQUARE  1 5 . 0  0.222  B  66.  1 3 . 8 6  CHI-SQUARE  UP  0.183  9 2 .  DISTRIBUTION OF OCCURRENCE O F T H E 1 0 0 MOST F R E Q U E N T WORO T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE T E N  X X X X  WORD  TOTAL  0.0  0.235  65.  H  1 1 . 9 1 3 . 3  CHI -SQUARE 64.  G  1 3 . 0 3 0 . 0 15.0  0.222  TABLE  RANK F  0.0  CHI-SOUARE 63.  S U B J E C T S 0 E 0.0  CHI-SQUARE  USE  C  1 5 . 0 1 9 . 0  0.196  62.  DISTRIBUTION OF OCCURRENCE O F T H E 1 0 0 H O S T F R E O U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE T E N  0.199  0.380  0.225  1 1 3 .  3 8 . 9 0.078  28.70  Ni THE THREE L I N E S O F F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S ; » O F F R E Q . T O T O T A L NO. O F WORDS I N S U B J E C T  THE THREE LINES O F F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S X, O F F R E Q • T O T O T A L N O . Jf W O R D S I N S U 8 J E C T :  LO «J  TABLE  RAN<  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E O U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS O F GRADE T E N  XXXX  WORD  71.  B 2 0 . 0  HOW  C  S U B J E C T S D E  6.0  1 0 . 2 1 1 . 4 0.261  0.07C  CHI-SQUARE 72.  4.0  NUMBER  CHI-SQUARE 73. MADE  1.0  0.012  2 2 . 0 1 5 . 0 1 6 . 0  0.0  9.5  0.0  0 . 0  0.310  2 0 . 8 0.096  0.0  0.0 110.0  0.0  0.0  0.0  0.0  1 8 . 2 3 9 . 9  ONLY  1.549  0.167  1 5 2 .  77. NO  5 2 . 3  8.2  0.0  0.0  6.S  1 5 . 0  0.105  0.105  0.0  0.0  0.042  0.096  78. MUST  1 9 . 6 0.108  10.4 0.316  0.0  0.0  5.C  9.0  0.0  0.0  8.6  1 8 . 9  0.0  C C  0.070  0.058  13.0  7 2 .  79.  1 3 . 4 1 5 . 0 0.105  1 1 . 4 1 2 . 7  0.0  C O  0.0  0.0  WATER  2 4 . 8  0.0  0.0  C O  0.0  C O  1.0  2 3 . 0 5 8 . 0  1 2 . 4 2 7 . 3 0.014  0.148  3 5 . 8 0.284  27.36  THE THREc LINES O F FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY R A T I O A S X, O F F R E Q . T O T U T A L N O . U F W U R O S I N S U B J E C T  1 0 4 .  80.  0.244  0.127  14.0 2 5 . 0 2 0 . 0  1 1 . 5 12.8  C O  0.0  10.7 2 3 . 4  0.210 0.0  C O  0.197  8.0  0.0  0.0  6.8  7.6  C O  C O  6.3  0.183  0.094  C O  0.0  0.155  C O  C O  0.0  30.7 0.098  11.0 1 4 . 0  0.0  0.0  C O  C O  1 3 . 9 0.090  6.0  5 3 .  1 8 . 3 0.029  1 9 . 1 7 9.0  1 1 . 6 13.0 0.105  10.K 0.0  3 7 . 0 39.0 2 3 . 6 0.237  9 0 .  3 1 . 0 0. 1 9 1  25.37 2.0  10.1 11.2  CHI-SQUARE  0.160  8 9 .  6.96  1 4 . 0  0.131  8 8 .  3 0 . 3  0.0  10.0 ALSO  TOTAL  3 8 . C 2 6 . 0  10.5 2 3 . 1 0.085  H  1 5 . 7 0  CHI-SQUARE  0.0  6.0  G  C O  C.065  0.064  0.152  F  11.0 18.0  5.0  4 7 . 1 0 9.0  0.0  CHI-SQUARE  2 . 5 6  S U B J E C T S D E 0.0  CHI-SQUARE 5 7 .  C  13.0  0.144  0.049  7.4  CHI-SQUARE  5.0  0.065  2 6 . 0 10.0  1 5 . 0 2 2 . 0  0.170  76.  B  CHI-SOUARE  3.0  1 3 . 0  WORD  0.078  0.0  9.3  7 9 .  2 7 . 2  0.0  CHI-SQUARE  MOST  0.0  TOTAL  9.C  0.235  75.  0.0  H  534.74  1 8 . 0 2 7 . 0 OUT  0.0  G  8.0  CHI-SQUARE 74.  RAN.< F  DISTRIBUTION OF OCCURRENCE O F T H E 1 0 0 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS O F GRADE T E N  X X X X  34.85  1 9 . 6 2 1 . 9 0.052  TABLE  0.023  C O  0.0  13.0 2 0 . 0 3 3 . 0  C O  C O  9.3  C O  C O  0.183  2 0 . 5  7 3 .  2 6 . 9  0.128 0.162  1 0 . 4 5  THE THREE L I N E S U F F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I U A S i, O F F R L Q . T O T O T A L N O . U F W O R D S I N S U B J E C T  to vO CO  TABLE  DISTRIBUTION OF OCCURRENCE OF THE 1 0 0 HOST FREQUENT WORD TYPES ACROSS THE SUBJECT AREAS OF GRADE TEN  XXXX-  RANK WORD 81. FIRST  B  GOOD  HIM  0.0  0.0  1 1 . 1 2 4 . 4  0.152 0.0  0.0  0.254  0.090  93.  3 2 . 0  0.0  0 . 0  9.8  1 1 . 0  0.0  0.0  9.1  0.222  0.062  0.0  0.0  0.141  2 0 . 0 0.160  8b. COULD  0.0  2.0  8.0  5.4  6.1  0.0  0.0  5.0  1 1 . 0  0.235  0.105  0.0  0.0  0.028  0.051  87.  0.083  5.0  4 2 .  88. ANY  1 4 . 5 C.024  1 2 . 4 1 3 . 8  0.0  0.0  1 1 . 5 2 5 . 2  0.0  0.0  0.296  0.173  0.0  0.0  0.0  8.0 1 6 . 8  0.281  5 3 . 0  0.0  0.0  8.0  3.0  2.0  7 1 .  9.2  10.2  0.0  0.0  8.5  1 8 . 7  2 4 . 5  0.065  0.620  0.0  0.0  0.113  0.019  O.01O  89. BECAUSE  8.3  9.7  0.0  U.O  7.7  0.392  0.187  0.0  0.0  0.0  0.0  0.0  3 0 . 0 2 9 . 0  0.0  0.0  9.5  2 0 . 8  0.0  0 . 0  0.423  0.186  6.0  7 9 .  2 7 . 2 0.029  SEE  0.051  3 3 . 1 0.083  I C O  6 4 .  2 2 . 0 0.049  8 1 . 1 0 4.0  0.0  0.0  9.2  1 0 . 2  0.0  0.0  8.5  1 8 . 7  0.222  0.C47  0.0  0.0  0.239  0.116  1 7 . 0 1 8 . 0 1 5 . 0 7 1 . 2 4 . 5 0.073  22.70  10.0  3.0  0.0  0.0  3.0  2 4 . 0 1 0 . 0  6.4  7.2  0.0  0.0  6.0  1 3 . 1  0.131  0.035  0.0  0.0  0.042  0.154  5 0 .  17.2 0.049  1 7 . 9 1  3.0  16.0  0.0  0.0  6.8  7.6  0.0  0.0  6.3  1 3 . 9  0.039  0.210  0.0  0.0  0.225  0.064  CHI-SOUARE  70.68  rHE T H R E E L I N E S U F F I G U R E S FREQUENCY EXPECTEO FREQUENCY  90.  TOTAL  2 7 . 0 1 7 . 0 9 6 .  17.0  CHI-SQUARE  214.28  H  25.59  CHI-SQUAKE  5.0  G  21.0  3 0 . 0 1 6 . 0 WHO  2 6 . 2  F  0.0  0.091  39.50  0.035  S U B J E C T S D E 0.0  CHI-SOUARE  0.0  1 0 . 2 11.4  C  2 4 . 0  7.0  1 1 . 2 9 9.0  CHI-SQUARE  B  0.181  1 0 . 0 2 5 . 0 1 7 . 0 7 6 .  1 8 . 0  0.144  WORD  CHI-SQUARE  7.0  3.0  DISTRIBUTION OF OCCURRENCE O FT H E 100 M O S T F R E Q U E N T WORD T Y P E S A C R O S S T H E SUBJECT AREAS O FGRAOE T E N  XXXX  9 . 5 6  1 7 . 0  1 1 . 0 SAME  TOTAL  1 2 . 0 1 3 . 4  CHI-SOUARE 85.  H  1 8 . 0 1 4 . 0 3 7 . 0  CHI-SQUAKE 84.  G  0.0  CHI-SQUARE 83.  F  0.0  CHI-SQUARE  VERY  RANK  S U B J E C T S D E  1 1 . 0 1 3 . 0  0.144  82.  C  TABLE  1 6 . 0 1 0 . 0  6.0  5 3 .  1 8 . 3 0.029  40.22  ro F O REACH  R A T I U A S %i O F F K c O . T O T U T A L  ENTRY  REPRESENT:  N O . O F WOROS I N S U B J E C T  vO THE THREE L I N E S O F F I G U R E S F O R EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTEO FREQUENCY RAT IU A S S , U F F R E Q . T O T O T A L NO. U F WORDS I N S U B J E C T  vO  TABLE  RANK  WORD  91. LIKE  B  C  S U B J E C T S D E  MUCH  PEOPLE  6.0  7.0  7.0  6.7  7.5  0.0  0.0  6.2  13.7  17.9  0.235  0.164  0.0  0.0  0.085  PLACE  52.  96.  5.0  6.0  0.0  0.0  6.0  25.0  10.0  THROUGH  6.7  7.5  0.0  0.0  6.2  13.7  17.9  0.065  0.070  0.0  0.0  0.C65  0. 160  0.049  0.0  2.0  15.0  34.0  8.8  9.8  0.0  0.0  8.1  17.9  23.4  0.170  0.047  0.0  0.0  0.028  0.096  68.  97. WORK  0.166  22.0  7.0  0.0  0.0  0.0  8.0  9.3  10.4  0.0  0.0  8.6  18.9  0.0  0.0  0.0  35.0  72.  98. NEW  24.8  10.0  6.0  0.0  0.0  14.0  27.0  9.0  10.I  0.0  0.0  8.4  18.4  0.0  0.0  0.197  0.070  0.0  0.0  2.0  8.0  6.0  3.4  3.7  0.0  0.0  3.1  6.8  9.0  0.118  0.012  0.0  0.0  0.028  0.051  0. 0 2 9  0.173  13.0  70.  99. SMALL  24.1 0.064  12.0  15.0  0.0  0.0  2.0  12.0  64.0  13.5  15.1  0.0  0.0  12.6  27.6  36. 2  0.0  0.0  0.028  0. 1 7 5  3.0  5.0  0.0  0.0  7.0  14.0  4.6  5.2  0.0  0.0  4.3  9.5  0.039  0.058  0.0  0.0  0.099  0.090  7.0 12.4 0.034  6.80  THE THREE L I N E S OF F I G U R E S F O R EACH ENTRY R E P R E S E N T : FRECUENCY EXPECTED FREQUENCY R A T I O AS S , OF F R t Q . T O T O T A L NO. OF WORDS I N S U B J E C T  36.  100. OVER  0.077  26.  105.  0. 3 1 3  11.0  2.0  0.0  0.0  2.0  23.0  20.0  7.5  8.4  0.0  0.0  6.9  15.2  20.0  0.144  0.023  0.0  0.0  0.028  0. 148  58.  0.098  13.97  6.0  10.0  0.0  0.0  2.0  9.0  18.0  5.8  6.5  0.0  0.0  5.4  11.8  15.5  0.078  0.117  0.0  0.0  0.026  CHI-SQUARE  52.  39 .30  CHI -SQUARE  14.68  TOTAL  13.09  CHI -SQUARE  37.62  H  1.0  0.157  0.051 0.171  G  9.0  CHI -SQUARE  15.35  F  13.65  CHI -SQUARE  0.0  CHI-SQUARE  J E C T S E D  S U B  0.034  4.0  CHI-SQUAKE 95.  0.045  13.0  0.131  C  34.55  0.032  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORD T Y P E S A C R O S S T H E S U B J E C T AREAS OF GRADE T E N  XXX  B  WORD  TOTAL  0.0  CHI-SQUAKE  CALLED  H  0.0  0.286  94.  G  14.0  CHI-SQUARE 93.  X  RANK F  18.0  CHI-SUUARE 92.  TABLE  D I S T R I B U T I O N O F O C C U R R E N C E O F T H E 100 MOST F R E Q U E N T WORO T Y P E S A C R O S S T H E SUBJECT AREAS OF GRADE TEN  XXXX-  0.058  45.  0. 0 8 8  5.11  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTEO FREQUENCY R A T I O AS S i OF F R E Q . TO T O T A L NO. OF WORDS I N S U B J E C T  LO O O  301  APPENDIX C H I SQUARE  RESULTS  OF  3  D I S T R I B U T I O N OF S E L E C T E D S E N T E N C E  LENGTHS  TABLE  DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE GRAOE LEVELS OF THE CORPUS  XXXXI  RANK LENGTH 1. 10  6  GRADES 9 10 TOTAL  105.0  294.0  182.0  126.0  305.6  149.4  581.  TABLE  RANK LENGTH  B  C  1. >  52.0  87.0  107.0  50.2  101.6  118.1  10  3.693 4.262 5.396 CHI-SQUARE 2. 20  128.0  101.9  247.2  120.9  CHI-SQUARE 3. 30  60  2. 20  41.6  101.0  49.4  1.392  192.  3. 30  16.0  24.0  8.0  10.4  25.2  12.3  48.  4. 40  CHI-SQUARE 5. 50*  29.0  81.0  39.0  32.3  78.4  38.3  1.020 CHI-SQUARE  1.174  149.  1.156  0.44  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: FREQUENCY EXPECTED FREQUENCY RATIO AS X, OF FREQ. TO TOTAL NO. OF SENT-LENGTH IN GRADE  5. 50*  92.0  84.0  83.1  51.7  90.5  85.6  73.0  81.0  40.7  62.4  95.8  67.4  41.9  73.4  69.4  2.465 4.014  3.518  35.0  38.0  31.0  16.6  33.6  39.0  27.5  4.195  3.573  471.  4.190  1.526  1.425  6.0 17.1  25.0  39.0  29.9  28.3  1.652  0.514  1.224  192.  2.018  12.72  4.0  14.0  3.0  5.0  1.0  6.0  15.0  4.2  6.4  9.8  6.9  4.3  7.5  7.1  0.113  0.267 0.0B6 0.294 0.776  48.  20.61  7.0  70.0  15.0  9.0  4.0  25.0  20.0  13.0  26.2  30.5  21.5  13.4  23.4  22.1  CHI-SQUARE  581.  12.47  16.0  0.617  TOTAL  4.904 5.736 4.503 4.346 49.0  CHI-SQUARE  4.60  67.0  66.0  0.353 0.610  0.563 0.346 0.237  92.0  107.0  CHI-SQUARE  2.16  H  57.0  1.587  1.364  G  8.72  CHI-SQUARE 46.0  F  36.0 3.351  3.76 96.0  CHI-SQUARE  670..  3.795  50.0 1.759  4.  CHI-SQUARE  227.0 3.291  SUBJECTS 0 E  4.586 3.793 4.014  11.02  115.0 4.045  DISTRIBUTION OF OCCURRENCE OF FIVE SELECTED SENTENCE LENGTHS ACROSS THE SUBJECT AREAS OF THE CORPUS  XXXX,*!  3.051  0.563 0.480 0.342  1.224  150.  1.035  97.71 CO O  THE THREE LINES OF FIGURES FOR EACH ENTRY REPRESENT: ^ FREQUENCY EXPECTED FREQUENCY RATIO AS X, OF FRtQ. TO TOTAL NO. OF SENT—LENGTH IN SUBJECT  TABLE  D I S T R I B U T I O N OF O C C U R R E N C E OF F I V E S E L E C T E D SENTENCE LENGTHS ACROSS THE S U B J E C T AREAS OF GRADE EIGHT  X.XXX/J/  RANK LENGTH  10  B  C  S U B J E C T S D £ 6.0  17.0  26.0  14.0  0.0  18.1  21.7  9.8  15.0  21.A  19.1  3.885  3.912  2.273  4.187  4.483  2.713  0.0  16.0  26.0  7.0  25.0  18.Q  23.0  0.0  19.8  23.8  10.7  16.4  23.5  20.9  3.272  4.422  2.652  6.158  3.103  115.  4.457  8.16  0.0  11.0  8.0  6.0  5.0  7.0  13.0  0.0  8.6  10.3  4.6  7.1  10.2  9.1  0.0  2.249  2.273  1.232  1.2U7  CHI-SQUARE  1.361  50.  2.519  4.94  0.0  4.0  4.0  1.0  1.0  1.0  11.0  0.0  3.8  4.6  2.0  3.1  4.5  4.0  0.0  0.818  0.680  0.379  0.246  0.172  2.132  CHI-SQUARE  22.  17.08  0.0  12.0  2.0  0.0  2.0  4.0  8.0  0.0  4.8  5.8  2.6  4.0  5.7  5.1  0.0  2.454  0.340  0.0  0.493  0.690  1.550  CHI-SQUARE  105.  4.16  CHI-SQUARE  50+  TOTAL  23.0  0.0  40  H  19.0  CHI-SQUARE  30  G  0.0  0.0  20  F  28.  18.99  THE T H R E E L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : FREQUENCY EXPECTED FREQUENCY R A T I O A S i, O F F R E Q . T O T U T A L NO. OF S E N T - L E N G T H I N S U B J E C T  TABLE  XXXX«V  RANK LENGTH  10  B  D I S T R I B U T I O N OF O C C U R R E N C E OF F I V E SELECTED SENTENCE LENGTHS ACROSS THE SU&JcCT AREAS OF GRADE TEN  C  S U B J E C T S D E 0.0  0.0  37.0  40.0  57.0  22.9  31.0  0.0  0.0  27.7  42.2  59.1  0.0  0.0  4.545  5.141  1. 10  0.0  13.C  32.0  48.0  16.2  21.9  0.0  0.0  19.5  29.8  41.6  0.0  0.0  2.273  2.544  4.113  129.  2. 20  4.408  S U B J E C T S D E  7.0  0.0  0.0  1.0  11.0  21.0  5.9  8.0  0.0  0.0  7.1  10.8  15.2  .1.655  1.224  0.0  0.0  0. 1 9 6  1.414  3. 30  1.928  14.0  26.0  13.0  30.3  52.6  88.6  68.7  10.7  29.2  14.0  3.406  2.0  0.0  0.0  1.0  2.0  3.0  1.1  1.5  0.0  0.0  1.4  2.1  2.9  0.350  0.0  0.0  0. 196  0.257  0.275  9.  4. 40  0.26 12.0  0.0  0.0  2.0  17.0  7.0  4.9  6.6  0.0  0.0  5.9  9.0  12.6  C.23o  2.098  0.0  O.C  0.391  2.185  39.  0.643  19.67  T H E T H R E E L I N E S OF F I G U R E S FOR E A C H E N T R Y R E P R E S E N T : FREOUENCY EXPECTED FREQUENCY R A T I O AS I t UF F R E O . TO T O T A L NO. OF S E N T - L E N G T H I N S U B J E C T  5. 50+  5.335  5.578  3.796  28.0  81.0  59.0  11.0  23.0  10.0  23.4  40.6  68.4  53.0  8.3  22.5  1C.8  2.271  3.898  3.660  4.382  3.358  3.049  17.0  30.0  25.0  1.0  7.0  5.0  9.9  17.2  28.9  22.4  3.5  9.5  4.6  1.551  0.398  1.022  1. 5 2 4  1.379  1.444  .  96.  2.95  3.0  8.0  3.0  4.0  1.0  3.0  2.0  2.5  4.3  7.2  5.6  0.4  2.4  1.1  0.422  0.649  0.144  0.248  0.398  0.438  0.610  24.  7.08  6.0  45.0  13.0  9.0  O.C  4.0  8.3  14.5  24.4  18.9  2.9  8.0  3.9  O.C  0.584  1.220  CHI-SQUARE  227.  10.88  11.0  0.844  294.  3.963  15.0  CHI-SQUAkE  1.0  4.042  TOTAL  8.20  CHI-SQUARE  1.0  H  86.0  1.547  7.83  G  84.0  2.110  47.  F  42.0  CHI-SQUARE  7.0  CHI-S3UAKE  C  29.0  4.079  9.81  CHI-SUUAKE  B  D I S T R I B U T I O N OF O C C U R R E N C E OF F I V E SELECTED SENTENCE LENGTHS ACROSS THE S U B J E C T AREAS OF GRADE N I N E  CHI-SQUARE  0.0  CHI-SOUARE  183.  RANK LENGTH  5.234  13.0  .0.236  TOTAL  XXXXV  4.11  CHI-SOUARE  50*  7.241  23.0  5.437  40  H  26.0  CHI-SOUARE  30  G  23.0  5.437  20  F  TABLE  3.650  0.626  0.558  4.0  81.  60.52  LO O  THE THREE L I N E S OF F I G U R E S FOR EACH ENTRY R E P R E S E N T : J> FREQUENCY EXPECTED FREQUENCY R A T I O A S S t OF F R E Q . T O T O T A L NO. OF S E N T - L E N G T H I N S U B J E C T  305  APPENDIX WORD FREQUENCY  K  DIAGRAMS  (GRAPHS)  O  .  z  U J O  UJ  0.0  10.0  —I  20.0  30.  40.0  50.0  RANK  .  (XlO*  I — DO.O  70.0  80.0  90.0  ~1  100.0  307 o  FIGURE 2.1  WORD-FREQUENCY  DIflGRflM  OF GRADE NINE  o —<C  (_> z  UJO UJ  cc  0.0  1  10.0  1  20.0  1  40.0'  1  RANK (X10 ) 50.0  , J  1—  60.0  70.0  I 80.0  I  90.0  -1 100.0  F1GURE J-.u. WORD-FREQUENCY DIAGRAM  8-  o  1  30.0  OF GRADE EIGHT  .  >(_)  Z UJ°  B4UJ  on  1  io o  1  20.0  1  30.0  1 -tt.o  1 sa.o , 1  1  GO.O  RANK IX10 )  1  70.0  1  ao.o  r  90.0  i  100.0  308  FIGURE *.5"  WORD-FREQUENCY DIAGRAM OF COMMERCE  o  .  H a .  U  UJ CE  0.0  10.0  20.0  30.0  40.0  FIGURE *.6  50.0  RANK ( X 1 0  ,  60.0  I  70.0  80.0  90.0  BO.O  93.0  100.0  1  WORD-FREQUENCY DIAGRAM OF ENGLISH '  o  U J O  UJ  rr  0.0  10.0  -1 20.0  30.  40.0  RANK  50.0  ,  (XI0  1  ~i  60.0  70.0  =*!  100.0  3 0 9  FIGURE g.7  WORD-FREQUENCY  DIAGRAM  OF HOME ECONOMICS  o  z OMUJ  0.0  I 10.0  -1— 20.0  30.0  40.0  F I G U R E 3.8  RANK  S3.0  60.0  (X10  "70.0  I — 80.0  1  WORD-FREQUENCY OF INDUSTRIAL  ~1 90.0  100.0  01AGRAM EDUCATION  z UJO UJ  cc  0.0  10.0  I 20.0  30.0  40.0  I  RANK  50.0  (X10  L  -1— 60.0  -1— 70.0  00.0  90.  100.0  310  FIGURE  8H  o  WORD-FREOUENCY DIAGRAM MATHEMATICS  .  .—, cn _  ><_> Z UJ  or  0.0  2  I  10.0  20.0  30.  40.0  co.o  50.0 ,  RANK (X10  l  70.0  80.0  —1 90.0  100.0  1 FIGURE 8.to WORD-FREOUENCY DIAGRAM OF  SCIENCE  o. >(_) Z Uja  8* UJ OH  ~1 0.0  I  10.0  I  20.0  1  1  1  30.0  40.0  50.0  RANK I X 1 0  . 1  )  1  1  00.0  70.0  I  80.0  I 90.0  1 100.0  311  FIGURE M l WORD-FREQUENCY DIAGRAM OF SOCIAL STUDIES  >-  o z UJ  8  0.0  ~I— 10.0  20.0  ~T  30.0  40.0  RANK  50.0 , f X10  1  —i— GO.O  70.0  I  80.0  90.0  100.0  PROFESSIONAL  A.  WRITING  ARTICLES " A r e n ' t We B e i n g C o n n e d ? " The_B.C. J u n e , 1969), 327-328. "Patterns in Literature" (June, 1970), 102-106.  Teacher,  48,  (May-  B._C._ E n g l i s h T e a c h e r ,  "PANORAMA - A S t u d y T e c h n i q u e " (November, 1973), 132-135.  J o u r n a l .of R e a d i n g ,  "Some Problems i n Secondary Reading" 5§3^i£2_Conference, U.B.C., Vancouver, C o n t i n u i n g E d u c a t i o n , 1973.  10,  17,  Jourth_Annual Centre for  "The Effect of Idioms on Children's Reading and Understanding of Prose" Teacher s _ T a n c i i b l e s T e c h n i q u e s ' _Co m jar eh e n s i o n _ o f _ Co n t en t _ i n _ R e a d i n 3, ed. Bonny Schulwitz (Newark: International Reading A s s o c i a t i o n , 1973). x  " I d i o m s and R e a d i n g B e h a v i o r , 10, ( F a l l , B.  Comprehension" 19 74) , 30-36.  Journal  x  of^Readincj.  REFERENCE_TEXTS Summers, Edward G., Brother Leonard C o u r t n e y . , and Peter Edwards. Guide t o _ P r g f e s s i o n a l ^ T e x t f c o o k s _ a n d Research_in_Secondar^ , University of British C o l u m b i a , I n f orma'tion R e s e a r c h C e n t r e , 1973.  C.  RESEARCH.REPORTS A_Study_of_the_Effectiven S°S£^_l§§^il}2_Centre_Pro3ram. Research Report V a n c o u v e r , B o a r d o f S c h o o l T r u s t e e s , J u n e 1973.  73-10,  An I n t e r a c t i o n - N e t w o r k ^ I n s t r u m e n t f o r M e a s u r i n g P u p i l IS£§El£ti2Ii_iB_3_L^i££i£a_Ii!lil0i}ii!§i}i' R e s e a r c h R e p o r t 73-15, V a n c o u v e r , B o a r d o f S c h o o l T r u s t e e s , J u l y 1973. lll_Ivaluation_of_the_Com B£itannia_Secondary_scho . Research Report V a n c o u v e r , B o a r d o f S c h o o l T r u s t e e s , J u l y 1973. i£_SX§.2iiation_of _ t h j 3 _ ^ Secondarj_School. Research Report 73-19, Board o f S c h o o l T r u s t e e s , J u l y 1973.  73-16,  Vancouver,  l£_Evaluation_of_the_Ada to_the_U n i v e r s i t y _ H i l l _ S e c o n d a r y _ P r o g r a m . Researc h R e p o r t 73-22, V a n c o u v e r , Board o f S c h o o l T r u s t e e s , J u l y 1973.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0055689/manifest

Comment

Related Items