Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Towards spoken English : a computer based synthesizer for a reading machine for the blind Yeung, James Man Chung 1974

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
831-UBC_1974_A7 Y48.pdf [ 3.67MB ]
Metadata
JSON: 831-1.0065541.json
JSON-LD: 831-1.0065541-ld.json
RDF/XML (Pretty): 831-1.0065541-rdf.xml
RDF/JSON: 831-1.0065541-rdf.json
Turtle: 831-1.0065541-turtle.txt
N-Triples: 831-1.0065541-rdf-ntriples.txt
Original Record: 831-1.0065541-source.json
Full Text
831-1.0065541-fulltext.txt
Citation
831-1.0065541.ris

Full Text

TOWARDS SPOKEN ENGLISH: A COMPUTER BASED SYNTHESIZER FOR A READING MACHINE FOR TEE BLIND by JAMES MAN CHUNG YEUNG B . E n g . , McMaster U n i v e r s i t y , 1971 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE i n the department of E l e c t r i c a l Engineer ing We accept t h i s thes i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA J u l y , 1974 In presenting th i s thesis in p a r t i a l fu l f i lment of the requirements for an advanced degree at the Univers i ty of B r i t i s h Columbia, I agree that the L i b r a r y sha l l make it f ree ly ava i lab le for reference and s t u d y . I further agree that permission for extensive copying of th i s thes is for scho lar ly purposes may be granted by the Head of my Department or by his representat ives . It is understood that copying or p u b l i c a t i o n of this thes is for f inanc ia l gain sha l l not be allowed without my writ ten permission. Department of /-/-ec7>i / tsxc; The Univers i ty of B r i t i s h Columbia Vancouver 8, Canada Date J"fy ' i ABSTRACT A speech s y n t h e s i z e r has been developed to s y n t h e s i z e r e a l - t i m e speech d i r e c t l y from l e t t e r i n f o r m a t i o n . The s y n t h e s i z e r was implemented on a PDP-12 d i g i t a l computer and i t s p e r i p h e r a l d e v i c e s . S y n t h e t i c speech was produced by the concatenation of 33 b a s i c phonemes. These phonemes were processed by a segmentation program which was developed to e x t r a c t , r e c o r d , d i s p l a y and s t o r e them: the o r i g i n a l speech samples were t r e a t e d s i m i l a r l y . The l e t t e r i n f o r m a t i o n from the output of a l e t t e r r e c o g n i z e r was stored i n an input stack (or b u f f e r ) u n t i l a com-p l e t e word has been r e c e i v e d . Then the input word was s y n t h e s i z e d by a s y n t h e s i s program which assigned phonemes to the word according t o the stored d i c t i o n a r y and the general grapheme-phoneme correspondence r u l e s w r i t t e n i n the program. A d i c t i o n a r y c o n t a i n i n g words t h a t were not pronounced according to the above general r u l e s , and i n a d d i t i o n , a l i s t of phoneme e q u i v a l e n t s , were stored i n a d i s k f i l e system. The g e n e r a l r u l e s were set up according to the most frequent grapheme-phoneme t r a n s -c r i p t i o n s and a frequency study of the f i r s t 1,000 words of the Thorndike word l i s t . The s y n t h e s i z e r was capable of s y n t h e s i z i n g over 90% of spoken E n g l i s h speech a c c u r a t e l y and at the same time c o u l d be understood by the l i s t e n e r a f t e r a short p e r i o d of t r a i n i n g i n s p i t e of the mechanical sounding d i a l e c t which r e s u l t e d from the i n c o r r e c t p r o n u n c i a t i o n of some words. An i n t e l l i g i b i l i t y t e s t using i s o l a t e d words, sentences, and short passages generated by the s y n t h e s i z e r , was g i v e n to s i x s u b j e c t s . The r e s u l t s showed th a t over 90% of the i s o l a t e d words, 95% of the sentences and 98% of the short passages could be recognized by the t e s t s u b j e c t s a f t e r approximately one hour of t r a i n i n g . i i TABLE OF CONTENTS Page ABSTRACT , 1 TABLE OF CONTENTS 1 1 LIST OF ILLUSTRATIONS . . i v LIST OF TABLES v ACKNOWLEDGEMENT . v i 1. INTRODUCTION 1 2. SEGMENTATION AND SYNTHESIS 5 2.1 The Computer System . . 5 2.2 The Segmentation Process ^ 2.3 The Synthesis Process H 2.3.1 Input Routine . 13 2.3.2 D i c t i o n a r y Method ." . 1 4 2.3.3 General Grapheme-Phoneme Rule Method ^ 2.3.4 Output Routine 1 6 2.4 An Example of Word Synthesis 2 y 3. INTELLIGIBILITY TEST EXPERIMENT '22 3.1 P r e p a r a t i o n of Test M a t e r i a l s 22 3.2 Experimental Design and T e s t i n g Procedure 2 3 3.3 R e s u l t s and D i s c u s s i o n s 2 4 4. SUMMARY, DISCUSSIONS AND SUGGESTIONS . 4 2 4.1 Summary 4 2 4.2 D i s c u s s i o n s 4 2 4.2.1 P r e p a r a t i o n of M a t e r i a l s 4 2 4.2.2 Storage of Data 4 3 4.2.3 S y n t h e t i c Speech and Subjects 4 5 i i i Page 4.2.4 Synthet ic Speech and Other Speech Devices . . . . 46 4.3 Suggestions 47 REFEPJ5NCES 48 APPENDIX A D e s c r i p t i o n of Phonemes Used 49 APPENDIX B Data Codes 50 APPENDIX C Summary of the Frequency Study of the F i r s t 1,000 Words of the Thorndike Word L i s t f or Vowels 52 APPENDIX D Examples of Monosyl lable Words 54 APPENDIX E Examples of PB Sentences 55 APPENDIX F Example of a Short Passage 56 i v LIST OF ILLUSTRATIONS F igure Page 1 Block diagram of synthe t i c speech reading machine system 3 2 Computer system f o r speech synthes i s 6 3 Flow chart for the segmentation program 8 4 a - f Time-amplitude waveforms of phoneme segment /ml . . . . 10 5 Flow chart for word synthes is 12 6 Word tes t scores of subjects as a func t ion of t e s t sessions 25 7 Mean word tes t scores as a func t ion of tes t sess ions . . 27 8 Sentence tes t scores of subjects as a f u n c t i o n of 35 t e s t sess ions 9 Mean sentence tes t scores as a f u n c t i o n of t e s t sess ions 37 10 Short passage tes t scores of subjects as a f u n c t i o n of t e s t sess ions .• 38 11 Mean short passage t e s t scores as a f u n c t i o n of t e s t sess ions . . . . . . 41 12 Discontinuous phonemic boundary of the / m i / t i m e - a m p l i -tude waveform 44 13 Continuous phonemic boundary of the / m i / t ime-amplitude waveform 44 LIST OF TABLES Table Page I Grapheme-phoneme r u l e s 17 I I Means and standard dev ia t ions of the percent c o r r e c t scores i n the word tes t of subjects as a func t ion of tes t sess ions 26 I I I Confusion matrix of phonemes i n word t e s t a f t e r 1/2 hour of t r a i n i n g 29 IV Confusion matrix of phonemes i n word t e s t a f t e r 1 hour of t r a i n i n g 30 V Confusion matrix of phonemes of word t e s t a f t e r 1 1/2 hours of t r a i n i n g 31 VI Confusion matrix of phonemes o f word tes t a f t e r 2 hours of t r a i n i n g 32 VII Confusion matrix of phonemes of word t e s t a f t e r 2 1/2 hours of t r a i n i n g 33 VIII Means and standard dev ia t ions of the percent c o r r e c t scores i n the sentence t e s t of subjects as a f u n c t i o n of t e s t sess ions 36 IX Means and standard dev ia t ions of the percent c o r r e c t scores i n the short passage t e s t of subjects as a func t ion of tes t sess ions 40 ACKNOWLEDGEMENT The author i s g r a t e f u l to D r . M . P . Beddoes, superv i sor of the p r o j e c t , for h i s advice and encouragement throughout the course of the work, and he a l so wishes to express h i s a p p r e c i a t i o n to Rodney George f o r h i s t e c h n i c a l a s s i s tance , and everyone who p a r t i c i p a t e d i n the experiment. The author wishes to acknowledge a l s o , f i n a n c i a l support from the N a t i o n a l Research C o u n c i l of Canada, the M e d i c a l Research C o u n c i l of Canada, and the Vancouver Foundation. 1 1. INTRODUCTION In recent y e a r s , a v a r i e t y of reading machines have been p r o -posed and some have a c t u a l l y been b u i l t which produce audible" sounds from which the b l i n d can read normal p r i n t e d m a t e r i a l . Simple machines such as the Optophone which generates buzz tones[1][2] and the Lexiphone which produces musica l tones[3] are por tab le and inexpens ive , but r e q u i r e up to one year of t r a i n i n g to master the code tones. In a d d i t i o n , the reading ra te of these machines i s slow. Other complex machines such as the terminal -analog synthes izer[4] and formant synthes izer[5] can produce speech of r e l a t i v e l y good q u a l i t y but they are expensive and complicated to c o n t r o l . A machine c a l l e d the s p e l l e d speech generator[6] which p r o -duces the l e t t e r sounds of the alphabet i s easy to master. The s p e l l e d speech generator uses 18 stored phoneme segments to produce the necessary s p e l l e d speech output by concatenat ion. T h i s method i s r e l a t i v e l y s imple and can be implemented e a s i l y i n a por tab le machine for use i n a t a l k i n g t y p e w r i t e r [ 7 ] . The l i m i t speed for s p e l l e d speech i s about 80 words per minute; normal speech ra te i s 150 to 200 words per minute. The long term purpose of t h i s p r o j e c t i s to produce a r e l a t i v e -l y s imple and inexpensive speech s y n t h e s i z e r . The speech was c o n t r o l l e d by e r r o r - f r e e l e t t e r informat ion which was presumed to come from a key-board or from a l e t t e r recognizer or other input d e v i c e . The Immediate purpose of the p r o j e c t was to create a computer-based system for examining a proposed method of speech synthes i s . By not r e q u i r i n g excess ive n a -t u r a l n e s s , the synthes izer should economical ly produce a h i g h l y i n t e l l i -g i b l e E n g l i s h d i a l e c t . 'Highly i n t e l l i g i b l e ' means that an e r r o r r a t e wi th connected speech of l e s s than 3 to 4% was achieved a f t e r an hour or so t r a i n i n g . The speech s i g n a l was produced by the concatenat ion of the s tored phoneme segments. The system was implemented on a FDP-12 d i g i t a l computer with a core memory capac i ty of 16 k i l o - ' w o r d s ' of 12 b i t s each, ('word' means a computer memory word) , along wi th i t s p e r i p h e r a l s , such as the RK08 d i s k system, the magnetic tape t r a n s p o r t , the CRT p r e c i s i o n d i s p l a y , the t e l e t y p e , the audi tory output and the programmable c lock ( F i g . 1 ) . A p h o t o c e l l scanner was used to scan the p r i n t e d text and ex-t r a c t l e t t e r features for l e t t e r r e c o g n i t i o n . A f t e r a l e t t e r was r e c o g -n i z e d , i t was s tored i n the storage b u f f e r u n t i l the whole word has been accepted. Then the word was processed by the synthes i s 'program which determined the phoneme equivalents of the word and the word could then be synthes ized by the concatenation of the s tored phoneme segments. No i n t o n a t i o n or s tress was added.to the speech output . segments from recorded speech samples a f t e r sampling and q u a n t i z i n g the analog s i g n a l s us ing an a n a l o g - t o - d i g i t a l converter wi th a sampling r a t e set by the i n t e r n a l c lock of the computer. The program a l s o d i s p l a y e d , both g r a p h i c a l l y and a u d i b l y , the speech samples for accurate and e f f i -c i en t e x t r a c t i o n of the phoneme segments. Another synthes i s program was then w r i t t e n to synthes ize speech from the recognized l e t t e r inputs by means of a s tored d i c t i o n a r y and general grapheme-phoneme correspondence r u l e s . An i n t e l l i g i b i l i t y t e s t was c a r r i e d out to study the q u a l i t y of the speech i n i s o l a t e d words, sentences , and short passages. The e x p e r i -ment was conducted us ing s i x subjec t s , three b l i n d and three s i g h t e d . The thes i s i s concluded wi th a summary, d i s cus s ions and suggest ions . A l l l e t t e r s i n quotations represent graphemes of words, except when s p e c i f i e d , f o r example, the graphemes of the word BET are shown by Letter Recognizer Letter Features Photocell Scanner Letter Printed Text Decisi on Digital Storage of Word Word Phoneme Translator D/A Converter and Ampl i f i e r Phonemes Digital Data Digital Storage of Phon ernes -Output Speech F i g . 1 S P E E C H SYNTHESIZER Block diagram of synthet ic speech reading machine system "bet". L e t t e r s i n s lashes represent the phonemic symbols; for example, the phonemic symbols of the word "bet" are / b e t / . \ 2. SEGMENTATION AND SYNTHESIS 2.1 The Computer System The computer system f o r the segmentation and synthes i s p r o -cesses i s shown i n F i g . 2. The system cons i s ted of a PDP-12 d i g i t a l computer wi th 16K words core memory of 1 2 - b i t word l e n g t h . A t e l e type was used f o r s imulat ing the input device as w e l l as p r i n t i n g and enter ing i n program i n s t r u c t i o n statements and da ta , and was the c h i e f means of commanding the computer. The console switches on the f r o n t pane l con-t r o l l e d programming func t ions . Expanded storage and t r a n s f e r of programs and data were provided by the LINC magnetic tapes and the RK08 d i s k s y s -tem. Each RK.08 d i s k i s organized i n t o the equiva lent of 6 a d d i t i o n a l LINC tape u n i t s . The d i s k system has a f a s t data t r a n s f e r r a t e of 4096 'words' i n 80 m i l l i s e c o n d s . A CRT p r e c i s i o n d i s p l a y u n i t was used to d i s p l a y processed speech s i g n a l s or o t h c i data s tored i n the computer. I t a l so allowed the speech s i g n a l s to be d i s p l a y e d dur ing the segmenta-t i o n and synthes is processes . Low-pass f i l t e r s were used to l i m i t the bandwidth of the audio s i g n a l en ter ing the a n a l o g - t o - d i g i t a l (A/D) con-v e r t e r and coming from the d i g i t a l - t o - a n a l o g (D/A) conver ter . The A/D converter has 10 -b i t r e s o l u t i o n whi le the D/A converter has 9 - b i t r e s o l u -t i o n . The input s i g n a l to the low-pass f i l t e r came from a t a p e - r e c o r d e r . The output s i g n a l from the low-pass f i l t e r was d i r e c t l y connected to a loudspeaker, headphone, or tape -recorder . 2.2 The Segmentation Process Segmentation of the speech samples enabled the b a s i c phonemes to be e x t r a c t e d , and reduced the data storage f o r the phonemes which were required for the product ion of the s y n t h e t i c speech. By s t o r i n g the d i g i t a l - data of the p i t c h periods of the vowels and v o w e l - l i k e consonants, RK08 Disk System Line Tape System A/D Converter Low-Fi l • pass ter L i Ta pe Reco refer -en Teletype PDP -12 Digital Computer 16K 12-bit Words - E N D/A Co nverter Low - pass Fi lter Tape Recorder Amplif ier F i g . 2 Computer system for speech synthes is 7 storage for these phonemes was f u r t h e r reduced. For some consonants, only part of the phoneme needed to be s t o r e d , for example, the s i l e n t i n t e r v a l (SI) preceding the burs t of a l l the stop consonants was e l i m i -nated when being s t o r e d , and the SI was added when the stop consonant was being synthes ized . For fur ther storage r e d u c t i o n , only 6 b i t s of the o r i g i n a l 10-b i t data were s tored and two 6 - b i t data elements were packed in to one 12-b i t 'word' which reduced the data storage by h a l f . A segmentation program was w r i t t e n for e x t r a c t i n g the b a s i c phonemes and d i s p l a y i n g them g r a p h i c a l l y and audib ly dur ing the segment-i n g . The program a l so enabled f i l i n g of the o r i g i n a l speech samples and the b a s i c phonemes under f i l e names on magnetic tapes and/or d i s k for future re ference . The flow chart for the segmentation program i s shown i n F i g . 3. The input speech s i g n a l was sampled and s tored i n the data b u f f e r . The s i z e of cue data buf fer was pre-deteriuineu by the software program. The sampling ra te was 12.5 KHz (or 80 microseconds per sample p e r i o d ) . For a data b u f f e r of 8 k i l o - w o r d s , a 640 m i l l i s e c o n d s speech sample could be s t o r e d . A t h r e s h o l d , which was set h igher than the noise i n the system, was used to automat ica l ly detect the beginning of the i n -coming speech s i g n a l dur ing the sampling process . T h i s system has the fo l lowing fea tures : 1) L i s t e n and D i s p l a y Thi s feature allowed the amplitude-t ime waveform of the speech segment i n the data buf fer to be d i sp layed on the CRT, and the output was heard using the ampl i f i er - loudspeaker u n i t . By c o n t r o l l i n g the t e l e type keyboard and the c o n t r o l switches of the computer, and generat ing the d i g i t a l phoneme data through the D/A converter i n t o the audio a m p l i f i e r -loudspeaker and the CRT d i s p l a y , accurate and e f f i c i e n t s e l e c t i o n o f the START set the sampling period 1. L i s t e n & D i s p l a y s e l e c t segment output through D/A converter to graphic & auditory d i s p l a y s wait f o r s e l e c t i o n 1 or 2 or 3 or 4 or 5 2. Record set t h r e s hold & length of data accept input through A/D converter & s t o r e i n core memory 3. Read Data accept f i l e name, no. of r e p e t i t i o n s , & length of data t r a n s f e r data from d i s k to core memory 4. Write Data accept f i l e name t r a n s f e r data from core memory to d i s k go back to s e l e c t i o n 1 or 2 or 3 or 4 or 5 5. Write Phoneme accept f i l e name s e l e c t segment t r a n s f e r segment data to d i s k f i l e F i g . 3 Flow ch a r t f o r the segmentation program (TT1) oo phoneme segment was p o s s i b l e . The input s i g n a l amplitude was l i m i t e d so t h a t i t could be represented by 6 b i n a r y b i t s f o r each sample. Since a packed byte array of two 6-bit data elements i n a 1 2 - b i t 'word' was the d e s i r e d f i n a l form, a proper s e l e c t i o n of the 6 b i t s was r e q u i r e d i n order to r e t a i n maximum i n f o r m a t i o n from the o r i g i n a l data. With the a i d of the sense switches on the computer, a comparison of the o r i g i n a l 1 0 - b i t data and the s e l e c t e d 6-bit data was made so th a t the f i n a l form of the phoneme data would not be d i s t o r t e d . The r e l a t i v e amplitudes of the phonemes were adjusted by t h i s f e a t u r e . F i g s . 4 a-f show the com-pa r i s o n of the o r i g i n a l data w i t h v a r i o u s s e l e c t i o n s of the 6-bi t data of the phoneme /m/. 2) Record The speech s i g n a l coming from a microphone or tape-recorder . was b a n d - l i m i t e d by a 6.25 XHs lev--pass f i l t c i : , and w i s .Llieu scored i n the data b u f f e r a f t e r i t had been sampled and quantized. By the s e t t i n g of the t h r e s h o l d , the rec o r d i n g could-be s t a r t e d a u t o m a t i c a l l y once the beginning of the s i g n a l was detected. The l i m i t of the data b u f f e r was 8K words f o r each speech sample r e c o r d i n g or approximately 640 m i l l i -seconds long . 3) Read Data The data of i n d i v i d u a l speech samples s t o r e d under d i f f e r e n t f i l e names i n the d i s k system could be read i n t o the data b u f f e r f o r r e -examination or m o d i f i c a t i o n . There were 37 f i l e s f o r the o r i g i n a l d a t a recordings which were r e a d i l y a c c e s s i b l e . 4) Write Data O r i g i n a l speech sample r e c o r d i n g could be s t o r e d under a f i l e name i n the d i s k system f o r f u t u r e r e f e r e n c e . The t r a n s f e r was done by (a) (b) (c) (d) (e) (f) F i g s . 4a- f Time-amplitude waveforms of phoneme segment /m/ (a) O r i g i n a l waveform. (b) Se lected 6 - b i t waveform, b i t s 2-7 . (c) Selected 6 -b i t waveform, b i t s 2, 4-8. (d) Se lected 6 -b i t waveform, b i t s 2, 5 -9 . (e) Selected 6 -b i t waveform, b i t s 2, 6-10 ( d i s t o r t e d ) (f) Se lected 6 - b i t waveform, b i t s 2, 7—11 (d i s tor ted) w r i t i n g the whole data buf fer onto a d i s k f i l e . 5) Wri te Phoneme Segment A f t e r a phoneme segment was s e l e c t e d , i t was s tored under a phoneme f i l e name i n the d i s k system. The speech synthes i s program used these phoneme segments for speech generat ion . The phoneme segments were packed in to a 6 -b i t array i n the d i s k f i l e s and were t r a n s f e r r e d to the main computer memory dur ing the speech synthes i s process . 2.3 The Synthesis Process Three stages were needed to produce speech from tex t : the l e t t e r recogn izer ; the l e t t e r (or grapheme) to phoneme map; and the t r a n s l a t i o n from the phoneme to the aud i tory output . The l e t t e r r e c o g -n i z e r i s not dea l t with i n t h i s t h e s i s . The t r a n s l a t i o n was dea l t w i th by Suen[6]. Synthet ic speech was generated by the concatenat ion of u i g i L d l plioiitiue data tln.c>ugii tli<i D/A converter tc ths ampl i f z.£r~-loud--speaker system. There were two methods of mapping graphemes to phonemes and they w i l l be discussed below. T h i r t y - t h r e e b a s i c phonemes (10 vowels , 22 consonants, and 1 s i l e n t i n t e r v a l ) were prepared by the segmentation program and s tored i n the d i sk system. The l i s t of these phonemes i s shown i n Appendix A . A synthes is program was developed to synthes ize speech by the concatena-t i o n of these phonemes, A flow diagram f o r the word synthes is i s shown i n F i g . 5. A s t r i n g of words, or a sentence was synthes ized word by word, that i s , by ass ign ing phonemes to each word i n d i v i d u a l l y . The phoneme informat ion was s tored and there was no output u n t i l the output command was g iven: a whole passage could be accepted and synthes ized before continuous speech was generated. Phonemes were assigned to an input word only a f t e r the whole word had been accepted from the input dev ice . 12 s tore phoneme t r a n s l a t i o n i n the output stack ^ START ^ 1 s e t the sampling p e r i o d Yes Yes J S L accept input l e t t e r output a l l phonemes i n the output s tack t r a n s f e r d i c t i o n a r y to working area s t o r e ' l e t t e r i n the input stack t r a n s l a t e word with general grapheme-phoneme r u l e s F i g . 5 Flow chart for word synthesis (ZZDICT) 13 The f i r s t map was i n the form of a d i c t i o n a r y . I t a p p l i e d to words which could not be constructed according to the r u l e s of c o r r e s -pondence between the graphemes and phonemes of E n g l i s h , as s p e c i f i e d by the ' g e n e r a l ' method descr ibed below. The second method was the ' genera l ' grapheme-phoneme r u l e method. I t was a set of genera l r u l e s s p e c i f y i n g the grapheme-phoneme t r a n s c r i p t i o n of E n g l i s h words. A f t e r a word was accepted i n the input b u f f e r , a search for the word i n the d i c t i o n a r y was i n i t i a t e d . I f a match was found, i t s phoneme equiva lent s p e c i f i e d i n the d i c t i o n a r y would be used. I f there was no match, the general grapheme-phoneme ru le s would app ly , ass ign ing the phonemic equivalent of that word. D e t a i l s of these two methods and some of t h e i r important features w i l l be discussed i n the fo l l owing paragraphs. 2 .3 .1 Input Routine Each input word was accepted and s tored one at a time i n the input s tack v i a the input dev ice . In the present system the input was s imulated on a t e l e type (Model 33 ASR T e l e t y p e ) . The t e l e type keyboard a lso c o n t r o l l e d the erasure and m o d i f i c a t i o n of the input word. I t a l so c o n t r o l l e d the output which was monitored by the audio and graphic d i s -p l a y s . The input stack was reserved f o r every grapheme word entry u n t i l the word had been assigned the appropr ia te phonemes. Every new word entry would automat ica l ly erase the o l d e n t r y . The input s tack was a packed byte array of two 6 -b i t ASCII codes i n each 12 -b i t 'word ' . A l i s t of the word codes i s shown i n Appendix B . For example, the word "beast" i s i n the form: 0205 ("be" i n the 1st l o c a t i o n of the input stack) 0123 ("as" i n the 2nd l o c a t i o n of the input stack) 2400 ("t" i n the 3rd l o c a t i o n of the input stack) T h i s form was e s s e n t i a l for the search process of the d i c t i o n a r y . 2.3.2 D i c t i o n a r y Method A d i c t i o n a r y conta in ing words which were not pronounced according to the general grapheme-phoneme r u l e s was prepared and s tored i n the d i s k system. I t a l so contained both the words and t h e i r phoneme s t r i n g equ iva lent s . There were about 300 words i n the d i c t i o n a r y . These words were prepared on the bas i s of the f i r s t 1000 words of the Thorndike l i s t . There were 26 f i l e s under f i l e names A to Z i n the d i c t i o n a r y . Each f i l e contained words beginning wi th the same l e t t e r and were arranged i n order as i n an ord inary d i c t i o n a r y . There were three types of data i n the d i c t i o n a r y : (1) the word codes, (2) the phoneme codes, and (3) the i n d i c a t o r codes for the prev ious two types and the termina-t i o n of the f i l e . D e t a i l s of these data codes are shown i n Appendix B. For example, the word "decide" has the format: 7777 i n d i c a t o r code for word code 0405 "de" "I 0311 " c i " i-word codes 0405 "de" J 0000 i n d i c a t o r code for phoneme code 0411 /di / -j 2334 /saee/ >phoneme codes 0477 /d/ J > 7777 i n d i c a t o r code for next word • 7777 6666 i n d i c a t o r code for f i l e t erminat ion The word codes were ordered from l g to 32g ( l^g to 26-^Q) r e -present ing the 26 l e t t e r s of the a lphabet . The phoneme codes ranged from l g to 56g (lj_g to 46^Q) , represent ing 46 d i f f e r e n t combinations of the 33 phonemes. The codes were packed i n such a way that the storage f o r t h e d i c t i o n a r y w a s m i n i m a l a n d t h e y w e r e c o m p a t i b l e w i t h t h e i n p u t s t a c k f o r m a t . A s s o o n a s t h e f i r s t l e t t e r o f t h e w o r d w a s r e c e i v e d , t h e a p p r o p r i a t e d i c t i o n a r y f i l e w o u l d b e t r a n s f e r r e d t o t h e w o r k i n g a r e a f o r s e a r c h i n g . I f a m a t c h w a s f o u n d , t h e e q u i v a l e n t p h o n e m e s w e r e u s e d a n d s t o r e d i n t h e o u t p u t s t a c k . I f t h e r e w a s n o m a t c h , t h e w o r d w o u l d t h e n o b e t r a n s c r i b e d b y t h e g e n e r a l g r a p h e m e - p h o n e m e r u l e s . T h e d i c t i o n a r y w a s s e a r c h e d b y c o m p a r i n g t h e w o r d s i n t h e d i c -t i o n a r y w i t h t h e w o r d i n t h e i n p u t s t a c k . T h e s e a r c h w a s l i n e a r a n d s t a r t e d f r o m t h e b e g i n n i n g o f t h e f i l e . S i n c e t h e w o r d s i n t h e d i c t i o n -a r y w e r e o r d e r e d a n d t h e v o c a b u l a r y w a s s m a l l , t h e s e a r c h t i m e f o r a n y p a r t i c u l a r w o r d w a s s h o r t . S e c t i o n 2 . 4 d e m o n s t r a t e s t h e s e t w o m e t h o d s . 2 . 3 . 3 G e n e r a l G r a p h e m e - P h o n e m e R u l e M e t h o d x n t h e X ^ I I ^ X J L S H x c u i g u a g e , iiiosL of t h e c o n s o n a n t ^ r a p h c - n ^ s h c v c a f i x e d o r f a v o r i t e p h o n e m e t r a n s c r i p t i o n , f o r e x a m p l e , t h e g r a p h e m e " t " i s m o s t l y p r o n o u n c e d a s t h e p h o n e m e / t / e x c e p t w h e n c o m b i n i n g w i t h o t h e r g r a p h e m e s u c h a s " t h " w h i c h i s p r o n o u n c e d a s /5/ o r / B / . B u t f o r v o w e l g r a p h e m e s , t h e r e a r e o f t e n e x c e p t i o n s t o t h e p r o n u n c i a t i o n s o f v o w e l s i n t h e c o n t e x t o f o t h e r g r a p h e m e s , f o r e x a m p l e , " e " c a n b e p r o n o u n c e d a s / £ / a s i n " b e d " , / i / ' a s i n " e v e n i n g " , o r s i l e n t a s i n " s i t e " . I n o r d e r t o c o n s t r u c t a s e t o f g e n e r a l r u l e s f o r t h e v o w e l s a s w e l l a s f o r t h e c o n s o n a n t s , a f r e q u e n c y s t u d y o f t h e T h o r n d i k e ' s f i r s t 1000 w o r d s w a s d o n e f o r t h e v o w e l s a n d s o m e common c o m b i n a t i o n s o f v o w e l s u c h a s " e e " , " o o " o r " e i " e t c . F r o m t h i s s t u d y t h e m o s t c o m m o n l y u s e d g r a p h e m e - p h o n e m e t r a n s c r i p t i o n s f o r t h e v o w e l s w e r e u s e d a s t h e g e n e r a l r u l e s . A s u m m a r y o f t h e s t u d y i s s h o w n i n A p p e n d i x C . A t o t a l o f 72 g e n e r a l g r a p h e m e -p h o n e m e c o r r e s p o n d a n c e r u l e s w e r e d e r i v e d a n d w e r e t h e n i m p l e m e n t e d b y a software program. The r u l e s are l i s t e d i n Table I . With some graphemes such as "a", the phonemic t r a n s c r i p t i o n / e i / was used r a t h e r than the more commonly used /a/ because of the d i s t i n c t i v e n e s s of / e i / p ronuncia-t i o n which was e a s i e r to recognize even though In some cases the pronun-c i a t i o n might have been i n c o r r e c t . Some other common double grapheme t r a n s c r i p t i o n s were added i n the r u l e s because of the p o s s i b l e confusions which might a r i s e when these combinations appear, f o r example "ew" was pronounced as /u/ r a t h e r than /ew/ which was not so common. These r u l e s were not exhaustive, yet w i t h the a i d of the d i c t i o n a r y , the m a j o r i t y of common words could be c o r r e c t l y s y t h e s i z e d by the combination of these two methods. The d u r a t i o n of the s i l e n t i n t e r v a l , the time i n t e r v a l between the t e r m i n a t i o n of the vowel-formant t r a n s i t i o n preceding the stop and the onset of the t r a n s i t i o n to the f c l i c k i n g v c v c l c , i s an important f a c t o r i n the p e r c e p t i o n and d i s c r i m i n a t i o n of the v o i c i n g of the stop consonants. The s i l e n t i n t e r v a l s of a l l the stop consonants were measured: /p/, 117.64 m i l l i s e c o n d s ; /k/ 113.09 m i l l i s e c o n d s ; /t/> 102.04 m i l l i s e c o n d s ; /b/, 85.58 m i l l i s e c o n d s ; /g/ 79.97 m i l l i s e c o n d s and /d/, 68.58 m i l l i s e c o n d s [ 9 ] . These measurements of the s i l e n t i n t e r v a l s f o r the stop consonants were used as r u l e s when the stops were generated. 2.3.4 Output Routine The phoneme i n f o r m a t i o n s t o r e d i n the output stack was used t o generate the s y n t h e t i c speech. The output stack s p e c i f i e d the order and l o c a t i o n of the phonemes which were t o be u t t e r e d . By s t o r i n g ' a l l the output phoneme i n f o r m a t i o n , g eneration of the speech could be done q u i c k l y and e f f i c i e n t l y , and no delays a t .phonemic boundaries, which would cause ' c l i c k s ' , would appear i n the output speech. The output s t a c k a l s o 17 TABLE I Grapheme-Phoneme Rules Graphemes Phoneme Equivalents Remarks a ei ae ei a i ei au 0 ar a aw o b b c k ch t j ce- s " - " represents space ces s ck k . J i« e e ea i ee i ed d except "ted" and "ded" es s except "ses", "xes" and "zes" e i i eo i er eu ia ew u e- # i f no other vowel i n the same word f f g 9 ges- d^es ge-h h - i - aee i i i e i 18 TABLE I (continued) - Graphemes Phoneme Eq u i v a l e n t s Remarks i r -i i i i n g io i v e i v iCe aee except when C (consonant) i s "v" j k k 1 1 m m n n o o oa o oe o o i o i oo u ou u ong Of) P P q 9 r r s s sh I ses s e s t t t h 5 tes t e s u A ua uei ue ui u i u ur v • V w w 19 TABLE I (continued) Graphemes Phoneme Equivalents Remarks wh W X S xes ses y J i f f i r s t l e t t e r of a word y - aee i f other vowel present i n the same word i f no other vowel i n the same word z z zes zes allowed m o d i f i c a t i o n and r e p e t i t i o n of an accepted input without r e -enter ing the whole message. The output ra te of the s y n t h e t i c speech was maintained at 80 microseconds per sample i n order to produce the same v o i c e and p i t c h q u a l i t y as the o r i g i n a l r e c o r d i n g . 2.4 Art Example of Word Synthesis The word "tent" i s used to demonstrate the synthes is process . As soon as the f i r s t l e t t e r "t" was accepted, the d i c t i o n a r y f i l e under f i l e name "T" was t rans ferred to the working area whi le the r e s t of the word was being accepted. Thi s simultaneous schedul ing of the t r a n s f e r of the d i c t i o n a r y and the acceptance of the input enabled the process to operate at a f a s t e r ra te than sequent ia l operat ion would a l low. A f t e r the whole word had been accepted, the search f o r the matched word commenced from the beginning of the f i l e . T h i s search would terminate i f the word be ing examined i n the d i c t i o n a r y has a lower order than the input word or L a c f i l e has ended. A 'lower order ' means that the word be ing searched i s arranged lower i n the d i c t i o n a r y l i s t than the input word, should the input word be i n the d i c t i o n a r y , f o r example, "take" i s ordered h igher than "tent" whi le "test" i s lower than "tent". I f the word "tent" was found i n the d i c t i o n a r y , i t s phoneme s t r i n g equivalent i n the d i c t i o n a r y would be s tored i n the output s tack . Then the computer would wait for a f u r t h e r command from the input dev i ce ; i . e . , whether to output the s tored message or accept the next word e t c . Since no match was found, the con-t r o l t r a n s f e r r e d to the general r u l e r o u t i n e . F i r s t , the l e t t e r "t" and i t s neighbors were examined for p o s s i b l e grapheme combinations , namely "th" and "tes", s p e c i f i e d by the general r u l e s . S ince these were not present , the computer c o n t r o l took the f i r s t l e t t e r "t" and assigned the phoneme / t / to i t . A g a i n , s ince there were no other p o s s i b l e combinations 21 o f g r a p h e m e s s t a r t i n g w i t h " e " , / £ / w a s a s s i g n e d t o " e " . S i m i l a r l y , t h e n e x t l e t t e r " n " w a s e x a m i n e d , a n d t h e f i n a l t r a n s c r i p t i o n f o r t h e w o r d " t e n t " w a s / t e n t / . T h e o r d e r a n d l o c a t i o n i n f o r m a t i o n o f t h e s e p h o n e m e s w a s s t o r e d i n t h e o u t p u t , s t a c k . I f a n o u t p u t command w a s g i v e n , t h e s e p h o n e m e s w o u l d t h e n b e o u t p u t t h r o u g h t h e a u d i o a n d g r a p h i c d i s p l a y s b y c o n c a t e n a t i o n . F u r t h e r m o r e , i f a n o t h e r w o r d w a s a c c e p t e d a s i n p u t , i t w o u l d b e t r a n s c r i b e d u s i n g e i t h e r o n e o f t h e m e t h o d s a s b e f o r e , a n d i t s p h o n e m e i n f o r m a t i o n w o u l d b e s t o r e d f o l l o w i n g t h o s e o f t h e p r e v i o u s w o r d i n t h e o u t p u t s t a c k . E a c h w o r d w a s s e p a r a t e d b y a s i l e n t i n t e r v a l o f a b o u t 200 m i l l i s e c o n d s . C o n n e c t e d s p e e c h w a s p r o d u c e d b y s t o r i n g a n d t h e n g e n e r a t i n g t h e p h o n e m e s o f a w o r d s t r i n g i n s u c c e s s i o n . 3. INTELLIGIBILITY TEST EXPERIMENT 3.1 Preparat ion of Test M a t e r i a l s In order to assess the e f fec t iveness of the system, i n t e l l i g i -b i l i t y t es t us ing i s o l a t e d words, p h o n e c t i c a l l y balanced (PB) sentences , and short passages were given to tes t subjec t s . Three types of t e s t mater ia l s were used because the percept ion of i s o l a t e d words i s d i f f e r e n t from that of words i n context. In grammatical s t r u c t u r e s , a l i s t e n e r apparent ly perce ives continuous speech i n terms of phrases , or longer elements. He may delay i n t e r p r e t a t i o n of words rather than i n t e r p r e t each word as i t occurs . Nongrammatical s t r u c t u r e s , such as i s o l a t e d words, cannot be processed t h i s way. They must be perceived i n terms of shorter temporal elements. The mechanisms and time for process ing on-going contextual in format ion may be cons iderably d i f f e r e n t from those f o r i s o l a t e d words, even though both are speech sounds[10]. Continuous speech produces complex temporal patterns that are perce ived as a whole. Un i formi ty i s a b a s i c r e q u i r e -ment f o r the tes t mater ia l s used i n i n t e l l i g i b i l i t y t e s t ; . d i f f e r e n t l i s t s employed under d i f f e r e n t l i s t e n i n g condi t ions should produce comparable r e s u l t s [ 1 1 ] . The word tes t l i s t s contained synthes ized words of the consonant-vowel-consonant (CVC) type[10][11][12] . The purpose was to determine q u a l i t a t i v e l y whether the phonemes would be i n t e l l i g i b l e w i t h i n a word. The word l i s t s were p h o n e t i c a l l y ba lanced, and the t e s t words were common and f a m i l i a r . The phonemic content occurs wi th the same frequency of occurrences as the phonemes found i n everyday speech. Examples of these monosyl lable words are shown i n Appendix D. L i s t s of PB sentences were a l so used i n the i n t e l l i g i b i l i t y tes t [13] [14] . They were equa l ly d i f f i c u l t and were reasonably short and homogeneous. Short l i s t s are d e s i r a b l e so that the a r t i c u l a t i o n score w i l l not be unduly inf luenced by fac tors that tend to r a i s e or lower the a r t i c u l a t i o n score over a g iven t e s t . In sentence i tems, which were scored i n terms of the meaning conveyed, p s y c h o l o g i c a l f a c t o r s were s t i l l more important i n determining the a r t i c u l a t i o n score than when s i n g l e words are used. The percentage of key words c o r r e c t l y r e -corded depended not only upon the a r t i c u l a t i o n values of these i n d i v i d u a l words, but a l so upon the r e l a t i o n they bore to the other words of the sentence. Examples of the PB sentences used i n the experiment are shown i n Appendix E . Short passages from a text book[15] , newspapers and magazines were a l so presented to subjec t s . The p s y c h o l o g i c a l f a c t o r s which would ^ r r „ „ 4- *_-u ~ _ -* —• i +.u . -c,— .4-1— - - —.« - _ i. .... 4. 4. t - v ^ I. L44^4.C^I. 1.44^ . 4. V—• .J V4 W 4. W 4> J.444J. J_4i4. 4. L. O 4.44UOC 4 Vj4 LllC DClll.CllV.4. L.CC3 L C/^UCpL l_ 11 C4 l_ the contents of the'passages are more p r e d i c t a b l e than the sentences and t h i s w i l l subsequently a f f ec t the scores . The m a t e r i a l s covered by the passages have to be equa l ly f a m i l i a r to each t e s t subject so that the tes t r e s u l t s are comparable. The mater ia l s used i n a l l t e s t s were not r e s t r i c t e d to the 1,000 words vocabulary . For example, on ly 70% of the sentences and short passages shown i n Appendices E and F were i n the present vocabulary . The use of t e x t s , such as h i s t o r y books, conta in ing frequent appearances of names which are d i f f i c u l t to pronounce, was avoided. 3.2 Experimental Design and Tes t ing Procedure Six subjec t s , 3 s ighted and 3 b l i n d , aged 20 to 27, were used i n the experiment. A l l subjects had at l e a s t h igh school educat ion , 4 of them have completed undergraduate co l l ege educat ion . No one had 24 formal t r a i n i n g i n a r t i c u l a t o r y phonet ics . The experiment was designed to tes t the confusion among the phonemes i n i s o l a t e d words and the i n -t e l l i g i b i l i t y scores of the synthe t i c speech i n i s o l a t e d words, PB sen-tences , and short passages. Each subject was given h a l f an hour per sess ion to go over a short l i s t of 200 i s o l a t e d words (Appendix D) and to become f a m i l i a r with the synthe t i c speech. A f t e r the b r i e f t r a i n i n g p e r i o d , the subjects were given the tes t m a t e r i a l . The m a t e r i a l had been p r e v i o u s l y recorded on a tape-recorder (Sony TC-165) and was played back to the subjects i n a soundproof room through an ampl i f i er - loudspeaker system (Ampex 2044). F i v e sets of 50 i s o l a t e d words each, f i v e sets of 10 PB sen-tences each, and f i v e short passages were i n d i v i d u a l l y presented to the 6 subjec t s . Examples of these tes t m a t e r i a l are shown i n Appendices D, E and F . Each word and each sentence was repeated three times d u r i n g the presenta t ion . There were re s t periods i n between tes t sess ions to ensure that the subjects would remain a l e r t throughout the whole t e s t . A f t e r t h i s s er i e s of t e s t s , the subjects were given another t r a i n i n g per iod followed by another s e r i e s of t e s t s as b e f o r e . A t o t a l of 5 sess ions were g iven . Contextual m a t e r i a l was presented at a r a t e of 150 words per minute. 3.3 Results and Discuss ions The i n t e l l i g i b i l i t y scores of the s i x subjects i n the word tes t wi th respect to the number of t r a i n i n g sess ions are shown i n F i g . 6 . Each sess ion was about one-half hour l o n g . The means and standard d e v i a -t ions of the scores of the subjects i n the word t e s t are shown i n percent i n Table I I . The mean scores are shown i n F i g . 7. The standard d e v i a -t ions show that i n i t i a l l y the scores were widely spread among the s u b j e c t s . 100 F i g . 6 Word tes t scores of subjects as a funct ion of tes t sess ions . Test Session Mean Standard D e v i a t i o n 1 74.9 9.45 2 84.9 5.80 3 89.0 3.01 4 90.8 .1.82 5 92,3 2.33 Table II Means and standard dev ia t ions of the percent c o r r e c t scores i n the word tes t of subjects as a func t ion of t e s t s e s s ions . F i g . 7 Mean word tes t scores as a funct ion of tes t s e s s ions . Rapid l e a r n i n g i n the f i r s t couple sess ions i s i n d i c a t e d by the r i s e i n F i g . 7, but the ra te of l e a r n i n g decreased as the subjects became more f a m i l i a r wi th the ' d i a l e c t ' of the s y n t h e t i c speech. The e r r o r s i n these tes ts were caused mainly by a few phonemes. From the confusion matrix of the word tes t shown i n Tables I I I - V I I , confusions among the phonemes / b , k , p j d ^ j S , J , h , r , j / were the major e r r o r s although these phonemes formed only about 3% of the phonemes presented i n the word t e s t . The m a j o r i t y of the e r r o r s were with continuants and mostly occurred w i t h i n a c lass of phonemes wi th the same manner of a r t i c u l a t i o n ; for example, one f r i c a -t i v e was confused wi th another f r i c a t i v e . Few e r r o r s , f or example, on ly about 3% for / u / , were made i n the r e c o g n i t i o n of the vowels. Consonants were more d i f f i c u l t to perce ive c o r r e c t l y than the vowels because the percept ion of consonants depends upon the vowels that f o l l o w [ ± 0 j . A l so the h igh rrequency components of tne consonants are one of the main cues for consonant p e r c e p t i o n . In the sampling and quant i za t ion process of the speech s i g n a l s , d e t a i l s of these components were subject to more d i s t o r t i o n than the lower frequency components which e x i s t i n the in tense , f i r s t formant of the vowel sounds.- There are other fac tors that are important to the percept ion of the consonants which the concatenation method tends to d i s t o r t . Formant t r a n s i t i o n i s one of the major problems; for example, stop consonant-vowel formant t r a n s i t i o n s , i n which the second formant t r a n s i t i o n e f f e c t i v e l y cues the / p , t , k / v e r -sus the vo iced cognates / b , d , g / d i s c r i m i n a t i o n s . The d i s t i n c t i o n between the unvoiced and vo iced cognates i s a l so made p a r t l y by the f i r s t formant t r a n s i t i o n . Recogni t ion of consonants i n the i n i t i a l p o s i t i o n was poorer than f o r those i n the f i n a l p o s i t i o n i n a word; f o r example, from the RESPONSE H CO b P d t 9 k m n 0 t j <*3 1 V f c z s J h w r J 1 3 u e Y 0 E A 3 a b no 10 1 2 P 22 140 2 2 1 d 3 3 162 t 23 47 409 9 96 k :o 30 24 240 m 100 40 4 4 n 20 160 6 0 2 2 20 t j 54 10 23 1 1 1 1 224 9 V 40 6 2 f 10 110 3 10 15 20 1 1 2 z 40 5 s 7 150 J 14 10 90 h 2 I 1 1 12 11 10 40 w 4 50 12 r 40 4 60 J 10 16 1 540 6 84 u 12 6 SO e 2 6 340 V 31 5 0 3 210 e 4 260 A 5 150 1 0 132 a 3 66 Table I I I Confusion matrix of phonemes i n word t e s t a f t e r ^ hour of t r a i n i n g % trace = 8 8 . 6 % VO RESPONSE b P d t g k m n 0 t j d? 1 V f 8 | z s J h w r J I 3 u e Y o A 0 a b 109 30 P 6 200 4 d 161 1 • t 15 441 g 96 3 k 10 290 3 m n o 24 10 n 6 180 5 0 1 2 • 20 54 10 26 1 240 5 > V 42 3 3 f 6 112 2 5 5 7 31 5 2 1 41 1 5 150 J 2 11 9 93 1 h 4 6 4 13 51 w 1 61 4 r 15 3 66 J 5 5 20 1 544 1 1 84 u 9 2 86 1 e 337 2 3 Y 36 0 15f E 264 A 155 1 0 2 13C a Table IV Confusion matrix of phonemes i n word test a f t er 1 hour of t r a i n i n g % trace = 93.5% O RESPONSE b P d t g k HI n 0 t j 1 ? 1 V f 3 z s J h w r J 1 3 u e Y 0 e A 3 a b 110 30 P 200 d 210 t 10 430 9 100 k 10 20 30 240 m 160 5 5 n 7 220 3 0 2 2 16 tj 80 d'5 12 57 1 1 I 310 3 V 60 2 f 4 96 2 . a 10 2 30 z 30 1 s 1 280 J 2 3 1 4 2 87 h 3 16 45 w 2 80 2 r 4 116 6 J 2 4 51 1 530 5 IOC u 30 80 e 230 Y 50 o 170 e 330 A 140 0 150 a ! 130 Table V Confusion matrix of phonemes i n word tes t a f ter 1% hours of t r a i n i n g % trace =95.3% RESPONSE H H b P d t g k m n 0 tj " 5 1 V f 5 z s J h w r J 1 3 u e Y o e A 0 a b 116 31 P 200 d 201 6 t 6 420 g 101 1 k 6 10 ?4 210 m 166 4 2 n 6 223 0 1 1 14 tj 80 d 5 8 60 1 320 4 y 62 1 f 4 80 1 a 1 1 36 z 30 1 s 1 1 290 / 2 1 6 1 1 87 h 5 10 49 w 82 3 4 r 4 120 3 J 2 6 51 i 540 -IOC u 6 84 e. 23C Y 50 0 16E e 336 A 0 * 16; a 1 132 Table VI Confusion matrix of phonemes i n word test a f ter 2 hours of t r a i n i n g % trace =96.6% r-o R E S P O N S E b P d t 9 k m n 0 t j <J3 1 V f a z s J h w r J 1 3 u e Y 0 e A 0 a b 131 20 P 2 201 d 200 3 t 5 421 9 101 1 k 6 10 23 210 m 162 3 3 n 3 225 1 1 2 12 t j 81 <J3 9 62 1 321 V 62 1 f 2 80 i S 6 2 36 z 32 1 s 1 2 281 J 1 2 77 h 11 47 w 83 r 3 117 2 J 7 1 52 1 532 1 , 3 121 u 2 31 e 241 Y 37 0 150 e 360 A 141 3 171 a 151 T a b l e V I I C o n f u s i o n m a t r i x o f p h o n e m e s i n w o r d t e s t a f t e r 2k h o u r s o f t r a i n i n g % t r a c e = 97.5% 34 t e s t r e s u l t s , t h e consonant /v/ was p o o r e r i n t h e i n i t i a l p o s i t i o n t h a n f i n a l p o s i t i o n . A l s o , t h e f r i c a t i v e /S/ was a c o n s i s t e n t s o u r c e o f e r r o r . I n a d d i t i o n , /J"/ was o f t e n h e a r d as /"?/, and / r / as /w/ o r /v/. The a f f r i c a t i v e / d ^ / was o f t e n c o n f u s e d w i t h n i l as w e l l . The n a s a l s /m/ and / n / were sometimes c o n f u s e d i n t h e i n i t i a l p o s i t i o n s . The c o n -t i n u a n t / j / i n t h e i n i t i a l p o s i t i o n was e a s i l y c o n f u s e d w i t h t h e l a t e r a l /'/. The p l o s i v e /b/ was o f t e n h e a r d as t h e a l v e o l a r p l o s i v e / d / . The f i n a l p o s i t i o n e r r o r s were c h i e f l y among t h e n a s a l s /m,n,n/. The r e s u l t s shown by t h e c o n f u s i o n m a t r i x i n d i c a t e d r a p i d l e a r n i n g i n i t i a l l y by t h e t e s t s u b j e c t s . They l e a r n e d t o d i s c r i m i n a t e phonemes w h i c h were d i f f i c u l t a t t h e b e g i n n i n g o f t h e t e s t , s u c h as and /5/, /w/ and / r / and / j / . The s c o r e s o f each s u b j e c t i n t h e PB s e n t e n c e t e s t s a r e shown i n F i g . 8. The means and s t a n d a r d d e v i a t i o n s c f t h e c c o r e ? f o r t h e wh o l e group a r e shown i n T a b l e V I I I . The mean s c o r e s a r e shown i n F i g . 9. A l l t h e s u b j e c t s a c q u i r e d h i g h s c o r e s w i t h i n a s h o r t p e r i o d o f t i m e . Most o f them a f t e r o n l y two o r t h r e e t r a i n i n g s e s s i o n s c o u l d r e c o g n i z e o v e r 95% o f t h e s e n t e n c e s . The h i g h s c o r e s i n t h e s e n t e n c e t e s t were due t o t h e f a c t t h a t t h e s e n t e n c e c o n t e x t r e d u c e d t h e number o f a l t e r n a -t i v e words among w h i c h t h e l i s t e n e r must d e c i d e . D u r i n g t h e c o u r s e o f t h e t e s t , s u b j e c t s o f t e n t r i e d t o l o o k f o r c e r t a i n k e y words i n a s e n t e n c e and sometimes even i f t h e y m i s s e d some words t h e y c o u l d s t i l l 'guess' t h e s e n t e n c e d e r i v i n g t h e meaning f r o m t h e k e y words. I n t h e i n t e l l i g i b i l i t y t e s t w i t h s h o r t p a s s a g e s , t h e words i n t h e same s e n t e n c e as w e l l as t h e p r e v i o u s one a f f e c t t h e r e c o g n i t i o n o f t h e p r e s e n t e d m a t e r i a l . The r e s u l t s o f t h e t e s t w i t h s h o r t p a s s a g e s g i v e n t o t h e s u b j e c t s a r e shown i n F i g . 10. The means and s t a n d a r d d e -F i g . 8 Sentence tes t scores of subjects as, a f u n c t i o n of tes t sess ions . Test Session Mean Standard D e v i a t i o n 1 83.8 7.34 2 89.8 2.85 3 93.6 2.84 4 94.6 2.5 5 97.0 1.64 Table V I I I Means and standard d e v i a t i o n s of the percent c o r r e c t scores i n the sentence t e s t of s u b j e c t s as a f u n c t i o n of t e s t s e s s i o n s . 37 F i g . 9 Mean sentence t e s t scores as a func t ion of tes t s e s s i 38 2 3 4 5 T E S T S E S S I O N S F i g . 10 S h o r t passage t e s t s c o r e s o f s u b j e c t s a s a f u n c t i o n of t e s t s e s s i o n s . v i a t i o n s of the scores are shown i n Table IX. The mean scores are p l o t -ted i n F i g . 11. The scores for t h i s t e s t were higher than for the sen-tence t e s t , most of the subjects could recognize 98% of the passages being presented. The content of the m a t e r i a l was u s u a l l y g iven by f r e -quent 'key' words; for example, i n sc ience textbooks, the words "measure1 "example", or "conclude" appeared qui te o f ten . Test Session Mean Standard D e v i a t i o n 1 86.6 2.83 2 92.6 3.02 3 97.5 1.14 4 97.8 1.12 5 98.5 0.94 Table IX Means and standard d e v i a t i o n s of the percent c o r r e c t scores i n the short passage t e s t of subj e c t s as a f u n c t i o n of t e s t s e s s i o n s . 41 TEST SESSIONS F i g . 11 Mean short passage tes t scores as a func t ion o test sess ions . 4. SUMMARY, DISCUSSIONS, AND SUGGESTIONS 4.1 Summary A segmentation program was w r i t t e n f o r the PDP-12 d i g i t a l com-puter to r e c o r d , d i s p l a y and l i s t e n to the speech samples. Us ing the sampled and d i g i t i z e d speech samples, 33 b a s i c phoneme segments were ex trac ted . Synthet ic speech was produced by the concatenat ion of these phonemes with a d i c t i o n a r y stored i n the d i s k and a software program for the general grapheme-phoneme correspondence r u l e s . The d i c t i o n a r y con-tained those 300 words which v i o l a t e d the correspondence r u l e s on the bas i s of a frequency study of the f i r s t 1000 words of the Thorndike word l i s t . A t o t a l of 72 general correspondence r u l e s was used. A f t e r an input word was accepted, the d i c t i o n a r y was searched f i r s t to t r y to f i n d a match. I f a match was found, the equiva lent phonemes def ined i n the d i c t i o n a r y were used to synthes ize the word. I f n o t , the genera l r u l e s were used to ass ign phonemes to the input words. T e s t - m a t e r i a l s of i s o l a t e d words, PB sentences, and short passages generated by the synthe t i c speech were presented to 6 subjects i n i n t e l l i g i b i l i t y t e s t s . The r e s u l t s showed that the subjects could l e a r n to i d e n t i f y the speech w i t h i n a short per iod of t r a i n i n g . The scores were r e l a t i v e l y h i g h , e s p e c i a l l y wi th the contextua l m a t e r i a l . Most e rrors were caused by a few phonemes which were i n the same manner c la s s of a r t i c u l a t i o n . The meaning of the t e s t mater ia l s was e a s i l y understood by the subjects at the normal speech ra te of 150 words per minute. 4.2 Discuss ions 4.2.1 Preparat ion of M a t e r i a l s During the prepara t ion of the phonemes, i t was found that the t ime-amplitude waveform of every phoneme segment must have the same p o s i t i v e s lope at the s t a r t and at the end of the waveform. The wave-form must s t a r t and end with zero v a l u e s . These are requ ired for the c o n t i n u i t y at the t r a n s i t i o n between adjacent phonemes. Any d i s c o n t i n u i t y would r e s u l t d i s t o r t i o n s at the phoneme boundary and a n o t i c e a b l e ' c l i c k ' or 'bump' would be heard i n the output speech which i s h i g h l y undes irab le and i n t e r f e r e s with i n t e l l i g i b i l i t y . An example of t h i s case can be shown by j o i n i n g the phonemes /m/ and / i / as i l l u s t r a t e d i n F i g . 12. Since the two phonemes have d i f f e r e n t s lopes i n t h e i r waveform, t h i s would produce d i s t o r t i o n i n the output sound / m i / . One means of c o r r e c t -ing t h i s phenomenon i s to s tore both /m/ and / i / wi th the same s t a r t i n g and ending slopes and with zero cross ing at both the beginning and the end of the waveform as shown i n F i g . 13. A l l the phonemes must have p o s i t i v e slopes so that any combination of the phonemes can be j o i n e d without any d i s c o n t i n u i t y problem. The recording of the speech samples was done by one speaker i n order to maintain the consistency i n the v o c a l q u a l i t y of the speech e s p e c i a l l y wi th vowels which were kept at equal p i t c h through the p r e -p a r a t i o n of b a s i c phonemes. The vowels are important for cons is tency i n the i d e n t i t y of the speaker, but s ince consonants are l e s s d i s t i n c t i v e from one speaker to another, i t i s not so important that the same v o i c e be used to record the consonant phonemes. Consonant phonemes recorded by a second speaker were interchanged wi th the o r i g i n a l speaker's correspond-ing phonemes: l i t t l e d i f f e r e n c e was n o t i c e d . 4 .2.2 Storage of Data A t o t a l of 186.8 k i l o - b i t s would be r e q u i r e d for s t o r i n g a l l the 33 bas i c phonemes i f the whole b a s i c phoneme were to be recorded and p o s i t i v e s t a r t i n g slope Jp o s i t i v ending e s lope • ••• it* bas ic segment of phoneme /m/ negative s t a r t i n g slope discont inuous phonemic boundary phoneme /m/ negative ending slope k-»| b a s i c segment of phoneme li-k phoneme / i / zero reference time F i g . 12 Discontinuous phonemic boundary of the / m i / time-amplitude waveform p o s i t i v e s t a r i n g slope I p o s i t i v e ending s lope b a s i c segment of phoneme /m/ p o s i t i v e s t a r t i n g s lope • • I • • • continuous phonemic boundary phoneme /m/ p o s i t i v e ending s lope b a s i c segment of phoneme / i / phoneme / i / zero reference time F i g . 13 Continuous phonemic boundary of the / m i / .time-amplitude waveform 45 stored i n a 12 -b i t 'word' per sample. But for the vowels and the vowel-l i k e phonemes, only the samples w i t h i n one p i t c h per iod were s tored and the vowel was produced l a t e r by repeat ing 30-40 times the s tored samples. Th i s process reduced the storage to 163.2 k i l o - b i t s . F u r t h e r , the b a s i c phonemes were packed i n t o 6 b i t s per sample wi th two samples i n each 12-b i t 'word' thus fur ther reducing the storage by h a l f , to 81.6 k i l o - b i t s . A l l phonemes were s tored i n the core memory of the computer. Approximately 7,000 12 -b i t 'words' were used for phoneme s torage , among which only 500 'words' were used for s t o r i n g vowels and semi-vowels. Almost 6.5 k i l o - ' w o r d s ' or 93% of the t o t a l data storage were devoted to consonant s torage , e s p e c i a l l y for the f r i c a t i v e s / v , f , 5, z , s , J", h / , the a f f r i c a t i v e s d ^ / , and the continuants /w, I, r , j / . Often one of these phonemes, such as / h / , would r e q u i r e 500 'words' of packed byte storage; s t o r i n g a p i t c h per iod of a t y p i c a l vowel such as /I / r equ ires only 32 'words' . A cheap means of s t o r i n g these phonemes was t r i e d us ing a RK08 . d i s k . Word synthes is was done d i r e c t l y with phonemes s tored i n the d i s k , but due to d i s k access time which requ ire s - f rom 2.35 to 477 m i l l i s e c o n d s , delays from 50 to 150 m i l l i s e c o n d s of ten appeared between adjacent phonemes and caused 'bumpy' sounds i n the generated speech which were judged to be unacceptable . 4 .2 .3 Synthet ic Speech and Subjects A l l the subjects employed i n t h i s experiment were young adul t s aged from 20 to 27. Among those there were 4 subjects who had had u n i -v e r s i t y education and the other two had f i n i s h e d h igh s c h o o l . However, from the tes t r e s u l t s , educat iona l background d i d not seem to have any e f f e c t on i n d i v i d u a l a b i l i t y i n l e a r n i n g the speech. The tes t subjects were chosen at random. There was no i n t e n -t i o n of r e s t r i c t i n g the speech being used by a c e r t a i n group of people . The r e s u l t s d i d not show any s i g n i f i c a n t d i f f e r e n c e i n the performance between the s ighted and b l i n d subjec t s . The m a t e r i a l s presented were r e l a t i v e l y simple and d i d not demand great understanding a b i l i t y . Two e l d e r l y people were used i n a p i l o t experiment. I n t e l l i g i b i l i t y scores i n i s o l a t e d word t e s t were lower than those obtained with the young sub-j e c t s . The e l d e r l y subjects were more confused with phonemes which had important cues i n h igh frequency components such as / t j / . But wi th the sentence and short passage t e s t s , they performed as w e l l as the young subjec t s . 4 .2 .4 Synthet ic Speech and Other Speech Devices The synthet i c speech can be presented at normal speaking r a t e cf 150""200 "cr^s per *nir i , + - ( D ^o a^-'^s^-^d *~o i r*^"'vidu^l ^ ° ^ P T O T i n n I t s ra te i s twice as f a s t as the s p e l l e d speech. A l s o i t does not r e -quire the l i s t e n e r to memorize the s p e l l i n g of the words or to f u r t h e r decode them in to a meaningful message. The speech i s perce ived as any normal speech which reduces the number of d i s c r e t e elements per word and permits the transmiss ion of in format ion at a ra te that i s r a p i d and w e l l w i t h i n the r e s o l v i n g power of the ear . In comparison with s p e l l e d speech, the speech synthes ized by the method d iscussed above can be i d e n t i f i e d wi th f a r greater accuracy and speed and i s h i g h l y d i s t i n c t i v e . I t has been shown that t h i s speech could be learned w i t h i n a short p e r i o d of t ime. I t i s b a s i c a l l y E n g l i s h , almost a s p e c i a l d i a l e c t at t imes , which i s nevertheless e a s i l y recogn izab le . The speech i s much more i n -t e l l i g i b l e than the S p e l l t a l k by L o n g i n i [ 1 6 ] . There are other machines which produce b e t t e r q u a l i t y speech than the present s y n t h e s i z e r , but 47 these systems are more complex and expensive. Many improvements are p o s s i b l e for the present system but the performance of the system at t h i s stage i s adequate. 4.3 Suggestions In order to improve the i n t e l l i g i b i l i t y and the q u a l i t y of the s y n t h e t i c speech, one method i s to increase the number of s tored phonemes i n the system. More r e f i n e d phonemes, such as the schwa / a / f or example the "a" i n the word "about", can be added to the present phoneme data . These phonemes w i l l increase the naturalness of the speech q u a l i t y and w i l l a l so make some of the words eas i er to recognize . Since the present d i c t i o n a r y covers about 80% of the commonly used E n g l i s h words, a l a r g e r d i c t i o n a r y can be compiled to cover more words by a frequency study of the f i r s t 3,000 words which are about 95% of the most f requent ly used words. By doing so , more ru le s and d i c t i o n a r y storage w i l l be r e q u i r e d . Word s t res s and p i t c h can be added to the speech by v a r y i n g the frequency and the durat ion of the output phonemes. These v a r i a t i o n s may be implemented by a software program. The s t res s assignment by machine depends on a c l a s s i f i c a t i o n system of the E n g l i s h vocabulary i n t o p r e d i c t a b l e and probable s tress c l a s s e s . E n g l i s h words show a f a v o r i t e p a t t e r n , such as high-low for two s y l l a b l e words, as i n " c a t t l e " , high-low-mid for three s y l l a b l e s as i n "catalogue", and mid- low-high- low for four s y l l a b l e s as i n "catatonic"[17] . These f a v o r i t e pat terns could be implemented wi th general r u l e s . But there are frequent exceptions to these ru le s which make i t equa l ly adv i sable to s tore the exceptions i n t o the form of a s tress d i c t i o n a r y . T h i s method invo lves problems i n programming, and work i n these areas has yet to be done. 48 REFERENCES 1. P.W. Nye and J.C. B l i s s , "Sensory a i d s f o r the b l i n d : A c h a l l e n g i n g problem w i t h lessons f o r the f u t u r e . " Proc.IEEE, 5_8, 1878-1898, 1970. 2. J.G. L i n v i l l , "Development progress on a m i c r o e l e c t r o n i c t a c t i l e f a c s i m i l e reading a i d f o r the b l i n d " , IEEE Trans.Audio and E l e c t r o -a c o u s t i c , 17, 271-274, 1969. 3. D. Ramsay, "Design of a simple reading machine f o r the b l i n d " , M.A.Sc. t h e s i s , Dept.Elec.Engg., U. of B r i t i s h Columbia, Oct. 1968. 4. J.L. Flanagan, "Note on the design of termin a l - a n a l o g speech synthe-s i z e r " , J.Acous.Soc.Am., 29, 306-310, 1957. 5. J.L. Flanagan, G.H. Coker and CM. B i r d , "Computer s i m u l a t i o n of a formant-vocoder s y n t h e s i z e r " , J.Acous.Soc.Am., 35, 2003(A), 1963. 6. C.Y. Suen and M.P. Beddoes, "Development of a d i g i t a l s p elled-speech reading machine f o r the b l i n d " , IEEE Trans.Biomed.Eng., BME-20, 452-459, 1973. .7. M.P. Beddoes, C.Y. Suen, B.A. Dixon.and R.G. George, " S p e l l e x : An automatic t a l k i n g t y p e w r i t e r f o r the b l i n d . " Digest of 10th I n t . Conf. on Med. and B i o . Eng., 2> August, 1973. 8. E.L. Thcrndilcc anJ I . Lorge, "The teacher's word book of 30,000 woras", Bureau of Pub., Columbia U., N.Y., 1963. 9. C.Y. Suen and M.P. Beddoes, "The s i l e n t i n t e r v a l of stop consonants", Language and Speech, Sept. 1974. ( t o be published) 10. J.L. Flanagan, "Speech a n a l y s i s , s y n t h e s i s and p e r c e p t i o n " , Academic Press Pub., N.Y., 1965. 11. J.P. Egan, " A r t i c u l a t i o n t e s t i n g methods", Laryngoscope, 58, 955-991, 1948. 12. G.E. Peterson and I . L e h i s t e , "Revised CNC l i s t s f o r a u d i t o r y t e s t s " , J . of Speech and Hearing D i s o r d e r s , 27_, 69-70, 1962. 13. "1965 Revised l i s t of p h o n e t i c a l l y balanced sentences (Harvard sen-t e n c e s ) " , IEEE Trans.Audio and E l e c t r o a c o u s t i c , 17_, 239-246, 1969. 14. H. Fetcher and J.C. S t e i n b e r g , " A r t i c u l a t i o n t e s t i n g methods", B e l l System T e c h n i c a l J . , 8, 806-854, 1929. 15. N.B. Smith, "Be a b e t t e r reader foundations", P r e n t i c e - H a l l , N.Y., 1968. 16. R.L. L o n g i n i , " S p e l l t a l k : a new approach to reading machine f o r the b l i n d " , AFB Research B u l l e t i n , No. 24, 153-157, March 1972. 17. J.H. Gaitenby, G.N. Sholes and G.M. Kuhn, "Word and phrase s t r e s s by r u l e f o r a reading machine", I n t . Congr. on Speech Comm. and P r o c , Boston, A p r i l 24-26, 1972. 49 Appendix A D e s c r i p t i o n of Phonemes Used Consonants \ . P lace of ^ ^ a r t i c u l a t i o n Manner of a r t i c u l a t i o n >v Bilabial Labio-dental Dental Alveolar Palatal-alveolar Palatal Velar Glottal P l o s i v e b P d t g k Nasal m n 0 A f f r i c a t i v e t j d 5 L a t e r a l 1 F r i c a t i v e V f 0 z s I h Continuant w r j S i l e n t # Vowels Front Back Y fa A /o a 50 Appendix B Data Codes  Word Codes The word codes are 6 -b i t ASCII codes represent ing the graphemes of words stored i n the input stack and the d i c t i o n a r y . The f o l l o w i n g tab le shows the 6 - b i t codes represented each by a 2 - d i g i t o c t a l number. Codes • Graphemes Codes Graphemes 01 A 20 P 02 B 21 Q 03 C 22 R 04 D 23 S 05 E 24 T 06 F 25 U 07 G 26 V 10 H 27 W 11 I 30 X 12 J 31 Y 13 K 32 Z 14 L 15 M 16 N 17 0 00 40 72 terminat ion of word codes, used i n d i c t i o n a r y o n l y . space used i n input stack o n l y . output mode code, used i n input stack o n l y . Ind ica tor Codes Ind ica tor codes are used i n the d i c t i o n a r y for the s t a r t and the end of e i t h e r a word code or phoneme code. 0000 - s t a r t of the phoneme codes or end of the word codes 7777 - s t a r t of the word codes or end of the phoneme codes 6666 - end of the d i c t i o n a r y f i l e Phoneme Codes The phoneme codes are 6 - b i t codes represented d i f f e r e n t combinations of the 33 bas ic phoneme stored i n the speech synthesis system. The codes are used i n the d i c t i o n a r y represent ing the phoneme equiva lent of the words. Codes Phonemes Codes Phonemes 01 / e ' / 30 . / s / 02 /#b/ 31 / J / 03 mi 32 / s / 04 I**I 33 /#/ 05 ' I^I 34 /aee / 06 ni 35 / « o / 07 IHI 36 nli 10 /hi 37 ill 11 IM 40 12 l&V 41 I*I 13 l#k/ 42 fe/ 14 IM 43 lol 15 M 44 IQI 16 In/ 45 / i a / 17 lof 46 /"/ 20 Ihl 47 / a / lo ]/ 21 l#9l 50 22 frl 51 / A U / 23 1*1 52 / u 6 / 24 in/ 53 IV 25 IN 54 1^1 26 /w/ 55 27 / s / 56 IV 77 terminat ion of the phoneme codes used i n the d i c t i o n a r y only Appendix C Summary of The Frequency Study of The F i r s t 1,000 Words of The Thorndike Word L i s t for Vowels The fo l lowing tables summarizes the percentage of the grapheme to phoneme r e l a t i o n s h i p counts f o r vowels and vowel combinations i n the f i r s t 1,000 words of the Thorndike 's l i s t . "a" / e i / 22.9% hi 35.1% Izl 32.1% hi 13.7% "a i" / e i / 84.8% / £ / 15.2% "aii" hi 80.0% hi 20.0% "e" hi 3.1% / £ / 40.7% IM 14.2% "ea" IM .61.8% /Y/ 5.9% Izl 22.1% / a / 1.5% / i a / 8.8% "ee" IM 100.% "ei" IM 30.0% /aee/ 30.0% lei 30.0% / e i / 10.0% "eo" IM 100.% "er" hi 50.8% / u / 3.3% hi 37.7% /y 8.2% "ew" NI 100.% I I IM 70.1% /aee / 27'. 1% hi 2.3% IV-0.5% "ie" IM 58.3% /aee / 33.3% / i e / 8.4% "io" / I 3 / 100.% " i r " / V 100.% "o" / a u / 1.2% hi 45.6% / o / 28.2% / a / 14.5% / u / 7.9% / A / 2.1% l\l 0.4% "oa" lol 80.0% /o/ 20.0% "oe" / A / 50.0% hi 50.0% "oi" Io1/ 100.% "oo" lul 76.2% / o / 19.0% hi 4.8% "ou" / a u / 25.5% / a / 21.5% / a / 11.8% / u / 13.7% / A / 17.7% lol 7.8% "ow" / a u / 90.0% lol 10.0% "u" IM 5.1% / A / 41.0% hi 35.9% iy 15.4% hi 1.3% / i U / 1.3% " u i " / i / 37.5% hi 50.0% /dee/ 12.5% "ue" / i U / 42.9% / u / 42.9% / i e / 14.2% Appendix D Examples of Monosyllable Words 1. pet 26. f i n 2. bet 27. f i g 3. pub 28. bed 4. l e d 29. k i d 5. l e t 30. put 6. d i d 31. pet 7. t i d 32. hush 8 - got 33. zone 9. load 34. t h i s 10. net 35. t h i n 11. met 36. shave 12. red 37. l a t e 13. j e t 38. mate 14. hot 39. hate 15. heap 40. chase 16. yet 41. f a t e 17. wet 42. make 18. not 43. p o l e 19. h e a l 44. v e t 20. t i c k 45. c o o l 21. p i c k 46. soup 22. peace 47. box 23. cape 48. mean 24. s i l l 49. neat 25. s i n 50. jeep 55 Appendix E Examples of PB Sentences 1. Name a s ta te which has no sea coast . 2. What substance i s a good conductor of heat . 3. E x p l a i n the d i f f e r e n c e between import and export . 4. E x p l a i n the purpose of t h i s t e s t . 5. At what times do the t ides come i n . 6. What knowledge i s covered by the study of astronomy. 7 . Name a good p lace to eat i n t h i s d i s t r i c t . 8. What i s the importance of l arge windows i n s t o r e s . 9. How are the pages of a book he ld together . 10. Name a f r u i t which grows i n bunches. 11. Why d id people conserve food dur ing the war. 12. N a m e a n insec t that has no hard s h e l l . 13. What candy i s b lack and good f o r c o l d . 14. A pot of tea helps to pass the evening. 15. I t i s easy to t e l l the depth of the w e l l . 16. The s a l t breeze came across from the sea . 17. The boy was i n the room when the sun r o s e . 18. Note c l o s e l y the s i z e of the gas tank. 19. Mend the coat before you go to the show. 20. He pays the b i l l s at the end of every month. 56 Appendix F Example of A Short Passage[15] Thousands of people l i v e i n the b i g c i t i e s i n our country . Some of these people work i n f a c t o r i e s . Some work i n s t o r e s , o f f i c e s , h o s p i t a l s , theaters , and h o t e l s . A l l of these people have to eat good food i n order to keep themselves s trong and healthy and able to do t h e i r work. In most c i t i e s , there are very l i t t l e land on which food can be r a i s e d . C i t y workers have to depend upon others to r a i s e t h e i r food f o r them. Some food i s shipped in to the c i t y from d i s t a n t p laces by t r u c k , t r a i n , boat , and a i r p l a n e . C i t y d w e l l e r s , however, l i k e to have t h e i r m i l k , eggs, f r u i t and vegetables very f r e s h . 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0065541/manifest

Comment

Related Items