UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Lexiphone : an experimental reading machine for the blind 1966

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata


UBC_1966_A7 C2.pdf [ 5.15MB ]
JSON: 1.0103236.json
JSON-LD: 1.0103236+ld.json
RDF/XML (Pretty): 1.0103236.xml
RDF/JSON: 1.0103236+rdf.json
Turtle: 1.0103236+rdf-turtle.txt
N-Triples: 1.0103236+rdf-ntriples.txt

Full Text

THE LEXIPHONE: AN EXPERIMENTAL READING MACHINE FOR THE BLIND CHARLES GARRY AKERMAN CAPLE B.A.Sc, University of British Columbia, 1963 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR >THE DEGREE OF MASTER OF APPLIED SCIENCE in the Department of Electrical Engineering. We accept this thesis as conforming to the required standard Members of the Department of Electrical Engineering THE UNIVERSITY OF BRITISH COLUMBIA February, 1966 In p r e s e n t i n g t h i s t h e s i s in p a r t i a l f u l f i l m e n t o f the requirements f o r an advanced degree at the U n i v e r s i t y o f B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r re ference and s tudy . I f u r t h e r agree that p e r - m i s s i o n f o r e x t e n s i v e copying o f t h i s t h e s i s f o r s c h o l a r l y ' purposes may be granted by the Head of my Department o r by h i s r e p r e s e n t a t i v e s „ It i s understood that copy ing o r p u b l i - c a t i o n of t h i s t h e s i s f o r f i n a n c i a l ga in s h a l l not be a l lowed wi thout my w r i t t e n p e r m i s s i o n . Department of E l e c t r i c a l Engineering The U n i v e r s i t y of B r i t i s h Columbia Vancouver 8, Canada ' Date February 15, 1966, • ABSTRACT An experimental reading machine for the blind has been built to test a proposed multidimensional audible code. This device, patterned after the popular Optophone reader, can generate either the multidimensional code or a simulated version of the Optophone code. The results of tests carried out with two blind subjects show that multidimensionally-encoded letters and words can be learned and "read" with reasonable accuracy, even •when entirely different dimensions of the code are ut i l i z e d . A comparative evaluation of the multidimensional and Optophone codes, based on the performance of 52 sighted persons, suggests that the multidimensional.code provides a better basis for letter discrimi- nation. A detailed study of the discrete print signals produced by this machine i s presented. The results of this study suggest that this particular print scanning system does not lend i t s e l f to automatic letter recognition, but that, with some pre-processing of the print, information, some optimization of the audible code can be achieved. It is also demonstrated that the information produced by this machine is highly redundant, and that the discrete nature of the print translation process may psychologically limit the maximum reading speed, regardless of the audible code employed. i i TABLE OE CONTENTS Page List of Illustrations o v i List of Tables v i i Acknowledgement ....................................... v i i i 1. INTRODUCTION 1 1.1 The Problem .......... 1 1.2 Previous Attempts to Solve the Problem ....... 2 1.3 The Proposed Code 2 1.4 Scope of the Project 4 1.5 Thesis Outline 5 2. A REVIEW OE READING MACHINES EOR THE BLIND 6 2.1 "Historical Review 6 2.2 Classification of Reading Machines for the Blind 13 2.3 Class Characteristics 15 2.3.1 Direct-Translation Machine 15 2.3.2 Recognition Machine 15 2.3.3 Cost 16 2.4 Introduction to the Lexiphone 17 3. DESCRIPTION OE THE LEXIPHDNE 19 3.1 Principle of Operation • 20 3.2 The Light Source 21 3.3 The Motorized Platform 22 3.4 The Reading Tube 23 3*5 Quantizer and Switching Unit 24 3.6 The Lexiphone Coder 27 3.7 The Optophone Coder 29 i i i Page 4. EXPERIMENTAL EVALUATION OE THE LEXIPHONE CODE 31 4.1 Introduction 3 1 4*2 Blind Reading Experiments with the Lexiphone Code 32 4.2.1 Letter-Reading Tests With the Blind Subjects ..' 33 4.2.2 Word-Reading Tests With the Blind Subjects 35 4.2.3 Comments on the Blind Reading Experiments and Further Tests 37 4.2.4 Summary of Lexiphone Tests With the Blind Subjects 39 4.3 Lexiphone-Optophone Code Comparison With Sighted Subjects 40 4.3.1 Test Results 41 4.3.2 Summary of the Lexiphone-Optophone Code Comparison , 46 5. STATISTICAL STUDY OF LEXIPHONE-CODED LETTERS 48 5.1 Introduction 48 5.2 Quantizing and Recording the Print Signals ... 48 5.3 Study of Correlation Between Pairs of Cells .. 53 5.4 Characteristics of the Quantized Letters 54 5.4.1 Letter-Independent Characteristics .... 55 5.4.2 Letter-Dependent Characteristics ...... 57 5.5 Recognition Effectiveness of Characteristics . 59 6. LIMITATIONS OF READING MACHINES IN RELATION TO THE INFORMATION-CHANNEL 64 6.1 Introduction 64 6.2 Lexiphone Source Entropy 65 6 . 3 Psychological Source Entropy -68 6 . 4 Human Channel Capacity and Maximum Reading Rates 1 70 iv Page 7. SUMMARY AND CONCLUSIONS 74 APPENDIX I - THE THREE-LETTER WORD VOCABULARY ........ 77 APPENDIX II - SENTENCE LIST ........ 78 REFERENCES •_>•••»#• • • © • * o » » o * o » * « * « o « e » * » * « » o * « * « » » w © » » 79 V LIST OF ILLUSTRATIONS Figure Page 2-1 Nye's Comparison of Several Audible Codes .... 12 2- 2 • The Reading Machine Hierarchy ................ 17 3- 1 Photograph of the Lexiphone 19 3-2 Block Diagram of the Lexiphone ............... 20 3-3 Lexiphone Reading Tube 25 3-4 Relation of Photocells to Print 25 3-5 Photocell Signal Quantizing Circuit 26 ' 3-6 Photocell-Relay Quantizing Characteristic .... 26 3-7 Control of Lexiphone Code Dimensions ......... 28 3- 8 Optophone Coding Scheme 30 4- 1 Letter-Reading Test Results • 33 4-2 Letter Confusion Matrices 34 4-3 Word-Reading Test Results (Three-letter words) 36 4-4 Training and Testing Presentation Used for Each Letter 41 4- 5 Letter Confusion Matrices (Groups pooled) .... 44 5- 1 Quantized Facsimiles of the Letters "a" and "f 1 1 49 5-2 Binary Matrix and ̂ Hexadecimal Row Matrix Rep- resenting the Quantized Letter "a" 52 5-3 One-Dimensional Representation of Quantized Letter on a Time or Space Scale .............. 52 v i LIST OP TABLES Table Page 2- 1 Classification of Reading Machines for the Blind .. 1 3 3- 1 Values of the Lexiphone Code Dimensions 29 4- 1 Group Mean Scores and Standard Deviations of Code Tests 42 4- 2 Significance of Lexiphone-Optophone Code Test Variables ......... 43 5- 1 Correlation Coefficients Computed f.Tom the Quantized Photocell Signals .................. 54 5-2 Goodness Measures G. Computed for the v Characteristics C.. 61 v i i ACKNOWLEDGEMENT The author would like to thank Dr. M.P. Beddoes, the supervisor of this project, for his encouragement and assistance throughout the course of the work. The author i s indebted to Professor E.S.W. Belyea for his considerable assistance with the code comparison experiments. To Mrs. Eva Williams and Miss Cynthia Moffatt, who so willingly worked with the reading machine code, and to the volunteers of the Canadian Institute of the Blind who provided transportation, sincere thanls are offered. Special thanks are given to my wife, Marcia, for her help and encouragement. Acknowledgement is gratefully given to the National Research Council for financial assistance, and for their co-sponsoring, with the Mr. and Mrs. P.A. Woodward Foundation, the work described here* v i i i THE LEXIPHONE: AN EXPERIMENTAL READING MACHINE FOR THE BLIND 1. INTRODUCTION 1.1 The Problem For more than a half century the workers of many varied disciplines have laboured, but with limited success, to provide a simple, personal machine with which the blind might "read" ordinary print. A multitude of simple reading machines, known as direct-translation machines, have been proposed and built, but a l l have failed in one important respect: no direct- translation machine tested to date has enabled a blind person to read more .quickly than 30 W O r d s per minute. In fact, average machine reading rates have been closer to 10 words per minute. Researchers in this f i e l d consider that in order for a reading machine to be truly useful and widely accepted, i t should provide a reading speed of at least 60 to 100 words per minute. Consequently, the reading machine problem s t i l l remains to be solved: to construct a simple, cheap ($500 - $1000), and portable machine capable of translating print into a coded form easily assimilated by a blind person at speeds greater than 6'0 words per minute. 2 1.2 P r e v i o u s Attempts t o S o l v e the Problem Most of t h e e a r l i e r r e a d i n g machines p r e s e n t e d the p r i n t i n f o r m a t i o n i n terms of some a u d i b l e code. Each new machine was based on a d i f f e r e n t l e t t e r s c a n n i n g p r i n c i p l e , - and each d i f f e r e n t p r i n c i p l e i n h e r e n t l y gave r i s e to a new type of a u d i b l e code. While the c h a r a c t e r of the r e s u l t i n g codes d i f f e r e d c o n s i d e r a b l y , each code was founded on the assignment of v a r i o u s a u d i o f r e q u e n c i e s t o c e r t a i n parameters of the p r i n t d e t e c t e d by i t s scanner. The machine u s e r had t o a s s o c i a t e w i t h each l e t t e r a c h a r a c t e r i s t i c p a t t e r n of f r e q u e n c i e s . While none of t h e s e codes p r o v i d e d the r e a d i n g performance ex- p e c t e d , i t s t i l l seemed r e a s o n a b l e t h a t the b o t t l e - n e c k t o h i g h e r r e a d i n g speeds l a y i n the type o f a u d i b l e code employed; i t was s i m p l y a m a t t e r of d i s c o v e r i n g a code w e l l - s u i t e d t o the human c h a n n e l . 1.3 The Proposed Code An a u d i b l e code e a s i l y a s s i m i l a t e d by t h e human p r o - cesses i s t h a t of speech. W h i l e many of the components of speech a r i s e from pure f r e q u e n c i e s or tones (vowels and v o i c e d c o n s o n a n t s ) , a g r e a t number o f the components ( f r i c a t i v e s , s i b i l a n t s , p l o s i v e s , c l i c k s , e t c . ) have a c o m p l e t e l y n o n - t o n a l b a s i s . T h i s o b s e r v a t i o n suggests t h a t an a u d i b l e code more l i k e l y t o y i e l d the h i g h e r r e a d i n g r a t e s d e s i r e d would, u n l i k e the e a r l i e r codes which u t i l i z e d o n l y a s i n g l e a u d i t o r y v a r i a b l e ( f r e q u e n c y ) , be one t h a t i n c o r p o r a t e s the use of s e v e r a l s i m u l t a n e o u s v a r i a b l e s . Hence we propose j u s t such a new 3 code, c a l l e d a m u l t i d i m e n s i o n a l code, whose components a r e d e r i v e d f rom s e v e r a l a u d i t o r y v a r i a b l e s , o r di m e n s i o n s . An a u d i t o r y d i m e n s i o n i s d e f i n e d as a s t i m u l u s v a r i a b l e whose v a l u e can be ma n i p u l a t e d i n d e p e n d e n t l y o f any o t h e r s t i m u l u s v a r i a b l e . F o r example, the f r e q u e n c y and a m p l i t u d e of a tone are c o n s i d e r e d t o be independent dimensions because the v a l u e o f one v a r i a b l e may be a l t e r e d w i t h o u t a f f e c t i n g t h e v a l u e o f the o t h e r . Of c o u r s e , c e r t a i n dimensions may t o some e x t e n t be s u b j e c t i v e l y i n t e r - d e p e n d e n t . T h i s . c o n s i d e r a t i o n does not a f f e c t the d e f i n i t i o n of a di m e n s i o n , but i t does c o n t r o l the c h o i c e o f d i m e n s i o n s . Only experiments can show which dimen- s i o n s a re most s u b j e c t i v e l y independent and t h e r e f o r e most e f f e c t i v e . The d e c i s i o n t o i n v e s t i g a t e a m u l t i d i m e n s i o n a l code i s a l s o supported by the e x p e r i m e n t a l r e s u l t s o f many workers. P o l l a c k and F i c k s ^ " ^ have demonstrated t h a t human s u b j e c t s a re a b l e t o a s s i m i l a t e a g r e a t e r amount of i n f o r m a t i o n per a u d i b l e s t i m u l u s when t h e s t i m u l i v a r y a l o n g s e v e r a l d i m e n s i o n s , t h a n when t h e y v a r y a l o n g o n l y a s i n g l e dimension.- T h i s means t h a t , f o r a g i v e n number of a l t e r n a t i v e s t i m u l i , a m u l t i d i m e n s i o n a l l y - encoded s t i m u l u s i s more e a s i l y d i s c r i m i n a t e d t h a n one encoded i n a s i n g l e d i m e n s i o n . I f i t can be demonstrated t h a t a per s o n w o r k i n g w i t h a m u l t i d i m e n s i o n a l machine code i s a b l e t o " r e a d " more a c c u r a t e l y t h a n a p e r s o n w o r k i n g w i t h a u n i d i m e n s i o n a l code produced by t h e same machine, t h e n t h e r e i s a r e a s o n a b l e p o s s i b i l i t y t h a t t h e m u l t i d i m e n s i o n a l code can a l s o p r o v i d e h i g h e r r e a d i n g r a t e s . We i n t e n d t o make j u s t such a comparison 4 between our m u l t i d i m e n s i o n a l code and a p o p u l a r u n i d i m e n s i o n a l code known as t h e Optophone code. The m u l t i d i m e n s i o n a l machine code we propose i s based on the a u d i b l e dimensions s u c c e s s f u l l y employed by Beddoes (2) et a l . , i n a l e t t e r r e c o g n i t i o n experiment. These i n v e s t i g a - t o r s used t h e dimensions o f tone f r e q u e n c y , tone t i m b r e , h i s s bandwidth, and c l i c k , t o encode each l e t t e r w i t h a s i n g l e m u l t i d i m e n s i o n a l s t i m u l u s . These same dimensions are a p p r o p r i a t f o r our purposes'where we w i s h t o encode each of many elements of a l e t t e r w i t h a s i n g l e m u l t i d i m e n s i o n a l s t i m u l u s . These elements are tho s e n o r m a l l y encoded by the d i r e c t - t r a n s l a t i o n machine i n t o the f r e q u e n c y d i m e n s i o n a l o n e . 1.4 Scope o f t h e P r o j e c t A v e r s a t i l e d i r e c t - t r a n s l a t i o n r e a d i n g machine was c o n s t r u c t e d f o r t h i s p r o j e c t . P a t t e r n e d a f t e r one o f t h e more p o p u l a r r e a d i n g machines known as t h e Optophone ( d i s c u s s e d i n d e t a i l i n Chapter 2), our machine was designed to produce e i t h e r of two a u d i b l e codes: t h e o r i g i n a l Optophone code o r t h e m u l t i - d i m e n s i o n a l code. The v a l u a b l e f e a t u r e o f t h i s machine i s t h a t the components of each code a r e c o n t r o l l e d by e x a c t l y t h e same p r i n t s i g n a l s . C o n s e q u e n t l y , i f the p r i n t n o i s e i s t h e same f o r b o t h codes, any d i f f e r e n c e e x h i b i t e d between the two codes, i n terms o f r e a d i n g performance, would be due t o the codes a l o n e , independent o f the p r i n t s c a n n i n g t e c h n i q u e . 5 B o t h a b s o l u t e and comparative e v a l u a t i o n s of the m u l t i d i m e n s i o n a l code were c a r r i e d out u s i n g coded m a t e r i a l produced by t h i s machine. The r e a d i n g performance of two b l i n d s u b j e c t s p r o v i d e d the a b s o l u t e e v a l u a t i o n , w h i l e the r e l a t i v e performance of two groups of s i g h t e d s t u d e n t s w o r k i n g w i t h the two codes o f f e r e d a c o m p a r a t i v e e v a l u a t i o n of the m u l t i d i m e n s i o n a l and Optophone codes. The p r i n t s i g n a l s produced by our e x p e r i m e n t a l machine were s u b j e c t e d to a s t a t i s t i c a l s t u d y f o r the purposes of sug- g e s t i n g a u d i b l e code s i m p l i f i c a t i o n , and o f d e t e r m i n i n g the amount o f i n f o r m a t i o n a l redundancy and p s y c h o l o g i c a l e n t r o p y produced by a machine o f t h i s t y p e . 1.5 T h e s i s O u t l i n e A r e v i e w o f p a s t and c o n t i n u i n g r e s e a r c h i n the r e a d i n g machine f i e l d i s p r e s e n t e d i n Chapter 2. Chapter 3 p r e s e n t s a d e t a i l e d d e s c r i p t i o n of our e x p e r i m e n t a l machine and of the two a u d i b l e codes. . Chapter 4 d e a l s w i t h the e n t i r e m u l t i - d i m e n s i o n a l code e v a l u a t i o n . I n Chapters 5 and 6 the r e s u l t s of the s t a t i s t i c a l s tudy are p r e s e n t e d and d i s c u s s e d * I n a d d i t i o n , p s y c h o l o g i c a l l i m i t a t i o n s a s s o c i a t e d w i t h r e a d i n g machines o f t h i s type a r e c o n s i d e r e d i n Chapter 6. The f i n a l c h a p t e r , Chapter 7, summarizes the r e s u l t s o f Chapters 4, 5, and 6,- and p r e s e n t s a p p r o p r i a t e c o n c l u s i o n s and recommendations. 6 2. A REVIEW OF READING MACHINES FOR THE BUND This chapter i s devoted to a d i s c u s s i o n of the con- s i d e r a b l e e f f o r t t h a t has been made to produce a p r a c t i c a l read- i n g machine f o r the b l i n d . A short h i s t o r i c a l review of previous and c o n t i n u i n g r e s e a r c h i n the re a d i n g machine f i e l d i s gi v e n . I t i s seen that a l l machines f a l l b r o a d l y i n t o two c l a s s e s a c c o r d i n g to t h e i r o v e r - a l l s o p h i s t i c a t i o n of p r i n t t r a n s l a t i o n , and that these c l a s s e s can be subdivided i n t o groups correspond- i n g to the human sense modality employed; The p r a c t i c a l i t y and code problems of the simple d i r e c t - t r a n s l a t i o n machine are d i s c u s s e d as an i n t r o d u c t i o n to the Lexiphone. 2 . 1 H i s t o r i c a l Review Research i n t o r e a d i n g machines f o r the b l i n d was launched by F o u r n i e r d'Albe of England when, i n 1914, he invented the Type-Reading Optophone—a machine designed to transform p r i n t i n t o s o u n d . ^ In t h i s f i r s t machine, c a l l e d the White- sounding Optophone, a narrow v e r t i c a l bar of pulsed l i g h t , one l e t t e r i n h e i g h t , f a l l s on the p r i n t . E i g h t regions c o n s t i t u - t i n g the bar are pulsed at eig h t audio f r e q u e n c i e s by means of a s p i n n i n g d i s c . L i g h t r e f l e c t e d from the i l l u m i n a t e d area f a l l s on a selenium c e l l and causes corresponding f r e q u e n c i e s to be heard i n an earphone. L e t t e r s p a s s i n g h o r i z o n t a l l y under the i l l u m i n a t e d area d e l e t e c e r t a i n f r e q u e n c i e s a c c o r d i n g to the regions occupied by segments of a l e t t e r . Consequently, each l e t t e r c r e a t e s a c h a r a c t e r i s t i c sound p a t t e r n as i t i s scanned. 7 However, with t h i s machine, the presence of a l l eight tones i n the absence of print was considered to be a serious drawback. The problem was solved i n a modified version of the Optophone produced i n 1 9 2 0 by the English firm of Barr and Stroud. This model i s called the Black-sounding Optophone.^ It u t i l i z e s two selenium c e l l s i n a bridge which i s balanced when no ink i s detected. Tones are generated, rather than deleted, by l e t t e r segments int e r s e c t i n g the illuminated area. Silence i s therefore present during the spaces between l e t t e r s ; This" model employs only f i v e tones ( s o l , doh, ray, me, sol) i n place of the o r i g i n a l eight, and i t s sound code consists of a pattern of chords for each l e t t e r . The term "chord" here means the simultaneous sounding of one or more tones. Prom i t s creation, to the present, the optophone* has been subjected to continuous evaluation and modernization— perhaps more so than any other reading machine invented since. Reading speeds attained by blind subjects using the optophone have not improved over the years,- even with the use of up-dated models. Speeds of 3 0 words per minute have been recorded f o r exceptional persons, while speeds of 2 to 1 0 words per minute are more t y p i c a l of average performance. Such limited success with the optophone led many research workers to search f o r other reading machine schemes. These other schemes are now reviewed* * The generic term, "optophone", i s now used i n the l i t e r a t u r e to describe a reading machine whose reading mechanism and acoustic code are similar i n p r i n c i p l e to the modified (Black- sounding) Optophone. 8 The f i r s t -tactile reading machine, the Visagraph, (*>) was invented in 1928 "by R.E. Naumburg. 1 This machine produces an enlarged embossed facsimile of printed letters on aluminum f o i l . It embosses the f o i l in six rows corresponding to the print information sensed by a vertical line of six photocells. This particular type of print sensing device is similar in function to that of the Optophone, and identical to the photocell sensing device used.in later Optophone simulations. The principle difference between the Optophone and Visagraph, then, is the human sense stimulated. The Optophone is designed to communicate to the aural sense; the Visagraph, to the tactile sense. However, i t turned out that embossed letters were even more d i f f i c u l t to "read" than Optophone-coded letters. During World War II, the Committee on Sensory Devices was formed by the U.S.' Veterans Administration in order to stimulate and sponsor research leading to the development of sensory aids, particularly reading machines for the blind. Under the auspices of the Committee, during the period 1944-49, several machines were built and tested. This work was caried out, primarily, at the Haskins Laboratories and at RCA. The RCA Labs f i r s t developed the "Type A-2 Reading (6) Machine. "v It employs a small point of light rapidly o s c i l - lating up and down to scan the vertical extent of a letter. A variable frequency oscillator is coupled to the oscillating light beam in such a way that a high audible frequency is gener- ated when the spot is at the top of i t s scan and a low frequency at the bottom. This audio frequency i s keyed ON whenever the 9 spot of l i g h t intersects a l e t t e r segment. An optophone-like sound code results when the scanning "beam i s drawn slowly over a l i n e of prin t , "but superfluous noise i s added to the normal code "by the keying process. (q) A l a t e r development, the RCA Recognition Machine, 7 u t i l i z e s a flying-spot scanner as the input to a 26-letter recognition c i r c u i t . A l e t t e r i s scanned horizontally i n eight "bands, and the number of intersections are counted and l o g i c a l l y processed to trigger an audible output. The spelled pronunciation of each l e t t e r , prerecorded on magnetic tape, i s heard as the corresponding output. This type of audible code, l a t e r termed "spelled speech," has since undergone much serious investigation. Research at the Haskins labs included the simulation and evaluation of previous reading machines, a thorough inves- t i g a t i o n of various audible codes f o r use with optophone-type and recognition machines, and the development of the FM^-Scan (8) reading machine. This l a t t e r machine u t i l i z e s a frequency- modulated audio tone to indicate the l e t t e r scanned. An illuminated narrow v e r t i c a l s l i t , one l e t t e r i n height, i s passed over a l i n e of pr i n t . As various parts of a l e t t e r pass be- neath the s l i t , the t o t a l reflected l i g h t , which varies accord- ing to the proportion of s l i t area darkened by prin t , i s sensed by a photocell. The r e s u l t i n g analogue signal i s used to frequency-modulate an audio o s c i l l a t o r whose output frequency varies between 100 cps (no print "seen" i n the s l i t ) and 4000 cps ( s l i t e n t i r e l y black), A "Zero-suppressor" detects the absence of print between l e t t e r s or words and squelches the 10 100 cps output during these periods. After a 90 hour training period with this machine, the average reading speed attained was 4.2 words per minute. In 1952, P.E. Argyle of Royal Oak, B .C . , invented the Argyle Reader. This machine employs a rapidly moving point of l ight which scans the vert ica l extent of one letter from top to bottom at a uniform rate and returns instantaneously to repeat the operation. This operation is repeated at 200 cycles per second; The total instantaneous l ight reflected is collected by a photocell whose signal current is" amplified and fed to a loudspeaker. As letters pass horizontally under the f lying spot of l ight , each letter intercepts the spot and generates a pattern of audible transients whose spectral components are multiples of 200 cps. The result of this process i s a directly generated audible code in which each scanned letter is represented by a characteristic "growl." This machine was later evaluated at the (Q) National Physical Laboratory by Clowes et a l . , who found that the reading f a c i l i t y afforded by the Argyle Reader was equivalent to that .given by the optophone. Since 1957, the Veterans Administration has sponsored reading machine research on a broad front. Many new transis- torized models of the optophone, now employing nine photocells, have been developed and thoroughly tested for the Administration by the Battelle Memorial Institute, O h i o . ^ ^ ' An extensive training programme has also been developed by Battelle for users of their device. The Mauch Laboratories, Ohio, have pursued the development of a "multiple snapshot" character recognition 11 (12) machine with spelled-speech output. The Haskins Laboratories have concentrated on the synthesis of speech, with the idea of providing a speechlike code for use with recognition-type read- (13) ing machines. Also under VA auspices, Professor Metfessel at the University of Southern California has evolved techniques by which synthetic spelled speech may be produced in smooth coalescent form, i n t e l l i g i b l e at 90 words per minute. His"continuing work offers a reasonably simple and familiar output code for character recognition machines. (16) Nyev has carried out a thorough investigation of audible outputs for reading machines, including multidimensional codes, an a r t i f i c i a l larynx code, and a synthesized speech code known as "Wuhzi," originally developed by the Haskins Labs. The results of Nye's code tests, in which the same vocabulary of eight four-letter words was used for each, are shown in Pigure 2-1. These were not speed tests, but tests for comparative code performance. C'2) Beddoesv has studied the multidimensional auditory coding of single letters in connection with a Braille reader. With four-dimensional coding of letters,- he found that 80$ of the letters could be identified correctly. He has also demon- strated that a r t i f i c i a l l y compressed spelled speech can be assimilated comfortably at 110 words per minute with 82$ (17) accuracy.v Other workers have investigated the value of output codes employing the tactile and kinesthetic senses, and of codes u t i l i z i n g several sensory modalities simultaneously.^^' 12 J.G. L i n v i l l recently devised a tactile reading aid which dis- plays an enlarged facsimile of the print on an 8 x 5 array of (20 21) vibrating piezoelectric reeds. ' ' The entire device, sim- i l a r in principle to earlier designs (described by Freiberger (22) and Murphy ')> can be contained in a hand-sized package. Reading speeds of 15 words per minute have been recorded for one user of this device. ; Many varied character recognition schemes have been proposed. '' Because of size or cost limitations normally imposed by the average blind consumer, most of these schemes are of value only in high-speed business applications. The characteristics of the reading machines reviewed here are discussed in groups in Section 2.3. u 1 2 3 4 5 6 7 8 Number of Trials Figure 2-1. Nye's Comparison of Several Audible Codes 13 2.2 C l a s s i f i c a t i o n o f Readin g Machines f o r the B l i n d R e a d i n g machines can be c a t e g o r i z e d a c c o r d i n g t o o v e r a l l f u n c t i o n and s e n s o r y o u t p u t . F u n c t i o n a l l y , the machines tend t o d i v i d e i n t o two d i s t i n c t c l a s s e s : d i r e c t - t r a n s l a t i o n machines and c h a r a c t e r r e c o g n i t i o n machines. These c l a s s e s can a l s o be s u b d i v i d e d i n t o t h e two output code groups, a u d i b l e and t a c t i l e - k i n e s t h e t i c , c o r r e s p o n d i n g i n name t o the senses most o f t e n used i n s u p p l a n t i n g l o s s of v i s i o n . Table 2-1 summarizes t h i s c l a s s i f i c a t i o n scheme. DIRECT-TRANSLATION(Ref.) RECOGNITION (Ref. ) A d'Albe Optophone (3) Output: u RCA A-2 Reader (6) H a s k i n ' s "Wuhzi" (8) d FM-Scan Machine i8\ M e t f e s s e l ' s s p e l l e d i A r g y l e Reader 9 speech (14) b Other u n i and m u l t i - Other s p e e c h l i k e codes (16) 1 e d i m e n s i o n a l optophones M u l t i d i m e n s i o n a l codes (19) T Naumburg V i s a g r a p h (5) T a c t i l e systems (18, 21) a Mauch T a c t i l e opto- c phone -(12) t L i n v i l l ' s d ev i c e (20, 21) i Other t a c t i l e 1 e d e v i c e s (21, 22) Table 2-1. C l a s s i f i c a t i o n of Reading Machines f o r the B l i n d The d i r e c t - t r a n s l a t i o n machine, as i t s name s u g g e s t s , g e n e r a t e s output s i g n a l s d i r e c t l y i n accordance w i t h the changing c o n t o u r s o f p r i n t . No i n t e r n a l p r o c e s s i n g of the p r i n t i n f o r m a t i o n t a k e s p l a c e . The output s i g n a l s and the p r i n t s c a n n i n g t e c h n i q u e are so c l o s e l y r e l a t e d t h a t l i t t l e c h o i c e e x i s t s c o n c e r n i n g the type s o f sounds t h a t may be ge n e r a t e d ; I n the d e s i g n of a 14 direct-translation machine, the scanning method and the output code are, unfortunately, effectively inseparable factors. On the other hand,- a character recognition reading machine permits the separate optimization of i t s two principal functions which are: 1. To scan the print and identify each letter discretely and 2. To trigger, on the basis of this identification, a circuit which generates an output sensory stimulus corresponding to the letter identified. This basic letter recognition machine generates a distinct output for each letter i d e n t i f i e d — f o r example, the spoken sound of each letter. More sophisticated recognition schemes u t i l i z e sequences of identified letters to trigger the generation of pre-stored voiced phonemes, syllables, or even entire words. Of course^ such sophistication causes machine complexity and associated expense to mushroom. ( 5 ) A third class of reading machines has been proposed, ' although no complete model has been demonstrated. This machine, termed the "intermediate" reading machine, l i e s in terms of complexity and cost somewhere between the simple direct-translation machine and the recognition machine. Such a machine theoretically retains the instrumental simplicity of the direct-translation device, but operates upon the derived print information in such a way that speechlike or simple sound units are generated directly (26) from letter features. Mauch ' initiated work on a machine of this type but soon found that the desirability of a speechlike 15 sound'output necessitated a recognition-type input. ' 2.3 Class Characteristics To each functional class of reading machines considered in the previous section can be ascribed a set of characteristics typical of a l l machines in that class. 2.3 .1 Direct-translation Machine The direct-translation machine i s generally simple to instrument, inexpensive, and can usually be made portable. How- ever, i t s simple translation process generates an abundance of code units for each le t ter , ignoring the fact that the translated graphical information is highly redundant. (See Chapter 6;) An audible code produced by direct-translation cannot be made similar to the efficient natural code, speech, owing to the fact that no unique relationship exists between the shape of a letter and i t s phonetic equivalent. As a consequence, an audible or tact i le code produced by such a machine is not easily learned* and in fact, as experience has shown, l imits reading speed to about 30 words per minute. In addition, with the direct-translation machine* an appreciably different code must be learned for every new font style encountered. 2 . 3 . 2 Recognition Machine The character recognition machine allows much higher reading rates to be attained by assigning no more than one out- put stimulus to each let ter . For example, speeds of 90 wpm are possible using spelled speech. At these higher reading 16 rates the fatigue associated with reconstructing contextual material from coded letters i s also reduced. With a let ter- reading recognition machine only 26 stimuli need "be' employed (compared with the 52 stimulus patterns generated "by the direct- translation machine). In addition, the same 26 stimuli can "be associated with a number of font styles, given a multifont recognition machine. The code training period may be made almost negligible with a recognition machine by employing a sensory code consisting of familiar code units such as those of spelled speech. Of course, the drawback with such an otherwise ideal machine is that i t requires relatively complex instrumentation; i t i s therefore costly to implement and d i f f i cu l t to make portable. 2.3.3 Cost A functional c lass i f icat ion of reading machines for the blind is i l lustrated in Figure 2-2, showing the relationship of a class to an approximate level of data processing and output code sophistication. For each level is indicated.the estimated cost of a complete machine produced in quantity. The f i r s t three estimates are quoted from the l i terature, while estimates 4 and 5 are pure guesses. In fact, the possiblity of even instrumenting the components of level 4 is in question. Also level 4 may not be a desirable or necessary step to the realization of level 5. Cost can be used as a rough index of instrumental complexity and of machine dimensions. 17 DATA ( 5 ) ( 4 ) ( 3 ) ( 2 ) ( 1 ) PROCESSING LEVEL OUTPUT GENERATORS ESTIMATED COST RANGE Word Recognizer Word Generator -m. t ' Syllable Recognizer Phoneme or Syllable Generator $ 1 0 0 , 0 0 0 Letter Recognizer Letter Generator $ 2 , 0 0 0 ^ CLASS > RECOGNITION Feature Code Detector Generator Code Generator $ 2 , 0 0 0 $ 1 . 0 0 0 $ 5 0 0 INTERMEDIATE DIRECT Transducer PRINT Figure 2 - 2 . The Reading Machine Hierarchy 2 . 4 Introduction to the Lexiphone Figure 2 - 2 immediately indicates why so many workers have pursued the development of a simple direct-translation machine, rather than a more sophisticated machine. It is by far the least expensive machine to produce. Also, as pointed out earl ier , i t i s easily made portable. Hence, such a machine i s ' well suited for'and well within the means of the average blind person, and consequently deserves much attention. For these reasons i t was decided to investigate further the direct-translation machine, in an attempt to resolve the major problem experienced with machines of this type; namely, 18 the low reading speeds normally attained. With the optophone, i t was fe l t the problem lay in the type of audible code employed; If, in place of the normal optophone "chord" code, a multidimensional auditory code were employed, then perhaps an increase in reading speed might be pos- sible. This would be due to the fact that auditory stimuli vary- ing along several dimensions are more easily identified than those stimuli arising from a single dimension. In order to test reading performance w i t h a multi- dimensional sound code, a direct-translation machine was built. While i t s p r i n t pickup device is similar toA"|;hat of the optophone, its-output code employs several dimensions in sound. For the sake of brevity, this machine, a multidimensional optophone, is called the Lexiphone.* Its description and performance character- is t i c s form the subject of subsequent chapters of this thesis. * This name, meaning "words in sound," was suggested by Professor E.S.W. Belyea. 19 3 . DESCRIPTION OP THE LEXIPHONE The Lexiphone (Pig. 3-1), although designed principally to generate the multidimensional Lexiphone code, was constructed to generate as well a simulated version of the optophone code. The latter code provides a suitable standard against which the hopefully superior Lexiphone code can be evaluated. The two code generators, while controlled by the same print information Figure 3-1. Photograph of the Lexiphone signals and machine components, develop dist inct codes by assign- ing different acoustic variables to the same signals. Discussed in this chapter are the system components used to produce these signals, and the audible codes themselves. 20 5.1 Principle of Operation The print information signals are generated by a linear array of six photocells, situated at the image plane of a lens which i s focused on the printed page. Oriented in a direction perpendicular to the horizontal line of print, the six cells together span the vertical extent of letters in a line. As the page i s transported horizontally from right to l e f t , the stationary cells scan a line of print and convert the print into six electrical signals.- When a photocell detects the presence of ink i t activates a corresponding relay. The instantaneous state of all^ six relays determines the code i sound generated. f ! Illustrated in :the block diagram of Figure 3-2 are. the i system components which constitute of lexiphone. They are discussed individually in the* sections that follow. Light Source Quantizer and Switching |< Unit Lexiphone Code Generator Opto-electrical Transducer (Reading Tube) Motorized Platform Optophone Code Generator PRINTED PAGE 0 U T P U T Figure 5-2. Block Diagram of the Lexiphone 21 3.2 The Light Source I n i t i a l l y , a strqboscopic light source was used to illuminate the print. Plashing illumination was employed so that the photocell signals could be ac-amplified, thus circumvent ing the temperature d r i f t problem associated with dc-connected transistor amplifiers. However, this type of light source was discarded in favour of a constant intensity incandescent source because the xenon flashtube in this unit .generated a,distracting' amount of acoustic noise, and was. subject to short l i f e and variable light output. As well, the stroboscopic unit was undesirably bulky. It was found that i f the high-impedance photocells used were properly matched,'the temperature d r i f t effect would be negligible with respect to the total photocell signal. Thus, simple, reliable dc-connected circuits oould safely be employed between photocells and relays. The present print illumination source proved to be the simplest, most compact incandescent source of those inves- tigated. It u t i l i z e s a ring of 'six pre-focused flashlight bulbs positioned just above the page so as to illuminate an area the size of one letter. The bulbs* rated at 2.2 volts, are connected in parallel and driven by a regulated dc voltage supply whose output is continuously variable between 1.4 and 2.2 volts, i s constant to within 5 millivolts, once set, and has less than Ifo peak ripple. These last two supply voltage characteristics are important for reliable operation of the machine. Fi r s t , the 22 illumination level, once set, must be kept-within certain bounds in order that the range of photocell voltages generated is' con,-'. • stant. These bounds dictate that bulb voltage must dri f t no more than 30 millivolts. The supply d r i f t i s well below this figure. Second, the thermal time-constant of the bulb filament i s so short that any 120-cycle ripple voltage impressed across the filament is reproduced as illumination ripple-. This ripple is detected by the photocells and causes the relays to "chatter" when switching. The 1$ supply ripple i s well below that nec- essary to cause "chattering." 3.3 The Motorized Platform ' To ensure the reliable translation of print into code, the page of print is transported at a uniform, rate on a f l a t , motor-driven platform. A small induction.motor-moves the plat- form from right to l e f t u n t i l a complete line'.of print is scanned, A microswitch then reverses the motor, mutes the audible output, and returns the platform to the' opposite end where a ratchet assembly moves the page up one line, This cycle i s repeated u n t i l the entire page is scanned. -There are vernier adjustments to make the lines of print parallel-to the motion (horizontal alignment) and to position the Sprint directly under the photocells (vertical alignment). The scanning speed is continuously variable in two ranges: 1 to 2_ and 2 to 5 characters per second (corresponding to the total equivalent speed range of 12 to" 60 words per minute). A variable frequency phase-shift oscillator supplies t h e ac 23 voltage which is amplified "by a 100-watt amplifier and which in turn drives the induction motor. The two speed ranges are supplied "by a rack-and-double-pinion gear assembly. In Figure 3-1, pictured on the lef t are the platform and the cantilevered unit containing the l ight source and read- ing tube. On the mounting board at the front are situated three controls for the user: a platform reversing switch, a drive-release lever, and a motor.switch. The speed control knob is located on the cabinet front. For reasons of portabil i ty, a commercial model of the Lexiphone would probably substitute a hand-held reading probe for the bulky motorized platform and cantilevered unit . How- ever, in our experimental model, code uniformity, not machine portabil i ty, was the factor of greater concern. 3.4 The Reading Tube Figure 3-3 shows the Lexiphone lens-photocell assembly that transforms print into e lectr ica l signals. This assembly, called the reading tube, contains six cyl indrica l photocells each 0.082 inches in diameter, located at the image plane of a two- element magnifying lens. The lens provides print magnification sufficient to ensure reasonable photocell resolving power. For letters of the standard Pica typewriter font, the font that this reading tube was designed to scan, resolution sufficient to distinguish a l l significant let ter features is obtained when the standard let ter width (0.1") is magnified to 9.5 times the photocell diameter. This leads to print magnification of 7.8. A top view of the reading tube in Figure 3-4 shows Printed page Object plane Figure 5-5. Lexiphone Reading Tube Figure 5-4. Relation of Photocells to Print 25 the position of the photocells with respect to correctly aligned, magnified print. The print is drawn horizontally past the c e l l array as shown. Cells 2 to 5 detect print information concentrated in the area defined "by the lower-case main body height. A l l lower- and upper-case letters cause these cells to be activated. The uppermost c e l l (#1) detects r isers , dots, and the top of a l l upper-case letters , while, the lower c e l l (#6) detects descenders. This particular c e l l configuration is called the even-cell con- figuration. The seventh c e l l location shown provides for an odd- ce l l configuration in which c e l l 7 substitutes for cells 3 and 4. This 5-cell configuration is necessary to distinguish between the letters e and c. When c e l l 7 is not substituted, the horizontal l ine of the e fa l l s between cel ls 3 and 4 and fa i l s to activate either of them, and therefore e is translated as c. 3.5 Quantizer and Switching.Unit When a photocell detects segments of print on the illuminated page i t s resistance varies between the approximate values of 100 kilohms and 1 megohm, corresponding to no-print and print. These variations produce an analogue voltage signal which is quantized by a subminiature relay into a succession of binary states: "white" or OFF, and "black" or ON. One of the six identical quantizing circuits is shown in Figure 3-5- In this c ircuit the photocell voltage appears as the output voltage v of a Darlington c i rcu i t . Resistor R-, 26 sets the "white" output voltage 1/-̂ . With the photocell seeing white, R^ is increased unti l ' the Darlington buffer is just turned off ( v ^ ^ 0 . 5 volts ) . Resistor R2 sets the switching threshold * * R-, 5M 100K t—• -9v. 1N5 -« -16v. J_ ii Relay T i 700n _I 1 H. 1N2175 V 10K "J" I 10Z < >10K > >~ZL Transistors: 2N1381 +3v. Figure 5-5. Photocell Signal Quantizing Circuit voltage V T (aA T 2.0 volts) . The "black" output voltage v^, which depends upon the ratio of R-̂  to the photocell "black" resistance and upon the Darlington leakage,current, generally exceeds 5 volts . These voltage values are summarized in Figure 3-6. IT (volts) i I 8 fciD £ 6 H O i> H 4 H • CD O o 0 OFF (White) Transition (Grey) Region Relay Hysteresis ON(Black) Relay Position Figure 5-6. Photocell-Relay Quantizing Characteristic • - • • 27 Because the transition region (Pig. 3-6) is passed through so quickly in practice, and the relay hysteresis region is so small a fraction of the entire output voltage swing,1 the use of a Schmitt trigger in this c ircuit is unnecessary. Voltage drif t of u(0.2 volts) due to ambient temperature variation is also insignificant compared with the total voltage swing. Each relay is f i t ted with four type-C (SPDT) contact sets, two of which are allotted to general code switching functions and appear on a patch board. The third set of each relay is wired in to the lexiphone code c ircuit to switch on the audible output i f any relay i s ON and to mute the output i f a l l relays are OPP. An individual pilot l ight for each relay is activated by the fourth set. The functional simplicity and sufficient switching speed ( < 10 ms.) of subminiature relays adequately suited Lexiphone switching requirements. Paster semiconductor switches were considered unnecessary; in fact, the need for versatile switching f a c i l i t i e s in this experimental machine precluded their use. 3.6 The Lexiphone Coder Pour acoustic dimensions are employed in the production of the Lexiphone code: tone frequency, tone timbre, hiss band- width, and c l ick . The value of each dimension is controlled by the state of one or two photocells (relays), according to the scheme of Figure 3-7. Referring to Figures 3-4 and 3-7, one can see that particular letters or letter features control two P h o t o c e l l Number Dimension 28 6 o- Timbre Frequency H i s s C l i c k Mute Output F i g u r e 5 - 7 . C o n t r o l o f l e x i p h o n e Code Dimensions of t he di m e n s i o n s . S p e c i f i c a l l y , r i s e r s and c a p i t a l l e t t e r s cause a change i n t i m b r e ; descenders produce a c l i c k . Of c o u r s e , a l l l e t t e r s cause v a r i a t i o n i n t h e tone f r e q u e n c y and h i s s band- w i d t h because c e l l s 2 t o 5 a r e a c t i v a t e d by every l e t t e r . L i s t e d i n Table 5-1 a r e the d i m e n s i o n a l v a l u e s employed i n t he Lexiphone code. The f o u r f r e q u e n c i e s a r e generated by an a s t a b l e m u l t i v i b r a t o r whose feedback c a p a c i t o r i s s w i t c h e d t o f o u r d i f f e r e n t v a l u e s by r e l a y s 2 and 5 . The output o f the a s t a b l e i s a s p i k e waveform whose t i m b r e i s " h a r s h " . A smoothing f i l t e r •is s w i t c h e d i n by r e l a y 1 t o produce " s o f t " t i m b r e . A n o i s y t r a n s i s t o r g e n e r a t e s wideband h i s s which i s a m p l i f i e d and passed t h r o u g h one o f f o u r f i l t e r c o n n e c t i o n s , determined by the s t a t e of r e l a y s 5 and 4, t o y i e l d f o u r h i s s bandwidths. When the 5 - c e l l c o n f i g u r a t i o n i s employed, o n l y two v a l u e s o f h i s s can be used: wideband h i s s and no h i s s , c o r r e s p o n d i n g t o the two s t a t e s of p h o t o c e l l 7 . The c l i c k i s produced upon e n e r g i z a t i o n of r e l a y 6, which a p p l i e s a s t e p v o l t a g e t o an RC d i f f e r e n t i a t o r connected t o t h e ou t p u t . The v a r i o u s components of t h e code a r e summed i n a m i x e r , a m p l i f i e d * and f e d t o a l o u d s p e a k e r . 29 Dimension Dimension Values , Tone, Frequency (cps) 250 298 334 420 Tone Timbre Soft (Sinewave) Harsh (Spiked wave) Hiss Bandwidth (6 cells) (5 cells) Low-pass (1000~) Band-pass (500-1000~) High-pass (500~) No hiss All-pass No hiss Click Click No Click Table 3-1. Values of the Lexiphone Code Dimensions 3.7 The Optophone Coder The optophone code employs only one acoustic dimension: frequency. A distinct audio frequency is assigned to each photo- c e l l , and that frequency is heard whenever i t s corresponding c e l l senses print . With six photocells, up to six frequencies may be heard simultaneously. This control scheme and the code frequencies used in the optophone simulation are indicated in Figure 3-8. In-our machine, the six frequencies are spaced at equal logarithmic intervals, after the Battelle Optophone. Six phase-shift oscil lators provide these individual frequencies, and the sinusoidal output of each osci l lator is clipped to pro- duce a square waveform similar to that of the' original Fournier d'Albe Optophone. Each osci l lator s ignal ; is connected to the output mixer only when i t s corresponding relay i s energized; 30 Photocell Number 1 2 3 4 5 6 a o a- Frequency (cps) Output Figure 3-8. Optophone Coding Scheme i f no print is sensed, a l l relays are OFF and the optophone output is s i lent . The necessary relay contacts are connected in to the optophone circuits by means of the patchboard evident in Figure 3-1. 31 4. EXPERIMENTAL EVALUATION OF THE LEXIPHONE CODE 4.1 Introduction In order to establish the value of the Lexiphone multidimensional code, i t was necessary to determine the answers to two questions: 1. Is i t possible for a blind person to "read" letters and words presented in the Lexiphone code? and i f so, 2. Is the Lexiphone code more effective than the optophone code in conveying print information? Two series of experiments were performed to answer these questions. In the f i r s t series, two blind subjects worked with the Lexiphone code over a period of two months (about 36 hours training time). They learned to read a large number of Lexiphone-coded words constructed from a restricted alphabet of nine letters . The second series of experiments was designed to de- monstrate any difference in coding efficiency that might exist between the Lexiphone and optophone codes. Based on_the same nine-letter alphabet, training and test tape-recordings were prepared for each of the two codes. These tapes were used to test the code learning ab i l i ty of two groups of sighted students. It was intended that the quantitative results of these tests be used to assess the relative value of the two audible codes. In these experiments the nine-letter alphabet - consisted of the letters a e s c r g y h t . These letters were chosen to represent those found most often in printed English 32 (e_ t_ a_ o n r i s ) , those with risers and descenders (g y h t ) , and those normally found most d i f f i cu l t to decode because of their apparent s imilarity in code (a e s c) . The need to conserve code training time necessitated our resorting to such an abbre- viated alphabet. However, d i f f i cu l t letters were included to make the decoding task more rea l i s t i c . It became apparent during tests with the blind subjects that the s ix -ce l l configuration (Sec. 3.4) provided insufficient information for them to distinguish between letters e and c. With a 52-letter alphabet this d i f f icul ty might1have been over- looked, but with a 9-letter alphabet, the decoding problem was severely aggravated. For this reason,^ throughout the audible- code experiments the f ive-ce l l configuration was employed. 4.2 Blind Reading Experiments with the Lexiphone Code For a period of two months, two blind persons,- both female, volunteered an hour a day to work with the Lexiphone code. During the f i r s t ten sessions of code training they learned to recognize the nine coded letters presented individually. For the remaining twenty-two hours, they•concentrated on decoding words composed of these letters . Some additional time was spent in testing their ab i l i ty to perceive certain parameters of the Lexiphone code. Each training session comprised f i f ty minutes of code instruction and ten minutes of testing. This training method made possible a continuous measure of their progress. 33 4.2.1 Letter-reading Tests With the Blind Subjects For the letter-reading experiments, code training proceeded as follows. The experimenter announced the letter to "be heard in code. Then the Lexiphone presented the coded letter three times in succession allowing a space of one second "between repeti'ons. This training cycle was repeated throughout the instruction period,. with, each of the nine letters appearing randomly. At the end of the instruction period a short test was given in which letters were presented randomly and without the voice clue. During the 10 seconds of silence following each presentation the subjects were asked to record in Brai l le their response to each coded le t ter . Their test results are i l lustrated in Figure 4-1. 100$ -p o CP fH u o o -p rt cu o U CD 50$ 0$ .1.0 sec/letter Subject A 0.67 s . / l Subject B 0 5 . Hours- of. Training Figure 4-1. Letter-Reading Test Results 10 34 i The Lexiphone scanning speed was i n i t i a l l y set at one letter per second; that i s , each single letter occupied one second in time. The rate at which successive letters were heard, on the other hand, was determined by the pause between letters . After eight hours training at a speed of one second per let ter , both subjects could identify coded letters with almost no error (Pig. 4-1). The scanning speed was then increased to yield a duration o(f 0.67 seconds per le t ter . Within two hours both subjects managed to identify perfectly letters presented at this higher speed. Because of these encouraging results, the more rea l i s t ic task of word-decoding was begun (Subsection 4.2.2). Shown in Figure 4-2 are the letter confusion matrices R i 3 S P 0 n S € —> a e s c r t h g y - a e s c r t h g y - a 88 6 6 74 6 i4 6 e 4 85 7 4 4 85 11 s 7 21 72 14 36 43 7 c 21 79 5 21 63 11 r 4 7 85 4 5 12 9 69 5 t 95 82 18 h 100 19 811 i g 5 5 5 85 90 10 y 95 5 12 88 Subject A Subject B Figure 4-2. Letter Confusion Matrices 35 for each "blind subject, compiled froin 100 responses made during the last four letter tests. The matrices indicate different confusion patterns for the two subjects. For example, while subject A sometimes gave the responses e, s and h for the stimulus letter g, subject B confused stimulus g only with y. (The reason for these differences is discussed shortly.) Matrix asymmetry indicates that confusions were often non-reciprocal. For instance, letter a was sometimes given as a response to stimulus letter s, but the reverse confusion never occurred. The main diagonals of the matrices summarize the relative d i f f i c u l t y of decoding each letter. Those letters with risers or descenders (t h g y) were most easily decoded, while s was the letter most readily confused. 4.2.2 Word-Reading Tests With the Blind Subjects A vocabulary of 46 three-letter words (Appendix I), constructed from the nine-letter alphabet, provided the material used for word training and testing. Words of equal length were employed in these tests to exclude the possibility of identifying a word by i t s length. During the training period of each session a code-voice- code technique was employed. The coded version of a word was presented followed by a five second pause; the experimenter then announced the word and, fina l l y , the coded word was repeated. During instruction successive words were often chosen to illustrate certain important code differences or similarities. A twenty-word test.was given at the end of each training session, 36 and the results of these tests are illustrated i n Figure 4-3. Hours of Training " Figure 4-3. Word-Reading Test Results (Three-letter words) Similar to the procedure employed in the letter- reading experiments, the subjects learned to'decode individual words presented at two scanning speeds. Fori.the f i r s t five sessions the presentation speed was one letter per second, corresponding to a duration of three seconds per word. For the remaining seventeen sessions the higher speed of 0.67 seconds per letter was used. For the f i r s t seven word-training sessions., as for a l l letter-training sessions, the lexiphone was utilized to produce the coded material directly from typewritten sheets. However, the remaining fifteen word-training sessions were ̂'conducted with 37 tape-recorded material in order to ensure code uniformity, to exclude operating noises of the lexiphone, and generally to expedite the sessions. 4.2.3 Comments on the Blind Reading Experiments and Further Tests The consistently superior performance of Subject A in these Lexiphone code experiments i s obvious from the curves of Figures 4-1 and 4-3. Her f a c i l i t y with the code is attributed to her exceptional musical a b i l i t y . She experienced no d i f f i c u l t y in recalling the exact pattern of pitch variation corresponding to each letter. This particular a b i l i t y is important where two coded letters (for example, a and s) are characterized prin- cipally in the frequency dimension, and to distinguish between them requires detecting a subtle difference in the order of frequencies. Subject A stated that she relied heavily on the dimension of frequency and rarely upon that of hiss or click. This fact explains why she sometimes confused the letter g, whose descender triggers the click, with letters not having descenders but whose musical patterns are somewhat similar to the pattern generated by g. Subject B, on the contrary, depended almost entirely on the dimensions of hiss and click. The variations of pitch not reinforced by a change in timbre seemed to escape her entirely. A short test to determine the extent of her pitch perception problem was given, and i t showed that she was unable to distin- guish between various patterns of two notes unless the notes differed in frequency by nearly an octave. Apart from this 38 part ia l tone-deafness, her hearing seemed normal. In fact, in another similar test with both subjects, this time using two bandwidths of hiss, Subject B made no errors in identifying different hiss patterns while-A did. Subject B's clear percep- tion of cl icks and hiss accounts for the fact that, unlike subject A, she confused letter g only with y. Both letters have descenders and therefore both cause a c l ick to be heard. She (B) always heard the c l ick and hence knew the letter to be either g or y. Another short and informal experiment was performed during the last few minutes of each of the f ina l four sessions. Coded sentences consisting of two to eight words were presented to the subjects at an equivalent rate of 12 words per minute (1 let ter per second). The component words, varying in length from one up to six letters each, were constructed from the nine- letter alphabet. The majority of these words were words never heard before in coded from" by the subjects (Appendix II) . Both subjects demonstrated an amazing ab i l i ty to decode nearly every word upon i t s f i r s t presentation. Sometimes a word, usually a long word, had to be repeated. Also surprising is the fact the Subject B, whose word-reading ab i l i ty averaged about 50% throughout the formal word-training sessions' compared with A's 90$, usually responded more quickly to these new words than did A. Although no quantitative results were obtained from the sentence-reading tests, the qualitative results are apparently encouraging, but may indicate no more than that new words of 39 varying lengths, presented at one letter per second, can be easily decoded on a letter-by-letter basis. It was not possible, of course, for the subjects to decode each word as a single sound pattern or word-unit because for most .of - the.se words no previous training had been given. The subjects themselves, could offer no specific reasons for thMr unusual 'performance .during the sentence-reading..tests... ~ 4-2.4 Summary of Lexiphone Tests with the Blind Subjects The results of our machine reading tests show that these two blind persons can indeed read Lexiphone-coded words from the nine-letter alphabet. Although one subject demonstrated superior a b i l i t y with the code, both subjects seemed equally well motivated and interested throughout the period of training. The fact that Subject B, not having a musical ear, performed as well as she did (50$) helps justify the choice of a multidimen- sional code—its non-musical dimensions can provide useful information. While the Lexiphone experiments were not designed to test reading speed as such, nevertheless we*were interested in determining the effect of an increased scanning speed upon performance.- At the two speeds employed, 1.0 and 0.67 seconds per letter, performance did not alter perceptibly. This phase of the Lexiphone code experiments was concluded because the blind subjects were no longer available. Also i t was desired at this point to procede with the compara- tive evaluation of the Lexiphone and optophone codes. Unless 40 the Lexiphone code could he shown to he significantly superior to the optophone code, further Lexiphone training of the "blind subjects would be of token interest only. 4.3 Lexiphone-Optophone Oode Comparison with Sighted Subjects Test performances of two comparable groups of sighted university students provided the data for the Lexiphone-optophone 1 code comparison. Each group consisting of 26 students was trained and tested with each of the codes. Group 1 spent one hour learn- ing the Lexiphone code, and Group.2, the optophone code; the groups then switched and worked with the second code during an additional hour one week later. This technique of reverse code presentation was intended to control the effect that the learning of the f i r s t code might have upon the learning of the second. Single letters of the restricted nine-letter alphabet were used in these tests, and were presented at a speed cor- responding to 1.0 seconds per letter. The following outlines the training and testing pro- cedure employed in each of the hourly sessions. Each session was divided into two parts. In that part given to training, the code-voice-code technique was used (Eigure 4-4(a)). Each letter was- sounded three times in' code, then a seven-second pause was followed by the letter spoken once,- then the coded letter was sounded twice again. During the pause, the subjects recorded their identification of the coded letter, as a means of reinforcing their learning of the code and of checking their own progress. i 1 Ten groups of nine letters each were presented in this way with 41 each letter occurring randomly, yet appearing ten times in a l l . Pause for Code Code Code Code Code Code response j/'Spoken yletter, Code Code (a) Training Method Pause for response (b) Testing Method Pigure 4-4. Training and Testing Presentation • Used for..Each- Letter In the second part of the session, the students' code learning a b i l i t i e s were tested with 27 coded letters presented in random order with each of the nine' different letters appear-- ing e_uiprobably. Each letter was sounded three times only, without a voice clue, and during the pauses between letters the students recorded their responses (Pigure 4-4(b)). 4.3.1 Test Results Table 4-1 summarizes the results of the Lexiphone- optophone code tests. The tabulated scores shown are the average number of letters correctly identified by each group out of the total of 27 letters presented during- each ..'Code.- test and the standard deviations.' serve '...to. .indicate} -hbw.:!,*ihdividual scores', in; a group varied. . "' . Comparing the results of the f i r s t code tests, i t is seen that group performances differed somewhat: Group 1 working 42 with the Lexiphone code achieved s l ightly better results „ (mean = 10.51), compared with Group 2 working with the optophone Lexiphone Optophone Avg. Score S.D. Avg. Score S.D. Group 1 (Lex.-Opt.) 10.51 (N = 4.40 26) 11.69 (N = 3.04 26) Group 2 (Opt.-Lex.) 11.50 (N = 4.47 26) 8.57 (N = 3.55 26) Code Average 11.01 (N=52) 10.12 (N=52) Maximum possible score = 27. Table 4-1. Group Mean Scores and Standard Deviations of Code Tests code "(mean=8.57).' The s ta t i s t i ca l significance of the differ- ence between these means (D=1.94) was examined by applying a s ingle-ta i l t-test. The value of t computed (2.01) indicates the difference in performance is just significant (at the a=.05 level , cLf=25). To include the results of the second code tests and to evaluate other variables, further examination of the results (27) was carried out with a special t-test v ' that ut i l i zes the mean difference in performance experienced by individual subjects between their f i r s t and second code tests. This t-test offers a measure of the three important variables operating in this series of code experiments: the code effect, practice effect, and the effect of interaction between code and test-order. Table 4-2 summarizes the results derived from this t-test (s ingle-tai l ) . 43 Effect Mean Difference t ' P (df=50) Significance (a=.05) 1. Code effect Lex. -Opt. =s 1.019 1.66 >.05 Not Signif. 2. Test-order effect (Practice effect) lst-2nd= -1.904 - 3 . H < .005 Very Signif. 3. Interaction • (Code x test- order effect) (L-0)-(0-L)= -3.810 -3.58 < .001 Very Signif. Table 4-2. Significance of Lexiphone-Optophone Code Test Variables The positive value of the mean difference in l ine 1 of the Table indicates that performance with the Lexiphone code, averaged over both groups, was s l ightly superior to that with the optophone code. Unfortunately, the corresponding value of t reveals that this superiority i s so slight as to be s t a t i s t i - cally insignificant (P >.05). Lines 2 and 3 of the Table show two secondary effects which are quite significant. The t-value calculated for the test-order effect indicates that, regardriess of which code was learned f i r s t , experience acquired with the f i r s t code significantly improved performance with the second. The codes were similar enough in character that practice with the f i r s t code could successfully be applied to the second. The third calculation, which measures the interaction between code performance and code order, shows that greater improvement in performance occurred between the f i r s t and second tests when the optophone code preceded the Lexiphone code. 44 Confusion matrices, constructed from the 1,404 pooled test responses for each code, are shown in Figure 4-5• Con- fusions are widely dispersed in both matrices, demonstrating that the students had not received enough training to enable them to narrow their confusion f i e l d . Note, however, that the optophone matrix may be partitioned to yield two high-confusion submatrices, a e s c and r t h g y. This observation (supported by observations of the students' progress during their training period) may indicate that optophone code learning had reached R e s p o n s e - a e s c r t h g y — a e s c r t h g' y - a 30 14 18 13 1 4 3 8 4 4 28 8 13 1 11 8 13 8 7 4 e 17 26 17 26 I 3 1 6 3 2 8 29 23 12 5 11 3 8 1 s 17 19 24 29 1 3 1 3 1 I 1 8 9 40 31 5 1 3 1 1 c 13 21 20 34 1 4 3 1 2 2 13 8 20 36 7 3 3 4 2 4 r 6 5 5 5 "[*>. |10 18 8 7 10 4 18 5 6 8 38 6 7 6 3 3 t 1 1 2 1 49 12 3 20 1 3 4 3 6 46 17 3 18 1 h 6 3 4 1 lio 17 31 15 7 4 8 7 5 2 6 10 42 9 7 4 g 4 4 3 3 1 6 7 13 51 4 3 10 7 6 3 10 6 10 42 . 2 3 y 1 ! 8 15 12 4 60 1 2 1 3 13 2 3 74 1 Optophone Code Lexiphone Code Figure 4-5. Letter Confusion Matrices (Groups pooled) some kind of plateau even at the end of one hour .of training. The optophone submatrices suggest that the students had learned to group coded letters into two classes, but they then found, perhaps, that the optophone code offered no further clue with 45 which they could accurately identify individual letters. Or, perhaps more likely, after listening to the optophone code continuously for one hour, the students became bored with the "monotonous" code (according to their comments) and therefore made no effort to identify letters more accurately. After a period of rest, i t is possible that performance with the opto- phone code would have seen some improvement. The partitioning phenomenon is certainly not evident in,'the Lexiphone matrix; confusions are more uniformly d i s t r i - buted than in the optophone matrix. This fact suggests that the students had learned to classify each Lexiphone-coded letter into i t s own category and not just into one of two or three multi- letter categories. The main diagonal values show that the average number of Lexiphone confusions were less than the number of optophone confusions. No evidence of Lexiphone code saturation i s exhibited in these results, and hence, we conclude that a second Lexiphone code training session would have improved performance quite markedly. The remarks of the following para- graph also bear out the conjecture that the students could have experienced more substantial gains in performance with the lexiphone code than with the optophone code given further training time. At the conclusion of the code comparison experiment the students were asked to record their opinion of the two codes. The majority preferred the Lexiphone code, saying that the optophone code quickly became monotonous, or that a l l optophone- coded letters came to have an annoying "sameness." One student 46 stated specif ical ly that the hiss and cl ick dimensions of the Lexiphone code provided the only distinguishing clues for him. He admitted not having a musical ear. Although the remainder of students preferring the Lexiphone code were not so specific, they explained their preference by saying that the Lexiphone code components were more "variable." At the same time, some of them complained that the "noise" (hiss) present in the Lexiphone code tended to obscure the "real var iat ion," thus revealing their inabi l i ty to appreciate the hiss dimension. With further training this unpleasant dimension l ike ly could have become a subjectively worthwhile variable to those dis- turbed by i t in the f i r s t training session. 4.3.2 Summary of the Lexiphone-Optophone Code Comparison A l l of the results obtained from this Lexiphone- optophone comparison experiment suggest that the Lexiphone code is the superior code. Unfortunately, we have not been able to show that the difference in performance between the two codes is s ta t i s t i ca l ly significant, but this f a i l ing is attributed to the short period of code training involved. The d i f f icul ty of correctly interpreting the short-term training results is out- lined in the following paragraph. Both groups performed better on their second code test, despite the fact the second code differed from the f i r s t . Had the same code been presented in both training sessions, the improvement in performance certainly would have been at least as great. Hence, the significant improvement that was noticed 47 demonstrates that at the end of the f i r s t hour of training the code learning process was not yet complete; the students' ab i l i t y to improve their decoding performance had not yet reached a saturation point. (Average score was 40$.) It is therefore not valid to extrapolate the results of these short one-hour training sessions as truly representing the results that would he obtained after extended code training. As an illustration of the danger of such extrapolation, the i n i t i a l learning rate with code A might be greater than that with code B, but after sufficient training the learning curves could level off with code B in the superior position. Although desirable, i t was not practical to carry out a more decisive code comparison experiment involving several hours training with each code. 48 5. STATISTICAL STUDY OP LEXIPHONE-CODED LETTERS 5.1 Introduction This chapter describes a study carried out to deter- mine the exact print information signals produced by the Lexiphone, and to ascertain whether let ter characteristics derived from these signals lead to a simple scheme for let ter or letter-feature recognition. If simple feature-detection circuits based on these characteristics could economically be added to the Lexiphone, then the sound code generated by the machine could be made less complicated and consequently more efficient. It was implicit in this study that the signals be obtained from the existing s ix-ce l l print1 scanner. The idea was merely to establish the poss ibi l i ty of incorporating simple improvements in the present machine, not to suggest the design of a new print scanning method. Also i t was intended to use the stat ist ics compiled from the signals of the present scanner to study the quantity of information generated by a simple d i rec t - translating machine of the Lexiphone type. This informational study is the subject of Chapter 6. 5.2 Quantizing and Recording the Print Signals As print passes beneath the Lexiphone scanner (reading tube), six binary signals are generated by six relays connected to the photocells. At any instant, each of the six relays i s in one of two conditions: 'ON (binary state " 1 " ) , or OPP (binary state " 0 " ) . We c a l l the instantaneous condition of a l l six 49 relays the relay "state." Physically, each state registers the instantaneous distribution of ink detected in a narrow vert ica l sample of that let ter , and the entire letter i s then described by the succession of states generated during i t s scanning. There are 64 possible states, each of which can be characterized by a six-digit binary number denoting the binary position of the relays. The zero state, a l l relays OPP, occurs when no print is detected. Any non-zero state signifies the presence of print . If the six binary digits of a state are considered to form a column vector, and the successive column vectors describing the letter are placed next to each other, a binary array or matrix results, and i t constitutes a quantized facsimile of the letter (Pigure 5-1). Relay No. 1 111111111 2 ...1111111 111111111111 5 .1111 1111 1111 4 11111 1111 1 1 1 . . . . . . 5 . l l l l l l l l l l l l l l i : 111111111111 6 Binary zeros are represented by dots. Pigure 5-1. Quantized Pacsimiles of the Letters "a" and "f" In this study, the 52 letters of the standard Pica typewriter font were used. Ten samples of each were typed on a sheet with an electric typewriter in preparation for Lexi- phone scanning. Because the horizontal letter-spacing in this 50 font i s constant (0.1 inches per letter) independent of the letter typed, we define this length (0.1 inches) to be the "letter-space." The contours of a let ter , however, actually span just a proportion of the letter-space, and this proportion defines the "length" of a letter (Fig. 5-3).- An Alwac III-E d ig i ta l computer performed the state sampling and storing functions. Because the input unit of the Alwac conveniently reads six-bit characters, the six lexiphone relay contacts were connected directly to i t , thus allowing the computer to sample the instantaneous relay state with a single READ instruction. It was determined experimentally that twenty equally- spaced samples per letter-space were sufficient to resolve the shortest state durations occuring during the scanning of a let ter . The sampling process was therfore carried out as f o l - lows. A convenient Lexiphone scanning speed of one letter- space per second (0.1 in. /sec.) was chosen, and the computer was programmed to sample the relay state every 50 milliseconds once the presence of print was detected. The white space before each letter generated the zero state. When this state was encountered by the computer, i t cycled back to the READ instruc- tion every 4 milliseconds. The f i r s t non-zero state read-in constituted the f i r s t sample of a let ter , and i t init iated the 50 millisecond sampling cycle for the remaining nineteen equi- spaced samples of that let ter . Each of the 520 letters was sampled in this way and stored. This sampling method, in which the samples refer to 51 print information detected at specific locations of a letter left-registered in the letter-space (Pig. 5-1), seemed to be the simplest method that might be instrumented in connection with the Lexiphone. The samples are simply and rel iably syn- chronized to each letter because the sampling process is t r i g - gered by the left-hand edge of a le t ter . In the actual machine sampler, the sampling cycle rate would be tied directly to the scanning speed and this would ensure the correct spacing of samples. For the purposes of displaying the quantized letters in an abbreviated form, the 64 six-bit states are most easily represented by means of hexadecimal numb ers (0,1,...,9 ,a,b, . . . , f) . If position 6 in Figure 5-1 i s considered the least significant b i t , and position 1 the most significant, then each state can • be described by a two-digit hexadecimal number between 00 and 3f (corresponding to the decimal equivalents 0 and 63). The binary matrix shown in Pigure 5-1 for a quantized letter can then be represented as a row matrix of twenty elements, as i n - dicated in-Figure 5-2. The binary matrix i s spread out in Figure 5-2 to allow for the clear designation of each state. The row matrix of Figure 5-2 can be considered a one- dimensional representation of the twd "dimensional spatial infor- mation contained in the printed letter "a". The row matrix i s therefore a one-dimensional code for that le t ter , and the deci- mal values of the matrix can be plotted on a scale, as in Figure 5-3, to yield a graphical display. Such a display i l lustrates the signal variation which would be encountered with 52 a unidimensional sound code where there exists a one-to-one correspondence "between the 64 state and code values. STATES - B I . . . . 1 1 1 1 1 1 1 N . 1 1 1 1 . . . . 1 1 1 1 . . - . A 1 1 1 1 1 . . . . 1 1 1 1 . . . R . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Y « • • H E 04 Oe Oe le le 12 12 12 12 le Oe Oe Oe 02 02 02 00 00 00 00 X. Figure 5-2. Binary Matrix and Hexadecimal Row Matrix Representing the Quantized letter "a , r rt o •H CQ rt CD a •H o H a a •r) O CD 60 50 40 h 30 CD g 20 10 0 0 Quantized Letter "a" Letter-space Length of Letter "a" • ' I I I I } L. J I 1_ 5 10 15 State Sample Number m 20 Figure 5-3. One-dimensional Representation of Quantized Letter On a Time or Space Scale 53 5.3 Study of Correlation Between Pairs of Cells A preliminary study was carried out on the quantized print data to establish the possibility of optimizing the l e x i - phone code with respect to photocells 2,3,4, and 5. These cells are activated by a l l letters and therefore generate the bulk (two-thirds) of the information produced by the print scanner. Because these cells are located symmetrically with respect to the lower-case body (the vertical extent of lower-case letter x), i t seemed possible that some of the symmetry evident in letters such as d,b,p,q,g,e,c,o, and x might cause cells 2 to 5 to duplicate each other to some extent. If the coding action of two of the cells were highly correlated, then one of them could be ignored and the sound code made consequently simpler, or the less well correlated cells could be made to control the more effective dimensions of the sound code. Pairs of cells expected to exhibit a high degree of correlation were selected, and from their binary signals were computed correlation coefficients r. ( x 1 - x 1 ) ( x 2 - Xg) r = i T i 2 where x^ is the binary signal of the i-th c e l l considered, and s^ i s the standard deviation of the c e l l signal. The correlation was also measured between two groups of two cells each, in which case x. refers to the combined signal of the i-th pair of cells. 54 In Table 5-1 are summarized the correlation results. Between Cells No. 2&5 2&3 3&4 4&5 2&3 and 4&5 3&4 and 2&5 r -0.41 -0.36 -0.28 -0.46 +0.30 +0.24 Table 5-1. • Correlation Coefficients Computed From the Quantized Photocell Signals The results indicate that a l l pairs of cel ls are cor- related to about the same extent and that the degree of correla- tion i s not very great. Each ce l l appears to contribute an approximately equal proportion of the coded information generated by cel ls 2 to 5. No special assignment of audible code dimensions i s therefore indicated. 5.4 Characteristics of the Quantized Letters A s ta t i s t i ca l study was carried out on the quantized let ter data to test for the presence of certain characteristics. The characteristics evaluated were those that 1. might lead to a letter recognition scheme, and 2. could be measured by a simply-instrumented device con- nected to the Lexiphone scanner. Some additional s tat ist ics were calculated for the whole f i e ld of data, independent of particular letters , which indicate the average distribution of print information in the letter space. The s ta t i s t i ca l calculations -tabulated in these sec- tions were performed by an IBM 7040 d ig i ta l computer. Punched 55 cards containing the quantized Lexiphone data were produced for the 7040 by f i r s t converting the normal six-bit punched paper tape output of the Alwac to corresponding tapes coded in the five-bit telegraphic code. These latter tapes were then converted to cards by an IBM Type 47 Tape-to-Card Converter. We f i r s t introduce terminology to be used in this and the following sections. The instantaneous relay state, denoted by s^, may be any one of the 64 possible states ( i = 0,1 , 2 , . . .,63). The j-th character of the 52-letter alphabet is denoted by :Y.(j = 1,2,...,52). Every Y. is described in quantized form by twenty particular states ŝ  occuring in the twenty successive locations m (m = 1 ,2 , . . . ,20) of the letter-space. (See Pigure 5-3,) We refer to location m as column number m; for i f the 520 row matrices of the ensemble of letters (10 examples of each letter) are placed one beneath the other to form an ensemble array (520 x 20), a l l locations numbered m f a l l in a column. The rows of the binary matrix for each letter are numbered n (n = 1, 2,...,6), corresponding to the photocell and relay number. C^(k^) refers to the i - th characteristic of the data," and depends on the parameter k^ (k^ = 1 , 2 , . . . , K ^ ) . When the probability of Y. is written i t is assumed the probability is evaluated over a l l ten examples of that let ter . 5.4.1. Letter-Independent Characteristics Some trends of the whole f ie ld of data were determined f i r s t without regard to the letters generating the data. The trends evaluated were:- the average state density, P ( s . ) » the 56 preference of state ŝ  for column m, P(mjs^), and vice versa, P(s jm); the probability P(C-^(k^) s^) that state ŝ  occurs exactly k-̂  times per letter ; the probability ViC^ik^)) that the average letter length is k^ columns long; the probability P(C^(k^)|n) that k.̂  "1" states occur in row n of the binary matrix (in other words, that the integrated "black" information detected by photocell n per letter amounts to k^); and f ina l ly , the probability P(C^(k^) n) that there are k^ intersections per letter in row n of the binary matrix (that i s , photocell n on the average, intersects k^ distinct parts of a le t ter ) . The following trends are evident from the results. Forty-three of the non-zero states occur, of which three in particular occur most frequently by quite a margin: state 22he)C (cells 1&5), state 12|Jcells 2&5), and state 3e k Jcel l s 1,2,3,4,&5). Some states show a preference for certain groups of columns m, but for the most part are widely distributed across the letter- space. Averaging characteristic C^(k^) over a l l states shows that states occur once, twice, three times, and four times per letter with about the same frequency, although a small peak at k^ = 3 is evident. Characteristic C^ik^) shows that the average length of a letter l i e s in the range k^ = 11 to 20 columns, with a definite average peak at kg =18. No particular patterning of the integrated information given by C^(k^) is evident, except for the obvious fact that cells 1, 2 and 5 encounter a greater amount of information than do the other cel l s . The results obtained from C^(k^) show that a c e l l encounters no more than three intersections per let ter , and that a l l cel ls encounter 57 single and double intersections with equal frequency. 5.4.2 Letter-Dependent Characteristics The power of the above four characteristics C^(k^) to discriminate between letters was determined by computing the conditional probabilities P(Y. C, (k.) , s .) , P(Y. C 9 ( k J ) , P(Y^ C^(k^), n), and P(Y .̂ C^Ck^), n). Based on these s tat i s t ics , the grouping of letters according to their characteristics is certainly possible i f small deviations between samples of the same letter are ignored. Characteristic C^, the let ter length, offers c lass i f ica- tion of the alphabet into two groups only' because of the incon- sistency and similarity of let ter lengths: lengths 11 to 16 - ijsclgleftozarpSCZgnXTHGQ; lengths 17 to 20 - nOTHGQdbuvxLBhmERP JPYDkywAMUWX. Characteristic C^, the integrated information per let ter per c e l l , leads to nineteen exclusive groups of letters . The groups are determined by f i r s t representing the integrated information per c e l l with the digits 0, 1, or 2 (no units of information, l ' t o 10 units, or 11 to 20 units) , and second, collecting together a l l letters characterized by the same set of six digi ts . The nineteen groups (some consisting of just single letters) are: ESTKLIPYZDX, srzmnux, RPHGBN, CJTJV, gpy, f l i , khd, eco, tb, a, j , q, v, w, A, M, 0, Q, ¥ . Characteristic C^, the number of intersections per c e l l , when summarized over a l l cel ls as for C^, leads to twenty- five exclusive groups of letters : escrv, f l i l Y , EODZB, tCLJ, 58 hdbG, aoz, kKR, gy, pq, FT, PY,'MN, u* j , m, n, w, x, A, H, Q, U, V, ¥ , X. The class i f icat ion of letters according to is unrel i - able; most of the groups resulting from such classi f ication overlap. A discrete part ial let ter recognition scheme based on characteristics Cg, C^, and would be simple to instrument: an integrator and comparator for C^, six sets of integrators and comparators for C^, six sets of two-bit counters for C^, a sampling clock pulse, and necessary AND and OR gates to make the discrete letter-grouping decisions. Those letters not classified uniquely could be coded conventionally (modified according to the class i f icat ion group into which they f a l l ) , while letters identified individually could be coded with a single character- i s t i c sound* The drawback with such a discrete GO, NO-GO recognition process is that imperfect letters are readily rejected, or classified in the wrong group. A more sophisticated identi f ica- tion scheme based on "maximum-likelihood decoding" circumvents this imperfection problem to some extent. In this scheme, J summing junctions are established corresponding to the J letters Y. to be identif ied. The outputs of each characteristic calculator C^(k^) (k^ = 1,2 , . . . , IL) are individually connected to each Y. junction through a weighting attenuator whose attenuation value depends on the s ta t i s t i ca l importance of.C^(k^) to the identif ication of Y . . After an unknown letter i s scanned, the Y. junction with the highest signal is selected. If i t s value 59 is above a certain threshold and is sufficiently greater than the value of i t s nearest competitor, that Y.. is the letter identified; i f not, the unknown letter i s rejected as being unidentifiable. The advantage of this scheme is that the effect of small imperfections in a let ter do not appreciably reduce the poss ibi l i ty of i t being correctly identif ied. In the following section, we examine the value of the characteristics C^, C^, and in such a maximum-likelihood recognition process. A "goodness" measure calculated for each characteristic establishes i t s relative merit and allows the effectiveness of a l l characteristics operating together to be established. 5.5 Recognition Effectiveness of Characteristics For each characteristic C^ is calculated a goodness measure G^, a single non-negative number measuring the correlation between C. and the Y . ' s . G ± = YL p ( c i ( k i ) > Y j ) i o § P(Y. C.(k.)) p r t i — ..._(5-D In information theory terms, G^ i s precisely the average infor- mation about the Y . ' s given by C^. Consequently, individual characteristics or groups of characteristics may be evaluated, for letter recognition purposes, by comparing individual G^'s or sums of G^'s , respectively., Lewis points out that i f the characteristics chosen are s ta t i s t i ca l ly independent, then 60 a l inear relationship exists between the percentage recognition attainable by a system employing the C^'s ( i = 1,2,'..., i>) and the corresponding sum df the G^'s. This means that a quick evaluation of proposed characteristics (and groups of character- i s t ics) is possible without the problem of computing ;£-th order conditional joint probabilities P(Y.. | C - ^ C g , . . . ,0^ ). He also suggests that i f the argument of the log term of equation (5-1) is very near unity (that is, ' C\ (k^ by i t s e l f i s not very effective in identifying Y . ) , then the log term can accurately be represented by the first-term of i t s power series expansion, so that equation (5-1) becomes G i - ' L L p< oi ( ki>'V U t i l i z i n g the IBM 7040 computer, G^ values were computed by means of equation (5-2) for C^kg) , C^^^(k^), and ^4(n)^4^' ^ n = 1 ,2 , . . . ,6 ) , where n i s the row number of the binary matrices. The values are l i s ted in Table 5-2. It is seen from the G. values that characteristic C, (integrated "black" information) is about twice as effective as character- i s t i c (number of intersections). This difference is actually overrated. The value k^ of C^ computed for a letter depends on the sensit ivity of the lexiphone scanner which can change s l ightly from one day to the next. C^ is therefore not a rel iable characteristic. Characteristic (letter length) is unreliable for the same reason that C„ is unreliable, although not to the P(Y. C.(k.)) P(T ) - 1 ...(5-2) 61 same extent. C^, however, is quite rel iable ; the number of intersections counted for a let ter i s reasonably independent of g. \ Q. i n G 2 1 . 6 3 1 . 6 3 n G 3 (n) 1 1 . 8 4 2 1.60 3 1 . 1 5 4 1 . 0 2 5 1 . 5 6 6 0 . 5 0 7 . 6 7 n G 4 (n) 1 0 . 9 4 2 0 . 7 8 • 3 0.82 4 0 . 7 3 5 0 . 6 4 6 0 . 4 4 4 . 3 5 Sum of a l l G-. i 1 3 . 6 5 Table 5 - 2 Goodness Measures G^ Computed for the Characteristic C^ To relate G^ to the percentage recognition P^ possible employing a single characteristic C^, equation ( 5 - 3 ) was evaluated for each C. i(n). ?i(n) = L k i 0.f ^(k.) x(n)v \ ' C./ N (k . ) x(n)v i / . . . ( 5 - 3 ) where Y. is that let ter for which P Y . 3 C./ N (k . ) x(n) x _ x s a maximum. Plotting the Pj_(n) against the corresponding ĵ_(n) i n c ^ i c a ^ e s that 1 0 $ recognition per unit of G^ is obtained for characteris- t ics C^ and C^, while C^ offers about 5 $ recognition per unit G^. If we take the 5 $ value to be the more r ea l i s t i c , then the total percentage recognition possible with a system using 62 characteristics C^, C^, and together, assuming the characteristics to he independent and identif ication errors to he equally weighted, is (13.65) x (0.05) = 68$. Although time did not allow further investigation of the characteristics, i t i s possible to determine whether or not characteristics are independent. To do this , one simulates a maximum-likelihood recognition system and measures the per- centage recognition obtained for a given set of characteristics. In each of several successive t r i a l s another characteristic is added to the previous set. For each t r i a l , the sum of the G-̂ 's corresponding to the 0^'s employed i s plotted against the percentage recognition. If the resulting curve formed by the series of t r i a l points i s l inear, the characteristics chosen are s ta t i s t i ca l ly independent. If a dependent characteristic is added, the curve exhibits saturation. Of course, eventually, the curve must saturate at 100$ recognition. The cursory examination we have given the above characteristics suggests that they by themselves cannot yield a perfect letter-recognition system. (Lewis required thirteen independent characteristics to achieve 82$ recognition of f i f - teen alphabets.) Perhaps the introduction of registration- dependent characteristics (dependent on m) would improve recogni- tion performance. Nevertheless, as demonstrated in Section 5 . 4 , these simple characteristics can be used to identify some letters uniquely or, at the least, to classify them into groups. By delaying the audible coding of a let ter , this class i f icat ion information could be used to make the output coding more efficient. The conventional sound-coded version of that letter could be modified (or simplified) according to the group into which i t is classified. 64 6. MACHINE READING RATES - INFORMATIONAL AND PSYCHOLOGICAL CONSIDERATIONS 6.1 Introduction The assimilation of printed language presents no problem to normal sighted readers. Typically, they can manage to read 150 to 400 words per minute. . Yet when the same print is translated by a simple reading machine and presented to blind persons in a different sensory form, only 30 words per minute or less can be handled. This comparison indicates that the print recoding process performed by the machine must intro- duce certain speed l imit ing factors not normally present or operative in the direct visual process. While .the existence of these factors is self-evident, i t i s not obvious how many there are nor upon what they depend. The choice of a particular audible code, the factor with which we have primarily been con- cerned up to this point, may be only one of several, more significant factors.' Naturally, the reading machine problem would be greatly c lar i f ied i f the identity of these l imit ing factors were established. In this chapter we speculate upon the effect of sever- al informational and psychological variables, some or a l l of which may be l imit ing factors responsible for low machine read- ing rates. Selective and psychological order source entropies are calculated for Lexiphone-coded print , taking Lexiphone characteristics as typical of simple translating devices. When the total psychological entropy calculated is related to the human channel capacity, maximum machine reading rates similar to recorded experimental results are obtained. 6.2 Lexiphone Source Entropy and Coding Redundancy Consider the Lexiphone to represent an information source capable of transmitting any one of a set of n source events x^ ( i = l , 2 , . . . , n ) . If event x^ occurs with probability p. , then the average entropy H associated with the x^ is given n H(x) = - p.̂  loggP^ (bits/event) ...(6-1) i=l where H(x) denotes specif ical ly source entropy. If the x^ are equiprobable, H(x) attains a maximum value and equation (6-1) simplifies to H(x) = l og 2 n Let us f i r s t calculate the maximum amount of coded information per letter that could be transmitted by the Lexiphone in the optimal coding situation. Maximum H occurs only when the 63 states generated by the machine are used equiprobably in the coding of let ters . . If this is done, then H(x) = log 0 63 = 5.98 bits per state. From the s ta t i s t i ca l max c. data collected for Chapter 5, i t has been calculated that a letter scanned by the Lexiphone generates, on the average, 8.4 66 distinct states. This leads to a maximum possible source entropy entropy actually transmitted by the lexiphone (or for that matter, by any reading machine that presents no more than one letter per stimulus). The source material from which the read- ing machine entropy arises i s simply the set of letters making up the 52- (or 26-) letter alphabet. In the case of a simple reading machine there are generated 52 different stimulus pat- terns corresponding to the 52 input letters. If the 52 letters are assumed to occur equiprobably and independently, the source entropy generated by the machine must be just the zeroth-order approximation to the entropy of printed English, H_. Of course print "noise" can cause the actual machine source entropy to rise above HQ by causing the machine to generate more than one stimulus pattern per letter, but under normal circumstances this rise is not serious. The values of actual and maximum lexiphone source entropy are now compared by calculating the • coding redundancy. R of H(x) max = 5.98 x 8.4 = 50.3 bits/letter Now let us compare this maximum with the source H(x) = HQ = l o g 2 52 = 5.7 bits/letter R = 1 - max 67 Hence, the Lexiphone coding redundancy is E L e x " 1 - ^ 3 = 8 9 f° (29) Shannon^ • 7 / points out that printed English i s about 50$ redundant, and that much of this redundancy is useful in reducing error. (In a non-redundant language, every misprinted letter gives rise to a new word that is perfectly meaningful in the sentence, but a sentence whose meaning is different from that intended.) With the Lexiphone code i t is a question of whether or not a l l of the redundancy is useful. If the Lexi- phone user i s able to appreciate and use to his advantage the patterning and interdependence of code sounds, then the code redundancy is useful. If he cannot appreciate (or ignore) the redundant parts of the code, the redundancy is not useful, and he experiences a consequent increase in effective source entropy. The result of such an increase is to reduce his maximum reading rate, as i s demonstrated in Section 6.3. We have not included alphabetic or contextual redun- dancy in the above calculation of Lexiphone source entropy_ because this type of redundancy is effective regardless of what reading machine code is used; the machine merely translates the alphabetic information i t receives. Unequal letter probabilities evident in printed English reduce the average entropy per letter ; for a 26-letter alphabet, the entropy is reduced from HQ = l og 2 26 = 4.7 bits per letter to = 4.1 bits per letter . Further reduction occurs when higher-order redundancies (digram, t r igram, . . , and word frequencies) are taken into account. 68 The redundancy we have c a l c u l a t e d f o r the lexiphone code i s over and above that language redundancy present i n the source m a t e r i a l . 6.3 . P s y c h o l o g i c a l Source Entropy i n g a l e t t e r cannot be perce i v e d by the human subject as a s i n g l e image or G e s t a l t , then a p s y c h o l o g i c a l l i m i t a t i o n suggested entropy. The l i m i t a t i o n d e a l s with the cost of confusing,- on r e c a l l , the e x a c t . s e r i a l order i n which the code sounds were pe r c e i v e d . I t i s not u n l i k e l y that t h i s l i m i t a t i o n i s e f f e c t i v e here because many Lexiphone-coded l e t t e r s are d i f f e r e n t i a t e d only by a s u b t l e d i f f e r e n c e i n the order of c e r t a i n code sounds. Consider a coded l e t t e r to be a message c o n s i s t i n g of n d i s t i n c t , s e q u e n t i a l l y ordered message-units (which are merely the n d i s t i n c t s t a t e s generated by the l e t t e r ) . A f t e r a message has been r e c e i v e d by the human sub j e c t , he r e c a l l s the set of message-units from immediate memory, and to make a c o r r e c t i d e n t i f i c a t i o n , he must r e c a l l e x a c t l y t h e i r o r i g i n a l s e r i a l order. The cost of h i s p r e s e r v i n g the s e r i a l order of n u n i t s i n a message can be considered the p s y c h o l o g i c a l order source entropy, H ( x ) , g i v e n by I f the d i s c r e t e sounds of the Lexiphone code compris- by Crossman (33) may tend t o i n c r e a s e the e f f e c t i v e source H Q(x) = l o g 2 (6-2) n-1 n ... n. r where n i i s the number of permutations of n message u n i t s , n^ i s the number of times message-unit i appears i n the message 69 i f i appears more than once, and r i s the number of d i f f e r e n t units appearing more than once per message. The denominator of equation (6-2) i s present "because we consider that two iden- t i c a l units n^ may he transposed without any loss of order information. As we would expeot, the cost of preserving s e r i a l order increases with the length of the message. Equation (6-2) was evaluated f o r each of our 520 letter-samples, and the average order source entropy calculated was H 6(x) =13.7 b i t s / l e t t e r (When the i n d i v i d u a l l e t t e r entropies are weighted according to the frequency of occurrence: of" letters:-, i n English,-"H-(x)» i s reduced to 12.0 b i t s per l e t t e r . ) Crossman shows that the t o t a l psychological source entropy Hj.(x) i s simply the sum of the o r i g i n a l selective souroe entropy H(x) and the order source entropy H ^ ( x ) : H t(x) « H(x) + H Q(x) . . . (6 -3 ) I f we assume that a l l of the order source entropy i s e f f e c t i v e as f a r as a Lexiphone user i s concerned, then, by i n s e r t i n g i n equation (6-3) t h i s value of H Q(*) and that value of H(x) determined i n Section 6.1, we arrive at the following amount of t o t a l psychological source entropy: H t(x) = 5.7 + 13.7 = 19.4 b i t s / l e t t e r 70 We are now i n a position to calculate the maximum possible reading rate based on thi s estimate of source entropy. 6.4 Human Channel Capacity and Maximum Reading Rates Peak information transfer rates I m a x attained by human subjects performing well-practised sequential tasks are ( • 3 2 " 3 5 ) reported to be i n the range 15 to 44 b i t s per second. ^ ' The lowest figure i s t y p i c a l of tasks requiring motor-control, such as piano-playing and typing; intermediate values corres- pond to tasks such as casual conversation, oral reading, and expert mental arithmetic; while the upper l i m i t i s an estimate of the peak rate attained f o r the motorless task of s i l e n t reading. If we suppose a person reading by means of the Lexi- phone i s so highly experienced with the machine that his • errorless reading rate has reached a saturation speed of L . max l e t t e r s per second, then i t i s reasonable t c assume he i s processing information at a rate I approaching his channel capacity C (bits/sec.) for t h i s p a r t i c u l a r task. Let the channel capacity of our subject.assume the generous value of C -.40 b i t s per second. We w i l l assume that his equivocation 9 rate H (x) i s neg l i g i b l e ; that i s , that C = max K x ; y ) A = max = max H(x)-H y(x) H(x) = H ( x )max ...(6-4) where the dot denotes a time-rate quantity. His maximum 71 reading rate i s then given by T « + 4 . « , , r , / „ . \ C (bits/seo.) ,r a^ Lmax (Otters/sec.) = (kts/letter) .-.(6-5) Substituting into equation (6-5) the value of total psychological source entropy calculated in the previous section in place of H(x), and the value of C = 40 bits per second, we obtain the following estimate for the maximum Lexiphone read- ing rate (assuming 4.5 letters per word): Lexiphone ' £ = = 2.06 l e t t e r s = 27.5 K max 19.4 sec. J mm. Another similar calculation is appropriate at this point. Consider a person who reads by means of spelled speech (say, the audible output from a recognition-type reading machine). The source entropy he encounters is just the selective entropy of one of 26 equiprobable letters, logg 26 = 4.7 bits per letter, considering that no psychological order entropy i s created by a single letter sound. If we assume his channel capacity for this task is also C = 40 bits per second, then his maximum spelled- speech reading rate i s r according to equation (6T-5), Spelled L = iO- = 8.5 l e t t e r s = 110 Speech m a x 4 * 7 s e c ' m i n * Although these maximum reading rates have been arrived at using rough informational approximations, the rates correspond quite closely to the experimental rates recorded for human performance with Lexiphone-type reading machines and spelled speech. How accurate are the approximations, or how 72 valid is this informational approach is d i f f i c u l t to say, for the human information channel does not easily lend i t s e l f to informational analysis in absolute terms. We can, however, offer comment on the informational (and some of the psychological) parameters involved. The maximum reading rate 1 given by equation ni six (6-5) employs the ratio of channel capacity C to source entropy H(x). The value of C = 40 bits per second chosen for our hypothetical subject is certainly an upper limit under any condition. Decoding errors and lack of familiarity with the task w i l l reduce C to a value well below 40 bits per second. The upper limit of 1 i s therefore dependent only on the lower bound of H(x), or, in the case of the-human channel, H^(x). So one asks, how can the components of H.(x) = H(x) + H (x) be reduced u o in magnitude psychologically? The true' s t a t i s t i c a l value of H(x)'cannot of course be reduced below the average amount of information necessary to differentiate 52 possible letters, b u t a good proportion of the information in excess of this minimum (i.e. the excessive coding redundancy) can certainly be excluded with psychological advantage. To achieve minimum H(x), a character recognition machine i s obviously indicated; but near-minimum H(x) can be achieved by transmitting only those states or patterns of states which are almost non-redundant.- In other words, near-optimum coding (minimum psychological H(x)) can be realized only-with some pre-processing of the print information. 73 The psychological order source entropy 'H (x) depends on the number of distinct units per message (sounds per letter) that must be recalled serially by the human subject. H (x) may be reduced in two ways. Fi r s t , we can minimize the actual number of discrete sound units generated per letter—even to one sound per letter, which again necessitates a character recognition device. Or, second, we can associate with the signal units generated by a letter a coalescent audible code whose components combine in time to produce a single psychological sound unit. Of course i t i s just such a coalescent code for which reading machine workers have unconsciously, or otherwise, been searching— a code which gives rise to easily remembered audible patterns. Most of the simple reading machine codes previously studied have generated sensory code units direct from discrete print signals. So i t would be.-"worthwhile to investigate a multidimensional audible code that i s controlled by analogue electrical signals derived from the print. Such a code, whose components would be smoothly varying and necessarily coalescent, should encourage the perception of a letter as no more than a single psychological unit. 74 7. SUMMARY AND CONCLUSIONS An experimental direct-translation reading machine, based on the print scanning technique of the Optophone, has been built and i t s operation studied. This machine has made possible the evaluation of a four-dimensional audible code (the Lexiphone code), and the determination of print signals produced by a machine of this type. Two blind subjects were trained for 36 hours to "read" both Lexiphone-coded letters and words selected from a limited alphabet of nine lower-case letters. Their reading performance was measured in terms of the percentage of letters or words correctly identified. While one subject performed consistently better,- both managed to attain 1 0 0 $ at some point during the training period. It was demonstrated that-.. blind persons can learn to "read" multidimensionally-encoded words, and more significantly, that they can do so by u t i l i z i n g entirely differ- ent dimensions of the code. This latter fact alone i s sufficient reason, in terms of wide appeal, to employ a multidimensional code in preference to a unidimensional code. Also carried out was a comparative evaluation of the Lexiphone and optophone codes. Two comparable groups of sighted persons were trained with single coded letters of the two codes for one hour. The results of tests given at the end of the training hour showed.performance with the Lexiphone code to be slightly superior. While this superiority was not s t a t i s t i c a l l y significant, other factors indicated that further code training 75 would widen the performance gap in favour of the Lexiphone code; Fi r s t , the results indicated that optophone performance had reached a partial saturation po^nt even during the f i r s t training hour. Second, the majority of subjects concurred in the opinion that while the Lexiphone code was not as pleasant to li s t e n to, i t provided a better basis for letter discrimination. The results of these audible code experiments lead to the conclusion that with a machine of the Lexiphone type a multidimensional code i s to be preferred to a unidimensional code. The further question of whether or not multidimensionally- encoded stimuli can yield reading rates higher than those experienced with unidimensional stimuli of the optophone type can be settled only by carrying out an exhaustive code training program. The danger and inadequacy of interpreting short-terin training results has been pointed out. A s t a t i s t i c a l study ;of the quantized print signals was carried out to determine possible sound code simplifications, and to establish the redundancy and amount of information pre- sented by the machine. It was shown that, with a small amount of logic circuitry, characteristics derived from the print signals could be used to classify letters into groups. If the encoding of letters into sound were delayed, then this c l a s s i - fication information could be used to optimize the encoding of particular letters or groups of letters. The use of these characteristics in a maximum-likelihood letter recognition scheme was also investigated. Although i t was demonstrated that 68$ recognition could be achieved i f the characteristics 76 acted independently, this low figure and the fact that some of the characteristics are rather unreliable leads to the conclusion that this particular print scanning system i s not well-suited to automatic letter- recognition. To f a c i l i t a t e the hypothesis of reasons why simple reading devices can offer only low reading speeds, the Lexiphone- human combination was treated as an information processing channel. Depending upon the validity of certain assumptions i t was shown that higher reading rates could be obtained by reducing the psychological value of the two components of source entropy. The f i r s t component, selective source entropy, may be reduced by the decreasing coding redundancy, which in the case of the Lexiphone • xa about 90%. That proportion of the redundant information the subject is unable to use to his advantage or to ignore becomes useless information he must digest to the detriment of reading speed; The reduction of this component can be realized by f i r s t determining the redundant print signals and then processing the print information in such a way that these signals are deleted. The. second component, psychological order source entropy, can be minimized by employing a sound code without discrete components—one which gives rise to just a single psychological sound pattern per letter. It was pointed out that both of these information components could together be reduced by u t i l i z i n g a character recognition device. 77 APPENDIX I THE THREE LETTER WORD VOCABULARY " * sag s a t say s e t she shy s t y c a r c a t c r y a c t a r t ash ass a t e age ace aye a r e eat ear e r g eye egg e t c r a t r a y r a g r y e he r hat hay hag has . t e e t a r t e a the t a g gas gee' gat get gay gag y e t yea yes 78 APPENDIX I I SENTENCE LIST 1. her tears are rare 2. the races that he sees are easy 3. her saga rates cash 4. he has a scar that each eye sees 5. she says that he acts gay yet cagy 6. she hates her rash 7. she acts her age at teas 8. there are the three tarts he eats 9. yes she teases 10. the crate sags at the stress 11. teach her that cats eat grey rats 79 REFERENCES 1. Pollack, I. and Ficks, 1. "Information of Elementary Multidimensional Auditory Displays," J. Acoust. Soc. Amer.. 26, 155-158;' 1954. 2. Beddoes, M.P., Belyea, E.S.W., and Gibson, W.C., "A Reading Machine for the B l ind , " Nature, 190. 874-875; 1961. 3. Fournier d'Albe, E . E . , "The Type-Reading Optophone," Nature, 2±, p. 4; 1914. 4 . Fournier d'Albe, E .E . , "The Optophone: . An .Instrument for . Reading by Ear ," Nature, 105, 295-296; 192Q. Fournier d'Albe, E . E . , The"Moon Element. An Introduction to.the Wonders of Selenium, D. Appleton and Co...,. New York; 1924. 5. Cooper, F.S.,. "Research on Reading Machines for the B l ind , " in P.A. Zahl (ed.), Blindness: Modern Approaches to the Unseen Environment,. Princeton Univ. Press, Princeton, N. J . , 512-543j 1950. 6. Zworykin, V . Z . , and Flory, L 0 E . , "Reading Aid for the B l ind , " . Electronics...1ft.. 84-87; August, 1946. 7. Zworykin, V.Z . , Flory,' L 0 E„ , and Pike, W„S,„, " letter Reading Machine, " Electronics, . 22. 80-86; .June, 1949. 8. Cooper, F .S . , and Zahl, P„A, , "Research on Reading Machines for .the B l ind , " Progress Report to the Committee on Sensory Devices of work done at the Haskins Laboratories, New York; June 30, 1947. 9. Clowes, M.B„, E l l i s , Z. , Parks, J .R., Rengger, R „ , Communica- tion from the National.Physical Laboratory, U . Z . ; February 22, 1961. 10. Abma, J „ S . , Laymon, R 0 S 0 , Mahler, D.S., Mason, L J . , Rice, D„R. , and others, ."The Development,and Evaluation of Aural Reading Devices for the B l ind , " Battelle Memorial . Institute Research.reports to the Veterans Administration; 1958, 19.5.9, .-I960, 1961, 1962, 1963. 11. Coffey, J .L . , "The Development and 'Evaluation1of the Battelle Aural Reading Device," Proc. Int. Cong, on Technology and Blindness,, Vol . 1, A.F.B; . , ;New York, 343-360; 1963. 12. Mauch, H.A. , "The Development of a.Reading Machine for the B l ind , " Summary Report to the Veterans Admin, from Mauch Laboratories, Inc., Dayton, Ohio; June 30, 1964. 80 1 3 . Freiberger, H. , "Summary,Report of Developments in the Veterans Administration Research Program on Aids for the B l ind , " Veterans Administration, New York; ..October, 1962. 14. .Metfessel, M.P., "Experimental, Studies of Human Factors in Perception and Learning of Spelled Speech," Proc. Int. Cong, on Technology and Blindness, Vol . 1 . A . F . B . , 305-308; 1 9 6 3 . 1 5 . Metfessel, M.P., and Lov.ell, C , "Controlled Association in Learning the .Auditory Code of Spelled Speech," paper read at the 72nd Annual Convention of the American Psychological Association, Los Angeles; September 7, 1964. 16. Nye, PoW.,, "An Investigation of Audio Outputs for a Reading Machine," Report from the National Physical Labora- tory, U.K. ; February, 1965. 17. Beddoes, M f fP„, "Possible Uses of a Printed Brai l le Reader with Spelled Speech Output," Prjoc. Int. Cong, on Technology and Blindness, Vol . 1, A . F . B . , New York. 325-341; 1963. • 18. Bl i s s , J , o C , "Kinesthetic - Tactile Communications,1' IRE Trans., on Inf. Theory. IT-8. 92-98; 1962. 19. Donaldson, R.W09 "Multimodality Sensory Communications," Ph.D. Thesis," M.'I.T.; June, 1 9 6 5 . 20. _______ "Medical electronics for blind readers," Electronics. •18, 35-36;"January 25, 19.65. 21. Shrager, P .G. , and Susskind, C , "Electronics and the Bl ind , " Adv. in. Electronics .and Electron Physics. 20, 281-292; .1964.. . '. 22. Freiberger, H. , arid Murphy, E . F . , "Reading Machines for the B l ind , " IRE Trans, on Human Factors i n .Electronics, • HFE-2. 8-19; March, 1961. 2 3 . Fischer, G .L . , Pollack, D 0 K„, Radack, and Stevens, (ed;), Optical Character Recognition. .Spartan Books, Washington,' D.C.; 1962. 2 4 . Scharff, S.A., "Letter Scanning for Character Recognition," " Proc. Int. Cong, on Technology and Blindness, Vol . I. A . F . B . , New York, 227-244; 1963. 2 5 . Clemens, J 0 K Q , "Optical Character Recognition for Reading Machine Applications," Ph.D. Thesis, M . I 0 T „ ; September, 1965. 81 26. Emmanuel, A . F . , and Mauch, H.A. , "The Development and Evalua- tion of a Personal Type Reading Machine for the B l ind , " Mauch Laboratories report to the Veterans Administration; September, 1958. 27. Scott,' W.A., and Wertheimer, M. , Introduction to Psychological Research, John Wiley and Sons, Inc., New York, 258-262; 1962. 28. Lewis II, P.M., "The Characteristic Selection Problem in Recognition Systems," IRE Trans, on Info. Theory, IT-8. 171-178; 1962. 29. Shannon, C . E . , "A Mathematical Theory of Communication," Be l l Syst. Tech. J . , 2 J , 279-423; 1948. 3 0 . Quastler, H . , "A Primer on Information Theory," in .H»P . Yockey (ed.), Symposium of Information Theory in Biology, Pergamon Press, New York, 3-49; 1956. 31. Shannon, C . E . , "Prediction and Entropy of Printed English," Be l l Syst. Tech. J . . 30, 50-64; 1951. 32. Attneave, P.^ Applications of Information Theory to Psychology, Holt, Ririehart, and Wilson, New York; 195,9. 33. Crossman, E.R.P.W., "Information and Serial Order in Human Immediate Memory," in C. Cherry (ed), Information Theory, Fourth',London Symposium, Butterworths, London, 147-161; I960.' 34. Murdock, B.B. , "The Immediate Retention of Unrelated Words," J. Exp. Psychol.. 60, 222-234; I960. 35. Pierce, J . R „ , and Karl in, H . E . , "Reading Rates and Information Rate of a Human Channel," Be l l Syst. Tech. J . , 56, 497-516; 1957.


Citation Scheme:


Usage Statistics

Country Views Downloads
Japan 5 0
United States 1 0
City Views Downloads
Tokyo 5 0
Unknown 1 7

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}


Share to:


Related Items