UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Machine recognition of independent and contextually constrained contour-traced handprinted characters Toussaint, Godfried T. 1969

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1970_A7 T68.pdf [ 3.36MB ]
Metadata
JSON: 831-1.0093327.json
JSON-LD: 831-1.0093327-ld.json
RDF/XML (Pretty): 831-1.0093327-rdf.xml
RDF/JSON: 831-1.0093327-rdf.json
Turtle: 831-1.0093327-turtle.txt
N-Triples: 831-1.0093327-rdf-ntriples.txt
Original Record: 831-1.0093327-source.json
Full Text
831-1.0093327-fulltext.txt
Citation
831-1.0093327.ris

Full Text

MACHINE RECOGNITION OF INDEPENDENT AND CONTEXTUALLY CONSTRAINED CONTOUR-TRACED HANDPRINTED CHARACTERS  by  GODFRIED T. TOUSSAINT B.Sc,  U n i v e r s i t y o f T u l s a , 1968  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE  REQUIREMENTS FOR THE DEGREE OF  MASTER OF APPLIED SCIENCE i n the Department o f Electrical  We accept  this  Engineering  t h e s i s as conforming to the  required  standard  Research S u p e r v i s o r Members o f Committee  A c t i n g Head o f Department  Members o f the Department of THE  Electrical  Engineering  UNIVERSITY OF BRITISH COLUMBIA  December, 1969  In  presenting  this  thesis  an a d v a n c e d d e g r e e the L i b r a r y I  further  for  agree  scholarly  by h i s of  shall  this  written  at  the U n i v e r s i t y  make i t  tha  freely  permission  for  It  of  Columbia,  British for  for extensive  gain  permission.  Depa r t m e n t  /  of  by  Columbia  shall  the  requirements  reference copying  of  I agree and this  that  not  copying  or  for that  study. thesis  t h e Head o f my D e p a r t m e n t  is understood  financial  The U n i v e r s i t y o f B r i t i s h V a n c o u v e r 8, Canada  fulfilment  available  p u r p o s e s may be g r a n t e d  representatives. thesis  in p a r t i a l  or  publication  be a l l o w e d w i t h o u t  my  ABSTRACT  A c o n t o u r - t r a c i n g technique was m o d i f i e d  and used w i t h  to r e c o g n i z e  upper case h a n d p r i n t e d  o r i g i n a l l y d i v i s e d by Clemens and Mason  several different  c l a s s i f i e r s of v a r y i n g  alphabetic characters.  comparison of the v a r i o u s c l a s s i f i e r s , w i t h  complexity  An a n a l y s i s and  the m o d i f i c a t i o n s i n t r o d u c e d to  handle v a r i a b l e l e n g t h f e a t u r e v e c t o r s , i s p r e s e n t e d . On independent c h a r a c t e r s , one e a s i l y r e a l i z e d  suboptimum  parametric  c l a s s i f i e r y i e l d e d r e c o g n i t i o n a c c u r a c i e s which compare f a v o u r a b l y w i t h published  results.  A d d i t i o n a l simple  improved r e s u l t s s i g n i f i c a n t l y  t e s t s on commonly  optimum Bayes c l a s s i f i e r  c a p a c i t y than a non-  are l i m i t e d .  The optimum d e c i s i o n on a s t r i n g of m c o n t e x t u a l l y  computationally and  constrained  a variable-length feature vector, i s derived.  efficient  a l g o r i t h m , based on t h i s e q u a t i o n , was  A  developed  t e s t e d w i t h monogram, bigram and t r i g r a m c o n t e x t u a l c o n s t r a i n t s of E n g l i s h  text.  A marked improvement i n r e c o g n i t i o n a c c u r a c y  was noted over the case  when c o n t e x t u a l c o n s t r a i n t s were not used, and a t r a d e - o f f was not  In  and performs s i g n i f i c a n t l y b e t t e r than the  optimum c l a s s i f i e r when t r a i n i n g and t e s t i n g data  c h a r a c t e r s , each having  characters  as d i d use of c o n t e x t u a l c o n s t r a i n t s .  a d d i t i o n , the above c l a s s i f i e r uses much l e s s s t o r a g e parametric  confused  other  only between the order  measurements taken,  of c o n t e x t u a l  observed  i n f o r m a t i o n used and the number of  but a l s o between the o r d e r  parameter ds which i n d i c a t e s the complexity  of context  and the v a l u e of a  of the c l a s s i f i c a t i o n  algorithm.  TABLE OF CONTENTS Page ABSTRACT TABLE OF CONTENTS  i  L I S T OF I L L U S T R A T I O N S  II.  i  i  '  v i  INTRODUCTION  1  1.1  Purpose of Research  1  1.2  Review o f P r e v i o u s Research  1  1.3  Scope o f t h e T h e s i s  3  THE PATTERN-RECOGNITION PROBLEM  '.  5  2.1  Introduction  5  2.2  Transducers  7  2.3  Feature  7  2.4  2.5 II.  i  v  ACKNOWLEDGEMENT I.  i  Extraction  2.3.1  Introduction...  7  2.3.2  Contour  8  2.3.3  Smoothing  11  2.3.4  Gestalt Variables  12  2.3.5  Feature  12  Tracing  Vector Formation  Classification  15  2.4.1  Introduction  15  2.4.2  The T a b l e L o o k - u p M e t h o d . . .  16  2.4.3  The P a r a m e t r i c B a y e s A p p r o a c h  17  2.4.4  Minimum E u c l i d e a n D i s t a n c e from  2.4.5  A Non-Euclidean  2.4.6  Some I m p o r t a n t  2.4.7  Handling  2.4.8  A Parallel-sequential  Classifier  23  2.4.9  Compound D e c i s i o n s f o r S t r i n g s o f D e p e n d e n t C h a r a c t e r s  23  EXPERIMENTS  Distance C l a s s i f i e r Interrelationships  Feature Vectors  T r a i n i n g and S t o r a g e  t h e Means  i n Multispace  18 18 19 21  Requirements  25  ...  27  i i i  IV.  V.  3.1  Introduction  27  3.2  D e s c r i p t i o n of the Data  28  3.3  Experiments  on I n d e p e n d e n t C h a r a c t e r s  30  3.4  Experiments  on Dependent C h a r a c t e r s  31  RESULTS  33  4.1  Introduction  33  4.2  Independent Characters  33  4.3  Dependent C h a r a c t e r s  37  D I S C U S S I O N AND CONCLUSIONS , 5.1  40  E v a l u a t i o n of Results  40  5.2  Suggestions  44  5.3  Summary o f C o n c l u s i o n s  f o r Further Research  46  REFERENCES  •  iv  47  LIST OF ILLUSTRATIONS FIGURE  Page  1  The g e n e r a l p a t t e r n - r e c o g n i t i o n problem  2  I l l u s t r a t i n g the contour t r a c e . Search f o r the c h a r a c t e r b e g i n s a t S. The CONTOUR mode begins a t C  10  CODE and COORD words f o r "C"  14  3 4  6  Some d e c i s i o n boundaries between two p a t t e r n - c l a s s - m e a n s i n two d i m e n s i o n a l space  22  A l p h a b e t s from seven p e r s o n s . The s i x t h and seventh a l p h a b e t s a r e from PERSONS 1 and 2, r e s p e c t i v e l y  29  6  M i s c l a s s i f i c a t i o n and r e j e c t p r o b a b i l i t i e s i n per c e n t . .  34  7  Scans and d e c i s i o n t r e e s f o r r e s o l v i n g c h a r a c t e r confusions  36  5  8  9  10  Per cent e r r o r p r o b a b i l i t y as a f u n c t i o n o f ds f o r the 4-PAD t e s t i n g data  38  Per cent e r r o r p r o b a b i l i t y as a f u n c t i o n of the o r d e r o f c o n t e x t u a l i n f o r m a t i o n used  39  I l l u s t r a t i n g a technique f o r c l o s i n g broken  characters  s m a l l breaks i n 45  v  ACKNOWLEDGEMENT G r a t e f u l acknowledgement i s g i v e n of Canada, The  Defence Research Board of Canada, and  Telephone Company f o r the f i n a n c i a l support and DRB  to the N a t i o n a l Research C o u n c i l  2801-30, and  the B r i t i s h  Columbia  r e c e i v e d under Grants  NRC  A-3308,  f o r the Graduate S c h o l a r s h i p , r e s p e c t i v e l y .  I would l i k e  to thank Dr.  Robert W.  t h i s p r o j e c t , f o r h i s generous c o u n s e l , and  Donaldson, the s u p e r v i s o r of  constant  a v a i l a b i l i t y and  encourage-  ment. I would l i k e and  to thank Dr. M i c h a e l  f o r h i s h e l p f u l suggestions.  I am  P. Beddoes f o r r e a d i n g  the h a n d p r i n t e d  recognition tests.  I am  for participating  a l s o g r a t e f u l to Messrs. J . Cavers, M.  f o r the many i n t e r e s t i n g d i s c u s s i o n s over N+1  and  U n i v e r s i t y of B r i t i s h  c h a r a c t e r samples and  I would a l s o l i k e  to thank my  f o r o p e r a t i n g the p r o j e c t o r d u r i n g  manuscript  g r a t e f u l to the graduate s t u d e n t s  the E l e c t r i c a l E n g i n e e r i n g Department of The providing  the  of  Columbia f o r i n the human  I t o and  S.  Hussain  cups of c o f f e e .  w i f e , N.inozka, f o r her  encouragement  the t e d i o u s p r e p a r a t i o n of the  data,  Miss B e v e r l y Harasymchuk f o r t y p i n g the manuscript,-and Messrs. John C o s s a l t e r , Tom  F l e t c h e r , and Toomas Vilmansen, f o r p r o o f r e a d i n g  vi  the  manuscript.  1  I. 1.1  INTRODUCTION  Purpose o f Research The  of a simple  purpose of t h i s r e s e a r c h was t o :  (1) a s s e s s  the s u i t a b i l i t y  c o n t o u r - a n a l y s i s f e a t u r e - e x t r a c t i o n scheme f o r r e c o g n i z i n g hand-  p r i n t e d upper case  c h a r a c t e r s when the f e a t u r e v e c t o r i s operated  on by  classifiers  o f v a r y i n g degrees of c o m p l e x i t y ;  classifiers  above by u s i n g c o n t e x t u a l c o n s t r a i n t s and a d d i t i o n a l simple  class-dependent  (2) improve the r e s u l t s o f the  t e s t s to d i f f e r e n t i a t e commonly confused  c h a r a c t e r s ; and  (3) observe t r a d e - o f f s , i f any, between the number of measurements,  complexity  of the b a s i c c l a s s i f i e r and o r d e r of c o n t e x t u a l i n f o r m a t i o n used when the a v a i l a b l e storage  1.2  i s limited.  Review o f P r e v i o u s The  Research  great s p a t i a l v a r i a b i l i t y  among samples from the same person,  of handprinted  c h a r a c t e r s , even  has l e a d many r e s e a r c h e r s  methods of f e a t u r e e x t r a c t i o n and c l a s s i f i c a t i o n . p r e p r o c e s s i n g . i s of c o n s i d e r a b l e complexity  I n many o f these  and  [1-3].  Greanias  Some o f the r e s e a r c h e r s such as Kamentsky  et a l . [6] have c o n s i d e r e d  such as Roberts  [ 7 ] , Highleyman  unique  cases the  and the d i m e n s i o n a l i t y o f the  f e a t u r e v e c t o r s i s l a r g e , t y p i c a l l y g r e a t e r than 100, making difficult  to e x p l o r e  classification [ 4 ] , Bakis  et a l .  [5],  o n l y numeric c h a r a c t e r s e t s , and o t h e r s  [8] and Gr imsdale  e t a l . [9] used the same  data s e t f o r t r a i n i n g and t e s t i n g which i s now known to y i e l d  an o v e r o p t i m i s t i c  p r e d i c t i o n o f performance. In 1965 Clemens  [10, 11] d e v i s e d a r e l a t i v e l y  a l g o r i t h m to r e c o g n i z e m a c h i n e - p r i n t e d  characters.  simple  contour  tracing  The r e l a t i v e ease o f imple-  mentation o f the scheme and the low d i m e n s i o n a l i t y o f the f e a t u r e v e c t o r s , about 20, a t t r a c t e d r e s e a r c h e r s of h a n d p r i n t e d 1966  character recognition.  Chodrow e t a l . [12] r e p o r t e d poor r e s u l t s i n a p p l y i n g Clemens'  to a l p h a b e t i c h a n d p r i n t e d  characters.  In 1968 Munson [13] o b t a i n e d  typically  In  techniaue e q u a l l y poor  r e s u l t s u s i n g a s l i g h t m o d i f i c a t i o n of Clemens' t e c h n i q u e . r e s u l t s have been r e p o r t e d by Johnson et a l . [14], Munson Knoll  [16] u s i n g more c o m p l i c a t e d  a r r i v i n g at very  techniques,  but  s a t i s f a c t o r y r e s u l t s without  Since then,letter[13, 15]  and  t h e r e seems l i t t l e hope of  the use of c o n t e x t u a l  information  In the p a s t , the use of c o n t e x t u a l c o n s t r a i n t s to improve c h a r a c t e r r e c o g n i t i o n has been s c a n t and method and  the Markov approach.  character confidences of  l i m i t e d g e n e r a l l y to the d i c t i o n a r y look-up In 1959  Bledsoe  of a word t o g e t h e r w i t h  the--length i n q u e s t i o n  and  Browning  a vocabulary  to make a d e c i s i o n on the word.  [1] used  the  of E n g l i s h words Cornew  [17]  a p p l i e d an e x t e n s i v e d i c t i o n a r y look-up method to s p e l l i n g c o r r e c t i o n and Alter  [18] used a s e q u e n t i a l decoding  method f o r d e t e r m i n i n g  p o s s i b l e symbol sequences i s i n some sense most l i k e l y g i v e n the sequence a c t u a l l y r e c e i v e d . .  The  t h a t a d i c t i o n a r y of p r a c t i c a l s i z e takes up i n f o r m a t i o n i s used from the measurements. on the output category  which of many  to have been  disadvantages  intended  of these methods a r e  too much s t o r a g e , and not In f a c t  enough  these methods o f t e n  of a c l a s s i f i e r which c o n t a i n s the d e c i s i o n , r a t h e r than  l i k e l i h o o d s , f o r the i n p u t c h a r a c t e r .  operate the  The Markov approach i s based  on the assumption that the t r u e i d e n t i t y of a c h a r a c t e r i s dependent on t r u e i d e n t i t i e s of n e i g h b o u r i n g approach to  replace missing  most probable in  trigrams.  a machine-printed  d e c i s i o n and  characters.  Carlson  [19] used the Markov  c h a r a c t e r s from names on the b a s i s of  Edwards  and  Chambers  c h o i c e based on the two  hoods, a c t u a l l y r e s u l t i n g  i n a decrease  the  [20] used c o n d i t i o n a l bigrams  c h a r a c t e r r e c o g n i t i o n scheme,which uses the  a present  the  c a t e g o r i e s having  previous highest  i n performance f o r h i g h i n i t i a l  likeli recog-  n i t i o n accuracies. In 1966 use  of c o n t e x t  Abend  [21, 22]  theoretically  u s i n g compound d e c i s i o n t h e o r y .  solution is virtually  u n r e a l i z a b l e and  s o l v e d the problem of optimum P r a c t i c a l l y , however,  this  s i m p l i f y i n g assumptions have to be made.  One  way  to s i m p l i f y the problem i s to assume Markov dependence among  characters  to be  recognized  to make a d e c i s i o n on one  and  then use  c h a r a c t e r at  s e q u e n t i a l compound d e c i s i o n a  the problem i s to have as the c l a s s i f i e r e g o r i e s having Hart  [24]  the h i g h e s t  i n a handprinted  confidence.  the  time  output  [23].  Another way  theory  to s i m p l i f y  only a f i x e d number of c h o i c e c a t -  T h i s s i m p l i f i c a t i o n was  made by Duda~-  c h a r a c t e r r e c o g n i t i o n scheme u s i n g both syntax  and.semantics.  However, the study.was made o n l y on  the FORTRAN language  the c o n f i d e n c e s  used were not  likelihoods.  equal  to the c a t e g o r y  s t u d i e s need to be made on  the much more d i f f i c u l t  language such as E n g l i s h .  In a d d i t i o n , , s i m i l a r  some f e a t u r e e x t r a c t i o n schemes such as the one  problem of h a n d l i n g  d e c i s i o n algorithms  Scope of the  used i n t h i s  and  Similar natural  need to  be d e r i v e d f o r h a n d l i n g v a r i a b l e - l e n g t h f e a t u r e v e c t o r s which r e s u l t  1.3  and  from  thesis.  Thesis  In t h i s t h e s i s , the complete p a t t e r n r e c o g n i t i o n system shown i n Fig.  1 i s considered.  The  i n p u t to the  alphabetic characters, binary quantized t h i s case, types  system c o n s i s t s of upper case on 50 x 50 a r r a y s .  c o n s i s t s of an e n l a r g e r p r o j e c t o r and  t r a c i n g based on Clemens' t e c h n i q u e .  used i n the c l a s s i f i e r  stage  e i t h e r by  transducer,  keypunch o p e r a t o r s .  of f e a t u r e e x t r a c t i o n are performed on the d a t a ,  method of contour  The  handprinted  Two  f o r comparison, by Various  themselves or making use  algorithms of  in  a are  contextual  i n f o r m a t i o n or t e n t a t i v e d e c i s i o n making f o l l o w e d by a d d i t i o n a l f e a t u r e e x t r a c tion before  proceeding  The  to a f i n a l d e c i s i o n .  t h e o r e t i c a l aspects  of the techniques  r e c o g n i t i o n problem are t r e a t e d i n Chapter I I . contour  trace  not  treated before  Some a s p e c t s  i n the l i t e r a t u r e ,  f o r i t s implementation, are d i s c u s s e d reviews some of the simple  used to s o l v e the  i n s e c t i o n 2.3.  classification  techniques  regarding  i n c l u d i n g the S e c t i o n 2.4  pattern the  square,  equations briefly  used i n the past  and  describes  the necessary  Subsection author's  2.4.5 d e s c r i b e s  a non-Euclidean  k n o w l e d g e , no c h a r a c t e r  reported. not  changes needed t o h a n d l e v a r i a b l e l e n g t h  Subsection  observed  interesting  theorem and p r o o f  2.4.5.  classification designed. ture,  Subsection  f o r which  some o f t h e i m p o r t a n t  between t h e v a r i o u s  concerning  d e c i s i o n boundary and t h e n o n l i n e a r subsection  very  previously  interrelationships, and p r e s e n t s  an  between a l i n e a r  d e c i s i o n boundary o f t h e c l a s s i f i e r o f  2.4.8 d e s c r i b e s  to obtain  a p a r a l l e l - s e q u e n t i a l mode o f  the equation,  f o r t h e optimum d e c i s i o n on a s t r i n g  an.apparently  f o r which, to the  classifiers,  the equivalence  vectors.  t h e s c a n s a n d d e c i s i o n t r e e s o f s e c t i o n 4.2 w e r e  I t was n e c e s s a r y  length feature vectors.  classifier  r e c o g n i t i o n e x p e r i m e n t s have been  2.4.6 d e s c r i b e s  i n the l i t e r a t u r e ,  distance  feature  This  equation  not a v a i l a b l e i n the l i t e r a -  of m characters having  i s derived  i n subsection  variable  2.4.9 a n d  s u c c e s s f u l m o d i f i c a t i o n i s made f o r d e c r e a s i n g  t h e amount o f  computation. The e x p e r i m e n t s a r e d e s c r i b e d have been used w i t h data new  efficiently experimental  methods  above.  conclusions  relatively  i n Chapter I I I .  small data  sets; the f i r s t  and t h e second needs a v e r y  In the past,  two m e t h o d s  does n o t use t h e  l a r g e amount o f c o m p u t a t i o n .  A  p r o c e d u r e i s d e s c r i b e d w h i c h i s a c o m p r o m i s e b e t w e e n t h e two The r e s u l t s a r e g i v e n  a r e t r e a t e d i n C h a p t e r V.  i n Chapter IV and t h e d i s c u s s i o n and  5  II.  2.1  THE PATTERN-RECOGNITION PROBLEM  Introduction The g e n e r a l  pattern  r e c o g n i t i o n problem i s expressed  d i a g r a m i n F i g . 1.  The r e a l w o r l d  number  w h i c h a r e t o be c l a s s i f i e d  of patterns  in decision.space. a pattern the h i g h  The t r a n s d u c e r  data  s e t i s composed  converts  i n image s p a c e by d i g i t i z a t i o n dimensionality  a pattern  strong  intra-category similarities  and s t r o n g  inter-category  The c l a s s i f i e r  the help Let  the features  X* = ( x ^ , x ^ , . . .  •  Feature there  exist  dissimilarities  and f e a t u r e e x t r a c t o r a r e  a c t s on m e a s u r e m e n t  decision either solely  of contextual  space.  t h e m a p p i n g i n s u c h a way t h a t  difficult  with  into  e x t r a c t o r maps t h e i m a g e  measurement  I n some c a s e s t h e t r a n s d u c e r  a tentative or f i n a l  from t h e r e a l world  The f e a t u r e  among f e a t u r e v e c t o r s . t o separate.  infinite  and q u a n t i z a t i o n which c o n t r i b u t e t o  o f t h e image s p a c e .  i s e f f e c t e d by d e s i g n i n g  of a p o s s i b l y .  i n t o a f i n i t e number o f c a t e g o r i e s  s p a c e i n t o t h e , u s u a l l y much l o w e r d i m e n s i o n a l , extraction  i n the block  s p a c e a n d makes e i t h e r  on t h e b a s i s o f t h e f e a t u r e v e c t o r s o r  information. f r o m p a t t e r n c l a s s C. b e d e s c r i b e d • i  I f P ( C ) and P(x|c  by a v e c t o r  ) denote, r e s p e c t i v e l y , the p r o b a b i l i t y  o f C_^ a n d t h e p r o b a b i l i t y o f % c o n d i t i o n e d  o n C_^, t h e n t h e p r o b a b i l i t y o f  misclassification  i t o m a x i m i z e any monotone  i s m i n i m i z e d by choosing  increasing  function of R. = P(X"|C.) P ( C . ) = P ( x , x n  I n most h a n d p r i n t e d d of I  i s large; typically  learning  the s t a t i s t i c s  r e q u i r e s many x^  training  character  d > 100.  ,..,x,|C.)  P(C.) .  r e c o g n i t i o n schemes  the dimensionality  Even i f t h e components o f ^ a r e b i n a r y ,  o f 2 ^ " d i f f e r e n t p r o b a b i l i t i e s P(]^|C ) s a m p l e s a n d much s t o r a g e  of 5 are s t a t i s t i c a l l y  (1)  capacity.  f o r each i  When t h e d c o m p o n e n t s  independent, however,  P($|C.) P(C.) = X X  d n k=l I  T  P ( x , |C.) P ( C . ) K. X X  (2)  CONTEXTUAL INFORMATION  IMAGE  MEASUREMENT  SPACE  TRANSDUCER  DECISION  SPACE  SPACE  x, FEATURE  /a  FINAL  DECISION  CLASSIFIER EXTRACTOR  /A  /-- 7  TENTATIVE DECISION  Fig.  I  The g e n e r a l  pattern-recognition  problem.  /?  7  i n which case i t i s necessary f o r e a c h i . Two not  be  possibilities  strongly interdependent,  P(X|C  ) sufficiently  best  t o l e a r n and  t o assume  now  and  arise.  to use  (A)  i f there  w e l l when t h e  ( 2 ) , or  store only I f the  are not  i s not  are  P(x|c\)  i s known a s u b o p t i m u m c l a s s i f i e r w h i c h r e q u i r e s l e s s s t o r a g e  by  fewer samples than are needed to e s t i m a t e  c o n t e x t u a l i n f o r m a t i o n and  b o t h of w h i c h can 18,  23,  24].  essential only but  do the  ).  they  simplify  relative the  the  (B)  Unused s t o r a g e  can  classifier  independence of  d  small are  capacity  t h e n be  occupied  algorithms,  those  the x  's e n c o u r a g e s t h e u s e  and of  [13,  features  a t t r a c t i v e , because  i n terms of l e s s s t o r a g e  be  E v e n when  improvement i n r e c o g n i t i o n a c c u r a c y  T h u s , f e a t u r e e x t r a c t i o n schemes w h i c h r e t a i n  not  computation,  (2), further  problem.  Transducers  o n l y d i g i t i z a t i o n and p a t t e r n i n the actual world  roughly  divided into  directly  two  q u a n t i z a t i o n a r e p e r f o r m e d and  r e a l world  and  groups. the  i n t o measurement space.  transducing,  I n one  kind  d i f f e r e n c e between  i n image s p a c e i s m i n i m i z e d .  f e a t u r e e x t r a c t i o n i s done d u r i n g  contour  In the o t h e r  thus  mapping the  E i t h e r scheme c a n be  used  the  kind real  with  tracing.  Feature 2.3.1  Extraction Introduction  There e x i s t s of  P(X|C  more s o p h i s t i c a t e d c l a s s i f i c a t i o n  considerable  T r a n s d u c e r s can be  2.3  desirable.  f o r r e c o g n i t i o n while keeping  simplifying 2.2  effect  may  suboptimum c l a s s i f i e r w h i c h can  on  be  estimate  known, i t  trained  t h a n t h e o p t i m u m c l a s s i f i e r may  |C.) r  k  components  enough samples to  true d i s t r i b u t i o n  some o t h e r  the d terms P(x  features  for a given  justification  t o d a y no  general  theory  that w i l l  p a t t e r n r e c o g n i t i o n problem.  i n allowing designers  to use  an  optimal  Hence t h e r e  i s some  features which  yield  they  t h i n k might  set  8  extract has  the  significant  been a c h i e v e d  totypes  but  the  It the  through i n t u i t i v e  field  r e m a i n s an  i s w e l l known t h a t  i n f o r m a t i o n needed  extraction algorithms reasonably their but  legible  B and  D.  mostly  between the  X w h i c h was  final  latter  case,  simple  r e c o g n i t i o n of  contours Mason  11,  characters.  2 5 - 2 7 ] and  by  a f e a t u r e v e c t o r of  Contour All  easily  contour  The  Troxel  [28,  29]  t o be  of  the  first  introduced  X,  First,  i s not  random,  below  0 and  of  0,  conand  used to g e n e r a t e  a  a s i n g l e c h a r a c t e r , or on  t h e one  easy to implement, the  the b a s i s of to  effect  of  character  u s e d by  for recognizing  as  Clemens  X.  and  machine-printed  transformation  small dimensionality.  are  similar  t r a c e of the b l a c k - w h i t e  p u r p o s e and  traced  t o use  feature,  the b a s i s  transformation  i s a m o d i f i c a t i o n of  a r e , h o w e v e r , s e v e r a l ways o f i m p l e m e n t i n g t h e  patterns  on  distinguished solely  tracing algorithms  some m a n n e r , a s t e p w i s e  different  was  of  Tracing  contour  slightly  solely  B., K and  e i t h e r as  t h e unknown c h a r a c t e r .  I n a d d i t i o n to b e i n g  2.3.2  A and  most  observations.  c l a s s - d e p e n d e n t t e s t s were d e s i g n e d  into binary vectors  [10,  yields  not  two  The  experiments described  outside  then c l a s s i f i e d  of a group of c h a r a c t e r s  In t h i s  characters  pro-  [25].  between c h a r a c t e r s  example, i n the  Accordingly, a character's  binary vector  recognizable  success  from b i o l o g i c a l  10-12, 25-32],  t h e s i s a r e b a s e d on  Second, c o n f u s i o n For  ideas  some  of a p a t t e r n contain  i t [ 5 , 6,  are  the past  than a s c i e n c e  the contours  Roman c h a r a c t e r s  highly structured. arose  art rather  used i n t h i s  In  a p p l i c a t i o n of  to recognize  e x t e r n a l contour.  fusions  one  i n f o r m a t i o n from p a t t e r n s .  [6, 10,  circular  hexagonal t r a c i n g ,  29,  tracing.  31-33].  stepwise  effect,  I n h i s Ph.D.  in There  t r a c e each s e r v i n g  l o c a l p r o p e r t i e s of  Greanias  a simple, o p e r a t i o n  they  b o u n d a r y .of a p a t t e r n .  requiring different 26,  i n that  e t a l . [6] was  t h e s i s Clemens  requiring l i t t l e  the one  [10] logic.  a  L a t e r Mason and The  Clemens  [26] used the square t r a c e which i s used i n t h i s  a l g o r i t h m c o n s i s t s of making a l e f t  a right  turn upon e n t e r i n g a b l a c k square  t u r n upon e n t e r i n g a white square as i l l u s t r a t e d  R(x,y) be  the image m a t r i x  'zero' or  'one', where the s e t of  some p a t t e r n P(p  ,p  thesis.  i n F i g . 2.  composed of elements which can  take on  'ones' i s the q u a n t i z e d  and  Let  the  values  r e p r e s e n t a t i o n of  ,...,p ).  Definition Two  elements of R are adj acent  i f they d i f f e r by  1 i n one  of  their  coordinates. Definition Two both of t h e i r  elements of R are d i a g o n a l l y a d j a c e n t  i f they d i f f e r by  1 in  coordinates.  Definition A p a t t e r n P i s connected i f , and and p  p  , t h e r e e x i s t s a sequence of elements  = p. such t h a t p j m F  them w i t h  The  and  p  Equations  be  o n l y i f , g i v e n any  those  p ,-. are a d j a c e n t m+1  F  J  =  for 1 < m < — -  of the p r e s e n t  and  ^  +  the 2 - s t a t e  the f o l l o w i n g 3 - s t a t e  k  =  y  k  +  { [ x  k  1  X k  k- k-i x  tracing patterns thus combatting to  (4) needed to implement  equations  combat j i t t e r  Y^)  a n  ~(  x k  _^> Y^-l^  l o c a t i o n of the scanning  ] [ 2 R ( x  spot.  equations:  i n [10] .  The  i n the r e a l world  1 ] }  ( 4 )  than the hexagonal t r a c e , the  i t are s l i g h t l y more.complex  than  hexagonal method i s more s u i t e d to  contour  because p o i n t s a r e i n t e r r o g a t e d o n l y  once,  the. e f f e c t s of j i t - t e r . effects also.  (3)  k  k'V-  Although the square t r a c e i s g e o m e t r i c a l l y s i m p l e r (3) and  Cx^,  d  to compare  = ^+'ny -y _ ][l-2R( ,y )]}  1  ^k i  past  Let  a n  p^  Z—l.  d e s c r i b i n g the square t r a c e were d e r i v e d i n order  next l o c a t i o n on R i s g i v e n by  equations  elements  (p^»P2>•••»P^) where P j P ^  f o r hexagonal t r a c i n g i n [10].  the c o o r d i n a t e s  two  Troxel  [29] m o d i f i e d  the square t r a c e  One. advantage of the square t r a c e i s that i t  can  c o m p l e t e l y contour t r a c e c e r t a i n  only  d i a g o n a l l y adjacent  method. in  which cannot be  arbitrary.  square t r a c e .  above, and An  the  containing  traced  However, both schemes w i l l contour  the sense d e f i n e d  strictly  patterns  choice  s e c t i o n s which  c o m p l e t e l y by  t r a c e any  pattern  of the a l g o r i t h m  analogous theorem to that g i v e n  the  are  hexagonal  that i s connected  in this  thesis i s  i n [10] h o l d s f o r  the  11 Theorem If  the square t r a c i n g r o u t i n e i s s t a r t e d at the lowest  of a non-white column, always r e t u r n to  ( x , y ) , i t s f i r s t move i s to  (x-1, y) a f t e r  (x-1, y) and  t r a c i n g the o u t s i d e contour  black  point  the t r a c e  will  of a connected  p a t t e r n e x a c t l y once. T h i s theorem can be c o n s i s t i n g of o n l y one by adding o t h e r be g i v e n  square and  adjacent  s t a r t i n g with  a pattern  c o n s t r u c t i n g a g e n e r a l connected  b l a c k squares.  pattern  A f o r m a l p r o o f , however, w i l l  not  here.  2.3.3  Smoothing The  Clemens  i n d u c t i v e l y proved by  same two-dimensional h y s t e r e s i s smoothing technique  [10, 25,  26] was  used here.  L e t x^ and  x^-fi be,  used  by  r e s p e c t i v e l y , the s  present be,  and  future x-coordinate  r e x p e c t i v e l y , the p r e s e n t  trace. I f  described with  \  \ i -V '  v i =\ i - V '  <  2  t h e n  k + 1  >  -V \  +  ::  x  +  +  i  +  T  x  / 2  >  t  h  e  n  X  k 1 +  V> 2  r e l a t i v e s i z e of T^ character.  The and  The  t h e n  " k 1 X  +  i s the t h r e s h o l d d i s c u s s e d  extraneous extrema.  L e t x^ and  of the smoothed  x  ^ ^ +  character  three i n t e r r o g a t i o n s .  +  ± k± k i  2  trace.  2  +  i s used f o r the y d i r e c t i o n .  of the  future x-coordinate  a l g o r i t h m can be  <  where T^  and  character  The  " * I f  of the raw  s  +  <+i = v V> 2  i n subsection  effect  ( 5 )  2.3.5.  of t h i s a l g o r i t h m  A similar  operation  i s to smooth out a l l  s i z e of the extraneous extrema i s determined by T^ w i t h r e s p e c t  to the h e i g h t  and  the  width, r e s p e c t i v e l y ,  .12  2.3.4  Gestalt  Variables  V a r i a b l e s w h i c h do n o t p r o v i d e reconstruct important  fairly  t h e k i n d o f i n f o r m a t i o n needed t o  w e l l the o r i g i n a l p a t t e r n but which nevertheless  abstract  p r o p e r t i e s o f t h e s h a p e a s a w h o l e a r e known a s g e s t a l t v a r i a b l e s .  E x a m p l e s o f g e s t a l t v a r i a b l e s a r e , t h e number o f s i d e s i n a p o l y g o n , height the  to width  (Ii/W) o f a c h a r a c t e r , o r t h e r a t i o  square root of the area  successfully overlapping H/W  ratio  of a pattern  on m a c h i n e - p r i n t e d distributions.  characters which y i e l d e d four v i r t u a l l y  a n d s o P//K,  o f a s h a p e , was t r i e d  as w e l l .  as f a r a s o v e r l a p p i n g  distributions  n e e d e d f o r i t was t o o g r e a t i n separating  i t was d e c i d e d to-black  resulting  2.3.5  Feature During  point, repeats  very non- '  characters  which i s a measure o f the d i s p e r s i o n P//~Twas s l i g h t l y b e t t e r t h a n  H/W  a r e c o n c e r n e d , t h e amount o f c o m p u t a t i o n  to j u s t i f y  i t s improvement.  In addition  c a t e g o r i e s w h i c h d i d n o t need h e l p .  P//K  Accordingly  s c a n s o f s e c t i o n 4.2 from contour  Vector  to help with  t h e commonly  confused  t r a c i n g alone.  Formation  t h e SEARCH mode ( F i g . 2) t h e s c a n n i n g  spot  moves, p o i n t by'  from t h e b o t t o m t o t h e t o p o f t h e l e f t most column, and s u c c e s s i v e l y this  previously point,  Although  f o r handprinted  to  t o abandon g e s t a l t v a r i a b l e s and l o o k i n s t e a d toward t h e w h i t e -  transition  characters  of the perimeter  [ 3 0 ] . C l e m e n s u s e d H/W  I t was a n t i c i p a t e d t h a t  w o u l d n o t be s a t i s f a c t o r y  only helped  (P//A)  the  p r o c e d u r e on t h e c o l u m n i m m e d i a t e l y  scanned, u n t i l  the scanner enters  according  the f i r s t  black point  to the r i g h t i s found.  o f t h e column  Upon l o c a t i n g t h i s  t h e CONTOUR mode, i n w h i c h t h e s c a n n i n g  to the square t r a c e described  completes i t s t r a c e around the o u t s i d e  earlier  and t e r m i n a t e s  of a character  spot  moves  when i t  and r e t u r n s  to i t s starting  point. After  t h e c h a r a c t e r has been t r a c e d once t h e r e c t a n g l e  surrounding  13 the  character  size  d e p e n d s on t h e h e i g h t  research cipated due  i s divided into  only  e i t h e r four or s i x equal-sized subrectangles  and w i d t h  of the l e t t e r  t h a t l e s s i n f o r m a t i o n from t h e m i d d l e o f c h a r a c t e r s would be l o s t ,  quantization manner t h a t  a six-rectangle subdivision.  errors the subdivisions are assigned  threshold  equal  to one-half  to one-half  f o r a second time.  e i t h e r an x max  of  the scanning  spot.  tracing;  of t h e b i n a r y  coincides with  the order  1 denotes x-extrema, w h i l e  labels of the rectangles  which extrema f a l l  to the y  spot  t o a p o i n t one i s designated  co-ordinate  T h e s t a r t i n g p o i n t o f t h e CONTOUR mode i s r e g a r d e d  T h e CODE w o r d f o r a c h a r a c t e r  d i g i t s whose o r d e r  direction  equal  i s contour  of the scanning  extremum, t h e r e s u l t i n g p o i n t  o r x . . A n a l o g o u s comments a p p l y mm  A y  and an x t h r e s h o l d  and t h e c h a r a c t e r  e x t r e m u m a n d moves i n t h e o p p o s i t e  as  as.an x'. . mm  are defined  t o t h e number  and/or h o r i z o n t a l l y .  W h e n e v e r t h e x. c o - o r d i n a t e  t h r e s h o l d away f r o m t h e r e s u l t i n g  the  3 - b i t l a b e l s i n such a  each r e c t a n g l e ' s h e i g h t  each r e c t a n g l e ' s w i d t h  reaches a l o c a l  vector  To f u r t h e r r e d u c e  t h e Hamming d i s t a n c e b e t w e e n a n y two l a b e l s i s e q u a l  o f r e c t a n g l e s b e t w e e n t h e two l a b e l s , v e r t i c a l l y  contour  Fig. 3). In previous  t h e f o u r - r e c t a n g l e s u b d i v i s i o n h a s b e e n u s e d b u t i t was a n t i -  to quantization, with  traced  (see  whose  c o n s i s t s o f a 1 f o l l o w e d by b i n a r y i n which extrema occur 0 denotes y-extrema.  i n accordance with  CODE a n d COORD w o r d s f o r "C".  w o r d . g e n e r a t i o n appear elsewhere  o f t h e CODE a n d COORD w o r d s .  Further  details  [10, 1 1 , 25-29].  The o r d e r i n g  the rectangles i n  i n e v e n t s e q u e n c e c o n s t i t u t e s t h e COORD w o r d .  c o n s i s t s .of t h e c o n c a t e n a t i o n  during  relating  The f e a t u r e F i g . 3 shows  t o CODE a n d COORD  14  01  11  110  Xm ax\ Ymax /  A/  / / \  V.. . \ v  \ ^ Xmin  111  OYmin  Ymln\  \  /Ymax  101  1 0 0  Ymin Xmin  "*£ Am(2X >  00  Ymax  PYmax *&&Xmax  10  ooo  001  START "CONTOUR" MODE CODE  WORD  CODE  10111 COORD  1001010010 WORD  COORD  0 0,11,1100. 10 /  /  WORD  WORD  000, 111, 111, 111,110,000, 001,001,001,000  / ~ -v  Fig.  3  CODE and COORD words f o r "C".  15  2.4  Classification 2.4.].  Introduction  A great in  the f i e l d  the  parallel  deal  of theory  of c l a s s i f i c a t i o n or sequential  has been developed  theory  type.  t a k e n on a p a t t e r n  the  and t h i s  sification  [34] i n w h i c h a f t e r t a k i n g  pattern  b a s e d on t h e measurements Every c l a s s i f i e r  each measurement  themselves are i m p l i c i t l y  t a k e n up t o t h a t  or e x p l i c i t l y  I n p a s t work Clemens  section the basic  R discriminant  estimated  scheme  time.  ( nonparametric  tions g,(X), J.  g (X), z  classifiers  the d e c i s i o n surface  2.4.5  that  the true  category.  a l l feature  vectors  i twill  [13] a l l In  used, as w e l l  are looked  For R categories  there  a t from are  of which the l a r g e s t i s Given R . discriminant  b e t w e e n t h o s e two c a t e g o r i e s  For s i m p l i c i t y  (parametric  F o r c l a r i t y , more m e a n i n g f u l  g (X) , f o r any two o f R c a t e g o r i e s R  0  g_^(X) - g (X) = 0.  o f view [ 3 5 ] .  training)  t o the' f e a t u r e v e c t o r s .  yielding R discriminants  a l w a y s chosen as r e p r e s e n t i n g  of  of the  [ 1 2 ] , a n d Munson  parametric  are described.  function point  functions  clas-  i n which e i t h e r the d i s t r i b u t i o n s  [ 1 0 ] , Chodrow  form of three  t h e t a b l e look-uD method,  discriminant  classifier.  a d e c i s i o n i s made  c o m p a r i s o n , and g e o m e t r i c i n t e r p r e t a t i o n , t h e c l a s s i f i e r s the  type of  of  u n d e r g o e s some k i n d o f t r a i n i n g p h a s e , when t h e  applied a table look-up c l a s s i f i c a t i o n  as  this  The b u l k  i n w h i c h some p a r a m e t e r s o f t h e d i s t r i b u t i o n s a r e e s t i m a t e d  this  into either  a s e t of measure-  o r t o d e c i d e on t h e i d e n t i t y  d i s t r i b u t i o n s a r e n o t known a p r i o r i ,  training).  ten years  i s c o s t l y i t i s d e s i r a b l e t o use s e q u e n t i a l  t o t a k e a n o t h e r measurement  or  fall  a n d t h e n a d e c i s i o n i s made.  either  pattern  classifier  t h e s i s are concerned mainly with  When t a k i n g m e a s u r e m e n t s  the past  Most c l a s s i f i e r s  In the p a r a l l e l  ments i s f i r s t references  [27].  during  i , j the  i s given  equation  by  be a s s u m e d i n s u b s e c t i o n s  h a v e t h e same d i m e n s i o n a l i t y  func-  2.4.3  and t h a t a l l  -  16  c a t e g o r i e s have equal  a priori probabilities.  These s p e c i a l cases w i l l be  t r e a t e d i n 2.4.8 and 2.4.9.  2.4.2  The Table  Look-up Method  The t a b l e look-up method case of the F i x and Hodges N of t r a i n i n g p a t t e r n s  n o n p a r a m e t r i c method  2  to C  a r e pooled  zero-Hamming d i s t a n c e with  of a l l the zero-Hamming d i s t a n c e n  n  2  R  to C . R  and a search  i s made of a l l f e a t u r e  the a r b i t r a r y p a t t e r n .  feature vectors  Suppose t h a t  found,n^ belong to C ,  We then s e t g (X)'= x  n  ' g (?) = n 2  g (X) R  and  number  Nov?, to c l a s s i f y an a r b i t r a r y p a t t e r n )t, the f e a t u r e  i n the R t r a i n i n g subsets  vectors having  [36]. Given some f i x e d  i n each of R c a t e g o r i e s a t a b l e i s made up c o n t a i n i n g  the NR f e a t u r e v e c t o r s . vectors  ( a l s o exact match c l a s s i f i e r ) i s a s p e c i a l  = n  ±  (6)  2  R  a d e c i s i o n i s made by s e l e c t i n g the l a r g e s t d i s c r i m i n a n t .  I f no  zero-Hamming  ->-  distance  feature vectors  c a t e g o r i e s have some s e t up by u s i n g  a r e found then the p a t t e r n X i s r e j e c t e d .  nonuniform a p r i o r i d i s t r i b u t i o n then the t a b l e can be  t r a i n i n g subsets  of s i z e s p r o p o r t i o n a l to the a p r i o r i  p r o b a b i l i t i e s or the t a b l e can be c o n s t r u c t e d be m u l t i p l i e d by the a p r i o r i p r o b a b i l i t i e s . n p(C ) , n p ( C „ ) , . . . , X  X  lities  Z  Z  P(C^).  as above and the d i s c r i m i n a n t s can  Then n , n , n would become 1 2 R n p(C ) where p(C.) a r e e s t i m a t e s of the a p r i o r i p r o b a b i K  K  1  0  1  C l e a r l y t h i s method i s an attempt to e s t i m a t e  P(x|c_^)P(C^) f o r i = l ,  R.  For a low r e j e c t r a t e a  samples a r e needed, e s p e c i a l l y w i t h p a t t e r n s handprinted  I f the  characters.  the v a l u e s of  l a r g e number of t r a i n i n g  of h i g h v a r i a b i l i t y  This requires a great deal of storage  such as  as w e l l as a  rapid-access  [35].  memory  2.4.3  This  The P a r a m e t r i c  method w i l l  be r e f e r r e d  t o as a l g o r i t h m ' 8 ' .  Bayes Approach ->  Given a feature  vector  X i t i s desired  t o maximize the a  posteriori  , -> P ( C |X) o v e r i o r a n i n c r e a s i n g m o n o t o n i c f u n c t i o n o f i t a s i n ( 1 ) .  probability From  ( 1 ) and ( 2 ) , a s s u m i n g e q u i p r o b a b l e c h a r a c t e r s  feature  vectors  and t a k i n g l o g a r i t h m s  a l l with  d  we get. t h e d i s c r i m i n a n t  dimensional function  d  g. (X) =  R  P^^lc^) f° i l >  where t h e  r  distributions  P(x|c_j.).  R  =  a n  as  logic  a linear  1  ^ k = l , . . . ,d a r e t h e p a r a m e t e r s o f t h e  When a d i s c r i m i n a n t  boundary i s a h y p e r p l a n e and i t can be v e r y shold  (7)  E l n P ( x . |C.) k=l  1  function easily  i s linear, i t s decision  implemented i n terms o f t h r e -  elements  [35].  equation  i n terms o f t h e components o f X as  . Any l i n e a r  discriminant  f u n c t i o n can be expressed  d g. (X) =  I  co  k=l  1  x  »  k  k  + co. 1  >  (8)  d + 1  where t h e o r i e n t a t i o n o f t h e h y p e r p l a n e i s d e t e r m i n e d by t h e x^eights co. , f o r k = l , w e i g h t co_j, [ ^ ' c  +  d, and t h e p o s i t i o n o f t h e h y p e r p l a n e i s d e t e r m i n e d by t h e I t c a n b e shown  [37]  that  f o rbinary  measurements  (7)  can  be w r i t t e n i n t h e f o r m o f ( 8 ) w h e r e w  = i,d+l  d E  l n P ( x . =0|C.) k i 1  and co. , = l n P ( x = 1 C.) l ,k k I 1  - l n P ( x = 0 C . ) . k I 1  (9)  18  2.4.4  Minimum E u c l i d e a n In the Euclidean  Distance  f r o m t h e Means  distance c l a s s i f i e r  a m e t r i c , monotonic to t h e E u c l i -  dean d i s t a n c e s u c h as t h e s q u a r e o f t h e E u c l i d e a n G\ , i = l ,  R [35, 3 8 ] . This m e t r i c  t h e mean o f e a c h c a t e g o r y  X  ..  i s taken  d i s t a n c e , i s minimized  between t h e g i v e n  L e t t h e mean X  . = ( x , . ,x  C..  J  A d e c i s i o n i s t h e n made b y  p a t t e r n X and  . , , . , x , .) w h e r e  1,X 2 , i ' d,r feature vectors f o r category  m,:i_ m,i x, . i s t h e mean o f t h e k ' t h c o m p o n e n t o f t h e t r a i n i n g k, :i. b  over  minimizing  2 I ( x , - x .) k=l ^ w h i c h can be p u t i n t h e l i n e a r d i s c r i m i n a n t f u n c t i o n form o f (8) where d  W  i , d  +  X  1 = - |  tHx =l|c.)]  (10)  2  k  k=l  and W  i,k  =  P  \  (  =  1  l i>  x-jhen t h e m e a s u r e m e n t s a r e b i n a r y v a l u e d . derived  i n this  2.4.5  A Non-Euclidean Distance difficult  distance  i n classification  signals  E k=l  be  i n general,  to a linear  theory  b u t used i n  i s the decoding distance described i n  _ |x  - x  |.  (12)  '  (12) implements a n o n l i n e a r  shown i n t h e n e x t s u b s e c t i o n  valent  n o t be  i s d e f i n e d as d  Although,  (11), however, w i l l  Classifier  to find  s e q u e n t i a l decoding f o r telephone This  Equation  thesis.  A metric  [39].  (11)  C  d e c i s i o n boundary, i t w i l l  t h a t , f o r b i n a r y measurements, i t i s e q u i -  discriminant function closely  related  t o (7)  19  2.4.6  Some I m p o r t a n t I n t e r r e l a t i o n s h i p s For  the three  mean p o i n t  f o r each  categories  C^,C  respect  category  there  lies  (7), (10) a n d (12) t h e p r o t o t y p e  somewhere i n s i d e a h y p e r c u b e . .  feature vectors.  ( 7 ) , (10) a n d (12) u n d e r  following  or  F o r a n y two  a r e two mean p o i n t s w h i c h c a n h a v e c e r t a i n s y m m e t r i e s  t o t h e s i d e s and v e r t i c e s o f t h e h y p e r c u b e  tions of the binary between  classifiers  i n t e r e s t i n g theorem  with  depending on t h e d i s t r i b u -  Certain interesting relationships arise  c e r t a i n t y p e s o f symmetry.  to c l a s s i f i c a t i o n  theory  In addition, the  i s proved.  Theorem The minimum d e c o d i n g d i s t a n c e implements  a nonlinear  classifier,  d e c i s i o n boundary  (12), w h i c h , i n g e n e r a l ,  i s equivalent  f u n c t i o n when t h e m e a s u r e m e n t s a r e b i n a r y  to a l i n e a r  and t h e c a t e g o r i e s  discriminant  have equal  a  priori  probabilities.  Proof For b i n a r y  measurements x, K  . = P(x =1|C.). j  The minimum d e c o d i n g d i s t a n c e w r i t t e n as  1  K.  (13)  X  classifier  f o r equiprobable  categories  c a n be  ^ M  - { X  E  |x -P(x =l| k  k  C i  )|}.  K.— X  When x =1, |x - P ( x = l | c . ) | = P ( x = 0 | c ) . K.  When x =0, K.  rC  |x K.  tC  X  K.  X  - P ( x =l|c.)| = P ( x = l | c ) . K.  X  rC  X  . . (14) c a n be w r i t t e n as Min i  { E i kea  P ( x =0|C.) + E P ( x =1|C.)} k ' l . , k I keb  w h e r e a i s t h e s.et o f k ' s f o r w h i c h x =1 k and b i s t h e s e t o f k ' s f o r w h i c h x  =0. K.  (15)  20 Since  E P ( x =0 C ) i s i n v e r s e l y monotonic i n i t o , k I '  E - P ( x =0 C.) and t o k l  Y. [1-P(x = 0 | c . ) ] and s i n c e E [1-P(x = 0 | c . l = E P ( x = l | C ) . Ic 1, k w r i t t e n as k  1  k  Max .  1  (15) can be  k  { E P ( x =1 |C.) + E P ( x =0 |C.) } , it 1 , .. K 1 k ea k eb  1  which i s e q u a l t o the d i s c r i m i n a n t f u n c t i o n g (X) =  d E P(xJC,). k I k=l  (16)  1  (16) can be w r i t t e n as g (X) ±  =  E k=l  [x P(x =l|C.) k  + (l-x )P(x =0|C )]  k  k  k  (17)  i  Recombining terms y i e l d s d  g.(X) 1  -  E [P(x =1|C.) - P ( x =0|C.)]x, + E P ( x = 0 | C . ) , - i K l K . 1 K , . , K X k=l k=l  v?hich i s a l i n e a r d i s c r i m i n a n t f u n c t i o n . 'TL'.  Note i t s s i m i l a r i t y t o ( 7 ) .  Q.E.D.  (18)  T h i s a l g o r i t h m w i l l be c a l l e d  I t can be generated by t a k i n g t h e a n t i -  l o g a r i t h m o f every term i n the summation o f ( 7 ) . The r e l a t i o n s h i p between ( 7 ) , ( 1 0 ) , (12) and TL can e a s i l y be observed when t h e symmetries a r e viewed  i n two-dimensional  f o l l o w i n g symmecries be d e f i n e d f o r any i and j , i ^ j : = P(x =0 C )  P(x =l  1  1  J  'A' symmetry P(x =l C ) = P(x =l l 2  2  P(x =l c ) 2  'B' symmetry  J  1  1  J  2  J  X  P ( x = l C ) = -p( =o c . ) 2  1  X;L  J  P ( x = l c.) = P ( x = 0 c . ) 1 J P(x =0 c.) = P ( x = l c.) x  x  'D' symmetry  2  P ( x = l c.) = P(x =0 c.) x  symmetry  = P(x =0 c.)  P ( x = l c.) = P ( x = l c . ) x  'C  1  2  1  2  J  space.  Let the  21  It  can be shown t h a t  (10) and TL implement the same d e c i s i o n l i n e when e i t h e r  A, B, C or D symmetries a r e observed,  ( 7 ) , (10) and TL implement  the same d e c i s i o n  l i n e when e i t h e r A, B or C symmetries a r e observed, and ( 7 ) , ( 1 0 ) , (12) and TL all  implement  the same d e c i s i o n l i n e when A or B symmetries a r e observed.  If  none o f the. above symmetries a r e observed a l l f o u r d e c i s i o n b o u n d a r i e s a r e d i f f e r e n t but some may s t i l l For example,  effect  the same d e c i s i o n f o r b i l i a r y f e a t u r e v e c t o r s .  (12) and TL always implement  the same d e c i s i o n f o r b i n a r y  v e c t o r s , and ( 7 ) , ( 1 0 ) , (12) and TL do so under C symmetry. be extended to t r e a t hyperspace. are l i k e l y to occur for  algorithms.  These i d e a s can  i n r e a l i s t i c problems these  i n f r e q u e n t l y i t i s reasonable,  the t h r e e d i f f e r e n t  boundaries,  Since  feature  symmetries  to expect d i f f e r e n t  results  F i g . 4 shows an example o f the d e c i s i o n  implemented by ( 7 ) , ( 1 0 ) , (12) and TL, f o r the two-dimensional  case when the above symmetries a r e n o t observed.  2.4.7  Handling In o r d e r  to be made i n o r d e r  Feature  Vectors  t o implement  ( 7 ) , (10) o r (12)., c e r t a i n m o d i f i c a t i o n s have  to handle v e c t o r s i n m u l t i s p a c e ,  arises i n character recognition. problem a r e taken.  i n Multispace  L e t x,  . .  In t h i s  a problem which seldom  t h e s i s t h r e e approaches t o the  P(x, |C.,L.) and. L. stand  k,i,j, k ' t h component o f the i ' t h c a t e g o r y  k  1  j  I  f o r the mean of the  3  y i e l d i n g f e a t u r e v e c t o r s o f the j ' t h l e n g t h ,  the p r o b a b i l i t y o f the k ' t h component c o n d i t i o n e d on c a t e g o r y and the j ' t h l e n g t h , r e s p e c t i v e l y . L. .  i of length j ,  In a d d i t i o n l e t M be the maximum v a l u e o f  3  In the f i r s t have l e n g t h s  equal  approach f e a t u r e v e c t o r s of l e n g t h L^ < M were made to  to M by s e t t i n g x^ = 0 f o r L  < k <_ M.  Index i was then  chosen to maximize W. and minimize Y., where i i  .M W. = 1  E lnP(x, |C.) + lnP(C.) k=l • k  1  (19)  22  2  x  Fig.  4  A  Some d e c i s i o n b o u n d a r i e s b e t w e e n two s i o n a l space.  M  Y. = l  probability  theoretic  bias  development,  terms  1  i was  G\  containing  then chosen  Eq.  (12)  TL  (10)  (7)  p a t t e r n - c l a s s - m e a n s i n two  approach,  ( 7 ) and  (12), respectively, with a  In W the b i a s  from a  and  1  , only the discriminants  V.,  decision  reasonable.  g i v e n an unknown c h a r a c t e r y i e l d i n g  length L  t o m i n i m i z e U. L.  term f o l l o w s  term i s a r b i t r a r y but  a p r o t o t y p e o f t h e same L  U . = ^  dimen-  (20)  1  added.  feature v e c t o r of a p a r t i c u l a r categories  Eq.  _  i n Y the b i a s  In the second  Alg.  E Ix-x, .I- lnP(C.) , _ k k,x l k=l  A l g o r i t h m s W a n d Y a b o v e a r e t h e same a s priori  Eq.  a  f o r the  were c a l c u l a t e d .  Index  where  1  |x -x k  k j i j  .|  -lnP(C.)  (21)  L  -  1  V  =  E  1  Algorithms added.  (x -x  j  k  U a n d V a b o v e come f r o m  Again,  although  the bias  2  k,l,_)  )  - lnP(C  probability belonging in  third  terms a r e a r b i t r a r y ,  bias  terms  they a r e reasonable  and  priori.  approach c o n s i s t s o f maximizing P(XC.L.), the j o i n t  that a feature vector  to category  (21) and (22) o n l y  totype  (22)  (12) and (10) r e s p e c t i v e l y w i t h  e a s y t o i m p l e m e n t b e c a u s e t h e P ( C ) a r e known a The  ) 1  c o m p o n e n t s and  X has L  C^,, u n d e r s t a t i s t i c a l  comes f r o m a  i n d e p e n d e n c e among c o m p o n e n t s .  the discriminants f o r the categories  o f t h e same L . w e r e c a l c u l a t e d .  character  containing  a  As  pro-  I n d e x i was t h e n c h o s e n t o m a x i m i z e  3  T., w h e r e X  T. = 1  Li . 3  E l n P ( x . |C.,L.) + l n P ( L . | C . ) + l n P ( C . ) k=l 3  2  (23)  1  -y  Equation  (23) f o l l o w s d i r e c t l y  from P(XCJL ) under s t a t i s t i c a l  i n d e p e n d e n c e among  components. 2.4.8  A Parallel-Sequential Classifier I n t h e p a r a l l e l - s e q u e n t i a l mode o f c l a s s i f i c a t i o n  formerly  t a k e n as f i n a l  i n algorithms  some o f t h e d e c i s i o n s  T a n d U, w e r e t a k e n a s t e n t a t i v e a n d  the  scans of F i g . 7 ( a ) , f o r a l g o r i t h m  the  d e c i s i o n t r e e s o f F i g . 7 ( b ) . The s c a n s a n d d e c i s i o n t r e e s o f F i g . 7 a r e  explained  2.4.9  of  Compound D e c i s i o n s  solely  This  on S t r i n g s o f  discussed  on t h e b a s i s  those characters.  characters  to  i n s e c t i o n 4.2.  In the algorithms characters  U, w e r e t a k e n s e q u e n t i a l l y a c c o r d i n g  Dependent  s o f a r a d e c i s i o n was made o n  individual  o f t h e measurements and t h e a p r i o r i p r o b a b i l i t y  mode o f d e c i s i o n i g n o r e s  may h a v e i n a g i v e n  Characters  sequence.  the dependencies  that  I t i s d e s i r a b l e t o make u s e o f t h e s e  d e p e n d e n c i e s by m a k i n g d e c i s i o n s on a s t r i n g  of characters  rather  t h a n on  characters  individually.  d e c i s i o n on a s t r i n g derived using  In this  subsection  the equation  f o r t h e optimum  of characters, given variable-length feature vectors, i s  certain underlying  assumptions, and a procedure f o r  reducing  t h e amount o f c o m p u t a t i o n i s g i v e n . We w i s h characters  t o maximize the a p o s t e r i o r i p r o b a b i l i t y  g i v e n by  P(c|,...,C^,L^,...,LJ 1  where on  i s the category  i values,  can  take  o f a sequence o f m  m  l  m  1  |X  ...., X )  1  of the n'th character  (24)  m  i n t h e sequence and c a n take  i s the length o f the feature vector o f the n'th and X  on i v a l u e s , .  i s the feature vector  n  we do n o t c a r e w h a t l e n g t h s  are associated with  n  of the n'th  (24).  Using  m  Since  1  C^, C^, 1 2  C m 1  B a y e s ' r u l e ( 2 4 ) becomes  P(X ,...,X |c .•.,C ,L ,•. 1  character.  the decisions C , the n  optimum d e c i s i o n i s a r r i v e d a t b y s e l e c t i n g t h e c a t e g o r i e s which maximize  c h a r a c t e r and  1 >  m  1  , L  M  )  P(C ,...,C ,L ,..•,L 1  m  1  M  )  (25)  E x p a n d i n g t h e n u m e r a t o r o f (25) and d e l e t i n g t h e d e n o m i n a t o r b e c a u s e i t does not  depend on i y i e l d s  P(2 ,./.,X |c^,...,^ 1  Assuming 1)  .  m  (26)  that, X ^ i s independent o f X ^ ,  and  f o r n=l,...,m; h = l , . . . , m ; £=l,...,m; n^h^£ , 2)  L i s independent o f L , and C , n h h f o r n = l , . . . , n ; h=l,...,m; n ^ h ,  (26)  c a n be w r i t t e n as f o l l o w s : m n  — n=l  _ P ( X  . | C \ L  J  .  m  )  n  n ' n n  P ( L  ., n=l  J  | C  1  ) P ( C ^ , . . . , C  n'n  1  1  ) .  m  (  2  7  )  Taking  n a t u r a l logarithms y i e l d s m  m  E lnP(X  , n=l  n  . 1 + E lnP(I. IC ) + InP(c'\ . . . , 0 ^ , n'n 1 m n=l  \C ,L ) X  n  1  .  J  3  n  I f i t i s now assumed t h a t the components o f X_ = (x ,x , ...,x  (28)  .) a r e independent,  n o f m c h a r a c t e r s i s made by m a x i m i z i n g Ij  then t h e optimum d e c i s i o n on t h e sequence  G ( X . . . , X ) over R™ p o s s i b l e sequences, where 1 m . l5  m  G(X.,...,$ ) = I  m  E  n=l  L-  .  1  [ E  . .  lnP(x. I C , ^ ) + ]*P (Ir I C ) ]  n  1  , _ k=l  k ' n n  1  1  n'n  + lnPCC^",...^ ). 1 m  (29)  1  One can v a s t l y reduce t h e computation of b r a c k e t e d ,  time by assuming t h a t i n an o r d e r e d  list  [ ] , terms i n (29) over C , L , t h e c o r r e c t i d e n t i t y o f an unknown n n 1  3  c h a r a c t e r has a h i g h p r o b a b i l i t y o f b e i n g i n the top ds e n t r i e s o f t h e o r d e r e d l i s t , where ds i s c a l l e d t h e 'depth o f search' and G. i s maximized over ( d s ) sequences. 2.5  m  F o r bigrams m-2 and f o r t r i g r a m s m=3.  T r a i n i n g and Storage  Requirements  P r o b a b i l i t i e s P ( X | C . ) , P ( x (c.)  and P ( x |C.,L.) were e s t i m a t e d o r K. X J t h e r e l a t i v e number o f times a v e c t o r X o r component  1  l e a r n e d by d e t e r m i n i n g  X  rC  x, o c c u r r e d , g i v e n the event C. o r t h e -joint event C.,L.. k x i j J  S i n c e we do n o t  know t h e p r o b a b i l i t y d i s t r i b u t i o n s o f t h e p r o b a b i l i t y v a l u e s themselves  we  .cannot make optimum e s t i m a t e s o f those v a l u e s and we a r e j u s t i f i e d i n u s i n g the maximum l i k e l i h o o d e s t i m a t e s d e s c r i b e d above. A l g o r i t h m S must l e a r n and s t o r e P(X|C.)P(C.) f o r v i r t u a l l y a l l X T needs o n l y P(x. | c . , L . ) , P(L.|C.) k i J J i I n a d d i t i o n t o P(C_^), f o r b i n a r y measurements, U and V need  which occur i n a t e s t s e t . and P(C_^).  Algorithm  P(x. |C.,L.) w h i l e W and Y need P(x, | c . ) . k ' l j k ' l P ( L . | c . ) and P ( C ^ , . . . , C ) . 1  _J  X  X  TTI  A l g o r i t h m G needs P(x, |C.,L.), k ' l j  Note t h a t , under the assumptions made, G l e a r n s the  same a b o u t t h e m e a s u r e m e n t s priori  probabilities  as does T b u t needs t o s t o r e  and n e e d s  R  instead  of R a  t o s e a r c h o v e r t h e same number o f d i s c r i m i n a n t s .  III. 3.1  EXPERIMENTS  Introduction Although  the whole system of  F i g . 1 can  t o o much c o m p u t a t i o n w o u l d be  duplicated  Accordingly,  the  The  contour  cards  each  s e c t i o n of  t r a c e r was  containing  containing  the  simulated  a l l the  data  feature vectors  p e r f o r m e d o n c e f o r 4-PAD and simulated a file. all on  the  on  an  Since  training  a great  The compare the 4-PAD, 2)  deal  on  an  the  the  storage  and  individual with  e r r o r on t h a t on  i n the  that  the-  data,  T, stored  Hence the b i g r a m the  and  feature vectors,  reducing  algorithms  thus  the bigram  and  part.  e x t r a c t i o n scheme  1)  to  with  i n terms of  per-  e f f e c t s o f monogram,  to o b t a i n reasonable and  were  stored.in  for  c o n s t r a i n t s i n terms of performance, r e q u i r e m e n t s , 4)  cards  classifiers  thesis is five-fold;  to observe the  punched  was  T e x p e r i m e n t s were  f o r t h e most  classification  5)  to compare  classifier predictions performance  population.  f o r m a n c e o f a r e c o g n i t i o n scheme by  4-PAD and  same as  c o m p u t a t i o n and  In e a r l y experiments researchers  *  operation  f e a t u r e v e c t o r s were  l i k e l i h o o d s of  r e a l world  separately.  of which punched  A l l the  (6-PAD) f e a t u r e  r e q u i r e m e n t s , 3)  added s t o r a g e  p r o b a b i l i t y of  out  G i s the  operations  division  program,  experiments.  simulated  Hence t h i s  experiments i n t h i s  trigram contextual  complexity,  on  purpose of  t o compare the v a r i o u s  b i g r a m and  and  trigram s t a t i s t i c s .  of c l a s s i f i c a t i o n  s i x - p a r t area  f o r m a n c e and  the  been read,  for algorithm  i n one  IBM-7044 c o m p u t e r i n t o w h i c h  were o b t a i n e d .  t r i g r a m programs to c o m b i n a t i o n a l  of  an  o n c e f o r 6-PAD o n l y .  t r i g r a m programs operated o n l y saving  using  had  w i t h b i g r a m and  t h e many  r e c o g n i t i o n p r o b l e m was  l i k e l i h o o d s f o r each f e a t u r e v e c t o r  tape together  simulated  i n performing  IBM-360/67 c o m p u t e r w h e r e t h e the  be  6-PAD r e f e r t o f o u r and  i n the  field  predicted  o b t a i n i n g a sample of d a t a ,  s i x - p a r t area  the  per-  training  the  division, resnectively.  a l g o r i t h m on  t h a t sample  known as t h e R m e t h o d It  i s now  and  testing  i t on t h e same s a m p l e .  (resubstitution)  w e l l known  [42] .  [40] t h a t  o p t i m i s t i c p r e d i c t i o n of performance. large fixed  samples  s e t and a t e s t i n g  with  that  set.  the t e s t i n g  an o v e r p e s s i m i s t i c  Highleyman  t o as t h e H method  [42,  showed t h a t  c o n s i s t s of t r a i n i n g  s e t s h o u l d n e v e r be s m a l l e r Highleyman's  (holdout).  f o r s m a l l sample the c l a s s i f i e r  I n 1968  Lachenbruch  and  on a l l b u t one p a t t e r n  formance.  3.2  the data s i z e  training,  a variation  The m e t h o d and  repeating  this  pro-  pattern. training  t o as t h e U m e t h o d .  o f t h e U method i s used t o p r e d i c t  i t s advantages  are described  i n section  per-  3.3.  D e s c r i p t i o n of the Data A total  o f twenty upper  case a l p h a b e t s were o b t a i n e d  d i f f e r e n t p e r s o n s , a l l o f whom w e r e r e q u i r e d paper  performance  i s very s m a l l , a great deal of  i n t h i s method r e f e r r e d  thesis  Mickey  i n t h e s e t and  e v e r y p a t t e r n o f t h e d a t a has been used as a t e s t  this  training  [401. T h i s p r o c e d u r e i s  cedure u n t i l  c o m p u t a t i o n i s needed  the performance,  m e t h o d b r e a k s down a n d  on t h e p a t t e r n l e f t  Needless to say, unless  of the set  s i z e s a much b e t t e r e s t i m a t e o f  out d u r i n g  over-  for very  than the  then t e s t i n g  In  a biased  [ 4 1 ] showed t h a t  e s t i m a t e of performance  referred 43]  procedure y i e l d s  set i n order to best predict  For s m a l l d a t a s e t s , however,  yields  this  o f d a t a t h e r e e x i s t s an optimum p a r t i t i o n i n g  into a training the r e s u l t  T h i s method i s  i n s q u a r e s 0.40  p e n was  used.  i n . h i g h by 0.50  E a c h p e r s o n was  c h a r a c t e r unbroken.  of magnification  on g o o d  f r o m Z-A.  Two  A Paper Mate  t e n was  three  lower case alphabets.  used  to p r o j e c t  each  An  graph  thin-felt-tip  t h e a l p h a b e t once  persons p r i n t e d  seven  quality  asked to leave the o u t s i d e contour of  a l p h a b e t s by c o p y i n g t h r e e , r a n d o m i z e d projector  i n . wide.  A l l seven persons p r i n t e d  shown i n F i g . 5, a n d o n c e  to p r i n t  from  each  f r o m A-Z,  additional enlarger-  c h a r a c t e r onto  an  as  29  1. --]-  I  i I  . 1 .  i  H1^ i I i 1 I. i  l•I  i ! ! , I ; • i ! I i i i ; I i •; I  .1 •  JJ—U |  T  J  I i  • i I ; I !  j ; » I ;  j  • i  • M  I • ; I  J > ! ' '•  I -1-—  i.L ;  ! I  ,  ! I  T_7 r-l-  in •i.J-.l.._.U',-',  . 1.1  I. i _  I.! I l i I,!  ! I.  i i i | ! ! !...-.. I . L i  rn-r-rrn T  j L_J_  __! i _  — T r ? - " T r n n - r r — n ~-|~; -  -4 4-r  A'  I. I  1  -n-r-  ! ••  1  1  1  r  .u^p T  :  , r  .. .-j .-; l  r  (  j y-  'j- ' '  r  fete •-i  ,1 J  :l!.  , -.~'|—j~r'i i ' : • iJ •.  -i_ :•.  J 1  1  •  1  i  1  ;  1  1  1  i.: i ! ! i I i,  I  1  !  .  ;  '  1  !.....  ..!...  1  • 1 i • !  /<  1  '  1  ; •. ! .  ! : I  , : i > t . i •i •: • • I  I..  '.J  Fig.  5  I  : .  M A/  \s  i !  I  A l p h a b e t s from seven p e r s o n s . The s i x t h and seventh a l p h a b e t s a r e from PERSONS 1 and 2, r e s p e c t i v e l y .  30  area d i v i d e d was  into  2500 0.1 i n . s q u a r e s .  b l a c k t h e s q u a r e was b l a c k e n e d  I f more t h a n  a n d a s s i g n e d a v a l u e 1.  s q u a r e was l e f t w h i t e a n d a s s i g n e d a v a l u e 0. ter  was t h e n e n c o d e d o n t o The  legibility  students to i d e n t i f y student responded at  c l o s e range  which  lasted  averaged  over  o f t h e d a t a was e s t i m a t e d b y a s k i n g 10  t o c h a r a c t e r s viewed  f o r as l o n g as d e s i r e d .  approximately  12 m i n u t e s  Averaging  per student.  over a l l 10'viewers  experiments  using the f i r s t  a l p h a b e t s ' f r o m each  were averaged i+l,  o f a few p e r cent would t h e r e f o r e r e s u l t  PERSONS 7 a n d 1 w e r e u s e d f o u r a l p h a b e t s were used r e s u l t s were a v e r a g e d  over  two a l p h a b e t s f r o m  trials.  the t e s t d a t a f o r each  f o rtesting. fortraining  trial.  from weaknesses i n  t h e p o p u l a t i o n (POP) w e r e  f i v e persons  In the i ' t h t r i a l ,  f o rtesting.  five  recognition  Characters  on t h e t e s t d a t a o v e r  i = l , 2 , . . . , 6 , were used  alphabets  or both.  o f t h e two r e m a i n i n g p e r s o n s  over'seven  accuracy  a n d o v e r a l l 10  Machine  feature extraction or c l a s s i f i c a t i o n ,  two  persons  9 9 . 8 % and 1 0 0 % , r e s p e c t i v e l y .  either  performed  The r e c o g n i t i o n  the five  on I n d e p e n d e n t  Each  i n random o r d e r  and o v e r  i n excess  The  graduate  A l l v i e w i n g was d o n e a t o n e s e s s i o n  errors  Experiments  charac-  quantized characters.  one-at-a-time  two a l p h a b e t s o f a l l s e v e n  f r o m PERSONS 1 a n d 2 y i e l d e d  3.3  quantized  IBM p u n c h c a r d s b y e x p e r i e n c e d k e y p u n c h o p e r a t o r s .  verbally  v i e w e r s was 9 9 . 7 % .  area  Otherwise the  Each s p a t i a l l y  a l l 520 m a g n i f i e d . s p a t i a l l y  the f i r s t  50% o f any s q u a r e ' s  trials  f o r t r a i n i n g , and  f o rtesting.  The r e s u l t s  d a t a f r o m PERSONS i a n d  I n the seventh  t r i a l data  from  F o r t e s t s on d a t a f r o m one i n d i v i d u a l , and t h e r e m a i n i n g one f o r t e s t i n g . i n which  This procedure  a different  alphabet  The  s e r v e d as  f o r e s t i m a t i n g the performance i s  a compromise between t h e H method and t h e U method  [40].  Although  w o u l d be a b e t t e r  e s t i m a t e and w o u l d p r o b a b l y p r e d i c t a b e t t e r  t o o much t r a i n i n g  computation  t h e U method  performance,  would be r e q u i r e d w i t h t h e d a t a s e t above.  31  The m e t h o d u s e d h e r e , than It  i n addition  t h e H method, i n v o l v e s f a r l e s s  t o b e i n g more e c o n o m i c a l  training  computation  than  on d a t a  t h e U method.  i s , t h u s , a r e a s o n a b l e method t o u s e and i f a n y t h i n g p r o b a b l y y i e l d s a  slightly  n e g a t i v e l y b i a s e d estimate o f performance.  t r a i n i n g data used.  i n t h e POP t e s t s  I n t e s t s , on a l p h a b e t s  the f i r s t  were  a l p h a b e t s were  i . e . , the probability  of error  used.  likelicondi-  To c a l c u l a t e  P ( e ) , f o r each a l g o r i t h m , t h e e s t i m a t e s o f  into 26  1  Z P(e|c.)P(C.)  P(e) =  i=l For the experiments 1/26  each p e r s o n  , f o r a l g o r i t h m s S, T, U, V, W a n d Y.  of error,  P(e|c.)were substituted  from  on t h e  d e s c r i b e d a b o v e y i e l d e d maximum  h o o d e s t i m a t e s o f B(e|c ), i = . l , 2 , . . . , 26,  the t o t a l p r o b a b i l i t y  two a l p h a b e t s  f r o m PERSONS 1 a n d 2 a l l f i v e  The e x p e r i m e n t a l p r o c e d u r e s  t i o n e d on p a t t e r n c l a s s  For experiments  on e q u i p r o b a b l e  1  c h a r a c t e r s ( E Q P ) , P ( G \ ) was made e q u a l t o  f o r a l l i i n (30) and i n t h e c l a s s i f i c a t i o n  experiments  on E n g l i s h a p r i o r i  (30)  1  probability  algorithms.  For the  c h a r a c t e r s (ENG), 5 - d e c i m a l  digit  e s t i m a t e s o f t h e v a l u e s o f P(C^) were o b t a i n e d from P i e r c e [ 4 4 ] . 3.4  Experiments  on Dependent  Characters  A l g o r i t h m G, e q . ( 2 9 ) , was u s e d w i t h b i g r a m contextual constraints.  Maximum l i k e l i h o o d  (m=2) a n d t r i g r a m (m=3)  e s t i m a t e s f o r t h e b i g r a m and  t r i g r a m a p r i o r i p r o b a b i l i t i e s were o b t a i n e d by d i v i d i n g occurrence,  given i n Pratt  304 a n d 2,510 b i g r a m combinations  frequency of  t o t a l number o b s e r v e d .  and t r i g r a m e n t r i e s , r e s p e c t i v e l y .  of search  Accordingly,  ( d s ) was s o s m a l l t h a t a l l t h e p r o s p e c t i v e s t r i n g s o f  m c h a r a c t e r s were i l l e g a l , using context.  There were  A l l other possible  o f two a n d t h r e e c h a r a c t e r s w e r e c o n s i d e r e d i l l e g a l .  when t h e d e p t h  in  [ 4 5 ] , by t h e i r  their  a d e c i s i o n was made o n i n d i v i d u a l  However, t h e l i k e l i h o o d  t h e s e a r c h made by G r a p i d l y  characters without  o f o b s e r v i n g l e g a l bigrams and t r i g r a m s  i n c r e a s e s , v a r y i n g w i t h t h e square  and cube  32  of  ds, respectively. The e x p e r i m e n t s o n t h e t e s t d a t a w e r e p e r f o r m e d  a l p h a b e t s from f i v e persons f o r t r a i n i n g  using  t h e f i r s t two  and f o r m i n g samples o f b i g r a m s and  t r i g r a m s o u t o f t h e two a l p h a b e t s f r o m e a c h o f t h e two r e m a i n i n g p e r s o n s f o r t e s t i n g . T h e r e s u l t s were averaged o v e r seven t r i a l s In  each  t r i a l 2 ( 2 ) samples o f bigrams m  w h e r e 2™ s a m p l e s a r e f o r m e d person.  a s d e s c r i b e d i n s e c t i o n 3.3.  o r t r i g r a m s were formed  f o r each  f r o m t h e two c h a r a c t e r - s a m p l e s o f e a c h  F o r e x p e r i m e n t s on t h e t r a i n i n g  each o f t h e seven persons were used  data the f i r s t  entry,  testing  two a l p h a b e t s f r o m  f o r t r a i n i n g a n d 7(2)™ b i g r a m o r t r i g r a m  samples p e r bigram o r t r i g r a m e n t r y were formed, as d e s c r i b e d above, f o r testing. The e x p e r i m e n t a l p r o c e d u r e s d e s c r i b e d a b o v e y i e l d e d e s t i m a t e s o f • P (e | C i ^ , Cl^ , . • . , C i ) f o r i ^ l >••-•> 26; i 2 l >•••> 2 6 ; . . . : i =1, . . . , 26 , w h e r e =  any  combination of i ^ , i 2 , . . . i  probability  was a l e g a l  o f e r r o r was t h e n c a l c u l a t e d 26  P(e)  =  =  E i = l 1  26  string  of m characters.  using  26  E ... E P(e|Ci ,Ci_,...,Ci i = l l =1 2 m 1  over a l l the l e g a l s t r i n g s of m The p r o b a b i l i t y  )P(Ci ,Ci-,...,ci 1  )  (31)  characters.  o f e r r o r was c a l c u l a t e d  depth of s e a r c h , f o r bigrams  f o r various values of the  up t o ds=16 a n d f o r t r i g r a m s up t o d s = 5 , u s i n g POP  t r a i n i n g d a t a w i t h 4-PAD f e a t u r e e x t r a c t i o n . o p t i m u m v a l u e o f d s , P ( e ) was c a l c u l a t e d 6-PAD TR a n d TS c a s e s .  The t o t a l  H a v i n g found an e x p e r i m e n t a l  f o r t h e 4-PAD TS c a s e a n d f o r t h e  33  IV. 4.1  Introduction A l t h o u g h the  set, the on  RESULTS  results  on  difference the  testing  the  results  of main i n t e r e s t  t r a i n i n g set  are  included  between the  performance of  set  smaller  i s , the  the  In o t h e r words, i f b o t h methods y i e l d confident  that  e n o u g h d a t a has  when d a t a i s d i f f i c u l t sifiers the  perform equally  classifier  large  data set  previously  previous test  p e r f o r m a n c e on  i s an  p e r f o r m a n c e on  the  p e r f o r m a n c e on  4.2  Independent  the  future  the  P(C_^) a s s u m e d t h e  and  training set, s e c o n d , and  indicate do  occur;  6.  set.  the the  the  the  can  be  do  be  s m a l l set  characters.  For  clason  Since i f a to  classifier subsequently  small data  characters  know  results  t r a i n i n g s e t may  future  to  i f two  r e s u l t s , the  I f the  becomes.  to d e c i d e  set  and  reasonably  helpful  well  test  smaller  t r a i n i n g set  t r a i n i n g set.  (3)  u s i n g a l g o r i t h m s S,  EQP  set  t o come t h e n  the  i s a reasonable estimate  o t h e r comments s e e  respectively, third  w h i l e 6 and  U,  V,  text.  The  W and  4 apply  TS to  squares apply, r e s p e c t i v e l y , numbers w h i c h a r e  m i s c l a s s i f i c a t i o n (error rejects  T,  Y of  means e q u i p r o b a b l e c h a r a c t e r s ; ENG  probabilities in English  PERSON 2.  i f no  i t may  t r a i n i n g set  testing  one  T h i s may  set  The  d a t a needed a l t o g e t h e r  would expect the  p e r f o r m a n c e on the  the  Given a small data set,  testing  and  on  (1)  test  also  for  [46j.  Characters  appear i n F i g .  PERSON 1 and  set  three reasons.  comparable r e s u l t s  p e r f o r m a n c e on  t r a i n i n g data of  Results obtained  first,  the  unbiased e s t i m a t e of  the  2.4.7  on  t h o s e coming from the  a classifier  amount o f  (2)  becomes a v a i l a b l e - one  yielding better  better  available  collect.  well  for  been c o l l e c t e d .  yielding better  l i e between the  yield  to  are  occur  plus reject)  neither  and  the  TR  subsection  means  that  denote t e s t  area d i v i s i o n .  to d a t a from the i n b r a c k e t s nor  p r o b a b i l i t i e s when  the-numbers i n d i c a t e , e r r o r  set The  population, parenthesis  rejects  probabilities.  Numbers  34  6  TR EQP  n  4  7-7  •  75  12  12  58 52 (48) (45) 4 54 59 (36) (41)  45 (40) 43 (31)  6  TS  6  TR  6  TS  6  TR EQP  _ TR  6  6"  6  TS  4  6_ 4  Fig. 6  L  TS I £  T  24  TR  49 (46) 50 (36)  36 (33) 33 (21)  DQ]  12 8-5 [5] [o-o]  32  18  16  30  23  20 (08)  14 4-4  ENG  14  14  19  10  17 (0-4)  30  25 19 (1-6)  23  8-5 4-6  31  20  14  M i s c l a s s i f i c a t i o n and r e j e c t  6  TS I 4  EQP  TS | £  21  15 (OB)  30 17 (OB) 4-3 2-0 30 P-BJ [0-0][0-0] 33  12  7-5 8-0  20  12  13 (0-8)  19 23 (0-8) 18  23  16  15  27  21  19 (0-8)  38  29 24 (OB)  12 53  TR ENG  27  18 9-2 7-6  TR  7-5  21  L  4  7-1 8-0  33 43 (0-8) 26 4  ENG  45 (41) 41 (26)  MI  EQP  4-0 1-5 3-0  77 ENG  10 I 7-0 4-5 [3-5j\[0-8] 17 JL 13 12  6-2 4-5  6  17  10  19  10  TS 2 5  Is).  7-5 8-9  (Ml 19  \  41 \ 26 19 4  46  probabilities' in  28  19  per cent  35  i n p a r e n t h e s e s denote the the  error probability  For  algorithms  W and  reject  when t h e Y,  p r o b a b i l i t y .and  numbers i n b r a c k e t s  scans of F i g . 7 were used  results  are  for equiprobable  denote  to r e s o l v e  confusions.  characters  on  the  l e n g t h was  1 7 and  training  set. O v e r POP's 1 4 a l p h a b e t s  The k.  p r o c e d u r e used  |C.,L.) makes r e j e c t i o n 1 j  P(X|C_^)P(C^) = of  the  26  merely 26  zero  to o b t a i n f i n i t e o f any  for a l l i ,  characters  as b e i n g  characters;  the  that  r e j e c t e d i n s u c h cases..  testing,  different  when t e s t i n g  and  f o r T,  f r o m any  i s done on  different U and  the  of  two  classified  on  the b a s i s of  The the  7.  Fig. 7-(a)  w e r e u s e d on  f r o m any  during  the  shows t h e  training  scans  begins  at  the  leftmost black point  and  S3,  and  ter width  and  height  the height  at  or width  the is W  the  l e n g t h of  For  No  the  the  confused  4 - P A D , and  and  H,  and  i n the  respectively.  at which  occurs  vector as  unknown c h a r a c t e r  for resolving character  black point  training  expected,  a l l a l g o r i t h m s , when t h e  easily  top  the  algorithm  r e j e c t s occur,  discriminant i n alphabetic  i n the  of  characters  test  diswas  order.  characters  appear i n  confusions  that  equiprobable  e x t e r n a l to the boxes i n d i c a t e that  lowest  fraction  unknown  encountered during  d a t a w i t h a l g o r i t h m U,  S i n g l e headed arrows  one  indicated for  training.  data.  first  ters.  S2  i s any  one  in Fig. 4  number i n p a r e n t h e s i s  o r more c a t e g o r i e s were e q u a l , the  For  optimum c h o i c e  V whenever  training  ) and i e x a m p l e , when  o p t i m u m d e c i s i o n i s t o s e l e c t any  Some s c a n s u s e d t o d i f f e r e n t i a t e Fig.  unnecessary.  Thus, r e j e c t s are  encountered  criminants  of  25  of P(XIC  estimates  i n many r e c o g n i t i o n s c h e m e s , h o w e v e r , t h e  S whenever a f e a t u r e v e c t o r during  character  correct.  i n d i c a t e s the p r o b a b i l i t y  . w o u l d be  is  average vector  4-PAD and' 6-PAD, r e s p e c t i v e l y .  for  P(x  the  the  charac-  scan  bottom rows, r e s p e c t i v e l y , leftmost  column of  Double headed arrows  the. s c a n s o c c u r .  F i g . 7-(b)  S8.  Charac-  accompany  shows  Jt>  W/2  3W/5  \H/3 SI  S4  S5  S9  S10  W/4  \H/3 H/4\ W/3 S7  S6  S3 (a) -D  •R  S3  AO  •0  S2  [6.0]  R  -A  S1  S2 S6  [1-4]  •B  -J  •F  S7 D  S6 B [2-8]  0 [19]  SS  Blorl  -2  S4  [0.8]  •B  Q  V  S8  [0.8]  0  S1 S9  3  •B  1 I 1 J S6 •V  2 (b)  D ABD  1  S2  K  S1  [7.7]  S10  [0.8]  -B  H  2  K  (c)  Fig.  7  S c a n s and  decision  trees  for resolving character  confusions  37  the  decision trees  the  number o f w h i t e - t o - b l a c k  at  the previous = A and  node.  i s B.  Thus, i f U  error  WB  t r a n s i t i o n s occur,  followed,  f o r PERSON 2 f r o m 8.5% t o 0.8%.  tree appears i n square  The ABD d e c i s i o n t r e e  U reduced reduced  A p r o c e d u r e s i m i l a r t o t h e one  numbers i n s q u a r e b r a c k e t s  i n F i g . 6, a l t h o u g h cases.  and s c a n s t o t h e ENG-6-PAD t r a i n i n g d a t a w i t h  yielded recognition rates  Use o f  algorithm  s c a n s , t r e e s , and t r e e b r a n c h e s w e r e n e e d e d i n t h e s e o t h e r trees  final  i f n e c e s s a r y , by  on t h e t r a i n i n g d a t a w i t h  from 24% t o 10%.  F i g . 7 y i e l d e d the other  decision  t h e s c a n shown  F i g 7 - ( c ) shows t h e d e c i s i o n t r e e s u s e d f o r PERSON'2.  e r r o r p r o b a b i l i t y f o r POP  indicates  ( 2 1 ) i s minimum f o r b o t h  i f three  O t h e r w i s e S2 i s a p p l i e d  t h e s e s c a n s and d e c i s i o n t r e e s  in  i n eauation  The d e c r e a s e i n e r r o r p r o b a b i l i t y f o r a n y g i v e n  brackets.  the  The number on a t r e e b r a n c h  (WB) t r a n s i t i o n s r e s u l t i n g f r o m  = 0, t h e n S I i s a p p l i e d :  classification S3.  f o r the population.  fewer  Applying  algorithm  T  o f 9 8 . 2 % , 1 0 0 % a n d 1 0 0 % on t h e d a t a f r o m t h e p o p u l a =  t i o n , PERSON 1 a n d PERSON 2, r e s p e c t i v e l y .  4.3  Dependent  Characters  Results Figs.  8 a n d 9.  obtained  using  algorithm  G •••ith m=2  and m=3  appear i n  F i g . 8 shows t h e p e r c e n t e r r o r a s a f u n c t i o n o f d s f o r t h e  4-PAD t e s t i n g d a t a .  F o r t h e b i g r a m c a s e t h e p e r c e n t e r r o r was  for  o f 16.  ds up t o a v a l u e  Note that  there  calculated  i s no n e e d t o i n c r e a s e  ds any  f u r t h e r s i n c e , f o r t h e d a t a i n t h e s e e x p e r i m e n t s , 16 was t h e l a r g e s t number o f categories  that  could  have p r o t o t y p e  t h a t o f t h e unknown v e c t o r feature vector  vector, was 16.  feature  i n question.  vectors  In other  t h e l a r g e s t number o f p o s s i b l e For the t r i g r a m  words, given  category  A FORTRAN p r o g r a m w r i t t e n t w i c e  as  any t e s t  l i k e l i h o o d s f o r that  case c a l c u l a t i o n of the e r r o r r a t e  a t ds=5 b e c a u s e t h e amount o f c o m p u t e r t i m e n e e d e d been too l a r g e .  o f t h e same l e n g t h  f o r ds > 5 w o u l d  stopped have  f o r the sake of e f f i c i e n c y  21 A  DEPTH OF SEARCH Fig.  8  P e r c e n t e r r o r p r o b a b i l i t y a s a f u n c t i o n o f d s f o r t h e 4-PAD t e s t i n g data.  t o o k a l m o s t 2 h o u r s o f CPU t i m e o n t h e IBM-360/67 f o r t h e t r i g r a m c a s e w i t h ds=5. Both e r r o r rates  on F i g . 8 d e c r e a s e a t an a c c e l e r a t i n g r a t e  a minimum a t d s = 4 , a f t e r w h i c h t h e y i n c r e a s e  at a decelerating  b i g r a m P ( e ) r a p i d l y becomes u n i f o r m a t a v a l u e distributions  assumptions o f algorithm  grams o n e w o u l d e x p e c t t h a t a similar fashion Fig. the  Since  the data  the trigram  G a r e made f o r b i g r a m s a n d t r i e r r o r c u r v e f o r d s > 5.  t o the bigram error  9 shows t h e p e r c e n t P ( e ) ,  used.  Zero order  represents  would behave  curve. f o r t h e 4-PAD a n d 6-PAD c a s e s o n  t r a i n i n g and t e s t i n g data s e t s , as a f u n c t i o n o f t h e order  information  The  a r e t h e same f o r t h e b i g r a m a n d t r i g r a m c a s e s a n d s i n c e t h e  same u n d e r l y i n g  in  o f 17.9%.  rate.  reaching  of contextual  equiprobable characters,  first  order  39  ORDER  9  Fig.  Percent error p r o b a b i l i t y i n f o r m a t i o n used.  represents results G but  monogram s t a t i s t i c s ,  come f r o m a l g o r i t h m  c a n n o t make u s e  of  OF  as  etc..  CONTEXTUAL  a f u n c t i o n of  Note that the  T which requires  character  of P ( E )  are  t h o s e f o r ds=4.  yields  86%  98%  c o r r e c t r e c o g n i t i o n on  respectively. zero for 50%.  to t h i r d  With trigrams the  A marked improvement i s noted order  contextual  e x a m p l e , r e d u c e s f r o m 33%  the  dependencies.  values  and  information. t o 16%,  the  INFORMATION  order  zero  and  of  contextual  first  order  same m e a s u r e m e n t t r a i n i n g The  b i g r a m and  and  6-PAD a l g o r i t h m  testing  and  trigram  training  G  data  i n a l l four cases i n going The  i . e . , an  from  e r r o r r a t e f o r TS-4-PAD,  error reduction  as  o f more  than  V. 5.1  Evaluation  of  DISCUSSION  S y i e l d e d the lowest  independent characters,  suggests that  CONCLUSIONS  Results  Although algorithm with  AND  t h e d i f f e r e n c e b e t w e e n S and T i s s m a l l .  the assumption of s t a t i s t i c a l  components i s r e a s o n a b l e .  P ( e ) on t h e t r a i n i n g s e t  Algorithms  i n d e p e n d e n c e among f e a t u r e  T, U and V a l l p e r f o r m e d much  may  such c i r c u m s t a n c e s suboptimum c l a s s i f i e r s be s i g n i f i c a n t l y  training.  This  w h i c h was  better  conclusion  having r e l a t i v e l y  t h a n optimum c l a s s i f i e r s  f r o m a l l p o s s i b l e p r o b l e m s , t h e o p t i m u m number feature vectors characters  decreases with.the  T performs b e t t e r  amount  of t r a i n i n g data.  t h a n U a n d V.  2.4.6), that  This  inferior  vectors  I n a d d i t i o n i t c a n be  and  shown  binary  t h e P ( L |C^,) t e r m f r o m T a n d  are s t a t i s t i c a l l y  I f , i n a d d i t i o n , t h e compo-  i n d e p e n d e n t t h e n U must  be  t o T. When t h e P ( C ) o f E n g l i s h  and  For•equiprobable  i s reasonable because both  the a n t i l o g a r i t h m of the remaining terms.  nents of the feature  [47]  a t random  f o r equiprobable characters  measurements, U c a n be g e n e r a t e d by o m i t t i n g taking  extensive  of t h e o r e t i c a l l y p o s s i b l e  U a n d V do n o t h a v e P ( L . C.) a v a i l a b l e a s d o e s T. j i (theorem i n s u b s e c t i o n  requiring  r e c o g n i t i o n problem i s s e l e c t e d  that  few p a r a m e t e r s  i s s i m i l a r t o t h e one r e a c h e d by Hughes  t h a t when a p a t t e r n  vector  better  t h a n S when t h e t r a i n i n g and t e s t d a t a w e r e d i s j o i n t , w h i c h i n d i c a t e s in  This  t e x t a r e u s e d t h e d i f f e r e n c e b e t w e e n T, U  V on t h e t e s t i n g d a t a W i t h 6-PAD becomes m i n i m a l a n d , i n f a c t , U and  V p e r f o r m e q u a l l y w e l l and b o t h p e r f o r m s l i g h t l y indicates  t h a t when s t o r a g e  contextual  statistics  to optimum c l a s s i f i e r s  contextual  constraints.  t h a n T.  i s limited a trade-off  c o n s t r a i n t s and measurement  w h i c h use. l e s s m e a s u r e m e n t superior  capacity  better  statistics.  w h i c h make l i t t l e  clearly  e x i s t s between  Suboptimum  a n d more c o n t e x t u a l  This  classifiers  d a t a may w e l l he-  o r no u s e o f e x i s t i n g  41  C o m p a r i s o n o f t h e r e s u l t s f o r T v s . W a n d U v s . Y shows t h a t the  information  equal It  i n length  should  contained through  i n the length  made l o n g e r  by a d d i n g  o f t h e f e a t u r e v e c t o r s , b y m a k i n g them  the addition of zeros,  be k e p t i n m i n d  that  although  decision indeed  makes r e c o g n i t i o n more d i f f i c u . l t .  the feature vectors  i n W and Y a r e  z e r o s , W a n d Y i n f a c t use. up much l e s s s t o r a g e  T and U b e c a u s e they s t o r e p r o b a b i l i t i e s For  f o r C_^ . r a t h e r t h a n C^,L  T and U r e c o g n i t i o n a c c u r a c y  tress increases.  Although  than  categories.  i m p r o v e s a s t h e number o f s c a n s a n d  t h e e x p e r i m e n t s show t h a t  improve r e c o g n i t i o n , t o f u l l y  neglecting  appreciate  their  the scans can  capability,  more  e x p e r i m e n t s on l a r g e t e s t i n g s e t s w o u l d h a v e t o b e p e r f o r m e d . Considerable  improvement r e s u l t s u s i n g  with  a l l algorithms  e x c e p t S.  I t was e x p e c t e d  with  6-PAD o n t h e t e s t i n g d a t a  binary 260  [10] used t h e b i n a r y digits  S was u s e d ) .  vector  characters  Even on t h e t r a i n i n g  x and y t h r e s h o l d s  equal  S would perform  have t o be l e a r n e d  r e s u l t i n g f r o m 4-PAD, a l o n g  d e s c r i b i n g o n e o f t h e f o u r H/W  upper case a l p h a b e t i c  that  ratios  data  (algorithm  a n d 1 0 1 i n F i g . 3, a n d t h e s e e x t r e m a a r e m o r e a c c u r a t e l y  occur i n squares l o c a t e d by a  In addition, characters  that are  commonly c o n f u s e d u n d e r 4-PAD, s u c h a s P a n d D, d u e t o t h e f a c t t h a t occurs  i n s q u a r e 10 f o r b o t h P a n d D, a r e n o t c o n f u s e d a n y more w i t h  because the x applying  max  lies  i n rectangle  Clemens' technique  characters  Munson  when 2340 c h a r a c t e r s  from seven  the x max 6-PAD  1 0 1 f o r P a n d 0 0 1 f o r D i n F i g . 3.  t o t h e r e c o g n i t i o n o f upper case  [13] o b t a i n e d  by u s i n g  t h e e r r o r was r e d u c e d t o  100  COORD w o r d .  recognize  15% e r r o r r e s u l t e d although,  t o 3/16 t h e l e t t e r h e i g h t ,  6-PAD.  w i t h two  f r o m 10 d i f f e r e n t t y p e f o n t s  I m p o r t a n t e x t r e m a o n many o f t h e u p p e r c a s e c h a r a c t e r s  than a f o u r - p a r t  worse  with  observed, to  3%.  six-part  t h a n 4-PAD  b e c a u s e S h a s no g e n e r a l i z a t i o n c a p a b i l i t y a n d  a l a r g e r number o f d i f f e r e n t f e a t u r e v e c t o r s Clemens  6-PAD r a t h e r  42% m i s c l a s s i f i c a t i o n writers  In  handprinted  o f w h i c h 1 9 % was r e j e c t  were used f o r t r a i n i n g  and an  (apparently)  c o m p a r a b l e number o f s a m p l e s f r o m d i f f e r e n t These r e s u l t s  a r e f o r x and y t h r e s h o l d s  respectively. tion  to using  a s i x - p a r t area,  On. c o m p l e t e l y this  "R"  t h e s i s may  unconstrained  using  simply  characters  experiments,  some was r e p o r t e d All  with  samples were  broken c h a r a c t e r s  problem are suggested  such as i n the next  can' a l s o b e s o l v e d b y  them, a c o n s t r a i n t w h i c h  d i d not cause t h e p r i n t e r  real  difficultv  apparently  although  [12].  algorithms  the p o p u l a t i o n algorithms  in  i f , i n addi-  t h e f e a t u r e e x t r a c t i o n scheme  the. p r o b l e m o f b r o k e n c h a r a c t e r s  r e q u i r i n g that people avoid p r i n t i n g  these  improve c o n s i d e r a b l y  and h e i g h t ,  a l g o r i t h m T r a t h e r t h a n S.  e n c o u n t e r some d i f f i c u l t y  Of c o u r s e  width  e i t h e r a l a r g e r number o f t r a i n i n g  i n F i g . 1 0 . Two m e t h o d s o f a t t a c k i n g t h i s  section.  in  o f 1/10 a c h a r a c t e r ' s  P e r h a p s Munson's r e s u l t s w o u l d  used o r t h e experiment were repeated  in  i n d i v i d u a l s was u s e d f o r t e s t i n g .  f o r a l l c a s e s p e r f o r m e d b e t t e r on i n d i v i d u a l s t h a n on  a n d i n some c a s e s t h e i m p r o v e m e n t was p r o n o u n c e d .  T and U y i e l d e d r e c o g n i t i o n a c c u r a c i e s  F o r some c a s e s  o f 1 0 0 % on t h e t r a i n i n g  data  o f PERSONS 1 a n d 2 when t h e s c a n s w e r e u s e d . As was e x p e c t e d b i g r a m a n d t r i g r a m i n f o r m a t i o n w i t h creased  P ( c ) f o r a l l cases.  Looking  a l g o r i t h m G de-  a t t h e TS 4-PAD a n d 6-PAD e r r o r  o f F i g . 9 one can s e e t h a t they  tend  This  amount o f new c o n t e x t u a l i n f o r m a t i o n b e c o m e s  suggests that a decreasing  to follow a decreasing  curves  a v a i l a b l e by i n c r e a s i n g f u r t h e r t h e o r d e r In fact  the curves  by  going  beyond  as  the order  exponential  of the contextual information  suggest t h a t P ( e ) would not.be decreased  3rd order  of context  contextual information.  increases  6-PAD a n d 4-PAD d e c r e a s e s a l t h o u g h 6-PAD w i t h b i g r a m s r e s u l t s  path.  used.  significantly  The c u r v e s  a l s o show t h a t  t h e d i f f e r e n c e i n performance between i ts t i l l  i n a lower  remains  important,  P ( e ) t h a n 4-PAD w i t h  noting  trigrams.  that In  a d d i t i o n i t i s much e a s i e r t o i m p l e m e n t 6-PAD w i t h b i g r a m s b e c a u s e t h e s t o r a g e , saving  i n going  from t r i g r a m s  t o b i g r a m s i s much g r e a t e r  than the e x t r a  storage  43 n e e d e d by t h e s l i g h t l y  longer  importance of balancing and  from context  is  also important  used w i t h  to balance  that obtained  These r e s u l t s  appropriately the information  as s u g g e s t e d by R a v i v  ds P ( e ) i s l o w e r .is lower  feature vectors.  [23].  from t h e measurements  I n a d d i t i o n , F i g . 8 shows t h a t i t  contextual information  from t h e depth of search.  f o r trigrams  i l l u s t r a t e the  from t h e order Although  f o r every  than f o r bigrams, f o r bigrams w i t h  than f o r trigrams w i t h  ds=3.  of the context value of  ds=4 i t  I n a d d i t i o n , much l e s s s t o r a g e  i s needed  f o r b i g r a m s , a n d t h e c l a s s i f i c a t i o n , c o m p u t a t i o n i s r e d u c e d since»for t h e former case,  while,for  2 4 =16 a p o s t e r i o r i p r o b a b i l i t i e s 3  the l a t t e r It  with respect  case,  i s interesting  a r e searched  3 =27 o r 9 p r o b a b i l i t i e s to note that  t o P ( e ) does n o t c o r r e s p o n d  order  A s was n o t e d  l a b e l s (characters) with respect It  i s reasonable  low values  i n subsection  trigrams  only  necessary  in  2.4.9 t h e of the  to ds.  t o e x p e c t P ( e ) t o d e c r e a s e a t an a c c e l e r a t i n g r a t e  o f ds b e c a u s e f o r v e r y  m a t i o n i s made u s e o f .  o f ds  2.4.9 i n t h e d e r i v a t i o n o f ( 2 9 )  o f F i g . 8 a r e e x p e c t e d t o d e p e n d a l s o On t h e d i s t r i b u t i o n  correct  searched.  optimum v a l u e  However, t h e a s s u m p t i o n s a r e n e v e r t h e l e s s  t o make t h e s o l u t i o n f e a s i b l e .  curves  for  valid.  are  t h a t p r e d i c t e d by eq. ( 2 9 ) . T h i s  s u g g e s t s t h a t t h e a s s u m p t i o n s made i n s u b s e c t i o n are not e n t i r e l y  per character  the experimental with  or 8 per character,  low values  o f ds l e s s c o n t e x t u a l  F o r e x a m p l e , o u t o f t h e 17,576 t h e o r e t i c a l l y  2,510 a r e E n g l i s h e n t r i e s i n t h e s e  experiments;  infor-  possible  w"hen d s h a s a  3 value that of  of 2 only  2  o r 8 o u t o f t h e p o s s i b l e 17,576 a r e l o o k e d  there i sa s i g n i f i c a n t  t h e 15,066 i l l e g a l  x^ithout  probability  trigrams  the use of context.  when ds = 2,  resulting  the 8 trigrams  looked  a f t e r w h i c h a d e c i s i o n i s made  T h i s was o b s e r v e d  ds=2. a n d l e s s f r e q u e n t l y f o r ds=3. context  that  f r o m t h e u s e o f monograms.  One c a n s e e at are part  individually  t o happen q u i t e f r e q u e n t l y f o r  I n s p i t e o f the. f r e q u e n t  P ( e ) f o r b i g r a m s was s t i l l  at.  slightly  lower  disregard f o r than  that  44 I t would seem from t h i s data t h a t t h e b u l k o f t h e c o r r e c t l a b e l s occurs f o r ds <_ 4. ds > 4.  T h i s would h e l p t o e x p l a i n the s l i g h t i n c r e a s e i n P(e). : for-  I f t h e g r e a t e r p a r t o f the c o r r e c t l a b e l s occurs f o r ds < 4, i n c r e a s i n g  ds f u r t h e r , would o n l y i n c r e a s e t h e chance o f f i n d i n g an erroneous b i g r a m o r t r i g r a m w i t h a h i g h e r a p o s t e r i o r i p r o b a b i l i t y than the c o r r e c t one, and thus increase P(e).  5.2  Suggestions  f o r F u r t h e r Research  S i n c e t h e r e q u i r e m e n t t h a t p e o p l e a v o i d p r i n t i n g broken c h a r a c t e r s may n o t be a d e s i r a b l e c o n s t r a i n t , i n view o f t h e f a c t t h a t some p r i n t e r s [12] found d i f f i c u l t y i n overcoming t h i s c o n s t r a i n t , two ways o f h a n d l i n g unc o n s t r a i n e d d a t a a r e suggested.  One c o r r e c t i v e measure i s t o  by b l a c k e n i n g a l l w h i t e squares a d j a c e n t p r o c e d u r e a c e r t a i n number o f t i m e s .  c l o s e s m a l l breaks  t o a b l a c k square and r e p e a t i n g  this  However, i f t h e b r e a k i s more than a few  u n i t s wide t h e c h a r a c t e r can be d i s t o r t e d c o n s i d e r a b l y by r e p e a t i n g t h e above procedure  enough times  disadvantage one  t o c l o s e t h e break.  A second a l t e r n a t i v e w i t h o u t  i s t o move a b a r o f l e n g t h b a l o n g t h e o u t s i d e c o n t o u r ,  end o f t h e b a r a g a i n s t t h e contour  this  keeping  and m a i n t a i n i n g an a n g l e o f 90° between  the b a r and t h e l i n e tangent t o t h e c o n t o u r  at the bar-contour  point of contact.  Whenever t h e o t h e r end o f t h e b a r e n c o u n t e r s b l a c k d u r i n g t h e c o n t o u r  trace,  the l i n e d e f i n e d by t h a t p o r t i o n o f t h e b a r between t h e two b l a c k p o i n t s i s made p a r t o f t h e c h a r a c t e r , as i n F i g . 10.  The b a r must be l o n g enough t o  c l o s e most b r e a k s b u t s h o r t enough t h a t i n t e n t i o n a l b r e a k s t h a t d i f f e r e n t i a t e p a t t e r n c l a s s e s remain. More e x p e r i m e n t s w i t h and w i t h o u t formed on l a r g e u n c o n s t r a i n e d One source present  t h e above methods s h o u l d be p e r -  data s e t s .  of feature vector v a r i a b i l i t y w i t h i n p a t t e r n c l a s s e s i n the  scheme i s t h e mode o f s e a r c h i n g f o r t h e c h a r a c t e r .  I n many c h a r a c t e r s  45  Fig.  10  Illustrating characters.  a technique  f o rclosing  small breaks i n broken  the  l e f t m o s t b l a c k p a r t o f t h e c h a r a c t e r may o c c u r  the  character  slant  i n various  i n d i f f e r e n t ways.  character  a t the top o r the bottom o f  s a m p l e s o f t h e same c h a r a c t e r , e s p e c i a l l y This  causes t h e v e r t i c a l  sometimes a t t h e u p p e r - l e f t m o s t  search  i f printers  mode t o e n c o u n t e r t h e  p a r t and sometimes a t t h e l o w e r -  leftmost part, resulting  i n different  cases,  mode a t 1 3 5 d e g r e e s w o u l d e n c o u n t e r t h e l o w e r - l e f t m o s t  a diagonal  search  part of a character As  feature vectors.  I n most o f t h e s e  only.  c a n b e s e e n i n F i g . 6, w i t h  English a priori probabilities U  f o r m e d b e t t e r o n TS-6-PAD t h a n T, e v e n t h o u g h U i s a s i m p l e r less  storage.  Similar  classifier  methods t o a l g o r i t h m G f o r b i g r a m s and t r i g r a m s  per-  and uses should  46 be t e s t e d u s i n g  5.3  U t o compare w i t h  Summary o f  final  conclusions  should  wait  s e t s , i t appears that the parametric  on b i n a r y v e c t o r s yield  results.  Conclusions  Although large data  G's  obtained  from contours  f o r e x p e r i m e n t s on s e v e r a l c l a s s i f i e r s T, U a n d V  of characters  r e c o g n i t i o n a c c u r a c i e s much b e t t e r t h a n t h o s e  optimum n o n p a r a m e t r i c recognition accuracies probabilities  obtainable  c l a s s i f i e r S, a n d t h a t i f 6-PAD i s u s e d can be achieved,  of the characters  particularly  are incorporated.  to differentiate  significantly  b e t w e e n commonly  b e t t e r than those  reasonably  good  i f English a p r i o r i also  suggest  c l a s s dependent f e a t u r e  confused  which use only  as i n F i g . 3  using the  The r e s u l t s  t h a t r e c o g n i t i o n schemes w h i c h u s e a d d i t i o n a l s i m p l e extraction  quantized  operating  c h a r a c t e r s may  perform  one f e a t u r e e x t r a c t i o n m e t h o d  common t o a l l p a t t e r n c l a s s e s . " I n a d d i t i o n , i t a p p e a r s t h a t a l g o r i t h m G u s i n g b i g r a m and t r i g r a m s t a t i s t i c s , w i t h a d e p t h o f s e a r c h can  s i g n i f i c a n t l y improve the r e c o g n i t i o n accuracy.  those  of Bakis  tual  equal  These c o n c l u s i o n s  t o 4, support  e t a l . [5] t h a t c u r v e - f o l l o w i n g f e a t u r e s e x t r a c t t h e s i g n i f i c a n t  i n f o r m a t i o n from h a n d p r i n t i n g , that using  value  and those  of Raviv  t h e measurements o f n e i g h b o u r i n g  [ 2 3 ] a n d Duda a n d H a r t [2<']  characters  c o n s t r a i n t s g r e a t l y improves'the r e c o g n i t i o n  together  accuracy.  with  contex-  4 7 REFERENCES 1.  W.W.  B l e d s o e and J . B r o w n i n g , " P a t t e r n r e c o g n i t i o n and r e a d i n g by m a c h i n e , " P a t t e r n R e c o g n i t i o n , L. U h r . , E d . , pp. 3 0 1 - 3 1 6 , W i l e y , New Y o r k , 1966.  2.  L.  Uhr and C. V o s s l e r , "A p a t t e r n - r e c o g n i t i o n p r o g r a m t h a t g e n e r a t e s , e v a l u a t e s and a d j u s t s i t s own o p e r a t o r s , " C o m p u t e r s and T h o u g h t , F e i g e n b a u m and F e l d m a n , E d s . , pp. 2 5 1 - 2 6 8 , M c G r a w - H i l l , New Y o r k , 1963.  3.  W.H.  Highleyman, " L i n e a r d e c i s i o n f u n c t i o n s w i t h a p p l i c a t i o n to p a t t e r n r e c o g n i t i o n , " P r o c . I R E , V o l . 5 0 , No. 6, pp. 1 5 0 1 - 1 5 1 4 , J u n e , 1962.  4.  L.A. K a m e n t s k y , "The s i m u l a t i o n o f t h r e e m a c h i n e s w h i c h r e a d r o w s o f h a n d w r i t t e n a r a b i c n u m b e r s , " I R E T r a n s , on E l e c t r o n i c C o m p u t e r s , V o l . E C - 1 0 , No. 3, pp. 4 8 9 - 5 0 1 , S e p t e m b e r , 1961.  5.  R.  6.  E.C. G r e a n i a s , e t - a l . , "The r e c o g n i t i o n o f h a n d w r i t t e n n u m e r a l s b y c o n t o u r analysis," IBM J o u r n a l , V o l . 7, No. 1, pp. 1 4 - 2 1 , J a n u a r y , 1963.  7.  L.G.  8.  W.H.  B a k i s , N.M. H e r b s t , and G. N a g y , "An e x p e r i m e n t a l s t u d y o f m a c h i n e r e c o g n i t i o n of handprinted numerals," I E E E T r a n s , on S y s t e m s S c i e n c e and C y b e r n e t i c s , V o l . SSC-4, p p . 1 1 9 - 1 3 2 , J u l y , 1968.  R o b e r t s , " P a t t e r n r e c o g n i t i o n w i t h a n a d a p t i v e n e t w o r k , " IRE I n t e r n a t i o n a l C o n v e n t i o n r e c o r d , pp. 6 6 - 7 0 .  on  1960  H i g h l e y m a n , "An a n a l o g m e t h o d f o r c h a r a c t e r r e c o g n i t i o n , " I R E T r a n s . E l e c t r o n i c C o m p u t e r s , V o l . E C - 1 0 , pp. 502.-512 , S e p t e m b e r , 1 9 6 1 .  9.  R.L. G r i m s d a l e , e t a l . , "A s y s t e m P r o c . I E E , V o l . 106B, No. 2 6 ,  10.  J.K. C l e m e n s , " O p t i c a l ' C h a r a c t e r R e c o g n i t i o n f o r P l e a d i n g M a c h i n e A p p l i c a t i o n s , " Ph.D. t h e s i s , D e p a r t m e n t o f E l e c t r i c a l E n g i n e e r i n g , M.I.T., C a m b r i d g e , Mass., A u g u s t , 1965.  11.  J.K. C l e m e n s , " O p t i c a l c h a r a c t e r r e c o g n i t i o n f o r r e a d i n g m a c h i n e a p p l i c a t i o n s , " M.I.T. E l e c t r o n i c s R e s e r a c h L a b . , C a m b r i d g e M a s s . , Q u a r t . P r o g r e s s R e p t . 79, pp. 2 1 9 - 2 2 7 , O c t o b e r , 1965.  12. M.M.  f o r the automatic r e c o g n i t i o n of p a t t e r n s , " pp. 2 1 0 - 2 2 1 , M a r c h , 1959.  C h o d r o w , W . A . B i o v o n a and G.M. W a l s h , "A S t u d y o f H a n d - P r i n t e d haracter R e c o g n i t i o n T e c h n i q u e s , " Rome A i r D e v e l o p m e n t C e n t e r T e c h n i c a l R e p o r t No. RADC-TR-65-444, F e b r u a r y , 1966.  13.  J.H. M u n s o n , "The L. K a n a l , E d . ,  r e c o g n i t i o n of handprinted t e x t , " P a t t e r n R e c o g n i t i o n , Thompson B o o k Co., W a s h i n g t o n , D . C , 1968.  14.  D.D.  15.  J.H. M u n s o n , " E x p e r i m e n t s i n t h e r e c o g n i t i o n o f h a n d - p r i n t e d t e x t : Part I C h a r a c t e r r e c o g n i t i o n , " F a l l J o i n t Computer C o n f e r e n c e AFIPS P r o c . , V o l . 3 3 , p t . 2, pp. 1 1 2 5 - 1 1 3 8 , T h o m p s o n , W a s h i n g t o n , D . C , 1968.  J o h n s o n , C.F. H a u g h , and K.P. L i , "The a p p l i c a t i o n o f a few h y p e r p l a n e . d e c i s i o n t e c h n i q u e s to h a n d p r i n t e d c h a r a c t e r r e c o g n i t i o n , " P r o c . NEC, V o l . 2 2 , pp. 8 6 9 - 8 7 4 , 1966.'  48 16.  A.L. K n o l l , " E x p e r i m e n t s w i t h ' c h a r a c t e r i s t i c l o c i ' f o r r e c o g n i t i o n o f h a n d p r i n t e d c h a r a c t e r s , " I E E E T r a n s , on E l e c t r o n i c C o m p u t e r s , V o l . EC-18, pp. 366-372, A p r i l , 1969.  17.  R.W. C o r n e w , "A s t a t i s t i c a l m e t h o d o f e r r o r c o r r e c t i o n , " C o n t r o l , No. 2, F e b r u a r y , 1968.  18.  R.  A l t e r , " U t i l i z a t i o n of c o n t e x t u a l c o n s t r a i n t s i n automatic speech r e c o g n i t i o n , " I E E E T r a n s , on A u d i o and E l e c t r o a c o u s t i c s , M a r c h , 1968.  19.  G.  C a r l s o n , " T e c h n i q u e s f o r r e p l a c i n g c h a r a c t e r s t h a t a r e g a r b l e d on i n p u t , " A F I P S C o n f e r e n c e P r o c e e d i n g s , V o l . 28, 1966 S p r i n g J o i n t C o m p u t e r C o n f . , pp. 189-192.  20.  A.W. E d w a r d s and R.L. C h a m b e r s , "Can a p r i o r i p r o b a b i l i t i e s h e l p i n c h a r a c t e r r e c o g n i t i o n ? " J . A s s o c . C o m p u t i n g M a c h i n e r y , V o l . 11, pp. 465-470, O c t o b e r , 1964.  21.  K.  A b e n d , "Compound d e c i s i o n p r o c e d u r e s  NEC,  V o l . 22, pp.  777-780,  Information  and  f o r pattern r e c o g n i t i o n , " Proc.  1966.  22.  K.  A b e n d , "Compound d e c i s i o n p r o c e d u r e s f o r unknown d i s t r i b u t i o n s f o r d e p e n d e n t s t a t e s o f n a t u r e , " P a t t e r n R e c o g n i t i o n , L. K a n a l , Thompson B o o k Co., W a s h i n g t o n , D . C , 1968.  23.  J . R a v i v , " D e c i s i o n making i n Markov chains a p p l i e d to the problem of p a t t e r n r e c o g n i t i o n , " I E E E T r a n s , on I n f o r m a t i o n T h e o r y , V o l . I T - 1 3 , pp. 536-551, O c t o b e r 1967. .  24.  R.O. Duda a n d P.E. H a r t , " E x p e r i m e n t s i n t h e r e c o g n i t i o n o f h a n d p r i n t e d t e x t : P a r t I I - c o n t e x t a n a l y s i s , " 1968 F a l l J o i n t C o m p u t e r C o n f . A F I P S P r o c . V o l . 33, p t . 2, W a s h i n g t o n , D . C : T h o m p s o n , 1968, pp. 1139-1149.  25.  M.D. Levine, "Feature e x t r a c t i o n : A survey," Proc. of IEEE, V o l . No.. 8, pp. 1391-1407, A u g u s t , 1969.  26.  S . J . M a s o n and J.K. C l e m e n s , " C h a r a c t e r r e c o g n i t i o n i n a n e x p e r i m e n t a l r e a d i n g m a c h i n e f o r t h e b l i n d , " R e c o g n i z i n g P a t t e r n s , P.A. K o l e r s and M. E d e n , e d . , The M.I.T. P r e s s , C a m b r i d g e , M a s s . , 1968, pp. 156-167.  27.  G.  Nagy, " S t a t e o f t h e a r t i n p a t t e r n r e c o g n i t i o n , " P r o c .  pp. 836-862, May,  and Ed.,  57,  IEEE, V o l .  56,  1968.  28.  D.E. T r o x e l , F.F. L e e , and S . J . M a s o n , " R e a d i n g M a c h i n e f o r t h e B l i n d , " Q u a r t e r l y P r o g r e s s R e p o r t No. 89, R e s e a r c h L a b o r a t o r y o f E l e c t r o n i c s , M.I.T., A p r i l 15, 1968, pp. 245-248.  29.  D.E. T r o x e l , " P a g e r e a d e r f o r a r e a d i n g m a c h i n e f o r t h e b l i n d , " Q u a r t e r l y P r o g r e s s R e p o r t No. 94, R e s e a r c h L a b o r a t o r y o f E l e c t r o n i c s , M.I.T., J u l y 15, 1969, pp. 233-246.  30.  F. A t t n e a v e . a n d M.D. A r n o u l t , "The. q u a n t i t a t i v e s t u d y o f s h a p e and p a t t e r n p e r c e p t i o n , " P a t t e r n R e c o g n i t i o n , L. U h r , E d . , pp. 123-141, W i l e y , New Y o r k , 1966.  49  31.  J.A. B r a d s h a w , " L e t t e r r e c o g n i t i o n u s i n g c a p t i v e s c a n , " I E E E T r a n s , E l e c t r o n i c C o m p u t e r s , E C - 1 2 , p. 26, F e b r u a r y , 1 9 6 3 .  on  32.  A.G. S e m e n o v s k i y , " R e c o g n i t i o n o f h a n d w r i t t e n c h a r a c t e r s by: means o f a s e r v o s c a n , " C h a r a c t e r R e a d e r s and P a t t e r n R e c o g n i t i o n , V.A. Kovalevsky E d . , S p a r t a n B o o k s , New Y o r k , 1 9 6 8 , pp. 2 1 3 - 2 2 2 ^  33.  CM. A u s t i n , " D e s i g n and C o n s t r u c t i o n o f a n Opaque O p t i c a l C o n t o u r T r a c e r f o r C h a r a c t e r R e c o g n i t i o n R e s e a r c h , " M.A.Sc. T h e s i s , U n i v e r s i t y o f B r i t i s h C o l u m b i a , D e c e m b e r , 1967.  34.  K.S. F u , " S e q u e n t i a l M e t h o d s i n P a t t e r n R e c o g n i t i o n and A c a d e m i c P r e s s , New Y o r k and L o n d o n , 1968.  35.  N.J. N i 1 s s o n , ''Learning Machines: Foundations c l a s s i f y i n g S y s t e m s , " M c G r a w - H i l l , 1965.  36.  E.  37.  CK. Chow, " S t a t i s t i c a l i n d e p e n d e n c e and t h r e s h o l d , f u n c t i o n , " I E E E E l e c t r o n i c C o m p u t e r s , V o l . E C - 1 4 , pp. 6 6 - 6 8 , F e b r u a r y , 1 9 6 5 .  38.  G.S. S e b e s t y e n , " D e c i s i o n - m a k i n g p r o c e s s e s i n p a t t e r n The M a c m i l l a n Company, New Y o r k , 1962  39.  J.M. W o z e n c r a f t a n d I.M. J a c o b s , " P r i n c i p l e s o f C o m m u n i c a t i o n E n g i n e e r i n g , " J o h n W i l e y & S o n s , I n c . , New Y o r k , 1 9 6 5 , pp. 4 5 9 - 4 6 0 .  40. '  L. K a n a l and B. C h a n d r a s e k a r a n , "On d i m e n s i o n a l i t y and s a m p l e s i z e i s s t a t i s t i c a l p a t t e r n c l a s s i f i c a t i o n " , N a t i o n a l E l e c t r o n i c s Conference, C h i c a g o , 1 9 6 8 , pp. 2-7.  41.  W.H. H i g h l e y m a n , "The d e s i g n and a n a l y s i s o f p a t t e r n r e c o g n i t i o n e x p e r i m e n t s , " The B e l l S y s t e m T e c h n i c a l J o u r n a l , M a r c h , 1 9 6 2 , pp. 7 2 3 - 7 4 4 .  42.  P.A. L a c h e n b r u c h and M . R . ' M i c k e y , " E s t i m a t i o n o f E r r o r R a t e s i n D i s c r i m i n a n t A n a l y s i s , " T e c h n o m e t r i c s , V o l . 1 0 , No. 1, F e b r u a r y , 1 9 6 8 , pp. 1-11.  43.  W.C C o c h r a n , "Commentary on " E s t i m a t i o n o f E r r o r R a t e s i n D i s c r i m i n a n t A n a l y s i s , " T e c h n o m e t r i c s , V o l . 1 0 , No. 1, F e b r u a r y , 1 9 6 8 , pp. 2 0 4 - 2 0 5 .  44.  J.R. P i e r c e , " S y m b o l s , S i g n a l s and N o i s e : The n a t u r e and c o m m u n i c a t i o n , " H a r p e r & Row, New Y o r k , 1 9 6 1 , p. 2 8 3 .  45.  F. P r a t t , " S e c r e t and U r g e n t , t h e s t o r y o f c o d e s and c i p h e r s , " R i b b o n B o o k s , G a r d e n C i t y , New Y o r k , 1942. '  46.  C T . T o u s s a i n t and R.W. Donaldson, "Algorithms f o r r e c o g n i z i n g contourt r a c e d h a n d p r i n t e d c h a r a c t e r s , " I E E E T r a n s a c t i o n s on C o m p u t e r s , i n press.  Machine Learning,''  of T r a i n a b l e P a t t e r n -  F i x , and J . L . Hodges, J r . , " D i s c r i m i n a t o r y A n a l y s i s , N o n p a r a r a e t r i c Discrimination: C o n s i s t e n c y P r o p e r t i e s , " P r o j e c t 2 1 - 4 9 - 0 0 4 , USAF S c h o o l of A v i a t i o n M e d i c i n e , R a n d o l p h F i e l d , T e x a s , F e b r u a r y , 1951. Trans.  recognition,"  process  of  Blue  F. H u g h e s , ''On t h e mean a c c u r a c y o f s t a t i s t i c a l p a t t e r n r e c o g n i z e r s , I E E E T r a n s , on I n f o r m a t i o n T h e o r y , V o l . I T - 1 4 , No. 1, J a n u a r y 1 9 6 8 , pp. 5 5 - 6 3 .  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0093327/manifest

Comment

Related Items