UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

An investigation of the ridge function as a pattern descriptor for character recognition Brown, Lachlan Hugh Thomas 1969

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1969_A6_7 B76.pdf [ 5.94MB ]
Metadata
JSON: 831-1.0052033.json
JSON-LD: 831-1.0052033-ld.json
RDF/XML (Pretty): 831-1.0052033-rdf.xml
RDF/JSON: 831-1.0052033-rdf.json
Turtle: 831-1.0052033-turtle.txt
N-Triples: 831-1.0052033-rdf-ntriples.txt
Original Record: 831-1.0052033-source.json
Full Text
831-1.0052033-fulltext.txt
Citation
831-1.0052033.ris

Full Text

AN INVESTIGATION OF THE RIDGE FUNCTION AS A PATTERN DESCRIPTOR FOR CHARACTER RECOGNITION by LACHLAN HUGH THOMSON BROWN B.A.Sc, University of B r i t i s h Columbia, 1963 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in the Department of Computer Science We accept t h i s thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August, 1969 In p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h C olumbia, I a g r e e t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and Study. I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g of t h i s t h e s i s f o r s c h o l a r l y purposes may be g r a n t e d by the Head o f my Department or by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department o f Computer S c i e n c e The U n i v e r s i t y o f B r i t i s h Columbia Vancouver 8, Canada Date September 1969 ABSTRACT The problem of character recognition is used as a vehicle for an investigation of the properties of a particular descriptor of planar patterns. The descriptor arises in the work of Connor [ 4 ] on the a c t i -vity of photosensitive receptors connected in a lateral inhibitory net-work. It results from a mapping of the planar pattern into a periodic function of one variable. The function is found to be a general descrip-tor of pattern contour which denotes extrema in contour curvature and is independent of pattern orientation. Based on conjectures regarding human perception of the elements of a pattern set representing alphabetic characters, a recognition machine based on a sequential decision strategy is developed and implemented. Results of the application of the strategy to a mixed-font environment show the pattern descriptor to be capable of high recognition rates. However, these anticipated results may be obtained only through the modi-fication of the current operating system. A number of such changes are proposed and their probable effects discussed. i i TABLE OF CONTENTS Page ABSTRACT i TABLE OF CONTENTS . . . 1 1 LIST OF TABLES i v LIST OF FIGURES v i LIST OF SYMBOLS USED ACKNOWLEDGEMENT . x i i 1. INTRODUCTION . . . 1 2. THE DESCRIPTOR SET . 5 2.1 The Area Operator . . . . 5 2.2 The Discrete Receptor F i e l d 8 2.3 The Ridge Function 11 2.4 Noise E f f e c t s and Selection of Operator Size . . 18 2.5 S e l e c t i o n of the Pattern Set 28 3. . RIDGE FUNCTION PREPROCESSING 30 3.1 Contour Length and Interpolation 30 3.2 Noise Suppression . . . 32 3.2.1 The Average Value Smoothing Algorithm 33 3.2.2 The Average Gradient Smoothing Algorithm . . . . 33 4. DECISION PROCESSES 38 4.1 The Codeword Decision Strategy 38 4.2 Results of the Codeword Algorithm .41 4.2.1 Abbreviated Pattern Set 41 4.2.2 Complete Pattern Set 45 11X TABLE OF CONTENTS (Continued) Page 4.3 A Sequential Decision Strategy . . . 52 4.4 The Threshold Detection Algorithm . . . 55 4.5 Results of the F i r s t Level of the Sequential Logic 66 4.6 Second Level Strategy of the Sequential Logic. . . 80 4.7 Results of the Second Level Strate gy 88 4.8 Results on the f u l l y Trained Machine 94 5. CONCLUSIONS 1 0 3 APPENDIX 107 BIBLIOGRAPHY 1 1 3 i v LIST OF TABLES Page Table 4.2.1 Codewords and t h e i r Character Class Associations f o r the Abbreviated Pattern Set 42 Table 4.2.2 Confusion Sets with Corresponding Codeword . . . . 45 Table 4.5.1 The Table MT Derived from the Set of Training Patterns 68 Table 4.5.2 Success of the Codeword Strategy by Subset Size -Training Mode 72 Table 4.5.3 Results of the Codeword Strategy by Character Class - Training Mode 74 Table 4.5.4 Identity Candidate Subsets with an Average f o r each Element and the Occurrence of each Subset as a Proportion of the Number of Samples from each C 1 - Training Mode . 75 Table 4.7.1 Success of Sequential Strategy by Subset Size -Training Mode 89 Table 4.7.2 Results of the Sequential Strategy by Character Class - Training Mode 91 Table 4.7.3 Weights which Result from the f u l l T r a i n i n g Set . . 93 Table 4.8.1 Results by Subset Size of the Sequential Strategy on the f u l l y Trained Machine - I d e n t i f i c a t i o n Mode. 94 Table 4.8.2 Results of the Codeword Strategy by Character Class - I d e n t i f i c a t i o n Mode 96 V LIST OF TABLES (Continued) Page Table 4.8.3 Table 4.8.4 Table 4.8.5 Identity Candidate Subsets with an Average s^ for each Element and the Occurrence of each Subset as a Proportion of the Number of Samples from each C 1 - I d e n t i f i c a t i o n Mode . 97 Results of the Sequential Strategy by Character Class on the f u l l y Trained Machine - I d e n t i f i c a t i o n Mode . . 99 Results of the Sequential Strategy by Font Style . . . 100 VI LIST OF FIGURES Page F i g . 2.1.1 Area Operator Value i n the Region of an Illumination Contour 6 F i g . 2.2.1 The Pattern from which the Receptor A c t i v i t y Functions of Figs. 2.2.2 and 2.2.3 are Derived. . . 9 F i g . 2.2.2 Receptor A c t i v i t y Function V n(x.,y, ) due to the Polygon of F i g . 2.2.1 . . . . ' 10 F i g . 2.2.3 Receptor A c t i v i t y Function V„(x.,y,) due to the £ X iC Polygon of F i g . 2.2.1 11 F i g . 2.3.1 Combined Receptor A c t i v i t y Function due to the Polygon of F i g . 2.2.1 . . 13 F i g . 2.3.2 Examples of Ridge Function Derived with Operator V^ . 15 F i g . 2.3.3 Examples of Ridge Function Derived with Operator V^ . 17 F i g . 2.3.4 Example of Ridge Function Derived with Operator. V^ with and without Inversion of Extrema Resulting from Concave Corners 18 F i g . 2.4.1 Examples of Contour Features which E f f e c t the Choice of Area Operator Size 19 F i g . 2.4.2 S i m i l a r i t y Between the Ridge Functions for D i f f e r i n g Patterns . . . . . 21 F i g . 2.4.3 Ridge Points Selected on Various Sections of Contour 22 F i g . 2.4.4 Receptor A c t i v i t y Function V„(x.,y.) With Noise J X xC Denoted 24 v i i LIST OF FIGURES (Continued) Page F i g . 2.4.5 High Noise Level and Poor S e p a r a b i l i t y which Result from 'Thin' Patterns 26 Fi g . 3.2.1 Ridge Function Noise Reduction by the Average Gradient Method 35 Fi g . 3.2.2 The Ridge Functions of F i g . 2.4.2 f u l l y Preprocessed . 37 Fi g . 4.2.1 Patterns of Like Character Class which Result i n Di f f e r e n t Codewords . . . . 44 F i g . 4.2.2 Examples of Patterns and Ridge Functions from the C - U Confusion Set 47 Fi g . 4.2.3 Examples of Patterns and Ridge Functions from the L - V Confusion Set 49 Fi g . 4.2.4 D i s t r i b u t i o n i n the Separations of Adjacent Peaks . . 51 F i g . 4.4.1 Ap p l i c a t i o n of the Threshold Detection Algorithm with m = 2 57 F i g . 4.4.2 Adjacent Peak Separations from F i g . 4.4.1 59 Fi g . 4.4.3 Flow Chart for the Trai n i n g Mode of Operation of the F i r s t Level of the Sequential Strategy 64 Fi g . 4.4.4 Flow Chart f o r the I d e n t i f i c a t i o n Mode of Operation of the F i r s t Level of the Sequential Strategy . . . . 67 Fi g . 4.5.1 Growth of the Table MT with Training 71 Fi g . 4.5.2 Examples of Confusions Caused by Poor Feature S e p a r a b i l i t y and Loss of Low Magnitude Peaks 78 v i i i LIST OF FIGURES (Continued) Page F i g . 4.5.3 Tra i n i n g Curve Giving the Success of the Codeword Decision Strategy for 6 = 0 . 3 80 F i g . 4.6.1 Flow Chart f o r the I d e n t i f i c a t i o n Mode of Operation of the Sequential Strategy 86 F i g . 4.6.2 Flow Chart f o r the Training Mode of Operation of the Sequential Strategy 87 Fig.-4.7.1 Training Curve Giving Success of the Combined Strategies 92 ix L I S T OF SYMBOLS USED a ( s ) t h e g e n e r a l r i d g e f u n c t i o n ( c o n t i n u o u s c a s e ) . ex.. ( s ^ ) t h e r i d g e f u n c t i o n due t o o p e r a t o r V.. ( d i s c r e t e c a s e ) . 3 t h e a r e a o p e r a t o r d i a m e t e r . 6 a f r a c t i o n o f t h e s c o r e s w h i c h d e f i n e s t h e t h r e s h o l d L . Y s 0 t h e t r a n s f o r m a t i o n from p a t t e r n space t o d e s c r i p t o r s p a c e . m t h e number o f components x ^ i n t h e v e c t o r X... -(f) ._.the.number .of r i d g e f u n c t i o n peaks d e t e c t e d . t h e i t ^ 1 c h a r a c t e r c l a s s o f t h e e n v i r o n m e n t . c. t h e number o f c o n t o u r s i n t h e p a t t e r n s o f t h e c l a s s C 1. l d s e p a r a t i o n between a d j a c e n t f e a t u r e s on t h e c o n t o u r . d g r i d s e p a r a t i o n c o n s t a n t , g d s e p a r a t i o n o f a d j a c e n t r i d g e f u n c t i o n peaks o f l i k e s i g n . K. D^ a v e r a g e d i f f e r e n c e v e c t o r r e l a t i n g an unknown i n p u t p a t t e r n w i t h t h e c l a s s C 1. PpjD^ l i s t s o f d^ f o r p o s i t i v e and n e g a t i v e peak p a i r s r e s p e c t i v e l y . e v a l u a t i o n o f t h e c l a s s C 1 as an i d e n t i t y f o r an unknown i n p u t , ev r i d g e f u n c t i o n v a l u e c o r r e s p o n d i n g t o s t r a i g h t c o n t o u r s e c t i o n s , f t h e number o f d i f f e r e n t f o n t s y l e s i n t h e e n v i r o n m e n t . f ^ , f _ ^ ( s ^ ) r i d g e p o i n t v a l u e a f t e r i n t e r p o l a t i o n , f ^ r i d g e p o i n t v a l u e a f t e r n o i s e r e d u c t i o n . f . . , f . t h e f r e q u e n c y o f o c c u r r e n c e o f k. i n t h e m p a s s e s o v e r t h e i n p u t f u n c t i o n w i t h i d e n t i t y known and unknown r e s p e c t i v e l y . ^ i ' ^ i L ' ^ i N a v e r a £ e g r a d i e n t computed from a s m a l l number of a d j a c e n t r i d g e f u n c t i o n p o i n t s . L I S T OF S Y M B O L S U S E D (Continued) a subset of the C 1 of candidates for the i d e n t i t y of an unknown input pattern. k. 3 the j*"* 1 codeword i n the table MT. L d ' L k magnitude thresholds f o r ridge function peak detection. LL' \ j l i m i t s set on the threshold L . a L s a threshold used for i n t e r p o l a t i o n and noise reduction (PP 32) L s a threshold used to define the subset I '(pp 65). 1. 3 a l i s t of t r i p l e t s i , p!., c , associated with each k. J i i J i n the table MT. M a parameter de f i n i n g window width f o r the gradients g_^ . MT a table derived from the t r a i n i n g set i n which are described associations between the k. encountered and the classes C 1. J m the number of passes made over each input ridge function. N the number of data points defining the ridge function a f t e r i n t e r p o l a t i o n . n the number of character classes i n the environment. ' p r o b a b i l i t y of occurrence of k^ i n the t r a i n i n g set given a sample from the cla s s C 1. p j ^ p r o b a b i l i t y , based on the t r a i n i n g set, that a given k_. r e s u l t s from a sample of the class C^ a source pattern of the cla s s C 1. P(k^) p r o b a b i l i t y of occurrence of the codeword k^ . . P(C X) p r o b a b i l i t y of occurrence of a sample of the class C 1. x i L I S T OF SYMBOLS USED (Continued) R the l i s t of k and f ^ a r i s i n g from the m passes over an input pattern from the class C 1 during t r a i n i n g . R' the l i s t R with input i d e n t i f i c a t i o n unknown. r the number of d i s t i n c t k. i n the l i s t R or E'. J s^ ridge points on the d i s c r e t e f i e l d (pp 8) . s_^  a score associating each C 1 with the unknown input (pp 65). s!^  ridge points a f t e r i n t e r p o l a t i o n . s, ridge points at which the ridge function has extrema. K. s the l a r g e s t score s^ occurring for a p a r t i c u l a r input. T.. an average reference vector,with components t, , which i s com-j i ° k puted from the X.. a s s o c i a t i n g the k^ with the c l a s s C 1. T a threshold used for peak merging, m T ,T„ l i m i t s on the threshold T . a 3 m T the period of the ridge function. V.,V.(s.) the.value of one of three area operators defined on e i t h e r the d i s c r e t e or continuous receptor f i e l d . V. the value of the area operator V . on an i l l u m i n a t i o n boundary, jmax j X. a vector of components x., each giving a d i f f e r e n t metric evaluation of the ridge function. ACKNOWLEDGEMENT I wish to extend my sincere thanks to Dr. J . S. MacDonald fo r the guidance and encouragement he has given me during h i s super-v i s i o n of t h i s t h e s i s . I would l i k e also to g r a t e f u l l y acknowledge the e f f o r t s of Dr. D. J . Connor i n supplying data from h i s own experi-ments. His many suggestions have proven invaluable. To Dr. J . M. Kennedy and Mr. A. G. Fowler, I o f f e r my deep gratitude f o r the f a c i l i t i e s they have made av a i l a b l e to me as well as f o r t h e i r arrangement for my f i n a n c i a l support through teaching a s s i s t a n t s h i p s . F i n a l l y , I am most appreciative of the generous f i n a n c i a l support which I have received at the hand of I. B. M. of Canada Ltd., i n the form of a Summer Fellowship. 1. I N T R O D U C T I O N The problem of automatic pattern or character recognition may be considered separable into two parts [9, 6]. Given an environment consisting of some pattern set the first consideration is the selection of some set of measurable descriptors a = {a . } J which characterize the environment in a way suitable for classification of its elements into a preassigned set of classes C = {c1}. This may be considered as a transformation a = 0(P) from the pattern space to the descriptor space [9], The second part of the problem requires the definition of decision processes which perform the classification on the basis of the a by assigning each to a C1. This division of the problem does not imply an independence between the component parts [9], In general, the decision strategies of the second part reflect the characteristics of both the environment and the descriptor set. A generalized pattern recognition system has been proposed which is composed of three units [ 7 ] . The first two perform the trans-formation 0 . In turn, they 1 ) input a source pattern from P and 2 subject i t to some p a r t i a l transformation and then 2) perform the bulk of the transformation into a f i n a l input pattern defined i n the descrip-tor space. These units w i l l be termed, r e s p e c t i v e l y , the simulator and the preprocessor. The input pattern i s supplied to the t h i r d , or de- c i s i o n u n i t , which i s the implementation of the decision process defined for the second part of the general recognition problem. This thesis presents an experimental i n v e s t i g a t i o n of a par-t i c u l a r d escriptor set a , l a r g e l y i n the context of the character recognition problem. The purpose of the work i s to study the d e s c r i p t i v e content of the chosen set a and how t h i s information may be extracted. The pattern descriptor on which the i n v e s t i g a t i o n centers evolves from the work of Connor i n a study of the v i s u a l system of the horseshoe crab, Limulus [4]. Based on a mathematical model developed f o r t h i s system, a simulator has been constructed to f a c i l i t a t e the study [5,3], The output from the machine i s a d i s c r e t e receptor a c t i v i t y function which, a f t e r preprocessing i n a general d i g i t a l computer, y i e l d s a set a , of ridge functions. The ridge function i s the r e s u l t of a mapping of the pattern contour into a function of one v a r i a b l e , the contour length. In the following chapter a one-to-one correspondence w i l l be shown to e x i s t between ridge function features and the points on the contour at which there occur l o c a l extrema i n curvature. Considerable work i n the f i e l d s of physiology and experimental psychology has indic a t e d the importance of curvature information to the operation of animal and the human v i s u a l systems [4,1]. Attneave, on hi s work with human subjects, observes, " I t i s evident that redundant v i s u a l stimulation r e s u l t s from e i t h e r a) an area of homogeneous colour (brightness) or b) a contour of homo-geneous d i r e c t i o n s or slope. In other words, information i s concentrated at those points on a contour at which i t s d i r e c t i o n changes most r a p i d l y , ( i . e . , at angles or peaks of curvature)" [1]. As a pattern descriptor s u i t a b l e for pattern c l a s s i f i c a t i o n , contour curvature has been shown to have merit. Masnikosa reports a successful implementation of a pattern recognition algorithm employing curvature [ 8 ] . Further, the d i f f e r e n t i a l geometry gives the r e s u l t that a planar curve i s defined uniquely and independently of coordinate system by i t s cur-vature function, provided the function i s parameterized by contour length alone. This arises from the s o l u t i o n i n the plane of the Serret-Frenet equations. Since i t contains information regarding contour curvature and, i n p a r t i c u l a r , extrema i n curvature, a case i s b u i l t , by means of the r e s u l t s reported above, for the i n v e s t i g a t i o n of the ridge function as a general pattern descriptor. I t s generality a r i s e s l a r g e l y from the fa c t that i t i s a function only of contour length and i s thus quite independent of pattern o r i e n t a t i o n . As w e l l , with one proviso, the choice of pattern set for the experiments i s completely open. The r e s t r i c t i o n , which i s , i n degree, dependent upon the simulator, i s that the patterns must e x h i b i t some 'thickness' or, i n other words, ' l i n e ' drawings are not acceptable. The pattern set chosen for the experiments reported i n t h i s thesis i s composed of block c a p i t a l alphabetic characters of a number of d i f f e r i n g font s t y l e s . This s e l e c t i o n of operating environ-ment i s a r b i t r a r y and i s based on convenience and on the large v a r i e t y 4 a v a i l a b l e i n font s t y l e s . In the large, the elements of the pattern set s a t i s f y the above r e s t r i c t i o n regarding pattern 'thickness' although some 'borderline' cases are encountered. Further discussion on the s e l e c t i o n of the pattern set w i l l be found i n section,2.5. A b r i e f d e s c r i p t i o n of those aspects of Connor's work which bear on t h i s thesis i s presented i n the following chapter. Included i s an abstr a c t i o n of the t h e o r e t i c a l r e s u l t s with some mention of the operation of the simulator and, as w e l l , a d e t a i l e d account of the ridge function. 2. THE DESCRIPTOR SET 2.1 The Area Operator Given an i l l u m i n a t i o n boundary impinging on an array of receptors connected i n a l a t e r a l i n h i b i t o r y network, i t has been shown that the receptor a c t i v i t y may be modelled by an area operator which i s small r e l a t i v e to the f i e l d of i l l u m i n a t i o n [4], For ease of pro-cessing, three area operators have been defined. The value of any operator at a point (x,y) of the receptor f i e l d i s V. = V.(x,y) 2.1.1 J 3 where the subscript, which may assume the values j = 1,2, or 3, denotes which of the three area operators i s under consideration. Provided the i l l u m i n a t i o n i n t e n s i t y at i t s center exceeds a preset threshold, the value of the operator, i n the case of the unit disk with point center, i s independent of i l l u m i n a t i o n i n t e n s i t y and i s proportional only to the d i f f e r e n c e between the i l l u m i n a t i o n i n t e n s i t y at the center and a weighted sum of i n t e n s i t i e s computed over the remainder of the area of the operator. This causes the V to be dependent only upon regions of contour i n i l l u m i n a t i o n i n t e n s i t y which l i e within the bounds of the operator. The properties of each area operator w i l l be discussed i n the neighborhood of an i l l u m i n a t i o n boundary incident on a continuous array of receptors. In F i g . 2.1.1 (a) a contour i s represented which l i e s separating two regions and having i l l u m i n a t i o n s i ^ < i2 respec-t i v e l y . A normal i s drawn to the contour at S. 6 - A (b) (c) F i g . 2.1.1 A r e a O p e r a t o r V a l u e i n t h e R e g i o n o f a n I l l u m i n a t i o n C o n t o u r . I f a r e a o p e r a t o r V ^ ( x , y ) i s c e n t e r e d a t A a n d t h e n m o v e d a l o n g t h e n o r m a l t o w a r d s A ' , i t s v a l u e c h a n g e s a s s h o w n i n F i g . 2.1.1 ( c ) . W h e n w h o l l y c o n t a i n e d i n t h e i l l u m i n a t i o n i n t e n s i t y i s c o n s t a n t o v e r t h e e n t i r e o p e r a t o r a n d t h u s i t s v a l u e i s z e r o . A s t h e a r e a o f i n t e r -s e c t i o n o f t h e o p e r a t o r w i t h i n c r e a s e s , i t s v a l u e i n c r e a s e s p r o -p o r t i o n a l l y u n t i l a m a x i m u m V l m a x " V ^ s 2 - 1 - 2 -i s a t t a i n e d a d j a c e n t t o t h e c o n t o u r . W h e n t h e c e n t e r o f t h e o p e r a t o r 7 enters the region of l e s s e r i n t e n s i t y i t s value, by d e f i n i t i o n , becomes zero. At the point S the area of i n t e r s e c t i o n of the operator, with the region not containing i t s center, achieves i t s greatest value. I f the area operator ^^(xjy) i s moved along the normal from A' -towards A, i t s value changes as shown i n F i g . 2.1.1 (b). The properties of t h i s operator are analogous to those of with the exception that i t i s non-zero i n the region of lesser i n t e n s i t y . Since, i n the region of S, M£ i s 'convex', the maximum values are V > V Imax 2max* I f , l o c a l to S, the contour were s t r a i g h t , then the two values would equal TJ- , the s t r a i g h t edge value. In t h i s way the value of the area operator i s c h a r a c t e r i s t i c of the contour curvature within unit distance of S. To define curvature sense, sections of a closed contour which have non-zero curvature and centers of curvature a l l l y i n g within the bounded region are termed l o c a l l y convex. Conversely, i f the centers of curvature a l l l i e outside the region, the contour i s l o c a l l y concave. Thus, i f M2 of F i g . 2.1.1 i s a bounded region, Vlmax >_~2 i f i s l o c a l l y convex, V lmax = ~2 ^ ^ ^S l°ca-'--'-y s t r a i g h t , and V lmax <_~2 i f i s l o c a l l y concave. I f the value V. i s derived for every normal to the closed contour, j max a continuum of points S r e s u l t . These are termed ridge points since they define a r i d g e l i n e i n the receptor a c t i v i t y function. In the following section t h i s r i d g e l i n e w i l l be demonstrated and shown to be 8 a descriptor of closed contours. 2.2 The Discrete Receptor F i e l d In order to study l a t e r a l i n h i b i t i o n and the area operator model, a s p e c i a l purpose computer has been constructed [3], This device w i l l serve as the simulator unit for the character recognition system to be investigated i n th i s t h e s i s . I t employs a fl y i n g - s p o t scanner to develop a 9x9 array of 'receptors', the center of which i s locatable at i n t e g r a l points (x.,y ) on a fixed r e c t i l i n e a r g r i d which forms the pattern f i e l d . On t h i s g r i d i s projected the s i l h o u e t t e of the source pattern. The t r a n s i t i o n from the continuous to the d i s c r e t e pattern f i e l d r e s u l t s i n an approximation to the continuum of ridge points S by a f i n i t e set {s^}. The elements of the l a t t e r set may deviate from the i l l u m i n a t i o n boundary by as much as the g r i d separation d , a value which w i l l remain f i x e d throughout the course of the work of the th e s i s . What e f f e c t t h i s has on the receptor a c t i v i t y function, i n the l i g h t of the r e s u l t s of section 2.1, i s demonstrated below. The simulator approximates the receptor a c t i v i t y function by a d i s c r e t e function ^ ( x ^ y ^ ) computed at each point of the quantized pattern f i e l d . To demonstrate t h i s function, the polygonal pattern of F i g . 2.2.1 i s processed by the simulator. The i l l u m i n a t i o n i n t e n s i t y i s considered to be greater within the closed contour than without. The corresponding receptor a c t i v i t y functions are shown i n F i g s . 2.2.2 and 2.2.3 f o r j = 1 and j = 2 r e s p e c t i v e l y . In agreement with e a r l i e r discussion, the function V 1(x.,y,) i s seen to be zero at every point F i g . 2.2.1 The Pattern from which the Receptor A c t i v i t y Functions of F i g s ; 2.2.2 and 2.2.3 are Derived. outside the contour, where the i l l u m i n a t i o n i n t e n s i t y i s minimal, and non-zero within the contour, i n the region of the greatest i n t e n s i t y . Each of the a c t i v i t y functions exhibits a continuous r i d g e l i n e which follows c l o s e l y the boundary of the pattern. In traversing the ridge-l i n e , three d i s t i n c t types of feature are encountered. These are 1) sharp l o c a l maxima which resemble peaks, 2) sections of h o r i z o n t a l ridge and 3) l o c a l minima. Reference to F i g . 2.2.2, for the function V 1 ( x . , y , ) , shows that the l o c a l maxima and the minimum correspond to corners i n the contour where the pattern i s l o c a l l y convex and l o c a l l y concave, r e s p e c t i v e l y . In F i g . 2.2.3 for ^ ( x ^ y ^ ) . t n e converse i s true. The h o r i z o n t a l sections of r i d g e l i n e i n both functions occur over regions i n which the contour i s l o c a l l y s t r a i g h t . Their average amplitude above the plane gives a measure of the s t r a i g h t edge value ev. Fig. 2,2.2 Receptor Activity Function ^ ( x ^ y ^ ) ' - due to the Polygon of Fig. 2.2.1. The. ridgeline i s defined by the set of ridge points {s^} whose locus provides an approximation to the pattern contour. It is apparent that the essential attributes of the area operator are preserved under the quantization invoked by the simulator. In addition, i t has become evident that the ridgelin« in the receptor activity function i s a pattern descriptor which preserves contour features at which the curvature has extrema. By traversing the ridge-line from an arbitrary starting point, a function of one variable, contour F i g . 2.2.3 Receptor A c t i v i t y Function V_(x.,y,) due to Polygon of F i g . 2.2.1. length, may be derived. The properties of t h i s , the ridge function, are discussed i n the following s e c t i o n . 2.3 The Ridge Function By t r a c i n g the locus of the ridge points, the ridge function a. = a . ( s . ) 1 = 1,2 or 3 2.3.1 j 3 i i s derived. In the continuous case, i t i s a function^ derived from the sequence of points V^max a n d i s > i t s e l f , continuous with the exception of at most one point. I t i s a descriptor of an i l l u m i n a t i o n boundary which i s dependent only upon the shape of the boundary. In p a r t i c u l a r , 12 i t e xhibits l o c a l maxima or minima i n correspondence to l o c a l extrema i n contour curvature. The function i s p e r i o d i c with period corresponding to the distance about the closed contour and the point at which the ridge t r a c i n g ' s t a r t s ' i s a r b i t r a r y . Thus, as a pattern descriptor, the ridge function i s independent of pattern o r i e n t a t i o n on the pattern f i e l d . Although the examination of Figs. 2.2.2 and 2.2.3 has shown these properties to carry over to the d i s c r e t e case, some dependence of the ridge function on pattern o r i e n t a t i o n does occur. This w i l l be d i s -cussed i n the next section. The extraction of the ridge function from the d i s c r e t e receptor a c t i v i t y function i s performed, through a ridge t r a c i n g technique, by the preprocessor unit of the generalized recognition system. While the l o c a l maxima on the r i d g e l i n e , as w e l l as the h o r i z o n t a l ridges, are r e a d i l y detectable, the l o c a l minima, because they define saddle points on the plane, are not. To overcome t h i s d i f f i c u l t y , the t h i r d , or combined, area operator V 3(x,y) = maxCV^x.y), V 2(x,y)) 2.3.2 i s defined. An example of the d i s c r e t e a c t i v i t y function due to t h i s operator i s shown i n F i g . 2.3.1. This, again, corresponds to the poly-gon of F i g . 2.2.1. The ridge function a^Cs^) exhibits l o c a l maxima i n correspondence to a l l extrema i n contour curvature. Unlike cc^(s^) and a 2 ^ S i ^ ' ^ t S v a-*- u e ^ s n e v e r l e s s than the s t r a i g h t edge value. The use of the combined operator requires that an extra b i t of information be kept for each s^ i n order to i d e n t i f y i t s source and thus d i f f e r e n t i a t e between ridge function features caused by convex extrema i n contour curvature and those due to concave extrema. The points which define Fig. 2.3.1 Combined Receptor Activity Function due to Polygon of Fig. 2.2.1. a^Cs^) are selected from either or depending upon which achieves the greater value in each small integrally spaced region along the contour. Where the contour is locally straight, this is equivalent to the selection of the points which l i e closest to the contour. The ridge functions a ^ ( s ^ ) a n ^ a 2 ^ s i ^ a r e » * n a sense, the inverse of one another. By subtracting the straight edge value ev from both and inverting a„(s.), a ridge function a(s_^) = (a^(s^) - ev) =-(a 2(s ) - ev) 2.3.3 r e s u l t s . This i s negative at a l l points corresponding to regions of contour i n which the pattern i s l o c a l l y concave. The function a(s^) may also be defined from a 3 ( s ^ ) by f i r s t subtracting the edge value and then making negative a l l points which were contributed by V^. An algorithm has been implemented which sel e c t s ridge points from the d i s c r e t e a c t i v i t y function and traces out t h e i r locus to provide approximations to both the contour and the ridge function. Figures 2.3.2, 2.3.3 and 2.3.4 show example outputs from the ridge t r a c i n g programme for the case of V^, V^, and r e s p e c t i v e l y . A prominent c h a r a c t e r i s t i c of each ridge function shown i n Figs. 2.3.2 and 2.3.3 i s t h e i r high noise content. The frequency of the noise varies from a low value giving the sawtooth:effect apparent i n Figs. 2.3.2 (a) and 2.3.3 (a) to a much higher value which causes sharp, low amplitude spikes. This v a r i a t i o n i n frequency implies some dependence of the noise content of the ridge function on the o r i e n t a t i o n of the contour on the pattern f i e l d . The dark spot on each pattern contour defines the point at which the trace was begun. By following the contour i n a clockwise d i r e c t i o n , while simultaneously moving from l e f t to rig h t on the ridge function, a one-to-one correspondence i s found between the features of the function and extrema i n contour curvature. In so traversing the contour and ridge function of F i g . 2.3.2 (a), the l a t t e r i s seen to o s c i l l a t e about zero while the s t r a i g h t back of the 'J* i s traced. When the curve at the bottom of the contour i s encountered, 15 (a) (b) F i g . 2.3.2 Examples of Ridge Function Derived with Operator V 1 the ridge function increases to a non-zero, p o s i t i v e average value. Then, due to the end-of-line feature of the pattern, the function exhibits two p o s i t i v e maxima or peaks which are c l o s e l y followed by the negative peak corresponding to the convex section of the 'hook' of the ' J ' . Beyond t h i s , the ridge function increases to an average magnitude of zero as the s t r a i g h t edge i s traversed i n approaching the second end-of-line feature and the end of the trace. Here a l s o , a p a i r of p o s i t i v e peaks occur. S i m i l a r l y , i n F i g . 2.3.2 (b), a close correspondence i s apparent between the magnitude and r e l a t i v e l o c a t i o n of ridge function features and the l o c a l extrema i n contour curvature. It i s notable, i n reference to the noise mentioned above, that the two shallow concave corners marked i n the figure cause ridge function features which are not pronounced r e l a t i v e to the adjacent noise spikes. Also, the removal of these corners from the pattern, by making them 1 6 increasingly.shallow, causes the pattern to become a manifestation of the character 'V'. Thus, i f a high noise l e v e l were to render such peaks undetectable, the ridge function may f a i l to adequately describe the. pattern for the purpose of character recognition. Further discussion of t h i s point i s pursued i n the next section. Examples of ridge function due to the operator are shown i n F i g . 2.3.3. These have not yet undergone the inversion implied i n equation 2.3.3. In t r a c i n g the con-tour of F i g . 2.3.3 (a) i n the d i r e c t i o n indicated, a f t e r f i r s t t r a v e r s i n g a short length of s t r a i g h t edge, two consecutive corners are encountered, the f i r s t being much less pronounced than the second. Corresponding to these are the f i r s t two negative peaks of the ridge function (they w i l l be p o s i t i v e i n the f i n a l version) of which the f i r s t i s of l e s s e r magnitude than the second. Further inspection of the ridge function v e r i f i e s the correspondence between the angle subtended at a contour corner and the magnitude of the r e s u l t i n g ridge function peak. The example of F i g . 2.3.3 (c) i s of two parts, one for each closed contour of the pattern. The point of union of the two ridge functions i s i n d i -cated. They are kept separate through the l o g i c of the system. In F i g . 2.3.4 i s shown the ridge function ^ ( s ^ ) before and a f t e r the i n v e r s i o n of points due to the operator V^. The l a t t e r can be seen to correspond to the regions of the contour which contain the concave corners. In comparing the ridge functions of F i g s . 2.3.3 (b) and 2.3.4 (b), a difference i s apparent i n the extrema of each which correspond to end-of-line features i n the contours; that i s , the l o c a l minimum which separates the two peaks corresponding to each end-of-line feature i s l e s s pronounced i n the former than i n the l a t t e r , where i t s Examples of Ridge Function Derived with Operator V„ 18 (a) (b) F i g . 2.3.4 Example of Ridge Function Derived with Operator with and without Inverstion of Extrema Resulting from Concave Corners value i s zero. This i s due to a combination of fa c t o r s , besides the choice of area operator, which i s discussed i n the following se c t i o n . 2.4 JNoise E f f e c t s and Selection of Operator Size . It i s most desirable that the transformation 0 from source pattern to input pattern leave notable d e t a i l of the former, i n some way, apparent i n the l a t t e r . In other words, a one-to-one correspon-dence i s required between a l l extrema i n contour curvature and detec-table features of the ridge function. To achieve t h i s , both r e l a t i v e magnitude and l o c a t i o n of features must be conserved. To aid i n the meeting of these requirements, the s i z e of the area operator may be chosen according to the properties of the pattern set at hand. The c r i t e r i o n on which the choice i s made w i l l be discussed with reference to f i g u r e 2.4.1 which i l l u s t r a t e s two regions from some pattern contour. 19 (a) (b) F i g . 2.4.1 Examples of Contour Features which E f f e c t Choice of Area Operator Size Each contour section exhibits two dominant features separated by a distance d . The ridge function for the pattern containing the rectangular section has two peaks corresponding to the 90° corners. These w i l l be separated by a l o c a l minimum of zero provided the diameter of the area operator 8 i s 6 4 d. . 2.4.1 In that case, the contour features are termed s t r i c t l y separable. If the condition i s not met, an area operator placed midway between the corners w i l l encompass them both and w i l l thus have a value with magnitude greater than the edge value. Therefore, the l o c a l minimum which separates the two ridge function peaks w i l l be non-zero. The pattern i n F i g . 2.3.3 (b) ex h i b i t s two pairs of adjacent extrema i n contour curvature, neither of which are s t r i c t l y separable. These form the end-of-line features shown at (1) and (2) i n the f i g u r e . Considering e i t h e r feature, i t i s observed that the condition of s t r i c t 20 s e p a r a b i l i t y i s not met. This i s by v i r t u e of the non-zero magnitude of the ridge function at the l o c a l minimum which l i e s between the two adjacent peaks. I f , for example, the magnitudes of the two peaks are h^ 4 ^ 2 ' a n < ^ t n a t °f the minimum i s ah^, then the degree of s e p a r a b i l i t y of the corners i n the contour from which the peaks a r i s e i s (1 - a). The contour section of F i g . 2,4.1 (b) poses a s i t u a t i o n i n which the ridge t r a c i n g programme may f a i l to traverse the e n t i r e contour. I f the diameter of the area operator i s chosen greater than d, the ridge point values in the neighborhood of e i t h e r corner w i l l be influenced by the existence of the other. At some value of 3/d > 1, the separation of the two sections of contour w i l l become i n d i s t i n g u i s h a b l e to the programme. In section 2.3 the noise content of the ridge function was observed and i t s possible relevance to the r o l e of the ridge function as a pattern descriptor acknowleged. Reference to F i g . 2.4.2 serves to r e i t e r a t e these points.' A n t i c i p a t i n g a necessity to automatically detect ridge function features, i t i s apparent that, i n the regions (1) and (2) of the f i g u r e , i t may prove d i f f i c u l t to detect, i n some cases, the desireable peaks shown i n the upper figure without some spurious pick-up from the function i n the lower f i g u r e . The great s i m i l a r i t y between the ridge functions i s r e a d i l y seen. The ridge function noise i s a d i r e c t consequence of the quantization of the pattern f i e l d . In order to i l l u s t r a t e the cause of the noise on the input pattern, the ridge points selected by the programme along four sections of contour are shown i n F i g . 2.4.3. • 21 ( 2 ) — \ — { I ) I s fcS Fig, 2.4,2 S i m i l a r i t y Between the Ridge Functions for D i f f e r i n g Patterns With the exception of F i g . 2.4.3 (b), a l l r e s u l t s are due to operator V^. The exception i s due to ei t h e r or V^. The s t r a i g h t contour of Fig. 2.4,3 '.(b) has orientation s i m i l a r to those of the l e t t e r s ' J ' of Fig. 2.3.2 and 'K' of F i g . 2.3.3. The programme, having selected an 22 F i g . 2.4.3 Ridge Points Selected on Various Sections of Contour 23 i * " * 1 ridge point s^, finds the next, s^ +^, by conducting a s p i r a l l i n g sweep of the region surrounding and s e l e c t i n g the f i r s t point en-countered which s a t i s f i e s c e r t a i n magnitude c r i t e r i a . The extent of the sweep i s l i m i t e d and thus points are selected intermediate to the three, s , s, and s , which would best approximate the contour. Since a b c r a l l points are confined to the g r i d l i n e s of the pattern f i e l d , the locus of the ridge points must follow a g r i d l i n e u n t i l the one adjacent to i t 'appears' from behind the pattern. Since the l a t t e r l i n e l i e s c l o s e s t to the contour, the locus changes to i t and follows i t u n t i l yet another g r i d l i n e i s encountered, within the l i m i t s of the sweep, which l i e s even closer to the contour. The value of the area operator i s proportional to the distance separating i t s center from the contour and thus, i n traversing the locus of the ridge points, commencing with point s^ of F i g . 2.4.3 (b) , the corresponding section of the ridge function decreases monotonically u n t i l such time as the locus changes grid l i n e s . When th i s occurs (at points s^ and s c ) i t i s accompanied by a sharp increase i n ridge function magnitude. This, then, corresponds to one cycle of the sawtooth noise which was noted i n Figs. 2.3.2 (a) and 2.3.3 (a). Figure 2.4.4 provides an example of t h i s noise form i n the r i d g e l i n e of the receptor a c t i v i t y function V..(x.,y, ) for a pattern representing the character *H'. As X X K. the i n c l i n a t i o n of the s t r a i g h t contour of F i g . 2.4.3 (b) i s increased towards 45°, the apparent frequency of the noise increases. This i s because the g r i d l i n e s are followed f o r an in c r e a s i n g l y shorter distance before an adjacent l i n e becomes a v a i l a b l e i n closer proximity to the contour. 24 Fig. 2.4.4 Receptor Activity Function V»(x.,y.) j X K. vith Noise Denoted. Along straight sections of contour oriented either parallel .to or at 45° to the grid lines, the ridge points l i e near equidistant from the contour. This fact results in relatively smooth sections of ridge function. In-regions of non-zero curvature, a^(s) l i e s always on the side of the contour nearest the evolute of the curve. Along straight edges, cross-overs occur with frequency dependent upon contour orien-tation. The effect of these on the ridge function depends on the relative proximity of the contour to the ridge points on either side. The 25 r e s u l t i n g noise i s minimal when these sets of points are equidistant from the contour. Unless t h e i r separation i s an i n t e g r a l multiple of the g r i d separation, p a r a l l e l edges i n a pattern r e s u l t i n ridge points of un-equal values since the edges l i e at unequal distances from adjacent g r i d l i n e s . S i m i l a r l y , the ridge function peaks corresponding to l i k e corners i n a contour are generally of varying magnitude for the reason that the corners usually l i e at d i f f e r e n t distances from the nearest sampling points on the pattern f i e l d . This e f f e c t i s equivalent to d.c. noise. The noise l e v e l i s reducible through an increase i n the s i z e of the area operator so that i t encompasses a greater number of sampling points on the pattern g r i d . Thus, i t i s evident that simultaneous im-provement i n feature s e p a r a b i l i t y and noise l e v e l may be achieved only by a reduction i n g r i d spacing, a recourse unavailable to t h i s work. The compromise to be made between these two considerations i s demon-strated i n Figure 2.4.5 which i l l u s t r a t e s the r e s u l t s obtained from very ' t h i n ' patterns. If the area operator diameter i s made small, so as to render a l l contour features separable, the noise l e v e l may become s u f f i c i e n t l y large so as to obscure desireable peaks of low magnitude and/or give r i s e to the detection of spurious peaks. On the other hand, the choice of a r e l a t i v e l y large area operator diameter, causes s u b s t a n t i a l reduction i n noise but r e s u l t s i n a low degree of s e p a r a b i l i t y of adjacent features. In addition, r e l a t i v e l y shallow contour features stand to be overlooked e n t i r e l y . 26 27 A reduction i n the quantization l e v e l of the pattern f i e l d , through the r e s u l t i n g reduction i n the d.c. noise l e v e l , w i l l cause the amplitude of the ridge function, at function extrema, to be a more s e n s i t i v e measure of the curvature or angle subtended at the corres-ponding contour features. Thus, by a decrease i n gri d spacing, r e l a t i v e to pattern dimensions, peak amplitude w i l l , i n the large, increase and the s i t u a t i o n i l l u s t r a t e d i n Figs. 2.4.2 and 2.3.2 (b) may be considerably improved by the combination of the two b e n e f i c i a l e f f e c t s : reduction of noise l e v e l of a l l frequencies and increase i n peak magnitude. The discussion to date has reviewed a number of r e s u l t s from Connors' work. Paramount i s 'the correspondence between the turning points of the ridge function and the outstanding features i n the i l l u m i n a t i o n boundary from which the function i s derived. E f f e c t s due to the quantization of the pattern f i e l d are foreseen to degrade the success of the ridge function i n the r o l e of a pattern descriptor. These may be overcome through a modification to the simulator. Before turning to the f i n a l stages of the preprocessor unit of the recognition system, the pattern set used for the experiments reported i n t h i s thesis w i l l be defined. 28 2.5 'Selection of the Pattern Set The ridge function has been shown to be a general pattern descriptor. In theory, i t may be equally well applied to any pattern set with the s i n g l e proviso given i n the introduction. That i s , that the patterns must exhibit s u f f i c i e n t 'thickness', r e l a t i v e to the g r i d separation d , to insure a high degree of s e p a r a b i l i t y of a l l prominent contour features. Due to the p r e v a i l i n g l e v e l of quantization of the pattern f i e l d , best r e s u l t s are to be expected from a pattern set which i s characterized by s t r a i g h t edges which i n t e r s e c t at well defined corners. This i s not a stringent requirement and i t may be relaxed provided s u f f i c i e n t sampling density over the pattern f i e l d i s possible. The pattern set used f o r the experiments reported i n t h i s thesis i s composed of block c a p i t a l alphabetic characters of 10 d i f f e r i n g font s t y l e s of the Letraset s e r i e s . A l l are s a n s - s e r i f . The choice of pattern set i s based on convenience, a v a i l a b i l i t y and on the great v a r i e t y which e x i s t s among font s t y l e s . Samples from t h i s environment are shown i n the Appendix. The set of input patterns on which the decision unit of the recognition system operates i s obtained by processing once, i n the simulator and the preprocessor, each of the 26 characters of each type s t y l e . A small number of the r e s u l t s are unacceptable for the reasons given i n section 2.4 i n conjunction with F i g . 2.4.1 (b). In a l l , 234 patterns make up the environment. Comment w i l l be offered i n a l a t e r s e ction regarding the adequacy of t h i s quantity. Before the patterns of a p a r t i c u l a r font s t y l e are processed, they are normalized, each by the same fa c t o r , so that at least one of 29 t h e i r number f i l l s the 64 x 128 unit gri d of the pattern f i e l d i n e i t h e r coordinate d i r e c t i o n . Thus, the normalization factor varies between the elements of each of the character classes. It i s apparent that the recognition machines to be used i n t h i s study of the ridge function w i l l be operating across d i f f e r i n g font s t y l e s rather than a number of samplings of a s i n g l e font s t y l e . No test w i l l be made of the consistency of the simulator. The e x t r a c t i o n of the ridge function from the receptor a c t i v i t y function accounts f o r the greater part of the preprocessing stage of the generalized pattern recognition system proposed i n the introduction. I t also marks the i n t e r f a c e between the work of Connor and that of t h i s t h e s i s . Unlike most p r a c t i c a l systems, the one on which t h i s work i s based has no feedback l i n k from the preprocessor or d e c i s i o n units to the simulator. A l i n k of t h i s nature provides for the extrapolation towards optimum values of simulator parameters for any given environment. Also, the techniques of averaging or cross-c o r r e l a t i n g the input patterns derived from repeated processing of each source pattern become a v a i l a b l e . This would allow a s u b s t a n t i a l reduction i n the quantization e f f e c t s . In l i e u of these c a p a b i l i t i e s , further preprocessing i s employed i n the form of data smoothing. This w i l l be a subject of discussion df the next chapter. 30 3. RIDGE FUNCTION PREPROCESSING 3.1 Contour Length and Int e r p o l a t i o n From a sequence of ridge points s^ defined on a r e c t i l i n e a r g r i d with corresponding values V\ (s^) j = 1,2 or 3, the ridge function i s obtained (equation 2.3.3) only a f t e r approximations are made to a nominal edge value and to the length of the contour. Related to the l a t t e r i s the p a r t i c u l a r need to e s t a b l i s h the r e l a t i v e l o c a t i o n of the ridge function features. Also, i n a n t i c i p a t i o n of l a t e r requirements, the ridge function i s redefined i n terms of equally spaced points. The most successful choice of edge value for equation 2.3.3 has proven to be an average, computed for each pattern, over a l l points with magnitude l e s s than an a r b i t r a r y value. This value i s chosen to provide a compromise between the high noise l e v e l which accompanies too low an edge value and the loss of pertinent information when i t i s too high. The contour length i s approximated simply by the sum of the distances separating adjacent ridge points: s t = * I s r s i - i l i=2 - .V ( x i " x i - i ) 2 + ( y i - y i - i ) 2 ] 1 / 2 3-1-1 1=2 where n i s the number of points which define the function. This approxi-mation to the true pattern contour i s i l l u s t r a t e d i n F i g . 2.4.3 (c). In the context of the a p p l i c a t i o n of the ridge function to pattern recognition, 31 the error incurred by t h i s approximation i s n e g l i g i b l e . This assertion follows from the v a r i a t i o n which ex i s t s between samples of l i k e pattern and from the d i s t r i b u t i o n of the ridge points on the contour. The l a t t e r i s sparse i n regions of s t r a i g h t edge where cross-over may occur and i s bunched i n regions of high curvature where the locus of points i s always on one side of the contour (Fig. 2.4.3). This arises as an i n t r i n s i c feature of the ridge t r a c i n g programme. To e s t a b l i s h equally spaced data points, a modified f i r s t order Lagrangian i n t e r p o l a t i o n i s used. A number of points, N, i s selected which exceeds the s i z e of the l a r g e s t input data set. From t h i s an increment As = S /N 3.1.2 i s derived so that the l o c a t i o n of the i generated data point i s Provided s. 1 = i*As, 3.1.3 I s . , <• s . ' < P . 3.1.4 J-1 = i =• j then the value of the function at s.' i s l 8.* ~ S f. = f ( s . ' ) = — L i • a(s.) 1 1 S . - S . T J J J-1 s. - s , ' + - J — • a(s. .) 3.1.5 s . — S , i 1-1 J J-1 j=l,n 1=1,N 32 In order to preserve i t s information content, i t i s important that the preprocessing of the ridge function does not degrade the amplitude of i t s extrema. Unless N i s chosen very large, t h i s w i l l generally r e s u l t from the i n t e r p o l a t i o n but i t may be minimuzed by a modification to the standard procedure. This change involves the detection of each l o c a l extremum i n ridge function magnitude and the assignment of t h i s value to the nearest point among those which have been generated by means of the l i n e a r i n t e r p o l a t i o n . A threshold L g i s established at approximately 40% of the p o s i t i v e maximum of the function. During the i n t e r p o l a t i o n , points s, are detected such that °^ Sk-l^ < 0 1 ^ k"* > °^ Sk+l^ 3.1.6 and l « ( s k ) | > L s The magnitude ( s^) l s then assigned to the nearest neighbor to s^ among the generated s^'. This introduces a s h i f t of less than As/2 i n the l o c a t i o n of extrema i n the ridge function. However, t h i s inherent error i n the d e f i n i t i o n of feature l o c a t i o n i s small r e l a t i v e to the error which arises through the approximation involved i n equation 3.1.1. The success of the method i n maintaining ridge function peak amplitude during the i n t e r p o l a t i o n i s v e r i f i e d by experimental r e s u l t s . 3.2 Noise Suppression The detection of ridge function features has been anticipated to be a requirement of the decision processes to be discussed i n a l a t e r chapter. This may prove d i f f i c u l t i n the face of small amplitude peaks 33 which are e a s i l y confused with noise or spurious spikes. A stringent requirement of a l l preprocessing i s that neither peak amplitude nor r e l a t i v e l o c a t i o n be degraded. In the implementation of the two noise suppressing algorithms to be discussed below, the former consideration i s paramount, e s p e c i a l l y i n regions having low amplitude peaks. 3.2.1 The Average Value Smoothing Algorithm A 'running average' technique i s used which employs a window that i s passed over the ridge function. Provided t h e i r value i s less than a f i x e d threshold, the fun c t i o n a l points within the window are averaged and the average assigned to the cen t r a l point. Experiments have been performed i n making t h i s assignment with each of the following provisos: 1. that the difference between the f i r s t and l a s t points within the window i s les s than some threshold and, 2. that a fi x e d number of alternations of gradient sign occur within the window, where the gradient i s based on two adjacent points. The window s i z e i s approximately 45% of the greatest value attained by the ridge function i n height and 2% of the contour length T i n width. I t i s symmetric about the abscissa. A f t e r three or more passes over the function by the programme, i t s e f f e c t becomes n e g l i g i b l e . A s i m i l a r technique has been used to advantage on continous waveforms having a superimposed noise of a p a r t i c u l a r character [2], 3.2.2 The Average Gradient Smoothing Algorithm The 'running average gradient' technique i s s i m i l a r to the 34 preceedirig with the exception that, rather than average ridge point values, an average gradient g, i s computed within the bounds of the til window. If the i ridge point i s the f i r s t contained by the window, at any l o c a t i o n on the abscissa, then i+M 8 i = i 1 ( W 7 ^ - ^ i=l,N-M 3.2.1 j=i+l J where M i s the width of the window. A value i s then assigned to the ( i + l ) S t point of f i i l = I t f i + ( w l g i + W 2 % - l ) + W i = 1> N- M 3'2-2 where w^  and are weights which are defined below. P r i o r to the a p p l i c a t i o n of these algorithms to the ridge function, the function i s rotated (undergoes a phase s h i f t ) so as to b r i n g a p o s i t i v e peak co-incident with the o r i g i n and thus f. > L N - M < i < N . 3.2.3 i s = = In achieving the objectives set out above, the l a t t e r method proves to be the more succe s s f u l . This i s a t t r i b u t e d to the ' i n e r t i a ' induced by the weighted averaging of the mean gradients g^ and Best r e s u l t s are obtained with the r a t i o of weights w^/w^ ~ 3/2. The very l i m i t e d success achieved by either smoothing method may be explained by considering the ridge function analogous to a continuous p e r i o d i c waveform on which there i s superimposed a noise s i g n a l . The l a t t e r has dominant frequency components coincident with some of those c h a r a c t e r i s t i c of the main waveform. The noise, then, cannot be removed without de-grading the desired s i g n a l . In F i g . 3.2.1 i s shown a ridge function before and a f t e r noise reduction by the second method. The example i s 35 F i g , 3.2.1 Ridge Function Noise Reduction by the Average Gradient Method t y p i c a l of patterns containing smooth arcs of non-zero but low curvature, I t i s evident that, given a s i n g l e sample of the ridge function for a p a r t i c u l a r pattern, the noise content cannot be markedly reduced. However, provided the c a p a b i l i t y to multiply process the source pattern a t various positions on the pattern f i e l d , the ridge function noise would be found to be non-stationary. Thus, by means of an averaging or 36 c o r r e l a t i o n technique, not only could the noise l e v e l be reduced but des i r a b l e , low magnitude peaks could be enhanced. Since t h i s technique i s not a v a i l a b l e i n the system at hand, the second noise reduction method described above w i l l be used i n the preprocessing of a l l functions used as input to the decision u n i t . Figure 3.2.2 shows further examples of the f u l l y preprocessed input pattern. These are the f i n a l form of the ridge functions shown i n F i g . 2.4.2. I t w i l l be noted, i n comparing the figures, that the small negative peaks (regions (1) and (2) i n F i g . 2.4.2) i n the ridge function associated with the character 'Y' are retained with l i t t l e degradation of peak amplitude. Much of the noise on the ridge function corresponding to the character 'V' remains and with maximum amplitude comparable to the peaks j u s t mentioned. What confusion that may a r i s e due to th i s condition w i l l be seen i n the next chapter. 38 4. DECISION PROCESSES 4.1 The Codeword Decision Strategy Before considering the problem of character recognition, a preliminary i n v e s t i g a t i o n w i l l be reported. The object of the study, which finds i t s d i r e c t i o n i n the conjecture of Attneave quoted i n the introduction, i s to determine the properties of the classes into which the environment i s p a r t i t i o n e d by a decision process which employs a pattern descriptor that represents the elements of the environment i n terms of the number, type (convex or concave), and sequence of the extrema i n curvature which are encountered as t h e i r contours are traversed. This information i s obtained from the ridge function by detecting i t s turning points. No metric evaluation of the ridge function i s made. The r e s u l t s of t h i s i n v e s t i g a t i o n are expected to cast some l i g h t on a possible approach to the character recognition problem and, further, to test the ease with which pertinent information may be extracted from the ridge function. The ridge function w i l l be abstracted i n a codeword which i s an ordered sequence of b i t s with l ' s and O's corresponding to p o s i t i v e and negative features r e s p e c t i v e l y . A peak detection algorithm i s implemented which makes a s i n g l e pass over each input pattern and from this generates a s i n g l e codeword k_.. For ease of i d e n t i f i c a t i o n , each pattern w i l l be named according to the character class i t represents. A t o t a l of 5 font s t y l e s w i l l be sampled to y i e l d 119 input patterns. Each character class i s represented and a l l i n equal proportion to one another. 39 Associated with each character class C 1 , one or more codewords w i l l occur with p r o b a b i l i t y P i j = P ( k j ' i : > -t i l This i s the p r o b a b i l i t y that, given a pattern which represents the i character c l a s s , i t w i l l r e s u l t i n the j t l a codeword. The set of patterns upon which these s t a t i s t i c s are based w i l l be known as a t r a i n i n g set. Conversely, associated with each k. encountered i s a l i s t 1. of proba-3 3 b i l i t i e s P'... This l i s t i s of the form 1. = {p! . , c.} i = l , n 4.1.2 3 * J i i where c. i s the number of closed contours associated with patterns of the x class C 1 and n i s the number of C 1 contained by the environment. The p r o b a b i l i t y p'.. = P(i|k.) = 4 # 1 . 3 *jx 1 3 n •I p!. i-1 J 1 i s the ' l i k e l i h o o d ' that, given no further information, the input pattern from which a codeword k^  has been derived i s of the class C 1. The peak detection algorithm i s of the curve following v a r i e t y , which detects the turning points i n the ridge function by noting a l t e r n a -tions i n l o c a l gradient sign. I t i s s i m i l a r to the noise reduction algorithm described i n section 3.2.2. In the d e s c r i p t i o n of the algorithm to follow, a number of constant thresholds, L,, L, and L , are used. ' d k m These w i l l be defined below. Two running average gradients are maintained. One, the gradient g. T, i s computed (equation 3.2.1) for a window of width M = T , a value to be defined below. The window i s moved along the abscissa from i = 0 to a point at which 40 g . L > L d 4.1.4 For a second window of width M = T^, the gradient g i s computed for j increasing from j = i to a point at which 8 j N < - L d 4>1.5 I f the condition g i L * g j N * \ 4 ' 1 - 6 holds, then the point s^ i s accepted as defining a peak candidate. I f furthermore | f ( s ! ) | > L 4.1.7 j m then the candidate i s considered a v a l i d peak. This process i s repeated u n t i l one e n t i r e cycle of the ridge function has been covered. The number of data points contained by the windows i s a de-creasing function of ridge function magnitude. The constants used above to define window width are a r b i t r a r i l y chosen to be T L = F ( f | ) , ' T N = F(f^) 4.1.8 where F(f.p = .02T - int ( / J f J 7 Y and T i s the ridge function period. This choice i s based on the d i f f e r i n g nature of high and low magnitude peaks. While the former are steepsided and w e l l defined, the l a t t e r tend to be broad with a r e l a t i v e l y high noise content. The various thresholds are fi x e d on the basis of experi-mental r e s u l t s . S a t i s f a c t o r y r e s u l t s are achieved with L,, = 0.80; L = 1.6; L = 0.15«max(f) j = 1,N 4.1.9 d k m 3 41 4.2 Results of the Codeword Algorithm 4.2.1 Abbreviated Pattern Set The implementation of the algorithm proves most successful on ridge functions having a low noise content and well defined l o c a l extrema, namely those corresponding to source patterns which consist only of i n t e r -s ecting s t r a i g h t edges. The characters represented by t h i s class of pattern are found i n the left-hand column of table 4,2.1. Ridge functions not associated with t h i s class r e s u l t i n the detection of spurious peaks when programme thresholds are lowered to allow the detection of desirable low magnitude peaks. Results are divided into two parts. F i r s t , for the 13 characters associated with patterns of the former c l a s s , with the character p a i r s N-Z and M-W considered to be manifest i n i d e n t i c a l patterns, r e s u l t s are given i n table 4.2.1. The second part, the r e s u l t s for a pattern set containing elements from a l l 26 character classes, w i l l be presented i n the following section. The abbreviated pattern set consists of 76 patterns from which are generated 22 unique codewords with an error rate, based upon the t o t a l number of k^ generated, of 17%. This f i g u r e includes those k^ which are inconsistent within themselves and f a i l to acknowledge a l l prominent contour features. The case of a pattern which contains two s i m i l a r and diametric features, of which only one i s detected, exemplifies such an inconsistency. The error rate breaks down to 11.0% due to poorly separated end-of-line, features, 4,6% from f a i l u r e to detect low magnitude peaks and 1.4% due to detection of spurious peaks caused by noise. 42 Character Codeword Occurrence Candidate and P r o b a b i l i t y Codeword k. P • • c 1 p! • Erroi J A 11111001 .40 A 1.00 1111001 .60 A .57 (J .24) (u .19) E 111100110011 1.00 E 1.00 F 1111100110 1.00 F 1.00 H 100111100111 1.00 H 1.00 I 1-111 1.00 I .59 (c .29) (0 .12) K 111001101101 .20 K 1.00 11101101101 .60 K 1.00 1010101 .20 K 1.00 AAA L 111110 .80 L .68 Y .18 V .14 1111110 .20 V .46 Y .36 L .18 *** M,W 1101111011110 .20 M .50 W .50 1110101110 .60 M .50 W .50 11010110 .20 M .50 W .50 AAA N,Z 1011110111 .50 N .50 z .50 10111011 .33 N .50 z .50 1001111011 .17 N .50 z .50 AAA T 11101101 1.00 T .83 Y .17 V 111110 .17 L .68 Y .18 V .14 1111110 .50 V .46 Y .36 L .18 11110 .33 (J .51) V .23 (U,P .13) *** X 110110110110 .80 X 1.00 1101010110 .20 X 1.00 Y 111110 .20 L .68 Y .18 V .14 *** 10110110.1 .20 Y 1.00 1111110 .40 V .46 Y .36 L .18 *** 11101101 .20 T .83 Y .17 • AAA Table 4.2.1 Codewords and t h e i r Character Class Associations for the Abbreviated Pattern Set. • . ' 43 An attempt to c l a s s i f y the source patterns on the basis of the largest p!. e 1. gives J 1 J 56.9% correct r e s u l t s by character class with r>! . = 1.00 and J 1 31.9% correct with p!. < 1.00 for a t o t a l of 88.8% of the abbreviated pattern set c o r r e c t l y i d e n t i f i e d . Inspection of the p'. i n each l i s t 1. shows the errors i n the r e s u l t s to involve only the characters ' L ' , 'V' and 'Y'. The error rates by character class are L - 15% V - 50% Y - 80% These figures are the percentages of the t o t a l number of samples from each character class i n c o r r e c t l y i d e n t i f i e d . Table 4.2.1 shows that 8 of the 13 characters are associated with two or more k_.. Why many of these associations a r i s e may be seen i n F i g . 4.2.1 which contains a number of d i f f e r e n t patterns with the codeword f or each. In the f i g u r e , examples b) and c ) , e) and f) as well as j ) and k) are s i x d i s t i n c t patterns associated with three character classes. Each pattern r e s u l t s i n a d i f f e r e n t k ^. Patterns a) and b) as we l l as d) and e) are e s s e n t i a l l y the same i n t h e i r contour features, however they r e s u l t i n four d i f f e r e n t codewords. Patterns g), h) and i) f a l l i nto t h i s category as w e l l . The reasons for t h i s p r o l i f i c generation of codewords i s f i r s t l y , the poor s e p a r a b i l i t y of the features at (1) i n example a) and secondly, the shallow angles at (2) i n examples d), g) and i) which cause low magnitude ridge function peaks. In summary, a (1) 111101111010 (a) 1111011110110 (b) 1110111010 . ( c ) 1101111 110111 (j) 0 0 F i g . 4.2,1 Patterns of Like Character Class which Result * i n D i f f e r e n t Codewords. 45 m u l t i p l i c i t y of codewords corresponding to one character class r e s u l t s from a) character manifestation i n more than one d i s t i n c t pattern, b) contour features which are poorly separated or c) contour features which r e s u l t i n very low magnitude ridge function peaks that are not detected. 4.2.2 Complete Pattern Set The complete pattern set of 119 input patterns, representing a l l 26 character classes, r e s u l t i n 40 d i f f e r e n t codewords which allow 53% of the pattern set to be c o r r e c t l y i d e n t i f i e d with p!. = 1.00 and ... J 1 another 18% with p ^ < 1.00. The 29% error rate l a r g e l y involves samples from character classes which are named i n the same 1. with near equal J p!.. This implies that the same k. occurs f o r each of these classes with near equal P^j* Such groups of character classes are termed confusion  sets. Examples of these are found i n table 4.2.2. The aptness of t h i s terminology i s demonstrated with Codeword Confusion Set 11011 c, u 1111 c, I 11110 J, u, v* 111011 v, Y* 1011 u, G* 1111110 v, Y* L 1111001 u, J * indicates associations due to erroneous codeVord generation. Table 4.2.2 Confusion Sets with Corresponding Codeword 46 reference to.Figs. 3.2.2 and 4.2.2 which show sample patterns associated with these sets along with t h e i r corresponding ridge functions. The confusion between the characters 'V and 'Y' has been found to a r i s e from a loss of d e t a i l i n processing some patterns associated with the l a t t e r character c l a s s . F i g . 4.2.2 shows two d i f f e r e n t patterns for each of the characters 'C and 'U1. The s i m i l a r i t i e s between the two character classes i n t h e i r ridge functions leads to the confusion i n t h e i r i d e n t i -f i c a t i o n by means of the codeword algorithm. Thus, the elements of the confusion sets are generally manifest i n patterns which have s i m i l a r i t i e s and whose ridge functions are likewise s i m i l a r . In t h i s category, the N-Z and M-W p a i r s have already been acknowledged. To these are now added the pairs C-U, C-I, J-U and L-V. Were i t a v a i l a b l e , information regarding the o r i e n t a t i o n of the source pattern r e l a t i v e to the pattern f i e l d g r i d would render many of the confusion sets r e a d i l y separable. This, then, would greatly enhance the r e s u l t s when viewing the problem as that of character recog-n i t i o n . Since, however, the purpose of these experiments i s p r i m a r i l y to i n v e s t i g a t e the r o l e of the ridge function as a general pattern descriptor, no attempt i s made to define pattern o r i e n t a t i o n . Results are presented according to character class for the sake of convenience. A number of conclusions are drawn on these r e s u l t s . For one, given a pattern set selected from the environment defined i n section 2.5 i n which a l l C 1 occur with equal p r o b a b i l i t y , the codeword strategy, together with information regarding the number of closed contours contained i n each pattern, allows a recognition rate of over 70%. The error rate F i g . 4.2.2 Examples o f Patterns and Ridge Functions f r o m the C-U Confusion Set. 48 i s l a r g e l y a t t r i b u t a b l e to a correctable state of the simulator; the degree of quantization of the pattern f i e l d . The portion of the errors that does not f a l l into t h i s category i s due to the manifestation of d i f f e r e n t characters i n very s i m i l a r patterns. To overcome the l a t t e r , a decision process based upon some metric information derived from the ridge function i s required. This need i s emphasized by reference to F i g . 4.2.3 which shows two sample patterns representing the characters 'L' and 'V' along with the corresponding ridge functions. The k.. derived from the two patterns are i d e n t i c a l . Besides o r i e n t a t i o n , the most notable differences between the patterns are i n the one short 'arm' of the 'L' and i n the angle subtended by the two 'arms'. This information i s con-tained by the ridge functions i n f i r s t , the r e l a t i v e l o c a t i o n of the peaks and, secondly, i n the r e l a t i v e peak amplitude. For instance, the si n g l e negative peak caused by the convex angle of each pattern i s centered, i n the case of the 'V, between the two adjoining p o s i t i v e peaks while, i n the 'L'; i t i s o f f center, on the side of the-right-hand neighbor. Also, due to the sharpness of the concave corner of the 'V, the negative peak i s of greater magnitude i n the upper i l l u s t r a t i o n than i n the lower where i t corresponds to the more shallow 90° corner. Figure 4.2.3 i n d i c a t e s , as w e l l , a t h i r d p ossible metric measure which may be derived from the ridge function to aid i n the sepa-r a t i o n of the 'L-V' p a i r . This rests i n a d i f f e r e n c e which i s common among the end-of-line features found i n patterns representing these characters. In the case of the 'L', these features usually involve two near 90° corners, whereas, i n the pattern for the 'V, the two angles 50 are generally far from equal. Although the pattern exhibits poor feature s e p a r a b i l i t y , the peaks on either side of the negative peak i n the ridge function corresponding to the 'V' are seen to be considerably less i n magnitude than t h e i r close and dominant neighbors. The use of such a measure presupposes the a b i l i t y to detect both of the peaks of each p o s i t i v e p a i r . The pattern of F i g . 2.3.2 (b) suggests a case i n which th i s measure may not be u s e f u l . Further discussion of a decision process which employs metric evaluations of the ridge function of the type mentioned above i s contained i n sections 4.3 and 4.6. The r e s u l t s of the experiment on the codeword algorithm i n d i c a t e that the occurrence, i n the chosen pattern set, of poorly separated contour features i s frequent. To further study this phenomenon, an i n v e s t i g a t i o n i s made into the separations of a l l adjacent p o s i t i v e ridge function peaks. From a pattern set of 150 elements, r e s u l t s are obtained which are shown i n F i g . 4.2.4. The abscissa of the histogram gives peak separation as a f r a c t i o n X of the ridge function period T. The ordinate gives the number of occurrences of two adjacent peaks having separation XT. Due to the f a i l i n g s of the algorithm implementation, the graph i s ' l i g h t ' for X < 0.05. High density i n peak separation values occur at X ~ 0.05 and at X = 0.30 with a marked dip at X - 0.11. This suggests that c l o s e l y spaced peaks, being generally unimportant to character recognition as a pair (Fig. 4.2.1), may be merged and considered as one. Such a step should decrease errors due to poorly separated peaks. I t should also reduce* the r e p e r t o i r e of codewords generated since minor d i s s i m i l a r i t i e s between d i f f e r e n t patterns a t t r i b u t e d to the same character class w i l l be overlooked. The next decision strategy.to be discussed w i l l use the technique of merging. 51 5 0 -n 4 0 -30 -to <D o c CD i -k-O O o 2 0 J D £ 2 10 i 0 1 — 0 . 0 5 T 0.I0T 0. I5T Peak Separation I I I 0.2 T 0.3T 0.4T F i g . 4.2.4 D i s t r i b u t i o n i n the Separations of Adjacent Peaks. 5 2 4.3 A'Sequential Decision Strategy A two stage sequence of decision procedures i s employed which, given an input pattern of unknown i d e n t i t y , f i r s t creates a subset of indices I = U ) 4.3.1 of probable candidates f o r the i d e n t i f i c a t i o n of the input. The s e l e c t i o n i s made from the n classes C 1. I t then performs a search of t h i s subset for a f i n a l i d e n t i f i c a t i o n . Such a cascaded decision l o g i c , which performs a progressively f i n e r p a r t i t i o n i n g of an environment, has the f a i l i n g that an error made at a low node of the decision tree i s generally irrecoverable [9]. For success, the i n i t i a l decisions must give, at most, a coarse p a r t i t i o n i n g , be independent of minor v a r i a t i o n s within character classes, and have a low p r o b a b i l i t y of error. The error rate w i l l be inversely proportional to the degree of p a r t i t i o n i n g achieved. Compared with a sing l e l e v e l d ecision process, the cascaded l o g i c stands to involve l e s s compu-t a t i o n since, as i t progresses, reference i s made to a diminishing sub-set of the n a v a i l a b l e classes. In t h i s implementation, the input to each node of the decision tree i s i d e n t i c a l , only the information extracted d i f f e r s . The primary decision l o g i c derives from the codeword strategy of section 4.1. A new threshold detection algorithm i s used for peak detection which employs peak merging and, from each input pattern, generates m codewords. These correspond to a range of values assumed by algorithm parameters. Although m passes are made ove^r each input ridge function, the algorithm i s less c o s t l y i n implementation than the curve following type. 53 Given a pattern set cons i s t i n g of n character c l a s s e s , each represented by one sample of each of f font s t y l e s , a maximum number of m«n«f unique stand to be generated. However, the r e s u l t s of section 4.2.2 in d i c a t e the number to be expected i s les s than 2n. Thus* the unique k_. generated from any one input pattern w i l l occur with frequencies f „ which are l i k e l y to d i f f e r among themselves. The processing of a number of samples from any one class C 1 w i l l r e s u l t i n a set •-•of k_., each occurring with a p r o b a b i l i t y p.. = avg(f..) . 4.3.2 If t h i s t r a i n i n g set of input patterns i s s u f f i c i e n t l y large and adequately describes the environment, erroneous associations of codewords with the C 1 w i l l occur with r e l a t i v e l y small p „ whereas, the k_. which are t y p i c a l of a pattern class w i l l be associated with i t by a large p . Those C 1 associated with patterns which are characterized by regions of low but non-zero curvature w i l l have a r e l a t i v e l y large number of associations of the l a t t e r type. Contour features of t h i s nature cause the broad, low magnitude ridge function peaks which are apparent i n F i g . 4.2.2. Due to the quantization of the pattern f i e l d , these extrema are i r r e g u l a r and usually contain a number of features which may or may not stand detected as v a l i d peaks, depending upon p r e v a i l i n g values of programme parameters. These low magnitude 'peaks' w i l l be detected i n only a f r a c t i o n of the m passes over the input pattern and thus the number of d i f f e r e n t but ' c h a r a c t e r i s t i c ' k_. generated from patterns of t h i s type w i l l be r e l a t i v e l y large. Further discussion on t h i s matter i s presented i n the following section. 54 For each input pattern selected from the C 1, over and above the t r a i n i n g set, a number of are. generated which form a subset of the codewords which were associated with the character class by the t r a i n i n g set. These k_. w i l l occur with frequencies f i j B - p i j * 4.3.3 By generating m codewords per input pattern, rather than one, the proba-b i l i t y of the eventual i s o l a t i o n of the k. which are c h a r a c t e r i s t i c of the J C 1 i s increased. For t h i s reason the method i s expected to achieve r e s u l t s which are superior to those of section 4.2.2. The d e t a i l s of the strategy are discussed i n the following section. The second stage of the sequential decision process involves a minimum distance l i n e a r c l a s s i f i e r [10, 11], For t h i s , an average reference vector T. . = (t. ) . . k = l,io 4.3.4 i s maintained for every a s s o c i a t i o n of a k_. with a C 1 for which the p.. 4 0. The components t, are derived from a number u of metric evalu-• i j k ations of the ridge function. These are measurements which are of the same nature as those which were proposed i n the previous section, with reference to F i g . 4.2.3. Derived from each unknown input pattern are s i m i l a r vectors X j = k = 1 , U 4.3.5 one of which corresponds to each k^ generated by the f i r s t l e v e l of the decision strategy. The t ^ and the x^ are themselves «vector quantities with as many components as there are ridge function peaks acknowledged by the corresponding k^. A number of distance vectors 55 D . = IT, - X.I i e l 4.3.6 J i j 3- J are computed which i s also equal to the number of k generated from the pattern. To obtain a r e l a t i v e measure assoc i a t i n g the input with each character class designated as an i d e n t i t y candidate by the subset I, the averaged vectors D. = avg.(D..) 4.3.7 i J J i are derived from which come the set of evaluations e. = W«D. 4.3.8 i I The weight vector W i s intended to weight or bias the r e s u l t s so as to r e f l e c t the r e l a t i v e merit of each measure as indicated by the experience of the set of t r a i n i n g patterns. The i d e n t i f i c a t i o n of the r input pattern i s made by assigning i t to the class C for which e = min(e.) 4.3.9 r i l This concludes a b r i e f introductory discussion of the sequential d e c i s i o n process upon which the r e s u l t s to be reported i n l a t e r sections of t h i s chapter are based. In the following sections both l e v e l s of the strategy are presented i n d e t a i l with r e s u l t s of each presented separately. 4.4 • The Threshold Detection Algorithm An algorithm i s implemented which locates the peaks of an input function by defi n i n g an amplitude threshold L and then detecting the a maximum over regions where t h i s value i s exceeded i n ^ magnitude by the ridge function. When two adjacent peaks are detected which are of l i k e sign and l i e within a distance T^ of one another, they are merged and considered one. Each pattern i s passed over m times with values of L 56 varying by approximately 12% within the range L < L < L 4.4.1 L = a ^ U where L and L are a r b i t r a r i l y chosen bounds and w i l l be defined at a J-j u l a t e r point i n t h i s discussion. Under s p e c i a l conditions which, again, w i l l be discussed i n a l a t e r point, the upper bound may be exceeded by L . a Figure 4.4.1 i l l u s t r a t e s the method as applied for m = 2. The peaks are numbered i n order of t h e i r detection. In each function, only those peaks are numbered which are detected for the current value of L . a In the upper i l l u s t r a t i o n , four p o s i t i v e and two negative peaks are detected. Among the former are two pairs of c l o s e l y spaced peaks which are merged so that the codeword 1010 r e s u l t s . For the l e s s e r value of L , demonstrated i n the lower i l l u s t r a t i o n , what appeared above as p a i r s of p o s i t i v e peaks are now detected as s i n g l e peaks. On the negative side, the converse has occurred and two pairs of immediately adjacent peaks are detected and merged. In addition, three newly encountered p o s i t i v e peaks are detected. Two of these are merged to give a f i n a l r e s u l t of 101101. If L were to assume an even lower value, the l i k e l i h o o d of a detecting spurious peaks due to noise increases. If a t h i r d k^ were generated for a value of L midway between those demonstrated, i t would 3. be the same as the f i r s t and thus, the s t a t i s t i c s f o r m = 3 would be k. f . . « 1010 ,666 101101 .333 57 F i g . 4.4.1 A p p l i c a t i o n of the Threshold Detection Algorithm with m = 2 58 Since ridge function peaks which correspond to sections of contour having low but non-zero curvature are usually detected at only the lowest values of L , s l i g h t v a r i a t i o n between d i f f e r e n t patterns associated with the same character class may cause the k^ to occur with markedly d i f f e r e n t f . . . To lessen the e f f e c t of i n t e r - f o n t v a r i a t i o n s , the amplitude I J ' threshold i s a r b i t r a r i l y chosen to be L = (1 - b.n').F A.4.2 a where 1=1 and b = 0.12 e = 1.20 n f = 0, ±1,2, ..., m-2 The choice of values for n' i s discussed below. The value F i s merely a constant mult i p l y i n g the R.M.S. value of the ridge function. The a b i l i t y to s e l e c t a successful merge threshold T i s J ° m demonstrated i n F i g . 4.4.2 which i l l u s t r a t e s the r e l a t i v e separations of the pairs of adjacent peaks of l i k e sign detected i n the ridge function of F i g . 4.4.1. A well defined gap i s evident between the spacings of the peak pai r s 8 - 6 and 1 1 - 1 . This agrees with the r e s u l t s shown i n F i g . 4.2.4. A value for T i s chosen such that m T < T < T n 4.4.3 a = m = 3 • The l i m i t s i n equations 4.4.1 and 4.4.3 and the parameters a, e, and n' of equation 4.4.2 are held constant over the en t i r e pattern set. Their s e l e c t i o n i s found to be c r i t i c a l for le s s than 5% of the input patterns 59 0.1 8T <t CM IO 2 & ~ tO I I I I I i I ro — N a> co — 2 Peak Pa i r F i g . 4.4.2 Adjacent Peak Separations from F i g . 4.4.1 encountered and are thus determined on an inspection basis rather than automatically. The merge threshold i s evaluated b y ' f i r s t computing the separa-t i o n of each p a i r of detected peaks which are adjacent to one another and of s i m i l a r sign. These are segregated i n t o two ordered l i s t s P k P 4.4.4 where the subscripts p and n denote reference to p o s i t i v e and negative peak pairs r e s p e c t i v e l y . For example, i f peak l o c a t i o n r e l a t i v e to the 60 o r i g i n i s r e p r e s e n t e d b y 1^ , t h e n t h e l i s t V^, d e r i v e d f r o m t h e e x a m p l e o f F i g . 4.4.1, f o r b o t h v a l u e s o f L d e m o n s t r a t e d , w o u l d b e •3. p p = { d 1 , d 2 , d 3 , d 4 , d 5 } w h e r e d l = £ 4 _ £ 3 d 3 = £ 1 0 - £ 9 d5 = ^3.-^10 d 2 = l 2 ll d4 = T - £ l l T h e f i r s t v a l u e s e n c o u n t e r e d i n t h e l i s t s V a n d V , r e s p e c t i v e l y , w h i c h p n s a t i s f y t h e c o n d i t i o n s d . > T • d . .. k , p k - l , p d , > T « d , , T 4.4.5 k , n k ' - l , n a r e f o u n d a n d t h e m e r g e t h r e s h o l d s a r e c h o s e n t o b e T m , p n p d k - l , p T = n • d . , 1 4.4.6 m , n n k - l , n w h e r e 1.80 < T 4 2.00 a n d 1.00 < n < 1.10 a r e c o n s t a n t s c h o s e n f o r t h e p a t t e r n s e t b y i n s p e c t i o n . M e r g e t h r e s h o l d s a r e t h u s c o m p u t e d f o r e a c h i n p u t p a t t e r n . T h e t w o v a l u e s a r e c o m p u t e d i n o r d e r t o a l l o w d i f f e r e n t h a n d l i n g o f p a i r s o f c l o s e l y s p a c e d n e g a t i v e a n d c l o s e l y s p a c e d p o s i t i v e r i d g e f u n c t i o n p e a k s . F r o m e v e r y i n p u t p a t t e r n , t h e p e a k d e t e c t i o n a l g o r i t h m g e n e r a t e s a k . f o r e a c h o f m v a l u e s o f L ( e q u a t i o n 4.4.2). T h e f i r s t c o r r e s p o n d s ' 3 a « t o n 1 =0 w h i c h g i v e s t h e n o m i n a l t h r e s h o l d v a l u e . I t i s c o m m o n t h a t p e a k s a r e d e t e c t e d i n t h e f i r s t p a s s o v e r t h e r i d g e f u n c t i o n w h o s e m a x i m a l i e j u s t a b o v e t h i s t h r e s h o l d . I n o t h e r p a t t e r n s a s s o c i a t e d 61 with the same character class as this input, the corresponding features may be of much less magnitude and l i e undetected except for very low values of L . Hence, i n an attempt to generate, for each pattern, as many of the k_. as possible which are c h a r a c t e r i s t i c of the C 1 with which the pattern i s associated, the smallest peak detected with the nominal threshold i s compared with the upper l i m i t and, i f found to be less i n magnitude, a second k^ i s computed for m = 2 and with = ^ n* The i n e q u a l i t y may hold only i n the case that a second r e l a t i v e l y low magnitude peak was detected i n the f i r s t pass. x I f the second peak has magnitude within approximately 8% of that of the f i r s t , the threshold L & may exceed i n order that, i n the second pass, neither peak i s detected. For example, i f , i n F i g . 4.4.1 the greater value of L i s el assumed to be the nominal value, the peak denoted at (2) w i l l be excluded i n the second codeword generated. However, since the peak at (4) i s of very s i m i l a r magnitude to the one at (2), even though i t exceeds L^, i t too w i l l be excluded. This consideration becomes important when sample patterns are encountered i n which the peaks at (9), (10), and (11) have magnitudes which exceed the nominal threshold. In such a case, i t i s possible that, only for m = 2, w i l l the c h a r a c t e r i s t i c codeword 1010 occur. I f no such low l e v e l peaks are detected i n the f i r s t pass, the i n i t i a l k_. i s assumed for m = 2. An a d d i t i o n a l m-2 passes are then made with 4 n' = 1,2, ..., m-2. Resulting from t h i s process i s the t r i p l e t R = (q, i , c) 4.4.8 where q i s a set of r pairs q = {k., f.} 4.4.9 • J . J The element i denotes the class C 1 from which the input pattern derives. This w i l l be known only i f the algorithm i s operating i n the t r a i n i n g  mode. Otherwise, the p a i r R' = {q, c} 4.4.10 r e s u l t s . The number of contours c i s given i n the simulator output and forms a part of the f i r s t l e v e l of the decision process. The machine may operate i n eit h e r of two modes: the t r a i n i n g mode or the i d e n t i f i c a t i o n mode. The t r a i n i n g mode i s designed to allow the machine to amass information describing i t s environment on the experience of a set of t r a i n i n g patterns. A f t e r t h i s period of adap-t a t i o n , the machine, operating i n the i d e n t i f i c a t i o n mode, accepts an input pattern of unknown i d e n t i t y and, on the basis of i t s t r a i n i n g experience, computes a ' l i k e l i h o o d ' or score s^ with which i t associates the input with each element of a subset of the C 1 contained i n the environment. This subset names only those character classes for which the patterns have the same number of contours as the input pattern and with which are associated, through the t r a i n i n g , at l e a s t one of the k. e R'. By means of a threshold L , which i s defined below, a number J s' of the la r g e s t s^ are selected which, i n turn, define a subset I (equation 4.3.1) of the 'most probable' candidates for the i d e n t i f i c a t i o n of the input. 63 In order to obtain c e r t a i n r e s u l t s from the t r a i n i n g procedure, the machine, while t r a i n i n g , alternates between the two modes of operation. F i r s t , a set of p r e t r a i n i n g patterns are processed i n the t r a i n i n g mode to allow the generation of a basic repertoire of codewords. This set consists of one sample from each C 1, a l l being of the same font s t y l e . Secondly, the remaining patterns of the t r a i n i n g set are accepted with t h e i r i d e n t i t y suppressed. The machine, f i r s t operating i n the i d e n t i f i c a t i o n mode, attempts to c l a s s i f y each input on the basis of the Sj, and then, a f t e r accepting the correct i d e n t i t y , treats the function as a t r a i n i n g pattern and, from the t r i p l e t R, updates the table MT = {k., P(k.), 1.} 4.4.11 J 3 3 i n which 1 , as given by equation 4.1.2, i s the l i s t of ordered pa i r s {p!., c.} defined f o r a l l i such that p!. ^ 0. In addition, a l i s t of J i i J 1 p r o b a b i l i t i e s PCC 1) i s maintained. The P(Kj) and P(C ) are the proba-b i l i t i e s of occurrence of the k. and C 1 r e s p e c t i v e l y . The p!. are 3 J computed from equation 4.1.3 with the p derived from equation 4.3.2. The operation of the f i r s t l e v e l of the sequential strategy i n the t r a i n i n g mode i s b r i e f l y outlined by the flow chart of F i g . 4.4.3. In the i d e n t i f i c a t i o n mode, the input to the machine of a pattern of unknown i d e n t i t y r e s u l t s i n the set R' of codewords, each with i t s frequency of occurrence. The table MT i s searched f o r each k. e R'. Since the ridge function i s independent of source pattern o r i e n t a t i o n , the search requires a r e g i s t r a t i o n procedure. For t h i s purpose a s h i f t r e g i s t e r i s used. Due to the binary nature of the k , the implementation i s p a r t i c u l a r l y s t r a i g h t forward. If the search i s Input t r a i n i n g pattern, ( i d e n t i f i c a t i o n i and number of contours c known) 64 Generate the t r i p l e t R. Generate m codewords from the ridge function. Compute the occurrence f.. of each unique codeword k. 3 ( eqns. 4.4.8 and 4.4.9) Update the p r o b a b i l i t i e s P ( k ) and P ( C 1 ) . Have a l l the k. e R been 3 processed? yes no Search the table MT fo r the next k. e R. 3 found not found Sea'rch the l i s t 1. 3 fo r an as s o c i a t i o n with the 1^ c l a s s . Start a new l i n e i n the table. Enter the codeword. found v — not found Update the p r o b a b i l i t y ( eqn. 4.1.3) Has the t r a i n i n g set been depleted?  I ^1 Create a new entry i n the table associated with the codeword. Enter the class'and contour information. Compute the p r o b a b i l i t y (eqn. 4.1.3) no yes STOP F i g . 4.4.3 Flow Chart for the Training Mode of Operation .of the F i r s t Level of the Sequential Strategy. 65 unsuccessful for any k_. , and the machine i s t r a i n i n g , the i s held i n abeyance pending the update of MT. Otherwise, i f no t r a i n i n g i s to be performed, i t i s discarded. The event of an unsuccessful search usually indicates that the t r a i n i n g set has been i n s u f f i c i e n t to adequately describe the environment. I f a l l of the r codewords of the set R' are contained i n MT, then the n»r array of subscores s.. = (f.p!.)P(C i)/P(k.) 4.4.12 J i J J 1 3 are computed for a l l i such that i e l . and the contour count c. e R' 3 i equals the value c, e 1^. F i n a l l y , the s ^ are summed to give the scores . s. = Z s.. i = l . n 4.4.13 1 3 3 1 where i assumes those values for which k. e R'. The value Y for which 3 s. = max(s.) i = l . n 4.4.14 Y x ' assigns the input pattern to the class C . The 'assurance' with which the i d e n t i f i c a t i o n i s made i s implied by the magnitude of s^ r e l a t i v e to that of the next l a r g e s t s^. As the f i r s t stage of the sequential decision process, the codeword strategy serves to impose a coarse p a r t i t i o n i n g on the set of n possible i d e n t i t y classes. This provides the subset I of candidates f o r the i d e n t i f i c a t i o n of each unknown input. The s i z e of the largest subset w i l l be determined by a parameter 6 which defines a threshold L = 6s 4.4.15 s y The subset I contains the elements i for which « s. > L 4.4.16 x = s The value of 6 i s chosen through experimental observation. The smaller 66 i t s magnitude, the larger the candidate subset and, thus, the higher i s the p r o b a b i l i t y that the subset w i l l contain the correct i d e n t i t y . However, the success of the second l e v e l of the decision process w i l l be seen to be inversely proportional to the s i z e of I. Thus, the d e f i -n i t i o n of 6 involves a trade-off between achieving low error rates at the two l e v e l s of the l o g i c . The r e s u l t s to be reported i n the next section are derived for 6 = 0.3, a value found to give best o v e r a l l operation. The operation of the f i r s t l e v e l of the sequential strategy i n the i d e n t i f i c a t i o n mode i s summarized i n F i g . 4.4.4. 4.5 Results of the F i r s t Level of the Sequential Logic The r e s u l t s given below are due to an environment c o n s i s t i n g of n = 26 character classes. However, the reporting of c e r t a i n f i n a l r e-su l t s w i l l be based on only n = 24 classes i n the l i g h t of the s i m i l a r i t y of the pair s N-Z and M-W. Whichever i s appropriate w i l l be made apparent. The algorithms and decision strategy which have been described above are implemented on an I.B.M. 360/67 d i g i t a l computer i n the Fortran language. This operating environment has been chosen for experimental convenience. The machine i s pretrained on a set of 26 patterns which r e s u l t i n a t o t a l of 23 unique k_.. An a d d i t i o n a l 208 input patterns are processed i n the t r a i n i n g mode and the f i n a l r e s u l t , the table MT, i s shown i n table 4.5.1. Using a value of m = 6, a t o t a l of 1404 k. are generated to form the table. Each numbered l i n e contains, i n order, a k., the J p r o b a b i l i t y P(k.), and the l i s t 1. defi n i n g associations of the k. with I n p u t u n k n o w n p a t t e r n , ( n u m b e r o f c o n t o u r s c k n o w n ) 67 G e n e r a t e t h e p a i r R ' G e n e r a t e m c o d e w o r d s a n d c o m p u t e t h e o c c u r r e n c e f o f e a c h u n i q u e c o d e w o r d k . . J H a v e a l l t h e k . e R ' b e e n 3 p r o c e s s e d ? y e s I n o If S e a r c h t h e t a b l e MT f o r t h e n e x t k . e R ' .1 o u n d H a v e t h e e n t r i e s i n t h e l i s t 1. b e e n e x h a u s t e d ? _ ^ i o t f o u n d I s t r a i n i n g t o b e p e r f o r m e d ? n o n o y e s F o r t h e n e x t e n t r y i n 1^  w h i c h h a s t h e c o r r e c t c o n t o u r d a t a , c o m p u t e t h e s u b s c o r e s . . . A d d t h i s v a l u e t o a p a r t i a l s c o r e f o r t h e c l a s s a s s o c i a t e d w i t h t h e e n t r y , ( e q n s . 4.4.12 a n d 4.4.13) T Y y e s R e t a i n c o d e w o r d f o r t r a i n i n g . F o r e a c h c h a r a c t e r c l a s s , t h e p a r t i a l s u m o f t h e s . . b e c o m e s t h e s c o r e s . . S e l e c t t h e l a r g e s t s . i j i i t o c o m p u t e t h e t h r e s h o l d L ^ . ( e q n . 4.4.15) T h e s . w h i c h a r e g r e a t e r t h a n o r e q u a l t o L i b n s d e f i n e t h e s u b s e t I o f c a n d i d a t e s f o r t h e i d e n t i t y o f t h e i n p u t . * S e c o n d S t a g e o f t h e S e q u e n t i a l S t r a t e g y . F i g . 4.4.4 F l o w C h a r t f o r t h e I d e n t i f i c a t i o n M o d e o f O p e r a t i o n o f t h e F i r s t L e v e l o f t h e S e q u e n t i a l S t r a t e g y . Table 4.5..1 The Table MT Derived from the Set of Tr a i n i n g Patterns Codeword Occurrence k. 3 1. 1101 .1800 L G .210 .032 2. 101101 .1250 N .304 3. 101 .1010 J .276 4. 11 .0978 I .418 5. 10101 .0780 T .445 6. 10110110 .0620 W .470 7. 101011 .0470 F .550 .8 1110101 .0424 E .867 9. 1101010 .0362 K .905 10. 1010 .0335 S .760 11. 10101010 .0335 X .865 12. 11101 .0294 B .304 13. 1 * .0219 0 1.000 14 1001 .0164 c .358 15. 100 .0143 Q 1.000 16. 10011 .0137 A .595 17. 1111 .0096 D .680 18. 111 .0089 D .745 Candidate and P r o b a b i l i t y V .210 A .168 Y .148 P .051 c .027 X .021 Q .011 z .302 H .250 K .061 X .036 p .199 B .100 U . .079 G .094 D .281 C .145 0 .117 U .041 R .308 S .069 F .052 Y .044 M .530 R .213 E .119 Q .085 V .015 F .133 X .095 G .124 Q .101 z .028 . W .120 K .020 P .289 U .251 c .108 J .024 G .320 U .242 J .046 S .041 U .146 c .154 G .158 0 .320 0 .155 B .046' J .043 U .035 S .025 Q .008 T .001 C .087 Q .019 S .012 G' .040 Q .036 J .009 S .014 L .020 ON CO . Table 4.5.1 (Continued) Codeword Occurrence Candidate k. 3 POO 3 c 1 19. 101001 .0068 G .745 R .155 20. * 101010 .0062 Y 1.000 21. 1111001001 .0055 E 1.000 22. 10011001 .0048 H 1.000 23. 1101110 .0048 H .633 Q .367 24. 1001011 .0041 F . 843 E .157 25. 110111 .0027 B .780 D .120 26. 101100 .0021 R .645 F .355 27. 10010 .0021 S .645 G .355 28. 1011001 .0014 G .530 N .470 29. 111100 .0014 U .540 C .460 30. 10 .0014 Q 1.000 31. 1010001 .0007 G 1.000 32. 11110000 .0007 C 1.000 33. 1010100 .0007 A 1.000 34. 111000 .0007 U 1.000 35. 1110111 .0007 B 1.000 36. 110010010 .0007 W 1.000 37 101000 .0007 G 1.000 id P r o b a b i l i t y ON VO 70 the various C 1 for which the p'. . i s non-zero. The k. are given i n order J i 3 of descending P(k\.). The number of associations with the k^ i s seen to decrease with that value. Erroneous k. and associations appear with 3 small P(k.) and p!. re s p e c t i v e l y . However, the converse i s not neces-s a r i l y true. In analyzing .the r e s u l t s of the codeword algorithm, i n t e r e s t . focuses on a number of s p e c i f i c points. These are 1) the rate at which new k_. (and, supposedly, new patterns) are encountered, 2) the success of the strategy as a stand alone decision process (for 6 = 1.00), 3) the maximum s i z e required of the subset I i n order to insure that the correct i d e n t i t y i s included, 4) the s i m i l a r i t i e s between the source patterns corresponding to the elements of I and 5) some explanation of the errors which may occur. Figure 4.5.1 shows two measures on the growth of the table MT during t r a i n i n g . These are the growth i n the number of associations between k. and C 1 with p i . ^  0 and i n the number of k. generated. The curves i n d i c a t e that some font s t y l e s contribute more to the growth of the table than do others. Although the rate of generation of the k^ becomes zero within the course of the t r a i n i n g , the rate at which new associations occur does not. This implies that the t r a i n i n g set i s i n -s u f f i c i e n t since both values should approach zero i f the environment i s adequately represented. By the end of the t r a i n i n g process, the average number of associations per k. i s 3.02 with extremes of one codeword with 3 an 1. having twelve entries and t h i r t e e n codewords with only one association. in T> i— O J « o o <i> 6 n 0 20 40 60 80 100 120 140 160 180 200 220 Number of Training Patterns a) Growth in the Number of Associations of Codewords with Pattern Classes for which p1. . ^  0. 40 30 -20 > 20 40 60 80 100 120 140 160 180 200 220 Number of Training Patterns b) Growth in the Number of Codewords Generated. Fig. 4.5.1 Growth in the Table MT with Training. 72 The success of the codeword strategy i s summarized, for n = 24 character classes, i n table 4.5.2 which gives r e s u l t s on the subsets I, according to s i z e , for both 6 = 1.00 and 6 = 0.30. The four columns of r e s u l t s i n the table give, i n order, the percentage of the t r a i n i n g set which r e s u l t s i n subsets of each s i z e , the percentage of the t o t a l number of subsets which contain the correct i d e n t i t y ( 6 = 0.30), the success of the score s_^  ( 6 = 1.00), also as a percentage of the t r a i n i n g set, and, f i n a l l y , the contribution to the o v e r a l l error rate due to the occurrence of subsets of each s i z e . Adding the r e s u l t s of the l a s t column shows the t o t a l error rate to be 28.2%. For example, over the whole environment, the table shows that 19.3% of the patterns resulted i n subsets of three i d e n t i t y candidates and 18.8% of the patterns resulted i n subsets of three elements of which one i s the correct i d e n t i t y for the input. S i m i l a r l y , 7.3% of the pattern set resulted i n subsets of three elements with subset scores which predict the correct r e s u l t . Subset Subset Occurrence of Success of Error i n Sub-Size Occurrence Subsets containing the Subset set Score Correct Identity Score Predic- P r e d i c t i o n ( 6 = 0.30) t i o n . ( 6 = 1.0) 1 54.5% 47.0% 47.0% 1.96% 2 20.8 19.3 15.0 3.98 3 19.3 18.8 7.3 8.81 4 3.9 3.9 1.5 8.74 5 1.5 1.5 1.0 4.73 Table 4.5.2 Success of the Codeword Strategy by Subset Size - Training Mode. « The r e s u l t s of the second and t h i r d columns, for 6 •=> 0.30 and 6 = 1.00 r e s p e c t i v e l y , add to show that 90.5% of the t r a i n i n g set resulted i n subsets of f i v e of less elements which contained the correct i d e n t i t y and 71.8% of the inputs were c o r r e c t l y i d e n t i f i e d by the subset score: s^. Inspection of the 9.5% of the cases i n which the subset did not contain the correct i d e n t i t y shows each to have occurred early i n the t r a i n i n g process and to be a t t r i b u t a b l e to the encounter of a new pattern. The r e s u l t s f o r 6 = 1.00 may be enhanced by the 5.3% of the patterns which gave r i s e to subsets i n which two C 1, including the correct i d e n t i t y , occur with equal s_^. Comparison with the r e s u l t s of section 4.2.2 indicates l i t t l e improvement. However, the very small pattern set on which the e a r l i e r r e s u l t s were based make them o p t i m i s t i c . In Table 4.5.3 the r e s u l t s are presented according to character c l a s s . ' For each C 1, i s shown the proportion of the number of samples from the class which resulted i n 1) correct i d e n t i f i c a t i o n through a subset of only one element, 2) a subset of more than one element, in c l u d i n g the correct identity^ or 3) a subset devoid of the correct i d e n t i t y . In the l a s t case, the r e s u l t achieved by the machine i s given. For example, the table shows that a l l samples of the character 'A' were c o r r e c t l y i d e n t i f i e d i n a subset of one element. On the other hand, no sample of the character ' C was so i d e n t i f i e d , whereas 60% of the samples have t h e i r i d e n t i t y contained i n a r e s u l t i n g subset of more than one element and the remaining 40% were i n c o r r e c t l y i d e n t i f i e d with 'U' being the most frequent candidate. At the foot of each column, the percentage of the t r a i n i n g set which f a l l s into each category i s given. These values f i n d correspondence with the r e s u l t s given i n table*4.5.2. The subsets c o n s i s t i n g of more than one element are shown i n d e t a i l i n table 4.5.4. According to source pattern character c l a s s , the '.Character Class 'Proportion of the Samples of each Class c o r r e c t l y i d e n t i f i e d i n Subsets of: Incorrect I d e n t i f i c a t i o n . One element. More than one element. A 1.000 -B •1.000 -C - .600 U .200 D .400 .600 E 1.000 -F .750 - T .125 G 6167 .167 C .333 H .750 .250 I - 1.000 J - .875 • c .125 K .778 .222 L - 1.000 M — .666 w .333 N - 1.000 0 .725 .125 D .150 P 1.000 -Q 1.000 -R .778 .111 Q .111 S .875 - c .125 T 1.000 -U .125 .500 c .375 V - 1.000 W .125 .875 X .750 .250 Y .572 .428 Z — .833 s .166 Column t o t a l s 47.0% 43.5% .333 9.5% of the t r a i n i n g set Table 4.5.3 Results of the Codeword Strategy by Character Class - T r a i n i n g Mode •Table 4 . 5 . 4 Identity Candidate Subsets with an Average s^ for each Element and the Occurrence of each Subset as a Proportion of the Number of Samples from each C 1 - Training Mode. Character Class Candidate and Average Score Occurrence c C .106 I .083 U .050 .300 - C .630 u .460 J .210 .100 C .730 u .530 .100 C .080 I .060 u .060 J .040 .100 .600 D D 1.820 . .0 .850 .600 G G .140 C .300 u .180 J .160 .167 H H .060 N .065 z .055 .250 I I .114 C .085 1.000 J J .100 C .043 u .063 G .043 .375 J .130 C .215 u .130 .250 J .060 u .050 G .030 . V .030 L .030 . .125 J .100 u .070 .125 .875 K K * 1.130 X .870 .222 L L .045 V .045 Y .030 .800 L .030 V .030 Y .040 U .020 .100 L .050 V .040 U .080 c .060 .100 1.000 M M .183 w .320 .666 N N .097 H .090 Z .063 .666 N .110 H .110 .222 N .050 H .050 Z .050 K .030 X .020 .111 1.000 Table 4.5.4 (Continued) Character Class 0 0 • R R U U u V V w w X X Y Y Y z z Candidate and Average Score .050 D .040 .180 Q .150 .345 C .480 .600 C .750 J .250 .044 L .054 Y .028 .320 M .183 .510 K 1.390 .020 L .060 V .050 .020 . L .020 V .020 .058 N •'.110 H .093 Occurrence .125 .125 .125 .375 .500 1.000 .875 .250 .268 .160  .428 .833 frequency of occurrence of each subset i s given. I t i s the sum of these ' values, for each c l a s s , which i s l i s t e d i n the second column of table 4.5.3. With each subset element i s an average computed over a l l occurrences of the subset. The large s t such value i n any subset gives, on the average, the r e s u l t to be expected i n the event of the occurrence of the subset. A number of confusion sets are apparent. The most frequent are: {C, U, J, G}, {C, I}, {H, (N,Z)}, {K, X}, {L, V, Y} The f i r s t two sets involve input patterns having low, broad peaks,which, for some values of the threshold L , are not detected. For other values a' of the threshold, noise may r e s u l t i n a m u l t i p l i c i t y of detected peaks. Table 4.5.1 shows various combinations of the elements of these subsets to be associated with a r e l a t i v e l y large number of k^. Among the remaining sets, the H - (N,Z) confusion r e s u l t s d i r e c t l y from poor separation among the two pairs of concave peaks of the character H while the K - X and (L,V) - Y confusions are a t t r i b u t e d to the loss of low magnitude peaks. Figure 4.5.2 makes apparent the cause of the confusion between patterns representing e i t h e r of the characters 'Z' or 'N' and many of those which correspond to the character 'H'. In the ridge function corresponding to the character 'X', there are two low magnitude peaks which are marked as (1) and (2). If both are detected, the r e s u l t i n g codeword leads d i r e c t l y to the correct i d e n t i f i c a t i o n of the pattern. Howevar, i f either one or both of these peaks are not detected, the pattern w i l l be i d e n t i f i e d as representative of the characters 'K' or 'N - Z' r e s p e c t i v e l y . . Fig. 4.5.2 Examples of Confusions Caused by Poor Feature S e p a r a b i l i t y and Loss of Low Magnitude Peaks. 79 The 28.2% error rate i n the r e s u l t s of table 4.5.2 i s due to a number of events. These are, together with the error due to each, 1) i n c o r r e c t merging and/or retention of spurious peaks due to noise (4.7%), 2) the input of patterns not previously encountered and which lead to a) the generation of new codewords (3.2%) and b) new associations with established k^ (5.7%) and 3) the l i m i t a t i o n s of the strategy (14.6%). The l a t t e r has been itemized i n the presentation of the foregoing r e s u l t s with the confusion sets discussed above responsible for the greater part of t h i s number. In nearly a l l cases, a marked s i m i l a r i t y e x i s t s between the elements of the subsets I i n both t h e i r associated source and input patterns. The t r a i n i n g curve for the machine i s given i n f?lg 4.5.3 The ordinate gives the percentage of the pattern set which r e s u l t s i n subsets which include the correct i d e n t i t y . These are generated with 6 = 0.3. The f i n a l value achieved by the curve agrees with with the t o t a l for the second column of r e s u l t s of table 4.5.2. This defines an upper bound on the success of the combined s t r a t e g i e s . The r e s u l t s imply that the codeword strategy employing the threshold detection algorithm i s superior to the method of sections 4.1 and 4.2. However, granted some modification to the simulator, i t i s probable that a compromise between the two peak detection algorithms may provide the most s a t i s f a c t o r y r e s u l t s . Either reduction i n pattern f i e l d g r i d spacing of the simulator, r e l a t i v e to pattern s i z e , or multiple processing of each source pattern should provide an enhancement of r e s u l t s by no l e s s than 5%. The confusion sets {C, I}, {H, (N,Z)}, {(L,V), Y}, and {K, X} would be l a r g e l y eliminated. 0 20 40 60 80 100 120 140 160 180 200 220 Number of Training Patterns F i g . 4.5.3 Training Curve Giving the Success of the Codeword Decision Strategy for 6 = 0.3 4.6 Second LeVel Strategy of the Sequential Logic A strategy has been implemented which imposes a p a r t i t i o n i n g on the environment by subdividing i t into subsets of a small number of elements. I t remains to define a decision process which w i l l s e l e c t an i d e n t i f i c a t i o n f o r each input from the r e s u l t i n g subset v/ith a low error rate. To date, no metric evaluation of the set of input patterns has been employed. With this i n mind, a study of the confusion sets defined i n the previous section leads to a number of conclusions. Some of these have already been discussed i n section 4.2 i n conjunction with F i g . 4.2.3. The observations made are 1) that r e l a t i v e peak l o c a t i o n may allow fhe 'symmetric' characters 'U' and 'V' to be distinguished from ' J ' and 'L' r e s p e c t i v e l y . 81 2) that peaks of a merged pair may exhibit a greater combined area than that of a peak corresponding to a s i n g l e corner. This may aid i n the separation of the 'L - Y' p a i r . 3) that ; i f the term non-peak i s defined to describe regions of the ridge function over which the magnitude i s at l e a s t 20% below the nominal threshold value, the areas under the curve i n the non-peak regions should be greater i n samples of the character 'C' than i n those of ' I ' . 4) that the i n c l i n a t i o n of the 'arms' of the characters 'V and 'Y' causes the angles subtended at t h e i r end-of-line features to d i f f e r appreciably from 90°. In contrast, however, samples of the l e t t e r 'L' generally exhibit near-90° corners at the ends-of l i n e s . The corresponding d i f -ferences i n ridge function peak amplitude may aid the separation of these elements. Similar observations hold f o r the p a i r K - X. Reference to f i g u r e 2.3.2(b) i l l u s t r a t e s an exception to t h i s hypothesis. The H - (N,Z) confusion may also be p a r t i a l l y resolved through a comparison of peak amplitudes. These features of the input pattern set serve to form a basis f o r the s e l e c t i o n of the components x^ of the vector X_. defined i n equation 4.3.5. These components w i l l be 1) and 2) the sums of p o s i t i v e and negative non-peak areas r e s p e c t i v e l y . In each such region , the area under the curve i s computed and added to the appropriate subtotal. 82 3) the R.M.S. value of the ridge function 4) the score for the candidate C 1 r e l a t i v e to the largest score s for a l l i e l . Y The remaining four components are themselves vector quantities with dimension equal to the number of peaks acknowledged i n the corres-ponding k_.. The measures on the ridge function which are represented by these vectors w i l l be 5) the r e l a t i v e l o c a t i o n of the peaks of the codeword 6) the area of each of the detected peaks 7) the area of each of the non-peak regions 8) the peak magnitude of the detected features. Corresponding to each k^ generated by the f i r s t l e v e l of the sequential strategy, the second l e v e l generates the vector X^ . c o n s i s t i n g of the components k = l,co, where, by a r b i t r a r y choice based on obser-vable properties of the ridge function, w = 8. Each component x^ i s i t s e l f a vector with components y_^ , i = 1,<(>, where <j> i s equal "to the number of ridge function extrema acknowledged' i n the corresponding k_.. When operating i n the t r a i n i n g mode, the machine computes and maintains a reference vector T.. for each as s o c i a t i o n defined by the table MT. Like those of X_., the to components of T_._^  are vectors of $ components. The reference T.. i s an average computed over a l l X. which r e s u l t from 3 i 3 patterns of the class C 1. The success of the decision process rests on i t s a b i l i t y to be i n s e n s i t i v e to pattern v a r i a t i o n s within the C 1 and yet to d i s t i n g u i s h between patterns on the basis of i n t e r - c l a s s v a r i a t i o n s . To overcome i n t e r - f o n t v a r i a t i o n s , a quantization of the measurements i s made. The degree of quantization i s chosen to be a compromise between 83 economy i n machine storage requirements and the optimal value for each measure as determined from the t r a i n i n g set. It i s such that each component \ = ( y i } k i = 1,* ;• k = 5,8 4.6.1 i s stored i n two 32-bit computer words provided cf> <. 10, a value which i s u n l i k e l y to be exceeded. This may be v e r i f i e d by reference to table 4.5.1 which shows the largest k_. to contain 9 b i t s . The actual quantization l e v e l for each measure i s chosen to achieve best d i s c r i m i -nation, on the basis of experimental evidence, between the C 1 within the bound of two orders of magnitude imposed on the range of the y_^ . The signs of the components of x^ and Xg are c a r r i e d by the corresponding b i t s of the k_. whereas, those of the non-peak areas of x^ are denoted by choosing a quantization l e v e l and a displacement such that the p a i r '55' denotes zero area while '00' and '99' correspond to large negative and p o s i t i v e values r e s p e c t i v e l y . Thus, associated with each entry i n the table MT, besides the two elements of the l i s t 1 , i s an co• (J> matrix T „ , each l o c a t i o n of which defines two words of computer memory. When the machine i s operating i n the i d e n t i f i c a t i o n mode, for each k_. e R' , an i s computed. A f t e r f i r s t generating the subset I of i d e n t i t y candidates, the distance vectors D ^ (equation 4.3.6) are evaluated for a l l i e I. By grouping these values according to the associated character c l a s s , the sets {D..}. are formed, one for each i e I. These sets are, i n general unequal i n s i z e f or the reason that the l i s t s 1. do not define associations with a l l of the C 1. It i s 3 . necessary to resolve, from these sets, a vector 84 D. = (d. ) . k = l,w ; i e I 4.6.2 I k I ' such that X d, . = 1 k = 1, co . . T k i i e l The technique used i s to average the of each set, by component, and assign the r e s u l t to D^. Other methods are a v a i l a b l e but have not been attempted. These include the s e l e c t i o n D. = min({D..}.) 4.6.3 or an average over some subset of the elements of each set. Once t h i s average has been computed, a weighted evaluation w 2 1/2 e. = ( Z (w. d. .rr' 4.6.4 l . i k k i k=l i s made f o r every i e l . The input pattern i s then i d e n t i f i e d as belonging to the class associated v/ith the smallest value among the e^. While the machine i s t r a i n i n g , the weight, vector W = {w. } 4.6.5 k i s computed from a t a l l y vector Z = {z. } 4.6.6 k which, for certain, inputs from the t r a i n i n g set, i s incremented by Az, = £«min. (d, . )/d k = l , u 4.6.7 where I i s the s i z e of the subset I generated by the codeword strategy and r denotes the correct i d e n t i t y of the input pattern. Thus, the t a l l y vector i s an always increasing function with each component incremented by an amount proportional to the s i z e of the subset I and to the r e l a t i v e c ontribution of the corresponding measure to the success of the decision 85 made. The weights are derived by normalizing the t a l l i e s : w = k = l.w 4.6.8 k W 2 1/2 i = l The t r a i n i n g procedure, by incrementing the components of the t a l l y vector by d i f f e r e n t amounts, depresses the w^  corresponding to un-successful x. and leaves those which tend to provide the better r e s u l t s 3 to determine the weighted evaluations e^. The t a l l i e s are modified during t r a i n i n g only i n the case that a subset I i s created which includes the correct i d e n t i t y of the input. Before the machine commences to t r a i n on the input pattern, an attempt to determine i t s i d e n t i t y i s made on the basis of the current set of weights. Following t h i s , and independent upon the outcome except for the proviso stated above, the t a l l i e s are updated and the weights recal c u l a t e d . The res u l t s reported i n the next section express the success of t h i s p r e - t r a i n i n g i d e n t i f i c a t i o n procedure. The maximum increment I i s made to the component z^ only when the th k component of the di f f e r e n c e vector i s less than or equal to the tVi k component of any other "D . In the case that d^ r i s greater than the smallest of the k 1^ components among the D_^ , the increment to the k*"*1 t a l l y i s pr o p o r t i o n a l l y less than t. The flow charts of Fi g s . 4.6.1 and 4.6.2 i l l u s t r a t e the operation of the second l e v e l strategy i n the i d e n t i f i c a t i o n and t r a i n i n g modes r e s p e c t i v e l y . Hence, a framework has been constructed to allow the s e l e c t i o n , from the subset I, of an i d e n t i t y for the input pattern with an error rate which i s expected to be less than that r e s u l t i n g from use of the Input unknown pattern. Generate the pair R ' . Generate a for each k vector X. j A £ R.' . V7ith the codeword strategy, generate the subset I . Have a l l the k. e R ' J been processed ? yes X For each i e l , average the vectors D.. and form the normalized vectors D. Compute the vector products W*D^  and normalize, componei by component, to form the evaluations e^. •Assign the input pattern to the c l a s s C associated with the smallest evaluation, (eqn. 4.3.9) Has the set of unknown inputs been depleted ? no Search the table MT for the next k. e R 1 found not found Have the entries i n the l i s t 1J been exhausted' no yes For the next entry i n 1. J which i s associated with an element of I, compute the distance vector D... (eqn. 4.3.6) STOP no yes F i g . 4.6.1 Flow Chart for the I d e n t i f i c a t i o n Mode of Operation of the Sequential Strategy. Input: known pattern. JL Using the sequential strategy, attempt to i d e n t i f y the input. Compare the p r e d i c t i o n with the true i d e n t i t y . i n c o r r e c t correct Has the subset I more than one element ? yes no From the set of normalized diff e r e n c e vectors D., 1 compare, component by component, the two which correspond to the correct and the predicted i d e n t i f i c a t i o n classes. For each component, compute the increments Az^ and add these to the t a l l y vector, (eqns. 4.6.2, 4.6.3 and 4.6.7) Recompute the weights W. (eqn. 4.6.8) Update the table MT and the p r o b a b i l i t i e s P(k.) and PCC 1). (ref. F i g . 4.4.3). Simultaneously, from the vectors X , update the average reference vectors T.. 3 J i Has the t r a i n i n g set been depleted?  n o yes 87 STOP F i g . 4.6.2 Flow Chart for the Training Mode of Operation of the Sequential Strategy. 88 subset score alone. Two p a r t i c u l a r questions a r i s e regarding the basis of the strategy and i t s implementation. These are 1) on the v a l i d i t y of each measurement.as well as i t s d e r i v a t i o n and the choice of quanti-zation l e v e l , and 2) on the v a l i d i t y of employing a l l of the measures of the vector i n processing each of the confusion sets. In regard to the l a t t e r , since each.of the x^ were selected on the grounds of some observed feature of perhaps one, or at most two, of the confusion sets, there i s no reason to believe that they w i l l a l l enjoy the same r e l a t i v e success i n p a r t i t i o n i n g each of the sets. In f a c t , the use of some of the measures i n c e r t a i n instances could cause a degradation of r e s u l t s . In view of t h i s , as well as the small number and cle a r d e f i n i t i o n of the confusion sets, a number of weight vectors are maintained. These corre-spond to the sets {L, V, Y>, {C, J , U}, {N, H, Z}, {M, W} and { a l l other characters). The set having the greatest representation i n the subset I determines the weight vector to be employed. Comment on the adequacy of t h i s s e l e c t i o n from the set of a l l confusion sets as well as on the v a l i d i t y of the adaptive, scheme i t s e l f w i l l be offered following the presentation of r e s u l t s i n the next section. 4.7 Results of the Second Level Strategy Due to i t s p o s i t i o n i n the l o g i c of the sequential decision strategy, the second l e v e l of the decision process has an upper bound set on i t s success by the success of the primary l e v e l . * To d i f f e r e n t i a t e between the two i n t h e i r e f f e c t on the f i n a l r e s u l t s of the sequential strategy, the second l e v e l , or metric decision process i s applied to the 89 i d e n t i t y candidate subsets given i n table 4.5.4. As were previous re-s u l t s , those reported i n t h i s section are derived from the machine while operating i n the t r a i n i n g mode. The success of both the sequential and the second l e v e l l o g i c i s summarized i n table 4.7.1 according to subset s i z e . The r e s u l t s are derived from an environment containing n = 24 character cla s s e s . With the exception of those of the f i n a l column, a l l r e s u l t s are given as percentages of the t o t a l number of patterns sampled. The f i r s t two columns of re s u l t s are taken from table 4.5.2 and are included for the sake of comparison. They show, re s p e c t i v e l y , the percentage of the t r a i n i n g set which r e s u l t s i n subsets of each s i z e and the percentage of the subsets of each s i z e which contain the correct i d e n t i t y . Occurrence of Success of Error i n the Subset Subset Subsets Con- the Combined Second Level Size Occurrence t a i n i n g the Strategy Strategy Correct Identity P r e d i c t i o n 1 54.5% 47.0% 47.0% 0.00% 2 20.8 ~ 19.3 16.4 2.81 3 19.3 18.8 8.9 9.90 4 3.9 3.9 2.9 4.86 5 1.5 1.5 1.0 6.24 Table 4.7.1 Success of Sequential Strategy by Subset Size - Training Mode The l a s t two columns contain, r e s p e c t i v e l y , the percentage of the t r a i n i n g set c o r r e c t l y i d e n t i f i e d by the combined str a t e g i e s and the contribution to the t o t a l error rate due to subsets of each s i z e . Summation over the entries of the fourth column gives an o v e r a l l recognition rate for the machine of 76.2%, a s l i g h t improvement over the 71.8% obtained by v i r t u e of the subset score alone. From the table i t i s found that 45.5% of the t r a i n i n g set resulted i n subsets of 90 more than one element and 73.6% of these included the correct i d e n t i t y . A successful f i n a l r e s u l t i s resolved by the second l e v e l l o g i c i n 87.2% of the l a t t e r cases whereas the success of the codeword strategy alone i s 74%. In general, the outcome due to the metric strategy agrees with the subset score p r e d i c t i o n although, for 5.3% of the inputs the s^ was i n c o r r e c t whereas the. f i n a l outcome was successful and, i n 3.8% of the cases, the converse i s true. In table 4.7.2 the r e s u l t s given above are itemized according the character c l a s s . The proportion of the number of samples of each class c o r r e c t l y i d e n t i f i e d i s given i n the f i r s t column of r e s u l t s . In addition, a d e t a i l e d account of a l l errors i s made. The learning curve f o r the sequential machine i s given i n figure 4.7.1. This follows from the r e s u l t s shown i n f i g u r e 4.5.3. These r e s u l t s may be interpreted only as very general i n d i c a t o r s of the operation and success of the machine were the t r a i n i n g set suf-f i c i e n t l y l a rge. A considerable portion of the errors encountered r e s u l t d i r e c t l y from inadequate t r a i n i n g . This i s expected i n view of the way i n which the r e s u l t s are obtained. Namely, the attempt to i d e n t i f y the i t * 1 input pattern based on the experience of only ( i - l ) t r a i n i n g patterns where i ranges from i = 26 to i = 234. Although the accumulation of unique k_. i s very rapid (figure 4.5.1(b)) and s t a r t s to l e v e l o f f a f t e r approximately 40 inputs from the t r a i n i n g set, both the number of associations with the k^ and the weight vectors take 5 considerably longer to s t a b i l i z e . The l a t t e r require from one h a l f to two t h i r d s of the t r a i n i n g set at hand whereas the former increases with constant rate to the end of the t r a i n i n g . ( F i g . 4.5.1(a)). The main point of i n t e r e s t i n 91 Character Class Proportion of Samples Incorrect I d e n t i f i c a t i o n c o r r e c t l y i d e n t i f i e d . • A 1.000 . B 1.000 C .500 U .300 I .100 (J,U) .100 D 1.000 E 1.000 F .750 E .125 T .125 G .166 C .166 (J,u,c) .333 (L,V,Y) .333 H .750 Z .250 I 1.000 • J .625 U .250 c .125 K 1.000 L .400 V .600 M 0.000 W 1.000 N .250 H .500 Z .250 0 .750 P 1.000 Q 1.000 R .899 Q .111 S .875 c .125 T 1.000 U .333 c .556 J .111 V .375 L .375 Y .250 w .750 M .250 x .778 K .222 Y .572 V .428 Z .286 N .286 H .286 S .142 Table 4.7.2 Results of the Sequential Strategy by Character Class - Training Mode. O i _ > - L— o o °- ° 60 I . . . 0 20 40 60 80 100 120 140 160 180 200 220 Number of Training Patterns F i g . 4.7.1 Training Curve Giving Success of the Combined Strategies the r e s u l t s above are the t r a i n i n g curves for the machine, as shown i n figures 4.5.1 and 4.7.1. I t i s suggested that a pattern set of no les s than 390 samples of 15 sans - s e r i f font s t y l e s , s i m i l a r to those exhibited i n the Appendix, would provide s u f f i c i e n t t r a i n i n g . The exceptionally poor r e s u l t s obtained from samples of the character 'G' are due to the r e l a t i v e l y large number of d i f f e r e n t patterns i n which i t i s manifest. Some examples are shown i n the Appendix. In addition, the p r o b a b i l i t y P(C X) of t h i s class i s r e l a t i v e l y low for the t r a i n i n g set used. Before compiling the above r e s u l t s , a test was performed on the second stage l o g i c using only a sin g l e weight vector. For comparison, the r e s u l t s from t h i s are given i n table 4.7.3.along with the f i n a l values f or the f i v e sets of weights defined i n the previous section. In the case of the s i n g l e weight vector, f i n a l recognition r e s u l t s showed l i t t l e improvement over those reported i n section 4.5 which were based on the subset score s. alone. The sin g l e set of weights s t a b i l i z e d to 93 A. Single Set of Weights Component x^ ., k = 1 2 3 4 5 6 7 8 Weight w. .29 .30 .30 .29 .38 .47 .35 .41 B. Mu l t i p l e Sets of Weights Component x^> k = 1 2 3 4 5 6 7 8 Weight w^  .30 .29 .35 .19 .33 .47 .40 .42 {L, V, Y} .37 .40 .29 .31 .39 .37 .32 .37 ( J , u, C} .36 .27 .30 .22 .38 .45 .38 .40 {N, H, Z} .32 .28 .31 .28 .38 .42 .40 .40 {M, W} .35 .32 .34 .29 .36 .37 .37 .40 { a l l others} Table 4. 7.3 Weights which Result from i the f u l l T r a ining Set t h e i r f i n a l values much more slowly than do the multiple weights. Table 4.7.3 shows that, among the l a t t e r , the maximum deviation (columnwise) of the w^  across the f i v e sets of weights ranges from a high of 27.6% of the l a r g e s t value for k = 2 to a low of 11.9% for k - 8. Re c a l l i n g that the measure involved non-peak area evaluation of the ridge function, i t i s not s u r p r i s i n g that t h i s measure finds r e l a t i v e importance i n the p a r t i t i o n i n g of the set {J, U, C}. The fact that each of the w^  do d i f f e r ^ according to the character set with which they are associated, by amounts greater than 10% supports t h e i r use rather than the sing l e set of weights. A further test of the sequential decision process i s reported i n the following section. Having trained the machine on the f u l l pattern set, and generated the contents of table 4.5.1, the set i s again processed, but with machine operating i n the i d e n t i f i c a t i o n mode. The error rate 94 due to i n s u f f i c i e n t t r a i n i n g and erroneous codeword generation i s la r g e l y eliminated. This allows a more meaningful appraisal of the decision processes than do the methods-described above. 4.8 Results on the f u l l y Trained Machine The f u l l y trained machine i s simulated by having i t process i t s t r a i n i n g set while operating i n the i d e n t i f i c a t i o n mode. Thus, every k. and every a s s o c i a t i o n between a k. and a C 1 with non-zero p!. which J 3 J i i s generated w i l l be found i n the table MT. Also, the p r o b a b i l i t i e s P(kj) and PCC 1) w i l l be t r u l y representative of the environment. The o v e r a l l r e s u l t s f o r the f u l l pattern set of 234 patterns from n = 24 classes are given i n table 4.8.1 according to the s i z e of the subset I which r e s u l t s from the f i r s t l e v e l of the strategy. The table i s of the same form as table 4.7.1. The columns, i n order, contain the frequency of occurrence of subsets of each s i z e , the percentage of input patterns which r e s u l t i n subsets containing the correct i d e n t i t y , the success of the codeword strategy i n i d e n t i f y i n g the input on the basis of the s. and, f i n a l l y , the o v e r a l l success of the combined s t r a t e g i e s . Occurrence of Success of Success of Subset Subset Subsets Con- the Subset the Combined Size Occurrence t a i n i n g the Score Pre- Strategies Correct Identity d i c t i o n 1 51.3% 50.5% 50.5% 50.5% 2 20.1 20.1 17.9 18.7 3 23.9 23.5 12.0 , 16.8 4 4.7 4.7 1.3 3.7 Table 4.8.1 Results by Subset Size of the Sequential Stratey on the f u l l y Trained Machine - I d e n t i f i c a t i o n Mode 95 The r e s u l t s show that 98.9% of the subsets I contained the correct i d e n t i t y and 81.7% of the pattern set was c o r r e c t l y i d e n t i f i e d on the basis of the s. alone. The recognition rate f o r the combined st r a t e g i e s i s 89.7%. Subsets of more than one element resulted from 48.7% of the input pattern set and, of those subsets, 99.2% include the correct i d e n t i t y . Further, of the subsets which contain the correct i d e n t i t y of the input pattern, 64.5% r e s u l t i n correct f i n a l i d e n t i t y of the input by v i r t u e of the score s^ alone. In comparison to the l a s t f i g u r e , the second l e v e l strategy gives a success rate of 80.3% on those subsets which contain the correct i d e n t i t y . The l a s t two figures are both l e s s than the corresponding values reported for the t r a i n i n g process. The increased o v e r a l l recognition rate derives l a r g e l y from the fact that 8.4% more of the subsets contained the correct i d e n t i t y i n the l a t t e r set of r e s u l t s than i n the former. For 9.8% of the input patterns the subset score predicts an inc o r r e c t r e s u l t whereas the f i n a l outcome proves successful while, for 2.2%, the converse i s true. Tables 4.8.2 and 4.8.3 contain an itemized account of the r e s u l t s of the f i r s t l e v e l of the strategy according to character c l a s s . Similar r e s u l t s for the o v e r a l l sequential strategy are given i n table 4.8.4. These are i d e n t i c a l i n form to the tables 4.5.3 and 4.5.4 r e s p e c t i v e l y . The observation has been made, i n conjunction with f i g u r e 4.5.1 that patterns of some type s t y l e s generate new associations and new at a greater rate than those of other s t y l e s . S i m i l a r l y , these patterns contribute heavily to the recognition error rate. To give some idea of the e f f e c t of font s t y l e v a r i a t i o n on the recognition r e s u l t s , table 4.8.5 Character Class Proportion of the Samples of each Class correctly identified in Subsets of: :One element. More than one element. A .857 .143 B 1.000 C .100 .900 D .500 .500 E 1.000 F .875 G .333 .500 H .778 .222 I - 1.000 J - 1.000 -K 1.000 L - 1.000 M - .834 N - 1.000 0 .625 .375 P .875 .125 Q .875 R 1.000 S .750 .250 T 1.000 U * .125 .875 V - 1.000 W .125 .875 X .857 .143 Y .572 ' .428 Z - 1.000 Column totals 51.3% 47.4% Incorrect Identification E .125 (LVY) .166 W .166 R .125 1.3% of the pattern set. Table 4.8.2 Results of the Codeword Strategy by Character Class - Identification Mode. Table 4.8.3 Identity Candidate Subsets with an Average s_^  f o r each Element and the Occurrence of each Character Class Subset as a Proportion of the Number of Samples from each C - I d e n t i f i c a t i o n Mode. Candidate and Average Score Occurrence A A .040 P .010 .143 C C .680 U .470 G .415 .200 C .105 U . .035 I .095 .200 C 4.770 u 4.180 .100 C .080 I .120 .100 C .270 u .170 G .160 J .090 .100 C .060 u .030 I .060 J .030 .100 C .030 u .040 J .070 .100 D D 2.200 0 .800 .500 G G .120 u .140 c .200 J .120 .167 G .400 • u .430 C .670 .167 G 1.140 S 2.740 .167 H H .070 N .085 Z .060 .222 I 1» .114 c .050 1.000 J J .080 u .030 .125 J .088 u .056 C .042 .625 J .110 u .190 C .290 G .170 .125 J .060 u .030 C .030 L .020 .125 L . L .040 V .033 Y .020 .889 L .040 V .020 U .050 C .030 •.'111 .900 .500 1.000 1.000 Table 4.8.3 (Continued) Character Class M M • N N N 0 0 P P s s u u u u u V V w w ' X *x Y Y z z z Candidate and Average Score .152 W .225 .070 H .057 Z .047 .180 1.320 D 2.300 .040 A .020 2.335 G .885 .365 C .250 .100 C .090 J .060 .315 C .360 J .145 .040 C .030 J .080 .034 L .040 Y .016 .250 M .170 .643 K .446 .023 L .040 V .030 .050 H .062 N .072 .060 H .050 N .060 Occurrence .834 .900 .105 .100 1.000 .375 .125 .250 .250 .250 .120. .250 L .'.030 .125 .875 1.000 .875 .143 .151 ,428 .833 .167 1.000 99 Character Class Proportion of Samples Incorrect I d e n t i f i c a t i o n c o r r e c t l y i d e n t i f i e d . A 1.000 B 1.000 C .727 J .091 D 1.000 E 1.000 F .889 E .111 G .427 C .430 H 1.000 I 1.000 J .889 U .111 K 1.000 L .455 V .545 M .167 W - .833 N .727 Z .273 0 1.000 P .889 A .111 Q 1.000 R 1.000 S 1.000 T 1.000 U .778 J .222 V .778 L .222 w 1.000 X .800 K .200 Y .750 L .250 Z .570 N .430 .091 U .091 « Table 4.8.4 Results of the Sequential Strategy, by Character Class, on the f u l l y Trained Machine - I d e n t i f i c a t i o n Mode. 100 provides the above r e s u l t s ordered according to font. Each type s t y l e i s i d e n t i f i e d by i t s Lettraset code. Some examples from the pattern set are shown i n the Appendix. Style No. Recognition Style No. Recognition rate rate 324 91.8% 171 92.3% 671 85.0 183 87.5 100 92.0 579* 75.0 109 96.0 709 96.3 656* 100.0 587 71.0 *sets containing l e s s than 18 elements Table 4.8.5 Results of the Sequential Strategy by Font Style By twice repeated t r a i n i n g on a set of 52 patterns of s t y l e s No. 324 and 671, the recognition rate over t h i s set increases to a maximum of 96%. Further t r a i n i n g on t h i s r e s t r i c t e d set does not enhance the r e s u l t s . The 4% minimum error rate involves patterns belonging to the confusion set {J, Y, C}. The decision'processes discussed i n t h i s chapter have found t h e i r roots l a r g e l y i n h e u r i s t i c s a r i s i n g from the scrutiny of i n d i v i d u a l patterns and groups of patterns which represent c e r t a i n characters of the environment. The use of precise measurements which are derived from the ridge function has been generally avoided. A l l measures which have been employed .were selected on the grounds of t h e i r apparent importance to human perception. A l i m i t e d set of experiments has also been performed on decision s t r a t e g i e s which are of & somewhat d i f f e r e n t nature. They d i f f e r from the methods which have been discussed i n that they r e l y on some very exact mathematical evaluation of the input pattern set. Without exception, these methods have resulted i n l i t t l e success. 101 Two of these s t r a t e g i e s w i l l be discussed b r i e f l y . One technique employs a weighted, l i n e a r minimum distance c l a s s i f i e r which operates i n a descriptor space defined by the f i r s t m c o e f f i c i e n t s of the l e a s t -square Fourier approximation to the ridge function. For each character class of the environment, an averaged reference vector i s computed on the basis of the t r a i n i n g set. To i d e n t i f y an unknown input pattern, i t i s f i r s t transformed into the descriptor space and then the distances between i t and each of the reference vectors are compared. For two reasons t h i s type of ridge function abstraction i s found unsuitable for e i t h e r the problem of pattern or character recognition. For one, the Fourier approximation, for a s a t i s f a c t o r y ' f i t ' to most samples of ridge function, contains high frequency components which, compared with the dominant components, have r e l a t i v e l y high magnitude. Thus to adequately describe the input pattern set, the s i z e of the descriptor space becomes unwieldy. Secondly, and most important, the Fourier co-e f f i c i e n t s are extremely s e n s i t i v e to small changes i n the ridge function and thus vary widely between patterns of the same class due to minor differences between the patterns themselves. The second strategy to be mentioned involves the c r o s s - c o r r e l a t i o n of input patterns. On the basis of the t r a i n i n g set, an average ridge function i s computed as a reference fo r each character c l a s s . I d e n t i f i c a t i o n i s made by c r o s s - c o r r e l a t i n g the unknown input pattern with each of the reference functions and s e l e c t i n g the class associated with that reference which y i e l d s the highest c o r r e l a t i o n f a c t o r . Since the ridge function i s independent of source pattern o r i e n t a t i o n , for r e g i s t r a t i o n purposes, the c r o s s - c o r r e l a t i o n must be made with the use of a s h i f t r e g i s t e r . Compared with other 102 decision s t r a t e g i e s , t h i s involves a r e l a t i v e l y large amount of compu-t a t i o n . Without some s p e c i a l preprocessing of the input pattern set, the method y i e l d s poor r e s u l t s due to v a r i a t i o n between patterns of the same cl a s s . These v a r i a t i o n s make d i f f i c u l t the r e g i s t r a t i o n of l i k e patterns for the evaluation of the reference functions during t r a i n i n g . I f that problem i s overcome, the reference functions may exhibit e i t h e r a m u l t i -p l i c i t y of peaks or extremely broad features where only one sharp feature should occur. Only i f some degree of quantization i s applied to the input pattern set do s a t i s f a c t o r y r e s u l t s occur. This has been demon-strated by the second l e v e l of the sequential decision strategy. « 103 5. CONCLUSIONS The experimental r e s u l t s reported i n t h i s thesis have served to demonstrate that a character recognition machine can be designed which, employing the ridge function as i t s sole pattern descriptor, i s capable of recognition rates which exceed 89%. A study of the r e s u l t s shows t h i s value to be a p e s s i m i s t i c lower bound set by experimental technique rather than l i m i t a t i o n s of the ridge function as a pattern d e s c r i p t o r . This a s s e r t i o n i s borne out by the human subject who, with the possible exception of the N-Z and M-W p a i r s , can i d e n t i f y the elements of the "input pattern set with an error rate which, depending upon experience, i s no more than 4%. Thus, the success to be expected of automatic recognition techniques should be no less than 95%. The discrepancy e x i s t i n g between t h i s a n t i c i p a t e d value and the r e s u l t s a c t u a l l y achieved i s l a r g e l y a t t r i b u t e d to the c h a r a c t e r i s t i c s of the operating system as w i l l be explained below. To achieve conclusive r e s u l t s i n evaluating the ridge function as a pattern descriptor, c e r t a i n refinements are necessary i n both the operating system and the pattern set which forms the environment. The most important modification to be made to the operating system involves the addition of a feedback l i n k from the decision unit to the simulator. T h i s feature w i l l allow the s e l e c t i o n of near-optimal values f o r such simulator parameters as the degree of quantization of the pattern f i e l d . To date, these parameters have remained f i x e d at values chosen p r i o r to the achievement of any experimental results*. They are found to be non-optimal and to r e s u l t d i r e c t l y i n a number of the confusion'sets encountered. Improved operating conditions are expected to eliminate these 104 sets which generally r e s u l t from a f a i l u r e to detect poorly defined ridge function features. They include the sets {V,Y}, {K,X}, {0,1} and {N,H} as well as those a t t r i b u t e d to the detection of spurious peaks due to noise. The e f f e c t of an increased sampling density from the pattern f i e l d on the remaining confusion sets cannot be s p e c i f i c a l l y predicted. These are the sets generally characterized by patterns with sections of contour which are smooth arcs having curvature of low magnitude. However, improved r e s u l t s are anticipated. T o t a l improvement i n r e s u l t s achieveable by modification of the simulator i s expected to exceed 4%. Two ways by which these r e s u l t s may be achieved are f i r s t , by a d i r e c t change i n the degree of quantization of the pattern f i e l d or secondly, by repeated processing of each source pattern at d i f f e r e n t orientations r e l a t i v e to the coordinates of the f i e l d . Assuming that the d e c i s i o n unit has an error detection c a p a b i l i t y , the use of time-varying feedback to the simulator w i l l allow the e f f e c t i v e quantization of the pattern f i e l d to be adjusted according to the properties of each set of source patterns encountered. This a b i l i t y i s expected to improve machine e f f i c i e n c y . Further improvement may be achieved by the use of a simulator expressly designed for the purpose of pattern or character recognition. This machine would best employ a contour t r a c i n g technique rather than a scan of the whole pattern f i e l d . It would also take f u l l advantage of the area operator model developed by Connor. t The learning curves for the machine have shown the pattern set used for the experiments to be of i n s u f f i c i e n t s i z e . In expanding the environment to include a greater number of font s t y l e s , the question 105 a r i s e s of l i m i t i n g t h e i r v a r i e t y . To date, the only r e s t r i c t i o n s placed on the elements of the pattern set have been that they exhibit a c e r t a i n 'thickness' and be devoid of s e r i f s or any other such embellishment. Assuming an a r b i t r a r i l y large pattern set, the recognition rate achieved i s expected to be inversely proportional to the v a r i e t y which ex i s t s between the more d i s s i m i l a r font s t y l e s . Thus, to achieve a set l e v e l of success, further r e s t r i c t i o n s on the choice of font s t y l e may become necessary. Further, regarding the choice of pattern set, whenever a character i s manifest i n a number of b a s i c a l l y d i f f e r e n t patterns, the environment.must include a number of samples of each type of pattern. The experimental environment has, i n part, f a i l e d to meet t h i s requirement. This may be seen with reference to the Appendix which shows, among the 9 patterns associated with the character 'G', only one (Style No. 579) devoid of the short h o r i z o n t a l bar. S i m i l a r l y , one and only one pattern associated with the character 'U' (again, of Style No. 579) ex h i b i t s a small 'tab'. I t cannot be expected that any such exception to the ru l e be c o r r e c t l y i d e n t i f i e d . In the event that the aforementioned changes to the simulator and pattern set are invoked, modifications to the second l e v e l of the sequential decision strategy w i l l become necessary. Since measures on the ridge function such as peak amplitude and r e l a t i v e peak l o c a t i o n become more precise descriptors of the contour as the»d.c. noise l e v e l decreases, t h e i r l e v e l of quantization w i l l require change. As w e l l , changes which may occur among the confusion sets w i l l l i k e l y d i c t a t e a r e d i s t r i b u t i o n of the multiple sets of weights. 106 F i n a l l y , i t i s r e i t e r a t e d that no account has been taken of pattern o r i e n t a t i o n r e l a t i v e to the pattern f i e l d . In considering the problem of pattern recognition, i t i s generally desireable that the pattern descriptor, l i k e the ridge function, i s quite independent of o r i e n t a t i o n . Conversely, f or character recognition, pattern o r i e n t a t i o n often serves to be a useful a t t r i b u t e of the pattern d e s c r i p t o r . In the case at hand, many of the confusion sets encountered could be r e a d i l y p a r t i t i o n e d were r e l i a b l e o r i e n t a t i o n information a v a i l a b l e . 107 APPENDIX The following pages contain sets of 26 examples each of nine of the font styles contained in the pattern set. They are identified according to their style number in the Letraset series. Style No. 579 108 S t y l e N o . 587 S t y l e No. 100 S t y l e No. 324 S t y l e No. 171 1 1 0 S t y l e No. 656, I l l Style No. 671 112 S t y l e No. 709 113 BIBLIOGRAPHY 1. Attneave, F., "Informational Aspects of Visual Perception", Psychol Rev., Vol. 61, pp. 183-193, 1954. 2. Bennett, J. R., "A Study of the Human Visual Evoked Potential", M.A.Sc. Thesis, University of British Columbia, Dec. 1968. 3. Connor, D. J., "A Special Purpose Computer to Simulate a Visual Receptor Network for Pattern Recognition Studies", M.A.Sc. Thesis, University of British Columbia, June 1965. 4. Connor, D. J., "Lateral Inhibition and the Area Operator in Visual Pattern Processing", Ph.D. Thesis, University of British Columbia, July, 1969. 5. Hartline, H. K., R a t l i f f , F., and Miller, W. H., "Spacial Summation of Inhibitory Influences in the Eye of Limulus, and the Mutual Interaction of Receptor Units", in Nervous Inhibition, ed. E. Florey, Pergamon Press, New York, pp. 241-284, 1961. 6. Ho, Y-C, "On Pattern Classification Algorithms, Introduction and Survey", Proceedings of the I.E.E.E., Vol. 56, No. 12, pp. 2101-2114, Dec. 1968. 7. Liu, C. N., and Shelton, G. L. Jr., "An Experimental Investigation of a Mixed-Font Print Recognition System:, I.E.E.E. Transactions  on Electronic Computers, Vol. EC-15, No. 6, pp. 916-925, Dec. 1966. 8. Masnakosa, V., "Practical Realization of a Pattarn Recognition on a Digital Computer", Engineering Cybernetics, No. 1, pp. 143-147, 1968. 114 9. Nadler, M., "An Analog-Digital Character Recognition System", ' I.E.E.E. Transactions on Electronic Computers, Vol. EC-12, No. 6, pp. 814-821, Dec. 1966. 10. Nagy, G. , "State of the Art in Pattern Recognition", Proceedings  of the I.E.E.E., Vol. 56, No. 5, May 1968. \ 11. .Nilsson, Nils J., "Learning Machines", McGraw-Hill, New York, 1965. I 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0052033/manifest

Comment

Related Items