UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Separation and recognition of connected handprinted capital English characters Ting, Voon-Cheung Roger 1986

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1987_A7 T56.pdf [ 6.01MB ]
Metadata
JSON: 831-1.0064980.json
JSON-LD: 831-1.0064980-ld.json
RDF/XML (Pretty): 831-1.0064980-rdf.xml
RDF/JSON: 831-1.0064980-rdf.json
Turtle: 831-1.0064980-turtle.txt
N-Triples: 831-1.0064980-rdf-ntriples.txt
Original Record: 831-1.0064980-source.json
Full Text
831-1.0064980-fulltext.txt
Citation
831-1.0064980.ris

Full Text

S E P A R A T I O N AND R E C O G N I T I O N OF CONNECTED HANDPRINTED C A P I T A L E N G L I S H CHARACTERS By VOON-CHEUNG ROGER TING B. Sc., Worceste r P o l y t e c h n i c I n s t i t u t e W o r c e s t e r , M a s s a c h u s e t t s , U.S.A. 1984 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE i n THE FACULTY OF GRADUATE STUDIES Department of E l e c t r i c a l E n g i n e e r i n g We a c c e p t t h i s t h e s i s as c o n f o r m i n g t o the r e q u i r e d s t a n d a r d THE UNIVERSITY OF BRITISH COLUMBIA November 19 8 6 ©Voon-Cheung Roger T i n g In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of ^ - ^ T / ? i ^ L €N61NBER1M6, The University of B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date A B S T R A C T The s u b j e c t o f m a c h i n e r e c o g n i t i o n o f c o n n e c t e d c h a r a c t e r s i s i n v e s t i g a t e d . A g e n e r i c s i n g l e c h a r a c t e r r e c o g n i z e r (SCR) assumes t h e r e i s o n l y one c h a r a c t e r i n t h e image. The g o a l o f t h i s p r o j e c t i s t o d e s i g n a c o n n e c t e d c h a r a c t e r s e g m e n t a t i o n a l g o r i t h m (CCSA) w i t h o u t t h e above a s s u m p t i o n . The n e w l y d e s i g n e d CCSA w i l l make use o f a r e a d i l y a v a i l a b l e SCR. The i n p u t image (e.g. a word w i t h t o u c h i n g l e t t e r s ) i s f i r s t t r a n s f o r m e d (thinned) i n t o i t s s k e l e t a l form. The CCSA w i l l t hen e x t r a c t the image f e a t u r e s (nodes and branches) and s t o r e them i n a h i e r a r c h i c a l form. The h i e r a r c h y stems from t h e l e f t - t o - r i g h t r u l e o f w r i t i n g o f t h e E n g l i s h l a n g u a g e . The CCSA w i l l f i r s t a t t e m p t t o r e c o g n i z e t h e f i r s t l e t t e r . When t h i s i s done, t h e f i r s t l e t t e r i s d e l e t e d and t h e a l g o r i t h m r e p e a t s . A f t e r e x t r a c t i n g t h e image f e a t u r e s , t h e CCSA s t a r t s t o c r e a t e a s e t o f t e s t i m a g e s f r o m t h e b e g i n n i n g o f t h e word ( i . e . b e g i n n i n g o f t h e d e s c r i p t i o n ) . E a c h t e s t i m a g e c o n t a i n s one more f e a t u r e than i t s p r e d e c e s s o r . The number o f t e s t i m a g e s i n t h e s e t i s c o n s t r a i n e d by a p r e d e t e r m i n e d f i x e d w i d t h o r a f i x e d t o t a l number o f f e a t u r e s . The SCR i s t h e n c a l l e d t o e x a m i n e e a c h t e s t image. The r e c o g n i z a b l e t e s t i m a g e ( s ) i n t h e s e t a r e e x t r a c t e d . L e t e a c h r e c o g n i z a b l e t e s t image be d e n o t e d by C^. F o r e a c h C-^ , a i i s t r i n g o f l e t t e r s C 2, C 3 , C L i s f o r m e d . C 2 i s t h e b e s t r e c o g n i z e d t e s t image i n a s e t o f t e s t images c r e a t e d a f t e r t h e d e l e t i o n o f C-^  f r o m t h e b e g i n n i n g o f t h e c u r r e n t word. C3 t h r o u g h C L a r e c r e a t e d by t h e same method. A l l such s t r i n g s a re examined t o d e t e r m i n e w h i c h s t r i n g c o n t a i n s the b e s t r e c o g n i z e d C^. E x p e r i m e n t a l r e s u l t s on t e s t images w i t h two c h a r a c t e r s y i e l d a r e c o g n i t i o n r a t e of 72.66%. Examples w i t h more than two c h a r a c t e r s a re a l s o shown. F u r t h e r m o r e , the e x p e r i m e n t a l r e s u l t s s u g g ested t h a t t o p o l o g i c a l l y s i m p l e t e s t images can be more d i f f i c u l t t o r e c o g n i z e t h a n t h o s e w h i c h a r e t o p o l o g i c a l l y more complex. i i i T A B L E OF CONTENTS Page A b s t r a c t . i i Table of c o n t e n t s i v L i s t of f i g u r e s v i L i s t of t a b l e s v i i i L i s t o f a b b r e v i a t i o n s i x A c k n o w l e d g e m e n t s x I . I n t r o d u c t i o n 1 1. P r e l i m i n a r y d i s c u s s i o n 1 2. Making use of the l e f t - t o - r i g h t r u l e and a " b l a c k box" 6 3. O u t l i n e of the e xperiment 8 4. Data s e t 10 I I . P r e p r o c e s s i n g 14 1. I n t r o d u c t i o n 14 2. B a s i c s on t o p o l o g y and d i s c r e t e geometry 15 3. F i l l i n g p i n h o l e s of s i z e one p i x e l 20 4. T h i n n i n g 21 5. I d e n t i f y i n g nodes and e l i m i n a t i n g n o i s y branches .. 22 6. Corner smoothing a l g o r i t h m 26 I I I . Image f e a t u r e e x t r a c t i o n 31 1. I n t r o d u c t i o n 31 2. Fuzzy image s u b s e t s 32 3. The problem w i t h b a c k - t o - b a c k nodes 33 4. Image f e a t u r e s 35 4.1. D i s c u s s i o n s 35 4.2. Image f e a t u r e s d e f i n i t i o n s 37 5. F e a t u r e e x t r a c t i o n a l g o r i t h m 39 6. A l g o r i t h m FE1 and r e l a t e d s u b r o u t i n e s 42 6.1. A l g o r i t h m r e s e t b a c k - t o - b a c k nodes 44 6.2. A l g o r i t h m swap node 45 6.3. A l g o r i t h m t r a c e b ranches 46 I V . The s e g m e n t a t i o n p r o c e s s 5 5 1. I n t r o d u c t i o n 55 2. C h o o s i n g f e a t u r e s t o c r e a t e t e s t images 56 3. SCR 59 i v Page 4. U p d a t i n g the Image F e a t u r e D e s c r i p t i o n T able (IFDT) 63 5. Steps t a k e n b e f o r e and a f t e r the c a l l i n g o f the SCR 65 6. Segmentation r o u t i n e 67 6.1. A l g o r i t h m f i n d t e s t images 70 6.2. A l g o r i t h m f i n d min 72 6.3. A l g o r i t h m s u b t 73 V. C o n c l u s i o n s 86 1. I n t r o d u c t i o n 86 2. E x p e r i m e n t a l r e s u l t s 86 3. B a s i c f e a t u r e v s . b a s i c c h a r a c t e r 87 4. C o n c l u s i o n and f u t u r e p r o s p e c t s 90 Appendixes 1. T h i n n i n g b i n a r y images 99 2. D e f i n i t i o n s f o r t e r m i n a t i o n p i x e l s 104 3. S i n g l e c h a r a c t e r r e c o g n i t i o n a l g o r i t h m 106 R e f e r e n c e s 129 v L I S T OF F I G U R E S F i g u r e Page 1-3.1. B l o c k diagram showing the g e n e r a l i d e a of the CCSA 12 I - 4.1. T y p i c a l i n p u t images t o the CCSA. The images r e p r e s e n t s the c h a r a c t e r s "EP" and "HS" r e s p e c t i v e l y 13 I I - l . l . A breakdown of the s t e p s t a k e n i n p r e p r o c e s s i n g ' . . 29 I I - 2 . 1 . The W o p e r a t o r [1] 16 I I - 2 . 2 . The T o p e r a t o r 17 I I - 2 . 3 . To show c o n n e c t i v i t y d e f i n i t i o n s 18 II-2.4. A t y p i c a l c o m b i n a t i o n showing 3 branches connected t o each o t h e r 18 II-2 . 5 . Freeman's d i r e c t i o n s d e f i n e d u s i n g the mask of the W o p e r a t o r [1] 19 II-2.6. Freeman's d i r e c t i o n s d e f i n e d u s i n g the mask o f the T o p e r a t o r 19 I I - 2 . 7 . P a th AB' s r e p r e s e n t a t i o n s 20 I I - 3 . 1 . C o n d i t i o n t o f i l l s i n g l e p i x e l p i n h o l e s 21 I I - 5 . 1 . Examples of tr i m m i n g n o i s y branches 30 I I - 6 . 1 - 4 . A diagram f o r d i s c u s s i n g c o r n e r smoothing ....26,27 I I - 6.5. Masks f o r smoothing c o r n e r s 28 I I I - l . l . A b l o c k diagram showing the f e a t u r e e x t r a c t i o n r o u t i n e and i t s r e l a t i n g s u b r o u t i n e s 49 I I I - 2 . 1 . To i l l u s t r a t e the i d e a of degree o f membership .. 50 I I I - 3 . 1 . Example of back-to-back nodes . 51 I I I - 5 . 1 . R e - l a b e l l i n g b a c k-to-back nodes 51 III - 5 . 2 a - b . Example on branch t r a c i n g o r d e r when the CCSA i s at a node 41,42 I I I - 5 . 3 a - b . An example o f an IFDT 52,53 I I I - 5.2c. G r a p h i c a l r e p r e s e n t a t i o n of IFDT 54 IV- 1.1. A b l o c k d i a g r a m showing the s e g m e n t a t i o n r o u t i n e and i t s r e l a t i n g s u b r o u t i n e s 78 IV-2.1. A s e t of t e s t images chosen by f i n d test image ... 79 IV-2.2. Two examples of BIBs ~. 80 IV-3.1. H a n d p r i n t e d c h a r a c t e r r e c o g n i t i o n s t a t i s t i c s f o r t u n i n g the SCR 81 IV-6.0.1. The IMF's c o n t e n t a f t e r the f i r s t p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 t h r o u g h SR I 82 IV- 6.0.2. The IMF's c o n t e n t a f t e r the second p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 t h r o u g h SRI 83 V- 6.0.3.- The IMF's c o n t e n t a f t e r the t h i r d p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 through SRI 84 IV-6.0.4. The IMF's c o n t e n t a f t e r the f o u r t h p a s t of the image and the IFDT i n f i g u r e I I I - 5 . 2 t h r o u g h SRI 85 IV-6.0.5. The IMF's c o n t e n t a f t e r the f i f t h p a s t of the image v i Page and t h e IFDT i n f i g u r e 111-5.2 through SRI 85 V - l . l . An example o f an image r e c o g n i z e d by the CCSA 95 V-1.2. An example o f an image r e c o g n i z e d by the CCSA 96 V-1.3. An example o f an image r e c o g n i z e d by the CCSA 97 V-1.4. An example o f an image not r e c o g n i z e d by the CCSA . 98 A l . l . L o c a l n e i g h b o u r s o f a m u l t i p l e p i x e l P 100 A1.2. L o c a l n e i g h b o u r s o f a t e n t a t i v e l y m u l t i p l e p i x e l P 100 Al.3.1-3. L o c a l n e i g h b o u r s o f a removable t e n t a t i v e l y m u l t i p l e p i x e l P 101 A1.4. An example showing the s t a t u s o f an image d u r i n g t h i n n i n g 103 A2.1. F i r s t c o n d i t i o n f o r a p i x e l P t o be r e l a b e l l e d as 3 105 A2.2. Second c o n d i t i o n f o r a p i x e l P t o be r e l a b e l l e d as 3 105 A2.3. C o n d i t i o n s f o r an i l l e g a l J4 p i x e l P 105 v i i LIST OF TABLES T a b l e Page IV- 3.1. Images used i n g e n e r a t i n g the p l o t o f f i g u r e I V - 3 . 1 61 V- 2.1. Segmentation and r e c o g n i t i o n r e s u l t s 93-94 A3.1. A t y p i c a l NDT 123 A3.2. T a b l e s from [1] d e s c r i b i n g the o l d d i c t i o n a r y 124 A3.3. New d i c t i o n a r y ' s g r o u p i n g 125 A3.4. E q u i v a l e n c e between c h a i n coded d i g r a p h and ARG NDTs 112 A3.5. To show the a d d i t i o n a l d e f o r m a t i o n f u n c t i o n 118 v i i i LIST OF ABBREVIATIONS 1. CCSA - Connected C h a r a c t e r s Segmentation A l g o r i t h m 2. DBRS - D i s t a n c e Based R e d u c t i o n Scheme 3. IFDT - Image F e a t u r e D e s c r i p t i o n Table 4. SCR - S i n g l e C h a r a c t e r R e c o g n i z e r i x Acknowledgements Three person have s u b s t a n t i a l c o n t r i b u t i o n t o make t h i s r e s e a r c h p r o j e c t p o s s i b l e . I would l i k e t o thank Dr. Rabab Ward f o r her e x c e p t i o n a l g u idance and a s s i s t a n c e . I w o u l d a l s o l i k e t o t h a n k Mr. I n g Woo Wong who has p r o v i d e d me w i t h a l l h i s r e s e a r c h m a t e r i a l s a n d t h e a s s i s t a n c e t o execute h i s s i m u l a t i o n program. F i n a l l y , I would l i k e t o thank Dr. M i c h e a l P. Beddoes f o r h i s h e l p w h i c h p r o v i d e d me w i t h the m o t i v a t i o n . x C H A P T E R I INTRODUCTION 1. P r e l i m i n a r y d i s c u s s i o n C h a r a c t e r r e c o g n i t i o n h a s b e e n an a c t i v e a r e a o f s c i e n t i f i c r e s e a r c h f o r many y e a r s . A l e s s e x p l o r e d s u b j e c t o f c h a r a c t e r r e c o g n i t i o n i s t h e s e p a r a t i o n o f h a n d - p r i n t e d c o n n e c t e d c h a r a c t e r s . I t i s c a l l e d c o n n e c t e d c h a r a c t e r  s e g m e n t a t i o n . The a s s u m p t i o n o f h a v i n g o n l y one c h a r a c t e r i n t h e image i s u s u a l l y i m p l i e d when r e s e a r c h e r s s peak o f c h a r a c t e r r e c o g n i t i o n . The d i s t i n c t i o n h e r e i s t h a t t h e r e w i l l be no such a s s u m p t i o n i n t h i s r e s e a r c h p r o j e c t . C o n n e c t e d c h a r a c t e r s e g m e n t a t i o n h a s b e e n b r i e f l y d i s c u s s e d o n l y i n a s m a l l number o f r e s e a r c h p a p e r s . I t i s c o n s i d e r e d t o be a much more d i f f i c u l t p r o b l e m t h a n s i n g l e c h a r a c t e r r e c o g n i t i o n [ 2 ] . I t i s o b v i o u s t h a t the a s s u m p t i o n o f i s o l a t e d c h a r a c t e r s o r "a c h a r a c t e r i n a box" i s n o t a c c e p t a b l e f o r c e r t a i n p r a c t i c a l a p p l i c a t i o n s s u c h as c u r s i v e - s c r i p t r e c o g n i t i o n , a u t o m a t i c h a n d - w r i t t e n t e x t t r a n s c r i p t i o n s , a u t o m a t i c w o r d p r o c e s s i n g , t r u e o f f i c e a u t o m a t i o n , e t c . A c h a r a c t e r r e c o g n i z e r can be c o n s i d e r e d as a s p e c i a l i z e d machine o r an a l g o r i t h m on a g e n e r a l purpose machine ( i . e . a computer). A g e n e r i c approach i n c h a r a c t e r r e c o g n i t i o n would be t o f i r s t a c q u i r e a good d e s c r i p t i o n o f t h e i n p u t image. 1 Chapter I Then by f o l l o w i n g a s e t o f r u l e s , t h e d e s c r i p t i o n o f t h e i n p u t image i s c o m p a r e d w i t h a s e t o f s t o r e d d e s c r i p t i o n s . By f i n d i n g the b e s t match, a c h a r a c t e r i s r e c o g n i z e d . The c u r r e n t i n t e r e s t i n t h i s a r e a o f r e s e a r c h i s t h e r e c o g n i t i o n o f C h i n e s e c h a r a c t e r s and J a p a n e s e c h a r a c t e r s . Many p u b l i s h e d papers on c h a r a c t e r r e c o g n i t i o n has t o do w i t h r e c o g n i t i o n of t h e s e c h a r a c t e r s . The p r o b a b l e r e a s o n f o r the skewed i n t e r e s t towards C h i n e s e and Japanese c h a r a c t e r s may be b e c a u s e o f t h e c o m p l e x i t y o f t h e c h a r a c t e r s i n v o l v e d . I n s t i n c t i v e l y , one w o u l d e x p e c t t h e more c o m p l i c a t e d t h e c h a r a c t e r s a r e t h e more d i f f i c u l t i t w i l l be f o r a m a c h i n e (or even humans) t o r e c o g n i z e them. C h i n e s e , J a p a n e s e , K o r e a n , e t c . a r e l a n g u a g e s w h i c h f o r b i d t o u c h i n g c h a r a c t e r s . (Touching c h a r a c t e r s s h o u l d n o t be c o n f u s e d w i t h s t r o k e s t o u c h i n g e a c h o t h e r . I t i s common i n w r i t t e n C h i n e s e t o have s e v e r a l s t r o k e s t o merge i n t o a s i n g l e s t r o k e . ) A s i n g l e c h a r a c t e r r e c o g n i z e r w o u l d be a d e q u a t e f o r i d e n t i f i c a t i o n o f a l e g i b l e i mage i n t h e s e l a n g u a g e s . However, i n t h e E n g l i s h l a n g u a g e , t o u c h i n g c h a r a c t e r s a r e b o t h l e g i b l e and a c c e p t a b l e . T h e r e a r e a l s o t h e c a s e s o f c u r s i v e s c r i p t s and i t a l i c s w h i c h a r e w i d e l y u s e d i n o u r e v e r y d a y c o m m u n i c a t i o n . I n t h i s r e s e a r c h p r o j e c t , t h e c a s e o f h a n d - p r i n t e d c o n n e c t e d c h a r a c t e r s i s c o n s i d e r e d . I n d o i n g s o , i t i s hoped t h a t t h i s e x a m i n a t i o n may p r o v i d e more u s e f u l knowledge on t h e s u b j e c t o f c h a r a c t e r 2 Chapter I s e g m e n t a t i o n i n g e n e r a l . D e s p i t e the d i f f i c u l t i e s i n r e c o g n i z i n g c o n n e c t e d E n g l i s h c h a r a c t e r s , t h e r e a r e s e v e r a l r e s e a r c h p a p e r s w h i c h a r e r e l a t e d t o c h a r a c t e r s e g m e n t a t i o n . I n r e f e r e n c e [ 2 9 ] , f o u r e a r l i e r c u r s i v e s c r i p t r e c o g n i t i o n a l g o r i t h m s are p r e s e n t e d . The f i r s t a l g o r i t h m has few c o n s t r a i n t s on h a n d w r i t i n g . I t was i m p l e m e n t e d on t h e M.I.T. L i n c o l n L a b o r a t o r y ' s TX-2 c o m p u t e r . I t t a k e s 15 s e c o n d s f o r e a c h r u n . T h e r e a r e 10 000 w o r d s i n i t s d i c t i o n a r y and e a c h w o r d i s t e s t e d by t h e a l g o r i t h m . No r e c o g n i t i o n r a t e i s q u o t e d h e r e b u t t h e a u t h o r s t a t e d t h a t c ases of m i s r e c o g n i t i o n d i d o c c u r . The s e c o n d a l g o r i t h m p r e s e n t e d i s d i f f e r e n t f r o m t h e o t h e r t h r e e because i t can r e c o g n i z e i n d i v i d u a l c h a r a c t e r s i n a word. However, i t has many c o n s t r a i n t s on t h e w r i t e r . I n d i v i d u a l c h a r a c t e r r e c o g n i t i o n r a t e a n d c h a r a c t e r s e g m e n t a t i o n r a t e are quoted a t 60% and 87% r e s p e c t i v e l y . The t h i r d a l g o r i t h m i d e n t i f i e s w o r d s u s i n g f e a t u r e s o f l o c a l maxima o r minima i n v e r t i c a l and h o r i z o n t a l d i r e c t i o n s . When the d i c t i o n a r y and the i n p u t were from the same w r i t e r , t h e word r e c o g n i t i o n r a t e was a t 30%. W i t h more c o n s t r a i n t s imposed upon the w r i t e r , the r a t e improves t o n e a r l y 60%. The f o u r t h a l g o r i t h m e m p l o y s an a n a l y s i s by s y n t h e s i s method t o r e c o g n i z e words. Only 91 out o f 100 t e s t e d samples a r e r e c o g n i z e d g i v e n t h a t t h e m a c h i n e had been p r e v i o u s l y 3 Chapter I t r a i n e d by t h e same 100 s a m p l e s . A t l e a s t one d e s i g n e r o f r e f e r e n c e [29] has c l a i m e d t h a t " r e l i a b l e r e c o g n i t i o n o f c u r s i v e s c r i p t s has not y e t been a c h i e v e d " ( a r t i c l e p u b l i s h e d 1965) . R e f e r e n c e [14] a l s o p r e s e n t s a c u r s i v e s c r i p t a l g o r i t h m . I t t r a c e s o u t t h e " a r c s , l o o p s , a n d c o r n e r s " a s c h a r a c t e r i s t i c c u r v e s . T h e n a s y n t a x - d i r e c t e d method c l a s s i f i e s a s e q u e n c e o f s u c h c h a r a c t e r i s t i c c u r v e s and compares the sequence t o the r e f e r e n c e . R e f e r e n c e [32] i d e n t i f i e s i s o l a t e d o r c onnected numerals by c h e c k i n g t h e image's p r o f i l e on t h e l e f t and t h e p r o f i l e on t h e r i g h t . However, i t f a i l e d t o d i s c u s s t h e c o m p l e t e a l g o r i t h m on the c onnected numerals. R e f e r e n c e [30] uses f e a t u r e s such as t h r e s h o l d c r o s s i n g , l o c a l v e r t i c a l maxima and minima, d o t s ( i . e . " i " and " j " ) and c r o s s b a r s ( i . e . " t " ) , c l o s u r e s , s t r o k e d i r e c t i o n s , a p e x s and c u s p s , e t c . t o d e s c r i b e t h e i n p u t word image. R e c o g n i t i o n r a t e s from 64.1% t o 96.8% were a c h i e v e d u s i n g v a r i o u s s e t s of d a t a f o r t r a i n i n g and t e s t i n g . However, t h e p o o l o f d a t a t e s t e d was t o o s m a l l t o be c o n c l u s i v e and t h e a l g o r i t h m , w h i c h was n o t w e l l d o c u m e n t e d , seems t o be e x c e e d i n g l y complex. T h e r e a p p e a r s t o be a l a c k o f w e l l d o c u m e n t e d c o n n e c t e d c h a r a c t e r s e g m e n t a t i o n a l g o r i t h m s (CCSA). However, we ca n c l a s s i f y a CCSA i n t o two d i f f e r e n t a p p r o a c h e s b a s e d on t h e 4 Chapter I above d i s c u s s i o n . The f i r s t one, was u t i l i z e d f o r most o f t h e a l g o r i t h m s d i s c u s s e d above. I t i s the whole word approach. U s u a l l y , an e x t r e m e l y l a r g e d i c t i o n a r y i s r e q u i r e d but the a l g o r i t h m i s s i m p l e . T h i s approach has the s e r i o u s d i s a d v a n t a g e of b e i n g u n a b l e t o r e c o g n i z e w o r d s t h a t a r e n o t i n t h e d i c t i o n a r y . R e f e r e n c e [30] even c l a i m e d t h a t i t i s "a p o t e n t i a l l y f a s t e r and more a c c u r a t e r e c o g n i t i o n scheme." The o t h e r a p p r o a c h i s t h e c h a r a c t e r by c h a r a c t e r a p p r o a c h . I t i s o n l y i n v e s t i g a t e d by one r e s e a r c h e r i n r e f e r e n c e [29]. T h i s approach i s r e g a r d e d as more d i f f i c u l t t h a n t h e f o r m e r b u t i t d o e s h a v e t h e c a p a b i l i t i e s o f r e c o g n i z i n g w o r d s o f a n y c o m b i n a t i o n o f c h a r a c t e r s . F u r t h e r m o r e , t h e s i z e o f t h e d i c t i o n a r y i s much s m a l l e r b e c a u s e t h e d i c t i o n a r y o n l y needs t o s t o r e i n d i v i d u a l c h a r a c t e r s and t h e i r v a r i a t i o n s . T h e r e f o r e , t h i s approach i s used h e r e . L a n g u a g e s r e l y on s t r u c t u r e s and r e l a t i o n s h i p s f o r t h e c o m m u n i c a t i o n o f i d e a s . F o r e x a m p l e , t h e c h a r a c t e r s "B," "O," and "Y" a r e m e a n i n g l e s s u n t i l t h e y a r e c o m b i n e d i n a s p e c i f i c f o r m s u c h as "BOY." O f t e n , t h e s e s t r u c t u r e s and r e l a t i o n s h i p s a r e not c o m p a t i b l e w i t h a machine's language. T h e r e f o r e , one c a n d e s i g n a new m a c h i n e s p e c i f i c a l l y f o r c h a r a c t e r r e c o g n t i o n b a s e d on t h e s t r u c t u r e s and t h e r e l a t i o n s h i p s or we can use a g e n e r a l purpose machine ( i . e . a 5 Chapter I c o m p u t e r ) and d e s i g n t h e a l g o r i t h m t o c l o s e l y a p p r o x i m a t e s t h e s e s t r u c t u r e s and r e l a t i o n s h i p s . F o r s e p a r a t i n g c o n n e c t e d c h a r a c t e r s , we w o u l d want a d e s c r i p t i o n o f the word "BOY," f o r example, i n the d i c t i o n a r y such as: "The w o r d 'BOY' c o n s i s t s o f t w o t o u c h i n g l o o p s ( c l o s u r e s ) on t h e l e f t hand s i d e o f t h e image. To t h e r i g h t o f t h e s e l o o p s , we have a l a r g e r l o o p . F u r t h e r t o t h e r i g h t o f t h e l a r g e r l o o p , we have t h r e e branches j o i n i n g a t a node." A p o s s i b l e a l g o r i t h m t o i d e n t i f y the word "BOY" g i v e n the above one word " d i c t i o n a r y " would be: 1. Look f o r two t o u c h i n g l o o p s on t o p o f e a c h o t h e r on the l e f t hand s i d e o f the image. I f not found, t h e n goto *. 2. L o o k f o r a l a r g e r l o o p t o t h e r i g h t o f t h e t w o t o u c h i n g l o o p s . I f not fo u n d , then goto *. 3. Look f o r t h r e e b r a n c h e s j o i n i n g a t a node, t o t h e r i g h t o f t h e l a r g e r l o o p . I f n o t f o u n d , t h e n g o t o *, e l s e s t o p and c l a i m t h a t the i n p u t was the word "BOY." * 4. C l a i m t h a t the i n p u t was not the word "BOY." H e r e , we c a n s a y t h a t t h e l o o p s , n o d e s , and b r a n c h e s a r e s t r u c t u r e s ( i . e . t o p o l o g i c a l i n f o r m a t i o n ) and t h e t o u c h i n g , t h e l e f t h a n d s i d e o f , t h e r i g h t h a n d s i d e o f a r e t h e r e l a t i o n s h i p s . 2. Making use o f the l e f t - t o - r i g h t r u l e and a " b l a c k - b o x " Through c o n s i d e r a t i o n o f h a n d - p r i n t e d E n g l i s h , i t s h o u l d be o b v i o u s t h a t t h e r e i s a c o n s i s t e n t d i r e c t i o n i n w h i c h c h a r a c t e r s a r e w r i t t e n and i n t e r p r e t e d . I n t h e E n g l i s h l a n g u a g e we a l w a y s o b s e r v e t h e " l e f t - t o - r i g h t " r u l e . F u r t h e r m o r e , the t o p h a l f o f an E n g l i s h word u s u a l l y c a r r i e s 6 Chapter I more i n f o r m a t i o n f o r t h e r e a d e r t h a n t h e b o t t o m h a l f o f t h e word. One c a n c o v e r t h e b o t t o m h a l f o f a l i n e and c a n s t i l l r e a d t h e l i n e . However, i f t h e t o p h a l f i s c o v e r e d , one would f i n d i t h a r d e r t o r e a d the same l i n e . A l t h o u g h t h e r e i s no d i r e c t l i n k b e t w e e n t h i s r u l e and t h e m a t e r i a l s on f u z z y s e t t h e o r y i n [4] and [ 3 5 ] , c a r e f u l a n a l y s i s s u g g e s t e d t h a t h a r n e s s i n g t h e w r i t i n g r u l e o f E n g l i s h s h o u l d h e l p t h e d e s i g n o f t h e s e g m e n t a t i o n ( i . e . segmentation) p r o c e s s . The most o b v i o u s a p p r o a c h w o u l d be t o s i m p l y s c a n an image from l e f t t o r i g h t and l o o k f o r a match w i t h a window. T h i s s c a n a n d m a t c h m e t h o d i s a r e a s o n a b l e a p p r o a c h . However, i t l a c k s the s o p h i s t i c a t i o n o f u t i l i z i n g t he h i g h l y s t r u c t u r e d r e l a t i o n s h i p b e t w e e n t h e l e f t hand s i d e and t h e r i g h t hand s i d e o f a s e q u e n c e o f c h a r a c t e r s . I t s h o u l d become o b v i o u s i n t h e l a t e r p a r t o f t h i s t h e s i s t h a t t h e newly, d e s i g n e d a l g o r i t h m p r e s e n t e d i n t h i s p r o j e c t i s a more g e n e r a l i z e d approach than the row by row scan method. I f a "ready-made" s i n g l e c h a r a c t e r r e c o g n i z e r i s used as p a r t o f the CCSA, a s e p a r a t e scheme c o u l d be used t o d e s c r i b e c o n n e c t e d c h a r a c t e r s . T h i s way, t h e d e s c r i p t i o n c a n be d e s i g n e d w i t h t h e n e c e s s a r y a d a p t a t i o n s and f o c u s on t h e s t r u c t u r e a n d r e l a t i o n s h i p i s s u e d i r e c t e d s o l e l y f o r s e g m e n t a t i o n . F u r t h e r m o r e , i t w i l l a l l o w us t o change t h e "module" on s i n g l e c h a r a c t e r r e c o g n i t i o n i f n e c e s s a r y . 7 Chapter I R e f e r e n c e s [ 1 ] , [ 2 ] , [ 3 ] , a n d [20] w e r e a b l e t o dem o n s t r a t e competency i n r e c o g n i z i n g s i n g l e c h a r a c t e r s . The o t h e r r e f e r e n c e s showed more r e s u l t s o f r e s e a r c h i n c h a r a c t e r r e c o g n i t i o n and r e l a t e d t e c h n o l o g y . A f t e r c a r e f u l e v a l u a t i o n o f t h e a d v a n t a g e s and t h e d i s a d v a n t a g e s , i t was c o n c l u d e d t h a t Mr. I . H. Wong's M. A. Sc. [1] was t h e most s u i t a b l e c a n d i d a t e f o r t h e s i n g l e c h a r a c t e r r e c o g n i z e r i n t h i s r e s e a r c h p r o j e c t . The " b l a c k - b o x " s i n g l e c h a r a c t e r r e c o g n i z e r (or t h e SCR) w i l l t a k e b i n a r y i m a g e s f r o m CCSA, d e c i d e s w h i c h c h a r a c t e r does t h e s e i m a g e s r e p r e s e n t , and r e t u r n s t h e r e s u l t s b a c k t o t h e a l g o r i t h m . The r e l a t e d p r o g r a m s , d i c t i o n a r y , and t h e d a t a s e t f r o m [1] a r e r e a d i l y a v a i l a b l e on Mr. Wong's hard d i s k i n our department. 3. An o u t l i n e o f the ex p e r i m e n t Images c a n be a c q u i r e d t h r o u g h a t e l e v i s i o n c a m e r a , g r a p h i c s t a b l e t s , e t c . The n e c e s s a r y s t e p s t o a c q u i r e t h e d i g i t a l b i n a r y i m a g e s w i l l n o t be d i s c u s s e d h e r e . D a t a c a n be a c q u i r e d by a p p l y i n g o p t i c a l c h a r a c t e r r e a d i n g t e c h n i q u e s s i m i l a r t o tho s e found i n Lunscher's M. A. Sc. t h e s i s [42] o r the r e g i o n s e g m e n t a t i o n t e c h n i q u e s found i n [21]. T h e r e f o r e , c l e a n and a d e q u a t e l y d i g i t i z e d i n p u t images a r e assumed f o r t h i s p r o j e c t . T h e r e a r e t h r e e m a j o r r o u t i n e s i n t h e CCSA w h i c h a r e d e s c r i b e d i n c h a p t e r s 2, 3, and 4. The f i r s t one i s t h e p r e p r o c e s s i n g r o u t i n e . I t i n c l u d e s r e moving n o i s y b r a n c h e s , 8 Chapter I f i l l i n g p i n h o l e s , s m o o t h i n g c o r n e r s , and t h i n n i n g down t h e i n p u t i m a g e s . T h i n n i n g i s t h e most i m p o r t a n t o f t h e f o u r p r e p r o c e s s i n g s t e p s . I t i s t h e r e d u c t i o n o f t h e image t o 1 p i x e l " t h i c k " o r 1 p i x e l " w i d e " s k e l e t o n . I t was p r o v e n i n r e f e r e n c e [1] t h a t r e d u c i n g the image t o i t s s k e l e t a l form i s n e c e s s a r y f o r e a s i e r d a t a r e p r e s e n t a t i o n and m a n i p u l a t i o n . The n e x t s t e p a f t e r p r e p r o c e s s i n g i s t h e e x t r a c t i o n of image f e a t u r e s . Image f e a t u r e s ( i . e . nodes and b r a n c h e s d e s c r i p t i o n s ) a r e e x t r a c t e d f r o m t h e s k e l e t a l i mage. A d e s c r i p t i o n t a b l e w i l l s t o r e t h e image f e a t u r e s . The s t r a t e g y o f e x t r a c t i n g and s t o r i n g t h e image f e a t u r e s i s t o o b t a i n a d e s c r i p t i o n o f the nodes and branches a c c o r d i n g t o a r u l e w h i c h a p p r o x i m a t e s t h e E n g l i s h l a n g u a g e ' s " l e f t - t o -r i g h t " r u l e . A f t e r a c q u i r i n g t h e image f e a t u r e s , t h e CCSA w i l l s t a r t t h e s e g m e n t a t i o n p r o c e s s . F e a t u r e s a r e c h o s e n f r o m t h e d e s c r i p t i o n t a b l e t o f o r m t e s t images. The f e a t u r e a t t h e t o p o f the d e s c r i p t i o n t a b l e w i l l be chosen f i r s t . More t e s t images w i l l be formed and t e s t r e c o g n i z e d by the SCR u s i n g a s p e c i a l scheme d i s c u s s e d below. T e s t i n g out e v e r y p o s s i b l e c o m b i n a t i o n of f e a t u r e s i n the d e s c r i p t i o n w i l l NOT be t h e method s i n c e i t i s p r o h i b i t i v e l y e x p e n s i v e . I n s t e a d , a s p e c i a l scheme i s e m p l o y e d . E a c h i t e r a t i o n o f t h e s c h e m e i s an a t t e m p t o f t h e CCSA t o r e c o g n i z e the f i r s t c h a r a c t e r on the l e f t . To r e c o g n i z e the 9 Chapter I f i r s t c h a r a c t e r the scheme a n a l y s i s the r e s u l t s from the SCR f o r a l l t e s t i m a g e s c r e a t e d . A f t e r t h e f i r s t c h a r a c t e r i s r e c o g n i z e d , i t w i l l be l o g i c a l l y r e m o v e d f r o m t h e d e s c r i p t i o n . The CCSA t h e n t r e a t s t h e r e m a i n i n g p a r t o f t h e i n p u t image as a n o t h e r i n p u t image. A new d e s c r i p t i o n i s u p d a t e d f o r m t h e o l d one and t h e r e m a i n i n g s e c t i o n o f t h e i n p u t image ar e ready f o r an o t h e r i t e r a t i o n . The p r o c e s s c o n t i n u e s u n t i l no image f e a t u r e s a r e l e f t i n t h e d e s c r i p t i o n t o f o r m new t e s t images. F i g u r e 1-3.1 i s a b l o c k diagram showing the g e n e r a l i d e a o f t h e CCSA. 4. Data s e t Because o f the i n v e s t i g a t i v e n a t u r e o f t h i s p r o j e c t , o n l y p a r t o f Munson's m u l t i - c o d e r a l p h a n u m e r i c c h a r a c t e r i m a g e s [2] have been chosen as a sour c e o f p r e - i n p u t d a t a . A s e t o f 26 i m a g e s f r o m "A" t h r o u g h "Z" i s c h o s e n a t random. The a c t u a l i n p u t i m a g e s u s e d a r e c r e a t e d by s l i d i n g any two o f t h e s e i m a g e s t o g e t h e r u n t i l t h e y t o u c h e a c h o t h e r . E a c h o f t h e s e new i m a g e s a r e 24 p i x e l s t a l l and o f v a r i a b l e w i d t h , d e p e n d i n g on t h e c h a r a c t e r s i n v o l v e d . F i g u r e 1-4.1 shows some t y p i c a l i n p u t images t o t h e CCSA. The same s e t o f 26 i m a g e s a r e u s e d t o c r e a t e a new d i c t i o n a r y u s i n g t h e same f e a t u r e d e s c r i p t i o n f o r m a t as 10 Chapter I d e s c r i b e d i n r e f e r e n c e [ 1 ] . The new d i c t i o n a r y was c r e a t e d t o s u i t t h e e x p e r i m e n t a l p u r p o s e o f t h i s p r o j e c t . The o l d d i c t i o n a r y from r e f e r e n c e [1] was too l a r g e (231 c h a r a c t e r s i n t o t a l , t h e r e can be more than one t o p o l o g i c a l shape f o r a c h a r a c t e r , e.g. " Z" a n d " 2-") and i s n o t e m p l o y e d h e r e . H o w e v e r , one s h o u l d k e e p i n m i n d t h a t e x p a n d i n g t h e d i c t i o n a r y t o s u i t one's needs i s e n t i r e l y p o s s i b l e . A l l t h e t e s t i m a g e s c o n t a i n two c h a r a c t e r s e x c e p t f o r some e x a m p l e s i n t h e l a t e r p a r t o f t h i s t h e s i s . T h i s i n v e s t i g a t i o n c o n s i d e r e d c onnected c h a r a c t e r s w h i c h a r e not c o n n e c t e d by t h e i r e n d - p o i n t s and w h i c h do n o t have t h e i r s i d e - b y - s i d e branches t o u c h i n g a f t e r t h i n n i n g . The number o f i m a g e s w h i c h f i t s t h e s e c o n s t r a i n t s i s 336. These i m a g e s r e p r e s e n t a p p r o x i m a t e l y o n e - h a l f o f a l l the p o s s i b l e (26^ = 676) c o m b i n a t i o n s o f t h e s e t o f 26 c h a r a c t e r s . I t i s p o s s i b l e t o m o d i f y my CCSA t o i n c l u d e the o t h e r s . The CCSA was implemented i n PASCAL/VS on The U n i v e r s i t y o f B r i t i s h C o l u m b i a C o m p u t i n g C e n t r e m a i n f r a m e c o m p u t e r Amdahl 5850 r u n n i n g t h e M i c h i g a n T e r m i n a l S y s t e m (MTS) o p e r a t i n g system. 11 INPUT Preprocessing - t h i n n i n g - f i l l p i n h o l e s - c u t n o i s y branches - c u t c o r n e r s Feature extraction - c r e a t e a h i e r a r -chy d e s c r i p t i o n w i t h the l e f t most node on t o p . Segmentation process - form t e s t images a c c o r d i n g t o the h i e r a r c h y . - a n a l y z e r e s u l t s from SCR and deter m i n e which t e s t image i s c o r r e c t v A OUTPUT, SCR t a k e s a b i n a r y image as i n p u t and o u t p u t s r e c o g n i t i o n r e s u l t s back t o the CCSA. where: INPUT OUTPUT Connected h a n d p r i n t e d b i n a r y c a p i t a l E n g l i s h c h a r a c t e r s . I d e n t i t i e s of the i n p u t c h a r a c t e r s , i f r e c o g n i z e d . F i g . 1-3.1. B l o c k diagram showing the g e n e r a l i d e a o f the CCSA. Chapter I • • • • • ......... ...................... 1 • • • • • 1111 • • • • • iii...111111 ...111111 1111111111111 ...111111 11111.11111111 ...111111 11111.11.. 1111 ...11111. 11... 111 ...111... 11.... 111 ...111... 111... 111 " ! i i i i r ! ! i 111111 i i ! ! ! i i i i i i n i i ! ! ! ! ! ! ! ..11111.. 111111111 ..111.... 11111111 • 111 11111 • • • • • . 111 1 1 1  .111..... .....111......... • • • • • 111111111 11111111 • • • • « 111111111 1111111 • • • • • 111111111 1111111 • • • • « n i i i i i . . .... 111 11 .... 111 • • • • i i i • • • • i i i • • • • 4 • • • « • • • • . • • • • • • • • • • • • • « • • • • • • • • ::::::::: i i i " " 111 * i i 111111 i .... :;;:!!!;; .::!!!!!!!!!!:!!::::::: .... :::;!!!:: ' in!!! ! ! ! ; : : : : : : : : : : : : .... ...in... . 111... 11111 .... " i i i i i i i i u i ! " i i i * ! .... .'.1111111 1 1 1 1 i i i ! ! ! ! ! ! ! .... .11111111 1111 ii .... .111.1111 111 ii .... .11 11.... i in .... .11 11.... 11.... 111 .... .111 11.... 111111111 .... in 1 111.... 11111111 .... .11 1 in 111111 .... .11 i i i 11 .... ii .... .... • • • .... F i g . 1-4.1. T y p i c a l i n p u t images t o the CCSA. The images r e p r e s e n t the c h a r a c t e r s "EP" and "HS" r e s p e c t i v e l y . 13 CHAPTER I I PREPROCESSING 1. I n t r o d u c t i o n Raw image d a t a i s o f t e n d i f f i c u l t t o r e p r e s e n t on a d i g i t a l computer. I t i s n e c e s s a r y t o p r o c e s s the i n p u t image b e f o r e a c q u i r i n g t h e image f e a t u r e s . P r e p r o c e s s i n g c a n be r e g a r d e d as t h e s t e p s t a k e n a f t e r t h e a c q u i s i t i o n o f image d a t a f r o m t h e c a m e r a and b e f o r e t h e e x e c u t i o n o f t h e image f e a t u r e e x t r a c t i o n r o u t i n e . R e f e r e n c e s [ 1 ] , [ 1 2 ] , [ 1 8 ] , [ 3 0 ] , and [42] show d i f f e r e n t methods s u i t a b l e f o r p r e p r o c e s s i n g t o f i t t h e i r s p e c i f i c needs. R e f e r e n c e [21] shows a good method t o e x t r a c t machine p r i n t e d c h a r a c t e r s o r w o r d s f r o m an e n t i r e page image. I t u s e s r e c t a n g u l a r r e g i o n g r o w i n g and c a n c e l l i n g method t o s e g m e n t r e g i o n s o f an i n p u t i m a g e . By c h a n g i n g t h e r e c t a n g u l a r window's h e i g h t and w i d t h , l i n e s , columns, words, o r c h a r a c t e r s o f an image c a n be s e g m e n t e d . T h i s method s h o u l d be u s e f u l i n s e p a r a t i n g an image i n t o s e c t i o n s i f the i n p u t image c o n t a i n s c l u s t e r i n g s o f connected c h a r a c t e r s . P r e p r o c e s s i n g i n t h i s p r o j e c t i n c l u d e s , f i l l i n g o f p i n h o l e s , d e l e t i o n o f n o i s y b r a n c h e s (known as " p r u n i n g " i n [ 1 ] ) , s m o o t h i n g o f 4 - c o n n e c t e d c o r n e r s and t h e t h i n n i n g o f i m a g e s . The t h i n n i n g a l g o r i t h m f r o m [1] was p r o v e n t o be u s e f u l i n t r a n s f o r m i n g t h e i n p u t image t o i t s s k e l e t a l form w i t h o u t l o o s i n g the g e n e r a l shape and i m p o r t a n t f e a t u r e s o f 14 Chapter I I the image. T h e r e f o r e , t h e t h i n n i n g a l g o r i t h m from r e f e r e n c e [1] i s u s e d h e r e . The o t h e r t h r e e s t e p s a r e r e l a t i v e l y s i m p l e s t e p s . These s t e p s a r e r e q u i r e d as supplements t o the t h i n n i n g a l g o r i t h m . F i g u r e I I - l . l i s a b l o c k diagram showing t h e v a r i o u s s t e p s i n v o l v e d i n p r e p r o c e s s i n g a d i g i t i z e d b i n a r y image. The f o l l o w i n g i s a p r e s e n t a t i o n of some b a s i c c o n c e p t s on d i s c r e t e g e o m e t r y and b a s i c t o p o l o g y . These c o n c e p t s a r e n e c e s s a r y f o r the u n d e r s t a n d i n g o f t h i s p r o j e c t . 2. B a s i c s on t o p o l o g y and d i s c r e t e geometry Topology can be thought of as " e l a s t i c geometry." Images a r e t o p o l o g i c a l l y e q u i v a l e n t i f t h e y can be " s t r e t c h e d " i n t o a n o t her image. For example, a l o o p and the c h a r a c t e r "0" are t o p o l o g i c a 1 l y e q u i v a l e n t . I n c h a r a c t e r r e c o g n i t i o n , t o p o l o g i c a l l y e q u i v a l e n t i m a g e s o c c u r s f r e q u e n t l y . F o r e x a m p l e , c h a r a c t e r " I " ( w i t h t h e c r o s s b a r s ) and c h a r a c t e r "H" a r e t o p o l o g i c a l l y e q u i v a l e n t c h a r a c t e r s ( s i m p l y r o t a t e and s t r e t c h t h e " I " ) . Images a r e u s u a l l y r e p r e s e n t e d i n a d i s c r e t e form f o r the e a s e o f s t o r a g e , a n a l y s i s , and m a n i p u l a t i o n by a c o m p u t e r . I t i s l o g i c a l t o r e p r e s e n t images on a square a r r a y (or g r i d ) o f p i c t u r e e l e m e n t s or p i x e l s t o r e p r e s e n t the image. There are o t h e r forms of d i g i t a l r e p r e s e n t a t i o n . Golay d i s c u s s e d t r a n s f o r m a t i o n a l g o r i t h m s b a s e d on a h e x a g o n a l g r i d 15 Chapter I I r e p r e s e n t a t i o n o f b i n a r y images i n r e f e r e n c e [41]. However, t h e s q u a r e g r i d i s t h e u n d e r s t o o d s t a n d a r d . A l m o s t a l l t h e r e s e a r c h i n p a t t e r n r e c o g n i t i o n , image p r o c e s s i n g , image u n d e r s t a n d i n g , a n d o t h e r r e l a t e d a r e a s u s e s t h i s r e p r e s e n t a t i o n . C h a r a c t e r s can be r e p r e s e n t e d u s i n g o n l y b l a c k and w h i t e p i x e l s . A "1" c a n r e p r e s e n t a b l a c k p i x e l and a "0" c a n r e p r e s e n t a w h i t e one. G i v e n an image A w h i c h has l ' s and O's p i x e l s , a window c a n be u s e d t o a n a l y s i s a s e c t i o n o f A. T h i s window i s t h e b a s i c o p e r a t o r . Two o p e r a t o r s a r e u s e d t h r o u g h o u t t h i s p r o j e c t . The f i r s t one i s i d e n t i c a l t o t h e o p e r a t o r used by Wong i n [ 1 ] : P4 P3 P2 P5 P PI P6 P7 P8 where: P = t h e p i x e l o f i n t e r e s t t o t h e a l g o r i t h m P1,P3,P5,P7 = immediate o r d i r e c t n e i g h b o u r s o f P P2,P4,P6,P8 = i n d i r e c t n e i g h b o u r s o f P F i g . I I - 2 . 1 . The W o p e r a t o r [ 1 ] . F o r c l a r i t y , t h i s o p e r a t o r i s c a l l e d t h e W o p e r a t o r (W f o r Wong). The f o l l o w i n g o u t l i n e s t h e o t h e r o p e r a t o r . I t i s c a l l e d the T o p e r a t o r (T f o r T i n g ) : 16 Chapter I I PI P4 P6 P2 P P7 P3 P5 P8 where: P = the p i x e l of i n t e r e s t t o the a l g o r i t h m P2,P4,P5,P7 = immediate o r d i r e c t n e i g h b o u r s o f P P1,P3,P6,P8 = i n d i r e c t n e i g h b o u r s o f P F i g . I I - 2 . 2 . The T o p e r a t o r . The r e a s o n s f o r the d i f f e r e n c e between them w i l l be d i s c u s s e d i n Chapter I I I . The CCSA w i l l use t h e s e o p e r a t o r s d u r i n g s c a n n i n g by t r a v e r s i n g an o p e r a t o r on images. At any one t i m e , o n l y the p i x e l i n t h e m i d d l e , i . e . p i x e l P, w i l l be c h a n g e d and a f f e c t e d by t h e a l g o r i t h m . The o t h e r p i x e l s i n t h e o p e r a t o r ' s window w i l l o n l y be r e a d by the CCSA. A n o t h e r i m p o r t a n t c o n c e p t i s c o n n e c t i v i t y . C o n s i d e r a c o l l e c t i o n o f p i x e l s [ S 3 . C S 3 i s c a l l e d 4 - c o n n e c t e d i f t h e p i x e l s o f £ S 3 a r e i m m e d i a t e ( o r d i r e c t ) n e i g h b o u r s . [ S 3 i s c a l l e d 8 - c o n n e c t e d i f t h e p i x e l s a r e i n d i r e c t n e i g h b o u r s . For example: 17 Chapter I I 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 U s i n g the 4-connected d e f i n i t i o n o n l y : The above image has 4 o b j e c t s , a background, and 1 h o l e . U s i n g the 8-connected d e f i n i t i o n o n l y : The above image has 1 o b j e c t , a background, and no h o l e s . F i g . I I - 2 . 3 . To show c o n n e c t i v i t y d e f i n i t i o n s . * I n t h i s r e s e a r c h p r o j e c t , t h e 8 - c o n n e c t e d d e f i n i t i o n i s used whenever p o s s i b l e and the 4-connected d e f i n i t i o n i s used o n l y when needed. For example: 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 1 1 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 F i g . I I - 2 . 4 . A t y p i c a l c o m b i n a t i o n showing 3 branches c o n n e c t e d t o each o t h e r . I n t h i s p r o j e c t , t h e f o c u s i s on t h e c o n n e c t i o n o f p i x e l s . As l o n g a s p i x e l s a r e n e x t t o e a c h o t h e r , no s p e c i f i c a l g o r i t h m i s r e q u i r e d t o r e s o l v e t h e a m b i g u i t i e s i n f i g u r e I I - 2 . 3 . However, i t i s i m p o r t a n t t o note t h a t the hexagonal g r i d does not have the mentioned a m b i g u i t i e s . F r e e m a n ' s c h a i n c o d e [1] i s an e f f e c t i v e way o f r e p r e s e n t i n g b r a n c h e s . A c h a i n c o d e i s a s e q u e n c e o f Freeman's d i r e c t i o n s . Freeman's d i r e c t i o n s a r e d e f i n e d by x The two d e f i n i t i o n s above i s from the l e c t u r e n o t e s o f the g r a d u a t e c o u r s e Computer S c i e n c e 505 - Image U n d e r s t a n d i n g I - Image A n a l y s i s . 18 Chapter I I t h e masks o f t h e W o p e r a t o r and t h e T o p e r a t o r . The W o p e r a t o r ' s mask w i l l form the W d i r e c t i o n s : 6 7 8 F i g . I I - 2 . 5 . Freeman's d i r e c t i o n s d e f i n e d u s i n g the mask o f the W o p e r a t o r [ 1 ] . S i m i l a r l y , the T o p e r a t o r ' s mask w i l l form the T d i r e c t i o n s : ' 1 4 6 2<r—±V- 7 3 5 ^8 F i g . I I - 2 . 6 . Freeman's d i r e c t i o n s d e f i n e d u s i n g the mask o f the T o p e r a t o r . F o r b o t h o f t h e s e f o r m a t s , we c a n t r a c e f r o m p i x e l P t o i t s n e i g h b o u r s and r e c o r d the c o r r e s p o n d i n g d i r e c t i o n . A b r a n c h o r a p a t h i n a d i g i t a l b i n a r y image c a n be r e p r e s e n t e d by a s e r i e s o f t h e s e d i r e c t i o n s . F o r e x a m p l e , c o n s i d e r a p a t h from p i x e l A t o p i x e l B: 19 Chapter I I # # # # # # # B # # # # # # # # # # where: P a t h AB = £ 7,8,1,1,8,8,8,2,1,3,4,3,2,1,1,8,1 3 u s i n g the W d i r e c t i o n s [ 1 ] . In T d i r e c t i o n s f o r m a t , we w i l l have P a t h AB = £ 5,8,7,7,8,8,8,6,7,4,1,4,6,7,7,8,7 3. Hence, b r a n c h e s can be e a s i l y r e p r e s e n t e d on a c o m p u t e r u s i n g an i n t e g e r a r r a y . R e c r e a t i n g t h e b r a n c h w i l l o n l y r e q u i r e t h e l o c a t i o n o f t h e b e g i n n i n g p i x e l ( i . e . t h e l o c a t i o n o f p i x e l A) and t h e Freeman's c h a i n code of p a t h AB. P a t h BA, h o w e v e r , i s t h e r e v e r s e o f p a t h AB. H e r e , a s i m p l e t r a n s l a t i o n c a n be u s e d t o t r a n s l a t e d t h e f o r w a r d r e p r e s e n t a t i o n i n t o a r e v e r s e r e p r e s e n t a t i o n . I f n e c e s s a r y , one can a l w a y s s t a r t a t p o i n t B. 3. F i l l i n g p i n h o l e s o f s i z e one p i x e l The f i r s t s t e p i n p r e p r o c e s s i n g i s t h e f i l l i n g o f p i n h o l e s o f t h e s i z e one p i x e l . A s m a l l a l g o r i t h m i s d e s i g n e d s u c h t h a t i t c h a n g e s a "0" p i x e l t o a "1" p i x e l i f a l l i t s d i r e c t n e i g h b o u r s a r e "1." The r o u t i n e w i l l scan the e n t i r e i m a g e , row by row, t o l o o k f o r a c o n d i t i o n t o be s a t i s f i e d . The c o n d i t i o n i s d e f i n e d as f o l l o w s : F i g . I I - 2 . 7 . P a t h AB's r e p r e s e n t a t i o n s . 20 Chapter I I X A X A P A X A X where: A must be a 1 p i x e l P must be a 0 p i x e l X can be a n y t h i n g F i g . I I - 3 . 1 . C o n d i t i o n t o f i l l s i n g l e p i x e l p i n h o l e s . Whenever t h i s p a t t e r n i s found, P i s changed from a 0 t o a 1. N o t i c e t h a t t h e p i x e l s i n A a r e a l r e a d y c o n n e c t e d t o e a c h o t h e r . A d d i n g an e x t r a 1 p i x e l i n P w i l l n o t a f f e c t t h e i mage. The e s s e n c e h e r e i s t h a t t h e c o n n e c t i v i t y o f t h e image i s not a l t e r e d . H o l es of s i z e w i t h more than one p i x e l a r e c o n s i d e r e d as u s e f u l i n f o r m a t i o n here. No o t h e r method i s d e s i g n e d t o f i l l them. F u r t h e r r e s e a r c h i n e l i m i n a t i n g r e d u n d a n t h o l e s o f s i z e g r e a t e r than one p i x e l i s r e q u i r e d f o r t h e o p t i m i z a t i o n o f t h e p r o c e s s . A c t u a l e x p e r i m e n t s have shown t h a t t h i s s i m p l e search-and-change p r o c e s s i s adequate. 4. T h i n n i n g The t h i n n i n g a l g o r i t h m u s e d h e r e i s t h e same one f o u n d i n r e f e r e n c e [ 1 ] . The t h i n n i n g a l g o r i t h m p e r f o r m e d r e m a r k a b l y w e l l . Image f e a t u r e s were w e l l r e t a i n e d . The t h i n n i n g a l g o r i t h m i s o u t l i n e d i n Appendix 1. T h i n n i n g i s t h e s e c o n d and t h e m a j o r s t e p o u t o f t h e f o u r i n p r e p r o c e s s i n g . A s e t o f c o n d i t i o n s f o r m e d t h e b a s i s f o r the a l g o r i t h m . When some o f t h e s e c o n d i t i o n s a r e s a t i s f i e d , t h e a l g o r i t h m " p e e l s - o f f " t h e o u t e r l a y e r ( u s u a l l y c a l l e d the 21 Chapter I I c o n t o u r ) o f a c h a r a c t e r . The p r o c e s s c o n t i n u e s u n t i l t h e c o n d i t i o n s f o r "non-removable" p i x e l s a r e s a t i s f i e d f o r a l l t h e b l a c k p i x e l s i n t h e image. As m e n t i o n e d i n [1] t h e t h i n n i n g p r o c e s s was an i m p r o v e m e n t o v e r t h e o r i g i n a l a l g o r i t h m from [ 9 ] . Repeated use o f the t h i n n i n g a l g o r i t h m i n t h i s r e s e a r c h p r o j e c t showed t h a t i t i s a r e l i a b l e way t o reduce the image t o a s k e l e t a l form. 5. I d e n t i f y i n g nodes and e l i m i n a t i n g n o i s y branches The n e x t s t e p i n p r e p r o c e s s i n g i s t h e i d e n t i f i c a t i o n o f j u n c t i o n nodes and t r i m m i n g n o i s y branches. B e f o r e t r i m m i n g t h e n o i s y b r a n c h e s , t h e t e r m i n a t i o n nodes o f t h e image a r e i d e n t i f i e d and l a b e l l e d d i r e c t l y on t h e image. T h e r e a r e t h r e e t y p e s o f t e r m i n a t i o n nodes [ 1 ] : 1. E n d p o i n t s (EP). 2. J u n c t i o n 3s ( J 3 ) . 3. J u n c t i o n 4s (J4) . T h e i r d e f i n i t i o n s and r u l e s o f r e l a b e l l i n g a r e o u t l i n e d i n Appendix 2. The m a t e r i a l s i n Appendix 2 can a l s o be found i n S e c t i o n 2.2 of Chapter 3 o f Mr. Wong's t h e s i s . The f o l l o w i n g i s a p r e s e n t a t i o n o f t h e node i d e n t i f i c a t i o n a l g o r i t h m . BEGIN node i d e n t i f i c a t i o n For each 1 p i x e l o f a t h i n n e d b i n a r y image Do 1. R e l a b e l t h e p i x e l as an EP a c c o r d i n g t o d e f i n i t i o n 1 o f Appendix 2; 2. R e l a b e l t h e p i x e l as an J 3 a c c o r d i n g t o d e f i n i t i o n 2 o f Appendix 2; 22 Chapter I I 3. R e l a b e l t h e p i x e l as an J 4 a c c o r d i n g t o d e f i n i t i o n 3 o f Appendix 2; END; node i d e n t i f i c a t i o n The i d e n t i f i e d and l a b e l l e d nodes w i l l r e m a i n i n t h e image and s e r v e s as markers f o r the f o r t h c o m i n g t r i m n o i s y branches r o u t i n e and the image f e a t u r e e x t r a c t i o n r o u t i n e . Upon c o m p l e t i o n of the i d e n t i f i c a t i o n p r o c e s s , p i x e l s of 1 w h i c h a r e t e r m i n a t i o n p o i n t s o f b r a n c h e s o f t h e i n p u t c h a r a c t e r s a re r e l a b e l l e d as: - 2 i f the p i x e l i s an e n d - p o i n t (EP). - 3 i f the p i x e l i s a t e r m i n a t i o n f o r t h r e e branches ( J 3 ) . - 4 i f t h e p i x e l i s a t e r m i n a t i o n f o r f o u r branches ( J 4 ) . I t i s p o s s i b l e t o have nodes as n e i g h b o u r s o f e a c h o t h e r . These nodes a r e c a l l e d back-to-back nodes. More d i s c u s s i o n on t h i s s u b j e c t i s i n Chapter I I I . A f t e r r e l a b e l l i n g t he node p i x e l s , the image i s ready f o r t r i m m i n g . The method f o r t r i m m i n g n o i s y b r a n c h e s i s a c o m p l e t e l y d i f f e r e n t a l g o r i t h m t h a n t h e p r u n i n g p r o c e d u r e found i n [ 1 ] . In [ 1 ] , the p r u n i n g was done a f t e r the f e a t u r e e x t r a c t i o n p r o c e s s w h i l e i n t h i s p r o j e c t , i t i s done b e f o r e f e a t u r e e x t r a c t i o n . T h i s way, the t r i m m i n g a l g o r i t h m i s more s t r a i g h t f o r w a r d and i n many ways e a s i e r t o debug t h a n t h e one i n [ 1 ] . A l s o , no u p d a t i n g o f the image d e s c r i p t i o n a f t e r t r i m m i n g i s r e q u i r e d . The t r i m m i n g a l g o r i t h m s e q u e n t i a l l y s e a r c h e s f o r a 2 p i x e l . I t l o c a t e s t h e t e r m i n a t i o n p o i n t o f t h e b r a n c h by 23 Chapter I I t r a c k i n g i t , from the 2 p i x e l , v i a a d i r e c t i o n away from the 2 p i x e l and f o l l o w i n g t h e p a t h o f t h e b r a n c h . The t r a c k i n g i s a c c o m p l i s h e d by i n s p e c t i n g t h e n e i g h b o u r s of t h e c u r r e n t p i x e l and s e a r c h f o r an u n - t r a c e p i x e l i n the c u r r e n t p i x e l ' s n e i g h b o u r . T r a c k i n g w i l l t e r m i n a t e i f a p i x e l l a b e l l e d as 2, 3, o r 4 i s e n c o u n t e r e d . The t r a c k i n g p r o c e s s u s e s t h e W o p e r a t o r and t h e c o r r e s p o n d i n g Freeman's d i r e c t i o n s . F o r e a c h t i m e t h e t r i m m i n g s u b r o u t i n e r e a c h e s a t e r m i n a t i o n p i x e l , i t w i l l l o o k a t the l e n g t h o f the path . I f the l e n g t h o f t h e p a t h i s s h o r t e r t h a n a minimum l e n g t h , i t w i l l b a c k t r a c k t h e p a t h and a s s i g n a 0 t o each p i x e l i n the path . The t r i m m i n g a l g o r i t h m i s p r e s e n t e d on the next page. 24 Chapter I I BEGIN Trim-branches 1. FOR each 2 p i x e l i n the image DO BEGIN 1.0. Length = 1; 1.1. L o c a t e t h e i n i t i a l d i r e c t i o n by i n s p e c t i n g t h e n e i g h b o u r s o f the c u r r e n t 2 p i x e l ; 1.2. REPEAT 1.2.1. F o l l o w t h e d i r e c t i o n c h o s e n t o t h e n e x t p i x e l ; 1.2.2. Next p i x e l i s the c u r r e n t p i x e l ; 1.2.3. I f the c u r r e n t p i x e l i s NOT a node, then BEGIN -Length = l e n g t h + 1; - l a b e l c u r r e n t p i x e l as 9; ( i . e . t r a c e d ) - t h e r e a r e 3 and 4 p i x e l s i n the neighbourhood o f t h e c u r r e n t p i x e l , l e t t h e n e x t p i x e l be t h a t p i x e l and r e c o r d the ne x t p i x e l ' s d i r e c t i o n ; - I f t h e ne x t p i x e l ' s d i r e c t i o n i s not f o u n d , then BEGIN - l e t t h e n e x t p i x e l ' s d i r e c t i o n be t h e d i r e c t i o n w here t h e r e i s a 1 p i x e l i n t h e n e i g h b o u r s o f the c u r r e n t p i x e l ; END; END; 1.2.4. Record the d i r e c t i o n chosen; UNTIL a 3 or 4 o r 2 p i x e l i s the c u r r e n t p i x e l ; 1.3. I f Length<=max. n o i s y b r a n c h l e n g t h t h e n BEGIN 1.3.1. R e t r a c e the p r e - r e c o r d e d c h a i n code from s t e p 1.2 and s e t a l l the p i x e l i n the p a t h t o 0; END; END; END; Trim - b r a n c h e s 25 Chapter I I 6. Corner smoothing a l g o r i t h m D u r i n g the s e p a r a t i o n o f two connected c h a r a c t e r s or the r e m o v a l o f a b r a n c h f r o m a node, i t i s n e c e s s a r y t o r e t a i n t h e n o d e p i x e l i n t h e o r i g i n a l i m a g e t o m a i n t a i n c o n n e c t i v i t y . F o r e x a m p l e , we c a n have t h e o r i g i n a l image as: 1 . 1 . 1 . 1 . 1 . 1 1 1 1 3 . 1 . . . . 1 . . . . . 1 1 . 1 1 1 1 1 1 F i g . I I - 6 . 1 . A diagram f o r d i s c u s s i n g c o r n e r smoothing. The b o l d l ' s r e p r e s e n t s the branch t o be removed or s e p a r a t e d from the v e r t i c a l branch. 1 1 1 1 1 1 . 1 . . 1 . 1 F i g . I I - 6 . 2 . A diagram f o r d i s c u s s i n g c o r n e r smoothing. I f s o , t h e c o n n e c t i v i t y o f t h e v e r t i c a l b r a n c h w i l l be d e s t r o y e d . T h e r e f o r e , i t i s a must f o r any a l g o r i t h m NOT t o remove t h e j u n c t i o n p i x e l . However, t h e r e a r e c a s e s where we have t o remove the j u n c t i o n p i x e l ;o m a i n t a i n the uniqueness 26 Chapter I I of c o n n e c t i v i t y . For example, assuming we want t o remove the t o p v e r t i c a l b r a n c h o f t h e same image and keep t h e j u n c t i o n p i x e l : 1 1 1 1 1 1 1 1 1 3 . . . 1 . . . . 1 . . . . 1 1 1 1 1 1 1 1 F i g . I I - 6 . 3 . A diagram f o r d i s c u s s i n g c o r n e r smoothing. A f t e r the s e p a r a t i o n , we would have: 1 1 1 1 1 1 . . . . 1 . . . . 1 1 1 1 1 1 1 1 F i g . I I - 6 . 4 . A diagram f o r d i s c u s s i n g c o r n e r smoothing. The b o l d "1" i n d i c a t e s t h e p i x e l o f i n t e r e s t i n t h e above f i g u r e . Here i t s h o u l d be r emoved t o a v o i d a m b i g u i t i e s i n c o n n e c t i v i t i e s ( r e c a l l t h e d i s c u s s i o n i n S e c t i o n 2 o f t h i s c h a p t e r ) . The remedy h e r e i s t o a l w a y s keep t h e node p i x e l and use a c o r n e r s moothing a l g o r i t h m t o remove the redundant p i x e l a f t e r t h e r e m o v a l o r t h e s e p a r a t i o n o f b r a n c h e s . R e m o v i n g t h e c o r n e r p i x e l c a n be v i e w e d as an a l g o r i t h m t o remove redundant 4-connected p i x e l s f r om a branch. There are 27 Chapter I I t h r e e d i s t i n c t l y d i f f e r e n t masks t o be f o l l o w e d . They a r e : X A X 01 X [A 01 X X 0 P A 0 P A 0 P A 0 0 X 0 X X 0 X A where A^O and X i s don't c a r e . F i g . I I - 6 . 5 . Masks f o r smoothing c o r n e r s . Note t h a t the c o n n e c t i v i t y i s m a i n t a i n e d even i f P i s s e t t o 0. Here i s the c o r n e r smoothing a l g o r i t h m : BEGIN corner smoothing 1. FOR e v e r y non-zero p i x e l i n the image DO BEGIN 1.1. S e a r c h f o r t h e masks i n f i g u r e I I - 6 . 5 . and t h o s e form by m u l t i p l e s o f 90 degrees r o t a t i o n s ; 1.2. I f 1.1. i s s a t i s f i e d , s e t t h e c u r r e n t p i x e l t o z e r o ; END; END; corner smoothing. T h i s a l g o r i t h m i s a p p l i e d t o the image a f t e r t h i n n i n g and a f t e r trim-branches. T h i s a l g o r i t h m w i l l a l s o be used l a t e r as a p r e p r o c e s s i n g s t e p f o r the SCR. Note t h a t c r e a t i n g t e s t i m a g e s f r o m t h e image's d e s c r i p t i o n w i l l a l s o c r e a t e t h e s i t u a t i o n s m e n t i o n e d above. Upon c o m p l e t i o n o f a l l t h e p r e p r o c e s s i n g , t h e image i s r e a d y t o have i t s f e a t u r e s e x t r a c t e d . 28 Chapter I I I N P U T : D i g i t i z e d b i n a r y image c o n t a i n i n g c h a r a c t e r s F i l l P] m h o l e s T h i n n i n g [1] c o r n e r smoothing Node I d e n t i f i c a t i o n & Tr i m branches c o r n e r sr noothing O U T P U T : Thinned b i n a r y image c o n t a i n i n g i d e n t i f i c a t i o n o f the j u n c t i o n type on each node p i x e l . F i g . I I - l . l . A breakdown o f the s t e p s t a k e n i n p r e p r o c e s s i n g . 29 Chapter I I i i 11111 i n n i 11 11 i 1 1 1 i 1 i i i . i i i . i i i . i . . 11 1. i . . 11.. i i i i 1 1 1 1 i i i i t 11... u i i i i 1... 111111 i i . . 111 i n ...11111111111 1111111 m i n i . . . . . . 1111111111 i i i i i . i i i . . . . i i n i i 111 i i m i i i 1111 i i . 11111111 i i . . .11111111111111 i 1.., .11111111111111 i n — i i n i .11111111111 1111 m i 11111 i i i n n i i . . . i n n i 1 . . i n n i i .111111111111 i i 1 1111111111111 i . . 111111 111111.. 1111111111111 .111 The i m a g e s on t h e l e f t a r e f r o m t h e o r i g i n a l Munson's d a t a , t h e two i n t h e c e n t e r a r e the t h i n n e d v e r s i o n , and t h e i m a g e s on t h e r i g h t a r e t r i m m e d by t h e a l g o r i t h m T r i m - b r a n c h e s . F i g . I I - 5 . 1 . Examples o f t r i m m i n g n o i s y b r a n c h e s . C H A P T E R I I I IMAGE F E A T U R E E X T R A C T I O N 1. I n t r o d u c t i o n F e a t u r e e x t r a c t i o n i s an i m p o r t a n t s t e p i n c h a r a c t e r r e c o g n i t i o n . I t i s t h e p r o c e s s where a d i g i t a l image i s t r a n s f o r m e d t o a s y m b o l i c d e s c r i p t i o n . As f a r as t h e image d e s c r i p t i o n g o e s , r e f e r e n c e s [ 1 ] , [ 5 ] , and [20] showed some e x c e l l e n t ways of d e s c r i b i n g an image. Wong uses A t t r i b u t e d R e l a t i o n a l G r a p h s (ARG) t o d e s c r i b e a c h a r a c t e r i n [1] and s i m i l a r m e t h o d s i s u s e d i n [5] t o d e s c r i b e a s c e n e . R e f e r e n c e [20] u s e s c e l l u l a r f e a t u r e s t o d e s c r i b e hand w r i t t e n C hinese-Japanese c h a r a c t e r s . The c e l l u l a r f e a t u r e s a p p e a r t o be e x t r e m e l y e f f i c i e n t i n d e a l i n g w i t h t h e c o m p l e x i t y o f t h e C h i n e s e and J a p a n e s e c h a r a c t e r s . I t i s a d e p a r t u r e f r o m t h e t r a d i t i o n a l f o r m a t of f e a t u r e s s u c h as graphs, nodes, s t r o k e s , l i n k s , e t c . Others use e n t i t i e s l i k e i mage c o n t o u r s [ 1 2 ] , a r c s , l o o p s , and c o r n e r s [ 1 4 ] , image p r o f i l e s [ 1 8 ] , e t c . However, r e f e r e n c e [1] and [5] use a f o r m a t w h i c h c a n be e a s i l y f o r m e d i n t o a g r a p h w h e r e m a t h e m a t i c a l c o n c e p t s such as graph t h e o r y . There are some u s e f u l i d e a s on d e s c r i b i n g i m a g e s i n [ 8 ] , [ 2 8 ] , [ 3 3 ] , [ 3 5 ] , [ 3 8 ] , [ 3 9 ] , [ 4 0 ] , and [ 4 1 ] . They r a n g e f r o m raw and f u l l p r i m a l s k e t c h e s t o minimum spa n n i n g t r e e s . T h i s c h a p t e r w i l l p r e s e n t the m o t i v a t i o n and the d e s i g n o f t h e f e a t u r e e x t r a c t i o n r o u t i n e o f t h e CCSA. I t i s t h e 31 Chapter I I I second major r o u t i n e i n the CCSA. F i g u r e I I I - l . l i s a b l o c k d i a g r a m s h o w i n g t h e f e a t u r e e x t r a c t i o n r o u t i n e and i t s r e l a t i n g s u b r o u t i n e s . F i r s t , a p r e s e n t a t i o n o f some fundamental c o n c e p t s on f u z z y image s u b s e t t o s e t a b a s i s f o r d i s c u s s i o n . 2. Fuzzy image s u b s e t s The f o l l o w i n g i s a p r e s e n t a t i o n of some of the m a t e r i a l s f o u n d i n [4] on f u z z y image s u b s e t s . A l t h o u g h t h e r e i s no d i r e c t use o f t h e m a t e r i a l i n 1 4 ] , i t i s t h e m o t i v a t i o n and t h e j u s t i f i c a t i o n i n t h e n e w l y d e s i g n e d CCSA's p o t e n t i a l o f s o l v i n g t h e s e g m e n t a t i o n o f c h a r a c t e r s p r o b l e m u s i n g t h e " l e f t - t o - r i g h t " r u l e . I t i s n e c e s s a r y t o p r e s e n t t h e s e i d e a s so t h e r e a d e r c a n b e t t e r u n d e r s t a n d t h e i n s p i r a t i o n i n t h e d e s i g n . A f u z z y s u b s e t o f a s e t S i s a m a p p i n g f f r o m S i n t o t h e r a n g e [ 0 , 1 ] , G i v e n f we c a n d e f i n e a d e g r e e o f m e mb e r s h i p. I f P i s an e l e m e n t o f S, t h e n f (P) i s c a l l e d t h e d e g r e e o f m e m b e r s h i p o f P i n f . The s t a n d a r d s e t t h e o r y i s a s p e c i a l c a s e where t h e mapping f i s mapped i n t o t h e r a n g e £0,13. F i g u r e I I I - 2 . 1 . i s a d i a g r a m i l l u s t r a t i n g the i d e a of degree o f m e m b e r s h i p . F o r e x a m p l e , r e g i o n 2 has a d e g r e e o f membership 0.75. D egree o f m e m b e r s h i p i s u s e d t o f u r t h e r d e f i n e o t h e r e n t i t i e s . 32 Chapter I I I The l e v e l sets of f are the s e t s : f k = [ P € S | f(P)>k } where 0 _< k < 1. L e t P, Q be e l e m e n t s o f S and l e t f be a f u z z y s u b s e t o f S. The d e g r e e o f c o n n e c t e d n e s s o f P and Q i n r e l a t i o n t o f i s d e f i n e d as: C f(P,Q) = m a x p a t h [ m i n p o i n t f (R) ] where max i s t a k e n o v e r a l l p a t h s f r o m P t o Q and min i s t a k e n o v e r a l l p o i n t s R on t h e p a t h . T h a t i s , f o r a l l t h e p o i n t s on a l l t h e p a t h s t r a c e d f r o m P t o Q, t h e d e g r e e o f c o n n e c t e d n e s s i s t h e l a r g e s t d e g r e e o f m e m b e r s h i p o u t o f a c o l l e c t i o n . The c o l l e c t i o n c o n t a i n s a l l the l o w e s t degree of membership o b t a i n e d on each p a t h t r a c e d . Note t h a t R i s a l s o an element of S. P, Q are c o n n e c t e d i n f i f : C f (P,Q) > min [ f (P) , f (Q) ] I f P and Q a r e nodes o f a t h i n n e d c h a r a c t e r , s i m i l a r c a l c u l a t i o n s c a n be p e r f o r m e d t o f i n d t h e i r d e g r e e o f membership and degree o f connectedness. 3. The problem w i t h b a c k - t o - b a c k nodes In a t h i n n e d b i n a r y image of a c h a r a c t e r , i t i s n e c e s s a r y f o r c l a r i t y r e a s o n s t h a t o n l y one p i x e l r e p r e s e n t s a j u n c t i o n p o i n t o f b r a n c h e s . However, as d i s c u s s e d i n [ 1 ] , t h e r e a r e c a s e s where j u n c t i o n p o i n t s becomes n e i g h b o u r s o f each o t h e r . They are c a l l e d b a c k - t o - b a c k nodes. F i g u r e I I I - 3 . 1 shows an 33 Chapter I I I example of b a c k - t o - b a c k nodes where a J3 i s n e x t t o a J4. E x t r a c t i o n o f branch i n f o r m a t i o n of a t h i n n e d image can s t a r t a t a node. U s i n g e i t h e r t h e W o p e r a t o r o r t h e T o p e r a t o r , t h e c u r r e n t p i x e l w i l l become t h e p i x e l P a t t h e c e n t e r o f t h e o p e r a t o r ' s window. The c u r r e n t p i x e l i s t h e n marked on the image as t r a c e d . By i n s p e c t i n g the n e i g h b o u r s of the c u r r e n t p i x e l f o r u n - t r a c e d p i x e l , the d i r e c t i o n from t h e c u r r e n t p i x e l t o t h e n e x t p i x e l i s t h e n e x t r a c t e d . A s e r i e s o f such d i r e c t i o n s , Freeman's c h a i n code, w i l l become a d e s c r i p t i o n o f a b r a n c h . T r a c i n g w i l l t e r m i n a t e d when t h e r e a r e p i x e l s marked as 2, 3, o r 4 ( i . e . nodes) i n t h e n e i g h b o u r i n g p i x e l s o f t h e c u r r e n t one. A m b i g u i t i e s i n t r a c i n g and d e s c r i b i n g the image a r i s e s when t h e s e j u n c t i o n p i x e l s a r e n e i g h b o u r s o f each o t h e r . A b r a n c h i s d e f i n e d by two j u n c t i o n p o i n t s a t e a c h end. S i n c e b a c k - t o - b a c k nodes o c c u r as n e i g h b o u r s o f e a c h o t h e r , we can s i m p l y choose one o f them as a r e f e r e n c e and c a l l t he o t h e r s hidden nodes. We c a n c a l l t h e r e f e r e n c e node as a non-hidden node. D u r i n g t h e t r a c i n g o f b r a n c h e s , i f we a r e a t a n o n - h i d d e n node, we c a n s e a r c h f o r a h i d d e n node f i r s t . I f we a r e a t a h i d d e n node, we c a n s e a r c h f o r a s i m p l e 1 ( b l a c k ) p i x e l f i r s t . T h i s way, one w i l l be g u a r a n t e e d t o l e a v e t h e n o n - h i d d e n node. The same i d e a b u t i n r e v e r s e a p p l i e s when we are a r r i v i n g a t a r e f e r e n c e node. T h e r e f o r e , a n o n - h i d d e n node, a h i d d e n node, and a 1 p i x e l o c c u r i n a 34 Chapter I I I c e r t a i n h i e r a r c h y w h i c h depends on w h e t h e r you a r e l e a v i n g the node or a r r i v i n g a t the node. I t i s not v i a b l e t o s i m p l y choose any u n - t r a c e d p i x e l s when the a l g o r i t h m i s a t a node. At a j u n c t i o n , c e r t a i n b r a n c h i s t r a c e d f i r s t b e f o r e t h e o t h e r s . T h e r e a r e t h e d i r e c t i o n a l " l e f t - t o - r i g h t " r u l e and t r a c i n g h i e r a r c h y t o f o l l o w . I t s h o u l d be p o s s i b l e t o g e n e r a l i z e the r e p r e s e n t a t i o n of a node w h i c h has any number o f r a d i a t i n g b r a n c h e s g i v e n t h e above h i e r a r c h y scheme. However, t h e g e n e r a l i z a t i o n may e v e n t u a l l y c a u s e t h e c o m p l e t e r e - d e s i g n o f t h e e n t i r e p r o j e c t . F a v o r a b l y , t h e 336 c o n n e c t e d c h a r a c t e r s i m a g e s t e s t e d have a l l been s u c c e s s f u l l y t r a c k e d by t h e f e a t u r e e x t r a c t i o n a l g o r i t h m u s i n g t h e h i e r a r c h y scheme m e n t i o n e d above. F u r t h e r r e s e a r c h on t h e g e n e r a l i z a t i o n o f t h e node r e p r e s e n t a t i o n i s s u e i s r e q u i r e d i f f u r t h e r e x p e r i m e n t a t i o n have found the h i e r a r c h y scheme u n a c c e p t a b l e . 4. Image f e a t u r e s 4.1. D i s c u s s i o n s As d i s c u s s e d i n C h a p t e r I , a "word r e c o g n i z e r " c a n be thought o f as a c h a r a c t e r s e g m e n t i z e r . I t s i m p l y compares a w o r d w i t h t h e r e f e r e n c e i n t h e d i c t i o n a r y . By r e c o g n i z i n g t h e w o r d , i n d i v i d u a l c h a r a c t e r s a r e a l s o r e c o g n i z e . The d i c t i o n a r y c a n a l s o be t h o u g h t o f as a l i s t o f f i x e d and i m p l i e d r e l a t i o n s h i p s between c h a r a c t e r s . For t h i s p r o j e c t , 35 Chapter I I I h o w e v e r , t h e e m p h a s i s i s on t h e l e f t - t o - r i g h t d i r e c t i o n a l r e l a t i o n s h i p between f e a t u r e s . To a c h i e v e a c o m p l e t e d e s c r i p t i o n on a _ l ^ t h e r e l a t i o n s h i p s b e t w e e n a_l_l t h e f e a t u r e s i n an i m a g e , s u b s t a n t i a l amount o f c o s t i s r e q u i r e d . The c o s t i s i n t h e a c q u i r i n g and the m a i n t a i n i n g of a d e s c r i p t i o n . Here, a node f e a t u r e ' s f o r m a t i s d e v i s e d s u c h t h a t o n l y t h e b r a n c h e s r a d i a t i n g f r o m a node c a n be a c c e s s e d d i r e c t l y i f t h e node i s a d d r e s s e d . F o r b r a n c h f e a t u r e s , o n l y t h e names o f t h e t e r m i n a t i n g nodes of a b ranch are i n c l u d e d . T h e r e f o r e , i t i s a t r a d e o f f o f f e a t u r e d e s c r i p t i o n s i z e and the c o m p l e t e n e s s o f t h e d e s c r i p t i o n . However, i t s h o u l d be o b v i o u s t h a t by f o l l o w i n g the f e a t u r e s ' d e s c r i p t i o n s , any node or branch can be r e a c h e d u s i n g t h i s " n e i g h b o u r i n g f e a t u r e o n l y " f o r m a t . The CCSA can be a t node A and has t h e i m m e d i a t e k n o w l e d g e of A's r a d i a t i n g branches w i t h o u t h a v i n g t o s e a r c h f o r i t by s c a n n i n g t h e e n t i r e d e s c r i p t i o n . I n PASCAL t e r m s , t h e t o p o l o g i c a l d e s c r i p t i o n o f node A w i l l be p a r t o f the r e c o r d f i e l d o f node A. O t h e r p a r t s o f t h e r e c o r d f i e l d w i l l c o n t a i n t h e names and t h e o r d e r o f t h e r a d i a t i n g b r a n c h e s . Here, th e names f o r nodes and branches w i l l be r e p r e s e n t e d by i n t e g e r s . The scheme f o r n u m b e r i n g ( i . e . naming) t h e nodes and b r a n c h e s i s d e f i n e d s u c h t h a t t h e h i g h e r t h e f e a t u r e ' s h i e r a r c h y , the s m a l l e r t h e number ( i . e . names) i t w i l l have. T h e r e f o r e - , node 1 w i l l be t h e l e f t most node i n an image 36 Chapter 111 w h i l e branch 1 w i l l the b r a n c h d e t e r m i n e by the T o p e r a t o r ' s t r a c i n g o r d e r i f t h e o p e r a t o r i s a t node 1. More on t h i s s u b j e c t l a t e r . R e c a l l i n g the "T" o p e r a t o r and the Freeman "T" d i r e c t i o n s mask: Where p i x e l 1 i s t h e h i g h e s t and p i x e l 8 i s t h e l o w e s t i n h i e r a r c h y (or d e g r e e o f i m p o r t a n c e ) f o r t h e p i x e l P. I t s h o u l d be o b v i o u s t h a t t h e "T" mask i s f o l l o w i n g t h e h i e r a r c h y d e t e r m i n e d by t h e " l e f t - t o - r i g h t " r u l e and t h e " t o p - t o - b o t t o m " r u l e . T h i s mask makes i t much e a s i e r f o r us t o d e s i g n o u r CCSA w i t h t h a n a s e p a r a t e scheme t o d e t e r m i n e who s h o u l d be c h o s e n f i r s t . F o r e x a m p l e , i f we want t o e x t r a c t the b ranch w i t h the h i g h e s t h i e r a r c h y r a d i a t i n g from node P, we s i m p l y f o l l o w t h e above mask s c a n n i n g f r o m d i r e c t i o n 1 t o d i r e c t i o n 8 u n t i l we f i r s t h i t a 1 ( b l a c k ) p i x e l . A l l o t h e r b r a n c h e s f o u n d a f t e r t h e f i r s t one w i l l f o l l o w i n the h i e r a r c h y o f the branch f e a t u r e d e s c r i p t i o n o f p i x e l P. 4.2. Image f e a t u r e s d e f i n i t i o n s I n t h i s p r o j e c t , t h e i n p u t image's d e s c r i p t i o n w i l l be s t o r e d i n t o an i mage f e a t u r e d e s c r i p t i o n t a b l e (IFDT). I t has N nodes and M b r a n c h e s . The h i g h e r t h e o r d e r o f nodes 37 Chapter I I I (or b r a n c h e s ) i n t h e t a b l e , t h e more s i g n i f i c a n c e t h e y a r e w i t h r e s p e c t t o the f i r s t node. An IFDT D i s d e f i n e d as: D = C J N , B M, N, M 3 where = A s e t o f N node f e a t u r e s BJ,J = A s e t of M branch f e a t u r e s J N and B^ are f u r t h e r d e f i n e d a s : JN = * I i ' J i / N l i » N s i * N b h i = C j | j = l . . 4 } » Nmki 3 where Ij_ = row c o o r d i n a t e o f the i - t h node p i x e l . Jj_ = column c o o r d i n a t e of the i - t h node p i x e l . N l i = l e v e l o f the i - t h node. N s i = s y m b o l o f t h e i - t h node where t h e s y m b o l c a n be e i t h e r EP, J 3 , or J 4 . Nbhj_ = a 4-element a r r a y of names of branches w h i c h use node i as t h e i r s t a r t o f t r a c i n g ( i . e . b e g i n n i n g ) node. Nmki = a marker f o r the i - t h node, w i t h i = £ 1 .. N 3 f o r the i - t h (Nj_) node f e a t u r e . BM = £'ci» L i / Blj_, Bbn^, Ben-L, Bbns^, Bens-^, Btd^ 3 where Cj_ = a 1-dimension a r r a y of Freeman's c h a i n code u s i n g the T d i r e c t i o n s of the i - t h branch. L i = l e v e l of t n e i - t h b r a n c h . Bl j _ = l e n g t h o f the i - t h b r a n c h . Bbnj = name of the b e g i n n i n g node of the i - t h b r a n c h . B e n i = name of the e n d i n g node of t h e i - t h b r a n c h . B b n s i = symbol of the b e g i n n i n g node of the i - t h b r a n c h . B e n s i = symbol o f the e n d i n g node of the i - t h b r a n c h . Btdj_ = Boolean marker f o r the i - t h b r a n c h , w i t h i = C 1 . . M 3 f o r t h e i - t h (B-jJ b r a n c h f e a t u r e . Note: Nmki i s lumped w i t h N b h i as t h e 5 t h e l e m e n t o f t h e a r r a y Nbhi i n t h e a c t u a l i m p l e m e n t a t i o n . N l i and L i a r e the i n d i c e s d i s c u s s e d i n s e c t i o n 2 o f t h i s c h a p t e r . I n c r e m e n t i n g and d e c r e m e n t i n g the l e v e l o f a node and a b r a n c h w i l l be d i s c u s s e d i n n e x t s e c t i o n . B e g i n n i n g node means s t a r t o f t r a c i n g node and e n d i n g node means end o f t r a c i n g node. I f B i i s p a t h AB as s e e n i n C h a p t e r 2, t h e 38 Chapter I I I name o f A w i l l be i n Bbn^ and t h e nane o f B w i l l be i n B e n ^ Nbh^ a r r a y s t o r e s t h e names o f t h e b r a n c h e s r a d i a t i n g f r o m node J ^ . They a r e s t o r e d i n an o r d e r a c c o r d i n g t o t h e "T" mask d e s c r i b e d above. G i v e n a t h i n n e d b i n a r y i m a ge, i t s d e s c r i p t i o n o f e v e r y node and e v e r y b r a n c h c a n be s t o r e d i n an IFDT. Each o f t h e N node f e a t u r e s c a n be a r r a n g e d i n an o r d e r w h i c h f o l l o w s t h e " l e f t - t o - r i g h t " and " t o p - t o - b o t t o m " r u l e s . B r a n c h f e a t u r e s c a n a l s o be a r r a n g e d i n t h i s o r d e r . The n e x t s e c t i o n i s a b r i e f d e s c r i p t i o n o f t h e f e a t u r e e x t r a c t i o n a l g o r i t h m d e s i g n e d t o e x t r a c t and t o a r r a n g e the f e a t u r e s i n the p r o p e r o r d e r . 5. F e a t u r e e x t r a c t i o n a l g o r i t h m The a l g o r i t h m f i r s t assumes t h a t a l l the EP ( e n d - p o i n t s ) , J3 ( J u n c t i o n o f 3 b r a n c h e s ) and J4 ( J u n c t i o n o f 4 b r a n c h e s ) are marked on the image w i t h v a l u e s 2, 3, and 4 r e s p e c t i v e l y . T h i s p a r t was c o m p l e t e d by the a l g o r i t h m node i d e n t i f i c a t i o n d i s c u s s e d i n Chapter I I . A t o p t o b o t t o m ( w i t h i n a c o l u m n ) and l e f t t o r i g h t c o l u m n by c o l u m n s c a n n i n g scheme i s e m p l o y e d t o l o c a t e t h e node p o s i t i o n (node f e a t u r e e x t r a c t i o n ) . The o r d e r of a l l N n o d e s i s d e t e r m i n e d by t h e o r d e r o f e n c o u n t e r d u r i n g s c a n n i n g . D u r i n g s c a n n i n g a node's c o o r d i n a t e s ( I t was c a l l e d s e m a n t i c v e c t o r i n [1].) and i t s s y m b o l o r j u n c t i o n t y p e ( c a l l e d s y n t a c t i c s y m b o l i n [1].) i s e x t r a c t e d and 39 Chapter I I I r e c o r d i n t h e IFDT. A s e p a r a t e s e c t i o n c h e c k s f o r t h e no nodes c a s e (a s i n g l e c h a r a c t e r "O," f o r e x a m p l e ) and f o r b l a n k i n p u t s . The CCSA w i l l e i t h e r branch t o t h e SCR f o r the no node c a s e o r i t w i l l t e r m i n a t e i f t h e r e a r e no b l a c k p i x e l s . The above i s f o l l o w e d by t h e s e c t i o n o f " e l i m i n a t i o n " ( i . e . r e l a b e l l i n g on t h e image t o a 1 p i x e l ) o f b a c k - t o - b a c k h i d d e n nodes. H e r e , o n l y t h e p i x e l w h i c h i s t h e l e f t most upper p i x e l i n the c l u s t e r s of a group o f n e i g h b o u r i n g nodes i s k e p t as a n o n - h i d d e n node p i x e l on t h e image. A l s o , i f a J4 i s n e x t t o a J 3 , t h e J4 w i l l have t h e p r i o r i t y t o become the non-hidden node. See f i g u r e 111-5.1. Note t h a t the IFDT s t i l l has t h e l o c a t i o n and t h e s y m b o l o f a l l t h e h i d d e n nodes from the i n i t i a l scan. A r o u t i n e i s i n v o k e d t o p r e v e n t t h e f i r s t node o f t h e image d e s c r i p t i o n t a b l e t o be a h i d d e n node. I t i s a r o u t i n e w h i c h swaps t h e o r d e r o f t h e h i d d e n node w i t h t h e n e x t non-h i d d e n node. Two i - d i m e n s i o n a l a r r a y s a r e used i n the main body of the f e a t u r e e x t r a c t i o n r o u t i n e t o keep t r a c k o f t h e c u r r e n t l e v e l ' s nodes' names and t h e n e x t l e v e l ' s nodes' names. F i r s t t h e f i r s t node o f t h e IFDT i s l o a d e d i n t o t h e c u r r e n t l e v e l ' s a r r a y . The a l g o r i t h m then t r a c e s a l l branches w h i c h r a d i a t e s from the f i r s t node. D u r i n g t r a c i n g , i t r e c o r d s the c h a i n code o f t h e b r a n c h e s and t h e names o f t h e t e r m i n a t i n g 40 Chapter I I I nodes ( b r a n c h f e a t u r e e x t r a c t i o n ) . The t e r m i n a t i o n nodes w i l l be s t o r e d i n t h e n e x t l e v e l ' s a r r a y a c c o r d i n g t o t h e i r o r d e r . The f i r s t node w i l l be t h e b e g i n n i n g node f o r a l l t h e s e b r a n c h e s and t h e e n d i n g nodes w i l l be t h o s e w h i c h a r e r e c o r d i n t h e n e x t l e v e l ' s a r r a y . A f t e r t r a c i n g a l l t h e b r a n c h e s o f t h e f i r s t n o d e , t h e n e x t l e v e l ' s a r r a y i s a s s i g n e d t o t h e c u r r e n t l e v e l a r r a y f o r a n o t h e r s i m i l a r i t e r a t i o n . . U s i n g t h i s l e v e l i n d e x scheme, a g r a p h c a n be drawn from the r e s u l t a n t IFDT. A l s o , the o r d e r o f the nodes i n t h e two l e v e l a r r a y s i s d e t e r m i n e d by t h e node o r d e r w h i c h was i n i t i a l l y a c q u i r e d i n t h e IFDT. The b r a n c h o r d e r i s d e t e r m i n e d by t h e o r d e r o f t h e i r c o r r e s p o n d i n g b e g i n n i n g nodes and by the o r d e r the branches a r e t r a c e d . The o r d e r o f w h i c h b r a n c h e s a r e t r a c e d when t h i s r o u t i n e i s a t a node i s d e t e r m i n e d s i m p l y by t h e "T" o p e r a t o r . For example: 1 0 1 0 3 0 0 1 0 F i g . I I I - 5 . 2 a . E x a m p l e on t r a c i n g o r d e r when t h e CCSA i s a t a node. H e r e , we have t h r e e 1 p i x e l s r e p r e s e n t i n g t h r e e b r a n c h e s . R e c a l l the mask f o r the "T" o p e r a t o r and the "T" d i r e c t i o n s : 1 4 6 2 P 7 3 5 8 The o r d e r o f w h i c h t h e s e t h r e e b r a n c h e s a r e t r a c e d i s shown b e l l o w : 41 Chapter I I I 9 0 1 9 0 1 9 0 9 0 3 0 0 3 0 0 3 0 0 1 0 0 9 0 0 9 0 1 s t branch 2nd branch 3rd b r a n c h t r a c e d t r a c e d t r a c e d where t r a c e d branch p i x e l s a re marked as "9." F i g . I I I - 5 . 2 b . E x a m p l e on t r a c i n g o r d e r when t h e CCSA i s a t a node. An example of a t y p i c a l IFDT and the c o r r e s p o n d i n g image i s shown i n f i g u r e I I I - 5 . 3 . F i g u r e I I I - 5 . 3 a i s t h e t h i n n e d i m age and i t s IFDT's node d e s c r i p t i o n p a r t and f i g u r e I I I -5.3b i s t h e b r a n c h d e s c r i p t i o n p a r t o f t h e IFDT. I t s h o u l d be o b v i o u s from f i g u r e I I I - 5 . 3 t h a t the s m a l l e r the number of the node or the b r a n c h , the h i g h e r the o r d e r the f e a t u r e has. F i g u r e l l l - 5 . 3 c i s t h e g r a p h i c a l r e p r e s e n t a t i o n o f t h e IFDT shown i n f i g u r e I I I - 5 . 3 a - b . The f e a t u r e e x t r a c t i o n a l g o r i t h m i s t e r m i n a t e d u n t i l a l l t h e node i n the ne x t l e v e l a r r a y a re i n v a l i d nodes f o r new t r a c i n g t o s t a r t ( i n v a l i d nodes a r e e i t h e r e n d p o i n t s o r t r a c e d J 3 s a n d J 4 s ) . The f e a t u r e e x t r a c t i o n a l g o r i t h m i s c a l l e d a l g o r i t h m FE1 and i t i s d e t a i l e d i n the ne x t s e c t i o n . 6. A l g o r i t h m FE1 and the r e l a t e d s u b r o u t i n e s The f o l l o w i n g i s a l i s t i n g o f a l g o r i t h m FE1. The r e l a t e d s u b r o u t i n e s a r e l i s t e d i n the f o l l o w i n g s e c t i o n s . A l g o r i t h m SCR r e p r e s e n t s t h e m a i n body o f Mr. Wong's a l g o r i t h m and i s d e t a i l e d i n Appendix 2 and Appendix 3. In p u t : P r e p r o c e s s e d ( i . e . t h i n n e d and l a b e l l e d ) b i n a r y image. Output: Image f e a t u r e d e s c r i p t i o n t a b l e (IFDT). 42 Chapter I I I BEGIN FE1 node_count = 0; 1. FOR each column from t o p t o bottom of the image DO BEGIN 1.1. I F a p i x e l i s a 2, 3, o r 4 p i x e l THEN e x t r a c t t h e node c o o r d i n a t e s and a s s i g n t he c o r r e s p o n d i n g symbol t o the node's d e s c r i p t i o n ; 1.2. node_count = node_count + 1; END; 2. IF node_count = 0 THEN count the number o f 1 p i x e l s ; IF t h e r e a r e no 1 p i x e l s , THEN GOTO end o f the a l g o r i t h m ELSE CALL SCR a l g o r i t h m and GOTO end o f the a l g o r i t h m ; 3. CALL r e s e t b a c k - t o - b a c k node and CALL swap node; c u r r e n t _ l e v e l = 1; 4. WITH the c u r r e n t IFDT DO BEGIN - c u r r e n t IFDT 4.1. L o a d node 1 o f t h e c u r r e n t IFDT i n t o c u r r e n t _ l e v e l _ a r r a y and CALL t r a c e b r a n c h e s ; Record the t r a c e d branches i n IFDT; s o r t t r a c e d branches i n a s c e n d i n g o r d e r ; 4.2. Copy n e x t - l e v e l - a r r a y i n t o t he c u r r e n t _ l e v e l _ a r r a y ; end_of_lFDT = FALSE; 4.3. WHILE NOT(end_of_IFDT) DO BEGIN - not end - c u r r e n t _ l e v e l = c u r r e n t _ l e v e l + I ; - a s s i g n c u r r e n t l e v e l t o a l l nodes' l e v e l i n t h e n e x t _ l e v e l _ a r r a y ; - c u r r e n t _ l e v e l _ a r r a y = n e x t _ l e v e l _ a r r a y ; 4.3.1. WHILE u s i n g c u r r e n t _ l e v e l _ a r r a y ' s l i s t o f nodes DO BEGIN - t h i s l e v e l - WITH c u r r e n t _ l e v e l _ a r r a y ' s c u r r e n t node's symbol and f o r each node i n the c u r r e n t _ l e v e l _ a r r a y DO i . CALL t r a c e branches and e x t r a c t u n - t r a c e d branch i n f o r m a t i o n from each node i n c u r r e n t l e v e l a r r a y ; 43 Chapter I I I s o r t t r a c e d branches i n a s c e n d i n g o r d e r ; i i . r e c o r d the t r a c e d branches i n the IFDT; END; - t h i s l e v e l 4.3.2. IF the number o f EP nodes and the number of c o m p l e t e d t r a c e d J3's and J4's are e q u a l t o the number of nodes i n the n e x t _ l e v e l _ a r r a y , then end_of_IFDT = TRUE; END; - not end 4.4. Record node's r a d i a t i n g branch i n f o r m a t i o n by i n s p e c t i n g the e x t r a c t e d branches' i n f o r m a t i o n ; END; - c u r r e n t IFDT END FE1 The a b o v e i s an o u t l i n e o f t h e f e a t u r e e x t r a c t i o n a l g o r i t h m . The i n d i v i d u a l s u b r o u t i n e s a r e o u t l i n e d i n t h e f o l l o w i n g s e c t i o n s . 6.1. A l g o r i t h m reset back-to-back node The t a s k o f r e s e t back-to-back node i s t o r e l a b e l t h e h i d d e n nodes t o 1 p i x e l s . I t t a k e s a l a b e l l e d t h i n n e d b i n a r y image as an i n p u t . The r e l a b e l l e d image w i l l be t h e o u t p u t . The r e l a b e l l i n g w i l l f o l l o w t h e d e f i n i t i o n s f o r h i d d e n and non-hidden nodes d i s c u s s e d i n s e c t i o n 3 o f t h i s c h a p t e r . BEGIN reset back-to-back node 1. FOR e v e r y c o l u m n f r o m l e f t t o r i g h t and f r o m t o p t o b o t t o m i n e a c h c o l u m n DO 1.1. IF the c u r r e n t p i x e l i s a J 4 , THEN change a l l i t s n e i g h b o u r i n g J4 and J3 p i x e l s t o 1 p i x e l s ; 2. FOR e v e r y c o l u m n f r o m l e f t t o r i g h t and f r o m t o p t o b o t t o m i n each column DO 2.1. IF the c u r r e n t p i x e l i s a J 3 , THEN change a l l i t s n e i g h b o u r i n g J3 p i x e l s t o 1 p i x e l s ; END reset back-to-back node; 44 Chapter I I I 6.2. A l g o r i t h m swap node Swap node w i l l c hange t h e o r d e r o f t h e h i e r a r c h y o f t h e f i r s t node i f i t i s a h i d d e n node. I t t a k e s t h e image and the c o r r e s p o n d i n g IFDT as i n p u t s and o u t p u t s a m o d i f i e d IFDT w i t h a n o n - h i d d e n node as t h e f i r s t node. H i d d e n nodes i n t h e I F D T a r e i d e n t i f i e d b y c h e c k i n g w h e t h e r t h e r e p r e s e n t a t i o n o f a node i n t h e a c t u a l image i s l a b e l l e d as 2, 3, or 4. I f n o t , swap i t w i t h the ne x t non-hidden node i n the h i e r a r c h y . BEGIN swap node 1. K = 1 2. I F node I's p i x e l i s a 1 p i x e l , i . e . h i d d e n node, THEN 2.1. Swap the f i r s t node w i t h the ( K + l ) t h node 2.2. K = K + 1 2.3. GOTO s t e p 2 END swap node 6.3. A l g o r i t h m t r a c e b r a n c h e s The t a s k o f t r a c e b r a n c h e s i s t o t r a c e and e x t r a c t node and b r a n c h i n f o r m a t i o n f o r a g i v e n node p i x e l of an image. A t e m p o r a r y a r r a y o f b r a n c h e s named Temp i s u s e d f o r I/O o f t r a c e d branches w i t h a l g o r i t h m FE1. T r a c e b r a n c h e s r e c e i v e s an empty a r r a y c a l l e d Temp and the node f e a t u r e d e s c r i p t i o n f r o m FE1. T r a c e b r a n c h e s t h e n t r a c e and e x t r a c t b r a n c h i n f o r m a t i o n o f a l l t h e u n t r a c e d b r a n c h e s o f t h e g i v e n node and r e t u r n s a c o m p l e t e d a r r a y Temp t o FE1. BEGIN t r a c e b r a n c h e s 45 Chapter I I I 1. Check a l l n e i g h b o u r s of the i n p u t node and d e t e r m i n e the number o f b r a n c h e s t o be t r a c e d . C a l l t h i s number T i m e s . L o a d a l l t h e f i r s t d i r e c t i o n s i n t o a r r a y ID f o l l o w i n g t h e h i e r a r c h y i n d i c a t e d by the T o p e r a t o r . Stop = FALSE; K l = 1; 2. WHILE K l i s l e s s than o r e q u a l t o Times AND NOT(Stop) DO BEGIN - K l <= Times 2.1. Length = 1; 2.2. Make s u r e t h e d i r e c t i o n s r e c o r d by ID a r e n o t 9 (traced) p i x e l s ; 2.3. S t o r e t h e f i r s t Freeman's d i r e c t i o n f r o m t h e K l ' t h d i r e c t i o n i n t h e a r r a y ID i n t o t h e f i r s t (Length'th) c h a i n code d i r e c t i o n a r r a y o f t h e K l ' t h b r a n c h i n Temp; 2.4. REPEAT 2.4.1. A c c o r d i n g t o t h e L e n g t h ' t h c h a i n code o f t h e K l ' t h b r a n c h o f Temp, a d v a n c e t o t h e n e x t p i x e l a c c o r d i n g t o t h e c h a i n c o d e d i r e c t i o n . A l s o , f o r w a r d d i r e c t i o n (FD) = L e n g t h ' t h c h a i n code and r e v e r s e d i r e c t i o n (RD) = o p p o s i t e d i r e c t i o n o f FD; 2.4.2. IF the c u r r e n t p i x e l i s not a node p i x e l THEN BEGIN - not a non-hidden node p i x e l a. L e n g t h = L e n g t h + 1 D l = 0 - D l i s t h e d i r e c t i o n f o r t he nex t c h a i n code; b. IF c u r r e n t p i x e l i s a 1 p i x e l THEN change i t t o a 9; c. P i x j = FALSE; Found = FALSE; M_hn = FALSE; d. I F a n e i g h b o u r i n g p i x e l of t h e c u r r e n t p i x e l i s a non-hidden node THEN P i x j = TRUE; e. IF t h e r e a re more than one 1 p i x e l i n the neighbourhood o f the c u r r e n t p i x e l THEN Found = TRUE ; f. IF P i x j = TRUE AND the p r e v i o u s p i x e l i s not a non-h i d d e n node THEN IF t h e c u r r e n t p i x e l m a t c h e s a node i n t h e IFDT and i t i s not t r a c e d THEN s t o r e the p i x e l i n the n e x t _ l e v e l _ a r r a y as a node; g. IF the c u r r e n t p i x e l i s a h i d d e n node THEN M_hn = TRUE ; h. IF NOT(M hn = TRUE) AND (Dl=0) THEN 46 Chapter I I I BEGIN - c u r r e n t p i x e l i s n o t a h i d d e n node and new d i r e c t i o n i s not found y e t h . l . WHILE (D1 = 0) S e a r c h f o r a n o n - h i d d e n node i n a l l the n e i g h b o u r i n g d i r e c t i o n s e x c e p t d i r e c t i o n RD. I f f o u n d , a s s i g n t h e d i r e c t i o n o f t h e n o n - h i d d e n node p i x e l t o D l ; h.2. IF (Length=2) AND (Found=TRUE) AND (D1=0) THEN IF t h e r e i s a 1 p i x e l i n one of the n e i g h b o u r s of the c u r r e n t p i x e l and the 1 p i x e l " next-door" n e i g h b o u r s a r e z e r o s , THEN a s s i g n the d i r e c t i o n of the 1 p i x e l t o D l ; h.3. IF (Found=TRUE) AND (D1=0) THEN s e a r c h IFDT and check i f the c u r r e n t node i s a h i d d e n node; h.4. IF (D1=0) AND (Found=TRUE) THEN cnoose d i r e c t i o n FD as D l i f the i n t h a t d i r e c t i o n p i x e l i s not z e r o ELSE - I F FD i s one o f [ 2,4,5,7 j o f t h e T d i r e c t i o n s THEN c h o o s e one o f t h e two s i d e n e i g h b o u r s , n o n - z e r o p i x e l n e x t i n t h e h i e r a r c h y ( i . e . i f FD i s 2 t h e n i f d i r e c t i o n 1 i s n o t z e r o c h o o s e 1 e l s e i f d i r e c t i o n 3 i s n o t z e r o c h o o s e 3. P i x e l 1 and p i x e l 3 a r e s i d e n e i g h b o u r s o f p i x e l 2.) ELSE - I F FD i s one o f [ 1,3,6,8 ] o f t h e T d i r e c t i o n s THEN c h o o s e one o f t h e f o u r p i x e l s o f t h e "L" shape c o r n e r a c c o r d i n g t o the T h i e r a r c h y . For example i f FD was 8 then the c h o o s i n g h i e r a r c h y f o r a new d i r e c t i o n w i l l be 13,5,6,73 w i t h the 3 d i r e c t i o n as the h i g h e s t i n the h i e r a r c h y ; h.5. I F (D1=0) THEN c h o o s e a 1 p i x e l i n t h e neighbourhood e x c e p t i n the d i r e c t i o n RD; h.6. I F (D1=0) THEN l o o k f o r a n o n - h i d d e n node i n t h e n e i g h b o u r h o o d , I F f o u n d e d , a s s i g n t h e no d e ' s d i r e c t i o n t o D l ; END - NOT(M_hn = TRUE) ELSE BEGIN - c u r r e n t p i x e l i s a h i d d e n node. h.7. I F t h e p r e v i o u s p i x e l i s a n o n - h i d d e n node AND t h e c u r r e n t p i x e l ( i . e . t h e h i d d e n node) has more tha n 1 u n t r a c e p i x e l s THEN s t o r e t h e c u r r e n t h i d d e n n o d e i n t h e n e x t a v a i l a b l e s l o t i n t h e c u r r e n t _ l e v e l _ a r r a y ; h.8. I F (D1=0) AND t h e p r e v i o u s p i x e l i s n o t a non-hi d d e n node THEN h.8.1. L o o k f o r a n o n - h i d d e n n o d e i n t h e n e i g h b o u r h o o d o f t h e c u r r e n t p i x e l and a s s i g n i t s d i r e c t i o n t o D l i f found; h.8.2. I F t h e r e a r e more t h a n one 1 p i x e l i n t h e c u r r e n t node THEN s t o r e t h e c u r r e n t h i d d e n node i n the n e x t a v a i l a b l e s l o t o f the nex t l e v e l a r r a y ; 47 Chapter I I I h.9. IF (D1=0) AND the p r e v i o u s p i x e l i s a non-hidden node THEN I F t h e p i x e l i n d i r e c t i o n FD i s a 1 p i x e l THEN c h o o s e d i r e c t i o n FD as D l ELSE c h o o s e FD's " n e x t d o o r " n e i g h b o u r s as D l a c c o r d i n g t o t h e "T" mask i f t h e y a r e n o t z e r o p i x e l s . F o r e x a m p l e , I F FD was 1 THEN t h e new d i r e c t i o n w i l l be 2 ELSE t h e new d i r e c t i o n w i l l be 4; END - c u r r e n t p i x e l i s a hi d d e n node i . A s s i g n D l t o t h e K l ' t h Temp b r a n c h ' s L e n g t h ' s element i n the c h a i n code a r r a y ; END - c u r r e n t p i x e l i s not a non-hidden node ELSE BEGIN - an t e r m i n a t i o n p i x e l i s reached 2.4.3. E x t r a c t b r a n c h e s f e a t u r e s : l e n g t h , c h a i n c o d e , b r a n c h l e v e l = c u r r e n t l e v e l , and a l l b r a n c h ' s node a t t r i b u t e s ; K l ' t h b ranch i s marked t r a c e d ; 2.4.4. I F t h e t r a c e d b r a n c h i s n o t a l o o p AND t h e e n d i n g node i s n o t a EP THEN s t o r e t h e e n d i n g node o f t h e branch i n t o the n e x t _ l e v e l _ a r r a y ; UNTIL t h e K l ' t h b ranch i s marked t r a c e d ; 2.5. K l = K l + 1; END; - K l <= Times END; - trace branches The f o r w a r d d i r e c t i o n FD and the r e v e r s e d i r e c t i o n RD a r e used t o l o c a t e the p o s s i b l e next p i x e l and the r e v e r s e p i x e l r e s p e c t i v e l y . D i r e c t i o n FD g i v e s t h e g e n e r a l d i r e c t i o n t o l o o k f o r an u n t r a c e d p i x e l . The d i r e c t i o n FD i s i n p a r t r e s p o n s i b l e f o r t r a c k i n g b a c k - t o - b a c k nodes s u c c e s s f u l l y . The b e s t but e x t r e m e l y e x p a n s i v e approach, however, i n v o l v e s s c a n n i n g t h e e n t i r e IFDT and c h e c k i n g t h e c u r r e n t p i x e l and i t s n e i g h b o u r s f o r b a c k - t o - b a c k nodes f o r e v e r y n o n - z e r o p i x e l i n t h e image. I m p r o v e m e n t on FE1 i s p o s s i b l e i f , f o r e x a m p l e , t h r e e d i f f e r e n t c a s e s ( n o n - h i d d e n n o d e s , h i d d e n n o d e s , and 1 p i x e l s ) a r e e m p l o y e d i n s t e a d o f t h e two ( i . e . M hn and NOT(M h n ) ) . 48 Chapter I I I INPUT:Thinned image FE1 No node case SCR No b l a c k p i x e l case Goto end o f CCSA r e s e t b a c k - t o - b a c k nodes X "1 N swap node S OUTPUT:IFDT o f the i n p u t image. For FE1 most o f the computing o c c u r s i n s u b r o u t i n e t r a c e b r a n c h e s . R e f e r t o t e x t o f the t h e s i s f o r the d e s c r i p t i o n o f t h e v a r i o u s s u b r o u t i n e s . F i g . I I I - l . l . A b l o c k d i a g r a m showing t h e f e a t u r e e x t r a c t i o n r o u t i n e and i t s r e l a t i n g s u b r o u t i n e s . 49 Chapter A n input image is shown below. Let the entire image be the region S. dSik P^JlillJJtlttllillllf * H ir - A-I I I I I I I I I I I I I I I I I I I I I I I I I I I This plain shovs the degree of membership for all 5 regions, l . o — . . . . . . . , , •-0 . 5 _ F i g . I I I - 2 . 1 . To i l l u s t r a t e t he i d e a o f degree o f membership. Chapter I I I 11 .111 1... 11. 1 1 11 i . 1.... . i . . . . .31..1 1 . . 1 . 1 1 1 1 . . 1 . . 1 . . 1 3 1 1 3 . 1 1 . 1 . 1 3 . . 4 . 1 1 1 . 1 1 1 1 1 1 1 . .1 1 . . . . 1 1 1 1 . . . 1 1 1 1 . . 1 1 2 . , . . 1 1 F i g . I I I - 3 . 1 . Example o f bac k - t o - b a c k nodes. , . . . 1 1 . . 1 1 1 1 1 . . . 1 . 1 11 F i g . I I I - 5 . 1 . R e - l a b e l l i n g the ba c k - t o - b a c k nodes. Chapter I I I 1.1 , 1111 . . . 1 . 1 1 1 1 1 1 1 3 11 1 3 1 1 1 1 . . . . 1 JO INT[ 1]= 1= 22 J= 9 n _ l e v e l = 1 SYM = E s ASSOCIATED BRANCHES : 1 J O I N T ! 2]= 1= 12 J= 11 n _ l e v e l = 3 SYM = J 3 ASSOCIATED BRANCHES : 2 4 JOINTt 3]= 1= 15 J = 11 n _ l e v e l = 2 SYM = J 3 ASSOCIATED BRANCHES : 1 2 3 J O I N T ! 4]= 1= 19 J= 18 n _ l e v e l = 3 SYM = J 3 ASSOCIATED BRANCHES : 3 5 6 J O I N T ! 5]= 1= 14 J= 28 n _ l e v e l = 4 SYM = J 3 ASSOCIATED BRANCHES : 5 6 7 JOINTt 6]= 1= 14 J= 30 n _ l e v e l = 5 SYM = J 3 ASSOCIATED BRANCHES : 7 8 9 J O I N T ! 7]= 1= 13 J= 38 n _ l e v e l = 7 SYM = E s ASSOCIATED BRANCHES : 10 J O I N T ! 8]"= 1= 15 J= 40 n _ l e v e l = 6 SYM = J 3 ASSOCIATED BRANCHES : 9 10 11 J O I N T ! 9 ] = 1= 7 J= 43 n _ l e v e l = 6 SYM » E s ASSOCIATED BRANCHES : 8 J O I N T ! 10]= 1= 14 J = 45 n _ l e v e l = 7 SYM = J3 ASSOCIATED BRANCHES : 11 12 13 J O I N T ! 11]= 1= 11 J= 47 n _ l e v e l = 8 SYM = J3 ASSOCIATED BRANCHES : 12 14 16 J O I N T ! 12]= 1= 12 J= 52 n _ l e v e l = 9 SYM = Es ASSOCIATED BRANCHES : 14 J O I N T ! 13]= 1= 19 J= 55 n _ l e v e l = 8 SYM = J 3 ASSOCIATED BRANCHES : 13 15 17 J O I N T ! 14]= 1= 6 J= 56 n _ l e v e l = 9 SYM = J3 ASSOCIATED BRANCHES : 16 1B 19 J O I N T ! 15]= 1= 22 J= 56 n _ l e v e l = 9 SYM = Es ASSOCIATED BRANCHES : 15 J O I N T ! 16]= 1= 12 J= 58 n _ l e v e l = 1 0 SYM = J 3 ASSOCIATED BRANCHES : 18 19 20 J O I N T ! 17]= 1= 15 J= 58 n l e v e l * 9 SYM = J 3 ASSOCIATED BRANCHES : 17 20 21 J O I N T ! 18]= 1= 20 J= 63 n l e v e l = l 0 SYM = E s ASSOCIATED BRANCHES : 21 F i g . I I I - 5 . 3 a . A t y p i c a l IFDT - node s e c t i o n . 52 Chapter I I I BP[ l]» 4 4 4 4 4 6 6 BR[ 2 ) . BR_CODE-LEN- 7 B_LEVEL- 1 B NODE- 1 E NODE- 3 BN_SYM-Es EN_SYM-J3 TRACED-TRUE BRI 3)-6 7 6 BR_CODE-LEN- 3 B LEVEL- 2 B_NODE- 3 E NODE- 2 BN_SYM-J3 EN SYM-J3 TRACED-TRUE BR CODE-LEN- 7 B_LEVEL- 2 B_NODE- 3 E NODE- 4 BN SYW-J3 EN~SYM-J3 TRACED-TRUE BR[ 4 ] -BR CODE-1 4 4 I 4 4 6 6 B I 8 5 5 5 5 3 2 2 LEN- 16 B LEVEL- 3 B NODE- 2 E NODE- 2 BN_SYM-J3 EN_STM-J3 TRACED-TRUE BR[ 5 ] -BR CODE-4 4 4 4 6 4 4 6 4 6 4 6 6 7 6 6 6 5 5 5 8 5 LEN- 22 B LEVEL- 3 B_NODE- 4 E NODE- 5 BN_SYM-J3 EN SYM-J3 TRACED-TRUE BR[ 6]-BR CODE-6 8 7 7 6 6 6 6 4 6 " 6 LEN- 11 B_LEVEL- 3 B_NODE- 4 E NODE- 5 BN_STM=J 3 ~ EN_SYM-J3 TRACED-TRUE BR[ 7 ] -BR CODE-7 7 LEN- 2 B_LEVEL- 4 B_NODE- 5 E NODE- 6 BN SYH-J3 EN~SYM-J3 TRACED-TRUE BR[ 8 ] -BR CODE-4 6 4 4 6 4 6 6 7 6 7 6 7 7 7 6 6 5 LEN- 16 B LEVEL- 5 B NODE- 6 E NODE- 9 BN SYM-J3 " EN~SYM-Es TRACED-TRUE BR[ 91-BR CODE-5 6 5 5 8 8 7 B 7 7 7 6 6 6 4 I 4 LEN- 17 B LEVEL- 5 B_NODE- 6 E NODE- 6 BNSYM-J3 EN SYM-J3 TRACED-TRUE B R !10]-BR CODE-I 1 ~ LEN- 2 B LEVEL- 6 B NODE- B E NODE- 7 BN STM-J3 EN"STM-Es TRACED-TRUE BR[11]-BR CODE-7 6 7 7 7 LEN- 5 B LEVEL- 6 B NODE- B E NODE-10 BN_STM-J3 EN_SYM-J3 TRACED-TRUE BRI12)-BR CODE-4 6 6 LEN- 3 B_LEVEL° 7 B NODE-10 E NODE-11 BN_SYM-J3 EN_SYM-J3 TRACED-TRUE BR[131-BR CODE-5 5 5 5 8 6 7 6 7 7 7 7 7 7 LEN- 14 B_LEVEL- 7 B NODE-10 E_NODE*13 BN_SYM-J3 EN SYM-J3 TRACED-TRUE BR[14] -BR CODE-6 6 7 6 7 LEN- 5 B_LEVEL. 6 B_NODE-11 E_NODE-i 2 BN_SYM-J3 EN_5YM-Es TRACED-TRUE BR[15). BR CODE-5 8 5 LEN- 3 B_LEVEL- 6 B NODE-13 E_NODE-15 BN SYM-J3 EN~SYM-Es TRACED-TRUE BR!16] ' BR CODE-4 1 4 6 6 7 6 7 7 7 6 7 7 LEN- 13 B LEVEL- 6 B_NODE« 1 1 F NODE- 1 4 BN SYH-J3 EN"SYM-J3 TRACED-TRUE BRI17]-BR CODE-6 4 6 6 LEN- 4 B LEVEL- 6 B NODE-13 E_NODE-I 7 BN_SYH-J3 EN_SYM-J3 TRACED-TRUE BR[16]-BR CODE-5 5 6 5 5 8 LEN- 6 B LEVEL- 9 B_NODE-I 4 E NODE-16 BN SYM-J3 EN~SYM-J3 TRACED-TRUE BR[19]-BR CODE-6 6 8 B 8 5 5 5 5 3 2 2 LEN- 12 B_LEVEL- 9 B NODE-14 E NODE-16 BN_SYM-J3 EN SYM-J3 TRACED-TRUE BR[20]-BR CODE-4 4 4 LEN- 3 B_LEVEL« 9 B NODE-17 E NODE-16 BN SYM-J3 EN"SYM-J3 TRACED-TRUE BRI21) -BR CODE-6 6 6 8 6 LEN- 5 B_LEVEL- 9 B_NODE-17 E NODE-16 BN_SYM-J3 EN_SYM-ES TRACED-TRUE T h i s t r e e has 16 nodes, 21 branches. F i g . I I I - 5 . 3 b . (cont.) A t y p i c a l IFDT - branch s e c t i o n . 53 Chapter I I I Diagram showing r e l a t i v e p o s i t i o n of the nodes H i g h e s t Node h i e r a r c h y L e v e l 1 A L e v e l ' 2 L e v e l 3 L e v e l 4 L e v e l 5 L e v e l 6 L e v e l 7 L e v e l 8 L e v e l 9 Lowest L e v e l 10 h i e r a r c h y F i g . I I I - 5 . 3 c . G r a p h i c a l r e p r e s e n t a t i o n o f the IFDT, CHAPTER IV THE SEGMENTATION PROCESS 1. I n t r o d u c t i o n The d e s i g n o f t h e s e g m e n t a t i o n r o u t i n e i s b a s e d on t h e image f e a t u r e d e s c r i p t i o n t a b l e (IFDT) d i s c u s s e d i n c h a p t e r I I I . The s e g m e n t a t i o n p r o c e s s i s the l a s t s t a g e o f the CCSA. I t i s made up o f t h e main body o f t h e s e g m e n t a t i o n r o u t i n e and f o u r s e p a r a t e s u b r o u t i n e s . The f i r s t s u b r o u t i n e chooses t e s t i m a g e s b a s e d on t h e f e a t u r e s i n a g i v e n IFDT. The s e c o n d s u b r o u t i n e i s t h e SCR by Mr. Wong [ 1 ] , The SCR t e s t r e c o g n i z e s t h e t e s t i m a g e s c h o s e n by t h e f i r s t s u b r o u t i n e . The t h i r d s u b r o u t i n e updates image d e s c r i p t i o n t a b l e s g i v e n the b e s t r e c o g n i z e d c h a r a c t e r . The f o u r t h s u b r o u t i n e a c t s as a b u f f e r b e t w e e n t h e SCR and t h e f i r s t s u b r o u t i n e . I t p e r f o r m s t h e n e c e s s a r y s t e p s s u c h t h a t t h e o u t p u t f r o m t h e s e g m e n t a t i o n r o u t i n e can be t r a n s f e r t o the SCR and i n s p e c t s t h e r e s u l t s f r o m t h e SCR b e f o r e r e t u r n i n g t h e r e s u l t t o t h e s e g m e n t a t i o n r o u t i n e . The s e g m e n t a t i o n r o u t i n e c o n t i n u o u s l y l o o p t h r o u g h t h e s e s u b r o u t i n e s , n o t n e c e s s a r y a t t h e o r d e r m e n t i o n e d a b o v e , u n t i l t h e r e a r e no f e a t u r e s l e f t t o c h o o s e f r o m i n t h e IFDT. F i g u r e IV-1.1 i s a b l o c k d i a g r a m s h o w i n g t h e i r r e l a t i o n s h i p s . I n c h a p t e r I I I , t h e IFDT i n d i c a t e s a p a r t i c u l a r o r d e r . I t i s p o s s i b l e t o s i m p l y f o l l o w t h f o r d e r o f t h e nodes and branches t o form t e s t images. However, the p a r t i c u l a r o r d e r 55 Chapter IV i s o n l y a g r o s s a p p r o x i m a t i o n o f the " l e f t - t o - r i g h t " r u l e of E n g l i s h . As a r e s u l t , f o l l o w i n g s u c h o r d e r may y i e l d t e s t i m a g e s w h i c h do n o t r e p r e s e n t t h e c o r r e c t c h a r a c t e r s t o be s e g m e n t e d . A r u l e i s r e q u i r e d t o f o r m t e s t i m a g e s i n a way w h i c h c l o s e l y f o l l o w s the " l e f t - t o - r i g h t " r u l e . The r u l e i s d e t a i l e d i n s e c t i o n 2 o f t h i s c h a p t e r . S i n c e e a c h p h y s i c a l b r a n c h ( i . e . p h y s i c a l f e a t u r e t h a t c a n be a p h y s i c a l p a r t o f an i m a g e ) i s a p o t e n t i a l l y r e c o g n i z a b l e c h a r a c t e r , the maximum p o s s i b l e number o f t e s t image f o r any i n p u t image t o the CCSA i s l a r g e . T h i s number i s s i m p l y be t o o l a r g e i f we have a s t r i n g o f c h a r a c t e r s s u c h as "MISUNDERSTANDING" o r "THEORETICALLY." T h e r e f o r e a s p e c i a l scheme i s u t i l i z e d t o reduce the number o f p o s s i b l e t e s t i m a g e s . The s p e c i a l s c h e m e a n a l y z e s t h e t o t a l d e f o r m a t i o n d i s t a n c e ( d e f o r m a t i o n d i s t a n c e i s a measure o f the goodness o f ma t c h i n g between a r e f e r e n c e c h a r a c t e r and an i n p u t i m a g e p r o d u c e d by t h e SCR) o f an a r r a y o f t e s t i m a g e s as an a t t e m p t t o r e d u c e t h e number o f p o s s i b i l i t i e s . I t i s c a l l e d D i s t a n c e Based R e d u c t i o n Scheme o r D B R S i n s h o r t . The DBRS i s d i s c u s s e d i n a l a t e r s e c t i o n o f t h i s c h a p t e r . 2. C h o o s i n g f e a t u r e s t o c r e a t e t e s t images Owing t o t h e h i g h l y s t r u c t u r e d IFDT, l e s s e f f o r t i s r e q u i r e d i n d e s i g n i n g t h e s u b r o u t i n e t o c r e a t e t e s t i m a g e s . The s t r u c t u r e s o f the t a b l e and the r e l a t i o n s h i p s between the 56 Chapter IV f e a t u r e s a r e e x c e l l e n t cues f o r c h o o s i n g the p r o p e r f e a t u r e s f o r t h e c r e a t i o n o f t e s t i m a g e s . The s u b r o u t i n e f i n d _ t e s t _ i m a g e s t a k e s an image d e s c r i p t i o n t a b l e as an i n p u t . I t o u t p u t s a s e t o f t e s t i m a g e s i n an 1 - d i m e n s i o n a l a r r a y . E a c h t e s t image i s d e f i n e d by i t s p h y s i c a l b i n a r y image and s e v e r a l a t t r i b u t e s and l a b e l s . These a t t r i b u t e s and l a b e l s a r e i n c l u d e d t o d e s c r i b e the t e s t image and t o r e c o r d i t s r e c o g n i t i o n r e s u l t s . The " s m a l l e s t " t e s t image w i l l o n l y c o n s i s t o f one b r a n c h . F o r e x a m p l e , t h e c h a r a c t e r s "C" and "L" i s made up o f j u s t one b r a n c h . The " l a r g e s t " t e s t image w i l l be d e f i n e d by v a r i o u s c o n s t r a i n t s s u c h as t h e s i z e and the number o f branches ( i . e . p h y s i c a l f e a t u r e s ) . We c a n d e f i n e a t e s t i mage TI w h e r e : TI = ( A, ND, BRH, ID, Gl 3 where:A = 2 - d i m e n s i o n a l a r r a y t o s t o r e the image. ND = 1 - d i m e n s i o n a l a r r a y t o s t o r e the names of the nodes o f the t e s t image as t h e y a r e chosen. BRH = 1 - d i m e n s i o n a l a r r a y t o s t o r e the names of the branches of the t e s t image as t h e y a r e chosen. ID = r e c o g n i t i o n r e s u l t s o f t h e t e s t i mage d e t e r m i n e d by t h e SCR - a 4 t u p l e f i e l d c o n s i s t s o f B o olean v a r i a b l e f o r r e c o g n i t i o n , i d e n t i t y o f the c h a r a c t e r , minimum d e f o r m a t i o n d i s t a n c e , and the p r e - c l a s s i f i e d group of the t e s t i m a g e . R e f e r t o A p p e n d i x 3 on p r e -c l a s s i f i c a t i o n . G l = o t h e r a t t r i b u t e s f o r s e g m e n t a t i o n . I t i s a 4 t u p l e f i e l d w i t h a s c r a t c h B o o l e a n v a r i a b l e , a w i d t h v a r i a b l e f o r t h e p h y s i c a l w i d t h o f t h e c h a r a c t e r , a p o s i t i o n v a r i a b l e f o r t h e p i x e l w h i c h i s t h e c o o r d i n a t e o f l e f t most p i x e l , and a s y n t a c t i c v a r i a b l e c o u n t i n g t h e t o t a l n u m b e r o f E P s , J 3 s , a n d J 4 ' s i n t h e t e s t image. 57 Chapter IV The s t r a t e g y f o r c h o o s i n g f e a t u r e s t o c r e a t e t e s t images s u c h t h a t t h e y w i l l c l o s e l y a p p r o x i m a t e t h e l e f t - t o - r i g h t r u l e o f E n g l i s h i s r e l a t i v e l y s i m p l e i f we h a v e t h e d e s c r i p t i o n t a b l e . T h e r e a r e f o u r r u l e s t h a t s h o u l d be observed by the s t r a t e g y : 1. a l w a y s s t a r t a t t h e v e r y t o p ( i . e . f i r s t node) o f t h e h i e r a r c h y . 2. n o t e w h i c h node i n ND o f TI i s o f t h e l o w e s t o r d e r a t a l l t i m e s . I n o t h e r w o r d s , keep i n mind w h i c h one i s the r i g h t most node p i x e l . C a l l i t Top. 3. I f we are a t a node, c h o o s i n g o r d e r w i l l be d e t e r m i n e d by the T o p e r a t o r ' s mask. 4. I f a l l b r a n c h e s o f a c u r r e n t node a r e a l r e a d y c h o s e n , l o o k f o r any node h i g h e r i n the h i e r a r c h y w h i c h has " u n - c h o s e n " b r a n c h e s . I f n o t f o u n d , t h e n s t a r t a new node w i t h the next node ( l o w e r i n h i e r a r c h y ) i n the IFDT. Make t h i s new c u r r e n t node as the Top. A s c r a t c h t e s t i m a g e w i l l c o n t a i n a l l t h e c u r r e n t f e a t u r e s c h o s e n a t a n y t i m e d u r i n g t h e e x e c u t i o n o f f i n d _ t e s t _ i m a g e s . The f i r s t b r a n c h ( i . e . B-^ ) i s t h e f i r s t f e a t u r e p h y s i c a l l y drawn i n t o the s c r a t c h image. G i v e n t h a t we know t h e Freeman c h a i n code o f a b r a n c h , we c a n e a s i l y draw b r a n c h e s on a b l a n k image by f o l l o w i n g t h e c h a i n code. As t h e s c r a t c h image i s b u i l t up w i t h more new b r a n c h e s ( O n l y b r a n c h e s c a n be p h y s i c a l l y d r a w n i n t o t h e i m a g e , t h e nodes w i l l be d r a w n a u t o m a t i c a l l y . ) i t w i l l be a s s i g n e d t o t h e o u t p u t a r r a y o f t e s t i m a g e s e a c h t i m e when a new b r a n c h i s d r a w n t o i t . A t t h e same t i m e ND and BRH o f TI a r e u p d a t e d a c c o r d i n g l y . G i v e n ND and BRH o f T I , we can e a s i l y compute the a t t r i b u t e s o f G l . F i n d t e s t i m a g e s i s d e t a i l e d l a t e r i n 58 Chapter IV t h i s c h a p t e r . F i g u r e IV-2.1 shows an e x a m p l e on one s e t o f t e s t i m a g e s c r e a t e d by p a s s i n g t h e IFDT o f f i g u r e I I I - 5 . 2 t h r o u g h find_test_images once. The numbers on the upper l e f t hand c o r n e r o f e a c h image o f f i g u r e IV-2.1 shows t h e o r d e r t h a t the image was c r e a t e d . Find_test_images t e r m i n a t e s i f t h e w i d t h o f t h e r e s u l t a n t t e s t image i s t o o w i d e , o r i f t h e r e a r e t o o many p h y s i c a l f e a t u r e s , o r i f t h e end o f IFDT i s reached. 3. SCR Mr. Wong's S C R as s e e n i n r e f e r e n c e [1] i s u s e d h e r e . The d e t a i l s a r e p r e s e n t e d i n A p p e n d i x 3. I t t a k e s a b i n a r y image o f s i z e 24 p i x e l s by 24 p i x e l s as i n p u t . I t o u t p u t s t h e r e c o g n i t i o n r e s u l t s o f t h e i n p u t image. The r e c o g n i t i o n r e s u l t s w o u l d i n c l u d e t h e i d e n t i t y o f t h e c h a r a c t e r i f r e c o g n i z e d and a d e f o r m a t i o n d i s t a n c e w h i c h r e f l e c t t h e amount of d e f o r m a t i o n o f the i n p u t c h a r a c t e r when compared t o t h e b e s t m a t c h e d r e f e r e n c e c h a r a c t e r . The s m a l l e r t h e d e f o r m a t i o n d i s t a n c e the b e t t e r the c h a r a c t e r i s matched t o t h e r e f e r e n c e c h a r a c t e r . The SCR has been m o d i f i e d t o p u t the r e c o g n i t i o n r e s u l t s i n t o ID o f TI and, more i m p o r t a n t l y , t o meet t h e e x p e r i m e n t a l n a t u r e o f t h i s p r o j e c t . I t i s m o d i f i e d s u c h t h a t a s l i g h t d e f o r m a t i o n w i l l y i e l d a r e l a t i v e l y l a r g e d e f o r m a t i o n d i s t a n c e and t h a t l a r g e r d e f o r m a t i o n d i s t a n c e w i l l u s u a l l y y i e l d the "Not Recognized" r e s u l t . 59 Chapter IV T h e r e a r e f o u r m a j o r c h a n g e s made on t h e SCR. F i r s t , i n s t e a d o f u s i n g the o r i g i n a l r e f e r e n c e d i c t i o n a r y from [ 1 ] , a new r e f e r e n c e d i c t i o n a r y i s c r e a t e d . The new d i c t i o n a r y i s much s m a l l e r b u t i t has t h e e x a c t f o r m a t as t h e o r i g i n a l d i c t i o n a r y . I t has 26 c h a r a c t e r s f r o m A t o Z. They a r e t h e same c h a r a c t e r u s e d i n g e n e r a t i n g a l l t h e i n p u t i m a g e s f o r t h e CCSA. S e c o n d l y , t h e c o n s t a n t s f o r e a c h k i n d o f d e f o r m a t i o n s ( i . e . d e f o r m a t i o n i n l e n g t h , p o s i t i o n , c u r v a t u r e , etc.) a r e a d j u s t e d such t h a t a s l i g h t d e f o r m a t i o n i n t h e image w i l l y i e l d a r e l a t i v e l y l a r g e r d e f o r m a t i o n d i s t a n c e . T h i r d l y , a s t a t i s t i c a l p l o t o f the f i r s t c h a r a c t e r s e g m e n t a t i o n and r e c o g n i t i o n r e s u l t of a sample o f con n e c t e d i n p u t i m a g e s , l i s t e d i n t a b l e IV-3.1., i s u s e d t o d e t e r m i n e t h e new t h r e s h o l d s f o r r e c o g n i z e d / n o t r e c o g n i z e d . F i g u r e I V - 3 . 1 . s h o w s t h e p l o t . L a s t l y , an e x t r a d e f o r m a t i o n f u n c t i o n i s added t o t h e SCR. I t w i l l g i v e t h e d e f o r m a t i o n d i s t a n c e as a f u n c t i o n o f t h e l e n g t h o f a c l o s e d l o o p . I t w i l l be added t o t h e f i n a l d e f o r m a t i o n d i s t a n c e u s e d t o d e t e r m i n e whether the c h a r a c t e r i s r e c o g n i z e d o r not. T h i s f u n c t i o n i s u s e f u l i n d i s t i n g u i s h i n g between a s e c t i o n o f the c h a r a c t e r B a n d t h e c h a r a c t e r O o r v i c e v e r s a . The m o d i f i c a t i o n s a r e p r e s e n t e d i n Appendix 3 e x c e p t f i g u r e IV-3.1 w h i c h i s d i s c u s s e d i n t h e f o l l o w i n g p a r a g r a p h s . 60 Chapter IV Image AB CR KQ MD OT AT IB KS MY PA BS IT LL NB PP BW JS LZ NK QT CC KL MB NL RZ T a b l e IV-3.1. Images used i n g e n e r a t i n g the p l o t of f i g u r e IV-3.1. T a b l e IV-3.1. i s a l i s t o f conn e c t e d images used i n t h i r d m a j o r change o f t h e SCR. E a c h image has two c o n n e c t e d c h a r a c t e r s . T h e y a r e f i r s t p r e p r o c e s s e d a n d f e a t u r e e x t r a c t e d . Then t h e i m a g e s a r e e a c h p r o c e s s e d once by t h e r o u t i n e f i n d _ t e s t__image. E a c h t e s t image c r e a t e d by f i n d _ t e s t _ i m a g e i s p a s s e d t h r o u g h t h e SCR. The d e c i s i o n o f w h e t h e r a t e s t i mage i s r e c o g n i z e d o r n o t r e c o g n i z e d i s n o t d e t e r m i n e d a u t o m a t i c a l l y b y t h e (SCR) o l d d e c i s i o n t h r e s h o l d s . I n s t e a d , i t i s d e t e r m i n e d m a n u a l l y . A f t e r p r o c e s s i n g a l l the images, the r e s u l t s a r e p l o t t e d on f i g u r e IV-3.1. The t o p g r a p h (top g r i d ) o f t h e f i g u r e i s f o r t h e not recognized c a s e . The b o t t o m g r a p h (bottom grid) i s f o r t h e recognized c a s e . The t o t a l number of c r i t i c a l p o i n t s ( i . e . number o f EP + number o f J3 + number o f J4 i n a t e s t image = t o t a l n u m b e r o f n o d e s ) i s t h e h o r i z o n t a l a x i s . The c o r r e s p o n d i n g deformation d i s t a n c e i s t h e a x i s g o i n g i n t o the page. The t o t a l number of t e s t images, r e c o g n i z e d o r n o t r e c o g n i z e d , i s the v e r t i c a l a x i s o f the page. V e r t i c a l l i n e s r i s i n g from the bottom g r i d (or d e s c e n d i n g 61 Chapter IV from the top g r i d ) r e p r e s e n t s the t o t a l number o f t e s t images w h i c h has t h e same number o f c r t i c a l p o i n t s and t h e same d e f o r m a t i o n d i s t a n c e . The f a r t h e r s u c h a l i n e on a g r i d e x t e n d s t o w a r d s t h e c e n t e r of t h e page, the l a r g e r the t o t a l number o f t e s t images r e p r e s e n t e d on t h a t p o i n t o f the g r i d . The d i a g o n a l s t r a i g h t l i n e s s e e n on b o t h g r a p h s shows t h e t h r e s h o l d s f o r r e c o g n i z e d and not r e c o g n i z e d used by Wong i n [ 1 ] . The s q u a r e m a r k e r s a r e t h e t h r e s h o l d s c h o s e n f o r t h i s r e s e a r c h p r o j e c t . Note t h a t i t i s t o p o l o g i c a l l y i m p o s s i b l e t o have a odd number f o r the t o t a l number o f c r i t i c a l p o i n t s [ 4 2 ] . T h e r e f o r e , odd number o f c r i t i c a l p o i n t s a r e n o t marked on f i g u r e IV-3.1 and t h e c a s e o f no c r i t i c a l p o i n t ( i . e . a l o o p ) assumes t h e p o s i t i o n o f one c r i t i c a l p o i n t on t h e g r a p h s . A l s o , t h e r e a d e r s h o u l d n o t e t h a t t h e v e r t i c a l a x i s on t h e " r e c o g n i z e d g r a p h " i s h a l f o f s c a l e on t h e "not r e c o g n i z e d g r a p h . " T h a t i s , t h e v e r t i c a l l i n e s on t h e t o p g r i d s h o u l d be t w i c e as l o n g as t h e y a p p e a r i f b o t h g r a p h s a r e p l o t t e d u s i n g the same s c a l e . The i m p o r t a n t f e a t u r e h e r e i s t o l o o k a t t h e o r i g i n a l d e c i s i o n t h r e s h o l d s used by Mr. Wong. I t i s r e l a t i v e l y l a r g e and would i n c l u d e most o f the not r e c o g n i z e d case. The b a d l y d e f o r m e d i m a g e s w h i c h h a s t o o l a r g e o f a d e f o r m a t i o n a r e lumped and r e p r e s e n t e d a t the end o f the d e f o r m a t i o n d i s t a n c e a x i s . The r e a d e r s h o u l d a l s o n o t i c e a l a r g e group o f "wrong" c h a r a c t e r s l i e s u n d e r Mr. Wong's t h r e s h o l d . B a s e d on t h e s e 62 Chapter IV o b s e r v a t i o n s , t h e g r a p h s u g g e s t s t h a t t h e SCR c a n be e a s i l y c o n f u s e d i f t h e o l d d e c i s i o n t h r e s h o l d s a r e u s e d f o r t h i s p r o j e c t . I d e a l l y , we would l i k e t o have a SCR w h i c h y i e l d s a w i d e r "gap" b e t w e e n t h e " r i g h t " and t h e "wrong" c h a r a c t e r s t h a n t h e SCR w h i c h was u s e d i n t h i s p r o j e c t . N e v e r t h e l e s s , t h e new d e c i s i o n t h r e s h o l d s a r e p r o v e n t o be a c c e p t a b l e as seen i n the s e g m e n t a t i o n r e s u l t s i n Chapter V. A l t h o u g h the a c t u a l performance w i l l be degraded by t h e s e m o d i f i c a t i o n s , t h e y s h o u l d a l l o w us t o o b t a i n an o b j e c t i v e c o n c l u s i o n a b o u t c o n n e c t e d c h a r a c t e r s e g m e n t a t i o n . N o n e t h e l e s s , e f f e c t s o f t h e CCSA a r e more " v i s i b l e . " The c a u s e s f o r t h e o b t a i n e d s e g m e n t a t i o n r e s u l t s , w h e t h e r t h e y a r e c o r r e c t o r n o t , c a n be o b s e r v e d . A b y - p r o d u c t o f t h e c h a n g e s i s t h e r e d u c t i o n i n t h e p o s s i b l e number o f t e s t i m a g e s marked r e c o g n i z e d . S u b s e q u e n t l y , t h e c o s t f o r t h i s r e s e a r c h p r o j e c t i s r e d u c e d . N o t e t h a t i f we move t h e r e c o g n i z e d / n o t r e c o g n i z e d t h r e s h o l d f u r t h e r down on t h e d e f o r m a t i o n d i s t a n c e a x i s (on f i g u r e I V - 3 . 1 . ) , we a r e bound t o g e t more r e c o g n i z e d c h a r a c t e r . 4. U p d a t i n g the IFDT T h i s s u b r o u t i n e i s r e s p o n s i b l e f o r l o g i c a l l y r e m o v i n g s e c t i o n s o f t h e image o f f t h e IFDT a c c o r d i n g t o ND and BRH o f T I . I t t a k e s a TI and t h e c o r r e s p o n d i n g IFDT as i n p u t s and o u t p u t s a updated IFDT. I t uses a s c r a t c h d e s c r i p t i o n t a b l e as t h e new IFDT. F i r s t , i t l o a d s a l l t h e node f r o m t h e o l d 63 Chapter IV t a b l e w h i c h has a t l e a s t one r e l a t e d branches e x c l u d e d from t h e i n p u t TI i n t o t h e new t a b l e . Then, i t c h o o s e s t h e f i r s t node o f t h e new t a b l e as t h e s t a r t i n g node t o b e g i n t h e update (or r e t r a c i n g by f o l l o w i n g the known Freeman's c h a i n code from the o l d t a b l e ) o f the new t a b l e ' s b ranch f e a t u r e s . The u p d a t i n g p r o c e s s i s i n many ways s i m i l a r t o t h e f e a t u r e e x t r a c t i o n r o u t i n e FE1 p r e s e n t e d i n Chapter 3. Node f e a t u r e s w i l l be updated a f t e r a l l branch f e a t u r e s a r e updated. Keep i n m i n d t h a t some nodes i n t h e o l d t r e e has o n l y two b r a n c h e s l e f t i n the new t r e e . A l s o , some o f the branches' c h a i n code ( i . e . C^ o f B^) may have t o be r e v e r s e d due t o s e g m e n t a t i o n . The u p d a t i n g s u b r o u t i n e b e g i n s a t t h e t o p o f t h e t a b l e . The f i r s t node o f t h e new t a b l e i s t h e s t a r t i n g node f o r u p d a t i n g . S i n c e we know t h e names ( i . e . o r d e r ) o f t h e Nbh ( a s s o c i a t e d b r a n c h e s ) o f t h e f i r s t node i n t e r m s o f t h e o l d t a b l e , we c a n d e t e r m i n e w h i c h b r a n c h o f t h e node i s t h e f i r s t b r a nch t o be s t o r e d i n the new t a b l e . D e t e r m i n i n g whether a bra n c h i s i n r e v e r s e d i r e c t i o n o r not can be a c c o m p l i s h e d by compar i n g the c o o r d i n a t e s o f the t e r m i n a t i o n nodes i n the o l d t a b l e w i t h t h e t e r m i n a t i n g nodes' c o o r d i n a t e s g i v e n i n t h e new t a b l e . I f the e n d i n g node o f the c u r r e n t l y t r a c e d branch has e x a c t l y two names i n t h e Nbh a r r a y o f t h e node i s r e a c h e d a t t h e end o f t r a c i n g , i t means t h a t t h e node no l o n g e r e x i s t s a f t e r s e g m e n t a t i o n . We w i l l c o n t i n u e t r a c i n g t h e c u r r e n t b r a n c h i n t o t h e n e x t b r a n c h . H e r e , t h e t a s k o f 64 Chapter IV t r a c i n g i s a c c o m p l i s h e d by s i m p l y c o p y i n g the Freeman's c h a i n c ode f r o m t h e o l d t a b l e . The t r a c i n g w i l l c o n t i n u e u n t i l a t e r m i n a t i o n node i s r e a c h e d . Then t h e o t h e r e x c l u d e d b r a n c h e s a r e t r a c e d . Upon c o m p l e t i o n o f t r a c i n g a l l t h e b r a n c h e s o f t h e f i r s t node, two a r r a y s ( c u r r e n t l e v e l a r r a y and n e x t l e v e l a r r a y ) a r e used t o keep t r a c k of w h i c h a r e the n e x t nodes t o be t r a c e d j u s t l i k e t h e way i t was p r e s e n t e d f o r a l g o r i t h m FE1 i n c h a p t e r 3. The s e q u e n c e w i l l c o n t i n u e u n t i l t h e r e a r e no more t r a c e a b l e node i n the new t a b l e . The s u b r o u t i n e f o r u p d a t i n g IFDTs i s c a l l e d s u b t and i t i s d e t a i l e d i n t h e s e c t i o n 6 o f t h i s c h a p t e r . 5. Steps t a k e n b e f o r e and a f t e r the c a l l i n g o f the S C R The f o u r t h s u b r o u t i n e a c t s as a i n t e r m e d i a t e s t a g e between the s e g m e n t a t i o n r o u t i n e and the SCR. I t t a k e s a s e t o f t e s t i m a g e s c r e a t e d by f i n d _ t e s t _ i m a g e s as i n p u t s . I t o u t p u t s t h e r e c o g n i t i o n r e s u l t s f o r the e n t i r e s e t . The b e s t r e c o g n i z e d t e s t image i s a l s o marked by t h i s s u b r o u t i n e . E x t r a n e o u s branches between t h i n n e d c h a r a c t e r s e x i s t s i n some t e s t images. These e x t r a branches a r e c a l l e d Branch i n B e t w e e n ( B I B s ) . These B I B s a r e r e s u l t s f r o m t h i n n i n g . F o r example, two v e r t i c a l b r a n c h meets each o t h e r i n the m i d d l e s e c t i o n and t h e r e a r e two d i s t i n c t c o n c a v e " n o t c h e s " above and b e l o w t h e m i d d l e s e c t i o n . A f t e r t h i n n i n g , t h e m i d d l e s e c t i o n can, sometimes, be t r a n s f o r m e d i n t o a h o r i z o n t a l l i n e 65 Chapter IV j o i n i n g the two v e r t i c a l bremch. A B I B i s u s u a l l y b r a n c h e s o f l e n g t h s no l o n g e r t h a n t h e t h i c k n e s s o f t o u c h i n g s e c t i o n b e f o r e t h i n n i n g . A good p o r t i o n o f them a r e s t r a i g h t and h o r i z o n t a l b r a n c h e s . A s u b r o u t i n e s i m i l a r t o t h e trim_branches s u b r o u t i n e c a l l e d exbr i s employed t o e l i m i n a t e some o f t h e s e BIBs. Exbr makes use o f t h e s h o r t , s t r a i g h t , and h o r i z o n t a l f e a t u r e o f t h e s e B I B s as c u e s f o r t r i m m i n g t h e s e B I B s . I d e a l l y , s u b r o u t i n e t r i m - b r a n c h e s c a n be u s e d i n s t e a d . H e r e , exbr i s a an a l g o r i t h m w h i c h p e r f o r m s t r i m m i n g i n a h o r i z o n t a l l e f t t o r i g h t f a s h i o n . T h i s i s b e c a u s e t h e B I B s i n a t e s t i mage a l w a y s b e g i n on t h e l e f t hand s i d e o f t h e image and has a p a t h t r a c i n g t o w a r d s t h e r i g h t hand s i d e . Exbr i s l e s s c o s t l y t o e x e c u t e t h a n t h e a l g o r i t h m trim-branches b e c a u s e trim-branches s c a n s a l l t h e 8 Freeman's d i r e c t i o n s w h i l e exbr o n l y have t o l o o k a t one h o r i z o n t a l d i r e c t i o n . S i n c e exbr i s s i m i l a r t o trim-branches, i t w i l l n o t be d e t a i l e d e x p l i c i t l y . F i g u r e IV-2.2. shows some e x a m p l e s o f i m a g e s w i t h BIBs. A f t e r p r o c e s s i n g a t e s t image w i t h exbr, smooth corner i s i n v o k e d a g a i n t o e l i m i n a t e any p o s s i b l e c o r n e r s . Then e a c h t e s t i m a g e i s p u t t h r o u g h t h e SCR a n d r e t u r n s w i t h a r e c o g n i t i o n r e s u l t s . Find_min w i l l f i r s t c h e c k f o r any no n o d e s i m a g e s ( i . e . l o o p s ) a n d m a r k t h e m as h a v i n g one c r i t i c a l p o i n t ( i . e . n o d e ) . Then n o r m a l i z e t h e t e s t i m a g e s ' 66 Chapter IV d e f o r m a t i o n d i s t a n c e by d i v i d i n g t b e d e f o r m a t i o n d i s t a n c e w i t h t h e c o r r e s p o n d i n g d e f o r m a t i o n t h r e s h o l d f o r t h e p a r t i c u l a r number o f c r i t i c a l p o i n t s ( i . e . nodes) i n t h e image. The b e s t r e c o g n i z e d t e s t image i s d e t e r m i n e d by i t s r e s u l t a n t n o r m a l i z e d d e f o r m a t i o n d i s t a n c e . I f none o f t h e t e s t i m a g e s a r e r e c o g n i z e d by t h e SCR, l e t t h e f i r s t t e s t i m age be t h e b e s t r e c o g n i z e d one. T h i s way, i f a BIB i s i n t h e t e s t i mage even a f t e r t h e p r o c e s s i n g by e x b r , i t w o u l d be e l i m i n a t e d e v e n t u a l l y . F u r t h e r m o r e , t h e CCSA i s guaranteed t o t e r m i n a t e even when t h e i n p u t i m a g e s a r e a l l i n v a l i d c h a r a c t e r s . F i n a l l y , t he o r i g i n a l d e f o r m a t i o n d i s t a n c e s a re r e s t o r e d f o r c o n s i s t e n c y purposes. T h i s s u b r o u t i n e i s c a l l e d f i n d _ m i n and i s d e t a i l e d i n the n e x t s e c t i o n . 6. S e g m e n t a t i o n r o u t i n e The s e g m e n t a t i o n a l g o r i t h m (SRI) i s d e t a i l e d b e l o w . I t t a k e s an image d e s c r i p t i o n t a b l e and t h e c o r r e s p o n d i n g image as an i n p u t and o u t p u t s s e g m e n t a t i o n r e s u l t s . I t i s a c o l l e c t i o n o f t h e f o u r major s u b r o u t i n e s . The DBRS mentioned above i s d i s c r i b e d i n t h e a l g o r i t h m S R I . I f a l l c h a r a c t e r a r e p r o p e r l y r e c o g n i z e d and s e g m e n t e d , i t w i l l g i v e t h e i d e n t i t i e s o f t h e c h a r a c t e r s . I t w i l l g i v e t h e i d e n t i t y where e v e r i t can. I f i t cannot g i v e t h e i d e n t i t y o f anyone o f the c h a r a c t e r s i n v o l v e d , i t w i l l s i m p l y t e r m i n a t e s a f t e r a l l t h e f e a t u r e s o f t h e r e f e r e n c e t a b l e a r e c h o s e n t o f o r m t e s t images. 67 Chapter IV The 2 - d i m e n s i o n a r r a y u s e d h e r e i s named as IMF where e a c h e l e m e n t o f t h e a r r a y i s a t e s t image. I t i s t h e a r r a y u s e d by t h e DBRS. The f o l l o w i n g i s a d e s c r i p t i o n o f a l g o r i t h m SRI. I n p u t : Image F e a t u r e D e s c r i p t i o n T a b l e Output: I d e n t i t i e s o f c h a r a c t e r s o f the i n p u t image BEGIN SRI 1. WHILE NOT(End) DO BEGIN - not End 1.1. CALL f i n d _ t e s t _ i m a g e s t o g e n e r a t e a s e t o f t e s t images from the c u r r e n t IFDT; 1.2. CALL find_min t o l o c a t e t h e r e c o g n i z a b l e i m a g e s i n the s e t o f t e s t images from 1.1; 1.3. S t o r e a l l t h e r e c o g n i z a b l e i m a g e s i n c o l u m n one o f a r r a y IMF. I f none o f the t e s t images i n the s e t i s r e c o g n i z a b l e c h o o s e t h e f i r s t one o f t h e s e t and a s s i g n i t t o c o l u m n 1 o r a r r a y IMF. A s s i g n t h e i n p u t IFDT t o e a c h row's own s u b t a b l e w h i c h a r e a l s o IFDTs o f the same f o r m a t ; 1.4. FOR each row o f IMF DO 1.4.1. CALL subt t o e l i m i n a t e f i r s t c h a r a c t e r f r o m t h e s u b t a b l e f o r the c u r r e n t row; 1.4.2. CALL f i n d _ t e s t _ i m a g e s and t h e n CALL find_min t o g e n e r a t e t e s t i m a g e s f r o m t h e c u r r e n t row's t a b l e and t o l o c a t e the most r e c o g n i z a b l e image from a s e t o f r e c o g n i z a b l e t e s t i m a g e s . P u t t h e m o s t r e c o g n i z a b l e image i n t o t h e n e x t e l e m e n t o f t h e c u r r e n t row of IMF; 1.4.3. REPEAT f r o m s t e p 1.4.1. UNTIL end o f i t e r a t i o n f o r the c u r r e n t row; 1.5. the DBRS: A n a l y s i s the a r r a y IMF based on the o u t p u t o f t h e SCR - FOR e a c h row o f IMF DO sum a l l t h e d e f o r m a t i o n d i s t a n c e s and n o t e t h e row w i t h t h e s m a l l e s t sum. The row w h i c h has t h e s m a l l e s t d e f o r m a t i o n d i s t a n c e sum s h o u l d c o n t a i n t h e t r u l y f i r s t c o r r e c t c h a r a c t e r ; 1.6. CALL subt t o e l i m i n a t e t h e t r u l y c o r r e c t f i r s t c h a r a c t e r from t h e IFDT; 1.7. I F a l l f e a t u r e s o f t h e IFDT i s c h o s e n THEN End = TRUE; END; - not End END; SRI. 68 Chapter IV F i g u r e IV-6.0.1 t h r o u g h f i g u r e IV-6.0.5 a r e d i a g r a m s s h o w i n g t h e c o n t e n t o f a r r a y IMF. O n l y t h e p h y s i c a l i m a g e s a r e shown f o r i l l u s t r a t i o n purposes. Other a t t r i b u t e s such as t h e i r NDs and BRHs a r e o m i t t e d i n t h e d i a g r a m s . The s t a t u s o f IMF shown i s b e f o r e t h e e x e c u t i o n o f s t e p 1.6 i n a l g o r i t h m SRI. The i n p u t image "ROGER" r e q u i r e d f i v e passes t h r o u g h the a l g o r i t h m SRI and each i t e r a t i o n r e q u i r e s a new IMF. F i g u r e IV-6.0.3 i s t h e o n l y c a s e f o r t h i s i mage when IMF has more tha n one row. The two rows a r e a r e s u l t o f two r e c o g n i z a b l e t e s t i m a g e s i n t h e s e t o f t e s t i m a g e s c r e a t e d b y f i n d _ t e s t _ i m a g e . N o t e t h a t e v e r y t i m e f i n d _ t e s t _ i m a g e i s c a l l e d , , a s e r i e s o f t e s t images s i m i l a r t o the ones shown i n f i g u r e IV-2.1 a r e c r e a t e d . I n t h e c a s e o f f i g u r e IV-6.0.3, t h e r e were two i m a g e s i n t h e s e t w h i c h a r e c o n s i d e r e d r e c o g n i z a b l e by t h e SCR. A l l o t h e r t e s t i m a g e s i n f i g u r e s 6.0. 1-5 had o n l y one r e c o g n i z a b l e t e s t image i n t h e s e t . N o t e t h a t f i n d _ m i n p i c k e d o n l y t h e good "G" f o r t h e row. DBRS c h o o s e t o k e e p t h e t w o "G"s as i . f t h e y a r e b o t h c o r r e c t l y r e c o g n i z e d as t h e f i r s t c h a r a c t e r and s t o r e them i n the f i r s t column o f a r r a y IMF. The f o l l o w i n g s e c t i o n s a r e t h e d e t a i l s o f s u b r o u t i n e s find_test_images f find_min, and subt. 6.1. A l g o r i t h m find_test_images 69 Chapter IV 6.1. A l g o r i t h m find_test_images I t w i l l t a k e an image's IFDT as an i n p u t and o u t p u t s a s e t o f t e s t i m a g e s l e s s t h e p h y s i c a l i m a g e s ( i . e . A o f T I ) . A s c r a t c h t e s t image c a l l e d Try temporary s t o r e s the chosen f e a t u r e s . E v e r y t i m e a f e a t u r e i s c h o s e n t o be p a r t o f a t e s t image, i t w i l l be ac c u m u l a t e d i n the s c r a t c h t e s t image Try. A l s o , each t i m e the new f e a t u r e i s a c c u m u l a t e d i n Try, T r y w i l l be s t o r e d i n t h e n e x t a v a i l a b l e empty s l o t o f t h e a r r a y FT. As a r e s u l t , a s e r i e s o f t e s t i m a g e s i s s t o r e d i n FT. The c u r r e n t b r a n c h i s s p e c i f i e d by L e a f and Temp as i t s t e r m i n a t i o n node. The l o w e s t o r d e r node i s c a l l e d Top. Try and each e l e m e n t o f FT w i l l have the f o r m a t o f TI as d e f i n e d i n t h e p r e v i o u s s e c t i o n . Find^_test_image u s e s t h e s c r a t c h o u t p u t v a r i a b l e FT t o c o m m u n i c a t e w i t h SRI. Keep i n mind t h a t t h e o r d e r o f a node o r a b r a n c h i s d e t e r m i n e d by i t s name. The names are r e p r e s e n t e d by i n t e g e r s . The c o n v e n t i o n h e r e i s t h a t t h e s m a l l t h e number o f t h e name, t h e h i g h e r t h a t o r d e r t h e f e a t u r e has i n t h e h i e r a r c h y . R e f e r t o f i g u r e I I I - 5 . 3 as an e x a m p l e on t h e h i e r a r a c h y . The f o l l o w i n g i s a d e s c r i p t i o n o f s u b r o u t i n e find_test_image. BEGIN find_test_image 1. E n d _ T r y = FALSE; E n d _ o f _ T a b l e = FALSE; W i d t h = 0; L e t Top be t h e f i r s t node; S t a r t a t t h e f i r s t node o f t h e t a b l e ( L e a f = 1) and l e t t h e Top be t h e s t a r t i n g p o i n t ; L e t t h e c u r r e n t l e v e l be l e v e l 1, F i r s t node o f T r y i s L e a f ; Number o f Images c o m p u t e d i s IC = 1; L e t t h e 70 Chapter IV c u r r e n t b r a n c h be t h e f i r s t b r a n c h o f Nbh-^ o f J , o f t h e i n p u t IFDT; 2. WITH t h e i n p u t IFDT DO REPEAT - c o m p u t a t i o n o f a new image 2.1. F l = FALSE; New_try = FALSE; I I = 1; 2.2. REPEAT - t r y the n e x t branch of the IFDT 2.2.1. L o c a t e t h e n e x t s l o t f o r a new b r a n c h i n t h e BRH o f Try; S earch IFDT and Try t o l o c a t e the e x c l u d e d b r a n c h , s t a r t t he s e a r c h a t t h e I l ' t h b r a n c h o f t h e IFDT and keep i n c r e m e n t i n g I I u n t i l a b r a n c h n o t i n BRH o f T r y i s found; I f one o f the two t e r m i n a t i n g nodes o f t h e I l ' t h b r a n c h i s f o u n d t o be l e s s t h a n Top THEN F l = TRUE; A l s o , update Top i f one o f the node i s g r e a t e r t h a n Top; UNTIL (F1=TRUE) OR A l l t h e branches i n the IFDT i s checked; - t r y the nex t branch 2.3. IF Temp i s not a s s i g n e d w i t h a new v a l u e THEN - L o c a t e a b r a n c h o f h i g h e s t p o s s i b l e o r d e r i n t h e h i e r a r c h y w h i c h i s n o t i n T r y ' s b r a n c h a r r a y b u t con n e c t e d t o nodes o f Try's node a r r a y ; A l s o , update Top i f n e c e s s a r y ; 2.4. IF Temp i s not a s s i g n e d w i t h a new v a l u e THEN BEGIN - choose the n e x t b r a n c h - FOR e v e r y node o f t h e i n p u t IFDT AND Temp does n o t have a new v a l u e DO s e a r c h f o r t h e n e x t node i n h i e r a r c h y o f t h e t a b l e w h i c h i s n o t y e t i n t h e node a r r a y o f t h e t e s t image Try; IF such a node i s found THEN s e a r c h f o r the n e x t b r a n c h i n h i e r a r c h y o f the t a b l e w h i c h i s not y e t i n t h e b r a n c h a r r a y o f t h e image T r y ; I F s u c h b r a n c h i s f o u n d THEN l e t i t be t h e c u r r e n t b r a n c h and update L e a f , Temp, and Top a c c o r d i n g l y ; - I F a new c u r r e n t b r a n c h i s n o t f o u n d a f t e r s e a r c h i n g a l l o f t h e b r a n c h e s o f t h e t a b l e , THEN E n d _ o f _ T a b l e = TRUE. END - choose t h e nex t b r a n c h 2.5. IF (Width < maximum wi d t h ) AND not ( E n d _ o f _ t a b l e ) AND NOT (End_Try) AND a new v a l u e o f Temp i s found THEN BEGIN - 2.5. - u p d a t e a l l t h e a t t r i b u t e s and c o n t e n t s o f t h e t e s t image Try; - Width = 0; - I F the f i r s t BRH o f TI i s not z e r o THEN BEGIN - w i d t h - Compute the Wi d t h o f the p r o s p e c t i v e c h a r a c t e r o f T r y 71 Chapter IV by t r a c i n g o u t t h e h o r i z o n t a l (column) c o o r d i n a t e s o f the f i r s t ( l e f t - m o s t ) b ranch and the l a s t ( r i g h t -most) b r a n c h i n the branch o r d e r a r r a y o f IFDT; END - w i d t h - IF (Width < maximum width ) THEN BEGIN - a s s i g n - c u r r e n t l e v e l = Temp's node l e v e l ; - I F t h e c u r r e n t l e v e l i s g r e a t e r t h a n o r e q u a l t o the maximum l e v e l THEN End_Try = TRUE; - A s s i g n Try t o the Image T a b l e ' s I C ' t h element. - I F IC > maximum number o f p o s s i b l e t e s t i m a g e s THEN End_Try = TRUE. - IC = IC + 1. END - a s s i g n ELSE End_Try = TRUE. - I F t o o many b r a n c h e s i n t h e b r a n c h a r r a y o f T r y THEN End_Try and E n d _ o f _ t a b l e i s b o t h TRUE. END - 2.5. ELSE End_Try = TRUE; UNTIL (End^_Try i s TRUE) - c o m p u t a t i o n o f a new image; END f i n d _ t e s t images 6.2. A l g o r i t h m find_min T h i s s u b r o u t i n e t a k e s a s e t o f t e s t images o f p r o s p e c t i v e s i n g l e c h a r a c t e r s and i t s c o r r e s p o n d i n g IFDT as i n p u t s . I t then c r e a t e s the b i n a r y t e s t images ( u s i n g s u b r o u t i n e c r e a t e image) and t h e n i t CALLs up t h e SCR t o t e s t them o u t . The o u t p u t c o n t a i n s a l l t h e r e c o g n i t i o n r e s u l t f r o m t h e SCR. O t h e r r e l a t e d s u b r o u t i n e s CALLed by find_min a r e exbr ( f o r t r i m m i n g BIBs) and smooth corners. BEGIN find_min 1. I = 1; 2. WHILE there are s t i l l test images to be tested DO BEGIN 2.1. CALL create image; 2.2. CALL exbr; 2.3. CALL smooth corners; 72 Chapter IV 2.4. WITH the c u r r e n t image DO CALL SCR; 2.5. GOTO the n e x t t e s t image; END; 3. FOR e v e r y t e s t image DO IF t h e r e are no node i n the image THEN mark the image as h a v i n g one node; 4. N o r m a l i z e a l l r e c o g n i z a b l e t e s t i m a g e s ' d e f o r m a t i o n d i s t a n c e s by d i v i d i n g the d e f o r m a t i o n d i s t a n c e w i t h the c o r r e s p o n d i n g d e f o r m a t i o n t h r e s h o l d f o r t h e p a r t i c u l a r number o f c r i t i c a l p o i n t s i n the image; 5. F i n d t h e t e s t image w h i c h was r e c o g n i z a b l e and has t h e s m a l l e s t d e f o r m a t i o n d i s t a n c e ; 6. I F none o f t h e t e s t i m a g e i n t h e s e t i s r e c o g n i z a b l e THEN choose the f i r s t one i n the s e t as the r e c o g n i z e d image; 7. D e - n o r m a l i z e the t e s t images; END; find_min 6.3. A l g o r i t h m subt T h i s s u b r o u t i n e t a k e s a t e s t image and i t s c o r r e s p o n d i n g IFDT as i n p u t s . Subt then l o g i c a l l y d e l e t e the t e s t image i t from the IFDT. T h i s s u b r o u t i n e a l s o r e w r i t e s and updates the IFDT a c c o r d i n g l y . I t a l s o d e t e c t s a n d o u t p u t t h e E n d _ o f _ t a b l e c o n d i t i o n . I t uses two a r r a y s c a l l C u r r e n t and N e x t as s c r a t c h t o keep t r a c k o f t h e c u r r e n t l e v e l . These t w o a r r a y p l a y s t h e e x a c t s a m e r o l e a s t h e C u r r e n t _ l e v e l _ a r r a y and N e x t _ l e v e l _ a r r a y i n t h e s u b r o u t i n e t r a c e branches of C hapter I I I . Subt's f u n c t i o n i s a l m o s t t h e same as t r a c e branches e x c e p t i t does n o t a c t u a l l y t r a c e a b r a n c h b u t c o p y t h e Freeman's c h a i n code o f t h e b r a n c h f r o m t h e i n p u t IFDT. 73 Chapter IV However, t h e o r d e r o f t h e f e a t u r e s may be a l t e r e d o w i n g t o the change i n t o p o l o g y . A d e s c r i p t i o n o f s u b t was p r e s e n t e d i n one o f t h e p r e v i o u s s e c t i o n . The f o l l o w i n g i s a p r e s e n t a t i o n o f the s u b r o u t i n e . BEGIN s u b t 1. Check f o r end o f t a b l e c o n d i t i o n f i r s t by c o m p a r i n g t h e number o f branches i n BRH o f the i n p u t TI and the number o f b r a n c h e s i n t h e i n p u t IFDT; I F t h e y a r e e q u a l THEN En d _ o f _ T a b l e = TRUE ELSE E n d _ o f _ T a b l e = FALSE; IF NOT(End_of_table) THEN BEGIN - t a b l e u p d a t i n g 2. BEGIN - l o c a t e the nodes which s h o u l d be e r a s e d f i r s t 2.1. Mark any l o o p s a t node f e a t u r e s ' Nmk^ as -8888; 2.2. FOR a l l the nodes i n BRH o f the i n p u t TI DO 2.2.1. I F a l l b r a n c h e s i n Nbh^ o f t h e c u r r e n t node i s i n BRH o f i n p u t TI THEN a s s i g n t h e c u r r e n t node t o t h e e l i m i n a t e d node a r r a y ; 2.3. FOR e v e r y BRH o f the i n p u t TI DO 2.3.1. Use t h e f i r s t and t h e l a s t d i r e c t i o n s o f t h e c h a i n c o d e o f t h e b r a n c h t o c h e c k w h e t h e r t h e r e a r e b a c k - t o -back node wh i c h s h o u l d be i n c l u d e d i n the e l i m i n a t e d node a r r a y ; IF a l l such node's Nbh^ branches a r e i n the i n p u t TI THEN a s s i g n t h e b a c k - t o - b a c k node t o t h e e l i m i n a t e d node a r r a y ; 2.4. FOR e v e r y node w h i c h does not b e l o n g t o t h e e l i m i n a t e d node a r r a y DO a s s i g n t h e s e nodes t o t h e new t a b l e ; 2.5. FOR e v e r y node i n t h e new t a b l e DO 2.5.1. I F any one b r a n c h o f t h e Nbh^ o f t h e c u r r e n t node i s i n t h e i n p u t TI THEN change t h e s y m b o l o f t h e c u r r e n t node a c c o r d i n g l y ; 2.5.2. I F any one b r a n c h o f t h e Nbh^ o f t h e c u r r e n t node i s i n t h e i n p u t TI THEN s e t t h e c o r r e s p o n d i n g member o f Nbh^ of the c u r r e n t node t o z e r o ; 3. I n i t i a l i z e t h e a r r a y C u r r e n t and N e x t ; L e t t h e f i r s t e l e m e n t o f C u r r e n t be t h e f i r s t node o f t h e new IFDT; I F 74 Chapter IV the f i r s t node has no branch o r two branches i n i t s Nbh^ THEN CALL swap node t o f i n d a pro p e r f i r s t node; E t t = FALSE; Mark a l l b r a n c h e s o f t h e o l d t a b l e as NOT t r a c e d ; c u r r e n t l e v e l = 1; Note t h a t a t t h i s p o i n t , the new t a b l e ' s nodes' Nbh^s s t i l l has t h e names o f t h e b r a n c h e s i n terms o f the o l d t a b l e ; 4. WHILE NOT (Ett) DO WITH the new t a b l e and the i n p u t TI DO BEGIN - Not E t t 4.1. FOR e v e r y node i n C u r r e n t DO - c u r r e n t node's l e v e l = c u r r e n t l e v e l ; 4.1.1. FOR e v e r y n o n - z e r o b r a n c h i n Nbh^ o f t h e c u r r e n t node of C u r r e n t DO - c a l l t he chosen branch o f Nbh^ the c u r r e n t b r a n c h ; - I F t h e same b r a n c h i n t h e o l d IFDT as t h e c u r r e n t b r a n c h i s not marked as t r a c e d THEN BEGIN - l o c a t e a p r o p e r / c u r r e n t branch i n the o l d t a b l e - I F t h e r e a r e b a c k - t o - b a c k node a t t h e c u r r e n t b r a n c h ' s f i r s t o r l a s t d i r e c t i o n p i x e l w h i c h has u n - t r a c e d branches w i t h i t THEN the back-to-back node i n C u r r e n t ; - U p d a t e t h e c u r r e n t b r a n c h ' s l e v e l , b e g i n n i n g node, and the node's symbol; - L e t s t a r t o f t r a c i n g node be the c u r r e n t node; WHILE the c u r r e n t branch i s marked n ot t r a c e d DO BEGIN - t r a c e i t - I F t h e s t a r t o f t r a c i n g node's c o o r d i n a t e m a t c h e s t h e same branch's (of the o l d t a b l e ) e n d i n g node's c o o r d i n a t e THEN BEGIN - r e v e r s e t r a c i n g - r e v e r s e d t h e c h a i n code o f t h e b r a n c h i n t h e o l d t a b l e and a s s i g n t h e r e v e r s e d c h a i n c ode t o t h e c u r r e n t branch's c h a i n code a r r a y ; - f i n d t he c u r r e n t branch's e n d i n g node by m a t c h i n g the c o o r d i n a t e s o f the node g i v e n i n the o l d t a b l e w i t h t h e new t a b l e ' s node's c o o r d i n a t e s ; END; - r e v e r s e t r a c i n g ELSE BEGIN - f o r w a r d t r a c i n g - copy t h e c h a i n code o f t h e br a n c h i n t h e o l d t a b l e and a s s i g n t h e code t o t h e c u r r e n t b r a n c h ' s c h a i n code a r r a y ; - f i n d t he c u r r e n t branch's e n d i n g node by ma t c h i n g the c o o r d i n a t e s o f t h e node g i v e n i n t h e o l d t a b l e w i t h the new t a b l e ' s node's c o o r d i n a t e s ; END; - f o r w a r d t r a c i n g - I F t h e r e a r e NOT e x a c t l y two e l e m e n t s i n t h e Nbh^ o f 75 Chapter IV the e n d i n g node o f the c u r r e n t branch OR i t i s marked as a l o o p (by -8888) THEN Mark t h e c o r r e s p o n d i n g b r a n c h i n the o l d t a b l e as t r a c e d ; ELSE BEGIN - c o n t i n u e t r a c i n g - Mark the c o r r e s p o n d i n g branch i n the o l d t a b l e as t r a c e d ; - Mark t h e e n d i n g node o f t h e c u r r e n t b r a n c h as b e i n g a double branch by +8888 a t Nmk^; - L e t t h e c u r r e n t b r a n c h be t h e f i r s t u n-marked b r a n c h i n t h e h i e r a r c h y o f t h e Nbh^ o f t h e e n d i n g node's; L e t t h e s t a r t o f t r a c i n g node be t h e b e g i n n i n g node o f the c u r r e n t b r a n c h ; - I F t h e r e a r e b a c k - t o - b a c k node i n t h e c u r r e n t b r a n c h ' s f i r s t o r l a s t d i r e c t i o n p i x e l w h i c h has u n - t r a c e d branches w i t h i t THEN the ba c k - t o - b a c k node i n C u r r e n t ; END; - c o n t i n u e t r a c i n g END; - t r a c e i t - Mark c u r r e n t b r a n c h as t r a c e d i n the o l d t a b l e ; - s t o r e i t i n t h e new IFDT's b r a n c h d e s c r i p t i o n ' s n e x t a v a i l a b l e empty s l o t ; - I F t h e e n d i n g node i s an end p o i n t THEN u p d a t e i t s node l e v e l i n t h e new t a b l e as one g r e a t e r t h a n t h e c u r r e n t l e v e l ; - I F t h e r e a r e s t i l l u n - t r a c e d b r a n c h e s i n t h e c u r r e n t branch's e n d i n g node's Nbh^ THEN a s s i g n t h e e n d i n g node of t he c u r r e n t branch i n t o Next; END; - l o c a t e a p r o p e r / c u r r e n t branch i n the o l d t a b l e 4.2. i n c r e m e n t c u r r e n t l e v e l by one; 4.3. S o r t Next a r r a y i n a s c e n d i n g o r d e r ; 4.4. S o r t a l l c u r r e n t l e v e l b r a n c h e s i n a s c e n d i n g o r d e r i n the new IFDT; 4.5. I F t h e r e a r e no nodes i n N e x t THEN E t t = TRUE ELSE a s s i g n Next t o C u r r e n t ; END; - Not E t t 5. BEGIN - o t h e r m i s c . i n f o r m a t i o n u p d a t i n g 5.1. S p e c i a l case - IF a l l t h e nodes i n the new t a b l e a r e a l l marked t o have two b r a n c h e s THEN keep t h e f i r s t node o f the t a b l e and empty the r e s t ; ELSE BEGIN - two branches f o r one node 5.2. Check a l l node f e a t u r e s f o r nodes w i t h two branches and mark them when found; 5.3. FOR e v e r y node i n t h e new t a b l e DO - I F t h e c u r r e n t node i s marked as two b r a n c h e s ( i . e . +88f8) THEN BE G I N - t a k e away t h e m a r k e d n o d e s a n d p a c k t h e h i e r a r c h y i n the new IFDT; 76 Chapter IV - copy the ne x t node i n the t a b l e i n t o the c u r r e n t node - reduce t o t a l node count o f t h e new IFDT by 1; - S u b t r a c t 1 from a l l branches' node feature/name w h i c h use t h e two b r a n c h nodes as a t e r m i n a t i o n node; END; - t a k e away the marked nodes and pack the h i e r a r c h y ; 5.4. FOR e v e r y branch i n the new t a b l e DO BEGIN - update Nbh^ o f a l l nodes i n the new t a b l e - I F t h e b r a n c h ' s t e r m i n a t i o n node i s one o f t h e node i n t h e new t a b l e THEN a s s i g n t h e name o f t h e b r a n c h t o t h a t node's Nbh^; END; - update Nbh^ o f a l l nodes i n the new t a b l e END; - o t h e r m i s c . i n f o , u p d a t i n g OLD IFDT = NEW IFDT; END; - t a b l e u p d a t i n g END; s u b t By i n s p e c t i n g the t e s t images i n f i g u r e IV-2.1, the CCSA i s c r e a t i n g t e s t i m a g e s as i f a p e r s o n i s w r i t i n g them on p a p e r . I n t h e n e x t c h a p t e r , t h e e x p e r i m e n t a l r e s u l t s a r e d i s c u s s e d . F u r t h e r m o r e , t h e s e g m e n t a t i o n and r e c o g n i t i o n r e s u l t s o f some images are a l s o shown. 77 Chapter IV INPUT: IFDT o f the i n p u t image. SRI w i t h DBRS f i n d t e s t images f i n d min -> SCR OUTPUT: I d e n t i t i e s o f r e c o g n i z a b l e c h a r a c t e r s . F i g . IV-1.1. A b l o c k d i a g r a m showing the s e g m e n t a t i o n r o u t i n e and i t s r e l a t i n g s u b r o u t i n e s . 78 Chapter IV , . . ! . . . 1 . . . 1 , . . 1 . .1 .1 . 1 . . . 1 . 1 .1 . 1 1 . . 1 . . 1 11 . , . 1 , . 1 . . . 1 . . 1 .1 .1 . . . 1 .1 . . . . 1 . 1 1 . . 1 . . . 1 . . 1 . . . 1 . . 1 . . . 1 ...Ill . . . 1 . . . 1 . . . 1 . . 1 .1 . 1 . . . 1 . 1 1 . 1 1 . . 1 . . 1 It.. . 1 . 1 , . . 1 . . 1 . 1 . . . . 1 . . . 1 . . . 1 . . . . 1 . . 1 1 , . . I . . . 1 . , 1 . , 1 . . . 1 ..111 1 , . 1 1 . . 1 . . , . 1 . . . 1 . 1 . . 1 . . . . . 1 . I ... 1 . . 1 . . 1 . ..It.. . . . 1 . . . . . 1 . 1 . . . 1 . . . 1 . , 1 . 1 . 1 . . 1 . . . 1 . . 1 . . . 1 . ..lit.. . . 1 . . . . . . 1 . . . . . 1 . 1 . . . . 1 .... 1 . . 1 . . . 1 . . . 1 . . 1 1 . . . . 1 1 . 1 . . . 1 . 111 . . . . 1 . . . 1 . 1 . . 1 . . 1 . . 11 . . 1 . . . . 1 1 . . 1 . . . . 1 t 1 t . . , t 1 1 . . 1 . . 1 . . . 1 1 1 . . . 1 . . . 1 1 1 . . ..111 1 1 . . . . 1 1 1 . 1 1 1 1 1 1 . . 1 1 . . . . . . . 1 . . . 1 . . . 1 1 1 , . 1 . . . . 1 . . . 1 1 . . . 1 1 . . 1 1 . . . . 1 1 1 . 1 . . . 1 . . . . . 1 Ill . I , F i g . IV-2.1. A s e t of t e s t images chosen by f i n d _ t e s t _ i m a g e . 79 Chapter 1 • • • • • . . 1 1 . . 1 111 . 1 1 1 . 1 . 1 1 1 1 1 . 1 1 1 1 , . 1 . . 1 . 1 . . 1 F i g . IV-2.2. Two examples o f BI B s . Handpr i n ted C h a r a c t e r R e c o g n i t i o n S t a t i s t i c s 0 2 4 6 8 10 F i g . IV-3.1. H a n d p r i n t e d c h a r a c t e r r e c o g n i t i o n s t a t i s t i c s f o r t u n i n g the SCR. ... I . . 1 .1 . 1 . . . 1 . 1 . . . . 1 . 1 . . . . I . . 1 . . . 1 . . 1 . . . 1 . . 1 . . . 1 ...Ill . . . 1 . . . 1 . . . 1 . .1 .1 . 1 . . . 1... L.. . I I . 1 I.. 1. . 1 II.. . 1 . 1 II 1. . 1 1 1 . . . ,... I 1. . ,.. 1 I.. . . . 1 1. . .. I 1.. , . 1 I. . . 1 1. , 1 I.. . 1 1 . . . . 1 1 . . . . I 1 . . . . . 1 I .. 1... 1 ...Ill 1 1 1 1 . . . ,... 11.... I.. ,.11 1. , 1 1. , I. , 1 . . . 1  . . 1 I 1 I . . I II ,. I 1 . . 1 1 .. I I . . . 1 1 . . . . 1 1 . . . . 1 1111 1 1 1 1 . . . . ....II 111. , . . 1 ,. I . . 1 . . . 1 . . . 1 . 11 . . 1 . 1 . . 11 . I . I , . 1 , . I , . 1 . I , . . 1 . . 1111111. . . .11 1 ,..1.1 . 1 . . . 1 ,. 1.... 1 . . . ,. 1.... 1 . . . ,.. I... I... ,.. 1... 1 . . . ,.. 1... 1 . . . . . . 1 1 1 . . . . ,... I ,... I ,... 1 . . . 1 . 1 .. I... 1 ,. I I... , 1 1 . . , I I. . . 1 . . 1 1111 . . . . . . . 1 1 . . . . 1.. ..II I. . 1 1. , . . 1 ,.. 1 .. 1 ,. I I. .. 1 1. . . 1 1 1... , 1 1. . 111 . I 11 . . I I .. 1 1 ,. 1 1 ... I I.. .... 11.... 1 1111 o oo F i g . IV-6.0.1. The IMF's c o n t e n t a f t e r the f i r s t p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 t hrough SRI. , 11 , . . . . 1.. 1.... . . . . 1.... 1.., , . . . 1 1.. , . . 1 1.. . . . 1 1.. ,. 1 1.. ,. 1 1, ,. 1 . . . . . . . . 1. , 1 1.. , 1 1.., , 1 1.., . 1 1.... . 1 1 ,. 1... 1 ,..111 1111... ...11 1 . , ,11 1. . 1. 1, ,...1111 ,. 11. . . . 111, , 1 . 1. . 1 1 , 1 1.. 1111. , 1 11 .. 1 1 ..1 1 ,. 1 1 , . . 1 1 ....11 1 1111 ...1.11 , . .1 .1 . .11. . . . ,. 1 ,. 1 , . 1 ,. 1 . . 1 , . 1 , . . 1.. 1111111, 11 . . . 1 ..1.1 • 1 . . . 1 . . . . . 1 . . . . 1 . . . . 1.... 1... . . 1 . . . 1 . . . . . 1 . . . 1 . . . 111 , . . . 1 , . . . 1 , . . . 1 ...1.1 .. 1... 1.... ,. 1.. . . 1... , 1 1.. . 1 1, , . 1 , . 1 , 1111... , . . . 11.... 1.. ,.11 1. , 1 1. .. 1 ,. 1 1 ,. 1 1. , 1 1 1.. ,1 1..111... , 1 11 ,. 1 1 ,. 1 1 ,.1 1 , . . 1 1 11 1 1111 o 3-pj no ri-al H F i g . IV-6.0.2. The IMF's c o n t e n t a f t e r the second p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 t h r o u g h SRI. Chapter IV ,..1111.. ,11 1, .11. . . . m i . . . , . 1 1 . . . . i n , . . i . i i . . , . i . i . . 11, , i . , i . . 1111. , . i i , . i , i i . , 1 1. , . i 1., , . . i i i . . , m i . . . , , i . . 1111111, , . i i . . . . 1 , . . i . i , . i . . . i . . i . . . . i . . , , . i . . . . i . . , , . . i . . . i . . , , . . 1... i . . , . . . i . . . i . . , , . . . i i i . . . , , . . . i , , . . . i , , . . . i , . . . i . i . . . . , , . i . . . i . . . , . . 1.... 1.., , i i . , , i i , ,. i , .11.... i. ...mi.. . i.. 1111111 ,.ii ....mi.... ..11.... 111. .i.ii... ..1..11. ... i . . I . I . . . . i... i.. . i.... i. .i.... i. ..i... i. ..i... i. .. i... i. ...in.. .. .1 ...i.... ... i.... ..i.i... . i... i.. . i i. i i i . i . i F i g . IV-6.0.3. The IMF's c o n t e n t a f t e r t he t h i r d p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 through SRI. 84 Chapter IV 1 m i 1.1 i i — i n 1. . .1. . . i i 1.. i 1 . . . .1 . . 1 1...1.. i 1...1.. i . n 1...1.. i . i . . 1 1 111... i i 1 i i i i i . i 1 1. . .1. . . 1 i 1.. i . . 1111111 i i . 11 i i i i F i g . IV-6.0.4. The IMF's c o n t e n t a f t e r the f o u r t h p a s t o f the image and the IFDT i n f i g u r e I I I - 5 . 2 t h r o u g h SRI. • i . i . . i . . . 1. i 1 i . . . . i . i . . . i . 1... i . i . . . i . .HI. . . 1... . . i . i . . . . . 1... i . . . . 1 i . . i i . i i . i . i F i g . IV-6.0.5. The IMF's c o n t e n t a f t e r the f i f t h p a s t of the image and the IFDT i n f i g u r e I I I - 5 . 2 t h rough SRI. 85 CHAPTER V CONCLUSIONS 1. I n t r o d u c t i o n T h i s r e s e a r c h p r o j e c t has been an e x p e r i m e n t a l r e s e a r c h p r o j e c t . I t was an e f f o r t t o a c q u i r e some k n o w l e g e a b o u t c o n n e c t e d c h a r a c t e r s e g m e n t a t i o n . Many r e s u l t s a r e w i t h i n the e x p e c t a t i o n s b u t t h e r e a re a l s o s u r p r i s e s . F i g u r e V - l . l , f i g u r e V-1.2, and f i g u r e V-1.3 a r e t h r e e images w h i c h were not i n c l u d e d i n the 336 images t e s t e d but t h e y a r e e x t r e m e l y r e p r e s e n t a t i v e o f t h e p o t e n t i a l o f t h e CCSA. A l s o , t h e y s h o w e d t h a t t h e CCSA i s c a p a b l e o f s e p a r a t i n g more than two t o u c h i n g c h a r a c t e r s . F i g u r e IV-1.4 shows a case o f not r e c o g n i z e d . 2. E x p e r i m e n t a l r e s u l t s The f o l l o w i n g i s an o u t l i n e o f t h e r e s u l t s . S i n c e we have 26 c h a r a c t e r s t o form the images from, t h e r e w i l l be 26 d i f f e r e n t g r o u p s . F o r e x a m p l e , g r o u p A w i l l be i m a g e s s u c h as "AA", "AB", "AC", and "AD". Group A c o n t a i n s 26 d i f f e r e n t i m a g e s f r o m "AA" t h r o u g h "AZ". O n l y 13 o u t o f t h e 26 i m a g e s i n g r o u p A f i t t h e c o n s t r a i n t s s e t f o r t h i n C h a p t e r I . The r e s u l t s a r e t a b u l a t e d i n t a b l e V-2.1. Of t h e 336 i m a g e s t e s t e d i m a g e s t e s t e d , 244 i m a g e s were c o r r e c t l y segmented and r e c o g n i z e d . A d d i t i o n a l l y , 38 images a c h i e v e d s e g m e n t a t i o n and r e c o g n i t i o n o f t h e f i r s t c h a r a c t e r . Group A, g r o u p I , g r o u p Q, and g r o u p T have s e g m e n t a t i o n and 86 Chapter V r e c o g n i t i o n r a t e o f o v e r 90%. On t h e o t h e r hand, g r o u p U, group Z, and group C d i d not r e a c h the 50% mark. The average s e g m e n t a t i o n and r e c o g n i t i o n r a t e i s 72.66%. The f i r s t c h a r a c t e r s e g m e n t a t i o n r a t e i s a t 83.93%. W i t h no o p t i m i z a t i o n and such t i g h t d e c i s i o n b o u n d a r i e s , the r e s u l t was e x c e l l e n t f o r such a newly d e s i g n e d a l g o r i t h m . As a c o m p a r i s o n , an i n f o r m a l e x p e r i m e n t i n v o l v i n g t h r e e human s u b j e c t s was conducted. The human s u b j e c t s were shown w i t h t h e 25 i m a g e s l i s t e d i n t a b l e IV-3.1 p l u s an e x t r a image "DO". The i n f o r m a l s t u d y showed t h a t t h e image "LL" was c o n s i d e r e d t o be n o t r e c o g n i z a b l e a t l e a s t once and image "CC" was d i f f i c u l t b u t r e c o g n i z a b l e by a l l t h r e e human s u b j e c t s . Cases o f m i s r e c o g n i t i o n such as "AT" r e c o g n i z e d as "A J " and "MB" r e c o g n i z e d as "M8" a r e a l s o p r e s e n t . D e s p i t e t h e s u p e r i o r i t y o f t h e human r e c o g n i t i o n s y s t e m , t h e r e a r e s t i l l c a s e s o f m i s r e c o g n i t i o n . The r e c o g n i t i o n r a t e h e r e averages about 82%. The CCSA a c h i e v e d a r e c o g n i t i o n r a t e of 73% on the same s e t . 3. B a s i c f e a t u r e v s . b a s i c c h a r a c t e r A b a s i c c h a r a c t e r can be d e f i n e d as a c h a r a c t e r w h i c h i s t o p o l o g i c a l l y e q u i v a l e n t t o a f e a t u r e . F o r c o n v e n i e n c e , t h e s e f e a t u r e s a r e c a l l e d b a s i c f e a t u r e s . F o r e x a m p l e , a bra n c h ( i . e . a f e a t u r e ) can be deformed i n t o a c h a r a c t e r such as a C o r a L. T h e r e f o r e , c h a r a c t e r s s u c h as O, C, and L c a n 87 Chapter V be r e g a r d e d as b a s i c c h a r a c t e r s . T h e r e a r e two f e a t u r e s w h i c h can be re g a r d e d as a b a s i c f e a t u r e . They are branches and l o o p s . F o r e x a m p l e , c h a r a c t e r "R" c a n be c o n s i d e r e d t o have t h r e e b a s i c f e a t u r e s (two branches and one l o o p ) . I t i s e x t r e m e l y d i f f i c u l t t o d i s t i n g u i s h between b a s i c c h a r a c t e r s and b a s i c f e a t u r e s s i n c e e v e r y s e c t i o n ( i . e . e v e r y b a s i c f e a t u r e ) i s a p o t e n t i a l l y r e c o g n i z a b l e c h a r a c t e r . The w o r s t c a s e i s t h a t t h e e n t i r e image i s r e c o g n i z e d as a c o l l e c t i o n o f b a s i c c h a r a c t e r s . A l t h o u g h we c a n use t h e c u r v a t u r e , t h e l e n g t h , t h e e n t r y and d e p a r t u r e a n g l e o f t h e t e r m i n a t i o n p o i n t s , and v a r i o u s o t h e r methods t o t r y t o make the t r u e d i s t i n c t i o n , l i m i t a t i o n s and t r a d e o f f s w i l l a r i s e . F o r e x a m p l e , t h e a d d i t i o n a l "/" o f t h e n u m e r a l " 0 " i s a c a s e where a d d i t i o n a l f e a t u r e s h e l p s t h e r e c o g n i t i o n o f n u m e r a l s "0"s and c a p i t a l E n g l i s h c h a r a c t e r "0". The t r a d e o f f h ere i s the e x t r a "/" and the e f f o r t r e q u i r e d t o m a i n t a i n two t o p o l o g i c a l l y d i f f e r e n t v e r s i o n s of the numeral z e r o . G r o u p C a n d g r o u p L a r e c l a s s i c e x a m p l e s o f b a s i c c h a r a c t e r v s b a s i c f e a t u r e . I n s p e c t i o n on some o f t h e i n c o r r e c t c a s e s o f g r o u p B have shown t h a t t h e u p p e r l o o p s e c t i o n was r e c o g n i z e d as a h i g h l y deformed 0. S i m i l a r c a s e s were f o u n d i n g r o u p C ( L ) , g r o u p D (0) , g r o u p L (V & C) , g r o u p 0 (D) , and g r o u p U (V). N o t e t h a t , g r o u p 0 and g r o u p D's h i g h s e g m e n t a t i o n r a t e c a n be a c c o u n t e d f o r b e c a u s e o f the i n v a r i a n c e i n the i n p u t image. 88 Chapter V Group A, g r o u p I , g r o u p Q, and g r o u p T have r e l a t i v e l y good s e g m e n t a t i o n r e s u l t s . They appear t o have the advantage o f h a v i n g more " c u e s " f o r r e c o g n i t i o n s i m p l y b e c a u s e t h e r e a r e more f e a t u r e s i n t h e image. I f t h e f o r m a t o f t h e f e a t u r e i s g e n e r a l i z e d , we s h o u l d be a b l e t o d e f i n e d e f o r m a t i o n such t h a t t h e r e i s no de p e n d e n c e w h a t s o e v e r on t h e t o p o l o g i c a l c o m p l e x i t y o f the image. L o o k i n g a t t h e c a s e f o r humans, t h e p r o b l e m w i t h s i m p l e c h a r a c t e r s p e r s i s t s . A good e x a m p l e i s t h e c a s e o f image "LL" where i t was not r e c o g n i z a b l e . Other cases such as "CR" r e c o g n i z e d as "OR" f u r t h e r c o n f i r m t h e p r o b l e m w i t h s i m p l e c h a r a c t e r s . D i f f i c u l t i e s w i t h t h e i m a g e s "LL", "CC", and "CR" a l s o c o r r e l a t e w i t h t h e p r o b l e m o f b a s i c f e a t u r e v s b a s i c c h a r a c t e r . A l s o , t h e i n f o r m a l e x p e r i m e n t w i t h t h e t h r e e human s u b j e c t s has p r o v e d t h a t i t i s d i f f i c u l t t o o b t a i n a 100% r e c o g n i t i o n r a t e on t h e t e s t d a t a even w i t h t h e "best r e c o g n i t i o n system" a v a i l a b l e . U l t i m a t e l y , i t a p p e a r s t h a t t h e s i m p l i c i t y o f t h e t o p o l o g y o f a c h a r a c t e r p o s e s a g r e a t e r c h a l l e n g e t o t h e m a c h i n e t h a n t h e c o m p l e x i t y o f t h e c h a r a c t e r g i v e n good d e s c r i p t i o n , s e g m e n t a t i o n , and r e c o g n i t i o n schemes. T h i s t h o u g h t i s a g a i n s t o u r i n s t i n c t i v e k n o w l e d g e t h a t t h e more c o m p l i c a t e d the image i s , the more d i f f i c u l t t he image w i l l be r e c o g n i z e d . The d i s c u s s i o n o f b a s i c f e a t u r e v s b a s i c c h a r a c t e r 89 Chapter V s u g g e s t s t h a t t h e c h a r a c t e r s e g m e n t a t i o n p r o b l e m l i e s u l t i m a t e l y i n the a l g o r i t h m ' s a b i l i t y t o d i s t i n g u i s h between t h e two. I t has been a c c e p t e d as a s t a n d a r d t o q u o t e a r e c o g n i t i o n r a t e a t t h e e n d o f one's r e s e a r c h r e p o r t . S u r p r i s i n g l y , t h e s e r e c o g n i t i o n r a t e s do not r e a l l y r e f l e c t t h e a b i l i t y o f an a l g o r i t h m w i t h r e s p e c t t o i t s a b i l i t y t o d i s t i n g u i s h between the two e n t i t y . 4. C o n c l u s i o n and f u t u r e p r o s p e c t s The c a s e s o f b r a n c h t o u c h i n g a t t h e e n d p o i n t s , b r a n c h t o u c h i n g s i d e by s i d e , and e x t r a n e o u s b r a n c h e s i n b e t w e e n (BIBs) c h a r a c t e r s have n o t been d e a l t w i t h i n t h i s r e s e a r c h p r o j e c t . However, i f a d j u s t m e n t s were made on t h e CCSA, t h e s e c a s e s c o u l d be d e a l t w i t h . The p o s s i b l e a d j u s t m e n t s a r e d i s c u s s e d below. To d e a l w i t h e n d p o i n t s t o u c h i n g , we may want t o add a s u b r o u t i n e i n the a l g o r i t h m t o c r e a t e t e s t images w i t h h a l f e d l e n g t h b r a n c h e s f o r any b r a n c h w i t h an EP f o r m e d a f t e r s e g m e n t a t i o n . The s t r a t e g y i s n o t t o o c o m p l i c a t e d . A f t e r the o r i g i a n l t e s t images a r e c r e a t e d by f i n d _ t e s t _ i m a g e , any b r a n c h e s w i t h an EP and i s on t h e r i g h t hand s i d e o f a t e s t i mage c a n be h a l f e d . T h e r e f o r e , a new t e s t i mage i s f o r m e d f o r e v e r y EP branch. The DBRS scheme can then s o r t out w h i c h i m a g e i s t h e b e s t r e c o g n i z e d one by c h e c k i n g t h e i r d e f o r m a t i o n d i s t a n c e . B r a n c h e s t o u c h i n g s i d e - b y - s i d e c a n be s o l v e d by 90 Chapter V r e c o n n e c t i n g p o i n t s o f s e g m e n t a t i o n . F i r s t , e v e r y e f f o r t i s put i n t o r e c o g n i z i n g the f i r s t c h a r a c t e r . A f t e r i d e n t i f y i n g the f i r s t c h a r a c t e r , the a l g o r i t h m w i l l s e a r c h f o r a s h o r t e s t p a t h c o n n e c t i n g the s e g m e n t a t i o n p o i n t s o f the r e m a i n i n g p a r t o f t h e image. T h i s p a t h ' s i n f o r m a t i o n c a n be f o u n d i n t h e c o r r e s p o n d i n g IFDT. A f t e r r e c o n n e c t i o n , an a d d i t i o n a l row i s c r e a t e d i n a r r a y IMF f o r t h e new image. The same DBRS c a n be used t o d e t e r m i n e w h i c h row o f a r r a y IMF c o n t a i n s the c o r r e c t f i r s t c h a r a c t e r . L a s t l y t h e r e i s t h e c a s e o f B I B s . I n t h i s p r o j e c t , t h e b r a n c h e s i n b e t w e e n have o n l y been d e a l t w i t h i n a s m a l l s c a l e . As seen i n c h a p t e r 4 , s u b r o u t i n e e x b r i s n o t the i d e a l method b u t p r o v e n t o be a c c e p t a b l e . F u r t h e r r e s e a r c h i n t r i m m i n g n o i s y branches f o r the t e s t images t o e l i m i n a t e BIBs i s r e q u i r e d . T h i s CCSA does n o t u t i l i z e any c o n t e x t u a l i n f o r m a t i o n . However, c o m b i n i n g t o p o l o g i c a l approach pursued here and the c o n t e x t u a l r u l e s i n s p e l l i n g words may be e x t r e m e l y h e l p f u l . F u r t h e r m o r e , a word d i c t i o n a r y can a l s o be i n c l u d e d f o r even b e t t e r i m p r o v e m e n t s . A d i c t i o n a r y l o o k up scheme may be i n e f f i c i e n t i n many sense but todays' e v e r a d v a n c i n g computer t e c h n o l o g y may make t h i s approach v i a b l e . U t i l i z i n g any form of c o n t e x t u a l i n f o r m a t i o n may be e x t r e m e l y h e l p f u l i n d e a l i n g w i t h b a s i c f e a t u r e s and b a s i c c h a r a c t e r s . A c o m b i n e d r e c o g n i t i o n scheme w h i c h u t i l i z e t h e b e s t o f b o t h t h e 91 Chapter V c o n t e x t u a l and s y n t a c t i c approach would be the b e s t approach y e t . N e v e r t h e l e s s , t h e e x p e r i m e n t a l s e g m e n t a t i o n a l g o r i t h m p e r f o r m e d w e l l . H a r n e s s i n g the w r i t i n g r u l e s o f the E n g l i s h l a n g u a g e s h o u l d be a g r e a t h e l p t o w a r d s s e g m e n t a t i o n . One c o u l d c o n s i d e r a s i m p l e h o r i z o n t a l l e f t t o r i g h t c o l u m n by column windowing of the b i n a r y image as b e i n g the e a s i e s t and m o s t o b v i o u s a p p r o a c h . H o w e v e r , s o p h i s t i c a t i o n a n d g e n e r a l i z a t i o n by f o r m i n g a h i e r a r c h y o f f e a t u r e s i s a much b e t t e r m e t h o d . F u r t h e r r e s e a r c h on t h i s m e t h o d i n c o n j u n c t i o n w i t h b a s i c f e a t u r e vs b a s i c c h a r a c t e r i s l i k e l y t o be the new s t a t e - o f - t h e - a r t i n the s e p a r a t i o n o f c o n n e c t e d c h a r a c t e r s . 92 Chapter V Tab l e V-2.1. Segmentation and recognition resul GROUP COMPLETE lst-ONLY NONE # OF BIB COMP. A (13) 12 1 0 2 1 B (22) 11 10 1 10 0 C (13) 6 0 7 0 0 D (14) 10 4 0 2 2 E (14) 11 0 3 0 0 F (15) 14 1 0 0 0 G (16) 13 1 1 3 1 H (9) 7 1 1 1 1 I ( I D 10 0 1 0 0 J (10) 7 0 3 0 0 K (13) 7 2 4 1 0 L (15) 10 3 2 0 0 M (17) 12 1 4 0 0 N (15) 11 1 3 4 2 0 (9) 8 0 1 2 2 P (16) 13 2 1 2 1 Q (9) 9 0 0 1 1 R (15) 11 3 1 2 0 S (14) 10 0 4 4 1 T (13) 13 0 0 0 0 U (5) 1 1 3 2 0 V (15) 12 1 2 1 1 W (3) 2 1 0 1 0 X (16) 12 2 2 0 0 Y (11) 7 1 3 2 1 Z (14) 5 2 7 9 1 WHERE: Group = Input images w h i c h has t h a t c h a r a c t e r as the f i r s t c h a r a c t e r . The number i n b r a c k e t s i s the t o t a l number o f images o f t h a t group w h i c h was w i t h i n the c o n s t r a i n t o f t h i s p r o j e c t d e f i n e d i n c h a p t e r 1. Complete = The number o f t e s t images i n the group where b o t h c h a r a c t e r s i n the image i s c o r r e c t l y segmented and r e c o g n i z e d . l s t - O n l y = The number o f t e s t i m a g e s i n t h e g r o u p where o n l y the f i r s t c h a r a c t e r i s c o r r e c t l y segmented and r e c o g n i z e d . None = The number o f t e s t i m a g e s i n t h e g r o u p where n e i t h e r o f the two c h a r a c t e r s i n the image was c o r r e c t l y segmented o r r e c o g n i z e d . 93 Chapter V Ta b l e V-2.1. (cont.) Segmentation and r e c o g n i t i o n r e s u l t s . # o f B I B = The number o f i m a g e s i n t h e g r o u p w h i c h has e x t r a branches c o n n e c t i n g the f i r s t and second c h a r a c t e r s . COMP. B I B = The number o f i m a g e s i n t h e g r o u p w h i c h i s a member o f t h e column Complete and i s a member o f t h e c o l u m n # o f BIB. 94 Chapter V I . 1 . . . . . 1 . . . 311. . . . . 1 . 1 1 1 . 1 . . 1 2 .311. . 1 . . . 1 . . . .... 1 . .313 1 . . 1113, 1 1, ? 5 INPUT CHARACTER IDENTIFIED AS * * R * There are —> 1 <— row(s) for t h i s i t e r a t i o n . M i n . D i s t « 1 0 . 0 0 INPUT CHARACTER IDENTIFIED AS * * O * * INPUT CHARACTER IDENTIFIED AS * * G * * INPUT CHARACTER IDENTIFIED AS * * G * * INPUT CHARACTER IDENTIFIED AS * * E * * INPUT CHARACTER IDENTIFIED AS * * R •« M i n . D i s t M i n . D i s t M i n . D i s t 0.00 4.00 6.00 M i n . D i s t =14.00 M i n . D i s t « 6.00 The image that was c o r r e c t l y segmented / recogn ized i s * * * * p * * * * INPUT CHARACTER IDENTIFIED AS * * O * * M i n . D i s t » 0.00 There are —> 1 <— row(s) for t h i s i t e r a t i o n . INPUT CHARACTER IDENTIFIED AS * * G * * M i n . D i s t • 4.00 INPUT CHARACTER IDENTIFIED AS * * G * * M i n . D i s t - 8.00 INPUT CHARACTER IDENTIFIED AS * * E * * M i n . D i s t -14 .00 INPUT CHARACTER IDENTIFIED AS * * R * * M i n . D i s t - 6.00 The image that was c o r r e c t l y segmented / recogn ized i s * * * * o **** INPUT CHARACTER IDENTIFIED AS * * G * * INPUT CHARACTER IDENTIFIED AS * * G * * There are —> 2 <— row(s) f o r t h i s i t e r a t i o n . M i n . D i s t M i n . D i s t 4.00 B.00 INPUT CHARACTER IDENTIFIED AS * * E * * INPUT CHARACTER IDENTIFIED AS * * R * * INPUT CHARACTER IDENTIFIED AS * * L * * INPUT CHARACTER IDENTIFIED AS * * R * * M i n . D i s t -14 .00 M i n . D i s t « 6.00 M i n . D i s t » 7.00 M i n . D i s t - 6.00 The image that was c o r r e c t l y segmented / recognized i s * * * * Q * * * « INPUT CHARACTER IDENTIFIED AS * * E * * There are —> 1 <— row(s) f o r t h i s i t e r a t i o n . M i n . D i s t - 14 .00 INPUT CHARACTER IDENTIFIED AS * * R * * M i n . D i s t - 6.00 The image that was c o r r e c t l y segmented / recognized i s • *** £ **** INPUT CHARACTER IDENTIFIED AS * * R * * M i n . D i s t - 6.00 There are —> 1 <— row(s) f o r t h i s i t e r a t i o n . The image that was c o r r e c t l y segmented / recogn ized i s * * * * p « * * * F i g . V - l . l . An example o f an image r e c o g n i z e d by the CCSA. 95 Chapter V , 1 .311... . 1 . . 1 1 . 1 1 112. 1 1 . . 1 1 1 1 1 . . 1 1 1 11 31 1 1 13. . . 1 . . . 1 . ...1.1 1 . .113.. 1 ... 1 . .31 1 1 1 1 1 1 .1 1 . . . 1 . . 1 . . .. . 1 1 . . . INPUT CHARACTER IDENTIFIED AS ** P ** Th e r e a r e --> 1 <-- r o w ( s ) f o r t h i s i t e r a t i o n . M i n . D i s t = 4.00 INPUT CHARACTER IDENTIFIED AS ** V ** M i n . D i s t = 6.00 INPUT CHARACTER IDENTIFIED AS ** 0 ** M i n . D i s t = 0.00 INPUT CHARACTER IDENTIFIED AS ** T ** M i n . D i s t =10.00 The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s **** p **** INPUT CHARACTER IDENTIFIED AS ** V ** M i n . D i s t = 6.00 INPUT CHARACTER IDENTIFIED AS ** O ** M i n . D i s t = 0.00 Th e r e a r e 2 <-- r o w ( s ) f o r t h i s i t e r a t i o n . INPUT CHARACTER IDENTIFIED AS ** T ** M i n . D i S t =10.00 The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s * * * * Q * * * * INPUT CHARACTER IDENTIFIED AS ** T ** M i n . D i s t =10.00 T h e r e a r e — > 1 < — r o w ( s ) f o r t h i s i t e r a t i o n . The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s **** T **** F i g . V-1.2. An example o f an image r e c o g n i z e d by the CCSA. 96 Chapter V m 11112 1 ... 3 1 1 1 1 .... 1 1 1 ... 1 1 ... ... 1 2 1 1 1 . ., . . 1 1 1 1 1 . . . 1 1 1 1 1 ... 1 1 1 .... 1 1 ...3.11 3 1 .... 1 1 . . 1 . 1 . . 1 2 1.1 1 1 1 . . 1 1 . . 1 1 1 1 .1 1 .... 1 1 1 1 1 1 .... 1 1 1 1 . 1 1 1 1 1 1 . 1 1 1 1 1 1 . . 1 1 1 1 1 1 . .. , . 1 ... 1 1 1 1 1 1 3 1 ... 1 1 1 ..111 1 1 . . 1 1111 1 11 1 2 INPUT CHARACTER IDENTIFIED AS ** L ** M i n . D i s t = 6 . 0 0 INPUT CHARACTER IDENTIFIED AS ** E ** M i n . D i s t =12 . 0 0 T h e r e a r e --> 2 <-- r o w ( s ) f o r t h i s i t e r a t i o n . INPUT CHARACTER IDENTIFIED AS ** N ** M i n . D i s t = 16 . 0 0 INPUT CHARACTER IDENTIFIED AS ** D ** M i n . D i s t = 0 . 0 0 The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s **** £ **** INPUT CHARACTER IDENTIFIED AS ** N ** M i n . D i s t = 16 . 0 0 T h e r e a r e --> 1 <-- r o w ( s ) f o r t h i s i t e r a t i o n . INPUT CHARACTER IDENTIFIED AS ** D ** M i n . D i s t = 0 . 0 0 The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s **** N **** INPUT CHARACTER IDENTIFIED AS ** D ** M i n . D i s t = 0 . 0 0 T h e r e a r e — > 1 < — r o w ( s ) f o r t h i s i t e r a t i o n . The image t h a t was c o r r e c t l y s e g m e n t e d / r e c o g n i z e d i s **** p **** F i g . V-1.3. An example o f an image r e c o g n i z e d by the CCSA. 97 Chapter V 2 1 . 1 1 . . . 1 . . 1 31 . 1 . . 1 1 . . . .1111111 1 There a r e — > 1 < — row(s) f o r t h i s i t e r a t i o n . INPUT CHARACTER IDENTIFIED AS ** V ** Min. D i s t = 2.00 The image t h a t was c o r r e c t l y s egmented/recognized i s **** _ **** INPUT CHARACTER IDENTIFIED AS ** V ** M i n . D i s t = 2.00 There a r e — > 1 <-- row(s) f o r t h i s i t e r a t i o n . The image t h a t was c o r r e c t l y s egmented/recognized i s **** y **** F i g . V-1.4. An example o f an image not r e c o g n i z e d by the CCSA. 98 APPENDIX 1 THINNING BINARY IMAGES The t h i n n i n g a l g o r i t h m w i l l t r i m away r e d u n d a n t p i x e l s b u t a t t h e same t i m e , i t w i l l : 1. P r e s e r v e the image's c o n n e c t i v i t y . 2. P r e s e r v e the e s s e n t i a l f e a t u r e s o f an image. 2. P r e s e r v e the o r i g i n a l shape o f the a r c s , c u r v e s , and c o r n e r s . 3. P r e s e r v e the s t r u c t u r a l c h a r a c t e r i s t i c s o f an image. The f o l l o w i n g d e f i n i t i o n s o r i g i n a t e d f r o m [1] and was employed i n the t h i n n i n g a l g o r i t h m . D e f i n i t i o n 1_: A c o n t o u r p i x e l i s a p i x e l t h a t has a t l e a s t ONE d i r e c t n e i g h b o u r s w h i c h i s a 0. Contour p i x e l s a r e l a b e l e d 2 as th e y a r e found. D e f i n i t i o n _2: A c o n t o u r p i x e l i s m u l t i p l e i f i t s a t i s f i e s any ONE o f the f o l l o w i n g c o n d i t i o n s : i . I t has a t most ONE non-zero n e i g h b o u r . i i . I t s neighbourhood conforms t o ONE o f the masks shown i n f i g u r e s A l . l o r t h o s e f o r m by m u l t i p l e s o f 90 degrees r o t a t i o n s . M u l t i p l e p i x e l s a r e l a b e l l e d 3 as t h e y a r e found and a r e NOT removable. 99 Appendix 1 A A A 0 P 0 B B B A A A A P 0 A 0 B C A C A P A 0 0 0 where a t l e a s t one o f A and a t l e a s t one o f B i s non-zero, where a t l e a s t one o f A and B i s n on-zero. where A i s non-zero and a t l e a s t ONE of C i s z e r o . F i g . A l . l . L o c a l n e i g h b o u r s of a m u l t i p l e p i x e l P. D e f i n i t i o n _3: A c o n t o u r p i x e l i s t e n t a t i v e l y m u l t i p l e i f i t s a t i s f i e s any ONE o f the f o l l o w i n g c o n s t r a i n t s : i . I t has no n e i g h b o u r s l a b e l e d 1. i i . I t s n e i g h b o u r h o o d c o n f o r m s t o t h e mask shown i n f i g u r e A1.2 o r t h o s e form by m u l t i p l e s o f 90 degrees r o t a t i o n s . T e n t a t i v e l y m u l t i p l e p i x e l s a r e l a b e l e d 4 as t h e y a r e found. where a t l e a s t ONE of A, B, and C must be non-zero and D>=2. I f both C's are non-z e r o then A and B can be a n y t h i n g . F i g . A1.2. L o c a l n e i g h b o u r s o f a t e n t a t i v e l y m u l t i p l e p i x e l P. A A C 0 P D B B C D e f i n i t i o n 4^ : A removable t e n t a t i v e l y multiple p i x e l i f i t s a t i s f y ONE o f t h e f o l l o w i n g c o n s t r a i n t s : i . I t s l o c a l n e i g h b o u r s c o n f o r m s t o the.mask shown i n f i g u r e A l . 3 . 1 . 100 Appendix 1 i i . I t s l o c a l n e i g h b o u r s c o n f o r m s t o t h e mask shown i n f i g u r e Al.3.2 and none of i t s n e i g h b o u r s were l a b e l e d by c o n d i t i o n i . i i i . I t s l o c a l n e i g h b o u r s conforms t o any ONE of the mask shown i n f i g u r e Al.3.3 o r t h o s e f o r m by m u l t i p l e s o f 90 degrees r o t a t i o n s . R e m o v a b l e t e n t a t i v e l y m u l t i p l e p i x e l s a r e l a b e l e d 5 as t h e y are found. X X X 0 p A X X X where A=3 o r A=4 and X=don't c a r e . F i g . Al.3.1. L o c a l n e i g h b o u r s o f a removable t e n t a t i v e l y m u l t i p l e p i x e l P. X 0 X X P X X A X A1.3 .2 where A=3 or A=4 and X=don't c a r e . m u l t i p l e p i x e l P. 0 B 0 X 0 0 A 0 0 B P 0 B P 0 B P 0 0 0 0 A 0 0 X 0 0 where B=3 or B=4, A>0 and i s not l a b e l e d as removable by c o n d i t i o n i and i i . X i s don't c a r e . F i g . Al.3.3. L o c a l n e i g h b o u r s o f a removable t e n t a t i v e l y m u l t i p l e p i x e l P. The o r i g i n a l t h i n n i n g program was implemented on the VAX 11/750 ( r u n n i n g i n PDP 11 c o m p a t i b l e mode) c o m p u t e r i n o u r 101 Appendix 1 department and w i t h minor m o d i f i c a t i o n s the t h i n n i n g program i s i m p l e m e n t e d u s i n g PASCAL/VS on t h e UBC's Amdahl 5850 c o m p u t e r . Here i s a l i s t i n g o f t h e a l g o r i t h m . T h i s i s t h e e x a c t a l g o r i t h m , n a m e l y A l g o r i t h m A l , l i s t e d on page 25 o f [1 ] . I t i s c a l l e d a l g o r i t h m WA1 here. BEGIN WA1 1. For each 1 p i x e l do - i d e n t i f i c a t i o n o f c o n t o u r p i x e l s - l a b e l the p i x e l 2 i f d e f i n i t i o n 1 i s s a t i s f i e d ; 2. For each 2 p i x e l do - i d e n t i f i c a t i o n of m u l t i p l e p i x e l s - l a b e l the p i x e l 3 i f d e f i n i t i o n 2 . i . i s s a t i s f i e d e l s e - l a b e l the p i x e l 3 i f d e f i n i t i o n 2 . i i . i s s a t i s f i e d ; 3. F o r e a c h 2 p i x e l do - i d e n t i f i c a t i o n o f t e n t a t i v e l y m u l t i p l e p i x e l s - l a b e l the p i x e l 4 i f d e f i n i t i o n 3 . i . i s s a t i s f i e d e l s e - l a b e l the p i x e l 4 i f d e f i n i t i o n 3 . i i . i s s a t i s f i e d ; 4. F o r e a c h 4 p i x e l do - i d e n t i f i c a t i o n o f r e m o v a b l e t e n t a t i v e l y m u l t i p l e p i x e l s - l a b e l the p i x e l 5 i f d e f i n i t i o n 4 . i . i s s a t i s f i e d ; 4.1. For each r e m a i n i n g 4 p i x e l do - l a b e l the p i x e l 5 i f d e f i n i t i o n 4 . i i . i s s a t i s f i e d e l s e - l a b e l the p i x e l 5 i f d e f i n i t i o n 4 . i i i . i s s a t i s f i e d ; 5. S e t a l l 2 and 5 p i x e l s t o 0 and r e l a b e l a l l 3 and 4 p i x e l s t o 1; 6. R e p e a t s t e p s 1 t h r o u g h 5 u n t i l a l l c o n t o u r p i x e l s a r e m u l t i p l e o r t e n t a t i v e l y m u l t i p l e ; END; WA1 102 Appendix 1 1.. 111 1i1 111 .. 1 .. i 1i1 111 111 111 1 1 111 111 111 11. n u ,.111111... 111111111.. 11111.11111 11111 11111 1111 111 .1111 .111 .111 .111 111. 111. 1111 1111 111111 1.. 11 i i 1 111 111 11 i 111 11111 1111111111 .11111111. .111111... B e g i n t h i n n i n g 22 2222 . .2221 122 21 1112 . . 2 1 1 1 1 1 2 221 1 121 1 2 . . . . .211 12244 21 1 1 2 . 2 2 1 2 2 . , . . 2 1 2 . . . 5 4 . . . 2 1 1 1 2 2 1 2 . , . . 2 1 2 . . . 2 1 2 . 2 1 1 1 2 2 1 2 . . . . 2 1 2 . . 2 1 1 2 . 2 1 1 2 5 3 . . . . 2 1 2 . . 2 1 1 3 . 2 1 2 4 . . . . 2 1 1 2 2 1 1 2 . 3 1 1 2 . . . 2 1 1 1 1 1 2 . . 2 1 2 . . 2 1 1 1 1 1 1 2 . . 2 1 2 . .21 1 12221 2.21 2 . . 2 1 2 2 . . . 2 1 2 1 2 . .54 2112 .212 21112 .212 211112 .212 21 1 1 221 22 2 2 2 . . . .212 2 1 1 2 . . 2 1 1 2 2 . 2 1 1 1 2 . . . 2 1 2 . . 2 2 1 1 2 2 1 1 1 2 1 1 1 1 2 . . . .21221 1 12 21 1 1 1 122 . . . . . . 2 1 1 1 1 2 222222 2222 . . 3 3 3 . 3 . . . 3 3 . . .3 3 . . . . 5 3 3 . . 3 4 . . 3 .3 3 . . . .3 3 3 . .3 3 . . . 5 4 . .3 3 3 . . 3 . . . . 5 4 . . . 3 . . 3 . 3 . . . . 3 . 5 3 3 . . . . 33333 3 . . .3 3 . . . 3 . . . 3 . 3 . . . 3 3 . . . . 3 . . . . 3 . 3 . . 3 . . . 3 . . 3 3 3 . . . .3 3 4 . , . . 5 . . 3 3 4 . . 3 . . . 5 3 . . . 3 3 3 4 . . . 5 3 3 . . . 3 3 3 . 3 . . . 3 3 . . .3 3 . 33 . . 3 3 . . 3 .3 3 . . . .3 3 3 . ,3 3 3 . .3 3 3 . . 3 . . . . . 3 . . . 3 . . 3 . 3 . . . . 3 . .3 3 . . . . 33333 3 . . .3 3 . . . 3 . . . 3 . 3 . . . 3 3 . . . . 3 . . . 3 . 3 . 3 . . . 3 . . . .3 3 3 . .3 3 . . . 3 3 . . . . . 3 3 3 3 . . . . 3 3 . . 3 3 3 . . . . 3 3 22143 214 .53 2 1 2 . . 33 212 . . . 3 3 . . . . . 3 . . . . 2 1 2 . . 3 . . . 3 . . . . . 3 . . . 2 1 2 . . . 3 . . . 3 . . . . 5 4 3 . . 3 3 . . .... 3 4 . . 5 4 . . 3 5 4 . 21 544 .3 3 2 2 . . . 3 . . . 3 3 . . . . . . 3 . 3 3 . . . . . . . 3 4 . . . . 3 . . . 2 1 2 3 . . 2 1 2 3 . . . . 3 . 2 1 2 . . 3 . 2 5 4 . . . 5 1 4 4 . . . 3334 F i g . A1.4. An example showing t h e s t a t u s o f an image b e f o r e e x e c u t i n g s t e p 5 e a c h t i m e i t p a s s e s t h r o u g h a l g o r i t h m WA1. The numbers on the upper l e f t hand c o r n e r shows the sequence. 103 APPENDIX 2 NODE IDENTIFICATION The n o d e i d e n t i f y i n g s u b r o u t i n e w i l l l a b e l n o d e s a c c o r d i n g t o a s e r i e s of r u l e s and masks p r e s e n t e d below. I t i s t h e e x a c t m a t e r i a l f o u n d i n S e c t i o n 2.2 o f C h a p t e r 3 o f Mr. I . H. Wong's t h e s i s . T h e r e a r e t h r e e t y p e s o f nodes. E n d p o i n t s (EP) a r e t h e t e r m i n a t i o n o f a b r a n c h . J u n e t i o n - t h r e e s ( J 3 ) a r e t h e t e r m i n a t i o n of t h r e e branches a t a p i x e l . J u n c t i o n - f o u r s ( J 4 ) a r e the t e r m i n a t i o n o f t h r e e branches a t a p i x e l . Here a r e the d e f i n i t i o n s : EP 1. I f ONLY ONE 1 p i x e l i s f o u n d a t a p i x e l ' s n e i g h b o u r s and t h e r e m a i n i n g n e i g h b o u r s a r e O's, t h e p i x e l i s r e l a b e l as 2 ( f o r EP). J 3 2. A p i x e l i s r e l a b e l l e d 3 ( f o r J3) i f i t s a t i s f i e s any ONE o f the two c o n s t r a i n t s below: 1. I t has e x a c t l y t h r e e ( 3 ) n o n - z e r o n e i g h b o u r s and does n o t have a l o c a l n e i g h b o u r h o o d o f any ONE o f t h e two mask shown i n f i g u r e A2.1. o r t h o s e o f m u l t i p l e s of 90 degrees r o t a t i o n . 2. I t has e x a c t l y f i v e n e i g h b o u r s and an i d e n t i c a l mask as s e e n i n f i g u r e A2.2. o r t h o s e o f m u l t i p l e s 104 Appendix 2 of 90 degrees r o t a t i o n . A A B A B B A P B A P A A A A A A A where A i s non-zero and B=l. F i g . A2.1. F i r s t c o n d i t i o n f o r a p i x e l P t o be r e l a b e l l e d as 3. A 0 A A P 0 A 0 A where A i s non-zero F i g . A2.2. Second c o n d i t i o n f o r a p i x e l P t o be r e l a b e l l e d as 3. J4 3. A p i x e l i s r e l a b e l l e d 4 ( f o r J4) i f i t has e x a c t l y f o u r n e i g h b o u r s and does n o t have any ONE o f t h e masks shown i n f i g u r e A2.3 or those o b t a i n e d by m u l t i p l e s o f 90 degrees r o t a t i o n s . A A A B A B B P B 0 P 0 B B B C A C where A>0, ONE of B and ONE of C i s non - z e r o . F i g . A2.3. C o n d i t i o n s f o r an i l l e g a l J4 p i x e l P. The node i d e n t i f y i n g s u b r o u t i n e s i m p l y scans an image row by row. S u b r o u t i n e node i d e n t i f i c a t i o n was p r e s e n t e d i n Chapter 2. The f i r s t s t e p o f A l g o r i t h m A2 i n Appendix 3 can be r e g a r d e d as a node i d e n t i f y i n g s u b r o u t i n e . 105 APPENDIX 3 SINGLE CHARACTER RECOGNITION ALGORITHM 1. I n t r o d u c t i o n T h e r e a r e t h r e e d i s t i n c t s t e p s i n Mr. Wong's s i n g l e c h a r a c t e r r e c o g n i z e r (SCR). They a r e : 1. T h i n n i n g / p r e p r o c e s s i n g o f the i n p u t image. 2. Image f e a t u r e e x t r a c t i o n , r e p r e s e n t a t i o n , and p r e c l a s s i f i c a t i o n . 3. Image c l a s s i f i c a t i o n / r e c o g n i t i o n u s i n g a t t r i b u t e d r e l a t i o n a l graph (ARG). S t e p 1 was d i s c u s s e d i n A p p e n d i x 1 and a s m a l l p a r t o f S t e p 2 was d i s c u s s e d i n Appendix 2. Here, a b r i e f o u t l i n e of Step 2 and S t e p 3 i s p r e s e n t e d as a r e f e r e n c e . The f o r m a l d i s c u s s i o n and p r e s e n t a t i o n i s i n r e f e r e n c e [ 1 ] . In C h a p t e r I V , t h e r e were f o u r m a j o r c h a n g e s m e n t i o n e d . O n l y one o f t h e m was d i s c u s s e d . The o t h e r t h r e e a r e d i s c u s s e d h e r e . They a r e t h e new r e f e r e n c e d i c t i o n a r y , t h e new c o n s t a n t s v a l u e s , and t h e e x t r a d e f o r m a t i o n f u n c t i o n . The new t h r e s h o l d s f o r r e c o g n i z e d / n o t r e c o g n i z e d were d i s c u s s e d i n Chapter IV and th e y were shown i n f i g u r e IV-3.1. The SCR was p r o v e n t o have an o v e r a l l r e c o g n i t i o n r a t e o f 91.46%. I n some c a s e s l i k e t h e c h a r a c t e r "F", t h e r e c o g n i t i o n r a t e was 96.88% w h i l e c h a r a c t e r s "U" and "V" had a r e c o g n i t i o n r a t e o f 77.27% and 80.00% r e s p e c t i v e l y . 2. Image e x t r a c t i o n , r e p r e s e n t a t i o n , and p r e c l a s s i f i c a t i o n . A f t e r t h i n n i n g , t h e i m a g e i s r e a d y f o r f e a t u r e e x t r a c t i o n . The f e a t u r e e x t r a c t i o n p r o c e s s e x t r a c t s and 106 Appendix 3 a c q u i r e s the t o p o l o g i c a l f e a t u r e s and t h e i r r e l a t i o n s h i p s and s t o r e s t h e d e s c r i p t i o n i n node d a t a t a b l e s (NDT). The t o p o l o g i c a l f e a t u r e s are c a l l e d " g l o b a l f e a t u r e " o r " c r i t i c a l p o i n t " i n [ 1 ] . These f e a t u r e s a r e e n d p o i n t s (EP), j u n c t i o n 3s ( J 3 ) , j u n c t i o n 4s ( J 4 ) , c l o s u r e s (C) , and s i n g l e p o i n t s (SP). T h e i r f o r m a l d e f i n i t i o n s are p r e s e n t e d i n Appendix 2. S t e p 2 u s e s t h e "W" o p e r a t o r and t h e c o r r e s p o n d i n g Freeman's d i r e c t i o n s mask. The.SCR i d e n t i f i e s and l a b e l s the t o p o l o g i c a l f e a t u r e s o f an image a c c o r d i n g t o the d e f i n i t i o n s i n A p p e n d i x 2. W h i l e t h e number o f c l o s u r e f e a t u r e ( i . e . l o o p s ) i s c a l c u l a t e d by the f o l l o w i n g f o r m u l a : C = 1/2* (N(J3)+2*N(J4)+2*N(piece)-N(EP)-N(J3J4)-2*N(J4J4) ) where N(EP) = number of end p o i n t s N(J3) = number of j u n c t i o n 3s N(J4) = number of j u n c t i o n 4s N ( p i e c e ) = number o f d i s j o i n t p i e c e s N(J3J4) = number of J3 and J4 b a c k - t o - b a c k nodes N(J4J4) = number of J4 and J4 b a c k -to-back nodes A f t e r i d e n t i f y i n g the nodes the SCR t r a c e s the branches. The t r a c i n g method i s c o m p l e t e l y d i f f e r e n t from the one employed by t h e CCSA. The SCR u t i l i z e s t h e W o p e r a t o r an d t h e c o r r e s p o n d i n g d i r e c t i o n s d u r i n g t r a c i n g . No s p e c i f i c h i e r a r c h y i s r e q u i r e d t o a s s i s t t h e SCR. However, i t does f o l l o w a p r e d e t e r m i n e d i r e c t i o n w h i l e i t i s t r a c i n g branches. A t r a c e d p i x e l i s c h a n g e d f r o m a 1 t o a 9. The r e s u l t i n g c h a i n code d i g i t a l g r a p h s (or d i g r a p h s ) a r e s t o r e d i n a NDT. A t y p i c a l NDT i s shown i n t a b l e A3.1. An i m a g e i s f o r m e d by 107 Appendix 3 l i n k i n g NDT's u s i n g l i n k e d l i s t . The f o l l o w i n g i s a d i r e c t e x c e r p t i o n f r o m Mr. Wong's t h e s i s d e s c r i b i n g t h e f e a t u r e e x t r a c t i o n a l g o r i t h m . I t was c a l l e d A l g o r i t h m A2 i n [ 1 ] . "1. P i c k a c o n v e n i e n t node and c a l l t h i s t h e r o o t node. T h i s node i s t a k e n t o be t h e b o t t o m r i g h t -most node by the A l g o r i t h m A2. 2. S t a r t i n g w i t h the r o o t node as the p r e s e n t node, t r a c k and e x t r a c t the c h a i n code o f any branch t h a t i s f o u n d s t a r t i n g f r o m t h e e a s t e r l y (Freeman's 1) d i r e c t i o n and m o v i n g c o u n t e r - c l o c k w i s e . I f a bra n c h t r a c k e d l e a d s t o an e n d p o i n t o r a node t h a t has p r e v i o u s l y been v i s i t e d ( t e r m i n a l node) t h e n the b ranch i s c a l l e d a s u b l i n k . The c h a i n code f o r t h i s s u b l i n k i s s t o r e d and i t s a s s o c i a t e d p o i n t e r s e t t o p o i n t t o t h e n o n - t e r m i n a l node. B r a n c h s e a r c h i n g and t r a c k i n g c o n t i n u e s a t t h i s p r e s e n t node. I f a branch t r a c k e d l e a d s t o a J3 o r J4 t h a t has not been p r e v i o u s l y v i s i t e d ( n o n - t e r m i n a l node) t h e n t h e b r a n c h i s c a l l e d a main l i n k . The c h a i n code f o r t h i s m a i n l i n k i s s t o r e d a t t h e p r e s e n t node and t h e new n o n - t e r m i n a l node f o u n d i s t a k e n as t h e new p r e s e n t node. The f o r w a r d p o i n t e r and b a c k w a r d p o i n t e r o f t h e new and p r e v i o u s p r e s e n t nodes are then s e t . T r a v e r s a l c o n t i n u e s i n t h i s manner u n t i l t h e a l g o r i t h m c a n go no d e e p e r , t h a t i s a l l b r a n c h e s f r o m a p r e s e n t node have been t r a c k e d . Then t h e a l g o r i t h m b a c k t r a c k s t o the 1 s t node t h a t was used as the p r e s e n t node and the p r o c e s s c o n t i n u e s . The t r a v e r s a l i s c o m p l e t e when the p r o c e s s b a c k t r a c k s t o t h e r o o t node and a l l b r a n c h e s o f t h e r o o t node has been t r a c k e d . " A l g o r i t h m A2 I n p u t : T h i n n e d b i n a r y image o f an a l p h a n u m e r i c c h a r a c t e r . O u t p u t : L i n k e d l i s t g r a p h i c a l r e p r e s e n t a t i o n o f i n p u t image w i t h s p u r i o u s n o i s e branches removed. R o o t node o f t h e g r a p h i s c a l l e d b a s e and any a u x i l i a r y g r a p h s ( s u b g r a p h s ) p r e s e n t i n t h e image have t h e i r r o o t nodes c a l l e d s u b - b a s e ( l ) t o s u b -108 Appendix 3 b a s e ( p i e c e s - 1 ) . BEGIN C main 3 1. F o r each '1' p i x e l Do c h e c k and l a b e l p i x e l s t h a t s a t i s f y any one o f t h e c o n d i t i o n s 1 t o 4 i n s e c t i o n 2.2 (Appendix 2 i n t h i s t h e s i s ) ; 2. F o r each c r i t i c a l p o i n t found by s t e p 1 Do (a) . c r e a t e an NDT; (b) . E n t e r node a t t r i b u t e s o f the new NDT, i . e . n o d e - n o . , j u n c t i o n t y p e a n d c o o r d i n a t e s . 3. No. o f p i e c e s := 0; 4. S e l e c t an NDT whose j u n c t i o n t y p e i s n o t an Sp and c a l l i t base; 5. T r a c k and s t o r e g r a p h i c a l i n f o r m a t i o n o f t h e image u s i n g t h e r u l e s i n s e c t i o n 3.2. ( t h e above p a r a g r a p h s on t h e d e s c r i p t i o n o f t r a c k i n g ) T r a c k i n g i n c l u d e s : (a) . I n c r e m e n t c o u n t e r s o f f s e t l ( N ( J 3 J 4 ) ) and o f f s e t 2 ( N ( J 4 J 4 ) ) as a p p r o p r i a t e . (b) . R e l a b e l a l l b r a n c h p i x e l s t h a t has been t r a c k e d as 9. (c) . L i n k t h e NDTs t r a c k e d a c c o r d i n g t o d e p t h f i r s t t r a v e r s a l . 6. p i e c e s := p i e c e s + 1; 7. I f a l l nodes o t h e r t h a n Sps have been v i s i t e d t h e n g o t o 8 e l s e B e g i n (a) , p i c k an u n v i s i t e d node o t h e r t h a n an Sp and c a l l i t sub b a s e ( p i e c e s ) . (b) . r e p e a t s t e p s 5 t o 7 u n t i l a l l nodes t h a t a r e not Sps have been v i s i t e d . End. 8. c a l c u l a t e c l o s u r e u s i n g e q u a t i o n 3.2.2; ( o r e q u a t i o n A3.1 here) 9. I f t h e r e a r e any '1' p i x e l s l e f t i n t h e image t h e n B e g i n (a) . c r e a t e a new NDT w i t h j u n c t i o n t y p e ' c l o s e d ' and c a l l i t s u b - b a s e ( p i e c e s ) . Appendix 3 (b) . t r a c k and s t o r e c h a i n code o f l o o p . (c) . c l o s u r e := c l o s u r e + 1; (d) . p i e c e s := p i e c e s + 1; (e) . r e p e a t s t e p 9 u n t i l a l l '1' p i x e l s i n t h e image has been t r a c k e d . End. 10. F o r each NDT Do { p r u n i n g } (a) , remove each s u b l i n k t h a t t e r m i n a t e s a t an e n d p o i n t w i t h l e n g t h l e s s t h a n t h e s p e c i f i e d minimum. (b) . u p d a t e NDT e n t r i e s a n d p o i n t e r s a s a p p r o p r i a t e . END [ main 3. A f t e r t h e i m a g e i s p r o c e s s e d by A l g o r i t h m A2, t h e t o p o l o g i c a l and g r a p h i c a l i n f o r m a t i o n i s e x t r a c t e d and s t o r e d . Based on the g r a p h i c a l e n g i n e e r i n g knowledge o f [ 1 ] , a l i s t o f f e a t u r e s were c h o s e n by Mr. Wong t o f o r m 25 d i f f e r e n t groups f o r p r e c l a s s i f i c a t i o n purposes. T a b l e A3.2 are t a b l e s from [1] (Table I I on page 57 and Table IV on page 101 o f [1]) i l l u s t r a t i n g t h e d e f i n i t i o n s f o r e a c h g r o u p . These d e f i n i t i o n s a r e b a s e d on t h e number o f EPs , J 3 s , J 4 s , and c l o s u r e s i n an image. B e i n g i n t h e same g r o u p i s t h e same as b e i n g t o p o l o g i c a l e q u i v a l e n t . T h e r e a r e a t o t a l o f 231 r e f e r e n c e c h a r a c t e r s i n t h e 25 d i f f e r e n t g r o u p s . Some g r o u p s o n l y h a s one c h a r a c t e r . I f a c h a r a c t e r i s p r e c l a s s i f i e d as t h o s e g r o u p s , i t w i l l be i m m e d i a t e l y r e c o g n i z e d by t h e SCR. O t h e r w i s e , t h e SCR w i l l p r e c l a s s i f y an i mage by s i m p l y k e e p i n g t r a c k o f t h e number o f f e a t u r e s f o u n d d u r i n g t h e e x e c u t i o n o f A l g o r i t h m A2 and match t h e r e s u l t w i t h T a b l e A3.2. 110 Appendix 3 The 231 r e f e r e n c e c h a r a c t e r s o f [1] makes up a f a i r l y -l a r g e d i c t i o n a r y . Because of the e x p e r i m e n t a l n a t u r e o f t h i s r e s e a r c h p r o j e c t , a new d i c t i o n a r y i s e m p l o y e d . T h i s new d i c t i o n a r y a l s o c o n s i s t s of 25 d i f f e r e n t groups g i v e n by the same d e f i n i t i o n s . However, the r e f e r e n c e c h a r a c t e r s i n the new d i c t i o n a r y a r e the c h a r a c t e r s w h i c h was used i n c r e a t i n g the i n p u t images as i n p u t s t o the CCSA. The new d i c t i o n a r y ' s group's d e f i n i t i o n s a r e shown i n t a b l e A3.3. The SCR was t e s t e d w i t h t h e 26 r e f e r e n c e c h a r a c t e r s i n o r d e r t o v e r i f y i t s 100% r e c o g n i t i o n r a t e on the r e f e r e n c e c h a r a c t e r s b e f o r e u s i n g i t f o r the CCSA. 3. Image c l a s s i f i c a t i o n / r e c o g n i t i o n u s i n g a t t r i b u t e d r e l a t i o n a l graph (ARG) 3.1. O u t l i n e The c h a i n c o d e d d i g r a p h NDT's a r e t r a n s f o r m e d i n t o A t t r i b u t e d R e l a t i o n a l G r a p h s (ARG) NDT's. A f t e r t h e t r a n s f o r m a t i o n , a h e u r i s t i c m a t c h i n g a l g o r i t h m i s u s e d t o match the i n p u t (unknown) c h a r a c t e r ' s ARGs w i t h the r e f e r e n c e ( d i c t i o n a r y ) c h a r a c t e r ' s ARGs. The ARG c o n c e p t s b e i n g u s e d a r e c l o s e l y r e l a t e d t o T s a i and Fu [ 5 ] . The t r a n s f o r m a t i o n f r o m a NDT t o an ARG i s b e s t i l l u s t r a t e d by the f o l l o w i n g t a b l e : 111 Appendix 3 C h a i n coded d i g r a p h NDT ARG NDT J u n c t i o n t ype — > Node s y n t a c t i c symbol C o o r d i n a t e — > Node semantic v e c t o r P o i n t e r s & node no — > Branch s y n t a c t i c symbol (denotes l i n k a g e o r d e r ) . M a i n - l i n k & s u b l i n k codes c h a i n — > Branch semantic v e c t o r T a b l e A3.4. E q u i v a l e n c e between c h a i n coded d i g r a p h and ARG NDTs. [1] R e w r i t i n g t h e nodes and b r a n c h a t t r i b u t e s means t h a t t h e d i g r a p h NDTs w i l l f o l l o w t h e g e n e r i c scheme shown i n t h e t a b l e above. The d e t a i l s o f c o m p u t i n g t h e ARG f r o m t h e NDT i s shown i n t h e f o l l o w i n g s e c t i o n . The subsequent r e f e r e n c e g u i d e d i n e x a c t m a tching a l g o r i t h m i s shown i n s e c t i o n 3.3. 3.2. A t t r i b u t e d R e l a t i o n a l G r a p h - i t s d e f i n i t i o n and t h e t r a n s f o r m a t i o n o f ARG from d i g r a p h o f a t h i n n e d image. The p r e s e n t a t i o n h e r e i s f o c u s e d on t h e f o r m a t and t h e usage o f ARG f o u n d i n [ 1 ] . The f o r m a l p r e s e n t a t i o n and d i s c u s s i o n o f ARG used i n r e c o g n i z i n g s i n g l e c h a r a c t e r s can be f o u n d i n [ 1 ] . A t t r i b u t e d r e l a t i o n a l g r a p h i s an e f f e c t i v e way o f r e p r e s e n t i n g a s t r u c t u r e . The f o l l o w i n g o u t l i n e s the usage o f ARG i n Mr. Wong's a l g o r i t h m . L e t ' s s t a r t by l o o k i n g a t a t y p i c a l ARG. An ARG c o n s i s t f o u r m a j o r p a r t s . An ARG named G has t h e form: 112 Appendix 3 G = C N, B, n, b 3 where N B n i s a f i n i t e s e t o f nodes o f s i z e N; i s a f i n i t e s e t o f b r a n c h e s o f s i z e B and i t i s a s u b s e t o f NN where NN i s a s e t o f NxN o r d e r e d p a i r s of d i s t i n c t e l e m e n t s i n N c a l l branches; i s t h e node i n t e r p r e t e r w h i c h c o n t a i n s a l l o f the node l a b e l s . i s the b ranch i n t e r p r e t e r w h i c h c o n t a i n s a l l of the branch l a b e l s . F u r t h e r m o r e : n i s a s e t o f s i z e N where e a c h e l e m e n t = ( s j f x^) s u c h t h a t S i i s t h e s y m b o l i n one o f t h e 5 s y m b o l t y p e s o f e n d p o i n t s (EP) , J u n c t i o n o f 3 b r a n c h e s ( J 3 ) , J u n c t i o n o f 4 branches ( J 4 ) , c l o s u r e (C), o r s i n g l e p o i n t s (SP) o f the node n i and x^ i s t h e l o c a t i o n o f t h e node i n t h e i mage w h i c h i s one o f t h e shown: 9 e q u a l y p a r t i t i o n e d s e c t o r s o f t h e image as aa ba ca ab bb cb ac be cc H e r e , n o t e t h a t t h e 5 node s y m b o l s a r e c o n v e n i e n t l y d e f i n e d as b e i n g the 5 g l o b a l f e a t u r e s so no s p e c i f i c t r a n s f o r m a t i o n i s r e q u i r e d . However, the s e m a n t i c v e c t o r ( i . e . l o c a t i o n ) o f a n o d e h a v e t o be c o m p u t e d f r o m i t s d i g r a p h NDT's c o o r d i n a t e s . and b i s o f s i z e B where each element bj = (Uifc, yj) where - U i k i s t h e s y m b o l o f b r a n c h b j w i t h 1 a n d k r e p r e s e n t i n g the t e r m i n a t i o n node. That i s 1 and k i s a s u b s e t o f N. T h i s s y m b o l c a n be o b t a i n e d f r o m t h e p o i n t e r s i n the l i n k e d l i s t . - y j i s o f f o u r t u p l e ( L j , K j , 9 1 , 6 2 ) d e s c r i p t i o n s e m a n t i c v e c t o r o f y j . The t r a n s f o r m a t i o n can be e a s i l y computed from the c h a i n codes o f the m a i n - l i n k s and the s u b l i n k s . We can d e f i n e the t r a n s f o r m a t i o n s a s : L j = l e n g t h a t t r i b u t e o f s h o r t ( S h ) , medium (Me), Long (Lg). I f T i s t h e t o t a l number o f p i x e l i n the t h i n n e d image and m i s p h y s i c a l l e n g t h of the b r a n c h , t h e n 'Lj = Sh i f ITKT/20, L J = Me i f T/20<m<3T/30, L j = Lg i f m>3T/10. Kj = c u r v a t u r e a t t r i b u t e s of C^, C-2' C 3 ' C 4 ' C 5 * G i v e n a Freeman's c h a i n code i n the W mask o f Cd^, d 2 , ... d m 3, we c a n d e f i n e a t o t a l i n c r e m e n t a l c u r v a t u r e 1 1 3 Appendix 3 I a s : 1 i - 2 2 k i i f m>l. 0 i f m=l. where k i i s t h e i n c r e m e n t a l c u r v a t u r e ( i n d e g r e e s ) i n moving from d i t o d i + i d e f i n e d a s : ( 4 5 ( d i + 1 - d i ) i f -4 < ( d i + 1 - d i ) < 4. k i = • 4 5 ( d i + 1 - d i - 8) i f ( d i + 1 - d i ) > 4. 145(8 - | d i + 1 - d i | ) i f ( d i + 1 - d i ) < -4. then Kj i s d e f i n e d as: k j f C i i f -45°<I<45°. C 2 i f 46°<I<90° or -46°<I<-90°. C3 i f 91°<I<270° or -91°<I<-270°. C 4 i f I > 270° or I < -270^. C5 s p e c i a l *. where o f C 5 r e p r e s e n t s a s p e c i a l c a s e o f wh i c h can be found i n c h a r a c t e r 'B', 'M', or 'W'. 6 1 and 6 2 a r e e n t r y and e x i t d i r e c t i o n s o f one o f t h e e i g h t d i r e c t i o n s i n t h e W (Freeman's) d i r e c t i o n s . G i v e n a Freeman's c h a i n code { d^, d 2 , ••• d m 3, we d e f i n e t h e e x i t and e n t r y d i r e c t i o n s as f o l l o w : For rn>4 6 1 = ( I ( d i + d 2 ) / 2 | i f | d i - d 2 | < 4. 1 ( | ( d i + d 2 ) / 2 | + 4)Mod 8 i f | d x - d 2 | > 4. 6 2 = ( | ( d m - i + d m ) / 2 | i f | d m _ ! - d m | < 4. I (|(dm-l + d m ) / 2 | + 4)Mod 8 i f | - d m | > 4. For m < 4 6 1 = d i . e 2 = d m . E x a m p l e A3.1 ( e x a m p l e 1 on page 83 o f [1]) i s an e x a m p l e on t r a n s f o r m i n g t h e d i g r a p h r e p r e s e n t a t i o n o f a t h i n n e d image t o an ARG r e p r e s e n t a t i o n . G i v e n a r e f e r e n c e ARG G a n d an i n p u t ARG G*. The d e f o r m a t i o n d i s t a n c e o f G' i s d e f i n e d as t h e c o s t h T t o 114 Appendix 3 t r a n s f o r m G' t o G: h T - d(G',G) = d ( n ' i f n ±) + w ^ j d ( b ' j f b j |U' m l, U n k) where d ( n ' ^ , n^) i s the c o s t o f o b t a i n i n g n^ from n^ by h. d ( b ' j , bjI u' ml» u n k ^ ^ s t h e c o s t o f o b t a i n i n g b r a n c h l a b e l b j f r o m b'j by h g i v e n t h a t U n k , t h e end node p a i r o f b j , i s o b t a i n e d f r o m U 1 ^ , t h e end nodes o f b ' j , by the same t r a n s f o r m a t i o n , w-^  and w 2 are w e i g h t s . To c o m p u t e t h e n u m e r i c a l c o s t o f t h e d e f o r m a t i o n , d e f i n i t i o n s on d e f o r m i n g a r e p r e s e n t e d below. As mentioned b e f o r e , some of the c o n s t a n t v a l u e s f o r d e f o r m a t i o n ( i . e . the c o s t ) f o r d e f o r m a t i o n i s a l t e r e d f o r t h i s p r o j e c t . N e v e r t h e l e s s , the o l d v a l u e s a re a l s o l i s t e d f o r comparison. i . node semantic v e c t o r d e f o r m a t i o n c o s t : Node s e m a n t i c v e c t o r x^ i s a 2 t u p l e f i e l d w i t h t h e f i r s t one s p e c i f y i n g t h e c o l u m n s e c t o r and t h e s e c o n d one s p e c i f y i n g t h e row s e c t o r . E a c h f i e l d c a n be o f one o f t h e t h r e e t y p e s o f [ a, b, c 3. R e f e r t o t h e d e f i n i t i o n o f ARG. The o l d and new c o s t o f d e f o r m a t i o n s a r e : DEFORMATION OLD COST [1] NEW COST a t o b & b t o c J J a t o c 7 13.1 i i . b r a n c h semantic v e c t o r d e f o r m a t i o n c o s t : B r a n c h s e m a n t i c v e c t o r y i i s a 4 t u p l e f i e l d w i t h L-; as 115 Appendix 3 the l e n g t h a t t r i b u t e s , Kj as the c u r v a t u r e a t t r i b u t e s , and 0 1 2 and 6 as the e n t r y and e x i t a n g l e a t t r i b u t e s r e s p e c t i v e l y , i i a . L ength a t t r i b u t e d e f o r m a t i o n c o s t : L e n g t h a t t r i b u t e s c a n be one o f t h r e e t y p e s i n [ Sh, Me, Lg 3. R e f e r t o t h e d e f i n i t i o n o f ARG. The o l d and new c o s t of d e f o r m a t i o n s a r e : DEFORMATION OLD COST [1] NEW COST Sh t o Me & Me t o Lg 10 Sh t o Lg 8 16.3 i i b . C u r v a t u r e a t t r i b u t e d e f o r m a t i o n c o s t : C u r v a t u r e a t t r i b u t e s c a n be one o f f i v e t y p e s i n ( C^, C 2, C 3 , C 4 , C 5 3. R e f e r t o t h e d e f i n i t i o n o f ARG. The o l d and new c o s t o f d e f o r m a t i o n s a r e : DEFORMATION OLD COST [1] NEW COST C i t o : C 2 2 2 C 3 4 8 C4 16 16 C 5 16 • 32 C 2 t o : C i 2 2 C 3 2 2 C 4 8 8 C5 8 32 C3 t o : C i 4 8 C 2 2 2 C4 2 2 C 5 8 32 C4 t o : C i 16 16 C 2 8 8 116 Appendix 3 C 3 2 2 C 5 8 32 C5 t o : C i 16 32 C 2 8 32 C 3 8 32 C 4 8 32 i i c . E n t r y and e x i t a n g l e d e f o r m a t i o n c o s t : The e n t r y o r e x i t a n g l e d e f o r m a t i o n c o s t i s d e f i n e d by the f o l l o w i n g f o r m u l a : I F (abs(6 - 6') < 4) THEN d@ = abs(6 - 6') * 2 ELSE d Q = (8 - abs(6 - 0')) * 2 where d @ i s the d e f o r m a t i o n c o s t . T h i s d e f o r m a t i o n c o s t i s n o t a l t e r e d . T h e r e f o r e , t h e above e q u a t i o n i s a l s o u t i l i z e d i n t h i s r e s e a r c h p r o j e c t . A new d e f o r m a t i o n c o s t s w ere added t o t h e SCR. T h i s d e f o r m a t i o n c o s t w i l l o n l y a f f e c t i m a g e s w i t h l o o p s and no node. C a r e f u l i n s p e c t i o n on the a c t u a l SCR's PASCAL program showed t h a t t h e l e n g t h o f a l o o p ' s d e f o r m a t i o n c o s t i s c o m p u t e d u s i n g t h e c o s t d i s c u s s e d i n s e c t i o n i i a above. I n v i e w o f the b a s i c c h a r a c t e r vs b a s i c f e a t u r e problem (Chapter V ) , an a d d i t i o n a l d e f o r m a t i o n f u n c t i o n i s a d d e d . The a d d i t i o n a l d e f o r m a t i o n cOst w i l l a s s i s t t h e CCSA and t h e SCR t o d i s t i n g u i s h b e t w e e n c h a r a c t e r "0" and a s e c t i o n o f a c h a r a c t e r w h i c h f o r m s a c l o s e d l o o p . I t i s a l e n g t h d e f o r m a t i o n c o s t where: 117 Appendix 3 D I F F i s the d i f f e r e n c e between the p h y s i c a l l e n g t h o f a l o o p t y p e i n p u t image and a r e f e r e n c e image o f t h e same t y p e i n n u m b e r o f p i x e l s . I f D I F F i s a t a c e r t a i n v a l u e , t h e f u n c t i o n w i l l have a c o r r e s p o n d i n g v a l u e . These v a l u e s a r e l i s t e d i n the f o l l o w i n g t a b l e : V a l u e o f DIFF d e f o r m a t i o n c o s t DIFF<3 0.0 3<DIFF<5 1.0 5<DIFF<7 2.0 7<DIFF<8 4.0 8<DIFF<10 8.0 10<DIFF<14 16.0 14<DIFF<18 32.0 18<DIFF<28 64.0 DIFF>28 100.0 T a b l e A3.5. To show the a d d i t i o n a l d e f o r m a t i o n f u n c t i o n . T h i s c o s t f r o m t h e new f u n c t i o n w i l l be added t o t h e f i n a l d e f o r m a t i o n d i s t a n c e o f the i n p u t image. The c o m p u t a t i o n o f t h e d e f o r m a t i o n d i s t a n c e i s p r e s e n t e d i n t h e n e x t s e c t i o n . An e x a m p l e o f c a l c u l a t i n g d e f o r m a t i o n c o s t s i s shown i n e x a m p l e A3.2. ( e x a m p l e 2 on page 84-85 o f [ 1 ] ) . The e x a m p l e i l l u s t r a t e s t h e b a s i c o p e r a t i o n s o f t h e c a l c u l a t i o n s o f d e f o r m a t i o n c o s t s based on the OLD COST g i v e n above. 3.3. R e f e r e n c e g u i d e d i n e x a c t ARG matching a l g o r i t h m I n s t e a d o f u s i n g an e x h a u s t i v e s e a r c h scheme, a h e u r i s t i c ARG m a t c h i n g scheme i s u t i l i z e d . The scheme i s g u i d e d by the r e f e r e n c e c h a r a c t e r ' s ARG. The m a t c h i n g i s a c h i e v e d by t h e f o l l o w i n g s t e p s (from page 87 o f [ 1 ] ) : "1. O b t a i n a b e s t match b e t w e e n t h e nodes o f t h e i n p u t ARG and r e f e r e n c e ARG. M a t c h i n g w i l l p r o c e ed i f and o n l y i f a match can be found f o r e v e r y node i n t h e r e f e r e n c e and i n p u t ARG. 118 Appendix 3 2. P e r f o r m a d e p t h f i r s t t r a v e r s a l o f t h e r e f e r e n c e ARG. F o r e a c h s t e p o f t h i s t r a v e r s a l f i n d an e q u i v a l e n t s t e p i n t h e i n p u t ARG by c h e c k i n g t h e f e a s i b i l i t y o f m a k i n g s u c h a s t e p b e t w e e n p a i r ( s ) o f nodes i n t h e i n p u t image t h a t has been m a t c h e d t o t h e node p a i r i n t h e r e f e r e n c e ARG t h a t i s i n v o l v e d i n the s t e p . For each f e a s i b l e s t e p f o u n d i n t h e i n p u t . A R G , c a l c u l a t e t h e d e f o r m a t i o n d i s t a n c e b e t w e e n t h e b r a n c h i n v o l v e d and t h e r e f e r e n c e b r anch t r a v e r s e d . The f e a s i b l e e q u i v a l e n t b ranch y i e l d i n g t h e m i n i m u m d i s t a n c e i s t a k e n a s t h e b e s t m a t c h e d b r a n c h . " The r e f e r e n c e g u i d e d m a t c h i n g o f ARGs u s e s a d i s t a n c e m a t r i x (DM) t o c a l c u l a t e t h e d e f o r m a t i o n d i s t a n c e e f f i c i e n t l y . G i v e n a s e t o f ARG o f s i z e N. A DM o f s i z e NxN can be form w i t h the columns r e p r e s e n t i n g the unknown nodes and the rows r e p r e s e n t i n g the r e f e r e n c e nodes: unknown nodes r e f e r e n c e nodes s i z e N by N D e f o r m a t i o n b e t w e e n t h e j t h u n k n o w n n o d e a n d t h e i t h r e f e r e n c e node i s c a l c u l a t e d and s t o r e d i n the j t h row's i t h e l ement of DM. The r e f e r e n c e g u i d e d ARG m a t c h i n g a l g o r i t h m i s p r e s e n t e d b e l o w by a l g o r i t h m A 3 w h i c h i s a d i r e c t e x c e r p t i o n from [1] (page 90-92). 119 Appendix 3 A l g o r i t h m A3 - Re f e r e n c e g u i d e d i n e x a c t m a t c h i n g . I n p u t : L i n k e d l i s t o f NDTs r e p r e s e n t i n g t h e d i r e c t e d r e f e r e n c e ARG and a l i n k e d l i s t o f NDTs r e p r e s e n t i n g the ARG of i n p u t image. Output : T o t a l d e f o r m a t i o n d i s t a n c e between the Re f e r e n c e ARG and i n p u t ARG. L e t 'I's r e p r e s e n t the node numbers of the r e f e r e n c e ARG and 'J's r e p r e s e n t t h e node numbers o f t h e i n p u t (unknown) ARG each h a v i n g N nodes. 1. I n i t i a l i z e DM o f s i z e NxN t o z e r o ; d ( I n p u t ARG, Ref. ARG) := 0; 2. C s t e p s 2 t o 4 f i n d t h e b e s t match b e t w e e n t h e nodes o f the i n p u t and r e f e r e n c e ARGs 3 For r e f . node I := 1 t o N Do For unknown node J := 1 t o N Do B e g i n I f node J s y n t a c t i c s y m b o l = node I s y n t a c t i c symbol then c a l c u l a t e d ( n - r , n j ) ; I f d(nj,n-r) = 0 th e n DM(I,J) := 1 e l s e i f d ( n j , n j ) > t h r e s h o l d l t h e n DM(I,J) := 0 e l s e DM(I,J) := d ( n j , n j ) End; 3. For I := 1 t o N Do B e g i n k e ep t h e minimum n o n - z e r o e n t r y o f e a c h row and s e t t h e r e s t t o z e r o ; I f a row w i t h a l l z e r o e n t r i e s i s d e t e c t e d t h e n g o t o s t e p 10. End; 4. For J := 1 t o N Do I f column J has no non-zero e n t r y t h e n B e g i n For I := 1 t o N Do B e g i n I f node J s y n t a c t i c s y m b o l = node I s y n t a c t i c symbol then c a l c u l a t e d ( n j , n j ) ; I f dfn-wnj) = 0 then DM(I,J) := 1 e l s e i f d ( n j , n j ) > t h r e s h o l d l then DM (I,J) := 0 e l s e DM(I,J) := d ( n j , n j ) End; For J := 1 t o N Do B e g i n keep the minimum non-zero e n t r y o f each column 120 Appendix 3 and s e t the r e s t t o z e r o ; I f a c o l u m n w i t h a l l z e r o e n t r i e s i s d e t e c t e d then goto s t e p 10 End; End ; 5. Ref. node I := r o o t node o f d i r e c t e d Ref. ARG; 6. C f e a s i b i l i t y check and branch matching 3 For J := 1 t o N Do B e g i n I f DM(I,J) > 0 then Repeat L e t I 1 be an end node a s s o c i a t e d w i t h an o u t g o i n g branch from node I ; r e s e t b f l a g ; db := t h r e s h o l d 2 ; B e g i n For J 1 := 1 t o N Do I f DM(I',J 1) > 0 t h e n B e g i n I f a b r a n c h e x i s t s b e t w e e n nodes J J ' o f t h e unknown ARG C f e a s i b i l i t y s a t i s f i e d 3 t h e n i f ( d ( b r a n c h J J ' , b r a n c h I I 1 ) < t h r e s h o l d 2 and d ( b r a n c h J J ' , branch I I 1 ) < bd) then B e g i n db := d ( b r a n c h J J ' , b r a n c h I I ' ) ; s e t b f l a g End; End; I f b f l a g s e t then DM(I,J) := DM(I,J) + db e l s e B e g i n DM ( I , J ) :=.0 £ n o d e p a i r I & J i s n o t a f e a s i b l e match so i s e l i m i n a t e d 3; goto nn End; End; U n t i l a l l o u t g o i n g b r a n c h e s o f R e f . node I have been t e s t e d and matched; nn : end; 7. c h e c k row I and keep minimum n o n - z e r o e n t r y o f t h e row and s e t the r e s t t o z e r o ; I f a l l z e r o s i n the row i s d e t e c t e d then g o t o s t e p 10; 8. I := n e x t node o f t h e d e p t h f i r s t t r a v e r s a l o f t h e Ref. ARG; 9. R e p e a t s t e p s 5 t o 8 u n t i l t r a v e r s a l o f t h e R e f . ARG i s 121 Appendix 3 completed; goto 11; 10. No match between the two ARGs i s p o s s i b l e ; g oto 12; 11. For I := 1 t o N Do For J := 1 t o N Do d ( I n p u t ARG, Ref. ARG) := d ( I n p u t ARG, Ref. ARG) + DM ( I , J ) ; 12. End. where t h r e s h o l d l and t h r e s h o l d 2 a r e t h e node and b r a n c h d e f o r m a t i o n t h r e s h o l d above w h i c h we c o n s i d e r t h e d e f o r m a t i o n t o be u n r e a s o n a b l e so t h e match i s not c o n s i d e r e d a p o s s i b l e match , r e s p e c t i v e l y . A f t e r the c h a r a c t e r i s r e c o g n i z e d the r e s u l t s a r e r e t u r n t o t h e CCSA t h r o u g h ID o f T I . The r e s u l t s i n c l u d e s t h e i d e n t i t y o f the c h a r a c t e r , the d e f o r m a t i o n d i s t a n c e , the p r e -c l a s s i f i e d g r o u p n u m b e r , a n d t h e B o o l e a n v a r i a b l e o f r e c o g n i t i o n . These f o u r e l e m e n t s c o r r e s p o n d s t o the 4 t u p l e f i e l d ID o f T I . R e f e r t o C h a p t e r IV f o r t h e d e f i n i t i o n o f a t e s t image T I . 122 Appendix 3 CONTENTS: Abv. I n i t i a l VaIue Node number Node-no. 0 June . t y p . JT • * Coord inate (I , J ) (0,0) Forward p o i n t e r fwdptr N i l . Backward p o i n t e r bkptr N i l Integer s t a t u s o r d e r , l k , M l 0,0 ,0 Main l i n k code M a i n - l i n k 0 . . 0 s u b l i n k l code s u b - l i n k i 0 . . 0 s u b l i n k 2 code s u b - l i n k 2 0 . . 0 s u b l i n k 3 code s u b - l i n k 3 0. .0 s u b l i n k l p t r . s u b p t r i N i l s u b l i n k 2 p t r . subptr2 N i l s u b l i n k 3 p t r . subptr3 N i l Remar ks: 1. Node number i s the order the nodes are found. 2. Order i s the order the nodes are v i s i t e d by during the e x t r a c t i o n process. Therefore once v i s i t e d a node w i l l have an order of > 0. 3. Lk and Ml are used t o i n d i c a t e the number of s u b l i n e s and mai n l i n k attached t o a node. 4. Subptrs are used t o i n d i c a t e the t e r m i n a l nodes of s u b l i n k s , these information are required by the pruning procedures. T a b l e A3.1. A t y p i c a l NDT. [1] 123 Appendix 3 GROUP. GLOBAL FEATURES A l CHARACTERS. B EP. J3. J4. SP. c. G 0 0 D 0 1 3 0 0 0 11 0 2 0 0 2 1 2 0,B 1 DO i n 1 1 0 0 1 5 6 D.G,O.P,Q,S 2 5S IV 1 1 1 0 2 0 1 0 0 V 1 l 2 0 2 D 2 0 0 VI 1 2 l 0 2 0 1 B D v : i 1 3 D 0 2 1 B B.O.C 1 59 ; i x 2 0 0 0 0 1 6 0 B.C.D.G,J.L.M.K.O P,R,S,U,V,W,Z 4 DO IX 2 0 1 0 1 0 3 R 0 X 2 0 2 0 1 4 5 c 0 XI 2 1 1 0 1 B 0 P.O.* 1 55 I : J 2 2 0 0 1 6 0 A,B.E,G,P,C,R,J 3 00 x:: I 2 4 D 0 2 D 2 B 0 x i v 3 1 0 0 0 2B 2 B,C,D,E,F,G.H.I,J,L,« N,P,0,R,S,T,U.V,W,T,Z 4 46 xv 3 1 1 0 1 0 l A 0 xv; 3 3 0 1 0 2 A,B,G,Q,R 2 32 XVI I 4 0 0 0 0 3 G,J,T 1 55 XIIX 4 0 1 0 D 0 2 Q.x 1 00 XIX 4 0 0 0 0 3 K 0 IX 4 1 1 0 0 0 5 K.X 1 00 XXI 4 2 0 0 2D 6 B.E.r.G.H.I,J,K,M N,P,R.T,W,X,Z 4 CO TX: I 4 2 1 0 1 0 1 A 0 XX:II 4 4 0 2 0 3 A 0 xxiv S 1 1 0 0 0 2 H 0 xxv 5 3 0 0 0 1 6 A,E.H,M,K,V,X,Z 3 DC Ktmarkt: 1. Ep,J3,J4,Sp and C are the number of endpoints, J3s, J4s, single points and closures found in the image. 2. A% is the percentage distribution of input images over the different groups. 3. H G is the total entropy of an image from the group. GROUP CHARACTERS TC. TO. 1 01 > , D II 0( >,BI2) 3 III S( ).G(l),0(1),P(l).0(11.5(1) 6 1 1 IV 0! ) 1 D V Ql ) 1 D VI B< ) 1 0 VII B( ),0(1),0(1) 3 7 IIX B( ) , C ( l ) , D ( i ) , e ( t ) , J ( 1 ) . L ( l ) , M ( l ) . M ( l ) ) , P ( l ) , F ( l ) , S ( l ) , 0 ( l ) , V ( l ) , H ( 1 ) . I ( l ) 16 32 0( IX R( ) 1 D X 0< ) 1 0 XI P( ) , Q(2),F(2) 3 e XII A( ) . B ( l ) , D ( 1 ) , G ( 1 ) , P ( l ) , Q ( 1 ) , R ( 1 ) , J ( 1 ) 6 24 XIII Bl ) 1 0 x i v B( J ( T( ) . C ( l ) , D ( 1 ) , E ( l ) , f ( 1 ) , G ( 1 ) , H ( 1 ) . I ( 1 ) >,L(1),M<1),N(1>.»(1) >Q(1),II(U,S(1) ), D ( l ) , V I I ) , H I D , I ( l ) , 2 ( 1 ) 22 St XV A( ) 1 D XVI A( ).B(1).G(1).C(1>.*(1> 6 XVII G< ) , J ( I ) , T ( I ) 3 3 XIIX Q< ),X(1) 2 2 XIX «.< ) t 0 XX K( ).X(1) 2 1 XXI B< M< ) , E ( l ) . r ( 1 ) , G ( l ) , H ( 1 ) , I ( l ) , J ( 1 ) , K ( l ) ),H(1).P(1),R(1),T(1),¥(1),X(1 ) , I(1) I t 52 XXII » ( ) 1 0 XXIII M ) 1 D XXIV H( ) l 0 XXV A< ), t ( 11,R(l).M(1>,a(l),W<lt,S(t),«(!) 8 t« I n a r U : TC • T o t a l nuaber et d i f f e r e n t c h a r a c t e r s i n t h t group. TS • T o t a l numbtr of re f e r e n c e c h a r a c t e r s f o r the group. •unber of t o p o l o g i e s observed f o r each c h a r a c t e r i s gieen i n b r a c k e t s beside each c h a r a c t e r . T a b l e A3.2. T a b l e s from [1] d e s c r i b i n g t he o l d d i c t i o n a r y . 124 Appendix 3 GROUP # OF FEATURES CHARACTERS EP J3 J4 c 1 0 0 0 1 D,0 2 0 2 0 2 B 3 1 1 0 1 -4 1 1 1 2 -5 1 1 2 2 — 6 1 2 1 2 _ 7 1 3 0 2 -8 2 0 0 0 C,L,S,V 9 2 0 1 1 -10 2 0 2 1 -11 2 1 1 1 — 12 2 2 0 1 P,Q,R 13 2 4 0 2 -14 3 1 0 0 E,F,G,J,M,N,T,U,W,Y,Z 15 3 1 1 1 — 16 3 3 0 1 A 17 4 0 0 0 -18 4 0 1 0 -19 4 0 2 0 -20 4 1 1 0 — 21 4 2 0 0 H,I,K,X 22 4 2 1 1 -23 4 4 0 2 -24 5 1 1 0 -25 5 3 0 0 -T a b l e A3.3. New d i c t i o n a r y ' s g r o u p i n g . 125 Appendix 3 Example 1 ; T ransformat ion of the d ig raph r e p r e s e n t a t i o n of 'R* i n t o an ARG r e p r e s e n t a t i o n . Input Image. Digraph Representation. ARG Representation. Nodes: Branches: l . ( J 3,ba). b,-(U12,(Me.Cl,5,6)) 2»<Ep,be>, b,-(lil3.<M«,Ci,7,7)) 3-<J3,bb), b , - (U31,(Lg ,C3,1,5!) *-(J3,bb>, b.-(U«3 ,(Sh.Ci,2,2)) 5*(Ep,Cc), b,»(U45,(Me,C1,B,B)) 6«(Ep,ac), b«-(U64,<Me,C2.3,2)) Sample c a l c u l a t i o n s . u(at) - Node 6 - ( s f , x , ) • ( E p , a c ) . where s y n t a c t i c symbol i s Ep and c o o r d i n a t e (5,23) maps i n t o s e c t o r a c . « < 7 i ) - b , - ( U 6 4 , ( L , , t , , 0 , « , e J « ) w i th L , • Me s i n c e T / 2 0 < m « S 3 T / l O . • « • I 6. • (-1-M-1-M-1 + 1+0-1-1 )45 - - 9 0 ° . . t V , - r i / 2 ( 3 * 2 ) l - 3 and 9*, - l l / 2 ( 2 * 1 ) l - 2. Example A3.1. D i g r a p h t o ARG t r a n s f o r m a t i o n [ 1 ] . 126 Appendix 3 Example 2. : T h i s example i l l u s t r a t e s how the d i s t a n c e between the two ARGs shown below can be c a l c u l a t e d . For the moment we assume that the f e a s i b i l i t y of matching the nodes and branches of the two ARGs - have been checked and they a re matched a c c o r d i n g to t h e i r l a b e l s (node numbers & branch s u b s c r i p t s ) . JJtC e l '»'. < > (tp.bal b, • Ct2 i . tlh.C! .I,"111 2 • IJ3.M ) b, • (042. (Itt.Cl .J.4 I I 1 • (JJ.btl b, • (D2J.(He,CI.•>,'>> 4 • (tp.ebl b. • (OJi,<nt,ei.i,< 11 5 • (tp.arl b, • IDJS.IN-.Cl,t.t>I i • <Ep,c-I b, • <Ua4 . (Nt.C . J.4 ) I l ' • (tp. ba ! f . • IBS1 •  •. (n« . c : • 2' > IJj.b*' b', • iiu 2 .(m.cj.a.s: )' . (J3.br.> b' , • 1C2 ' J' . IHr.C .1.1 I 4' . (Cp.et) b , • IU3 4 ' . (Ht .C .1. • 1 5- . (tr.e-> b , • IC)S ' .IM«.CI.t.1 I t • (Ip.ac) b' , • IU4'4' .(M«.C2. 1 .«/ Civer. that tha t r a n i f o r a a t i o - . *- i t »u-r. that the branch ar.fi nod* aaaa-tje vactor cD*pontn:s have the following dtiornatior dittancaa: For nodat: «(a.b>.J. d l b . e l O , d(a.c).7. For tiranebat: 1 . fcangth a t t r i b u t e s : - . • lth,Mal>2, *(>•> ,La>>2. «(Sr..le,l.l. 2. Cur.etura a t t r i b u t a t : * «IC1,C2I«2, «(C <,C3 )•«. «(Ci.C4)'li. • ( C l , C S l " t , 4IC2.CSI-2, 4IC2.C4I-I. *(C2,C5>.I. «[CJ,C4l-2, «(C).C!I.I. #(C4.CSI-I . I. #' and «-t-ftait and antry d i r a c t i o n t have a diatanea of 2 for •vary 45" of deforwation. •g. dd.21.2. d(<,ll>4. «('.41-» •! 1 . * ) • • . d l l . t ) - i . di'.1P-4, d ( l , l ) - 2 , d(2.)l.2. dl2.4l-4 a t e . a,.i and • . D i s t a n c e c a l c u l a t i o n s ; d (B ,A) • Z (node deformat ion d i s t a n c e ) * X (branch deformat ion d i s t a n c e ) • (0+0+0+0+3+7) • (8*4* (K2*4*6) - 34. e . g . d(node 6 ' ,node 6) - d ( c , c ) • d ( a , c ) • 7 * 0 • 7 . d ( b ' , , b . ) - d(Me,Me) • d(C2,C1> • « • ' . # « • ) • d ( » * , « " ) . . . . . . c o n t . Example A3.2. Computation o f d e f o r m a t i o n d i s t a n c e [1] Appendix 3 Note t h a t node 6 i s matched t o node 6' and node 4 t o node 4' so U64 and U 4 ' 6 ' d e s c r i b e a p a i r of ma ;ched branches b 6 and b 6 ' w i t h one b r a n c h t r a c k e d i n the r e v e r s e d i r e c t i o n w i t h r e s p e c t t o t h e i r end nodes as i n d i c a t e d by t h e i r s y n t a c t i c symbols. I n o r d e r t o match t h e s e two branches we have t o use e q u a t i o n 4.3.7 t o c o r r e c t 61 and B2 of any one of the branch b e f o r e m a t c h i n g . C o r r e c t i n g 0 1 and 62 of b 6 * we r e w r i t e i t s b r a n c h p r i m i t i v e as b ' 6 = (U6'4',(Me,C2,2,5)). T h e r e f o r e the branch d e f o r m a t i o n d i s t a n c e i s c a l c u l a t e d a s : d(Me,Me) = 0. d(C1,C2) = 2. d(3,2) = 2. d(4,5) = 2 . T h e r e f o r e d ( b 6 , b ' 6 ) = 0+2+2+2 = 6. Example A3.2. (cont.) Copmutation o f d e f o r m a t i o n d i s t a n c e [ 1 ] . 128 R e f e r e n c e s [1] I . H. Wong, " D e s i g n o f a r e a l t i m e h i g h s p e e d r e c o g n i z e r f o r u n c o n s t r a i n e d h a n d p r i n t e d a l p h a n u m e r i c c h a r a c t e r , " M a s t e r o f A p p l i e d S c i e n c e T h e s i s , D e p a r t m e n t o f E l e c -t r i c a l E n g i n e e r i n g , The U n i v e r s i t y o f B r i t i s h C o l u m b i a , F e b r u a r y 1985. [2] J . H. Munson, " E x p e r i m e n t s i n t h e r e c o g n i t i o n o f hand-p r i n t e d t e x t : P a r t I - C h a r a c t e r r e c o g n i t i o n , " P r o c . F a l l J o i n t Computer C o n f e r e n c e , V o l 33, pp. 1125 - 1138, December 1968. [3] J . H. Munson, " E x p e r i m e n t s i n t h e r e c o g n i t i o n o f hand-p r i n t e d t e x t : P a r t I I - C o n t e x t a n a l y s i s , " P r o c . F a l l J o i n t C o m p u t e r C o n f e r e n c e , V o l 33, pp. 1139 - 1149, December 1968. [4] A. R o s e n f e l d , "The f u z z y g e o m e t r y o f image s u b s e t s , " P a t t e r n R e c o g n i t i o n L e t t e r s , V o l . 2., No. 5, pp. 311 -317, September 1984. [5] W. H. T s a i and K. S. Fu, " E r r o r - c o r r e c t i n g I s o m o r p h i s m s o f a t t r i b u t e d r e l a t i o n a l g r a p h s f o r p a t t e r n a n a l y s i s , " IEEE t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , V o l SMC-9, No. 12, pp. 757 - 768, December 1979. [6] W. H. T s a i and K. S. Fu, "A p a t t e r n d e f o r m a t i o n a 1 model and B a y e s e r r o r - c o r r e c t i n g r e c o g n i t i o n s y s t e m , " IEEE t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , V o l . SMC-9, No. 12, pp. 745 - 756, December 1979. [7] N. J . N a c c a c h e and R. S h i n g h a l , "SPTA: A p r o p o s e d a l g o r i t h m f o r t h i n n i n g b i n a r y p a t t e r n s , " I E E E t r a n s a c t i o n s on Systems, Man, and C y b e r n e t i c s , V o l . SMC-14, No. 3, pp. 409 - 418, May/June 1984. [8] M. A. E s h e r a and K. S. Fu, "A g r a p h d i s t a n c e measure f o r image a n a l y s i s , " IEEE t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , V o l SMC-14, No. 3, pp. 398 - 408, May/June 1984. [9] T. P a v l i d i s , "An a s y n c h r o n o u s t h i n n i n g a l g o r i t h m , " Computer G r a p h i c s and Image P r o c e s s i n g 20, pp. 133 - 157, 1982. [10] A. R o s e n f e l d and L. S. D a v i s , "A N o t e on t h i n n i n g , " IEEE t r a n s a c t i o n s on Systems, Man, and C y b e r n e t i c s , pp. 226 -228, March 1976. 129 R e f e r e n c e s [11] B. B l e s s e r , e t a l . , "A t h e o r e t i c a l a p p r o a c h f o r c h a r a c t e r r e c o g n i t i o n b a s e d on p h e n o m e n o l o g i c a 1 a t t r i b u t e s , " I n t . J o u r n a l o f Man-Machine S t u d i e s 6, pp. 701 - 714, 1974. [12 ] J . L e r o u x , e t a l . , " S e g m e n t s d e t e c t i o n i n b i n a r y p i c t u r e s f o r t h e r e p r e s e n t a t i o n a n d t h e s y n t a c t i c r e c o g n i t i o n o f hand w r i t t e n c h a r a c t e r s , " 6 t h I n t . Conf. on P a t t e r n R e c o g n i t i o n P r o c e e d i n g s , pp. 692 - 695, 1982. [13] A. R o s e n f e l d , " F i g u r e e x t r a c t i o n , " A u t o m a t i c I n t e r p r e t a t i o n and C l a s s i f i c a t i o n o f Images, pp. 137 -153. [14] K. B a d i e and M. S h i m u r a , " M a c h i n e r e c o g n i t i o n o f Roman c u r s i v e s c r i p t s , " 6 t h I n t . C onf. on P a t t e r n R e c o g n i t i o n P r o c e e d i n g s , V o l 1., pp. 28 - 30, 1982. [15] P. S. P. Wang, "A new c h a r a c t e r r e c o g n i t i o n scheme w i t h l o w e r a m b i g u i t y and h i g h e r r e c o g n i z a b i 1 i t y , " 6 t h I n t . Conf. on P a t t e r n R e c o g n i t i o n P r o c e e d i n g s , V o l . 1, pp. 37 - 39, 1982. [16] M. S h r i d h a r and A. B a d r e l d i n , " H i g h a c c u r a c y c h a r a c t e r r e c o g n i t i o n a l g o r i t h m u s i n g F o u r i e r and t o p o l o g i c a l d e s c r i p t o r s , " P a t t e r n R e c o g n i t i o n , V o l . 17, No. 5, pp. 515 - 524, 1984. [17] M. R i c h e t i n a n d F. V e r n a d a t , " E f f i c i e n t r e g u l a r g r a m m a t i c a l i n f e r e n c e f o r p a t t e r n r e c o g n i t i o n , " P a t t e r n R e c o g n i t i o n , V o l . 17, No. 2, pp. 245 - 250, 1984. [18] M. S h r i d h a r and A. B a d r e l d i n , "A t r e e c l a s s i f i c a t i o n a l g o r i t h m f o r h a n d w r i t t e n c h a r a c t e r r e c o g n i t i o n , " 7 t h I n t . C o nf. on P a t t e r n R e c o g n i t i o n P r o c e e d i n g s , V o l . 1, pp. 615 - 618, 1984. [19] T. Kasvand and N. Otsu, "Segmentation o f t h i n n e d b i n a r y scenes w i t h good c o n n e c t i v i t y a l g o r i t h m s , " 7 th I n t . Conf. on P a t t e r n R e c o g n i t i o n P r o c e e e d i n g s , V o l . 1, pp. 297 -300, 1984 [20] R. I . Oka, " H a n d w r i t t e n C h i n e s e - J a p a n e s e c h a r a c t e r r e c o g n i t i o n by u s i n g c e l l u l a r f e a t u r e , " 6 t h I n t . Conf. on P a t t e r n R e c o g n i t i o n P r o c e e d i n g s , V o l . 2, pp. 783 -785, 1982. [21] N. O k a m o t o , 0. N a k a m u r a , T. M i n a m i , " C h a r a c t e r s e g m e n t a t i o n f o r mixed mode communication," I n f o r m a t i o n P r o c e s s i n g 83, pp. 681 - 685, 1983. 130 R e f e r e n c e s [22] E. G. W a l l i n g f o r d , J r . , "A v i s u a l p a t t e r n r e c o g n i t i o n c o m p u t e r p r o g r a m b a s e d on n e u r o p h y s i o l o g i c a l d a t a , " B e h a v i o r a l S c i e n c e , V o l . 17, pp. 241 - 248, 1972. [23] M. Suk and T. H. Cho, " S e g m e n t a t i o n o f i m a g e s u s i n g minimum spanning t r e e s , " P r o c e e d i n g s o f SPIE - the I n t e r -n a t i o n a l S o c i e t y f o r O p t i c a l E n g i n e e r i n g , V. 397, pp. 180 - 185, A p r i l 19-22, 1983. [24] J . Mantas and C. D a s k a l a k i s , " P a r a l l e l L a b e l l i n g and s t r u c t u r a l a n a l y s i s i n h a n d w r i t t e n c h a r a c t e r r e c o g n i t i o n , " P r o c e e d i n g s o f MELECON '83 - M e d i t t e r a n e a n E l e c t r o t e c h n i c a l C o n f e r e n c e , A t h e n s , G r e e c e , pp. A.8.03/1-2, May 24-26, 1983. [25] W. J . M. K i c k e r t and H. K o p p e l a a r , " A p p l i c a t i o n o f F u z z y S e t T h e o r y t o S y n t a c t i c P a t t e r n . R e c o g n i t i o n o f H a n d w r i t t e n C a p i t a l s , " IEEE t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , pp. 148 - 151, F e b r u a r y 1976. [26] R. L. P. Chang and T. P a v l i d i s , " F u z z y d e c i s i o n t r e e a l g o r i t h m s , " IEEE t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , pp. 28 - 35, V o l . SMC-7, No. 1, J a n u a r y 1977. [27] R. S h i n g h a l , "An e x p e r i m e n t a l i n v e s t i g a t i o n o f f o u r t e x t r e c o g n i t i o n a l g o r i t h m s , " I E E E t r a n s a c t i o n s on S y s t e m s , Man, and C y b e r n e t i c s , pp. 573 - 577, V o l . SMC-12, No. 4, J u l y / A u g u s t 1982. [28] T. Kanade, " S u r v e y - R e g i o n S e g m e n t a t i o n : S i g n a l v s S e m e n t i c s , " T u t o r i a l : C o n t e x t - d i r e c t e d P a t t e r n R e c o g n i t i o n and M a c h i n e I n t e l l i g e n c e T e c h n i q u e s f o r I n f o r m a t i o n P r o c e s s i n g , pp. 518 - 533, IEEE Computer S o c i e t y P r e s s , New York, 1982. [29] N. L i n d g r e n , " M a c h i n e r e c o g n i t i o n o f human l a n g u a g e -P a r t I I I C u r s i v e s c r i p t r e c o g n i t i o n , " IEEE S p e c -trum, pp. 104 - 116, May 1965. [30] M. K. B r o w n and S. G a n a p a t h y , " C u r s i v e s c r i p t r e c o g n i t i o n , " P r o c e e e d i n g s o f t h e I n t . C onf. on C y b e r -n e t i c s and S o c i e t y 1980, pp. 47 - 51. [31] C. H. C h e n , "A c o m p a r i s o n o f i m a g e s e g m e n t a t i o n t e c h n i q u e s , " P r o c e e d i n g s o f the I n t . Conf. on C y b e r n e t i c s and S o c i e t y 1981, pp. 364 - 368. [32] M. S h r i d h a r and A. B a d r e l d i n , " R e c o g n i t i o n o f i s o l a t e d 131 R e f e r e n c e s and c o n n e c t e d h a n d w r i t t e n numerals," P r o c e e d i n g s o f the 1984 IEEE I n t . Conf. on S y s t e m s , Man, and C y b e r n e t i c s , pp. 142 - 146. [33] C. C. Kwan, L. Pang, and C. Y. Suen, "A c o m p a r i t i v e s t u d y o f some r e c o g n i t i o n a l g o r i t h m s i n c h a r a c t e r r e c o g n i t i o n , " P r o c e e d i n g s o f t h e I n t . C o n f . on C y b e r n e t i c s and S o c i e t y 1979, pp. 530 - 535. [34] K. H i r o t a and W. P e d r y c z , " C h a r a c t e r i z a t i o n o f f u z z y c l u s t e r i n g a l g o r i t h m s i n t e r m s o f e n t r o p y o f p r o b a b i l i s t i c s e t s , " P a t t e r n R e c o g n i t i o n L e t t e r , V o l 2, No. 4, pp. 213 - 216, June 1984. [35] A. K a n d e l , " F u z z y t e c h n i q u e s i n p a t t e r n r e c o g n i t i o n , " W i l e y , New York, 1982. [36] A. R o s e n f e l d , "A C h a r a c t e r i z a t i o n o f P a r a l l e l T h i n n i n g A l g o r i t h m s , " I n f o r m a t i o n and C o n t r o l 29, pp. 286 - 291, 1975. [37] A. Rosenfeld,"A Converse t o the J o r d a n Curve Theorem f o r D i g i t a l C u r v e s , " I n f o r m a t i o n and C o n t r o l 29, pp. 292 -293, 1975. [38] F. H a r r y , "Graph T h e o r y , " pp. 1 - 23, A d d i s o n - W e s 1 e y , Reading, M a s s a c h u s e t t s , 1972. [39] D. Marr, " V i s i o n , " Freeman, New Y o r k , 1982. [40] D. B a l l a r d and C. Br o w n , "Computer V i s i o n , " P r e n t i c e H a l l , Englewood C l i f f s , New J e r s e y , 1982. [41] M. G o l a y , "Hexagonal P a r a l l e l P a t t e r n T r a n s f o r m a t i o n s , " IEEE T r a n s a c t i o n s o f C o m p u t e r s , V o l C-18, No. 8, pp. 733 - 740, August 1969. [42] W. H. H. J . L u n s c h e r , "A d i g i t a l p r e p r o c e s s o r f o r o p t i c a l c h a r a c t e r r e c o g n i t i o n , " M a s t e r o f A p p l i e d S c i e n c e T h e s i s , D e p a r t m e n t o f E l e c t r i c a l E n g i n e e r i n g , The U n i v e r s i t y o f B r i t i s h C o l u m b i a , 1983. 132 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0064980/manifest

Comment

Related Items