UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Reasoning about spatial relationships in the primal sketch Majka, Marc Stephen 1982

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1982_A6_7 M35.pdf [ 9.17MB ]
Metadata
JSON: 831-1.0051819.json
JSON-LD: 831-1.0051819-ld.json
RDF/XML (Pretty): 831-1.0051819-rdf.xml
RDF/JSON: 831-1.0051819-rdf.json
Turtle: 831-1.0051819-turtle.txt
N-Triples: 831-1.0051819-rdf-ntriples.txt
Original Record: 831-1.0051819-source.json
Full Text
831-1.0051819-fulltext.txt
Citation
831-1.0051819.ris

Full Text

REASONING ABOUT SPATIAL RELATIONSHIPS IN THE PRIMAL SKETCH by MARC STEVEN MAJKA B.Sc. The U n i v e r s i t y of New Brunswick, 1980 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA September 1982 © Marc Steven Majka, 1982 In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of do«*f>ole<r O e . ' e / i c e . The University of B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date Oct 18. i i ABSTRACT The p r i m a l sketch d e s c r i b e s the magnitude and s p a t i a l arrangement of abrupt b r i g h t n e s s changes i n an image. Many of these changes a r i s e due to s u r f a c e boundaries i n the u n d e r l y i n g scene. S p a t i a l r e l a t i o n s h i p s present i n the pr i m a l sketch are used to reason about o b j e c t s of known s p a t i a l s t r u c t u r e . Groupings of image f e a t u r e s are p r e d i c t e d from scene knowledge. Image f e a t u r e s that s a t i s f y p r e d i c t e d r e l a t i o n s h i p s are found i n the pri m a l sketch. These image f e a t u r e s and t h e i r r e l a t i o n s h i p s are used to form scene d e s c r i p t i o n s . The i n t e r p r e t a t i o n of the symbolic tokens i n these d e s c r i p t i o n s i s based on the known c o m p o s i t i o n a l and s t r u c t u r a l p r o p e r t i e s of o b j e c t s i n the scene domain. Roads and other c u r v e l i n e a r f e a t u r e s i n a e r i a l imagery are d e s c r i b e d i n terms of t h e i r s t r u c t u r a l and co m p o s i t i o n a l p r o p e r t i e s . T h i s domain i s used as a t e s t case f o r a s p a t i a l reasoning program. The program d e t e r -mines s e c t i o n s of l o g g i n g roads and t h e i r i n t e r s e c t i o n s . ACKNOWLEDGEMENTS Many thanks to my s u p e r v i s o r , Bob Woodham, f o r a l l h i s h e l p with t h i s t h e s i s . Thanks a l s o to Jim L i t t l e f o r many d i s c u s s i o n s and br a i n s t o r m i n g s e s s i o n s , and to Jay Glicksman f o r ideas and h e l p with software. T e c h n i c a l a s s i s t a n c e i n d i g i t i z i n g the image data was pro v i d e d by Block B r o t h e r s i n Vancouver. i v TABLE OF CONTENTS I ABSTRACT i i II ACKNOWLEDGEMENTS i i i III TABLE OF CONTENTS i v IV TABLE OF FIGURES v i 1 . INTRODUCTION 1 1.1. S c e n a r i o : l o g g i n g roads 2 1.2. Chapter summary 10 2. FRAMEWORK 13 2.1. C l a s s i f i c a t i o n techniques 14 2.2. Region growing 16 2.3. Template matching 17 2.4. L i n e f i n d i n g 19 2.5. The p r i m a l sketch 22 2.6. Map-guided i n t e r p r e t a t i o n 24 3. REASONING ABOUT IMAGE FEATURES 25 3.1. The raw p r i m a l s k e t c h 25 3.2. S t r u c t u r i n g scene domain knowledge 30 3.3. Applying scene domain knowledge 35 4. IMPLEMENTATION 41 4.1. Computing the p r i m a l sketch 41 4.1.1. Zero c r o s s i n g boundaries 41 4.1.2. Zero c r o s s i n g segments 45 4.1.3. Bar segments 47 4.1.4. Regions 51 4.2. S t r i p s and j u n c t i o n s 52 4.2.1. S t r i p s 53 4.2.2. J u n c t i o n s 55 V 5. EXPERIMENTAL RESULTS 59 5.1. D i s c u s s i o n 59 5.2. Graphic output 62 6. CONCLUSIONS 69 6.1. Features and s t r u c t u r e 69 6.2. The pri m a l sketch i s a p p r o p r i a t e 72 6.3. Research i s s u e s 73 7. BIBLIOGRAPHY 7 5 v i TABLE OF FIGURES 1 . A e r i a l image of a l o g g i n g area 3 2. P o t e n t i a l road s e c t i o n s i n f i g u r e 1 5 3. Zero c r o s s i n g s i n V 2G*I f o r f i g u r e 1 7 4. Bar segments from f i g u r e 1 8 5. S t r i p s from f i g u r e 1 9 6. J u n c t i o n s from f i g u r e 1 11 7. F i g u r e 1 convolved with a bar operator 20 8. A e r i a l image of a l o g g i n g area 28 9. Regions from f i g u r e 8 29 10. Bar segments from f i g u r e 8 31 11. C l a s s i f i c a t i o n based on p i x e l i n t e n s i t y 34 12. R a d i a l l y arranged Zero C r o s s i n g Segments 36 13. S t r a i g h t - l i n e approximations to a curve 48 14. The o r i g i n a l image f o r t r i a l 1 63 15. Roads i n f i g u r e 14 64 16. The o r i g i n a l image f o r t r i a l 2 65 17. Roads i n f i g u r e 16 66 18. The o r i g i n a l image f o r t r i a l 3 67 19. Roads i n f i g u r e 18 68 CHAPTER 1 INTRODUCTION T h i s t h e s i s d e s c r i b e s the a p p l i c a t i o n of scene and image domain knowledge to the pri m a l sketch of an image [Marr & H i l d r e t h 80], [ H i l d r e t h 80], An implementation i s presented to demonstrate t h i s approach using a e r i a l im-agery of l o g g i n g o p e r a t i o n s to i d e n t i f y l o g g i n g roads i n the scene. The p r i m a l sketch d e s c r i b e s the magnitude and s p a t i a l arrangement of abrupt b r i g h t n e s s changes i n an image. These changes a r i s e due to changes i n the s t r u c t u r e of the u n d e r l y i n g scene. The pri m a l sketch i s a s u i t a b l e l e v e l of r e p r e s e n t a t i o n f o r reasoning about scene f e a t u r e s u s i n g knowledge of t h e i r s t r u c t u r a l p r o p e r t i e s . T h i s knowledge i s used to p r e d i c t the e f f e c t s of those f e a t u r e s on the pr i m a l sketch. Reasoning from these p r e d i c t i o n s f a c i l i -t a t e s the i d e n t i f i c a t i o n and d e s c r i p t i o n of important groupings of image f e a t u r e s . These d e s c r i p t i o n s represent elements s a t i s f y i n g the s t r u c t u r a l c o n s t r a i n t s on the ap-pearance of scene f e a t u r e s . Such "scene elements" form the 1 2 b a s i s f o r reasoning about other p r o p e r t i e s of o b j e c t s . 1 . 1 . S c e n a r i o : Logging roads The problem of l o c a t i n g l o g g i n g roads and other c u r -v e l i n e a r f e a t u r e s i n a e r i a l images p r o v i d e s a good t e s t domain f o r reasoning about s t r u c t u r a l r e l a t i o n s h i p s i n the p r i m a l sketch. The appearance of these f e a t u r e s i s l a r g e l y d e s c r i b e d i n terms of composition and s u r f a c e s t r u c t u r e . T h i s knowledge i s used to p r e d i c t p r i m a l sketch f e a t u r e s i n d i c a t i n g the presence of scene elements which are i n t e r -preted.as segments of roads. F i g u r e 1 i s . a d i g i t i z e d p o r t i o n of an a e r i a l photo-graph of a l o g g i n g area i n southeastern B r i t i s h Columbia. Roads appear as l i g h t - toned c u r v i l i n e a r f e a t u r e s . N o t i c e the g r a v e l r i v e r bed near the c e n t r e of the image i s s i m i -l a r i n appearance. The important d e f i n i n g c h a r a c t e r i s t i c s f o r roads i n t h i s type of scene a r e : - c o n s i s t e n t width of approximately 10 meters - g r a v e l s u r f a c e s - continuous p o i n t to p o i n t connections - a c l e a r e d r i g h t of way Figure 1 A logging area. Roads appear as l i g h t - toned c u r v i l -inear f e a t u r e s . 4 From these c h a r a c t e r i s t i c s , i t i s p r e d i c t e d that roads w i l l produce long p a i r s of zero c r o s s i n g s i n the p r i m a l s k e t c h . These c u r v i l i n e a r p a i r s are r e f e r r e d to as s t r i p s . Knowing the image s c a l e i n f i g u r e 1 to be e q u i v a l e n t to approximately 2.5 meters per p i x e l , the width of s t r i p s w i l l be i n the range from 3 to 7 p i x e l s . S t r i p width i s l e s s c o n s i s t e n t than the u n d e r l y i n g road. Zero c r o s s i n g s forming s t r i p s i d e s may occur at the road's egde or at the edge of the road's r i g h t of way. The s t r i p i t s e l f must be of a tone a p p r o p r i a t e to g r a v e l s u r f a c e s . . Road segments found in t h i s image by the computer im-plementation are shown in F i g u r e 2. The g r a v e l bed of the c e n t r a l r i v e r has c o m p o s i t i o n a l and s t r u c t u r a l charac-t e r i s t i c s s i m i l a r to a road. I t i s t h e r e f o r e i n c l u d e d in the d e s c r i p t i o n of scene elements which e x h i b i t r o a d - l i k e p r o p e r t i e s . Zero c r o s s i n g s i n the L a p l a c i a n of the image, smoothed with a Gaussian f i l t e r , are used to d e s c r i b e b r i g h t n e s s changes in the image. T h i s o p e r a t i o n i s denoted by V 2G*I. I denotes the image, G the Gaussian, * convolu-t i o n , and V 2 the L a p l a c i a n (a second d e r i v a t i v e o p e r a t o r ) . F i g u r e 3 shows the zero c r o s s i n g s i n V 2G*I f o r t h i s image F i g u r e 2 Road s e c t i o n s are shown in b l a c k . They are d e r i v e d from the primal sketch by reasoning about the compo-s i t i o n a l and s t r u c t u r a l p r o p e r t i e s of roads. 6 u s i n g a s i n g l e s p a t i a l frequency channel. The width of t h i s channel was chosen to g i v e accurate demarcation of road edges. The width of roads i n the scene p l a c e s an upper l i m i t on the s i z e of the V 2G operator used. I f a l a r g e operator i s used, the zero c r o s s i n g s corresponding to road edges w i l l be d i s p l a c e d from the l o c a t i o n of the a c t u a l edges [ H i l d r e t h 80 footnote 9, p 116]. A p r i m i t i v e token i n the p r i m a l sketch i s the bar. Bars found i n f i g u r e 1 are shown in f i g u r e 4. Knowing that roads have p a r a l l e l s i d e s , bar tokens are used as cues to t h e i r presence. Bars with width, c o n t r a s t and tone a p p r o p r i a t e f o r roads are s e l e c t e d as cues. Width, tone, c o n t i n u i t y and branching p r o p e r t i e s of roads are used to d i r e c t a search of the p r i m a l sketch for extensions to the bar which are c o n s i s t e n t with the i n t e r p r e t a t i o n of the o b j e c t marked by the bar as a road. The r e s u l t of the search i s a d e s c r i p t i o n of s t r i p s . The c e n t r e l i n e s of s t r i p s are shown in black in f i g u r e 5. The c e n t r e l i n e s f o l l o w many of the roads and the c e n t r a l r i v e r bed i n the scene. Gaps occur where the p r i m a l sketch d i d not provide s u f f i c i e n t cues to i n i t i a t e a new s t r i p , where c l e a r ex-t e n s i o n s c o u l d not be found f o r e x i s t i n g s t r i p s , and where F i g u r e 3 Zero c r o s s i n g s in V 2 G * I for the image I of f i g u r e 1. F i g u r e 5 Centre l i n e s of s t r i p s are shown in black o v e r l a y . 10 j u n c t i o n s w e r e d e t e c t e d a n d t h e s e a r c h w a s s t o p p e d . I n t e r s e c t i o n s a r e d e s c r i b e d b y j u n c t i o n t o k e n s . S o m e j u n c t i o n s a r e d i s c o v e r e d w h e n a s t r i p e x t e n s i o n e n c o u n t e r s a n e x i s t i n g s t r i p . O t h e r s a r e f o u n d b y e x p l i c i t l y s e a r c h -i n g f o r c l u s t e r s o f s t r i p e n d p o i n t s . F i g u r e 6 s h o w s t h e l o c a t i o n s o f j u n c t i o n s i n t h e e x a m p l e i m a g e . 1.2. C h a p t e r S u m m a r y C h a p t e r 2 e x a m i n e s o t h e r w o r k r e l a t i n g t o r o a d i n -t e r p r e t a t i o n . S e v e r a l r o a d f i n d i n g o r r o a d t r a c k i n g s c h e m e s a r e r e v i e w e d , i n c l u d i n g p i x e l c l a s s i f i c a t i o n , t e m -p l a t e m a t c h i n g , a n d m a p - g u i d e d t e c h n i q u e s . T h e p r i m a l s k e t c h i s a l s o i n t r o d u c e d a n d d e s c r i b e d . C h a p t e r 3 d e v e l o p s t h e f u l l p r i m a l s k e t c h , a n d d i s c u s s e s how i t i s u s e d t o r e a s o n a b o u t s p a t i a l r e l a t i o n -s h i p s . A c o r r e s p o n d e n c e i s d e v e l o p e d b e t w e e n l e v e l s o f r e p r e s e n t a t i o n a n d t y p e s o f k n o w l e d g e . S t r u c t u r a l p r o p e r -t i e s o f r o a d s a r e m a t c h e d t o p r i m a l s k e t c h f e a t u r e s . C h a p t e r 4 d e s c r i b e s t h e a l g o r i t h m s a n d d a t a s t r u c -t u r e s u s e d i n a c o m p u t e r i m p l e m e n t a t i o n o f t h e r e a s o n i n g s y s t e m . T h e f i r s t s e c t i o n d e s c r i b e s t h e c o m p u t a t i o n o f t h e F i g u r e 6 The l o c a t i o n of j u n c t i o n are shown at the geometric centr e of the endpoints of the component s t r i p s . 12 p r i m a l sketch i t s e l f . The second s e c t i o n d e s c r i b e s the search r o u t i n e s which are used f i n d s t r i p s and j u n c t i o n s . Chapter 5 demonstrates the r e s u l t s obtained by the implementation. Chapter 6 summarizes the r e s u l t s of the work presen t -ed, and suggests r e l a t e d r e s e a r c h i s s u e s . CHAPTER 2 FRAMEWORK The i n t e r p r e t a t i o n of c u r v i l i n e a r f e a t u r e s i n remotely sensed imagery i s not a new problem to computa-t i o n a l v i s i o n . Approaches s i m i l a r to those which have been used to i d e n t i f y roads and r i v e r s have a l s o been used to i n t e r p r e t p r i n t e d c i r c u i t boards, p r i n t e d or hand w r i t t e n c h a r a c t e r s , and spark and bubble chamber t r a c k s . E x i s t i n g techniques are . u s e f u l i n the simpler of these image domains. For example, the use of f e a t u r e d e t e c t o r s together with a p a r s i n g s t r a t e g y can s u c c e s s f u l l y be used to read p r i n t e d c h a r a c t e r s . The complexity of images i s g r e a t e r i n remotely sensing. The v a r i a b i l i t y of the appearance of o b j e c t s i n the scene v i o l a t e s the inherent assumptions of e x i s t i n g techniques. Understanding images i n c o m p licated domains r e q u i r e s more f l e x i b l e i n t e r p r e t a -t i o n systems. Three important aspects of c u r r e n t approaches to c u r -v i l i n e a r f e a t u r e i n t e r p r e t a t i o n problem are examined: the type of knowledge used, the image r e p r e s e n t a t i o n , and the 1 3 1 4 way knowledge i s used. 2.1. C l a s s i f i c a t i o n techniques P o s s i b l y the e a s i e s t way to i n t e r p r e t an image i s to assume a d i r e c t mapping from image i n t e n s i t y to scene c h a r a c t e r i s t i c . But, image i r r a d i a n c e i s a f u n c t i o n of scene i r r a d i a n c e , s u r f a c e shape, imaging geometry, t r a n s m i s s i o n c h a r a c t e r i s t i c s of the o p t i c a l medium, and the r e f l e c t a n c e c h a r a c t e r i s t i e s . o f the obj e c t m a t e r i a l s . For a formal d i s c u s s i o n of the image formation process, see [Horn 75], [Horn 77], [Horn & Sjoberg 78], [Nicodemus, Richmond, H i s a , Ginsberg & Limperis 77] and [Woodham 81]. If f a c t o r s other than s u r f a c e r e f l e c t a n c e are h e l d con-s t a n t , image i n t e n s i t y can be r e l a t e d to the r e f l e c t a n c e c h a r a c t e r i s t i c s of the scene s u r f a c e s . Problems occur i f the r e f l e c t a n c e of d i f f e r e n t scene elements i s i d e n t i c a l i n the wavelengths sampled. T h i s technique i s l i m i t e d i n i t s a b i l i t y to make use of s t r u c t u r a l r e l a t i o n s h i p s . An image of a teddy bear might w e l l be c l a s s i f i e d as to the i d e n t i t y of the d i f f e r e n t types of c l o t h or fur making up the bear, but the bear i t s e l f c o u l d not be d i f f e r e n t i a t e d from other animals made out of s i m i l a r m a t e r i a l s . 15 A c l a s s i f i c a t i o n approach i s a p p l i e d to i n d i v i d u a l p i x e l s , or to small neighbourhoods of p i x e l s . In the l a t t e r case, image t e x t u r e can sometimes be r e l a t e d to s u r f a c e p r o p e r t i e s of the scene. If the scene domain i s not s u f f i c i e n t l y c o n t r o l l e d to minimize the f a c t o r s mentioned above, a mapping from image i n t e n s i t y to scene c h a r a c t e r i s t i c s cannot be d e f i n e d . Current r e s e a r c h i n c l a s s i f i c a t i o n techniques i s concerned with ways of d e a l i n g with ambiguous elements, or with e l e -ments which seem "out of p l a c e " with respect to t h e i r im-mediate neighbourhood. There are too many examples of t h i s approach i n the l i t e r a t u r e to mention e x p l i c i t l y . A r e -view of c l a s s i f i c a t i o n techniques i s given by Cormack [Cormack 71]. A survey of c l a s s i f i c a t i o n as a p p l i e d to r e -motely sensed imagery i s c o n t a i n e d i n [Swain & Davis 78]. The c l a s s i f i c a t i o n approach d i r e c t l y r e l a t e s image elements to c l a s s e s i n the scene domain. Implementations vary, but the technique used can be viewed as e q u i v a l e n t to a t a b l e look-up. 16 2.2. Region growing Region growing, i n i t s most ge n e r a l form, groups i n -d i v i d u a l p i x e l s i n t o connected homogeneous r e g i o n s . A measure of the s i m i l a r i t y of two p i x e l s , two r e g i o n s , or of a p i x e l to an e x i s t i n g region i s used to decide i f the two elements w i l l be merged. Commonly, the s i m i l a r i t y measure i s based on image i n t e n s i t y . The u n d e r l y i n g as-sumption i s that s i m i l a r neighbouring image elements a r i s e from the same object i n the scene. Region based schemes may i n c o r p o r a t e some knowledge of shape or c o n t i n u i t y . T h i s knowledge i s embodied as h e u r i s t i c s i n the region growing p r o c e s s . A d i f f i c u l t y encountered i n r e g i o n grow-ing process i s that a t e r m i n a t i o n c r i t e r i o n i s necessary. T h i s c r i t e r i o n may be d i f f i c u l t to d e r i v e from knowledge of the domain. T h i s approach i s demonstrated by Bajcsy and T a v a k o l i , [Bajcsy & T a v a k o l i 76]. T h e i r " v i s u a l model" of a road s p e c i f i e s s p e c t r a l p r o p e r t i e s , shape, and c o n t i n u i t y . S p e c t r a l knowledge i s used to s e l e c t candidate road p i x -e l s . These p i x e l s are grouped i n t o narrow s t r i p s , with s p e c i f i c a t i o n s f o r width and c u r v a t u r e , a c c o r d i n g to knowledge of the appearance of roads. F i n a l l y , i s o l a t e d 1 7 road c a n d i d a t e s are j o i n e d together by a "road grower". 2.3. Template matching The l o c a t i o n s of f e a t u r e s of s p e c i f i c shape may be determined by the degree to which the image matches a "mask" of the f e a t u r e of i n t e r e s t . The mask i s a p r o t o t y p -i c a l image of t h i s f e a t u r e . One method of matching the mask and the image i s d e s c r i b e d by the Swartz i n e q u a l i t y . In the d i s c r e t e case, i t p r e d i c t s that a p o i n t i n an image produced by the normalized c o n v o l u t i o n of the mask with the o r i g i n a l image w i l l have a maximum value of one i f the mask i s an exact m u l t i p l e of the image i n the area of that p o i n t . The value decreases as the match gets l e s s exact. S e v e r a l problems a r i s e . F i r s t , the technique only responds to exact matches between the image and the mask. The r e s u l t v a r i e s with the s c a l e and o r i e n t a t i o n of the f e a t u r e under the mask. Thus a l a r g e number of masks would be necessary to d e t e c t a f e a t u r e at an a r b i t r a r y s c a l e and 2-dimensional o r i e n t a t i o n . Second, i n n a t u r a l imagery, i t i s u n l i k e l y that an exact match would ever be found. At best, the r e s u l t s must be s u b j e c t e d to a t h r e s h o l d i n g o p e r a t i o n to s e l e c t " s t r o n g " c a n d i d a t e s f o r the l o c a t i o n 18 of a f e a t u r e . T h i r d , many 2-D images can be produced by many 3-D o b j e c t s i n the world. Objects do not have unique images, and n e i t h e r do images correspond to unique ob-j e c t s . The template matching approach works w e l l when a l l the assumptions of the u n d e r l y i n g theory are s a t i s f i e d . In a domain where the f a c t o r s that produce an image can be modeled i n t h i s mathematical framework, p r e c i s e s o l u t i o n s e x i s t f o r f e a t u r e r e c o g n i t i o n problems. An i n t e n s i t y mask i s a h i g h l y compiled form of in f o r m a t i o n about the shape, o r i e n t a t i o n , and s u r f a c e c h a r a c t e r i s t i c s of the scene f e a t u r e . T h i s technique has been a p p l i e d to the problem of f i n d i n g roads i n a e r i a l and s a t e l l i t e imagery [Vanderbrug 75a], [Quam 78], [ F i s c h l e r , Tenenbaum & Wolf 81]. F i s c h l e r et a l r e p o r t on a s p e c i f i c road opera-t o r , due to R. 0. Duda, and demonstrate some r e s u l t s of i t s a p p l i c a t i o n . They d i s c u s s the use of "edge o p e r a t o r s " as p o s s i b l e road d e t e c t o r s , and present a system to i n -t e g r a t e the r e s u l t s o b t ained from s e v e r a l such f i l t e r s . A "road" template i n gen e r a l i s a bar mask, which matches a bar of a given width, with a l i g h t center and dark out-19 s i d e , extended s p a t i a l l y i n the d i r e c t i o n of the bar. F i g u r e 7 shows the r e s u l t of a p p l y i n g such a mask to an image. Another approach to template matching i s the use of very g e n e r a l templates to determine image f e a t u r e s . Marr [Marr 74b] suggested the use of a bar shaped f i l t e r to h e l p determine bar tokens i n the p r i m a l s k e t c h . He l a t e r dropped the use of a bar f i l t e r . The f i r s t d i f f i c u l t y i s that the r e s u l t of the c o n v o l u t i o n of an image and a mask i s a new image, and not an image d e s c r i p t i o n . The second d i f f i c u l t y i s that the f i l t e r c o n t a i n s too much compiled knowledge and i s t h e r e f o r e < i n f l e x i b l e i n i t s a p p l i c a t i o n . Bar tokens are now found i n a more general manner, by sym-b o l i c reasoning about the c o m p a t i b i l i t y of c l o s e l y spaced zero c r o s s i n g segments to form a bar. 2.4. L i n e f i n d i n g One view of image understanding i s that i t i s a staged process, i n i t i a l l y producing image d e s c r i p t i o n s i n -dependent of the scene domain. Region f i n d i n g , ( i . e . the grouping of s i m i l a r p i x e l s ) , and l i n e f i n d i n g , ( i . e . the d i s c o v e r y of d i s c o n t i n u i t i e s i n p i x e l i n t e n s i t i e s ) , are 20 F i g u r e 7 The r e s u l t of a p p l y i n g a "bar op e r a t o r " to the image in f i g u r e 1. Four mask o r i e n t a t i o n s were used. The maximum value of the four c o n v o l u t i o n s i s used i n t h i s image. 21 the dual approaches taken. A l a r g e number of papers d e s c r i b e r e s e a r c h i n t h i s a r e a . L i n e s may denote boun-d a r i e s between o b j e c t s i n the scene. A d e s c r i p t i o n of ob-j e c t boundaries i s the b a s i s f o r reasoning using very high l e v e l s of knowledge of the scene domain. Work i n t h i s area began with [Roberts 65], Waltz presents a c l a s s i c develop-ment of t h i s approach [Waltz 72], [Waltz 75], L i n e and region d e s c r i p t i o n s of an image forms the b a s i s f o r s e v e r a l c u r v i l i n e a r f e a t u r e reasoning programs. In the i n t e r e s t of b r e v i t y , a number of r e p r e s e n t a t i v e r e f e r e n c e s a r e : [Bajcsy & T a v a k o l i 73] [Vanderbrug 75b] [Nevatia & Babu 79] [Hwang, Lee & H a l l 79] [ T a v a k o l i & Rosenfeld 80] [Nevatia & P r i c e . 8 2 ] [ T a v a k o l i & Rosenfeld 82] Among the road i n t e r p r e t a t i o n programs, the e s s e n t i a l c h a r a c t e r i s t i c i s the use of p a i r s of l i n e elements as cues to the l o c a t i o n s or roads. The knowledge used i s con-s i s t e n t l y that of the shape and composition of roads and other c u r v i l i n e a r f e a t u r e s . 22 A n u m b e r o f l i n e f i n d i n g m e t h o d s h a v e b e e n d e v e l o p e d . A s u m m a r y i s g i v e n i n [ D a v i s 7 5 ] . M a r r a n d H i l d r e t h g i v e a p r e c i s e c h a r a c t e r i z a t i o n o f l i n e s a s z e r o c r o s s i n g s i n t h e L a p l a c i a n o f i m a g e i n t e n s i t y . M o t i v a t e d b y h u m a n n e u r o p h y -s i o l o g y , t h e y p r o v i d e a m e a n s o f d e s c r i b i n g e d g e s i n t e r m s o f t h e i r s p a t i a l f r e q u e n c y c o m p o n e n t s . T h e p r i m a l s k e t c h i m a g e d e s c r i p t i o n i s b a s e d o n t h i s c h a r a c t e r i z a t i o n o f e d g e s . 2 . 5 . T h e r a w p r i m a l s k e t c h T h e p r i m a l s k e t c h d e s c r i p t i o n o f a n i m a g e i s t h e f i r s t d e s c r i p t i v e l e v e l o f a t h e o r y o f h u m a n v i s i o n d e v e l o p e d b y D a v i d M a r r . D e t a i l s o f t h e f o r m o f t h e p r i m a l s k e t c h c h a n g e d a n u m b e r o f t i m e s a s t h e t h e o r y e v o l v e d . T h e r e d o e s n o t a c t u a l l y e x i s t a s i n g l e l e v e l o f i m a g e d e s c r i p t i o n c a l l e d t h e " p r i m a l s k e t c h " . A t t h e l o w e s t l e v e l i s t h e " r a w p r i m a l s k e t c h " , w i t h h i g h e r l e v e l i m a g e d e s c r i p t i o n s b u i l t u p o n i t . T h e r a w p r i m a l s k e t c h i s e x -a m i n e d h e r e . T h e r a w p r i m a l s k e t c h d e s c r i b e d i n t h i s t h e s i s i s f r o m - [ H i l d r e t h 8 0 ] . I t i s a l s o d e s c r i b e d i n [ M a r r 7 9 ] , [ M a r r & H i l d r e t h 8 0 ] , a n d [ M a r r 8 2 ] . T h e , r a w p r i m a l s k e t c h 23 i s based on zero c r o s s i n g s of the L a p l a c i a n of s e v e r a l d i f f e r e n t smoothed v e r s i o n s of an image. The degree of smoothing determines the s p a t i a l frequency channel of edge s e n s i t i v i t y . There are three d e s c r i p t i v e tokens: Edges, Bars, and Blobs. Edges are are formed from the r e s u l t of segmenting zero c r o s s i n g contours i n t o approximately s t r a i g h t s e c t i o n s . Bars are p a i r e d p a r a l l e l edges. Blobs are regions with area s m a l l e r than a t h r e s h o l d determined by the s i z e of the s p a t i a l frequency channels. The zero c r o s s i n g i n f o r m a t i o n from s e v e r a l s p a t i a l frequency channels i s combined in the raw p r i m a l sketch. Zero c r o s s i n g s c o i n c i d e n t i n d i f f e r e n t s p a t i a l frequency channels are assumed to a r i s e from the same scene f e a t u r e . An edge w i l l thus have a s s o c i a t e d with i t a measure of the range of i t s s p a t i a l frequency components. In Marr's view, the raw p r i m a l sketch i s the f i r s t l e v e l of d e s c r i p t i o n i n any image a n a l y s i s system. Other l e v e l s of image d e s c r i p t i o n , up to the 2 1/2-D sketch, are computed from the raw p r i m a l sketch. 24 2.6. Map-guided i n t e r p r e t a t i o n An a l t e r n a t i v e to the bottom-up approaches d i s c u s s e d i n the preceding s e c t i o n s i s the road-reasoning system presented i n [Tenenbaum, Barrow, B o l l e s , F i s c h l e r & Wolf 79], [ F i s c h l e r , Agin, Barrow, B o l l e s , Quam, Tenenbaum & Wolf 79], and [Glicksman 82]. T h i s approach i n t e g r a t e s many types and sources of knowledge, i n c l u d i n g high and low r e s o l u t i o n images, e l e v a t i o n models, symbolic map i n f o r m a t i o n , and a data base of i n f o r m a t i o n about s p e c i f i c f e a t u r e s i n the p h y s i -c a l areas being examined by the system. D i f f e r e n t p a r t s of the system can f u n c t i o n independently, where techniques l i k e those above are used, or together, where the knowledge sources are used by s e v e r a l programs co-o p e r a t i n g t o maintain a data-base of i n f o r m a t i o n about these g e o g r a p h i c a l areas. The important r e s e a r c h problems are those of c o - o r d i n a t i n g the system's knowledge, and choosing the a p p r o p r i a t e knowledge sources f o r given q u e r i e s to the program. CHAPTER 3 REASONING ABOUT IMAGE FEATURES T h i s chapter presents a v e r s i o n of the raw p r i m a l sketch and a method of reasoning about the raw p r i m a l s k e t c h . The problem of f i n d i n g roads in a e r i a l photographs i s used to i l l u s t r a t e the approach. 3.1. The raw primal sketch 'A v e r s i o n of Marr's raw p r i m a l sketch forms the b a s i s f o r t h i s work. The raw p r i m a l sketch presented i n H i l d r e t h ' s t h e s i s [ H i l d r e t h 80] i s used as the model, but a few changes have been made. There are three tokens i n the raw p r i m a l sketch: Region, Zero C r o s s i n g Segment (ZCS), and Bar Segment. These correspond to H i l d r e t h ' s BLOB, EDGE, and BAR. Only one a d d i t i o n has been made to the i n f o r m a t i o n contained i n H i l d r e t h ' s raw p r i m a l s k e t c h . Region tokens are used to d e s c r i b e areas of u n l i m i t e d s i z e . A BLOB has an upper l i m i t on area. Other changes have been made to make i n f o r m a t i o n more e x p l i c i t i n the image d e s c r i p t i o n . For i n s t a n c e , the e l o n g a t i o n of a r e -gion i s e x p l i c i t in a region token. T h i s value i s computed 25 26 from the area and perimeter l e n g t h of the r e g i o n . Elonga-t i o n i s not e x p l i c i t i n H i l d r e t h ' s raw p r i m a l sketch, but c o u l d be computed from v a l u e s which are e x p l i c i t i n the r e p r e s e n t a t i o n . These tokens are d e r i v e d from a p r i m i t i v e token, the Zero C r o s s i n g Boundary (ZCB), which r e p r e s e n t s a s i n g l e z e r o - c r o s s i n g contour i n V 2G*I for a s i n g l e Gaussian f i l t e r . Only one s p a t i a l frequency channel i s used to form the raw p r i m a l sketch. T h i s i s the most s i g n i f i c a n t departure from H i l d r e t h ' s f o r m u l a t i o n . ZCS tokens, u n l i k e EDGEs, do not have a Width measure. A d e s c r i p t i o n found from a s i n g l e s p a t i a l frequency channel was deemed ade-quate to address the q u e s t i o n s posed i n t h i s t h e s i s . Im-age d e s c r i p t i o n s are s i m p l i f i e d by t h i s change, and the q u e s t i o n of determining corresponding z e r o - c r o s s i n g s i n s e v e r a l s p a t i a l frequency channels need not be addressed. A V 2G operator with <r = 2.2 was used. The width of o b j e c t s such as roads and power l i n e s i s used to determine the s i z e of the Gaussian f i l t e r . The width of the c e n t r a l p a r t of the mask, which i s 2*<y, should be approximately the same as the width of these o b j e c t s i n the s c a l e of the image. T h i s width g i v e s a c c u r a t e demarcation of the edges 27 of these f e a t u r e s . A s m a l l e r operator w i l l r e p o r t more high frequency d e t a i l . A l a r g e r operator w i l l cause the zero c r o s s i n g s to d r i f t outwards from the a c t u a l f e a t u r e edges. The f i r s t token to be d e r i v e d i s the Zero C r o s s i n g Boundary (ZCB). A ZCB represents a set of connected p i x -e l s l y i n g on a zero c r o s s i n g contour i n the L a p l a c i a n of the f i l t e r e d image. Each ZCB i n c l u d e s l e n g t h and boundary c o n t r a s t measures. Regions are d e r i v e d from the ZCBs. A r e g i o n i s a set of connected p i x e l s enclosed by (one or more) ZCBs. Measures of i n t e n s i t y , area, perimeter l e n g t h , and e l o n g a t i o n are made f o r r e g i o n s . F i g u r e 9 shows a graphic r e p r e s e n t a t i o n of the ZCBs and regions computed f o r the a e r i a l photograph in f i g u r e 8. The ZCBs are shown as black l i n e s . The i n t e n s i t y in each region i s propor-t i o n a l to the region's e l o n g a t i o n measure. P o i n t s on zero c r o s s i n g boundaries are grouped i n t o short s t r a i g h t c a l l e d Zero C r o s s i n g Segments (ZCS). T h i s grouping uses the g e n e r a l i z e d l i n e d e s c r i p t i o n a l g o r i t h m d e s c r i b e d i n [ B a l l a r d 79]. A ZCS i n c l u d e s the segment's endpoints, l e n g t h , o r i e n t a t i o n , and c o n t r a s t . F i n a l l y , p a i r s of p a r a l l e l ZCSs are found, and c a l l e d bar segments. 28 A d i g i t i z e d a e r i a l image of a l o g g i n g a r e a . 29 The boundaries and regions from F i g u r e 8. Re-gions are shown s c a l e d by t h e i r e l o n g a t i o n . 30 A bar segment i n c l u d e s the two component ZCSs, o r i e n t a -t i o n , width, l e n g t h , c o n t r a s t , and inner image i n t e n s i t y . F i g u r e 10 shows the Bar Segments found i n the a e r i a l pho-tograph i n F i g u r e 8. 3.2. S t r u c t u r i n g scene domain knowledge Image understanding must r e l a t e scene knowledge to image d e s c r i p t i o n s . Scene knowledge must be org a n i z e d to f a c i l i t a t e i t s use. D i f f e r e n t types of knowledge are r e l e v a n t to d i f f e r e n t l e v e l s of image d e s c r i p t i o n . Only scene knowledge that d i r e c t l y i n f l u e n c e s a l e v e l of d e s c r i p t i o n need be c o n s i d e r e d at that l e v e l . Knowing that c a r s have four wheels i s u s e l e s s i f a p i x e l l e v e l im-age d e s c r i p t i o n i s being c o n s i d e r e d . The f a c t that the t i r e s are made out of black rubber may, however, be q u i t e u s e f u l . The d e s c r i p t i o n s used by an image understanding sys-tem i n d i c a t e the type of knowledge that i s u s e f u l . As an example, an examination of l o g g i n g roads produces the f o l -lowing items of scene knowledge: Figure 10 The Bar Segments of F igure 8. 32 Logging Roads 1. The width i s c o n s i s t e n t i n a given range. 2. The s u r f a c e m a t e r i a l of the road i s g r a v e l . 3. Curvature i s u s u a l l y low, except i n mountainous areas, where switchbacks may have hi g h c u r v a t u r e . 4. Grades are c o n s t r a i n e d to be l e s s than 12%. 5. The road s u r f a c e i s smooth. ' 6. Roads i n t e r s e c t i n a r b i t r a r y l o c a t i o n s . The angles formed at the i n t e r s e c t i o n s are not c o n s t r a i n e d . 7. Roads are continuous. 8. In p e r s p e c t i v e a e r i a l photographs, r e l i e f displacement may obscure s e c t i o n s of the road. 9. The road s u r f a c e may be shadowed by f o r e s t cover or topography. 10. Roads c r o s s r i v e r s u s ing b r i d g e s . 11. Roads do not run through l a k e s . 12. Logging roads allow access to cu t - o v e r s and c u t t i n g areas. 13. There i s a c l e a r e d right-of-way. 14. Loading areas are common adjacent to the road, e s p e c i a l l y near c u t t i n g areas and i n t e r s e c t i o n s and at t h e i r terminus. 15. Long s t r a i g h t s e c t i o n s are uncommon i n mountainous t e r r a i n . 16. At l e a s t one end of the road w i l l leave the image frame i n a n . a e r i a l photograph. Each p i x e l i n an image combines i n f o r m a t i o n about the su r f a c e p r o p e r t i e s of the ground cover i n the area sensed by that p i x e l with e f f e c t s due to the geometry of image p r o j e c t i o n and scene i l l u m i n a t i o n . R e f l e c t a n c e , depth, c o l o u r , and i n some cases o p t i c a l flow are measures that may be d e r i v e d f o r each p i x e l . Texture measures may be 3 3 derived for groups of p i x e l s . At t h i s l e v e l of representa-tio n , knowledge of surface material and imaging geometry is relevant. For instance, image irradiance can be used to estimate surface reflectance. Surface reflectance can be used to c l a s s i f y the p i x e l as being on a road surface, i f measured reflectance i s consistent with the predicted value. Figure 11 shows a simple c l a s s i f i c a t i o n based on th i s reasoning for the image in Figure 8 . Histogram analysis determined three classes, which were interpreted as Gravel, New growth or Recent Cutover, and Forest. The raw primal sketch describes the s p a t i a l organiza-tion of brightness changes in an image. To the extent that such changes correspond to physical changes in the scene, i t i s an appropriate l e v e l of representation in which to formulate reasoning about structural knowledge. Reasoning at the l e v e l of the raw primal sketch i s used to describe the arrangement of scene elements. For example, The zero crossing segments in Figure 12 are r a d i a l l y structured, with their endpoints forming two concentric c i r c l e s . The raw primal sketch description of these zero crossing seg-ments includes endpoint locations. With th i s information e x p l i c i t l y represented, i t i s easy to recognize the 34 F i g u r e 1 1 A simple c l a s s i f i c a t i o n based on p i x e l i n t e n s i t y . The c l a s s e s are Gravel (white), Recent Cutover or New growth ( g r a y ) , and F o r e s t ( b l a c k ) . 35 c i r c u l a r arrangement. Without a d e s c r i p t i o n of the s p a t i a l arrangement of the zero c r o s s i n g segments, t h i s p a t t e r n would not be so e a s i l y r e c o g n i z e d . The s i d e s of an i d e a l road form p a i r s of p a r a l l e l Zero C r o s s i n g Segments spaced a c c o r d i n g to the width of road. A bar segment i s an ex-c e l l e n t cue to the e x i s t e n c e of a road. E x p e c t a t i o n s about c u r v a t u r e and c o n t r a s t are e a s i l y t e s t e d at the l e v -e l of the raw p r i m a l s k e t c h . C o n t e x t u a l r e l a t i o n s i n the scene domain may be ex-pressed i n many ways, i n c l u d i n g f u n c t i o n a l r e l a t i o n s between o b j e c t s , f u n c t i o n a l d e f i n i t i o n of o b j e c t s , c l a s -s i f i c a t i o n c a t a l o g u e s , and c o m p o s i t i o n a l s p e c i f i c a t i o n s . A d i f f e r e n t l e v e l of r e p r e s e n t a t i o n i s necessary to use t h i s type of knowledge. At t h i s l e v e l the tokens must be at l e a s t p a r t i a l l y i n t e r p r e t e d as scene o b j e c t s . To use the f u n c t i o n a l r e l a t i o n s h i p s between roads, r i v e r s and b r i d g e s , a d e s c r i p t i o n must c o n t a i n tokens r e p r e s e n t i n g p o t e n t i a l segments of roads and r i v e r s , and s i t e s of pos-s i b l e b r i d g e s . 36 F i g u r e 12 A group of r a d i a l l y arranged Zero C r o s s i n g Segments. The raw primal sketch i s an a p p r o p r i a t e l e v e l of r e p r e s e n t a t i o n f o r reasoning about such simple s t r u c t u r a l arrangements. . 37 3.3. A p p l y i n g scene domain knowledge The task of f i n d i n g roads i n an a e r i a l image i s used to t e s t the n o t i o n t h a t s t r u c t u r a l reasoning can be based on the raw p r i m a l s k e t c h . C u r v e l i n e a r f e a t u r e s are l a r g e -l y d e s c r i b e d by t h e i r s t r u c t u r a l p r o p e r t i e s and s u r f a c e m a t e r i a l . Although the i n t e r p r e t a t i o n of l o g g i n g roads u l -t i m a t e l y r e l i e s on other c o n t e x t u a l i n f o r m a t i o n , t h e i r primary c h a r a c t e r i s t i c s are shape and s u r f a c e appearance. Mapping from a e r i a l photographs i s s i m i l a r i n f l a v o u r . The p h o t o - i n t e r p r e t e r s f i r s t sketches the l o c a t i o n of roads and r i v e r systems. Cues pr o v i d e d by that i n f o r m a t i o n h e l p i n t e r p r e t other f e a t u r e s . T h i s approach i s f o l l o w e d here. A program f o l l o w i n g t h i s approach w i l l produce s t a t e -ments about p a r t s of l o g g i n g roads and a s s o c i a t e d f e a t u r e s . Ambiguous p a r t s of the road system need to be subsequently examined i n a more g l o b a l c o n t e x t . Information about l o g g i n g roads leads to p r e d i c t i o n s about the raw p r i m a l sketch d e s c r i p t i o n of image f e a t u r e s that correspond to p a r t s of roads. S p e c i f i c a l l y , image f e a t u r e s due to l o g g i n g roads are s t r i p s formed from two l i n e s of zero c r o s s i n g s . P r e d i c t i o n s about s t r i p s a r e : 38 1 - S t r i p s have width a p p r o p r i a t e f o r the width of a road. 2 - S t r i p s have tone a p p r o p r i a t e f o r the p o s s i b l e s u r f a c e cover of a road. 3 - S t r i p s have c o n t r a s t a p p r o p r i a t e f o r the d i f f e r -ence in the road s u r f a c e cover and the ground cover of areas adjacent to roads. 4 - S t r i p s are long and connected to form a network. ( i . e . At most one end of a s t r i p i s unconnected) 5 - S t r i p s form h i g h l y elongated r e g i o n s . 6 - S t r i p s have low average c u r v a t u r e . A s t r i p token i s d e f i n e d to embody p a i r s of l i n e s of zero c r o s s i n g segments. Bar segments suggest the p o s s i -b i l i t y of a s t r i p . A s i n g l e bar segment forms the i n i t i a l S t r i p . The s t r i p i s extended in both d i r e c t i o n s u n t i l a stopping c o n d i t i o n i s met. There are s e v e r a l stopping con-d i t i o n s : 39 Boundary - The s t r i p stops at the image boundary. J u n c t i o n - Another s t r i p i s encountered. Fork - The endpoints of the s t r i p become separated by another zero c r o s s i n g . Dead End - The s i d e s meet. Fanout - The width becomes i n c o n s i s t e n t . Fadeout - The image i n t e n s i t y along the s t r i p becomes i n c o n s i s t e n t . T h i s i s a s s o c i a t e d with a "Wandering" z e r o - c r o s s i n g contour. The complete s t r i p d e s c r i p t i o n i n c l u d e s : - Component ZCS l i s t s f o r both s i d e s of the s t r i p . - A l i s t of p o i n t s on the c e n t e r l i n e . - Termination c o n d i t i o n s f o r each end of the s t r i p . - Length - Width - Average cur v a t u r e - Image i n t e n s i t y - C o n t r a s t - Sense : The shape of the i n t e n s i t y p r o f i l e . The sense of a s t r i p d e s c r i b e s the r e l a t i v e i n t e n s i t i e s of the c e n t r e of the s t r i p and the surrounding areas. The p o s s i b l e d e s c r i p t i o n s a r e : 1 - L i g h t c e n t e r , dark background. 2 - Background l i g h t on one s i d e , dark on the other. 3 - Dark c e n t e r , l i g h t background. 40 Measures of c o n t r a s t , c u r v a t u r e , width and inner im-age i n t e n s i t y are used to prune the symbolic s t r i p d e s c r i p t i o n s . The remaining s t r i p s i n d i c a t e p o s s i b l e f r a g -ments of roads. Other scene f e a t u r e s may p e r s i s t . S e c t i o n s of g r a v e l r i v e r banks, for example, would not yet be e l i m -i n a t e d . A new token, the j u n c t i o n , i s d e f i n e d to embody the o b s e r v a t i o n that road s e c t i o n s form networks. J u n c t i o n s are found in two ways. S t r i p s determine j u n c t i o n as one t e r m i n a t i o n c o n d i t i o n . T h i s occurs whenever s t r i p expan-s i o n f i n d s that a ZCS contiguous with the c u r r e n t s t r i p forms p a r t of another s t r i p . J u n c t i o n s are a l s o found i n -dependently by examining c l o s e groupings of s t r i p end-p o i n t s . A search i s used to f i n d p o i n t s which are w i t h i n a small f i x e d r a d i u s of at l e a s t two s t r i p endpoints. A set of the s t r i p endpoints i n range of each found p o i n t forms the i n i t i a l token. Sets are merged to form maximal groupings. Each maximal grouping determines a j u n c t i o n . S t r i p s and j u n c t i o n s together form a network, with j u n c t i o n s as nodes, and s t r i p s as l i n k s between nodes. T h i s r e p r e s e n t a t i o n i s the f i n a l product. The network i s not yet "Roads" and "Road I n t e r s e c t i o n s " . C o n t i n u i t y con-41 s t r a i n t s has not yet been a p p l i e d . Neither has an a n a l y s i s been made of the r e l a t i o n s h i p s between s t r i p s , j u n c t i o n s , and other image f e a t u r e s . T h i s , however, completes the d e s c r i p t i o n of the scene's road system as produced. CHAPTER 4 IMPLEMENTATION Programs to produce the p r i m a l sketch, and to reason about image f e a t u r e s have been implemented i n FRANZLISP under VAX/UNIX. T h i s chapter d e s c r i b e s d e t a i l s of the implementation, i n c l u d i n g a l g o r i t h m s and data s t r u c t u r e s . Image data was obtained from a e r i a l photographs of an area i n southeastern B r i t i s h Columbia. The s i t e f l i g h t l i n e s and photo numbers used a r e : BC 79085 #195 - F i g u r e 1. BC 79099 #078 - F i g u r e 8. BC 79099 #078 - F i g u r e 18. The nominal s c a l e of the photographs i s 1:10,000. The pho-tographs were d i g i t i z e d at a scanning d e n s i t y of 50 l i n e s per i n c h . 256 x 256 a r r a y s are used throughout. The ground r e s o l u t i o n i n the d i g i t i z e d photographs i s 3-5 meters per p i x e l . A road at t h i s r e s o l u t i o n i s 2-4 p i x e l s i n width, although c l e a r e d right-of-ways u s u a l l y add 1 or 2 p i x e l s to the width apparent i n an image. T h i s r e s o l u t i o n i s low f o r r e l i a b l e d e t e c t i o n of roads. In 42 43 order to a c c u r a t e l y respond to the b r i g h t n e s s d i s c o n t i n u i -t i e s produced by the separate s i d e s of the road, the V 2G operator must have a c e n t r a l width of 4-5 p i x e l s , twice the width of a road. A d i g i t a l f i l t e r approximating an operator t h i s narrow i s n o i s y . A d d i t i o n a l l y , the p r e c i -s i o n with which the road edges are l o c a t e d i s 1/2 p i x e l . Zero c r o s s i n g s produced at road edges may be s i g n i f i c a n t l y d i s p l a c e d at narrow road s e c t i o n s . When an image with t h i s r e s o l u t i o n i s d i s p l a y e d on an output device or as a photograph, roads are p e r c e i v e d without d i f f i c u l t y , s i n c e roads i n the r e t i n a l image of .the d i s p l a y cover many sens-ing elements on the r e t i n a . The images are convolved with a V 2G operator and a bi n a r y image i s produced i n which p i x e l s on zero c r o s s i n g s are s e t . Zero c r o s s i n g contours i n t h i s image are 4-connected. 4.1. Computing The p r i m a l sketch The p r i m a l sketch d e s c r i p t i o n i s d e r i v e d from t h i s zero c r o s s i n g map. The d e s c r i p t i o n i s done i n phases: 1. Chains of connected zero c r o s s i n g p o i n t s are ex-t r a c t e d from the zero c r o s s i n g p o i n t map. These 44 are c a l l e d zero c r o s s i n g boundaries (ZCB). 2. Each ZCB i s segmented i n t o s h o r t , approximately s t r a i g h t s e c t i o n s . These are c a l l e d zero c r o s s i n g segments (ZCS). 3. An i c o n i c r e p r e s e n t a t i o n of the zero c r o s s i n g segments i s made by l a b e l i n g each p i x e l on a zero c r o s s i n g with a r e f e r e n c e to the ZCS c o n t a i n i n g i t . P a i r i n g s of n e a r - p a r a l l e l segments are'found in t h i s r e p r e s e n t a t i o n . These are c a l l e d bar seg-ments. 4. Connected sets of p o i n t s between zero c r o s s i n g boundaries are found i n the zero c r o s s i n g p o i n t map. These are c a l l e d r e g i o n s . The computation of zero c r o s s i n g correspondences i n m u l t i p l e s p a t i a l frequency channels was avoided. T h i s r e s u l t s i n a r e d u c t i o n of the computational time r e q u i r e d to compute the primal s k e t c h . Since the width of the V 2G operator c o u l d be tuned to the f e a t u r e s of i n t e r e s t , a s i n g l e channel was deemed s u f f i c i e n t . . 4.1.1. Zero C r o s s i n g Boundaries Zero c r o s s i n g boundaries d e s c r i b e the zero c r o s s i n g contours i n V 2G. The r e p r e s e n t a t i o n f o r zero c r o s s i n g boundaries i s : 45 Zero c r o s s i n g boundary ( < r e f e r e n c e number> ( <point 1> <point 2> <point 3> <point n> ) <Gaussian a2> <number of p o i n t s n> ( < c o n t r a s t mean> < c o n t r a s t v a r i a n c e > ) ) The r e f e r e n c e number i s an i n t e g e r used t o i d e n t i f y and a c c e s s the ZCB. The p o i n t s a r e (x y) p a i r s , g i v i n g the l o c a t i o n of the z e r o c r o s s i n g p o i n t s i n image c o o r d i n a t e s . The s i z e of the G a u s s i a n f i l t e r used i n the V 2G o p e r a t o r i s s p e c i f i e d by i t s v a r i a n c e <s2. The number of p o i n t s i s s i m p l y the l e n g t h of the p o i n t l i s t . The c o n t r a s t measures g i v e the mean and v a r i a n c e of c o n t r a s t f o r the boundary p o i n t s . A s i m p l e c o n t r a s t measure f o r ZCB's i s d e t e r m i n e d as the image i n t e n s i t y range i n the 5 x 5 neighbourhood of each p o i n t . The p o i n t l i s t s f o r each ZCB a r e found i n a scan of the z e r o c r o s s i n g p o i n t map. Whenever a s e t p i x e l i s en c o u n t e r e d a l i n e f o l l o w i n g r o u t i n e i s i n v o k e d . The p o s i -t i o n s of a l l s e t p i x e l s a re r e c o r d e d and then s e t o f f i n 46 the image. The r o u t i n e stops when the l i n e s t o p s . The scan then resumes f o r the next set p i x e l . The r e s u l t i s a l i s t of a l l l i n e s i n the image. A ZCB i s c o n s t r u c t e d f o r each contour. 4.1.2. Zero C r o s s i n g Segments Zero c r o s s i n g segments (ZCSs) are d e r i v e d from zero c r o s s i n g boundaries. The contour represented by the ZCB i s segmented i n t o short s e c t i o n s which may be approximated by s t r a i g h t l i n e segments. The r e p r e s e n t a t i o n f o r zero c r o s s -ing segments i s : Zero C r o s s i n g Segment ( <reference number> ( <end p t . 1> <end p t . 2> ) ( <point 1> <point 2> <point 3> <point n> ) <Gaussian <r2> <orientation> <segmentation t h r e s h o l d ©> <length> <number of p o i n t s n> ( <contrast mean> <contrast variance> ) ) 47 The end p o i n t s are the end p o i n t s of the s t r a i g h t l i n e segment approximation to the contour. The p o i n t l i s t con-t a i n s the contour p o i n t s between the two end p o i n t s of the segment. The p o i n t l i s t i s used to produce a map of ZCSs. P o i n t s on the ZCS are l a b e l e d with the ZCS r e f e r e n c e number. Note that <point 1> = <end p t . 1> and <point n> = <end p t . 2>. Contrast measures f o r ZCS are found by measuring the image i n t e n s i t y d i f f e r e n c e s a c r o s s the seg-ment. The d i f f e r e n c e i s measured at each p o i n t on the con-tou r , at a d i s t a n c e of 2 p i x e l s i n each d i r e c t i o n perpen-d i c u l a r to the o r i e n t a t i o n of the segment. The d i s t a n c e should be equal to the Gaussian v a r i a n c e a2. Each ZCB i s s p l i t i n t o segments by f i t t i n g s t r a i g h t l i n e segments to the boundary. The segmentation a l g o r i t h m i s d e s c r i b e d i n [ B a l l a r d 79]. The i n i t i a l approximation to a contour i s the l i n e between the f i r s t and l a s t p o i n t i n the ZCB p o i n t l i s t . The a l g o r i t h m r e c u r s i v e l y r e f i n e s the approximation by s p l i t t i n g the contour at the p o i n t on the contour f u r t h e s t from the c u r r e n t approximation. The c u r r e n t approximation i s then r e p l a c e d by two l i n e s o r i -g i n a t i n g at the o l d endpoints and meeting at the s p l i t t i n g p o i n t . No s p l i t i s made i f the d i s t a n c e between the l i n e 48 and the curve i s l e s s than the segmentation t h r e s h o l d ©. The segmentation t h r e s h o l d i s the parameter given to the segmentation procedure, along with the ZCB. F i g u r e 14 shows s u c c e s s i v e approximations to a curve based on t h i s segmentation a l g o r i t h m . The choice of © decides how t i g h t -l y the contour i s approximated. A-small v a l u e , say ©^1.0, makes the segmentation s u b j e c t to t e s s e l a t i o n n o i s e i n the image. A value of 0=2.0 was found to be f r e e of t e s s e l a -t i o n noise e r r o r s , and not to lo s e much i n the r e s o l u t i o n of shape. Connecting l i n e s i n the zero c r o s s i n g contour map used by the boundary f o l l o w i n g a l g o r i t h m cause some boun-d a r i e s to end before they c l o s e . Segmentation of these boundaries i n t o zero c r o s s i n g segments causes the two end-p o i n t s of the l i n e to be grouped i n t o a zero c r o s s i n g seg-ment c o n t a i n i n g two separated p o i n t s . An a l t e r n a t i v e i s to f i n d the zero c r o s s i n g boundaries by f o l l o w i n g r e g i o n per-imeters. T h i s would ensure that a l l boundaries : c l o s e , and e l i m i n a t e the a r t i f a c t i n zero c r o s s i n g segments. Since the d i s c o n n e c t e d segments are easy to d e t e c t and d i s r e g a r d i n f u r t h e r computations, t h i s change has not been made. 49 F i g u r e 14 Su c c e s s i v e s t r a i g h t - l i n e approximations to a curve. The curve i s r e c u r s i v e l y segmented at p o i n t f u r t h e s t from the c u r r e n t approximation. The segmentation t h r e s h o l d 9 i s the maximum d i s t a n c e between the curve and i t s approximation. 50 4.1.3. Bar Segments Bar segments are formed from p a r a l l e l p a i r s of zero c r o s s i n g segments. An i c o n i c r e p r e s e n t a t i o n of ZCSs i s searched to f i n d the p a i r i n g s . A l i n e p e r p e n d i c u l a r to the o r i e n t a t i o n of the ZCS i s c o n s t r u c t e d at each of i t s end-p o i n t s . These l i n e s are checked f o r i n t e r s e c t i o n s with other segments w i t h i n an a l l o w a b l e d i s t a n c e of the end-p o i n t . T h i s d i s t a n c e i s the maximum width p o s s i b l e f o r a road. Segments found are p a i r e d with the o r i g i n a l ZCS i f two c o n d i t i o n s are met. 1. The d i f f e r e n c e between the o r i e n t a t i o n s of the two ZCSs i s s m a l l . 2. The " t r a p e z o i d a l d i s t o r t i o n " of the p a i r i s s m a l l . The t r a p e z o i d a l d i s t o r t i o n measure i s the r a t i o of the sum of the d i s t a n c e s between the two p a i r s of endpoints of the bar segment to twice the width of the bar segment. A bar segment that encloses a r e c t a n g u l a r area has a t r a p e z o i d a l d i s t o r t i o n equal to 1.0. I f t h i s area i s d i s t o r t e d i n t o a t r a p e z o i d , the measure i n c r e a s e s . T h i s measure of a bar segment's shape i s used to exclude h i g h l y skewed p a i r s of 51 segments. Both of these t h r e s h o l d s were chosen without exact c r i t e r i a . The valu e s used were 10 degrees f o r the maximum d i f f e r e n c e i n o r i e n t a t i o n , and 5 f o r t r a p e z o i d a l d i s t o r t i o n . Experiments with smaller v a l u e s caused f e a t u r e s that appeared " b a r - l i k e " to be ignored by the system. The r e p r e s e n t a t i o n f o r bar segments i s : Bar Segment ( <reference number> ( <ZCS-1 re f e r e n c e number> <ZCS-2 re f e r e n c e number> ) <average inner image i n t e n s i t y > ( <contrast mean> <contrast variance> ) <Gaussian <r2> <orientation> <length> <width> <d i f f e r e n c e of ZCS o r i e n t a t i o n s > < t r a p e z o i d a l d i s t o r t i o n > <sense> ( <centre p t . 1> <centre p t . 2> ) <ZCS segmentation t h r e s h o l d e> < t r a p e z o i d a l d i s t o r t i o n threshold> ) The inner image i n t e n s i t y i s the average f o r the area en-c l o s e d by the bar segment. The sense i s a d e s c r i p t i o n of the c o n t r a s t p r o f i l e . The sense i s an i n t e g e r 0-3, which i s i n t e r p r e t e d : 52 0 - Unknown or no c o n t r a s t . 1 - L i g h t bar i n a dark background. 2 - Dark on one s i d e , l i g h t on the ot h e r . 3 - Dark bar i n l i g h t background. The c e n t r e p o i n t s are the midpoints of the l i n e s connect-ing the ZCS endpoint p a i r s . 4.1.4. Regions Regions are connected areas enclosed by ZCB's. They are found i n the zero c r o s s i n g p o i n t map. A l a b e l i n g a l g o -rithm s i m i l a r to the l i n e e x t r a c t i o n a l g o r i t h m used f o r ZCB's i s used. In t h i s case, unset p i x e l s are d i s c o v e r e d by an image scan. When one i s found, a region f i l l i n g rou-t i n e i s invoked. The unset p i x e l i s re s e t to a new l a b e l , and the c o o r d i n a t e s of the p o i n t are pl a c e d i n a queue. The l a b e l i n g a l g o r i t h m proceeds by removing a p o i n t from the f r o n t of the queue, f i n d i n g any unl a b e l e d 4-connected neighbouring p o i n t s , and g i v i n g them the same l a b e l . The new p o i n t s ' c o o r d i n a t e s are added to the queue. Once the queue i s empty, another region i s s t a r t e d at the next un-l a b e l e d p i x e l i n the image scan. T h i s produces a l a b e l e d region map. Region i n f o r m a t i o n i s c o l l e c t e d i n a second scan of the image. The r e p r e s e n t a t i o n f o r regions i s : 53 Region ( -^reference number> <average image i n t e n s i t y > <area> <perimeter> <elongat ion> <Gaussian a2> ) The e l o n g a t i o n i s c a l c u l a t e d as: E = P 2 / 4irA Where P i s the reg i o n ' s perimeter l e n g t h , and A i s i t ' s area. A c i r c l e has the minimum e l o n g a t i o n E=1.0. Devia-t i o n s from c i r c u l a r i t y i n c r e a s e the e l o n g a t i o n measure. In p r a c t i c e , t e s s e l a t i o n noise e f f e c t s the measure some-what . 4.2. S t r i p s and j u n c t i o n s S t r i p s are b u i l t by extending the s i d e s of a bar seg-ment. The extensions to the bar segment are zero c r o s s i n g segments. A ZCS map, marking the p o i n t s of a l l segments using t h e i r r e f e r e n c e number as a l a b e l , i s used f o r t h i s task. J u n c t i o n s d e s c r i b e groupings of s t r i p endpoints. The s t r i p growing program r e p o r t s the l o c a t i o n s of some j u n c t i o n s , and range sear c h i n g i s used to f i n d o t h e r s . 54 4.2.1. S t r i p s S t r i p growing uses a f i l t e r e d l i s t of bar segments. The f i l t e r excludes those with unacceptable c o n t r a s t range, width, or inner image i n t e n s i t y . A l i s t of zero c r o s s i n g segments that are i n c o r p o r a t -ed i n t o s t r i p s i s maintained. The l i s t i s used to d e t e c t j u n c t i o n s , and to a v o i d examining those bar segments ab-sorbed by p r e v i o u s l y c o n s t r u c t e d s t r i p s . The s t r i p i s ex-panded f i r s t i n one d i r e c t i o n , then the other. If one side of the s t r i p i s s i g n i f i c a n t l y longer, only the short side i s a candidate for e x t e n s i o n . A s i d e i s extended by s e a r c h i n g the ZCS map f o r a ZCS connected to that s i d e at the c u r r e n t endpoint. There are s e v e r a l p o s s i b l e t e r m i n a t i o n c o n d i t i o n s : (0) Unknown - No e x t e n s i o n can be found. (1) Boundary (2) J u n c t i o n ( 3 ) Fork (4) Fanout (5) Fadeout (6) Dead end - The s t r i p has extended to the image boundary. - The new ZCS has been used i n another s t r i p . - A l i n e between the endpoints i n t e r s e c t s a ZCS. - The width of the s t r i p becomes i n c o n s i s t e n t . - The image i n t e n s i t y becomes i n c o n s i s t e n t . - The s i d e s meet. The r e p r e s e n t a t i o n f o r s t r i p s i s : 56 S t r i p ( <reference number> ( <side 1> <side 2> ) <centreline> ( <termination code - Side 1> <termination code - Side 2> ) <image i n t e n s i t y > ( <contrast mean> <contrast variance> ) <Gaussian e2> ( <curvature mean> <curvature variance> ) <length> ( <width mean> <width variance> ) <sense> <segmentation t h r e s h o l d (ZCS)> <fanout parameter> <fadeout parameter> <runout parameter> ) A s i d e i s the l i s t of ZCS ref e r e n c e numbers. The c e n t r e -l i n e i s the l i s t of midpoints of the l i n e .j°i.n:Ln9 the end-p o i n t s of the s t r i p a f t e r each e x t e n s i o n . Constant c u r v a t u r e i s assumed between the ce n t r e p o i n t s of the s t r i p . Maximum cu r v a t u r e between these p o i n t s i s r e l a t e d to the d i s t a n c e between the p o i n t s and to the segmentation t h r e s h o l d . T h i s assumption i s too st r o n g . As a r e s u l t , c u r v a t u r e i s not used i n r e l a t i n g s t r i p s to roads. A more s o p h i s t i c a t e d c u r v a t u r e c a l c u l a -t i o n i s r e q u i r e d . 57 4.2.2. J u n c t i o n s J u n c t i o n s d e s c r i b e an a s s o c i a t i o n of two or more s t r i p s . They are found i n two ways. F i r s t , p o s i t i o n s of " j u n c t i o n " t e r m i n a t i o n s i n s t r i p s are used to form j u n c t i o n s . The p o s i t i o n of the meeting p o i n t of the two s t r i p s i s i n c l u d e d i n the " j u n c t i o n " t e r -mination code. A s t r i p map, s i m i l a r to the ZCS map, i s searched i n the neighbourhood of that p o i n t to determine the component s t r i p s of the j u n c t i o n . A l i s t of these s t r i p ' s r e f e r e n c e numbers and t h e i r c l o s e s t endpoints are put i n t o a new array at the same c o o r d i n a t e s as the junc-t i o n p o i n t . T h i s array i s c a l l e d the j u n c t i o n component map. Second, c l u s t e r s of s t r i p endpoints are used to form j u n c t i o n s . P o i n t s w i t h i n a f i x e d r a d i u s of a s t r i p end-p o i n t are marked i n the j u n c t i o n component map. A p o i n t i n the map i s marked by adding the s t r i p r e f e r e n c e number and c l o s e s t endpoint l o c a t i o n to the l i s t of such p a i r s a l -ready s t o r e d at that p o i n t . A new l i s t i s s t a r t e d at the j u n c t i o n component map l o c a t i o n i f the entry there was p r e v i o u s l y empty. The r e s u l t of t h i s process i s that any 58 p o i n t i n the a r r a y w i l l r e c o r d the p o s s i b l e components of a j u n c t i o n . Many d u p l i c a t e s may e x i s t , and many p o i n t s w i l l r e c o r d o n l y s u b s e t s of the f i n a l j u n c t i o n components. The a r r a y i s scanned t o throw out those e n t r i e s r e -p o r t i n g a s i n g l e component. The r e m a i n i n g e n t r i e s a r e p l a c e d i n a l i s t . D u p l i c a t e s and component s e t s t h a t a r e s u b s e t s of o t h e r components a r e removed from the l i s t . The r e m a i n i n g e n t r i e s a r e used t o form j u n c t i o n s . The r e p r e s e n t a t i o n used f o r j u n c t i o n s i s : J u n c t i o n ( < r e f e r e n c e number> < l o c a t ion> ( <component 1> <component 2> <component 3> <component n> ) <number of components> <radius> ) The l o c a t i o n i s the g e o m e t r i c c e n t r e of the component end-p o i n t s . A component i s a p a i r : ( < s t r i p r e f e r e n c e number> < s t r i p e n d p o i n t c o o r d i n a t e > ) 59 There are s e v e r a l problems encountered i n the compu-t a t i o n of j u n c t i o n s . The two sources of j u n c t i o n tokens are not s u f f i c i e n t to f i n d some types of j u n c t i o n s . Since only groups of endpoints are c o n s i d e r e d i n the range s e a r c h i n g a l g o r i t h m , no "T" j u n c t i o n s are found. One pos-s i b l e s o l u t i o n i s t o i n c l u d e s t r i p c e n t r e l i n e s i n the range search. T h i s would i n c r e a s e range s e a r c h i n g computa-t i o n a l time only s l i g h t l y i f the a l g o r i t h m d i d not search away from the c e n t r e l i n e s , j u s t checking f o r the p r o x i m i t y of an endpoint to a c e n t r e l i n e . A second problem occurs where many s t r i p s meet i n a small a r e a . t h i s r e s u l t s in s e v e r a l j u n c t i o n s being r e -p o r t e d . In some cases these j u n c t i o n s are separate i n t e r -s e c t i o n s i n the scene, but o f t e n they are not. The d i f f i -c u l t y i s that the c e n t r e of the j u n c t i o n i s the geometric c e n t r e of the s t r i p endpoints. An a l g o r i t h m that e v a l u a t e d the image i n the area of the j u n c t i o n to decide a "best" l o c a t i o n c o u l d provide enough i n f o r m a t i o n to group the i n -t e r s e c t i o n j u n c t i o n s . T h i s approach, however, steps out-s i d e the formalism being f o l l o w e d . A second p o s s i b i l i t y i s the use of zero c r o s s i n g i n f o r m a t i o n from other s p a t i a l frequency channels to h e l p i n grouping j u n c t i o n s . CHAPTER 5 EXPERIMENTAL RESULTS The system was t e s t e d on three d i f f e r e n t images. The output i n each case i s a d e s c r i p t i o n of the s t r i p s and j u n c t i o n s which correspond to s e c t i o n s of lo g g i n g roads and t h e i r i n t e r s e c t i o n s . The c e n t r e l i n e s of these s t r i p s , and the connections made by j u n c t i o n s are g r a p h i c a l l y d i s p l a y e d i n s e c t i o n 2. S e c t i o n 1 d i s c u s s e s the r e s u l t s . 5.1. D i s c u s s i o n The system was able to c o r r e c t l y i n t e r p r e t most of the l o g g i n g roads i n the images used. E r r o r s made i n c l u d e m i s - i n t e r p r e t i n g some f e a t u r e s as roads, and f a i l i n g to i d e n t i f y some roads. Most of the e r r o r s made by the system are of omis-s i o n . A common reason f o r a road not being found i s that i t i s too narrow. The main roads i n the images are found, but secondary t r a c k s with widths of one or two p i x e l s are not found. These roads are s i n g l e - lane, and have no c l e a r e d r i g h t of way. The narrow width of these roads 60 61 causes the zero c r o s s i n g s produced by the V 2G operator to be d i s p l a c e d outward from the a c t u a l road edges. The i n t e n s i t y measure of s t r i p s corresponding to these roads i n c l u d e s measurements of pa r t of the f o r e s t areas border-ing the roads. T h i s makes the i n t e n s i t y too low to i n t e r -p r e t the f e a t u r e as a l o g g i n g road. S e v e r a l roads i n the lower r i g h t corner of the area shown i n f i g u r e 15 are not found because of t h i s problem. Another common d i f f i c u l t y i s the d e t e c t i o n of road s e c t i o n s which pass through cutover areas. Landings and s k i d roads which j o i n a main road as i t passes through the c u t t i n g area causes poor d e f i n i t i o n of the s i d e s of the road. T h i s p r e c l u d e s the p o s s i b i l i t y of strong bar seg-ments along those road s e c t i o n s . Without these cues, the road s e c t i o n i s missed by the s t r i p d e t e c t o r . F i g u r e s 18 and 19 present a good example of t h i s d i f f i c u l t y . The road s e c t i o n i n the lower l e f t p a r t of the c e n t r a l cutover area has not been d e t e c t e d . The o p p o s i t e problem sometimes occurs when a str o n g edge i s a l i g n e d with a weak edge, and the p a i r forms a bar. Although the c o n t r a s t step on one s i d e of the bar w i l l be s m a l l , i t may be enough f o r the bar to become a 62 road cue. The r i g h t edge of the c u t t i n g area i n f i g u r e s 14 and 15 i s an example. The c o n t r a s t step on the l e f t s i d e of t h i s s t r i p i s l a r g e enough to i n c l u d e as a road. T h i s seems unavoidable. The v a r i e t y of areas that roads pass through produces a wide c o n t r a s t range. Some non-road f e a t u r e s are mistakenly i n t e r p r e t e d as roads. S i m i l a r i t y i n the appearance of these f e a t u r e s and the appearance of roads causes the system to i d e n t i f y them as roads. Dry stream beds and g r a v e l r i v e r banks are long, s t r i p - l i k e , and l i g h t toned. I f t h e i r width i s the same as the width of a road, they are e a s i l y mistaken as roads. A good example i s the c e n t r a l r i v e r bed i n f i g u r e s 14 and 15. A power l i n e running h o r i z o n t a l l y i n the c e n t r e of f i g u r e s 16 and 17 i s a l s o i n t e r p r e t e d as a road. The l o c a t i o n s of i n t e r s e c t i o n s are not p r e c i s e l y determined by the system. The c e n t r e of an i n t e r s e c t i o n i s the geometric c e n t r e of the endpoints of the s t r i p cen-t r e l i n e s which meet th e r e . I n t e r s e c t i o n s are shown i n the graphic output below by j o i n i n g the c e n t r e p o i n t of the i n t e r s e c t i o n d i r e c t l y to the endpoints of the component road s e c t i o n s . T h i s causes some of the marked road c e n t r e -l i n e s to s t r a y e n t i r e l y o u t s i d e the a c t u a l roads. T h i s can 63 be seen i n the upper r i g h t area of f i g u r e 15. Some i n t e r s e c t i o n s are not found at a l l . I f the s t r i p s corresponding to the roads which meet at an i n t e r -s e c t i o n do not touch, so that a j u n c t i o n t e r m i n a t i o n i s r e p o r t e d , or i f they do not stop w i t h i n a small d i s t a n c e of one another, so that a c l u s t e r of endpoints i s d e t e c t e d , then no i n t e r s e c t i o n i s found. In f i g u r e 15, a l a r g e l o a d i n g area at the bottom of the cut-over area causes the d e f i n i t i o n of the edges of the three roads which meet there to be poor. S t r i p s corresponding to the roads have not been extended f a r enough i n t o the l o a d i n g area f o r the s t r i p s to meet or to c l u s t e r . The same con-d i t i o n occurs i n the upper l e f t corner of f i g u r e 19. 5.2. Graphic output Roads found i n three t r i a l s of the system are presented i n the f i g u r e s below. 64 F i g u r e 15 The c e n t r e l i n e s of roads are shown i n b l a c k . 66 gure 16 The o r i g i n a l image from the second t r i a l of the sys-tem. F i g u r e 17 The c e n t r e l i n e s of roads are shown in b l a c k . 68 69 CHAPTER 6 CONCLUSIONS The r e s u l t s show a way i n which the p r i m a l sketch d e s c r i p t i o n of an image may be used to reason about the s t r u c t u r a l aspects of c u r v i l i n e a r f e a t u r e s . Three ques-t i o n s are d i s c u s s e d i n t h i s c h a p t e r : What c l a s s of f e a t u r e s are w e l l s u i t e d to t h i s approach? Is the primal sketch the a p p r o p r i a t e l e v e l of r e p r e s e n t a t i o n f o r t h i s type of reasoning? What resea r c h i s s u e s are r e l e v a n t to the use of s t r u c t u r a l reasoning? 6.1. Features and s t r u c t u r e Roads, power l i n e s and r a i l w a y s are l a r g e l y d e s c r i b e d i n terms of t h e i r composition and s t r u c t u r a l r e l a t i o n s h i p s with other f e a t u r e s . The computer implementation d e s c r i b e d above uses t h i s l e v e l of r e p r e s e n t a t i o n to f i n d roads i n a e r i a l images. 70 71 The present approach c o u l d be a p p l i e d w i t h i n t h i s domain to other s t r i p - l i k e f e a t u r e s such as r a i l w a y l i n e s and power l i n e s . The e r r o r s made by the system would remain the same: the scene elements i d e n t i f i e d by the sys-tem are those whose v i s u a l p r o p e r t i e s s a t i s f y the con-s t r a i n t s imposed by the knowledge used. I t i s premature to l a b e l the tokens produced by the system as "road s e c t i o n " or "power l i n e " u n l e s s the i n t e r r e l a t i o n s h i p s of scene elements have been examined. Tokens produced at t h i s l e v e l are i n h e r e n t l y ambiguous. Programs s i m i l a r to t h i s one c o u l d be employed i n a number of image understanding problem domains. A few pos-s i b i l i t i e s a re: F i n d i n g b u i l d i n g s i n a e r i a l or t e r r e s t r i a l images of urban areas. L i k e roads, b u i l d i n g s are d e s c r i b e d l a r g e l y by t h e i r s t r u c t u r a l p r o p e r t i e s . T a v a k o l i and Rosenfeld [ T a v a k o l i & Rosenfeld 82] use simple d e s c r i p t i o n s of b u i l d i n g s and roads based on the spa-t i a l arrangement of edge segments, although not from a p r i m a l sketch. P a i r s of edges with o p p o s i t e con-t r a s t are cues f o r roads. Four r e c t a n g u l a r edges with a dark i n s i d e are cues f o r b u i l d i n g s . More e x p l i c i t 72 models would be p o s s i b l e i n h i g h - r e s o l u t i o n imagery. Understanding indoor house or o f f i c e scenes. S t r u c -t u r a l models of f u r n i t u r e , windows, or doorways c o u l d be developed and corresponding tokens d e r i v e d from the p r i m a l sketch. L i k e the example above, t h i s would r e q u i r e matching d e t a i l e d s t r u c t u r a l models to a p r i -mal sketch of a h i g h - r e s o l u t i o n image of t h i s type of scene. Matching a e r i a l or s a t e l l i t e images to maps. T h i s would i n v o l v e matching known f e a t u r e s to elements d e r i v e d from the p r i m a l sketch. L i t t l e [ L i t t l e 80] Shows how l i n e elements found i n LANDSAT images may be matched to f e a t u r e s d e r i v e d from a d i g i t a l e l e v a -t i o n model. The matching i s used f o r automatic r e c -t i f i c a t i o n of the image to the model. A s i m i l a r matching might be attempted between scene elements d e r i v e d from the p r i m a l sketch and known map f e a t u r e s . Shape d e s c r i p t i o n s p r o v i d e a v i s i o n system with powerful cues f o r o b j e c t r e c o g n i t i o n . Reasoning from the pr i m a l sketch can produce many of those cues. To accom-73 p l i s h t h i s task, the e f f e c t s of the s t r u c t u r a l p r o p e r t i e s of scene f e a t u r e s must be p r e d i c t a b l e i n the language of the p r i m a l sketch. Bottom-up b u i l d i n g of shape tokens i s a l s o p o s s i b l e from the p r i m a l sketch. 6.2. The p r i m a l sketch i s a p p r o p r i a t e The p r i m a l sketch i s a s u i t a b l e l e v e l of r e p r e s e n t a -t i o n f o r reasoning about the s t r u c t u r a l p r o p e r t i e s of f e a t u r e s . T h i s means that computational problems faced by a reasoning system are f a c i l i t a t e d by the r e p r e s e n t a t i o n , and that the r e p r e s e n t a t i o n i s d e s c r i p t i v e l y adequate for s t r u c t u r a l d e s c r i p t i o n s of the elements of i n t e r e s t . The f i r s t of these c o n d i t i o n s i s demonstrated by the r e s u l t s of the computer implementation d e s c r i b e d above. S t i l l more work must be done to decide i f the p r i m a l sketch i s d e s c r i p t i v e l y adequate. There i s some evidence that the p r i m a l sketch c o n t a i n s a l l the i n f o r m a t i o n i n the image. If t h i s i s the case, one can e a s i l y argue that the p r i m a l sketch i s adequate. To be a p p r o p r i a t e , there must be a reason to choose to c a r r y out t h i s type of reasoning process using the p r i -mal sketch. There are two p e r s p e c t i v e s on t h i s q u e s t i o n . 74 If the goal of the re s e a r c h i s to sol v e problems i n v i -s i o n , then the method i s a p p r o p r i a t e simply because i t can be s u c c e s s f u l l y a p p l i e d . From the p e r s p e c t i v e of Marr's theory of Human v i s i o n , reasoning from model knowledge or the formation of s t r u c t u r a l elements i s done at the l e v e l of the f u l l 2 1/2-D sketch. T h i s makes the approach taken here s l i g h t l y premature. The r e s u l t s achieved i n d i c a t e the type of reasoning process that must be c a r r i e d out at the l e v e l of the f u l l 2 1/2-D sketch. 6 . 3 . Research i s s u e s The problem of forming s t r u c t u r a l d e s c r i p t i o n s from image r e p r e s e n t a t i o n s , e i t h e r bottom-up or top-down, presents the f o l l o w i n g r e s e a r c h i s s u e s : 1. A language f o r s t r u c t u r a l d e s c r i p t i o n should be developed. T h i s language would provide a uniform means of exp r e s s i n g knowledge about the s t r u c t u r e or shape of o b j e c t s , as w e l l as a r e p r e s e n t a t i o n f o r elements d e r i v e d from the p r i m a l or 2 1/2-D sk e t c h . 2. A p r o c e d u r a l formalism f o r reasoning about image f e a t u r e s using knowledge of shape or s t r u c t u r e should be developed as a t o o l to f a c i l i t a t e experiments i n 75 t h i s type of reasoning. Most e x i s t i n g v i s i o n systems have independently developed a s t r a t e g y f o r r e p r e s e n t i n g and using s t r u c t u r a l r e l a t i o n s h i p s . These u s u a l l y are w e l l s u i t e d to the f e a t u r e domain of the system, but are d i f f i c u l t to apply to other domains. A more gen e r a l approach i s r e q u i r e d . 3. The problem of reasoning about the c o n t e x t u a l r e l a -t i o n s h i p s between scene o b j e c t s should continue to be an important r e s e a r c h i s s u e . The a b i l i t y to reason about the many r e l a t i o n s h i p s amongst o b j e c t s and p a r t s of o b j e c t s i s c r u c i a l i f image understanding i s ever to be achieved. BIBLIOGRAPHY [Ambler & Popplestone 75] Ambler, A. P. and Popplestone, R. J . , " I n f e r r i n g the P o s i t i o n s of Bodies from S p e c i f i e d S p a t i a l R e l a t i o n s h i p s " , A r t i f i c i a l I n t e l l i g e n c e , V o l 6, PP157-174, 1975. [Bajcsy 73] Bajcsy, R., "Computer i d e n t i f i c a t i o n of v i s u a l s u r f a c e s " Computer Graphics and Image P r o c e s s i n g - 2, pp1 1.8-130, October 1973. [Bajcsy & T a v a k o l i 73] Bajcsy, R. and T a v a k o l i , M. , "A Computer Recogni-t i o n of B r i d g e s , I s l a n d s , R i v e r s and Lakes from S a t e l l i t e P i c t u r e s " , Machine P r o c e s s i n g of Re-motely Sensed Data, Purdue, 1973. [Bajcsy & T a v a k o l i 76] Bajcsy, R. and T a v a k o l i , M., "Computer Recogni-t i o n of Roads from S a t e l l i t e P i c t u r e s " , Second  I n t e r n a t i o n a l J o i n t Conference on P a t t e r n Recog-n i t i o n , Copenhagen, 1976. [Bajcsy & Rosenthal 77] Bajcsy, R. and Rosenthal, D., "What One Can See From D i f f e r e n t A l t i t u d e s - A H e i r a r c h i c a l C o n t r o l S t r u c t u r e i n Computer V i s i o n " , P a t t e r n Recogni-t i o n and Image P r o c e s s i n g , 1977. [ B a l l a r d 79] B a l l a r d , D. H., " S t r i p Trees : A H e i r a r c h i c a l R e p r e s e n t a t i o n f o r Map Fe a t u r e s " , P a t t e r n Recog-n i t i o n and Image P r o c e s s i n g , pp278-285, August 1979. [ B r i c e & Fennema 70] B r i c e , C. R. and Fennema, C. L., "Scene A n a l y s i s Using Regions", A r t i f i c i a l I n t e l l i g e n c e 1, 3, pp205-226, 1970. [ C a t a n z a r i t i & Mackworth 78] C a t a n z a r i t i , E. and Mackworth, A. K., " F o r e s t s and Pyramids: Using Image H e i r a r c h i e s to Under-stand LANDSAT Images", U n i v e r s i t y of B r i t i s h Columbia Computer Science T e c h n i c a l Report 78-2, March 1978. [Conway 76] Conway, S., Logging P r a c t i c e s : P r i n c i p l e s of  Timber H a r v e s t i n g Systems, M i l l e r Freeman P u b l i -c a t i o n s , 1976. [Cormack 71 ] Cormack, R. M., "A Review of C l a s s i f i c a t i o n " , The  J o u r n a l of the Royal S t a t i s t i c a l S o c i e t y 134, 3, pp321-367, 1971. [Davis 75] Davis, L. S., "A Survey of Edge D e t e c t i o n Tech-niques", Computer Graphics and Image P r o c e s s i n g 4, pp248-270, September 1975. [Duda & Hart 73] Duda, R. and Hart, P., P a t t e r n C l a s s i f i c a t i o n and  Scene A n a l y s i s , J . Wiley & Sons, New York, 1973. [ F i s c h l e r , Agin, Barrow, B o l l e s , Quam, Tenenbaum & Wolf 79] F i s c h l e r , M. A., Agin, G. J . , Barrow, H. G., B o l l e s , R. C , Quam, L. H., Tenenbaum, J . M., and Wolf, H. C , "The SRI Image Understanding Pro-gram", Proceedings Image Understanding Workshop, A p r i l 1979. [ F i s c h l e r , Tenenbaum & Wolf 81] F i s c h l e r , M. A., Tenenbaum, J . M., and Wolf, H. C , "Detection of Roads and L i n e a r S t r u c t u r e s i n Low-Resolution A e r i a l Imagery Using a M u l t i s o u r c e Knowledge I n t e g r a t i o n Technique", Computer Graph-i c s and Image P r o c e s s i n g 15, pp20l-223, 1981. [Glicksman 82] Glicksman, A. J . , "A Scemata-Based System f o r U t i l i z i n g Cooperating Knowledge Sources in Com-puter V i s i o n " , Proceedings of the Fourth B i e n n i a l  Conference, CSCSI / SCEIO, pp33-40, May 1982. [Havens 78] Havens, W. S., "A Pr o c e d u r a l Model of Re c o g n i t i o n for Machine P e r c e p t i o n " , Proceedings of the  Second N a t i o n a l Conference, CSCSI / SCEIO, pp. 254-262, 1978. [ H i l d r e t h 80] H i l d r e t h , E. C , "Implementation of a Theory of Edge D e t e c t i o n " , MIT AI-TR-579, MIT AI Lab., A p r i l 1980. [Horn 73] Horn, B. K. P., "On L i g h t n e s s " , MIT AI Memo 295, October, 1973. [Horn 75] Horn, B. K. P., Ob t a i n i n g Shape from Shading In-formation, i n The Psycology of Computer V i s i o n , P. H. Winston (ed.), pp93-113, McGraw-Hill, New York, 1972. [Horn 77] Horn, B. K. P., "Understanding Image I n t e n s i -t i e s " , A r t i f i c i a l I n t e l l i g e n c e 8, 2, pp2 0 l - 2 3 l , A p r i l 1977. [Horn & Sjoberg 78] Horn, B. K. P. and Sjoberg, R. W., " C a l c u l a t i n g the R e f l e c t a n c e Map", MIT AI Memo 498, October 1 978. [Hwang, Lee & H a l l 79] Hwang, J . J . , Lee, C. C , and H a l l , E. L., "Seg-mentation of S o l i d Objects Using G l o b a l and L o c a l Edge C o i n c i d e n c e s " , P a t t e r n R e c o g n i t i o n and Image  Pr o c e s s i n g 1979, pp114-121. [Keser 76] Keser, N. I n t e r p r e t a t i o n of Landforms from A e r i a l  Photographs, B r i t i s h Columbia F o r e s t S e r v i c e , V i c t o r i a , B.C., 1976. [ L i t t l e 80] L i t t l e , J . J . , "Automatic R e c t i f i c a t i o n of LANDSAT Images Using Features Derived from D i g i -t a l T e r r a i n Models", TR 80-10, U n i v e r s i t y of B r i t i s h Columbia, Sept. 1980. [Mackworth 76] Mackworth, A. K., "Model Driven I n t e r p r e t a t i o n i n I n t e l l i g e n t V i s i o n Systems", Percept ion, V o l 5, pp349-370, 1976. [Mackworth 77] Mackworth, A. K., "On Reading Sketch Maps", Proceedings of the F i f t h I n t e r n a t i o n a l J o i n t  Conference on A r t i f i c i a l I n t e l l i g e n c e , 1977. [Mackworth & Havens 81] Mackworth, A. K. and Havens, W. S., " S t r u c t u r i n g Domain Knowledge f o r V i s u a l P e r c e p t i o n " , Proceed-ings of The Seventh I n t e r n a t i o n a l J o i n t Confer-ence on A r t i f i c i a l I n t e l l i g e n c e , Vancouver, 1981. [Marr 74a] Marr, D., "An Essay on the Primate R e t i n a " , MIT AI Memo #296, 1974. [Marr 74b] Marr, D., "The Rec o g n i t i o n of Sharp, C l o s e l y Spaced Edges", MIT AI Memo #326, 1974. [Marr 76] Marr, D., " A r t i f i c i a l I n t e l l i g e n c e - A Personal View", MIT AI Memo #355, 1976. [Marr & Poggio 76] Marr, D. and Poggio, T., "From Understanding Com-pu t a t i o n to Understanding Neural C i r c u i t r y " , MIT AI Memo #357, 1976. [Marr 79] Marr, D., " V i s u a l Information P r o c e s s i n g " , Proceedings of the S i x t h I n t e r n a t i o n a l J o i n t  Conference on A r t i f i c i a l I n t e l l i g e n c e , Tokyo, pp1108-1126, 1979. [Marr & H i l d r e t h 80] Marr, D. and H i l d r e t h , E., "Theory of Edge Detec-t i o n " , Proceedings of the Royal S o c i e t y of Lon-don, B 207, ppl87-217, 1980. [Marr 82] Marr, D., V i s i o n , W. H. Freeman & co. San Fran-c i s c o , 1 982. [Nevatia & Babu 79] N e v a t i a , R. and Babu, K. R., "Linear. Feature Ex-t r a c t i o n and D e s c r i p t i o n " , Proceedings of the  S i x t h I n t e r n a t i o n a l J o i n t Conference on A r t i f i -c i a l I n t e l l i g e n c e , Tokyo, pp639-641, 1979. [Nevatia & P r i c e 82] N e v a t i a , R. and P r i c e , K. E., " L o c a t i n g S t r u c -t u r e s i n A e r i a l Images", IEEE P a t t e r n A n a l y s i s  and Machine I n t e l l i g e n c e , V o l . PAMI-4, No. 5, pp476-484 r September 1982. [ N e v a t i a , P r i c e & V i l n r o t t e r 79] N e v a t i a , R., P r i c e , K. E., and V i l n r o t t e r , F., " D e s c r i b i n g N a t u r a l Textures", Proceedings Image  Understanding Workshop, A p r i l 1979. [Nicodemus, Richmond, H i s a , Ginsberg, & Limperis 77] Nicodemus, F. E., Richmond, J . C , H i s a , J . J . , Ginsberg, I. W., and L i m p e r i s , T., "Geometrical C o n s i d e r a t i o n s and Nomenclature for R e f l e c t a n c e " , NBS Monograph 160, U n i t e d S t a t e s N a t i o n a l Bureau of Standards, U. S. Department of Commerce, Wash-ingt o n , D.C, 1977. [ P r a t t 78] P r a t t , W. K., D i g i t a l Image P r o c e s s i n g , J . Wiley & Sons, Toronto, 1978. [Quam 78] Quam, L. H., "Road T r a c k i n g and Anomaly Detec-t i o n " , Image Understanding Workshop, pp5l-55, May 1978. [Roberts 65] Roberts, L. G., "Machine P e r c e p t i o n of Three D i -mensional S o l i d s " , i n O p t i c a l and E l e c t r o - o p t i c a l Information P r o c e s s i n g , J.T. T i p p e t t , D. Ber-kowitz, L. Clapp, C. Koester, & A. Vanderburgh ( e d s . ) , MIT Press, Cambridge, Mass. ppl59-l97, 1965. [Russel 79] R u s s e l , D. M., "Where Do I Look Now? : Modeling and I n f e r r i n g Object L o c a t i o n s by C o n s t r a i n t s " , P a t t e r n Recognition and Image Pr o c e s s i n g 1979, PP175-181. [ S t a r r & Mackworth 78] Starr,. D. W. and Mackworth, A. K., " E x p l o i t i n g S p e c t r a l , S p a t i a l , and Semantic C o n s t r a i n t s i n the Segmentation of LANDSAT Images", U n i v e r s i t y of B r i t i s h Columbia Computer Science T e c h n i c a l Report 78r1, Febuary 1978. [Swain & Davis 78] Swain, P. H. and Davis, S. M. ( e d s . ) , Remote Sensing: The Q u a n t i t a t i v e Approach, McGraw-Hill., New York, 1978. [Tamura, Mori & Yanawaki 78] Tamura, H., Mori, S., and Yanawaki, T., " T e x t u r a l Features Corresponding to V i s u a l P e r c e p t i o n " , IEEE T r a n s a c t i o n s SMC-8, pp460-473, June 1978. [ T a v a k o l i & Rosenfeld 80] T a v a k o l i , M. and Rosenfeld, A., "Towards the R e c o g n i t i o n of B u i l d i n g s and Roads on A e r i a l Pho-tographs", U n i v e r s i t y of Maryland TR913, 1980. [ T a v a k o l i & Rosenfeld 82] T a v a k o l i , M. and Rosenfeld, A., " B u i l d i n g and Road E x t r a c t i o n from A e r i a l Photographs", IEEE  T r a n s a c t i o n s SMC-12, January/Febuary 1982. [Tenenbaum, Barrow, B o l l e s , F i s c h l e r & Wolf 79] Tenenbaum, J . M. , Barrow, H. G., B o l l e s , R. C , F i s c h l e r , M. A., and Wolf, H. C , "Map Guided In-t e r p r e t a t i o n of Remotely-Sensed Imagery", P a t t e r n  R e c o g n i t i o n and Image P r o c e s s i n g 1979, pp6l0-619. [Vanderbrug 75a] Vanderbrug, G. J . , "Line D e t e c t i o n i n S a t e l l i t e Imagery", U n i v e r s i t y of Maryland T e c h n i c a l Report TR371, 1975. [Vanderbrug 75b] Vanderbrug, G. J . , "Experiments i n I t e r a t i v e Enhancement of L i n e a r Features", U n i v e r s i t y of Maryland TR425, 1975. [Waltz 72] Waltz, D. L., "Generating Semantic D e s c r i p t i o n s from Drawings of Scenes with Shadows", T e c h n i c a l - n o t e TR-271, AI Lab, MIT, November 1972. [Waltz 75] Waltz, D. L., "Understanding L i n e Drawings of Scenes with Shadows", i n The Psycology of Comput-er V i s i o n , P. H. Winston (ed.), ppl9-91. McGraw-H i l l , New York, 1975. [Weszka & Dyer 76] Weszka, J . S. Dyer, C. R., and Rosenfeld, A., "A Computer Study of Texture Measures f o r T e r r a i n C l a s s i f i c a t i o n " , IEEE T r a n s a c t i o n s SMC 6, pp269-285, A p r i l 1976. [Woodham 80] Woodham, R. J . , "Using D i g i t a l T e r r a i n Data to Model Image Formation i n Remote Sensing", Proceedings of the S o c i e t y of P h o t o - O p t i c a l Instrumentation Engineers, V o l . 238, pp361-369, 1980. [Woodham 81] Woodham, R. J . , "Analyzing Images of Curved Sur-f a c e s " , A r t i f i c i a l I n t e l l i g e n c e 17, pp117-141, 1981. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051819/manifest

Comment

Related Items