Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Representation and control in program that understands line sketches of houses Mulder, Jan A. 1979

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1979_A6_7 M84.pdf [ 5.69MB ]
Metadata
JSON: 831-1.0051802.json
JSON-LD: 831-1.0051802-ld.json
RDF/XML (Pretty): 831-1.0051802-rdf.xml
RDF/JSON: 831-1.0051802-rdf.json
Turtle: 831-1.0051802-turtle.txt
N-Triples: 831-1.0051802-rdf-ntriples.txt
Original Record: 831-1.0051802-source.json
Full Text
831-1.0051802-fulltext.txt
Citation
831-1.0051802.ris

Full Text

REPRESENTATION AND CONTROL IN a PROGRAM THAT UNDERSTANDS LINE SKETCHES OF HOUSES fay JAN A. MULDER B.A. (Psychology), U n i v e r s i t y of L e i d e n , The Netherlands, 1974 A THESIS SOBBITTED IN PARTIAL FULFILLBENT OF THE HEQUIREMENTS FOB THE DEGREE OF HASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES {DEPARTMENT OF COMPUTER SCIENCE) He accept t h i s t h e s i s as conforming to the r e q u i r e d standard. THE UNIVERSITY OF BRITISH COLUMBIA February, 1979 (c) Jan A. Mulder, 1979 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r an a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e a n d s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d by t h e Head o f my D e p a r t m e n t o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . D e p a r t m e n t o f —Computer Science— The U n i v e r s i t y o f B r i t i s h C o l u m b i a 2 0 7 5 W e s b r o o k P l a c e V a n c o u v e r , C a n a d a V6T 1W5 n a t P 21-2-79 i i ABSTRACT In t h i s t h e s i s , a program, HOUSE, i s d e s c r i b e d t h a t can i n t e r p r e t l i n e s k etches of houses and other p o l y h e d r a l o b j e c t s . The program i s part of the SEE p r o j e c t , a sketch understanding p r o j e c t a t the U n i v e r s i t y of B r i t i s h Columbia. In t h i s r e s p e c t HOUSE i s a g e n e r a l i z a t i o n of HAPSEE, a program t h a t can i n t e r p r e t l i n e sketches of geographic maps. The most important g o a l of HOUSE i s to t e s t the adequacy and e f f i c i e n c y of the program's c o n t r o l s t r u c t u r e which i s based on a h e l i c a l metaphor f o r the p e r c e p t u a l p r o c e s s . Such a h e l i c a l metaphor i s based on a s t r a t i f i e d i n t e r p r e t a t i o n process with a bottom-up, p a s s - o r i e n t e d c o n t r o l s t r u c t u r e . HOUSE tak e s ( l i k e MAPSEE) as i n p u t a p l o t program by means of which the sketch can be d i s p l a y e d on a g r a p h i c a l d i s p l a y d e v i c e . The program subsequently d e s c r i b e s the sketch at d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n . These l e v e l s can be c a t e g o r i z e d i n two domains: a p i c t u r e domain c o n s i s t i n g of two-dimensional r e p r e s e n t a t i o n s of the sketch and a scene domain c o n s i s t i n g of t h r e e - d i m e n s i o n a l r e p r e s e n t a t i o n s . P r o c e s s i n g i n the p i c t u r e domain i s dominated by a segmentation process t h a t r e s u l t s i n t h r e e d i f f e r e n t types of r e p r e s e n t a t i o n s . In c o n t r a s t with MAPSEE, HOUSE a l s o maintains d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n the scene domain. Each of these l e v e l s r e p r e s e n t s the sketch at a d i f f e r e n t degree of a b s t r a c t i o n . The c y c l e of p e r c e p t i o n with i t s f o u r s t a g e s s e r v e s as a metaphor f o r d e s c r i b i n g the i n t e r p r e t a t i o n process at each of these l e v e l s . Cues are f i r s t formed by the segmentation process (cue d i s c o v e r y ) ; these cues (the v e r t i c e s ) suggest p o s s i b l e i n t e r p r e t a t i o n s f o r the p r i m i t i v e s (the edges) at the lowest l e v e l i n the i n t e r p r e t a t i o n domain (model i n v o c a t i o n ) ; these i n t e r p r e t a t i o n s are taken by a network c o n s i s t e n c y a l g o r i t h m t h a t t e s t s the g l o b a l c o n s i s t e n c y among these i n t e r p r e t a t i o n s (model t e s t i n g and e l a b o r a t i o n ) . I n t e r p r e t a t i o n s t h a t are p a r t of a g l o b a l l y c o n s i s t e n t d e s c r i p t i o n of the s k e t c h then s e r v e as cues f o r the next higher l e v e l where the c y c l e i s r e p e a t e d . T h i s process c o n t i n u e s u n t i l the sketch i s d e s c r i b e d a t the h i g h e s t p o s s i b l e l e v e l of a b s t r a c t i o n . S e v e r a l examples were run with the program. Apart from a number of d e s i r a b l e f e a t u r e s , these t e s t s showed two important weaknesses o f the h e l i c a l metaphor: i t s i n a b i l i t y to account f o r incomplete l i n e sketches, and i t s i n a b i l i t y t o impose top-down c o n s t r a i n t s . These weaknesses l e d t o the f o r m u l a t i o n of a more powerful metaphor: the m u l t i - h e l i x . Among oth e r t h i n g s , t h i s m u l t i - h e l i x allows m u l t i p l e access to the d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n both a bottom-up and top-down d i r e c t i o n . A comparison of the m u l t i - h e l i x metaphor with s e v e r a l other p e r c e p t u a l p r o c e s s i n g aetaphors argues f o r the former*s s u p e r i o r a b i l i t y t o account f o r c e r t a i n c h a r a c t e r i s t i c s of the p e r c e p t u a l process. Consequently, the m u l t i - h e l i x can be seen as another step towards a theory of machine p e r c e p t i o n . i i i TABLE OF CONTENTS 1: I n t r o d u c t i o n and re a d i n g guide ............................ 1 2: Three preambles to HODSE 8 2-1 Preamble 1: C h a r a c t e r i s t i c s of the p e r c e p t u a l process . 8 2.2 Preamble 2: The SEE p r o j e c t and i t s g o a l s ............ 11 2.3 Preamble 3: MAPSEE and the c y c l e of p e r c e p t i o n ....... 29 2.4 Summary ..............................................34 3: Goals and overview of BOOSE .............................. 36 3.2 Overview of HOUSE-..--..----.-------- - .......... 39 4s D e t a i l e d d e s c r i p t i o n of the program ...................... 44 4.1 Input 44 4.2 P r o c e s s i n g i n the p i c t u r e domain ..................... 44 4.2-1 The Sketch l e v e l 45 4.2.2 The L i n e / r e g i o n l e v e l ...........................45 4.2.3 The Vertex l e v e l ............................... 53 4.2.3.1 The vertex f o r m a t i o n procedures ....... 53 4.2.3.2 Besegmentation of the c h a i n s .......... 57 4.2.3.3 C o n s t r a i n i n g the p r i m i t i v e s ........... 61 4.2.3.4 Besegmentation of the r e g i o n s ......... 62 4.2.3.5 Shapes o f r e g i o n s ..................... 64 4-2.3.6 Begion r e l a t i o n s h i p s .................. 64 4.3 P r o c e s s i n g i n the scene domain ....................... 65 4.3.1 The Edge l e v e l 68 4.3.2 The O r i e n t a t i o n l e v e l .......................... 72 4.3.3 The Surface naming l e v e l ....................... 78 4.3.4 The Object l e v e l ............................... 84 4.4 output - - - - - 87 5: Experiments and r e s u l t s ,8.9 6 z D i s c u s s i o n . . . . . . . . . . . . . . . . . . . a . . . . . . . . . . . . . . . . . . . . . . ' - . — . • 95 6.1 Favourable f e a t u r e s of HOUSE .......................... 95 6.2 Problems i n HOUSE ................................... 101 6.3 Bhat HODSE cannot do ................................ 105 7: h M u l t i - h e l i x as a metaphor f o r the p e r c e p t u a l p r o c e s s . , . 1 1 9 7.1 The metaphor ..........,....-......-..--..-----.-.-.-110 7.2 Some examples ...-.,.....,-..-..-..-.,,,-,.,,-»-.».., 112 7.2.1 M u l t i p l e a c c e s s i b i l i t y of l e v e l s with. top-down imposing of i n t e r p r e t a t i o n s .......... 112 7.2.2 I n t e r p r e t a t i o n - d r i v e n segmentation ............115 8: Towards a theory of machine p e r c e p t i o n .,...,,,---....--.119 i v LIST OF TABLES Table 1: C u e / i n t e r p r e t a t i o n T a b l e f o r the Edge l e v e l ........ 69 Tab l e 2: C u e / i n t e r p r e t a t i o n T a b l e f o r the O r i e n t a t i o n l e v e l - 73 Table 3: C u e / i n t e r p r e t a t i o n Table f o r the Surface naming L. . 79 Table 4: C u e / i n t e r p r e t a t i o n T a b l e f o r the O b j e c t l e v e l ...... 87 Table 5: Experimental r e s u l t s ............................... 92 V LIST OF FIGUSES F i g u r e 1 z An incomplete cube ................................ 3 F i g u r e 2: The c y c l e c f p e r c e p t i o n .......................... 10 F i g u r e 3: A house and i t s p a r t s ............................ 15 F i g u r e 4: Sketch o f an i s l a n d 28 Fi g u r e 5: A house drawn i n t h r e e degrees of accuracy ....... 39 F i g u r e 6: The HOUSE c o n t r o l s t r u c t u r e ...................... 43 F i g u r e 7: The l i n e segmentation process .................... 46 F i g u r e 8: Begicn segmentation o f the house example ......... 49 F i g u r e 9: The v e r t i c e s used i n HOUSE ....................... 54 F i g u r e 10: V e r t i c e s found i n the house example .............. 58 F i g u r e 11: The r e g i o n c o n s i s t e n c y t e s t ...................... 63 F i g u r e 12: Representation o f a h y p o t h e t i c a l s k e t c h c o n s i s t i n g of f o u r d i f f e r e n t p r i m i t i v e s with f o u r d i f f e r e n t p o s s i b l e d e s c r i p t i o n s ............................ 67 F i g u r e 13: One of the d e s c r i p t i o n s of the house example at the Edge l e v e l ........................... 70 F i g u r e 14: a t the o r i e n t a t i o n l e v e l . . . . . . . . . . . . . . . . . 75 F i g u r e 15: a t the Surface naming l e v e l ................. 81 F i g u r e 16: a t the Object l e v e l ......................... 86 F i g u r e 17: The in p u t sketches of the t e s t examples .......... 90 F i g u r e 18: The open cube ................................... 106 F i g u r e 19: The t r i c k y i s l a n d ...............................106 F i g u r e 20: A house with an edge missing .....................109 F i g u r e 21: Open house ...................................... 109 F i g u r e 22: D i f f e r e n t metaphors f o r the p e r c e p t u a l process . . 1 1 9 ACKNOWLEDGEENTS I am g r a t e f u l to Alan Hackaorth f o r h i s ad v i c e and s u p e r v i s i o n o f the r e s e a r c h r e p o r t e d In t h i s t h e s i s . I am a l s o g r a t e f u l t o Roger Browse, Handy Goebel, B i l l Havens, and Ri c h a r d Rosenberg f o r d i s c u s s i o n s and comments on e a r l i e r v e r s i o n s of t h i s t h e s i s . T h i s research has been supported by a s c h o l a r s h i p from the C u l t u r a l Exchange S e c t i o n of the Canada C o u n c i l . I n t r o d u c t i o n 3 CHAPTER J : INTRODUCTION AND READING GUIDE The development of computers, d u r i n g the l a s t two decades, has made p o s s i b l e the design of computer programs t h a t show i n t e l l i g e n t behavior. A r t i f i c i a l I n t e l l i g e n c e i s the d i s c i p l i n e concerned with the design of these program models. E a r l y r e s e a r c h i n t h i s f i e l d r e s u l t e d i n the development of a l a r g e number of computer models f o r d i f f e r e n t kinds of i n t e l l i g e n t behavior (e.g., Samuel, 1963, Evans, 1963, Newell and Simon, 1963, and Q u i l l i a n , 1969). These r e s u l t s c r e a t e d a gre a t optimism about the f u t u r e progress of the r e s e a r c h i n t h i s f i e l d . D i s c i p l i n e s t r a d i t i o n a l l y concerned with human i n f o r m a t i o n p r o c e s s i n g a l s o became i n t e r e s t e d i n A r t i f i c i a l I n t e l l i g e n c e r e s e a r c h . Experimental p s y c h o l o g i s t s , f o r example, had always been c o n s t r a i n e d i n t h e i r r e s e a r c h because of a b a s i c l i m i t a t i o n of t h e a v a i l a b l e r e s e a r c h t o o l s . C o g n i t i v e processes c o u l d not be measured d i r e c t l y . T h e i r c h a r a c t e r i s t i c s had t o be d e r i v e d by means of chronometric, r e c a l l , and r e c o g n i t i o n s t u d i e s . On the other hand, s t u d i e s o f the i n t e r i o r of the b r a i n have taught us much about the s t r u c t u r e of our sensory systems (e.g., the work by Hubel and H i e s e l (1962, 1963, 1965 and 1968) on the v i s u a l d e t e c t i o n system of the c a t and monkey), but l i t t l e i s known as yet about the s t r u c t u r e u n d e r l y i n g d i f f e r e n t types of highe r mental processes such as image understanding, language understanding and problem s o l v i n g . Given t h i s s t a t e of r e s e a r c h i n human i n f o r m a t i o n I n t r o d u c t i o n 2 p r o c e s s i n g the p o s s i b i l i t y of s i m u l a t i n g c o g n i t i v e processes on the computer was welcomed as an o p p o r t u n i t y to study these processes d i r e c t l y . Computer s t u d i e s became an i n t e g r a t e d p a r t i n the r e s e a r c h methods of many p s y c h o l o g i s t s , l i n g u i s t s and people i n e d u c a t i o n . In t u r n , a number of A r t i f i c i a l I n t e l l i g e n c e people became i n v o l v e d i n p s y c h o l o g i c a l experimentation. A new, i n t e r d i s c i p l i n a r y f i e l d of r e s e a r c h emerged: C o g n i t i v e Science (Bobrow and C o l l i n s , 1975). The great optimism t h a t surrounded A r t i f i c i a l I n t e l l i g e n c e i n i t s f i r s t years of e x i s t e n c e has been tempered, however. The dream o f machines performing i n t e l l i g e n t t a s k s appears very d i f f i c u l t t o r e a l i z e . Imagine the simple task of p i c k i n g up a cap from the Table and t a k i n g i t t o the k i t c h e n , o r , even s i m p l e r , the t a s k of r e c o g n i z i n g a number of c h i l d r e n ' s b l o c k s s c a t t e r e d on a T a b l e . For an a d u l t , t h i s kind of task seems very s i m p l e , because we c a r r y a l l the d i f f e r e n t types of knowledge r e q u i r e d to perform the task with us (e.g., knowledge of o b j e c t s , t h e i r p h y s i c a l appearance, t h e i r a p p l i c a t i o n s , knowledge of l i g h t i n g , support, o c c l u s i o n e t c . ) . w r i t i n g a program t h a t can perform a t a s k l i k e t h i s i s a very complex assignment. The s u b f i e l d of A r t i f i c i a l I n t e l l i g e n c e concerned with the design of programs f o r i n t e r p r e t a t i o n of t h r e e - d i m e n s i o n a l s i t u a t i o n s i s u s u a l l y r e f e r r e d to as scene a n a l y s i s . These programs take a two-dimensional l i n e drawing ( u s u a l l y r e f e r r e d to as the p i c t u r e ) or the d i g i t i z e d output of a t e l e v i s i o n camera as i n p u t and they attempt t o i d e n t i f y d i f f e r e n t a s p e c t s o f the I n t r o d u c t i o n 3 t h r e e - d i m e n s i o n a l s i t u a t i o n ( u s u a l l y r e f e r r e d to as the scene) t h a t i s r e p r e s e n t e d . In the s i t u a t i o n of the Table and the b l o c k s , the program might attempt to i d e n t i f y such t h i n g s as: the type of b l o c k s on the Table (cube, wedge or L-beam) and t h e i r arrangement (the wedge i s on top of the cube). F a i l u r e to i d e n t i f y many scenes c o r r e c t l y has taught us much about the nature o f p e r c e p t i o n (Clowes, 1973). F i g u r e 1 i l l u s t r a t e s such a problematic scene. As a r e s u l t o f l i g h t i n g , one l i n e i n the p i c t u r e i s missing. F i g u r e 1: An incomplete cube In order t o recognize the p i c t u r e as a cube i n the scene domain the missing edge has t o be h a l l u c i n a t e d . E a r l y scene a n a l y s i s I n t r o d u c t i o n 4 programs were unable to deal with i n f o r m a t i o n i n the scene domain when such i n f o r m a t i o n was absent i n the p i c t u r e domain. Clowes notes t h a t , i n order to overcome such problems, one needs at l e a s t an i m p l i c i t knowledge of p e r s p e c t i v e , l i g h t i n g , and shadow forma t i o n , as we l l as an e x p e c t a t i o n about the s o r t s of o b j e c t s ae may be d e a l i n g with. As one r e s u l t of t h i s experience A r t i f i c i a l I n t e l l i g e n c e r e s e a r c h now fo c u s s e s on two important i s s u e s : r e p r e s e n t a t i o n , the q u e s t i o n of how to c r e a t e s t r u c t u r a l d e s c r i p t i o n s that capture the meaningful o r g a n i z a t i o n of the s i t u a t i o n one wants to d e a l with, and c o n t r o l s the g u e s t i o n how to c h a r a c t e r i z e the process which c o n s t r u c t s and u t i l i z e s such s t r u c t u r a l d e s c r i p t i o n s . Besearchers i n both human and machine p e r c e p t i o n have experimented with d i f f e r e n t types o f r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e s . T h i s work l e d t o common e x p l a n a t o r y d e v i c e s f o r knowledge r e p r e s e n t a t i o n (for example, the concept of schemata) and t o common c h a r a c t e r i z a t i o n s of the p e r c e p t u a l p r o c e s s i n both f i e l d s . The d i s t i n c t i o n between d a t a - d r i v e n and model-driven processes i s an example of such a common c h a r a c t e r i z a t i o n . Because of the complexity of the problems i n v o l v e d , most of t h i s r e s e a r c h takes place i n the form of long term p r o j e c t s centered around s p e c i f i c i d e a s f o r r e p r e s e n t a t i o n and c o n t r o l a p p l i e d to d i f f e r e n t problem domains. The INR p r o j e c t a t the U n i v e r s i t y of C a l i f o r n i a (Norman and Bumelhart, 1975), the HEABSAY p r o j e c t at Carnegy Mellon U n i v e r s i t y ( L e s s e r and Erman, I n t r o d u c t i o n 5 1977 and 1978), and the SEE p r o j e c t a t the U n i v e r s i t y of B r i t i s h Columbia (Mackworth, 1977a, Mulder and Mackworth, 1978a) are examples of such an approach. In the Chapters 3 and 4 of t h i s t h e s i s , a computer program, HOUSE, i s d e s c r i b e d , t h a t can i n t e r p r e t l i n e sketches of t h r e e - d i m e n s i o n a l o b j e c t s l i k e houses, cubes and wedges. The work on t h i s program has been motivated by the assumption t h a t the p e r c e p t u a l process can be d e s c r i b e d by means of f i v e c h a r a c t e r i s t i c s : 1) p e r c e p t i o n i s a bi-modal process. The process c o n s i s t s of two components: a d a t a - d r i v e n (bottom-up) component and a model-driven (top-down) component. 2) p e r c e p t i o n i s a c y c l i c process, c o n s i s t i n g of f o u r s t a g e s : cue d i s c o v e r y , model i n v o c a t i o n , model t e s t i n g , and model e l a b o r a t i o n . 3 ) p e r c e p t i o n i s r e c u r s i v e ; the c y c l e i s gone through at d i f f e r e n t l e v e l s o f r e p r e s e n t a t i o n of a p i c t u r e . 4) p e r c e p t i o n i s a redundant pr o c e s s ; the p e r c e p t u a l system has more than one way of r e a c h i n g a c e r t a i n c o n c l u s i o n about the i d e n t i t y of an o b j e c t . These c h a r a c t e r i s t i c s are d e s c r i b e d i n more d e t a i l i n Chapter 2-HOUSE i s p a r t of the SEE p r o j e c t f o r sketch understanding , i n i t i a t e d by Mackworth i n 1975 a t the U n i v e r s i t y of B r i t i s h Columbia. T h i s p r o j e c t was s e t up to i n v e s t i g a t e a number of c u r r e n t i s s u e s of r e p r e s e n t a t i o n and c o n t r o l i n machine I n t r o d u c t i o n 6 perception research, as r e f l e c t e d i n the domain of l i n e sketches. These issues are embodied in the s i x goals of the project: 1) . to develop methods of exploiting the semantics of images designed for communication as t y p i f i e d by sketches. 2) to explore possible solutions to the chicken and egg problem i n perception: sensible segmentation reguires i n t e r p r e t a t i o n and vice versa. 3) to broaden the scope of v i s i o n programs by applying the lessons learned i n the blocks world to ether domains. 4) to provide an experimental vehicle for studying control structures required to implement schema-based theories of perception. 5) to make av a i l a b l e useful interpretation programs f o r some r e s t r i c t e d but important classes of sketches. 6) to explore the re l a t i o n s h i p between natural and conventional representations. These issues, problems and t h e i r roots i n previous machine vi s i o n research are also discussed i n more d e t a i l i n Chapter 2. H O U S E i s an offshoot of M & P S E E , a program designed to inter p r e t sketches of geographic maps (Mackworth, 1977a). M & P S E E was the f i r s t implemented solution of some of the issues raised i n the project goals. M A P S E E i s also described i n more d e t a i l i n Chapter 2. The description of the perceptual p r i n c i p l e s , the S E E I n t r o d u c t i o n 7 p r o j e c t g o a l s and the MAPSEE program serve as preambles to the d e s c r i p t i o n of the HOUSE program. An overview of t h i s program and i t s u n d e r l y i n g metaphor i s given i n Chapter 3. The d e t a i l e d d e s c r i p t i o n of the program can be found i n Chapter 4. T h i s l a t t e r Chapter can be skipped by re a d e r s not i n t e r e s t e d i n implementation d e t a i l s , however. The t e s t examples, run with the program and the r e s u l t o f these t e s t s a r e d i s c u s s e d i n Chapter 5. In Chapter 6 a performance e v a l u a t i o n of HOUSE i s given. A d i s c u s s i o n of f a v o u r a b l e , unfavourable and missing f e a t u r e s i n HOUSE l e a d s to the p r o p o s i t i o n of a more powerful metaphor f o r the d e s c r i p t i o n of the p e r c e p t u a l process. T h i s metaphor i s d i s c u s s e d i n Chapter 7. A number o f su g g e s t i o n s as to how t h i s metaphor could be implemented i s g i v e n i n s e c t i o n 2 of t h i s Chapter. T h i s s e c t i o n can be skipped a g a i n by read e r s not i n t e r e s t e d i n implementation d e t a i l s . The explanatory power of the metaphor proposed i n Chapter 7 i s compared with other metaphors f o r d e s c r i p t i o n of the p e r c e p t u a l process i n Chapter 8. The re a d e r i n t e r e s t e d i n a d e s c r i p t i o n of HOUSE i n a more compressed form i s r e f e r r e d to Sulder and Hackworth (1978a). Preambles to HOUSE 8 CHAPTER 2: THREE PREAMBLES TO HOUSE 2.1 Preamble 1s Charact e r i s t i c s of the perceptual process - PERCEPTION IS A BI-HODAL PROCESS. The Cognitive Science l i t e r a t u r e of the l a s t few years (e.g., Bobrow and Norman, 1975, Palmer, 1975, Norman and Bobrow, 1976) frequently makes a d i s t i n c t i o n between two types of processing: data-driven processes and model-driven processes. The d i s t i n c t i o n i s based i n part on i n t u i t i o n and i t i s based i n part on the di f f e r e n t c h a r a c t e r i s t i c s of the two types of processes. An i n t u i t i v e account can be provided by considering the perceptual processes that occur when me enter the l i v i n g room of a house (see also Minsky, 1975, p. 221). Previous experience with t h i s type of s i t u a t i o n has equipped us with a set of expectations about the type of objects we might expect i n the l i v i n g room: Tables, chairs, couches, etc. The exact course of the process w i l l be context dependent. However, i f we enter the l i v i n g room of our own house processing w i l l be d i f f e r e n t from the s i t u a t i o n i n which we enter the l i v i n g room of a house unknown to us. In the former s i t u a t i o n processing w i l l be very much driven by a detailed expectation of the s i t u a t i o n , that i s , we know exactly which objects we w i l l f i n d i n the room and i n which arrangement. In a l i v i n g room unknown to us these expectations w i l l be much more general, that i s , our Preambles t o HOUSE 9 e x p e c t a t i o n s w i l l be l i m i t e d t o the broad c a t e g o r y of o b j e c t s we u s u a l l y f i n d i n a l i v i n g room, we have to search f o r these o b j e c t s . T h i s i s the model-driven p a r t of the process: the search process i s d r i v e n by our knowledge (expectations) of t h i s type of s i t u a t i o n . How, imagine t h a t the people we are v i s i t i n g have a washing machine i n s t a l l e d i n the l i v i n g room, fie c e r t a i n l y d i d not expect t h i s eguipment i n the l i v i n g room. Hence, our r e c o g n i t i o n of i t c o u l d not p o s s i b l y be d r i v e n by t h e e x p e c t a t i o n of t h a t s p e c i f i c o b j e c t . In the l a t t e r case the i n t e r p r e t a t i o n process must have been d r i v e n l a r g e l y by our sensory d a t a . T h i s type of p r o c e s s i n g i s c a l l e d data d r i v e n , or bottom-up. The two types of p r o c e s s i n g have very d i f f e r e n t c h a r a c t e r i s t i c s . For example, model-driven or top-down processes are g o a l o r i e n t e d ; our e x p e c t a t i o n s w i l l guide an e x p l i c i t s e a r c h f o r a c h a i r , a couch or a T a b l e . Bottom-up pro c e s s e s , on the other hand, are a s s o c i a t i v e ; the p r i m i t i v e form d e s c r i p t i o n s provided by the sensory d a t a are h i g h l y ambiguous. For example, a r e c t a n g u l a r p a r a l l e l o p i p e d form we are f o c u s s i n g on can be the l e g of a c h a i r or the l e g of a T a b l e . The eye has to focus on the space above t h i s l e g before we can be sure about the i d e n t i t y of the o b j e c t . However, the d e c i s i o n to f o c u s the eye on the space above the l e g must have been guided by an hypothesis about the p o s s i b l e i d e n t i t y of the o b j e c t . For t h i s reason p e r c e p t i o n can be seen as a bi-modal form of p r o c e s s i n g : an i n t e r p r e t a t i o n of an image i s c o n s t r u c t e d as a r e s u l t of an i n t e r a c t i o n between top-down and bottom-up Preambles to HOUSE 10 processes. - PEBCEPTION IS ft CYCLIC PBOCESS. In a survey of scene analysis l i t e r a t u r e Mackworth (1977c) noted that the control structure of these programs can be characterized as a cycle of four processes: cue discovery, model invocation, model testing and model elaboration. Mackworth (1977a) has also shown that t h i s cycle can be closed, that i s , model elaboration leads to the discovery of new cues which, in turn, invoke new models etc. Figure 2 shows t h i s "cycle of perception". model elaboration cue discovery model t e s t i n g invocation model Figure 2: The cycle of perception Preambles to HOUSE 11 - PERCEPTION IS ft BECUBSIVE PROCESS. P e r c e p t i o n i s a l s o r e c u r s i v e : i n the c y c l e of p e r c e p t i o n , cues at one l e v e l of a n a l y s i s not only serve as a way to access models a t a higher l e v e l , they may a l s o be the model i n t e r p r e t a t i o n s r e s u l t i n g from a lower stage i n the p e r c e p t u a l p r o c e s s . As a r e s u l t of t h i s , the c y c l e can re p e a t i t s e l f at l e v e l s o f i n c r e a s i n g a b s t r a c t i o n of the o r i g i n a l s k e t c h . T h i s c h a r a c t e r i s t i c i s p a r t i c u l a r l y u s e f u l f o r the i n t e r p r e t a t i o n of images whose semantics can be represented a t many l e v e l s (e.g. houses). We w i l l see i n Chapter 3 and 4 t h a t HOUSE embodies t h i s c h a r a c t e r i s t i c . - PEBCEPTION IS A BEDUNMNT PBOCESS. U s u a l l y , a p i c t u r e c o n t a i n s many more f e a t u r e s than necessary i n order t o i d e n t i f y i t . F i g u r e 1, f o r example, was e a s i l y r e c o g n i z e d as a cube, although one edge i s missing. Huch more de g r a d a t i o n i s necessary before the p i c t u r e f a i l s to be r e c o g n i z e d . 2.2 Preamble 2: The SEE p r o j e c t and i t s goals Sketches are r e f l e c t i o n s of our knowledge of the r e a l world. T h i s knowledge can be d e p i c t e d i n a n a t u r a l »ay f that Preambles to HOUSE 12 i s , the sketches d e p i c t the world almost as a c c u r a t e l y as a photograph taken with a camera does, but u s u a l l y they are a s i m p l i f i e d r e p r e s e n t a t i o n of i t . Sketches a l s o r e f l e c t our c o n v e n t i o n a l knowledge of the world, that i s , they d e p i c t f a c t s about the world t h a t we know to be t r u e , but which would never appear i n such a way on a photograph. Sketches a r e a means of i n t e r p e r s o n a l communication. The knowledge of the r e a l world must t h e r e f o r e be r e f l e c t e d i n them i n such a redundant way t h a t another person can understand what these sketches d e p i c t . T h i s property makes sketches a very a t t r a c t i v e study o b j e c t . The r e a l world i s o f t e n too complex t o d e a l with d i r e c t l y , but one can s t i l l expect the p r i n c i p l e s t h a t underly the p e r c e p t u a l process i n g e n e r a l t o be represented i n these s i m p l i f i e d r e p r e s e n t a t i o n s . In order t o i n v e s t i g a t e c u r r e n t i s s u e s i n r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e of v i s i o n systems the SEE p r o j e c t has s i x g o a l s . 1) TO DEVELOP METHODS OP EXPLOITING THE SEMANTICS OF IMAGES DESIGNED FOB COMMUNICATION AS TYPIFIED BY SKETCHES. Although the a b i l i t y of humans to recognize simple l i n e drawings i s w e l l known, the process that l e a d s t o such a r e c o g n i t i o n i s not a t a l l obvious. The r e p r e s e n t a t i o n of a l l d i f f e r e n t t y p e s of pr o c e s s i n g known to take place i n humans i s a task f a r too complex f o r A r t i f i c i a l I n t e l l i g e n c e i n i t s present s t a t e of r e s e a r c h . I n s t e a d , A r t i f i c i a l I n t e l l i g e n c e Preambles t o HOUSE 13 c o n c e n t r a t e s on adequacy and e f f i c i e n c y . f o r example, even i f the p e r c e p t u a l process i s known to have a bottom-up and a top-down component, we might c o n c e n t r a t e on the bottom-up component, j u s t to i n v e s t i g a t e the circumstances under which t h i s component i s adequate as a r e p r e s e n t a t i o n of the p e r c e p t u a l process and to i n v e s t i g a t e the type of c o n t r o l s t r u c t u r e t h a t a l l o w s us to d e a l e f f i c i e n t l y with the type of s i t u a t i o n s encountered i n the r e c o g n i t i o n t a s k . The e f f i c i e n c y o f the system i s u s u a l l y measured with r e s p e c t to the CPU time and memory r e q u i r e d by the system t o perform the r e c o g n i t i o n task-In i t s e f f o r t s to model i n t e l l i g e n t processes A r t i f i c i a l I n t e l l i g e n c e has to s o l v e two kinds of problems: problems of r e p r e s e n t a t i o n and problems of c o n t r o l . In s k e t c h understanding, the f i r s t problem d e a l s with the q u e s t i o n of what i n f o r m a t i o n needs t o be represented i n the machine i n order t o make p o s s i b l e the r e c o g n i t i o n o f c e r t a i n c l a s s e s of sketches and the g u e s t i o n how t o s t r u c t u r e t h i s i n f o r m a t i o n - The other problem addresses the process t h a t can make i n f e r e n c e s from the i n f o r m a t i o n r e p r e s e n t e d and the process that maps the sensory data onto the i n f o r m a t i o n r e p r e s e n t e d . The problems of r e p r e s e n t a t i o n and c o n t r o l are f r e q u e n t l y embodied i n p r o c e d u r a l / d e c l a r a t i v e t r a d e o f f s (e.g., S i n o g r a d , 1975). The p r o c e d u r a l view assumes that a l l "knowing" i s e q u i v a l e n t t o "knowing how" while the d e c l a r a t i v e view emphasizes "knowing" as "knowing t h a t " . The d e c l a r a t i v e approach p r o v i d e s economy of r e p r e s e n t a t i o n (each type of i n f o r m a t i o n i s uniguely represented) and human u n d e r s t a n d a b i l i t y Preambles to HOUSE 14 and communicability. The c o n t r o l s t r u c t u r e c o n s i s t s o f g e n e r a l procedures using s p e c i a l problem-dependent data. The p r o c e d u r a l approach r e l i e s on s p e c i f i c procedures f o r s p e c i f i c problems. I f a c e r t a i n type of i n f o r m a t i o n can be a p p l i e d i n more than one way, i t i s l i k e l y t o be represented i n more than one procedure. However, an adequate e x p l o i t a t i o n o f the semantics o f images f r e q u e n t l y r e q u i r e s a mixture o f the two approaches: one p a r t of the i n f o r m a t i o n represented d e c l a r a t i v e l y , and another p a r t p r o c e d u r a l l y . He w i l l see, t h a t t h i s i s the case i n HOUSE. 2) TO EXPLORE POSSIBLE SOLUTIONS TO THE CHICKEN AND EGG PROBLEM IN PERCEPTION. A human observer can, with l i t t l e d i f f i c u l t y , r e c o g n i z e the s k e t c h i n F i g u r e 3a as a house. almost simultaneously he a l s o r e c o g n i z e s the door, the walls and the windows.. However, i t i s d i f f i c u l t t o determine what was f i r s t r e c o g n i z e d : the door, the w a l l , the windows or the complete house? The problem i s t h a t i n t h i s c o n v e n t i o n a l sketch one cannot rec o g n i z e a p a r t (e.g., the window) bef o r e one knows the i d e n t i t y of the whole (the house). F i g u r e 3b, c and d show the windows and the door i n i s o l a t i o n . From t h i s i t can be seen t h a t t h e i r r e c o g n i t i o n i s more d i f f i c u l t when o b j e c t s are shown i n i s o l a t i o n as compared t o the s i t u a t i o n where these o b j e c t s appear i n the c o n t e x t of a house. P r e a m b l e s t o HOUSE 15 F i g u r e 3: A h o u s e a n d i t s p a r t s F i g u r e 3e s h o w s t h a t t h e p r e s e n c e o f t h e w i n d o w s a n d t h e d o o r a l s o s u p p o r t t h e r e c o g n i t i o n o f t h e h o u s e ; w i t h o u t i t s m e a n i n g f u l p a r t s t h e h o u s e l o o k s l i k e a n o r d i n a r y p o l y h e d r a l o b j e c t . Preambles to HODSE 16 This problem has been referred to by Palmer (1975) as the parsing paradox. Hackworth (1975) c a l l s i t the chicken and egg problem i n perception. The same problem i s met when tr y i n g to analyze the d i g i t i z e d output of a t e l e v i s i o n camera. Such an output consists of an array of pixels of varying i n t e n s i t y . Line finding procedures use the s p a t i a l d i f f e r e n t i a t i o n of p i x e l i n t e n s i t i e s to f i n d a l l the points i n the picture where substantial i n t e n s i t y changes occur. Interconnected sets of such points are presumed to represent l i n e s i n the picture. However, t h i s type of context free procedure has proven to be unreliable i n the sense that i t frequently comes up with non-existent l i n e s as a r e s u l t of shadowing e f f e c t s or non-homogeneous r e f l e c t i v i t y of a surface. It i s hard to decide which of the high i n t e n s i t y p i x e l s are part of object boundaries and which are not. Bottom-up methods which exhaustively check a l l p i x e l combinations succumb to a combinatorial explosion since there are t y p i c a l l y 100,000 pi x e l s i n a picture. Reliance on purely top-down methods r e s u l t s i n an exhaustive search through a l l the possible models and t h e i r combinations. Palmer (1975) suggested that a simultaneous application of both bottom-up and top-down methods could provide an e f f i c i e n t solution of t h i s problem. Such a multi-processing approach, however, encounters very complex problems i n representation and control (e.g.. Havens, 1978). Mackworth (1977a) has shown on the other hand that a conservative segmentation of the image followed by a consistency test between the models suggested during t h i s segmentation can also lead to a correct Preambles to HOUSE 17 i n t e r p r e t a t i o n of the image i n an e f f i c i e n t manner. T h i s approach i s a l s o taken i n HOUSE. 3) TO BROADEN THE SCOPE OP VISION PROGRAMS BY APPLYING THE LESSONS LEARNED IN THE BLOCKS WORLD TO OTHER DOMAINS. Indeed, the re s e a r c h i n the b l o c k s world has taught us many l e s s o n s . Roberts (1965) was the f i r s t person t o w r i t e a program t h a t c o u l d i n t e r p r e t l i n e drawings of th r e e - d i m e n s i o n a l o b j e c t s . Roberts's program c o n s i s t s of two p a r t s : a program t h a t reduces a gray l e v e l p i c t u r e t o a l i n e drawing, and a program that i n t e r p r e t s the l i n e drawing. The program can r e c o g n i z e t h r e e d i f f e r e n t types of polyhedra: cubes, wedges and prisms. These models are s p e c i f i e d by t h e i r t h r e e - d i m e n s i o n a l homogeneous c o o r d i n a t e s . The process t h a t matches the p i c t u r e d e s c r i p t i o n with the model d e s c r i p t i o n s can be c h a r a c t e r i z e d as a seguence of mathematical t r a n s f o r m a t i o n s with two important a s p e c t s . F i r s t of a l l , Roberts n o t i c e d t h a t over a wide range of p o s s i b l e viewpoints the topology of the models i n q u e s t i o n does not change. Thus, r a t h e r than using a procedure t h a t e x h a u s t i v e l y matches a l l the p o s s i b l e viewpoints o f the models a g a i n s t the p i c t u r e , only a few of them had to be t r i e d . A second important c o n t r i b u t i o n by Roberts i s that h i s program was the f i r s t to show the f e a s i b i l i t y o f c h a r a c t e r i z i n g the p e r c e p t u a l process as a c y c l e of f o u r processes: cue d i s c o v e r y , model i n v o c a t i o n , model t e s t i n g and model e l a b o r a t i o n . The program goes through the c y c l e s e v e r a l times i n d i f f e r e n t p a r t s of the p i c t u r e Preambles to HOUSE 18 (Mackworth, 1978). Guzman (1969) took a d i f f e r e n t approach- Starting from a s p e c i f i c a t i o n of the picture l i n e s and vertic e s his program groups the regions into "nuclei of bodies", where each body represents an object. The process of linking regions together i s driven by a l i s t of vertex arms specifying the arms of vertex types which can l i n k regions. Thus, the information about what constitutes v a l i d three-dimensional objects i s represented i n one domain only, the picture domain. Although a few three-dimensional situations were handled c o r r e c t l y , the many f a i l u r e s (e.g. the i n a b i l i t y of the program to handle objects with holes) i l l u s t r a t e d the necessity to represent the knowledge about three-dimensional objects in more than one domain, as Clowes (1971) noted. Clowes represents the information about three-dimensional objects i n two domains: a picture domain and a scene domain. For example, i n the picture domain one speaks of l i n e s , regions and vertices whereas i n the scene domain these primitives take the form of edges, surfaces and corners respectively. Clowes also noted that not a l l vertex configurations i n the picture domain make sense i n the scene domain: only certain configurations are possible. Clowes (and Huffman, 1971) investigated the possible configurations and interpretations of four vertex types: L, fork, Arrow and Tee. f o r each edge they allowed four d i f f e r e n t types of in t e r p r e t a t i o n : convex, concave and the two senses of occlusion. Using the l i s t of possible interpretations of the d i f f e r e n t vertex types as re f l e c t e d on Preambles to HOUSE 19 the arms and using the r u l e t h a t the i n t e r p r e t a t i o n of each edge has t o be the same at both ends, Clowes* program f i n d s the p o s s i b l e r e l a t i o n s between the s u r f a c e s i n such a way t h a t they make sense over the whole scene. T h i s c l a s s i f i c a t i o n schema was extended by S a l t z (1972). Waltz noted t h a t edges i n a t h r e e - d i m e n s i o n a l l i n e drawing can a l s o r e p r e s e n t a two-dimensional r e l a t i o n s h i p between the s u r f a c e s ( c r a c k ) . I f there i s more than one o b j e c t i n the scene, these o b j e c t s always meet at c r a c k , concave, or occlude boundaries. These types of boundaries can t h e r e f o r e s e r v e as cues t o s e p a r a t e the o b j e c t s i n the scene. A t h i r d a d d i t i o n was motivated by the f a c t t h at p i c t u r e l i n e s can a l s o r e p r e s e n t shadow l i n e s . As a r e s u l t , the number of p o s s i b l e l i n e l a b e l s i n c r e a s e d from 4 to 53, i n c l u d i n g a number of d i s t i n c t l e v e l s f o r e x p r e s s i n g the i l l u m i n a t i o n s t a t u s of the two s u r f a c e s appearing at the edge. In order to l i m i t the search time through the enormous amount of i n t e r p r e t a t i o n combinations Waltz designed an a l g o r i t h m more e f f i c i e n t than e a r l i e r a l g o r i t h m s designed to d e a l with t h i s type of scenes (e.g., Huffman, 1971, Clowes, 1971). Before attempting a depth f i r s t s e a rch (as Huffman did) or breadth f i r s t s earch Waltz a p p l i e s a j u n c t i o n f i l t e r i n g procedure, whereby the j u n c t i o n s ( v e r t i c e s ) are v i s i t e d i n some o r d e r . For each of i t s arms, the j u n c t i o n i n t e r p r e t a t i o n l i s t must pr o v i d e an i n t e r p r e t a t i o n that matches at l e a s t one of the i n t e r p r e t a t i o n s allowed f o r t h a t arm by the i n t e r p r e t a t i o n l i s t o f the j u n c t i o n a t the other end of the arm. J u n c t i o n i n t e r p r e t a t i o n s that do not match are d e l e t e d . I f such Preambles t o HOUSE 20 a d e l e t i o n takes p l a c e then a l l j u n c t i o n s whose i n t e r p r e t a t i o n s sere c o n s t r a i n e d by the d e l e t e d j u n c t i o n i n t e r p r e t a t i o n s are r e v i s i t e d ; t h e i r i n t e r p r e t a t i o n s are f i l t e r e d t o accommodate the new s i t u a t i o n . The advantage of t h i s procedure i s t h a t a s i n g l e pass through the j u n c t i o n s i s s u f f i c i e n t . Although there i s no guarantee that there w i l l be onl y one p o s s i b l e i n t e r p r e t a t i o n l e f t a t each j u n c t i o n , t h i s procedure guarantees the removal of a l l l o c a l l y i n c o n s i s t e n t i n t e r p r e t a t i o n s . S i m i l a r types of al g o r i t h m s have been proposed by Montanari (1974), Gaschnig (1974), Barrow and Tenenbaum (1976), Freuder (1976a), fiosenfeld. Hummel and Zucker (1976) and Macksorth (1977a). Some of these a l g o r i t h m s are e x t e n s i v e l y d e s c r i b e d and eval u a t e d i n Hackworth (1977b) where, they a r e c a l l e d network c o n s i s t e n c y a l g o r i t h m s . A completely d i f f e r e n t approach f o r the i n t e r p r e t a t i o n of l i n e drawings r e p r e s e n t i n g t h r e e - d i m e n s i o n a l o b j e c t s has been o f f e r e d by Hackworth (1973). Mackworth uses a r e p r e s e n t a t i o n based on t h r e e - d i m e n s i o n a l geometry, the g r a d i e n t space, as a b a s i s f o r h i s program. I n the g r a d i e n t space a s u r f a c e i s represented as a p o i n t . An edge i s rep r e s e n t e d by a v e c t o r whose d i r e c t i o n corresponds t o t h a t of the edge i n the p i c t u r e r e p r e s e n t a t i o n , and whose len g t h e q u a l s the tangent of the angle the scene edge makes with the p i c t u r e plane (the plane o f p r o j e c t i o n ) . Mackworth 1s program e x p l o i t s these p r o p e r t i e s and makes i n f e r e n c e s about s u r f a c e and edge o r i e n t a t i o n s . The g r a d i e n t space n o t a t i o n p r o v i d e s a more powerful t o o l f o r i n t e r p r e t i n g p o l y h e d r a l o b j e c t s than the Huffman/Clowes n o t a t i o n does. S e v e r a l examples t h a t are i n c o r r e c t l y handled by Clowes's Preambles to HOUSE 21 program are c o r r e c t l y handled by Mackworth's program {Mackworth, 1977c). T h i s program and the p r e v i o u s programs are reviewed i n d e t a i l i n Mackworth (1977c).. Apart from Roberts* program, a l l the p r e v i o u s programs worked from i n p u t as a l i n e diagram. However, t h e r e i s another c l a s s of programs t h a t use gray l e v e l p i c t u r e s as i n p u t . F a l k ' s INTERPRET (F a l k , 1972) i s a program t h a t i s a b l e to r e c o g n i z e any combination of a s e t of nine prototypes {cubes, rhomboids, r e c t a n g u l a r p a r a l l e l o p i p e d s , wedges, and L-beams) s c a t t e r e d on a t a b l e . His program has two p a r t s : a l i n e f i n d i n g system, designed by Hueckel (1971) and a program t h a t i n t e r p r e t s the l i n e drawing provided by the l i n e f i n d i n g system. The i n t e r p r e t a t i o n program c o n s i s t s of f i v e s t a g e s : SEGMENT, SUPPORT, COMPLETE, RECOGNIZE and VERIFY. SEGMENT provi d e s a s e p a r a t i o n of the p i c t u r e i n t o bodies. F a l k d i s t i n g u i s h e s between two types of v e r t i c e s : " b a d - v e r t i c e s " and " g o o d - v e r t i c e s " . G o o d - v e r t i c e s were taken as an i n d i c a t i o n t h a t a l l i t s edges belonged t o the same body while b a d - v e r t i c e s are assumed t o be on the boundary between two o b j e c t s . For each body found, SUPPORT determines the s e t of bodies t h a t could support i t . I f the p i c t u r e i s incomplete, COMPLETE provi d e s a s e t of r o u t i n e s t o complete the l i n e drawing. RECOGNIZE a c t u a l l y r e c o g n i z e s the o b j e c t . RECOGNIZE uses a "bottom-up" approach i n the sense t h a t i t f i r s t attempts to r e c o g n i z e the o b j e c t s on the T a b l e , then the o b j e c t s they support and so on. The r e c o g n i t i o n process can w e l l be d e s c r i b e d by the c y c l e of Preambles to HOUSE 22 p e r c e p t i o n . Hueckel*s program provides cue d i s c o v e r y . C e r t a i n v e r t i c e s are being used as cues f o r c e r t a i n prototypes (model i n v o c a t i o n ) . VEBIFY attempts to match the most l i k e l y prototype a g a i n s t i t s p i c t u r e r e p r e s e n t a t i o n (model t e s t i n g ) . Once v e r i f i c a t i o n has taken p l a c e , the t h r e e - d i m e n s i o n a l knowledge of the models i s being used to get a t h r e e - d i m e n s i o n a l g r i p on the scene (model e l a b o r a t i o n ) . Hackworth (1977c) shows that there are s e v e r a l problems with F a l k ' s program. F i r s t of a l l , the same c r i t i c i s m s t h a t apply to Guzman's program apply to F a l k * s program. F a l k ( s program uses r e p r e s e n t a t i o n s i n the p i c t u r e domain (the v e r t i c e s ) to make d e c i s i o n s i n t h e scene domain ( s e p a r a t i o n of o b j e c t s ) . , Another problem with F a l k ' s program i s t h a t i t has only one way of a r r i v i n g at a th r e e - d i m e n s i o n a l i n t e r p r e t a t i o n , t h a t i s , by r e c o g n i z i n g the o b j e c t . I f the program f a i l s t o recog n i z e or i n c o r r e c t l y r e c o g n i z e s an o b j e c t i t can never get a t h r e e - d i m e n s i o n a l g r i p on the scene. Hackworth (1977c) has argued t h a t t h i s i s another m a n i f e s t a t i o n of the c h i c k e n and egg problem i n p e r c e p t i o n . F a l k ' s program manifests t h i s problem even i n more than one way. Falk r e p o r t e d t h a t the output o f the pr e p r o c e s s o r s i n the edge f i n d i n g system had t o be c o r r e c t e d sometimes. T h i s problem can be r e l a t e d to the u n r e l i a b i l i t y of the l i n e f i n d i n g procedures- Indeed, the success of c o n t e x t - f r e e edge d e t e c t i o n systems can be doubted (Hackworth, 1975). F r e q u e n t l y the i n f o r m a t i o n provided by a gray l e v e l p i c t u r e i s not needed i n such d e t a i l . K e l l y (1971), f o r example, showed t h a t r e d u c t i o n of the o r i g i n a l 256x256 p i c t u r e t o a 32x32 Preambles to HOUSE 23 v e r s i o n a l l o w s the i n t e r p r e t a t i o n process to he done i n a much more e f f i c i e n t way. S h i r a i (1975) a l s o worked from a reduced gray l e v e l p i c t u r e . S h i r a i showed t h a t the l i n e s of a p i c t u r e r e p r e s e n t i n g a t h r e e - d i m e n s i o n a l o b j e c t can be found by a s e t of c o n t e x t dependent l i n e f i n d i n g procedures. Assuming t h a t the outer-boundary of the o b j e c t can be found these procedures suggest l o c a t i o n s and d i r e c t i o n s i n the outer-boundary from which l i n e s might flow i n t o the i n t e r i o r of the o b j e c t . Thus, S h i r a i has provided an hy p o t h e s i s guided s e a r c h procedure f o r f i n d i n g l i n e s i n the reduced p i c t u r e . Zucker (1978) has proposed a knowledge based system, t h a t , among other t h i n g s , can a s s e r t the presence (or absence) of l i n e segments from a gray l e v e l p i c t u r e . The c o n n e c t i o n between each p o i n t i n the image and i t s e i g h t neighbouring p o i n t s i s rep r e s e n t e d by a p o s s i b l e o r i e n t a t i o n f o r a l i n e together with a p r o b a b i l i t y f a c t o r . The g l o b a l c o n s i s t e n c y between these l i n e o r i e n t a t i o n s i s t e s t e d by means o f a r e l a x a t i o n a l g o r i t h m t h a t uses the knowledge provided by a model f o r the good c o n t i n u a t i o n of l i n e s . 4) TO PROVIDE &N EXPERIMENTAL VESICLE FOR STUDYING CONTROL STRUCTURES REQUIRED TO IMPLEMENT SCHEMA BASED THEORIES OF PERCEPTION. The term "schema 1 1 i s u s u a l l y t r a c e d back t o B a r t l e t t (1932) and t o P i a g e t (e.g., P i a g e t , 1971). Rumelhart and Ortony Preambles to HOUSE 2 4 (1977), however, noted that the term has much deeper r o o t s i n h i s t o r y . Over the l a s t few years schemata have served as a convergent n o t i o n f o r knowledge r e p r e s e n t a t i o n r e s e a r c h i n d i s c i p l i n e s l i k e Psychology (e.g. Bobrow and Norman, 1975, N e i s s e r , 1976, Pylyshyn, 1976), l i n g u i s t i c s (e.g. F i l l m o r e , 1968) and A r t i f i c i a l I n t e l l i g e n c e (e.g. Hinsky, 1975, K u i p e r s , 1975, Freuder, 1976b, Bobrow and Hinograd, 1977, Bumelhart and Ortony, 1977 and Havens, 1978). At p r e s e n t , there i s a wide d i v e r s i t y of d e f i n i t i o n s and c h a r a c t e r i z a t i o n s of schemata. Bobrow and Norman (1975), f o r example, d e f i n e schemata as : " a c t i v e p r o c e s s i n g elements t h a t can be a c t i v a t e d from higher l e v e l purposes and e x p e c t a t i o n s , or from i n p u t data t h a t must be accounted f o r " . One c h a r a c t e r i s t i c of schemata i s that they can r e p r e s e n t both d e c l a r a t i v e and p r o c e d u r a l knowledge. From a d e c l a r a t i v e p e r s p e c t i v e a schema can be embedded i n d i f f e r e n t types of s t r u c t u r a l networks l i k e i n s t a n c e ( i s - a ) h i e r a r c h i e s and composition (part-of) h i e r a r c h i e s . A p r o c e d u r a l r e p r e s e n t a t i o n on the other hand i m p l i e s t h a t a schema can take over c o n t r o l i t s e l f . Havens* p r o c e d u r a l model of r e c o g n i t i o n embodies such an approach (Havens, 1978). Havens* view of p e r c e p t i o n comes c l o s e to the one taken i n t h i s t h e s i s . P e r c e p t i o n i s c o n s i d e r e d as a n o n - d e t e r m i n i s t i c , r e c u r s i v e process d r i v e n by hypotheses and guided by o b s e r v a t i o n . Knowledge i s represented i n the form of schemata. A schema i s d e f i n e d as "a modular r e p r e s e n t a t i o n of e v e r y t h i n g known about some concept, o b j e c t , event or s i t u a t i o n " . Two d i f f e r e n t types of knowledge are d i s t i n g u i s h e d : PreamM.es to HOUSE 25 f a c t u a l knowledge and h e u r i s t i c knowledge. F a c t u a l knowledge can be represented both i n d e c l a r a t i v e or p r o c e d u r a l form. H e u r i s t i c knowledge i s represented i n p r o c e d u r a l form o n l y . H e u r i s t i c knowledge (also c a l l e d methods) guides the s e a r c h process f o r the schema*s concept. Complex concepts are i represented v i a composition i n v a r i o u s types of schemata h i e r a r c h i e s . Havens' r e c o g n i t i o n model c o n s i s t s of three s t a g e s : e x p e c t a t i o n , matching and completion. The s e t o f e x p e c t a t i o n s of a schema c o n s i s t s c f a n o n - d e t e r m i n i s t i c d e s c r i p t i o n of a l l p o s s i b l e f i n a l i n s t a n t i a t i o n s of a schema. For example, a schema f o r a r e c t a n g l e w i l l expect to f i n d f o u r r i g h t angles on the boundary of the r e g i o n i t i s proposed f o r . Each one of these r i g h t angles can s e r v e as a cue f o r t h i s schema. The schema's method w i l l then s t a r t to f o l l o w one of the arms of t h i s angle with the i n t e n t i o n t o f i n d another r i g h t a n g l e . I f the i n f o r m a t i o n expected by the schema matches the i n f o r m a t i o n found i n the p i c t u r e , then the search w i l l c o n t i n u e f o r the remaining two r i g h t a n g l e s . Such a process can be seen as an i t e r a t i v e r e c o g n i t i o n c y c l e . Each schema d i r e c t s o b s e r v a t i o n s (from e i t h e r low l e v e l sensory data or other schemata) i n an attempt to s a t i s f y i t s u n f u l f i l l e d e x p e c t a t i o n s . F a i l u r e w i l l r e s u l t i n suspension., In case of s u c c e s s , however, the schema w i l l seek completion, t h a t i s , i t w i l l a c t as a cue f o r another schema f u r t h e r up i n the h i e r a r c h y . Havens* system may be viewed as c o n s i s t i n g of a network of schemata which simultaneously c o n t r o l the p e r c e p t u a l process i n Preambles to HOUSE 26 d i f f e r e n t p a r t s of the concept space- The system was programmed i n MAYA, a LISP d i a l e c t with m u l t i - p r o c e s s i n g f a c i l i t i e s (Havens, 1978)- E x p l o r a t i o n of a network .of schemata, which s i m u l t a n e o u s l y c o n t r o l d i f f e r e n t p a r t s of the p e r c e p t u a l process i s one of the tren d s i n c u r r e n t machine v i s i o n r e s e a r c h . A more e l a b o r a t e i n t r o d u c t i o n t o t h e problems r e s u l t i n g from schema based t h e o r i e s i s g i v e n by Havens (1978). 5) TO MAKE AVAILABLE USEFUL INTERPRETATION PROGRAMS FOR RESTRICTED BUT IMPORTANT CLASSES OF SKETCHES. Geographic maps are one important s u b c l a s s o f the ske t c h domain. I f we have t o gi v e a t r a v e l l e r the d i r e c t i o n s to another p a r t of town, a sketch of the geographic s i t u a t i o n w i l l h e l p him t o reach h i s d e s t i n a t i o n ; geographic maps are a l s o important i n a completely d i f f e r e n t r e s e a r c h a r e a : Remote Sensing, the study o f p i c t u r e s of the e a r t h taken from s a t e l l i t e s . I n t e r p r e t a t i o n of these p i c t u r e s can be a s s i s t e d by A r t i f i c i a l I n t e l l i g e n c e t e c h n i q u e s employing d o m a i n - s p e c i f i c knowledge (e.g. S t a r r and Mackworth, 1976). By matching p i c t u r e s and sketches, i t i s p o s s i b l e t o d i s t i n g u i s h between l a n d , sea, r i v e r s , o r , i f we want a f i n e r d i s t i n c t i o n , between d i f f e r e n t types of s o i l , wood, e t c . T h i s was one of the reasons why the c l a s s of sketches o f geographic maps was chosen as a f i r s t study area i n the SEE p r o j e c t . MAPSEE (Mackworth, 1977a), a program t h a t can i n t e r p r e t sketches of geographic maps was the r e s u l t of t h i s study. Preambles t o HOUSE 27 Another important c l a s s o f sketches i s the domain of house sketches, commonly used i n a r c h i t e c t u r a l d e s i g n . A program, t h a t can check the c o r r e c t n e s s of d e s i g n s , o r , a program t h a t can make i n t e l l i g e n t designs on v e r b a l i n s t r u c t i o n s o f the a r c h i t e c t i s another demonstration of a u s e f u l a p p l i c a t i o n i n the sketch understanding domain. HOUSE, the program d e s c r i b e d i n Chapter 3 and 4 of t h i s t h e s i s , i s meant as a f i r s t s tep towards such a program. 6 ) TO EXPLGBE THE RELATIONSHIP BETBEEN NATURAL AND CONVENTIONAL SEPSESENTATIONS. F i g u r e 4 shows an example of a l i n e sketch of a geographic map, as i t can be handled by HAPSEE. In p a r t , the sk e t c h c o u l d have been e x t r a c t e d from a photograph of t h i s " i s l a n d " taken from high a l t i t u d e : the c o a s t l i n e , r i v e r and road would show l i k e t h i s on the p i c t u r e . The towns ( c l u s t e r o f points) can be re c o g n i z e d because i t has a t e x t u r e d i f f e r e n t from i t s surrounding environment. However, the bridge and mountains would never show l i k e t h i s on a photograph. These o b j e c t s are shown i n part as a r e s u l t of what we know from these o b j e c t s (e.g., a bridge i s a c o n s t r u c t i o n s u p p o r t i n g a road when c r o s s i n g a r i v e r ) and i n part as we expect them t o show on a photograph taken from above (a b r i d g e o c c l u d e s the r i v e r ) . The Preambles to HOUSE 28 F i g u r e Hz Sketch of an i s l a n d same ho l d s f o r the mountains. A mountain Mould not appear l i k e t h i s on a photograph. Bather, we have d e p i c t e d what we know from a mountain (an " o b j e c t " e l e v a t i n g i t s e l f above the earth) -Thus, sketches r e f l e c t a mixture of what we can see to be t r u e ( n a t u r a l r e p r e s e n t a t i o n s ) and what we know to be t r u e ( c o n v e n t i o n a l r e p r e s e n t a t i o n s ) i n the world, an experiment done by Mulder and Hackworth (1978b) shows t h a t both types of Preambles to HODSE 29 knowledge are not onl y r e f l e c t e d i n the drawing of o b j e c t s but a l s o i n t h e i r p e r c e p t i o n . E r r o r s made while c o n s t r u c t i n g cubes and e r r o r s made while e s t i m a t i n g the s l a n t s of cube s u r f a c e s c o u l d be t r a c e d back to the i n f l u e n c e of th r e e d i f f e r e n t cube schemata; one cube represented n a t u r a l knowledge, the other two represented a mixture of n a t u r a l and c o n v e n t i o n a l knowledge. 2.3 Preamble 3: MAPSEE and the c y c l e of p e r c e p t i o n . S i n c e t h e r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e o f the MAPSEE program served as model f o r the r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e of HOOSE, MAPSEE w i l l be d e s c r i b e d i n some d e t a i l . MAPSEE i s a program t h a t i n t e r p r e t s sketches of geographic maps, l i k e the one i n f i g u r e 4. One of the l e s s o n s l e a r n e d from the blocks world was the need f o r m u l t i p l e r e p r e s e n t a t i o n s and m u l t i p l e l e v e l s o f d e t a i l . MAPSEE uses r e p r e s e n t a t i o n s i n two d i f f e r e n t domains: the p i c t u r e domain and the scene domain. A segmentation program takes the p l o t t i n g program t h a t forms the inp u t to MAPSEE and transforms the p l o t t i n g commands i n t o d i f f e r e n t p o i n t , l i n e , r e g i o n and vertex r e p r e s e n t a t i o n s . The p o i n t s i n the p i c t u r e are represented i n two d i f f e r e n t ways: c h a i n s r e p r e s e n t the s e t s of i n t e r c o n n e c t e d p o i n t s and an a r r a y r e p r e s e n t a t i o n (32x32) i s maintained i n which each c e l l i n the a r r a y r e p r e s e n t s the l i s t of p o i n t s i n that p a r t of the p i c t u r e . Preambles t o HODSE 30 The l a t t e r r e p r e s e n t a t i o n i s very u s e f u l f o r answering q u e s t i o n s l i k e : "what am I near?" A l i n e h i e r a r c h y i s formed i n each of the c h a i n s ; the c o a r s e s t l i n e r e p r e s e n t a t i o n of the c h a i n i s the connecti o n between the end p o i n t s of the c h a i n , the f i n e s t r e p r e s e n t a t i o n i s the con n e c t i o n between two c o n s e c u t i v e p o i n t s i n t h e c h a i n . Region segmentation takes place by s u b d i v i d i n g the p i c t u r e i n t o square patches; a patch i s s u b d i v i d e d i n t o f o u r sub-patches i f i t i s not empty. Each of the v e r t i c e s (acute angle, obtuse angle, t e e , mul t i and l i n k ) has i t s own formation procedure. Acute and obtuse angles are found by means of a sea r c h i n s i d e each of the c h a i n s . During t h i s s e arch the l i n e h i e r a r c h i e s are t r a c e d j u s t up t o the l e v e l of d e t a i l r e q u i r e d . A l i n k i s formed i f and only i f the gap between both end p o i n t s of a c h a i n i s very s m a l l . Tee*s and Multi»s are found by checking the gaps between the end p o i n t s of each ch a i n and the nearest (other) c h a i n . The d i f f e r e n t r e p r e s e n t a t i o n s a t the p i c t u r e l e v e l serve d i f f e r e n t f u n c t i o n s : c h a i n s and r e g i o n s become the p r i m i t i v e s f o r the i n t e r p r e t a t i o n p r o c e s s ; v e r t i c e s serve as cues i s t h i s process. An important n o t i o n behind the segmentation process i s i t s c o n s e r v a t i v e b i a s . For example, a Tee w i l l be formed i f and only i f the gap between ch a i n end p o i n t and the n e a r e s t c h a i n i s very s m a l l ; r e g i o n segmentation s t o p s at a r e l a t i v e l y l a r g e patch s i z e i n order to prevent "leakage" through the gaps i n the v e r t i c e s . The idea behind c o n s e r v a t i v e segmentation i s , t h a t , although genuine cues may be missed t h i s way, segmentation w i l l not supply f a l s e cues which make the image u n i n t e r p r e T a b l e at Preambles t o HOOSE 31 the scene l e v e l . The models which provide an i n t e r p r e t a t i o n of the p r i m i t i v e s a t the scene l e v e l a r e represented i n the form of c u e / i n t e r p r e t a t i o n f a b l e s . For each cue found d u r i n g the segmentation process such a Table p r o v i d e s the s e t of p o s s i b l e i n t e r p r e t a t i o n s f o r each of the p r i m i t i v e s c o n s t r a i n e d by t h a t cue. Thus, a r e g i o n can be i n t e r p r e t e d as l a n d , l a k e or sea whereas a c h a i n can be i n t e r p r e t e d as c o a s t , shore, road, b r i d g e , mountainside or r i v e r . However, each of the cues a l l o w s these i n t e r p r e t a t i o n s i n c e r t a i n combinations o n l y . For example, the arms of the H u l t i i n F i g u r e 4 can r e p r e s e n t r i v e r s o n l y , whereas a l l the r e g i o n s c o n s t r a i n e d by the H u l t i must be i n t e r p r e t e d as l a n d . For a map t o make three-di m e n s i o n a l sense, the i n t e r p r e t a t i o n s of the p r i m i t i v e s have to be c o n s i s t e n t i n two d i f f e r e n t ways: the i n t e r p r e t a t i o n has to match the i n t e r n a l d e s c r i p t i o n of the p r i m i t i v e and i n t e r p r e t a t i o n s of neighbouring p r i m i t i v e s have to be c o n s i s t e n t with each other. I n t e r n a l c o n s i s t e n c y has to e x i s t i n the case of a r i v e r , t o mention an example. HAPSEE makes a d i s t i n c t i o n between two types of r i v e r , depending i f the c h a i n i s drawn i n the stream d i r e c t i o n of the r i v e r or the o p p o s i t e ( r i v e r * ) . Since a c h a i n cannot be both r i v e r and r i v e r * a f i l t e r procedure checks f o r the stream d i r e c t i o n and f i l t e r s out one of the i n t e r p r e t a t i o n s a c c o r d i n g l y . E x t e r n a l c o n s i s t e n c y means t h a t a simultaneous c o n s i s t e n c y has to e x i s t between the i n t e r p r e t a t i o n s of a l l the d i f f e r e n t Preambles to HOUSE 32 p r i m i t i v e s i n one or more d i f f e r e n t ways- The c o n s t r a i n t s a t i s f a c t i o n techniques developed by Huffman, Clowes and Waltz designed t o d e a l with t h i s type of c o n s i s t e n c y requirements have been g e n e r a l i z e d to a network c o n s i s t e n c y a l g o r i t h m by Hackworth (1977c)- Whereas the Waltz j u n c t i o n f i l t e r i n g procedure c o u l d d e a l with b i n a r y c o n s i s t e n c y problems o n l y , network c o n s i s t e n c y can deal with both b i n a r y and n-ary c o n s t r a i n t s a t i s f a c t i o n problems (as they occur i n MAPSEE). Each p r i m i t i v e i s p a i r e d with each of the cues t h a t c o n s t r a i n t h a t p r i m i t i v e . T h i s l i s t ( a c t u a l l y the queue) of v a r i a b l e / r e l a t i o n p a i r s forms the i n p u t f o r the n-ary C o n s t r a i n t s a t i s f a c t i o n a l g o r i t m (HC from now on). NC i s a two-step a l g o r i t h m . I n the f i r s t s tep the f i r s t p a i r (X,B) from the queue i s taken. NC checks f o r each of the p o s s i b l e i n t e r p r e t a t i o n s a of v a r i a b l e X, i f the o t h e r v a r i a b l e s a l s o c o n s t r a i n e d by B have a t l e a s t one i n t e r p r e t a t i o n t h a t i s d i r e c t l y c o n s t r a i n e d by B. I f such an i n t e r p r e t a t i o n can be found then the f i r s t s t e p i s repeated with the next p a i r (X,B) from the queue as arqument. However, i f such a c h a i n cannot be found then a i s d e l e t e d from X and the next s t e p i n the a l g o r i t h m i s taken: the queue i s r e p l a c e d by the union o f the queue and the s e t of v a r i a b l e / r e l a t i o n p a i r s obtained from a l l r e l a t i o n s other than fi t h a t c o n s t r a i n X- A f t e r t h i s replacement, step one i s repeated with the next p a i r (X,B) from the queue as argument. In i t s MAPSEE implementation t h i s a l g o r i t h m i s so powerful t h a t , once the queue i s empty, most of the p r i m i t i v e s are l e f t with o n l y one p o s s i b l e i n t e r p r e t a t i o n . However, i f more than one p r i m i t i v e i s l e f t with m u l t i p l e Preambles to HOUSE 33 i n t e r p r e t a t i o n s , then the i n t e r p r e t a t i o n domain of one of the p r i m i t i v e s i s s p l i t i n t o h a l f and NC i s repeated r e c u r s i v e l y on each of the two subproblems. The i n t e r p r e t a t i o n s l e f t on the p r i m i t i v e s are used to motivate a resegmentation of the p i c t u r e . For example, i f two or more d i f f e r e n t r e g i o n s are i n t e r p r e t e d as l a n d , a resegmentation procedure w i l l attempt to merge these r e g i o n s by means of a f u r t h e r refinement of the segmentation. HAPSEE shows t h a t the c y c l i c c o n t r o l s t r u c t u r e (Figure 2) t h a t a p p l i e s t o many blocks world programs (Mackworth, 1978a) i s a l s o a p p l i c a b l e t o domains o u t s i d e the b l o c k s world. MAPSEE's c o n t r o l s t r u c t u r e a l s o embodies the c y c l e : the program e n t e r s the c y c l e i n the cue d i s c o v e r y stage (segmentation); t h i s stage i s f o l l o w e d by model i n v o c a t i o n ( i n s t a l l a t i o n o f the models by means o f the c u e / i n t e r p r e t a t i o n T a b l e s ) ; model t e s t i n g ( i n t e r n a l c o n s i s t e n c y ) i s next; model e l a b o r a t i o n , f i n a l l y , i s represented i n the network c o n s i s t e n c y ( e x t e r n a l c o n s i s t e n c y ) t e s t s . A f e a t u r e r a r e l y found i n the b l o c k s world programs i s t h a t MAPSEE c l o s e s the c y c l e ; model e l a b o r a t i o n i s f o l l o w e d by cue d i s c o v e r y (resegmentation), but, most important of a l l , MAPSEE shows t h a t r e l y i n g on a c o r r e c t c o n s e r v a t i v e segmentation and r e l y i n g on the redundant c h a r a c t e r of the p e r c e p t u a l p r o c e s s , such a c o n s e r v a t i v e segmentation f o l l o w e d by i n t e r p r e t a t i o n f o l l o w e d by resegmentation o f f e r s a way of a v o i d i n g the c h i c k e n and egg problem i n p e r c e p t i o n . Preambles to HOUSE 34 2-4 Summary Research i n human v i s i o n and machine v i s i o n has l e d to the f o r m u l a t i o n of a s e t of p r i n c i p l e s t h a t underly the p e r c e p t u a l process- Research i n machine v i s i o n has a l s o l e d to the r e a l i z a t i o n t h a t we need systems with m u l t i p l e l e v e l s of r e p r e s e n t a t i o n i n m u l t i p l e l e v e l s of d e t a i l . I n order t o make p o s s i b l e an e f f i c i e n t communication between these l e v e l s , both r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e s must be modular i n c h a r a c t e r . Adhering t o these requirements we can d i s t i n g u i s h between two c u r r e n t r e s e a r c h t r e n d s . One t r e n d i s to provide a uniform format f o r the d e s c r i p t i o n of models. The assumption i s made tha t these models are organized i n l e v e l s . A g e n e r a l network c o n s i s t e n c y a l g o r i t h m p r o v i d e s a c o n s i s t e n c y t e s t between these models a t each l e v e l . The modular c h a r a c t e r of such a system i s embodied i n the l e v e l of r e p r e s e n t a t i o n . MAPSEE i s a demonstration of such a re s e a r c h t r e n d . Another trend i s to embody the modularity i n the models themselves. T h i s i s the schema approach mentioned i n s e c t i o n 2.2. Havens 1 program (Havens, 1978) i s an example o f t h i s approach. The schema i n Havens*s system i s the u n i t f o r both r e p r e s e n t a t i o n and c o n t r o l . Schemata form h i e r a r c h i c a l networks r e p r e s e n t i n g knowledge about c o n c e p t s , o b j e c t s or s i t u a t i o n s . Each schema has a l s o procedures a t i t s d i s p o s a l by means of which i t can communicate i t s knowledge with o t h e r schemata i n the network. HOUSE w i l l be seen t o f o l l o w the f i r s t r e s e a r c h Preambles t o HODSE 35 t r e n d . Goals and overview of BOOSE 36 CHAPTEB 3: GOALS AMD OVEBVIEg OF HOUSE . 3. 1 Goals of HOUSE Humans are able to d e s c r i b e t h e sketch i n F i g u r e 5a i n many d i f f e r e n t ways. In the p i c t u r e domain, f o r example, we can d e s c r i b e the sketch as a s e t of i n t e r c o n n e c t e d p o i n t s , as a s e t of l i n e s and r e g i o n s , or ( i f we f o c u s a t the r e g i o n forms) as a c o n f i g u r a t i o n of f o u r p a r a l l e l o g r a m s , a pentagon and an octagon. An even g r e a t e r v a r i e t y of p o s s i b l e d e s c r i p t i o n s e x i s t s i n the scene domain. F i r s t of a l l , l i n e s and r e g i o n s become s u r f a c e s . Edges can be d e s c r i b e d as convex, concave, o c c l u d e , occlude-ccncave or c r a c k ; edges can a l s o be d e s c r i b e d by means of t h e i r o r i e n t a t i o n : h o r i z o n t a l , v e r t i c a l or s l a n t e d . S u r f a c e s can be d e s c r i b e d by means o f t h e i r o r i e n t a t i o n , but, most important of a l l i n t h i s house domain, s u r f a c e s c a r r y meaningful names: door, window, w a l l , r o o f e t c . F i n a l l y , we can d e s c r i b e the s k e t c h by means o f one word: a house. Pr e v i o u s r e s e a r c h i n machine p e r c e p t i o n has shown the need f o r m u l t i p l e l e v e l s of r e p r e s e n t a t i o n and m u l t i p l e l e v e l s of d e t a i l i n every p e r c e p t u a l system. MAPSEE had d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n the p i c t u r e domain. The scene domain, on the other hand, was represented a t one l e v e l o n l y . F i g u r e 5a, however, has more than one p o s s i b l e d e s c r i p t i o n i n the scene domain. A system with the a b i l i t y to provide d i f f e r e n t scene d e s c r i p t i o n s t h e r e f o r e needs d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n the scene domain as w e l l . In MAPSEE, the c u e / i n t e r p r e t a t i o n Goals and overview of HOUSE 37 Tables served as gates from the p i c t u r e domain to the scene domain- S i t h m u l t i p l e l e v e l s of r e p r e s e n t a t i o n i n the scene domain, we need such Tables f o r each of those l e v e l s i n order to be a b l e t o access them. The models represented i n those Tables a t d i f f e r e n t l e v e l s a r e , a t l e a s t i n p a r t , h i e r a r c h i c a l l y r e l a t e d . For example, a w a l l and a v e r t i c a l s u r f a c e are both embedded i n an i n s t a n c e h i e r a r c h y whereas a house and a wa l l are embedded i n a composition h i e r a r c h y . Allowing models a t one l e v e l to serve as cues f o r the next higher l e v e l emphasizes the r e c u r s i v e c h a r a c t e r of the p e r c e p t u a l process and enables us to speak of a model r e p r e s e n t a t i o n i n the form of a cue/model h i e r a r c h y . , Such a cue/model h i e r a r c h y provides a uniform way f o r r e p r e s e n t i n g the models needed f o r d e s c r i p t i o n of the image a t d i f f e r e n t l e v e l s of a b s t r a c t i o n . Implementation of a cue/model h i e r a r c h y was one of the goals of HOUSE-The a b i l i t y to d e s c r i b e the image a t d i f f e r e n t l e v e l s of a b s t r a c t i o n s h o u l d i n c l u d e the a b i l i t y t o make i n f e r e n c e s from these d e s c r i p t i o n s . For example, two s u r f a c e s connected by a crack edge must have the same o r i e n t a t i o n ; a l l edges i n a h o r i z o n t a l s u r f a c e must be h o r i z o n t a l ; a s l a n t e d s u r f a c e cannot c o n t a i n v e r t i c a l edges. The a b i l i t y t o make such i n f e r e n c e s was another g o a l o f HOUSE. Human r e c o g n i t i o n of the house i s not a f f e c t e d i f the end p o i n t s i n the sketch do not c o i n c i d e (Figure 5b). Even a f r e e hand sketch as d e p i c t e d i n F i g u r e 5c i s s t i l l e a s i l y r e c o g n i z a b l e . A c e r t a i n amount of robustness should t h e r e f o r e a l s o be presen t i n HOUSE. Goals and overview c f HOUSE 38 The a b i l i t y t o d e s c r i b e an image a t d i f f e r e n t l e v e l s of a b s t r a c t i o n , i n f e r e n c i n g c a p a b i l i t y and robustness a re a l l as p e c t s o f human competence. T h i s does not imply, however, t h a t HOUSE embodies a l l of human v i s u a l competence. , As was p o i n t e d out i n s e c t i o n 2.2, the r e p r e s e n t a t i o n of a l l d i f f e r e n t types of p r o c e s s i n g known to take p l a c e i n humans i s a task f a r too complex f o r a r t i f i c i a l I n t e l l i g e n c e i n i t s pr e s e n t s t a t e of r e s e a r c h . I n s t e a d , A r t i f i c i a l I n t e l l i g e n c e c o n c e n t r a t e s on s u f f i c i e n c y and e f f i c i e n c y . The former i s s u e a l l o w s us to con c e n t r a t e on c e r t a i n a s p e c t s o f the p e r c e p t u a l process and i n v e s t i g a t e the circumstances under which these a s p e c t s can account f o r the p e r c e p t u a l task as a whole. The HOUSE program, f o r example, i s an experiment with a bottom-up p a s s - o r i e n t e d c o n t r o l s t r u c t u r e . Such a c o n t r o l s t r u c t u r e i s mainly concerned with the bottom-up component of the p e r c e p t u a l p r o c e s s . I n v e s t i g a t i o n of the s u f f i c i e n c y and e f f i c i e n c y of such a c o n t r o l s t r u c t u r e i s another g o a l o f the HOUSE program. Apart from t h i s , the e f f i c i e n c y i s s u e i s a l s o r e f l e c t e d i n the s t r i v i n g f o r modularity i n both r e p r e s e n t a t i o n and c o n t r o l of the HOUSE program. To summarize t h i s s e c t i o n , the implementation of HOUSE was motivated by f i v e d i f f e r e n t g o a l s : 1) to have a uniform r e p r e s e n t a t i o n f o r the models needed f o r d e s c r i p t i o n of the image at d i f f e r e n t l e v e l s of a b s t r a c t i o n , 2) t o have a program with a c e r t a i n i n f e r e n c i n g c a p a b i l i t y , 3) t o have a program with a c e r t a i n amount of robustness, 4) to have modularity i n r e p r e s e n t a t i o n and c o n t r o l . Goals and overview o f HOUSE 39 5) t o t e s t the adequacy and e f f i c i e n c y o f a hottora-up, pass o r i e n t e d c o n t r o l s t r u c t u r e . F i g u r e 5: a house drawn i n t h r e e degrees of accuracy 3.2 Overview o f the program HOUSE t a k e s , as i n p u t , a p l o t program which draws sketches l i k e the one i n F i g u r e 5a as i n p u t . The program maintains seven Goals and overview of HOUSE 40 d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n of the o r i g i n a l s k e t c h : 1) Sketch l e v e l : the s k e t c h i s rep r e s e n t e d as an i n t e r c o n n e c t e d s e t o f p o i n t s , 2) L i n e / r e g i o n l e v e l : s t r a i g h t l i n e r e p r e s e n t a t i o n and r e g i o n r e p r e s e n t a t i o n . 3) Vertex l e v e l : l i n e s a re i n t e r r e l a t e d by v e r t i c e s , the r e g i o n boundaries and shapes are computed. 4) Edge l e v e l : l i n e s are i n t e r p r e t e d as edges, r e l a t i n g the s u r f a c e s connected by the edge. 5) O r i e n t a t i o n l e v e l : the t h r e e - d i m e n s i o n a l o r i e n t a t i o n s of both s u r f a c e s and edges a r e repr e s e n t e d . 6) Surface naming l e v e l : the s u r f a c e s c a r r y meaningful names. 7) Object l e v e l : the image i s re p r e s e n t e d as an o b j e c t . The f i r s t t h ree l e v e l s are a l l i n the p i c t u r e domain whereas the l a t t e r f o u r are r e p r e s e n t a t i o n s i n the scene domain. D i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n the scene domain i s what d i s t i n g u i s h e s HOUSE from MAPSEE. Although MAPSEE has the same types of r e p r e s e n t a t i o n i n the p i c t u r e domain, the scene domain i s represented a t one l e v e l o nly. Hereby, the c u e / i n t e r p r e t a t i o n Tables serve as the gates from the p i c t u r e domain to the scene domain. In HOUSE, th e r e are separate c u e / i n t e r p r e t a t i o n Tables f o r each of the f o u r scene l e v e l s . These T a b l e s are shown i n Tabl e 1 to 4. The Ta b l e s are l i n k e d by the f a c t t h a t models a t one l e v e l serve as cues f o r the next l e v e l . S ince the models a t the d i f f e r e n t l e v e l s are h i e r a r c h i c a l l y r e l a t e d to some exten t , we can speak of a model Goals and overview of HOUSE 41 r e p r e s e n t a t i o n i n the form of a cue/model h i e r a r c h y . The o v e r a l l g o a l of the pro c e s s i n HOUSE i s to re p r e s e n t the sketch a t a l l the d i f f e r e n t l e v e l s s t a r t i n g with the ske t c h l e v e l up to the hig h e s t l e v e l i t can reach. The r e p r e s e n t a t i o n s i n the p i c t u r e domain ( p o i n t s , l i n e s , r e g i o n s and v e r t i c e s ) are c r e a t e d by means of a s t r a i g h t f o r w a r d segmentation process (He w i l l see i n the d e t a i l e d d e s c r i p t i o n i n Chapter 4, th a t t h i s i s not q u i t e t r u e but f o r the moment i t w i l l do). T h i s segmentation process i s c o n s e r v a t i v e l y biased i n the same way as i t i s i n MAPSEE (see s e c t i o n 2.3). In the scene domain the c y c l e o f p e r c e p t i o n ( F i g u r e 2) ser v e s as a model f o r the c o n t r o l s t r u c t u r e . The c y c l e can be found a t each o f the f o u r scene l e v e l s . The c y c l e i s entered i n the cue d i s c o v e r y stage, f o l l o w e d by model i n v o c a t i o n , model t e s t i n g and model e l a b o r a t i o n . The g o a l of the process a t each l e v e l of r e p r e s e n t a t i o n i s : 1) t o f i n d a l l the unique (c o n s i s t e n t ) d e s c r i p t i o n s f o r the sk e t c h at t h a t l e v e l , and 2) to f i n d the cues t h a t allow a " b o o t s t r a p " i n t o the next hig h e r l e v e l of r e p r e s e n t a t i o n . The d i f f e r e n t types of processes t h a t occur i n the f o u r stages o f the c y c l e are l a r g e l y the same, as i n MAPSEE. These processes were d e s c r i b e d i n s e c t i o n 2.3. HOUSE does not c l o s e the c y c l e , however. Model e l a b o r a t i o n i s fo l l o w e d by cue d i s c o v e r y a t the next h i g h e r l e v e l . As a r e s u l t , HOUSE*s c o n t r o l s t r u c t u r e takes the form of a h e l i x . T h i s h e l i c a l metaphor i s i l l u s t r a t e d i n Fig u r e 6. A t e s t o f the s u f f i c i e n c y and e f f i c i e n c y o f the bottom-up p a s s - o r i e n t e d c o n t r o l s t r u c t u r e of HOUSE i s an important g o a l of the program. Goals and overview of BOUSE 42 Success and f a i l u r e s i n describing l i n e sketches i n the polyhedral domain w i l l also form an indicati o n f o r the strengths and weaknesses of the helix as a metaphor for the description of the perceptual process. In Chapter 4 a detailed description of the HOUSE program w i l l be given. Commonalities between the HOUSE program and the MAPSEE program w i l l frequently be pointed out. The most important reason for t h i s i s that a great deal of MAPSEE code i s used i n HOUSE. The f a c t that i t was possible to use the same procedures i n two dif f e r e n t perceptual domains shows the generality of the approach. Goals and overview of HOUSE Program d e s c r i p t i o n 44 CHAPTER 4: DESCRIPTION OF THE PROGRAM 4.1 i n p u t The i n p u t to HOOSE i s given i n the form of a p l o t program-Each p l o t c o n s i s t s of sequences of p l o t t e r commands (Draw-to (X,SJ and Move-to (X,Y) from the c u r r e n t p o s i t i o n ) . Each sequence of Draw-to commands i s c a l l e d a c h a i n . Some of the examples run with the program were hand coded (the house and the wedge); two other examples (an I-beam and a s k e t c h map) were a c t u a l l y c o n s t r u c t e d on a screen by means of a Lisp-based i n t e r a c t i v e g r a p h i c s language c a l l e d IGLISP (Jones, 1978). 4.2 P r o c e s s i n g i n the p i c t u r e domain Four d i f f e r e n t types of r e p r e s e n t a t i o n s are maintained a t d i f f e r e n t l e v e l s of d e t a i l i n the p i c t u r e domain: point, l i n e , r e g i o n and vertex r e p r e s e n t a t i o n s . Both the l i n e and r e g i o n r e p r e s e n t a t i o n s can be d e r i v e d from the p o i n t r e p r e s e n t a t i o n ; the v e r t e x r e p r e s e n t a t i o n s can be d e r i v e d from the l i n e r e p r e s e n t a t i o n . For t h i s reason, we can d i s t i n g u i s h between thr e e l e v e l s of r e p r e s e n t a t i o n i n the p i c t u r e domain: the Sketch l e v e l , the L i n e / r e g i o n l e v e l and the Vertex l e v e l . From the f o u r l e v e l s of r e p r e s e n t a t i o n , the p o i n t and l i n e r e p r e s e n t a t i o n can be d e r i v e d d i r e c t l y from the i n p u t . For the Program d e s c r i p t i o n 45 forma t i o n of r e g i o n s and v e r t i c e s , on the other hand, some i n t e r p r e t a t i o n i s necessary. The goal of the process i n the p i c t u r e domain i s to c r e a t e d i f f e r e n t types o f r e p r e s e n t a t i o n s . Some of these r e p r e s e n t a t i o n s w i l l f u n c t i o n as p r i m i t i v e s , whereas o t h e r s w i l l serve as cues i n the i n t e r p r e t a t i o n p r o c e s s . 4.2.1 The Sketch l e v e l Two d i f f e r e n t types of r e p r e s e n t a t i o n s are maintained a t the Sketch l e v e l . F i r s t of a l l , the sketch i s represented as a s e t o f i n t e r c o n n e c t e d p o i n t s (chains) as was given i n the i n p u t . I f the d i s t a n c e between two c o n s e c u t i v e p o i n t s i n a cha i n i s l a r g e r than 1/32 of the diameter o f the t o t a l p i c t u r e then t h i s gap i s f i l l e d up with an a d d i t i o n a l s et of p o i n t s such t h a t the mutaal d i s t a n c e between c o n s e c u t i v e p o i n t s i s no l a r g e r than 1/32 of the p i c t u r e diameter. The reason f o r t h i s i s r e l a t e d to the nature of the r e g i o n formation process and w i l l be e x p l a i n e d d u r i n g the d e s c r i p t i o n of t h a t process. The p o i n t s are a l s o represented i n the form of a 32 by 32 a r r a y . Each c e l l i n t h i s a r r a y c o n t a i n s the l i s t of p o i n t s i n t h a t a r e a . During the vertex f o r m a t i o n process we w i l l see t h a t such a r e p r e s e n t a t i o n i s very u s e f u l f o r answering g u e s t i o n s l i k e : "What am I near ?" 4.2.2 The L i n e / r e g i o n l e v e l In each of the c h a i n s a l i n e h i e r a r c h y i s b u i l t . The l i n e Program d e s c r i p t i o n f o r m a t i o n procedure i s i l l u s t r a t e d i n f i g u r e 7. 46 1 c F i g u r e 7: The l i n e segmentation process The c h a i n i s drawn from a t o b. I n i t i a l l y , the end^-points of the c h a i n are connected ( l i n e 1). T h i s i s the c o a r s e s t l i n e r e p r e s e n t a t i o n of the c h a i n . & search procedure r e t u r n s the p o i n t i n the c h a i n between the two l i n e end-points with the l a r g e s t d i s t a n c e to t h i s l i n e (point c ) . A f i n e r r e p r e s e n t a t i o n of the c h a i n i s now c r e a t e d by the l i n e c onnection ac ( l i n e 2) and be ( l i n e 3 ). The same procedure subsequently r e t u r n s the p o i n t s with the l a r g e s t d i s t a n c e to l i n e 2 and l i n e 3 ( p o i n t d and p o i n t e r e s p e c t i v e l y ) . Thus, a f i n e r r e p r e s e n t a t i o n of the chain can now be formed by the combination l i n e 4, 5, 6 and 7. The process o f refinement w i l l c o ntinue u n t i l t h e r e are no f r e e p o i n t s l e f t i n the c h a i n . Program description 47 The representation of a chain in the form of a l i n e hierarchy o f f e r s many advantages- One such advantage i s that the segmentation process i s very robust, although some of the l i n e s in Figure 5c, to mention an example, are s l i g h t l y curved, there w i l l be a straight l i n e representation f o r t h i s l i n e in the hierarchy. As a r e s u l t , i t i s possible to approximate a free hand sketch l i k e Figure 5c by a configuration of straight l i n e s l i n e s at any desired l e v e l of d e t a i l - a l i n e representation i n hierarchy form also allows us to search for vertices i n a very e f f i c i e n t way- This w i l l be shown i n the next section. The l i n e formation procedure was copied unchanged from MAPSEE. Tie region formation process s t a r t s with a guery as to whether the picture as a whole i s empty. I f the answer i s negative, then the picture i s subdivided into four square subpatches and the question i s repeated f o r each of the subpatches. I f a patch i s empty, then no further subdivision i s made. However, t h i s subdivision of patches does not continue ad  infinitum. The conservative bias i n the segmentation process has been mentioned already during the description of MAPSEE i n section 2.3. Begion refinement stops when a patch area of 1/3024 of the t o t a l picture area i s reached. This conservative bias has several consequences. One of these consequences has been already mentioned i n section 2.3; that i s , the picture can be a free hand sketch and one wants to prevent a leakage through gaps between l i n e s that were intended to i n t e r s e c t . The region Program d e s c r i p t i o n 48 f o r m a t i o n process uses the a r r a y p o i n t n o t a t i o n d e s c r i b e d a t the Sketch l e v e l . T h i s i s a l s o the reason why, i n the p o i n t formation process, the d i s t a n c e between two c o n s e c u t i v e p o i n t s i n a c h a i n should be s m a l l e r than the diameter of the s m a l l e s t patch s i z e allowed f o r the r e g i o n s : a leakage between two c o n s e c u t i v e p o i n t s i n s i d e a c h a i n would be p o s s i b l e otherwise. Another consequence o f the c o n s e r v a t i v e segmentation b i a s i s t h a t an intended r e g i o n i n the p i c t u r e whose area i s s m a l l e r than the s m a l l e s t allowed patch s i z e w i l l be missed during the r e g i o n f o r m a t i o n process. I t s h o u l d be noted t h a t because of the c o n s e r v a t i v e b i a s , the r e g i o n segmentation r e s u l t s cannot be d e r i v e d from the p o i n t n o t a t i o n i n a s t r a i g h t f o r w a r d way. In f a c t , the d e c i s i o n to s t o p the process once a c e r t a i n patch s i z e i s reached embodies the c h i c k e n and egg problem i n p e r c e p t i o n : segmentation r e q u i r e s i n t e r p r e t a t i o n and v i c e versa. F i g u r e 8 shows the r e s u l t s of the segmentation i n r e g i o n s of a house. The d e c i s i o n not t o s u b d i v i d e empty patches a l l o w s an e f f i c i e n t r e p r e s e n t a t i o n of the r e g i o n patches. A l s o n o t e , t h a t no r e g i o n i s formed i n the door-handle because i t s patch s i z e i s too s m a l l . Such r e g i o n s w i l l be accounted f o r d u r i n g the v e r t e x f o r m a t i o n process. Program d e s c r i p t i o n Figure 8: Region segmentation of the house example Program d e s c r i p t i o n 4.2-3 The Vertex l e v e l 5 3 4.2.3.1 The vertex formation procedures One of the goals of the process a t the p i c t u r e l e v e l i s to f i n d the cues that w i l l allow b o o t s t r a p p i n g i n t o the f i r s t of the scene l e v e l s : the Edge l e v e l . In both HOUSE and MAPSEE the v e r t i c e s serve as cues f o r b o o t s t r a p p i n g from the p i c t u r e i n t o the scene domain. The type of v e r t i c e s t h a t serve as cues are d i f f e r e n t i n HOUSE and SAPSEE, however, and so are t h e i r formation procedures. MAPSEE uses f i v e d i f f e r e n t vertex types as cues: Free-end, Acute-L, Obtuse-I, l i n k . Tee and M u l t i . HOUSE, on the other hand, does not d i s t i n g u i s h between an Acute-L and Obtuse-L and does not use a M u l t i . A d d i t i o n a l cues are provided i n HOUSE i n the form of an Arrow and a Fork. The HOUSE v e r t i c e s are shown i n F i g u r e 9. Vertex f o r m a t i o n s t a r t s with the formation of Free-ends, f o l l o w e d by the formation of two-arm j u n c t i o n s (Link and L-vertex) and three-arm j u n c t i o n s (Tee, Arrow, Fork and T e e ) . The f o r m a t i o n procedure f o r Free-ends was copied from MAPSEE. The end p o i n t s of each c h a i n are taken as Free-ends- For each Free-end i t s neighbour i s computed, t h a t i s , the point n e a r e s t to the Free-end which i s not pa r t of the same c h a i n . T h i s search procedure uses the a r r a y r e p r e s e n t a t i o n of p o i n t s . Program d e s c r i p t i o n Region-a arm2 rmz Region-b LINK Region-a ion-c Region stem arm2 TEE •arml Region-a Region-b ARROW Region-small — arml Region-large L-VERTEX arml Region-c Region-b stem FORK Figure 9: The v e r t i c e s used i n HOUSE For a l i n e t o be accepted as an arm of a j u n c t i o n i t has to s a t i s f y the f o l l o w i n g c o n d i t i o n s : 1) the l e n g t h o f t h e l i n e has to exceed a c e r t a i n predetermined minimus l e n g t h . 2) the sp r e a d i n g of p o i n t s around the l i n e has to be below a c e r t a i n predetermined value. Program d e s c r i p t i o n 5 5 The L - f o r m a t i o n procedure i s p a r t l y c o p i e d from MAPSEE and was r e w r i t t e n i n part- In MAPSEE the assumption was made t h a t L - v e r t i c e s c o u l d only occur i n s i d e a c h a i n . HOUSE a l s o a l l o w s a "between-chain" L-vertex by merging the Free-ends of two d i f f e r e n t c h a i n s whose end p o i n t s are s p a t i a l l y c l o s e . An " i n s i d e - c h a i n * ' L-vertex i s found by means of a s e a r c h down the l i n e h i e r a r c h y of each c h a i n . Once, a t a c e r t a i n l e v e l o f d e t a i l , two a d j a c e n t l i n e s s a t i s f y the junction-arm c o n d i t i o n s and the angle they i n c l u d e i s l e s s than 180 degrees, then the search can s t o p . For example, i n Figure 7 the search w i l l s t o p a t the t h i r d l e v e l ( l i n e 4, 5 , 6 and 7) c r e a t i n g an L-vertex at p o i n t s b, c, d and e. A "between-chain" L-vertex can be formed i f two Free-ends are very c l o s e and i f t h e r e i s a l i n e o r i g i n a t i n g i n each Free-end t h a t s a t i s f i e s the junction-arm c o n d i t i o n s . I f the angle i n c l u d e d by the arms i s c l o s e t o 180 degrees, a Link w i l l be formed, an L-vertex otherwise. Note the e f f i c i e n c y i n the search procedures of both types of L - v e r t i c e s . The l i n e h i e r a r c h y of each c h a i n w i l l be t r a c e d j u s t down to the l e v e l of d e t a i l r e q u i r e d . I t should a l s o be noted that a t t h i s p o i n t HOUSE has a d i f f e r e n t n o t i o n of a L i n k , compared to MAPSEE. In c o n t r a s t t o the l i n k formation procedure i n HOUSE, MAPSEE w i l l form a Link i f and only i f there i s a s m a l l gap between the Free-ends of the same c h a i n . Once the two-arm j u n c t i o n s have been formed, the three-arm j u n c t i o n formation procedures become a c t i v e . A three-arm j u n c t i o n w i l l be formed under the f o l l o w i n g c o n d i t i o n s : Program d e s c r i p t i o n 56 1) an L - v e r t e x (or Link} and a Free-end are very c l o s e or the gap between a Free-end and t h e n e a r e s t p o i n t of another c h a i n i s very s m a l l . 2) the junction-arm c o n d i t i o n s are s a t i s f i e d . The type of three-arm j u n c t i o n t h a t w i l l be c r e a t e d h e r e a f t e r , depends on the a n g l e s i n c l u d e d by the arms: 1) i f two arms i n c l u d e an angle c l o s e to 180 degrees, then a Tee w i l l be formed. 2) i f the three-arm j u n c t i o n i n c l u d e s two acute a n g l e s , then an arrow w i l l be formed. 3) i f n e i t h e r 1 nor 2 i s the case, then a Fork w i l l be formed. as i s the case i n the r e g i o n f o r m a t i o n process, the vertex formation process i s c o n s e r v a t i v e l y b i a s e d . For example, the gap between Free-ends has t o be very s m a l l ; arm l e n g t h should be r e l a t i v e l y l a r g e and spreading of p o i n t s around an arm r e l a t i v e l y s m a l l . Once again, the s t r a t e g y i s t h a t i t i s b e t t e r t o miss genuine cues than t o form f a l s e cues. In HOUSE, the values of the parameters s p e c i f y i n g minimum arm l e n g t h , spreading of p o i n t s and gap s i z e a r e predetermined. I f we look a t segmentation as being a process of s t r a i g h t f o r w a r d computation from one type of r e p r e s e n t a t i o n i n t o another where a l l p r o p e r t i e s of one r e p r e s e n t a t i o n can be d e r i v e d from the r e p r e s e n t a t i o n from which i t was computed, then the vertex formation i s not j u s t segmentation. The d e c i s i o n t o c l o s e or not to c l o s e gaps, motivated by the p r e f i x e d values o f parameters means t h a t vertex formation i s a l s o the r e s u l t of i n t e r p r e t a t i o n . F i g u r e 10 shows the v e r t i c e s as they were Program d e s c r i p t i o n 57 formed d u r i n g segmentation of a sketch r e p r e s e n t i n g a house. 4.2-3.2 Besegmentation of the cha i n s In MAPSEE, the assumption has been made t h a t the c h a i n s as they have been drawn can serve as p r i m i t i v e s d u r i n g the i n t e r p r e t a t i o n process. T h i s assumption puts c o n s t r a i n t s on the way the s k e t c h can be drawn. HOUSE, on the c o n t r a r y , does not have such c o n s t r a i n t s . Apart from the r e g i o n s , the sub-chains serve as p r i m i t i v e s i n the i n t e r p r e t a t i o n process. A sub-chain i s a part of a cha i n and r e p r e s e n t s the s e t of p o i n t s i n between two c o n s e c u t i v e v e r t i c e s i n a c h a i n . A l i n e h i e r a r c h y s i m i l a r t o the one c r e a t e d i n each c h a i n i s b u i l t i n each o f the subchains. P r e c a u t i o n s are taken t h a t the l i n e s t r u c t u r e o f the ch a i n w i l l not be d u p l i c a t e d . I f two p o i n t s i n the sub^-chain a r e found to be connected a l r e a d y then the l i n e f ormation process s t o p s . A v e r t e x f o r m a t i o n p r o c e s s , f o l l o w e d by a c h a i n resegmentation i s a mere g e n e r a l procedure than the one o f f e r e d by MAPSEE which does no cha i n resegmentation. The HOUSE procedure guarantees t h a t , r e g a r d l e s s of how the ske t c h was drawn, i t w i l l always r e s u l t i n the same segmentation. MAPSEE does not have t h i s guarantee. An "unexpected' 1 way of drawing the s k e t c h may r e s u l t i n an i n c o r r e c t vertex formation or may r e s u l t i n cha i n s with d i f f e r e n t i n t e r p r e t a t i o n s i n d i f f e r e n t p a r t s , a s i t u a t i o n MAPSEE cannot cope with. 58 Figure 10: The v e r t i c e s found i n the house example Figure 10 continued Program d e s c r i p t i o n 4-2.3-3 C o n s t r a i n i n g the p r i m i t i v e s 61 For the i n t e r p r e t a t i o n process at the d i f f e r e n t l e v e l s i n the scene domain i t i s c r i t i c a l t o know which cues c o n s t r a i n which p r i m i t i v e s - Each ver t e x c o n s t r a i n s p o s s i b l e i n t e r p r e t a t i o n s of the sub-chains t h a t c o n s t i t u t e i t s arms. Both v e r t i c e s and r e g i o n s were computed independently, however. As a r e s u l t , we do not know which v e r t i c e s c o n s t r a i n the i n t e r p r e t a t i o n s of which r e g i o n s . T h i s gap i n i n f o r m a t i o n i s f i l l e d by means of a search out along the b i s e c t o r of each p a i r of vertex arms- T h i s search i s a l s o c o n s e r v a t i v e l y biased i n the sense t h a t no search w i l l be done at a d i s t a n c e f u r t h e r than the s i z e of the s m a l l e s t patch (formed i n the r e g i o n segmentation o process) from each v e r t e x . I f no r e g i o n can be found then a region-ghost i s c r e a t e d . Such a region-ghost stands f o r the r e g i o n which has t h a t r e l a t i o n s h i p to the vertex but cannot yet be i d e n t i f i e d . T h i s r e g i o n may or may not a l r e a d y e x i s t . On the one hand, i t i s p o s s i b l e t h a t the r e g i o n e x i s t s , but the d i s t a n c e between vertex and nearest patch of the r e g i o n was more than one patch s i z e . On the other hand, the r e g i o n may not e x i s t because the r e g i o n area was s m a l l e r than the s m a l l e s t patch s i z e allowed. I n the l a t t e r c a s e , there w i l l be a set of regi o n - g h o s t s p o i n t i n g i n t o the area from the d i f f e r e n t v e r t i c e s surrounding the area. T h i s procedure was copied from MAPSEE. Program d e s c r i p t i o n 4-2-3-4 Resegmentation of the r e g i o n s 62 At t h i s stage of p r o c e s s i n g we have d i f f e r e n t types of i n t e r r e l a t e d r e p r e s e n t a t i o n s i n the p i c t u r e - A p a r t of these r e p r e s e n t a t i o n s has been c r e a t e d by independent procedures. For example, l i n e s and v e r t i c e s have been c r e a t e d by procedures independent from the procedures which c r e a t e d the r e g i o n s . However, these d i f f e r e n t r e p r e s e n t a t i o n s c o n s t r a i n each other i n s e v e r a l r e s p e c t s . one such c o n s t r a i n t i s t h a t a l i n e c l o s u r e ( i . e . Region boundary) cannot surround more than one r e g i o n . As a r e s u l t cf the c o n s e r v a t i v e n e s s of the r e g i o n f o r m a t i o n procedure i t i s p o s s i b l e t h a t more than one r e g i o n i s i n s i d e one r e g i o n boundary. Region 5 i n F i g u r e 8 i s an example of t h i s . As a consequence of the shape of the boundary, two independent r e g i o n s are formed i n t h a t area (the p i c t u r e shows the r e g i o n s a f t e r resegmentation). For t h i s reason the c o n s i s t e n c y i s t e s t e d between the values of the r e g i o n p o i n t e r s of p a i r s of v e r t i c e s c o n s t r a i n i n g the same sub-chain. F i g u r e 11 i l l u s t r a t e s the s i t u a t i o n . In F i g u r e 11, sub-chain1 i s c o n s t r a i n e d by Arrow9 and F o r k l . The f o l l o w i n g r u l e s are a p p l i e d to the p o i n t e r p a i r s (a,b) and ( c , d ) : 1) i f both elements of the p a i r are p o i n t i n g t o region i or region-ghost i , then do nothing 2) i f one element p o i n t s to r e g i o n i and the other element p o i n t s t o r e g i o n j and j > i , then r e p l a c e r e g i o n j by r e g i o n i ; i f j < i then r e p l a c e r e g i o n i by r e g i o n j ; merge the two r e g i o n s . Program d e s c r i p t i o n 63 3) i f one element p o i n t s to r e g i o n i and the other element p o i n t s t o region-ghost j , then r e p l a c e the region-ghost by the r e g i o n . 4) i f one element p o i n t s to r e g i o n - g h o s t i and the o t h e r to region-ghost j and j > i , then r e p l a c e region-ghost j by r e g i o n - g h o s t i ; i n case j < i , then r e p l a c e region-ghost i by region-ghost J . subchain-I ARROW-I F i g u r e 11: The r e g i o n c o n s i s t e n c y t e s t During t h i s t e s t , a l l the sub-chains a r e subsequently v i s i t e d . I f replacements have been made during t h i s v i s i t , then the t e s t i s repeated u n t i l there has been a pass through the sub-chains without any replacement. This process i s another example of the redundancy of the p e r c e p t u a l process. From a l l the searches iade from d i f f e r e n t Program d e s c r i p t i o n 64 v e r t i c e s on the boundary o f a r e g i o n i n t o t h a t r e g i o n (the process i s d e s c r i b e d i n s e c t i o n 4-3.3), only one needs t o succeed. The f a i l u r e to f i n d the r e g i o n from the other v e r t i c e s w i l l be a u t o m a t i c a l l y compensated f o r i n the r e g i o n resegmentation p r o c e s s . Region resegmentation does not occur i n MAPSEE a t the Vertex l e v e l . In MAPSEE, the r e g i o n resegmentation was motivated by the i n t e r p r e t a t i o n process. A r e g i o n merge was attempted between r e g i o n s with the same i n t e r p r e t a t i o n . 4.2.3.5 Shapes c f r e g i o n s In order to compute the shapes o f the r e g i o n s , one has to know what the boundaries are. Since we know a l l the v e r t i c e s t h a t c o n s t r a i n each r e g i o n , one only has to check which sub^chains i n t e r c o n n e c t these v e r t i c e s . The shape of each boundary i s computed i n a very crude way o n l y . HOUSE can d i s t i n g u i s h between a t r i a n g l e , p a r a l l e l o g r a m , t r a p e z o i d , q u a d r i l a t e r a l and a polygon. Region shapes were not computed i n MAPSEE. 4.2.3-6 Region r e l a t i o n s h i p s R e l a t i o n s h i p s between r e g i o n s and r e g i o n - g h o s t s are a l s o computed i n a very crude way. HOUSE d i s t i n g u i s h e s between f o u r d i f f e r e n t types: CONNECT (two r e g i o n s share one l i n e ) , COMMON (two r e g i o n s share more than one l i n e , but not the whole P r o g r a m d e s c r i p t i o n 65 b o u n d a r y ) , I N S I D E ( t w o r e g i o n s s n a r e a w h o l e b o u n d a r y o f o n e o f t h e m ) a n d SDBROUNDS ( t h e i n v e r s e o f i n s i d e ) . 4 . 3 P r o c e s s i n g i n t h e s c e n e d o m a i n T h e t h r e e i m p o r t a n t c o n c e p t s i n t h e i n t e r p r e t a t i o n p r o c e s s a r e : p r i m i t i v e s , c u e s a n d m o d e l s . P r i m i t i v e s a r e t h e e n t i t i e s i n t e r m s o f w h i c h o n e s e e k s t o r e p r e s e n t t h e s k e t c h a t a c e r t a i n l e v e l o f a b s t r a c t i o n . T h e p r i m i t i v e s u s e d f o r i n t e r p r e t a t i o n d i f f e r f r o m l e v e l t o l e v e l . M o d e l s s e r v e a s i n t e r p r e t a t i o n s o f p r i m i t i v e s . U s u a l l y , t h e y a r e h i g h e r o r d e r r e p r e s e n t a t i o n s o f t h e p r i m i t i v e s . T h e c u e s s e r v e a s m e d i a t o r s b e t w e e n m o d e l s a n d p r i m i t i v e s . E a c h c u e c o n s t r a i n s o n e o r m o r e p r i m i t i v e s a n d s u g g e s t s o n e o r m o r e p o s s i b l e i n t e r p r e t a t i o n s ( m o d e l s ) f o r t h e m . T h e c y c l e o f p e r c e p t i o n s e r v e s a s a m e t a p h o r f o r t h e d e s c r i p t i o n o f t h e p r o c e s s a t e a c h o f t h e s c e n e l e v e l s . T h e c y c l e i s e n t e r e d i n c u e d i s c o v e r y s t a g e , a t t h e E d g e l e v e l t h e v e r t i c e s f o u n d i n t h e p i c t u r e s e g m e n t a t i o n p r o c e s s s e r v e a s c u e s . A t t h e h i g h e r l e v e l s , o n t h e o t h e r h a n d , e a c h i n t e r p r e t a t i o n o f a p r i m i t i v e , t h a t i s p a r t o f a c o n s i s t e n t d e s c r i p t i o n o f t h e s k e t c h a t t h a t l e v e l , s e r v e s a s a c u e f o r t h e n e x t h i g h e r l e v e l . I n p r i n c i p l e , h o w e v e r , e a c h o f t h e m o d e l s a t o n e l e v e l c a n s e r v e a s c u e s a t e a c h o f t h e h i g h e r l e v e l s . M o d e l i n v o c a t i o n i s a s t a n d a r d p r o c e d u r e t h a t i n s t a l l s t h e i n t e r p r e t a t i o n s a l l o w e d b y e a c h o f t h e c u e s i n t h e Program d e s c r i p t i o n 66 i n t e r p r e t a t i o n domain of each of the p r i m i t i v e s c o n s t r a i n e d by these cues. These i n t e r p r e t a t i o n s are provided i n the form of c u e / i n t e r p r e t a t i o n T a b l e s s i m i l a r to the ones used i n MAPSEE as d e s c r i b e d i n s e c t i o n 2.3. Although the form of t h e s e T a b l e s i s the same at every l e v e l , the content of each of these Tables d i f f e r s from l e v e l to l e v e l . Model t e s t i n g can appear i n two d i f f e r e n t ways; on the one hand, model t e s t i n g can take p l a c e i n the form of a f i l t e r i n g procedure which prevents c e r t a i n cue i n t e r p r e t a t i o n s from being i n s t a l l e d under c e r t a i n circumstances. In MAPSEE such a f i l t e r i n g procedure was a p p l i e d t o the i n s t a l l a t i o n of r i v e r (or r i v e r * ) as an i n t e r p r e t a t i o n (see s e c t i o n 2-3). In HOUSE such a f i l t e r i n g procedure has been a p p l i e d a t both the Edge l e v e l and the Object l e v e l . At the O r i e n t a t i o n l e v e l , on the other hand, model t e s t i n g takes p l a c e i n the form of a s e t of c o n s i s t e n c y t e s t s between s u r f a c e i n t e r p r e t a t i o n s and the i n t e r p r e t a t i o n s of t h e i r r e s p e c t i v e boundaries. These model t e s t s embody the i n f e r e n c i n g c a p a b i l i t y of the system. A d i f f e r e n c e between the f i l t e r i n g procedures and c o n s i s t e n c y t e s t s i s t h a t the f i l t e r i n g procedures are o p e r a t i o n a l during model i n v o c a t i o n r a t h e r than a f t e r model i n v o c a t i o n , which i s the case f o r c o n s i s t e n c y t e s t s . Model e l a b o r a t i o n i s provided by the network c o n s i s t e n c y a l g o r i t h m e x p l a i n e d i n s e c t i o n 2-3. In MAPSEE, however, only s t e p 1 and 2 of the a l g o r i t h m were implemented. In HOUSE, the implementation of the a l g o r i t h m has been completed; t h a t i s , i f a f t e r e x e c u t i o n of step 1 and 2, there are two or more p r i m i t i v e s whose i n t e r p r e t a t i o n domains c o n t a i n more than one Program d e s c r i p t i o n 67 i n t e r p r e t a t i o n , then the domain of one of these p r i m i t i v e s i s s p l i t i n h a l f and st e p 1 and 2 are repeated. T h i s process co n t i n u e s u n t i l each of the p r i m i t i v e s has a unique i n t e r p r e t a t i o n . T h i s v e r s i o n of HC r e t u r n s a l l the p o s s i b l e ways of d e s c r i b i n q the sketch a t the l e v e l i n v o l v e d . Each d e s c r i p t i o n of the ske t c h i s c o n t a i n e d i n a l i s t . The elements i n such a l i s t are the p r i m i t i v e s p a i r e d with one s p e c i f i c i n t e r p r e t a t i o n . The s t r u c t u r e of the s e t of d e s c r i p t i o n s of the sketch i s very e f f i c i e n t . D i f f e r e n t d e s c r i p t i o n s share s t r u c t u r e s where p o s s i b l e as F i q u r e 12 shows. p r i m i t i v e D = description I = i n t e r p r e t a t i o n F i g u r e 12: Repre s e n t a t i o n of a h y p o t h e t i c a l sketch c o n s i s t i n q of fo u r d i f f e r e n t p r i m i t i v e s with f o u r d i f f e r e n t p o s s i b l e d e s c r i p t i o n s Program d e s c r i p t i o n 68 Ho resegmentation i s done i n the scene domain as a r e s u l t of i n t e r p r e t a t i o n (as was the case i n HAPSEE). At each of the l e v e l s i n the scene domain, th e i n t e r p r e t a t i o n s t h a t are p a r t of a c o n s i s t e n t d e s c r i p t i o n of the sketch a t t h a t l e v e l serve as cues f o r b o o t s t r a p p i n g i n t o the next h i g h e r l e v e l . , 4.3.1 The Edge l e v e l The Edge l e v e l i s the f i r s t l e v e l i n the scene domain. At t h i s l e v e l the s k e t c h i s represented by a s e t of edge i n t e r p r e t a t i o n s o f the Waltz type (Waltz, 1972). These i n t e r p r e t a t i o n s a r e : convex, concave, occlude/concave ( i . e . concave and separable) and crack (shadows and i l l u m i n a t i o n c h a r a c t e r i s t i c s of the s u r f a c e s are not r e p r e s e n t e d i n HOUSE). Bote t h a t occlude and occlude/concave a r e d i r e c t i o n a l i n t e r p r e t a t i o n s . An edge i s occlude or occlude/concave i f the s u r f a c e on the r i g h t of the d i r e c t i o n i n which the edge i s drawn occl u d e s the s u r f a c e on the l e f t s i d e . I f t h i s i s not the case then the i n t e r p r e t a t i o n w i l l be o c c l u d e * or occlude/concave*. The v e r t i c e s which r e s u l t from the p i c t u r e segmentation process serve as cues f o r these i n t e r p r e t a t i o n s . F o r t h i s reason the p i c t u r e segmentation process can be seen as the cue d i s c o v e r y stage f o r the Edge l e v e l . The l i n e s are the p r i m i t i v e s f o r the i n t e r p r e t a t i o n p rocess. The i n t e r p r e t a t i o n of each l i n e i s c o n s t r a i n e d by the cues at both s i d e s o f each edge. Table 1 shows these c o n s t r a i n t s . Program d e s c r i p t i o n 69 cue type domain (s) chn Free-end 1 convex/concave/occl - c o n e { * ) , o c c l u d e ( * ) , c r a c k I chn 1 1 chn 2 L-vertex | occlude (*) ) concave/occl-conc (*) / I | occluder*)/convex I concave/occlude {*) / j o c c l u d e (*) I o c c l - c o n c ( * ) / c o n v e x ] | crack j crack Link I convex | [ concave J ! o c c l u d e (*} | | o c c l - c o n c {*) J I crack I convex concave occlude{*) o c c l - c o n c ( * ) crack | stem-chn | chn1 | chn 2 Arrow | concave | | convex . ] I convex j I convex J convex concave occlude{*) o c c l - c o n c {*) | convex j concave | occlude (*) I o c c l - c o n c ( * ) Pork | convex | I concave ] | concave j | concave | | occlude (*) j I occlude {*) | convex concave o c c l - c o n c (*) occlude (*) concave occlude <*) J convex | concave } o c c l - c o n c ( * ) | occlude (*) 1 occlude(*) | concave Tee I convex/concave/ J |occlude{*)/crack 1 | / o c c l - c o n c (*) 1 I c r a c k 1 I c r a c k | I crack J occlude{*) crack o c c l - c o n c {*) concave J occlude <*) i crack j o c c l - c o n c ( * ) | concave T a b l e 1: C u e / i n t e r p r e t a t i o n Table f o r the Edge l e v e l CHM* unenntewioiij. courts r Program d e s c r i p t i o n 72 Model i n v o c a t i o n and model e l a b o r a t i o n are as d e s c r i b e d i n s e c t i o n 4.3. A f i l t e r i n g procedure t h a t s t a n d s f o r model t e s t i n g makes sure t h a t from the d i r e c t i o n a l i n t e r p r e t a t i o n s only one type s i l l be i n s t a l l e d . At the Edge l e v e l the assumption i s made t h a t no more than t h r e e s u r f a c e s can meet i n a v e r t e x . F i g u r e 13 shows one of the p o s s i b l e d e s c r i p t i o n s of the house a t the Edge l e v e l . 4.3.2 The O r i e n t a t i o n l e v e l L i n e s , r e g i o n s and r e g i o n - g h o s t s are the p r i m i t i v e s f o r the i n t e r p r e t a t i o n process a t t h i s l e v e l . The model i n t e r p r e t a t i o n s p o s s i b l e a r e : h o r i z o n t a l , v e r t i c a l or s l a n t e d . T h i s r e p r e s e n t a t i o n i s much cruder than, f o r example, the g r a d i e n t r e p r e s e n t a t i o n f o r s u r f a c e s and edges used by Mackworth (1973). The models from the Edge l e v e l s serve as cues a t the O r i e n t a t i o n l e v e l . I f , f o r example, a l i n e was i n t e r p r e t e d as occlude/concave a t the Edge l e v e l , then the occlude/concave i n t e r p r e t a t i o n s e r v e s as a cue a t the O r i e n t a t i o n l e v e l . The model i n t e r p r e t a t i o n s allowed by t h i s cue w i l l be v e r t i c a l or s l a n t e d f o r the o c c l u d i n g s u r f a c e and h o r i z o n t a l f o r the s u r f a c e being o c c l u d e d . The edge c o n s t r a i n e d by t h i s cue must be a h o r i z o n t a l edge. The c u e / i n t e r p r e t a t i o n T a b l e f o r the O r i e n t a t i o n l e v e l i s shown i n Table 2. Program d e s c r i p t i o n 73 cue type domain (s) I chn 1 r e g i o a - a r e g i o n - b occlude(*) . 1 h/sl/v-edge h / s l / v - s u r f ace h / s l / v - s u r f a c e o c c l - c o n c (*) | h-edge I s l / v - s u r f a c e h - s u rface concave | h-edge | h-edge | v-edge | s l / v - s u r f a c e I h - s u r f a c e | v - s u r f a c e h - s u r f a c e s l / v - s u r f a c e v - s u r f a c e convex I v-edge | h-edge | h-edge | sl-edge I s l - e d g e 1 v - s u r f a c e I s l / v - s u r f a c e I h / s l - s u r f a c e I s l / v - s u r f a c e I s l - s u r f a c e v - s u r f a c e h / s l - s u r f a c e 1 s l / v - s u r f a c e s l - s u r f a c e s l / v - s u r f a c e crack | h/sl/v-edge | h/sl-edge | h-edge 1 v - s u r f a c e | s l - s u r f a c e | h — s u r f a c e v - s u r f a c e I s l - s u r f a c e I h - s u r f a c e Table 2: C u e / i n t e r p r e t a t i o n Table f o r the O r i e n t a t i o n l e v e l fiodel t e s t i n g i s a very important stage a t the O r i e n t a t i o n l e v e l . I n t e r p r e t a t i o n s of s u r f a c e s have t o be ( i n t e r n a l l y ) c o n s i s t e n t with i n t e r p r e t a t i o n s of t h e i r boundaries. The f o l l o w i n g ( i n t e r n a l ) c o n s i s t e n c y r u l e s are a p p l i e d i n HOUSE: 1) v e r t i c a l l i n e s t h a t are part of a v e r t i c a l s u r f a c e must be v e r t i c a l edges 2) the same v e r t i c a l l i n e s cannot be v e r t i c a l edges when the s u r f a c e i s s l a n t e d 3) a l l the boundary edges of a h o r i z o n t a l s u r f a c e must be h o r i z o n t a l 4 ) i f , on the other hand, two non p a r a l l e l edges i n a s u r f a c e are h o r i z o n t a l , then the su r f a c e must be h o r i z o n t a l Program d e s c r i p t i o n 74 5) p a r a l l e l edges i n the same s u r f a c e must have the same o r i e n t a t i o n . In the c y c l e cf p e r c e p t i o n model t e s t i n g ( i n t e r n a l c o n s i s t e n c y ) precedes model e l a b o r a t i o n ( e x t e r n a l c o n s i s t e n c y ) . The nature of the NC a l g o r i t h m , however, r e q u i r e s the procedures t e s t i n g the two types of c o n s i s t e n c y t o be implemented i n t e r a c t i v e l y . NC s t a r t s with a s e t of p o s s i b l e i n t e r p r e t a t i o n s i n the domain of each p r i m i t i v e as they a r e proposed by the cues. By a p p l y i n g the c o n s t r a i n t s that d i f f e r e n t cues impose on each p r i m i t i v e , a unique i n t e r p r e t a t i o n f o r each of the p r i m i t i v e s w i l l g r a d u a l l y emerge. I n t e r n a l c o n s i s t e n c y , on the other hand, cannot be t e s t e d u n t i l a unique i n t e r p r e t a t i o n i s proposed f o r one of the p r i m i t i v e s . as a r e s u l t of t h i s one has to wait with an i n t e r n a l c o n s i s t e n c y t e s t u n t i l NC s t a r t s to r e t u r n unique i n t e r p r e t a t i o n s f o r the s u r f a c e s and edges. I f one wants to maintain a complete s e p a r a t i o n between the two types of c o n s i s t e n c y t e s t s then i t would be p o s s i b l e to t e s t e x t e r n a l c o n s i s t e n c y f i r s t and i n t e r n a l c o n s i s t e n c y next. For reasons o f e f f i c i e n c y , however, the procedures were implemented i n t e r a c t i v e l y . Each time step 2 i n NC i s completed the i n t e r n a l c o n s i s t e n c y package i s a p p l i e d to those p r i m i t i v e s whose i n t e r p r e t a t i o n domain c o n t a i n s a unique i n t e r p r e t a t i o n . I f such an i n t e r p r e t a t i o n appears to be i n t e r n a l l y i n c o n s i s t e n t then p r o c e s s i n g can be i n t e r r u p t e d a t t h i s s t a g e , r a t h e r then Program d e s c r i p t i o n 78 completing the e x t e r n a l c o n s i s t e n c y t e s t f i r s t and f i n d i n g out t h a t the proposed i n t e r p r e t a t i o n s are i n t e r n a l l y i n c o n s i s t e n t a f t e r w a r d s . The i n t e r n a l c o n s i s t e n c y t e s t s can be seen as a form of p r o c e d u r a l attachment (Winograd, 1975). at the O r i e n t a t i o n l e v e l , the assumption i s made, t h a t the outer r e g i o n i s a h o r i z o n t a l support s u r f a c e . F i g u r e 14 shows one of the p o s s i b l e d e s c r i p t i o n s o f a house at t h i s l e v e l . 4.3.3 The Surface naming l e v e l Surfaces are not d e s c r i b e d completely by t h e i r o r i e n t a t i o n ; they a l s o c a r r y meaningful names. In the p o l y h e d r a l domains, such meaningful names are u s u a l l y l i m i t e d t o e x p r e s s i n g a s u r f a c e as b e i n g e i t h e r a s i d e - f a c e o r a t o p - f a c e . I n the house domain, however, we have a r i c h e r vocabulary: a s u r f a c e can be ground, ground* (a s u r f a c e c o p l a n a r with the ground), r o o f , w a l l , window, door or door-handle. Both types of e x p r e s s i o n s ( s i d e - f a c e s and walls) are p o s s i b l e a t the Surface naming l e v e l as Table 3 shows. Program d e s c r i p t i o n 79 cue type o c c l - c o n c (*) chn h-edge domain(s) r e g i o n - a w a l l / d o o r / r o o f / s i d e - f a c e / t o p-f ace region-b ground concave h-edge h-edge v-edge v-edge convex h/sl-edge h/sl-edge h/sl-edge h/sl-edge v-edge v-edge w a l l / d o o r / r o o f / s i d e - f a c e / t o p - f a c e ground/ground* w a l l s i d e - f a c e w a l l / r o o f r o o f s i d e - f a c e / t o p - f ace t o p - f a c e w a l l s i d e - f a c e ground/ground (*) w a l l / d o o r / r o o f / side*-f a c e / t o p - f ace wall s i d e - f a c e roof w a l l / r o o f t o p - f a c e s i d e - f a c e / t o p - f a c e w a l l s i d e - f a c e c r a c k h/sl/v-edge h/sl/v-edge h/sl/v-edge h/sl-edge h/sl-edge h-edge wall/door door/door-handle /window side-rface roof t o p - f a c e ground/ground(*) i n s i d e surrounds common regxon-a window wall/door w a l l / s i d e - f a c e / r o o f / t o p - f a c e ground wall/door-handle door door/door-h an d l e /window wall/door s i d e - f a c e roof t o p - f a c e ground/ground (•*) region - b wall/door window ground w a l l / s i d e - f a c e / r o o f / t o p - f a c e door wall/door-handle h - s u r f a c e r e g i o n ground/ground{*)/roof/top-face s l - s u r f a c e v - s u r f a c e r o o f / t o p - f a c e wall/door/window/door-handle/side-face Table 3 : C u e / i n t e r p r e t a t i o n Table f o r the Sur f a c e naming l e v e l Program d e s c r i p t i o n 80 Another phenomenon, that manifests i t s e l f at t h i s l e v e l , i s th a t the model i n t e r p r e t a t i o n s o f the previous O r i e n t a t i o n l e v e l are not the only p o s s i b l e cues a t t h i s l e v e l . On the c o n t r a r y , models from a l l p r e v i o u s l e v e l s are allowed to serve as cues. An example of such a low l e v e l model s e r v i n g as a cue a t the Surfa c e naming l e v e l a re the INSIDE, S0BRO0NDS, and COMMON reg i o n r e l a t i o n s h i p s as cues. Because such cues occur i n houses onl y , they exclude s i d e - f a c e s and t o p - f a c e s as p o s s i b l e model i n t e r p r e t a t i o n s . As a r e s u l t , t h i s type of cue s e v e r e l y c o n s t r a i n s the number of p o s s i b l e i n t e r p r e t a t i o n s a t t h i s l e v e l , because i t p r o h i b i t s any d e s c r i p t i o n s of the ske t c h i n the form of s i d e - or t o p - f a c e . On the other hand, the absence o f such cues does not p r o h i b i t a d e s c r i p t i o n of the sketch i n terms of w a l l s and r o o f s . The same type of i n t e r n a l c o n s i s t e n c y r u l e s t h a t a p p l i e d a t the O r i e n t a t i o n l e v e l apply a t t h i s l e v e l . However, s i n c e the c o n s t r a i n t s imposed by the r u l e s were s a t i s f i e d a l r e a d y a t the pre v i o u s l e v e l , t h e r e i s no need to t e s t them again. As a r e s u l t , no model t e s t i n g takes p l a c e a t the Surface naming l e v e l . I t i s the NC t e s t t h a t p r o v i d e s the s e t of c o n s i s t e n t i n t e r p r e t a t i o n s a t t h i s l e v e l . At the Surface naming l e v e l , the assumption i s made th a t the outer r e g i o n i s the ground. Another hypothesis i n t r o d u c e d a t t h i s l e v e l i s the o b j e c t support h y p o t h e s i s , t h a t i s , o b j e c t s must be supported by other o b j e c t s or the ground ( i n HOUSE an o b j e c t must be supported by the ground). MSton m c * r«TRTj0» j . ikoor MSIOU m c «Mcmio i i s« »tnn J r L net tan i»it(\f«fHTiowT, aoon J Figure 15 continued r ~ i L HCCIOI UnCBPMTKTJMJi OBPH-M»0L£ J Program d e s c r i p t i o n 84 As a r e s u l t of t h i s , only those i n t e r p r e t a t i o n s whose outer boundary co n t a i n e d an occlude/concave (*) or c r a c k i n t e r p r e t a t i o n a t the Edge l e v e l are c o n s i d e r e d f o r p r o c e s s i n g a t the Surface naming l e v e l . T h i s s e v e r e l y c o n s t r a i n s the number of p o s s i b l e i n t e r p r e t a t i o n s a t t h i s l e v e l . As we w i l l see i n Chapter 5 where the r e s u l t s are d e s c r i b e d , t h e r e i s a sharp decrease i n the number of i n t e r p r e t a t i o n s a t the S u r f a c e naming l e v e l . L i n e s , r e g i o n s and region-ghosts a r e the p r i m i t i v e s a t the Surface-naming l e v e l . F i g u r e 15 shows one of the p o s s i b l e d e s c r i p t i o n s of a s k e t c h of a house a t t h i s l e v e l . 4.3.4 The O b j e c t l e v e l The o b j e c t l e v e l i s the h i g h e s t l e v e l of a b s t r a c t i o n of the o r i g i n a l sketch. At t h i s l e v e l , the sketch i s d e s c r i b e d by means o f one s i n g l e word: a cube, wedge or house. The cues f o r t h i s l e v e l a r e a c o l l e c t i o n ( o r, r a t h e r , an amalgamation) of models obtained a t p r e v i o u s l e v e l s of p r o c e s s i n g . For an o b j e c t t o be r e c o g n i z e d a t the Object l e v e l c e r t a i n f e a t u r e s must be there i n c e r t a i n r e l a t i o n s h i p s . For example, a wedge s h o u l d , f i r s t of a l l , c o n s i s t of t h r e e s u r f a c e s ; a l l t h r e e s u r f a c e s should r e l a t e to each other i n a convex way; two of these r e l a t i o n s h i p s should i n v o l v e a s u r f a c e i n the form of a t r i a n g l e and a s u r f a c e i n the form of a p a r a l l e l o g r a m , the convex edge a l s o being a h o r i z o n t a l edge. F i n a l l y , the t h i r d r e l a t i o n s h i p should i n v o l v e two s u r f a c e s i n the form of a p a r a l l e l o g r a m , the convex edge t h i s time being a v e r t i c a l edge. These Program d e s c r i p t i o n 85 r e l a t i o n s h i p s change i f we are d e a l i n g with an A-framed wedge, as Table 4 shows. For a house, on the other hand, the only requirement i s a roof i n a convex r e l a t i o n s h i p with a w a l l t h a t c o n t a i n s e i t h e r a door, or a window. The s k e t c h as a whole s e r v e s as p r i m i t i v e a t the o b j e c t l e v e l . S i n c e HOUSE assumes t h a t there i s only one o b j e c t i n the scene, t h e r e i s no need to complete the c y c l e a t the Obje c t l e v e l . A f t e r cue d i s c o v e r y and model i n v o c a t i o n an i n t e r n a l c o n s i s t e n c y t e s t i s s u f f i c i e n t to re c o g n i z e the o b j e c t . I n t e r n a l c o n s i s t e n c y i s t e s t e d by means of a f i l t e r i n g procedure analogous to the one used i n MAPSEE, t h a t i s , c e r t a i n i n t e r p r e t a t i o n s w i l l be allowed by c e r t a i n cues i f c e r t a i n c o n d i t i o n s are s a t i s f i e d . In the case o f the house, f o r example, the inside/wall/window/* cue can be e f f e c t i v e only i f the convex/wall/roof/h-edge cue i s present as w e l l (Table 4) . F i g u r e 16 shows the only p o s s i b l e d e s c r i p t i o n o f the house at the Object l e v e l . Program d e s c r i p t i o n Figure 16: Description of the house example at the Object l e v e l Program d e s c r i p t i o n 87 cue type 1 domain J o b j e c t inside-wall-window-* common-wall-door-* c o n v e x- wa 11- r o o f - h- e d g e c o n v e x - w a l l - r o o f - s l - e d g e | house ] house I house I house c o n v e x - s i d e - f a c e - t o p - f ace-fa-edge con v e x - p a r a l l e l eg ram-parallelogram-h-edge con v e x - s i d e - f ace-side-rf ace-v-edge convex-parallelcgam-parallelogram-v-edge c o n v e x - p a r a l l e l o g r a m - t r i a n g l e - h - e d g e c o n v e x - s i d e - f a c e - t o p - f a c e - s l - e d g e con vex-pa ralie1egram-1 r i a n g l e - s l - e d ge convex-top-face-top-face-h-edge ) cube/wedge | cube/sedge j cube/wedge J cube/wedge | wedge | wedge J wedge | wedge Table 4: C u e / i n t e r p r e t a t i o n Table f o r the o b j e c t l e v e l 4.4 Output At each of the scene l e v e l s a l i s t of d i f f e r e n t p o s s i b l e d e s c r i p t i o n s o f the s k e t c h i s r e t u r n e d by HC, as was d e s c r i b e d i n s e c t i o n 4.3- Each s i n g l e d e s c r i p t i o n of the s k e t c h i s embodied i n a l i s t c o n t a i n i n g the p r i m i t i v e / i n t e r p r e t a t i o n p a i r s t h a t c o n s t i t u t e the d e s c r i p t i o n - D i f f e r e n t d e s c r i p t i o n s a t d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n are i n t e r c o n n e c t e d ; each d e s c r i p t i o n i s l i n k e d with the d e s c r i p t i o n <s) a t the next h i g h e r l e v e l t h a t were spawned by t h i s d e s c r i p t i o n . In t u r n , the h igher l e v e l d e s c r i p t i o n s are a l s o l i n k e d with the d e s c r i p t i o n t h a t spawned them. A f t e r completion of the i n t e r p r e t a t i o n p rocess, HOUSE r e t u r n s a g i g a n t i c i n t e r p r e t a t i o n s t r u c t u r e r e p r e s e n t i n g a l l the p o s s i b l e d e s c r i p t i o n s a t a l l l e v e l s . Program d e s c r i p t i o n 88 A g r a p h i c s program has been w r i t t e n that can show each of the d e s c r i p t i o n s on the screen of a CRT d i s p l a y . T h i s program takes the p r i m i t i v e i n t e r p r e t a t i o n s found i n each d e s c r i p t i o n one by one and shows the p r i m i t i v e s with t h e i r i n t e r p r e t a t i o n on the CRT. P i g u r e 8, 10 and the F i g u r e s 13 to 16 are a l l p l o t t e d output from t h i s program. Experiments and r e s u l t s CHAPTER 5z IXPEIIHEHTS AND 8ES0LTS 89 Four d i f f e r e n t examples have a c t u a l l y been run: a wedge, a house, an L-beam and a s k e t c h map. F i g u r e 17a, b, c, and d show the drawings as they were provided by the p l o t programs t h a t served as i n p u t to the program. The wedge and the house were run through a l l the l e v e l s of r e p r e s e n t a t i o n ; the L-beam was run up to the O r i e n t a t i o n l e v e l whereas the sketch map was run up to the Vertex l e v e l o nly. The segmentation of the house example r e s u l t e d i n the formation of 28 sub-chains, 4 L - v e r t i c e s , 3 Arrows, 1 Fork, 4 Tees, 7 r e g i o n s and 1 r e g i o n - g h o s t . In the wedge example 8 sub-rchains, 2 L - v e r t i c e s , 3 Arrows, 1 Fork and 4 r e g i o n s were found. The L-feeam segmentation r e s u l t e d i n the f o r m a t i o n of 14 sub-chains, 5 L - v e r t i c e s , 1 Fork, 4 Arrows, 1 Tee and 5 r e g i o n s . The s k e t c h map segmentation r e s u l t e d i n the f o r m a t i o n of 34 sub-chains, 28 L - v e r t i c e s , 2 Tees and 6 Free-ends. Region and vertex segmentation i s shown f o r the house example i n F i g u r e 8 and F i g u r e 10. Table 5 shows the number of d e s c r i p t i o n s c r e a t e d a t each of the f o u r scene l e v e l s by the t h r e e p o l y h e d r a . 91 Experiments and r e s u l t s 92 Object Edge 1- O r i e n t . L. Surf.nam. 1. Obj. L. CPO time wedge j 5 | 30 | 8 f 2 | 137s. L-beam \ 7 J 35 | HA 1 NA J NA bouse I 6 | 12 | 1 | 1 ] 236s. note: NA = not a p p l i c a b l e ; CPO times are acc u r a t e to w i t h i n 10 seconds. Table 5: Experimental r e s u l t s A few comments should be made at t h i s p o i n t . The house and the wedge should be seen as g e n e r a l r e p r e s e n t a t i v e examples of the domain; the L-beam was chosen because of a s p e c i a l s e t of f e a t u r e s i n t h i s p a r t i c u l a r example. The L-beam i n F i g u r e 17c has an uncommon i n t e r p r e t a t i o n (a Tee and an L - v e r t e x which are not part of the boundary of the object) a t the Edge l e v e l ; another uncommon f e a t u r e of t h i s example i s the presence o f a r e g i o n whose outer boundary i s d i f f e r e n t at the p i c t u r e l e v e l and the O r i e n t a t i o n l e v e l . Once t h i s r e g i o n i s i n t e r p r e t e d as a s u r f a c e , one of the boundary l i n e s i s no lo n g e r p a r t of the s u r f a c e . A sketch map was chosen as an example, i n order t o show the g e n e r a l i t y of the segmentation p a r t o f the program. The house example has been run under s e v e r a l c o n s t r a i n t s . F i r s t of a l l , at the Edge l e v e l the d i s t i n c t i o n between concave s e p a r a b l e ( i . e . occlude/concave) and concave non se p a r a b l e ( i . e . concave) was not made. Both types were seen as concave. Another c o n s t r a i n t has been put on the i n t e r p r e t a t i o n of r e g i o n boundaries s i t u a t e d i n s i d e other r e g i o n s . These boundaries c o u l d only be i n t e r p r e t e d as c r a c k . Both c o n s t r a i n t s have been Experiments and r e s u l t s 93 a p p l i e d f o r reasons of economy. E s p e c i a l l y the l a s t c o n s t r a i n t was necessary i n order t o prevent a c o m b i n a t o r i a l e x p l o s i o n of i n t e r p r e t a t i o n s a t the Edge l e v e l . The i n t e r p r e t a t i o n of boundaries of r e g i o n s i n s i d e other r e g i o n s i s not c o n s t r a i n e d by the i n t e r p r e t a t i o n of other edges i n the o b j e c t . Boundary edges c o n s t r a i n e d o n l y by L - v e r t i c e s can c a r r y many d i f f e r e n t i n t e r p r e t a t i o n s , as can be seen i n Table 1. The f a c t t h a t t h e r e are not one but two cases of a r e g i o n i n s i d e another r e g i o n i n the house example makes the s i t u a t i o n even worse. The d e c i s i o n t o allow the c r a c k i n t e r p r e t a t i o n only i n t h i s case i s s e n s i t i v e t o c r i t i c i s m , however. One might as w e l l argue t h a t the edges surrounding a window sh o u l d be occlude r a t h e r than c r a c k . The c h o i c e of cr a c k has a l s o been made f o r e f f i c i e n c y reasons, & crack r e l a t i o n s h i p a l l o w s one s p e c i f i c o r i e n t a t i o n only f o r the s u r f a c e s connected by t h i s edge whereas an o c c l u d e r e l a t i o n s h i p does not put any c o n s t r a i n t on the s u r f a c e being occluded {see Ta b l e 2). T h i s would have been the source of another c o m b i n a t o r i a l i n t e r p r e t a t i o n e x p l o s i o n . The d i s c u s s i o n of these problems w i l l be continued i n the next Chapter. The g e n e r a l p a t t e r n i n a l l three polyhedra examples i s t h a t the number of i n t e r p r e t a t i o n s s h a r p l y i n c r e a s e s towards t h e O r i e n t a t i o n l e v e l and decreases again beyond t h a t l e v e l . T h i s decrease i s caused mainly because of the i n t r o d u c t i o n of the support h y p o t h e s i s a t the Surface naming l e v e l . T h i s h y p o t h e s i s r e q u i r e s the presence of a t l e a s t one occlude/concave or concave edge i n the boundary of the o b j e c t . Bote t h a t some of the Experiments and r e s u l t s 94 d e s c r i p t i o n s a t one l e v e l can be s i m i l a r . T h i s i s another m a n i f e s t a t i o n of the redundancy i n the p e r c e p t u a l p r o c e s s : d i f f e r e n t d e s c r i p t i o n s a t one l e v e l can le a d to one and the same d e s c r i p t i o n a t the next higher l e v e l - As an example, the sedge demonstrates t h i s phenomenon a t the Object l e v e l . . The wedge has two d e s c r i p t i o n s a t the Object l e v e l ; both are a wedge. These two d e s c r i p t i o n s are a consequence of the f a c t t h a t i t i s p o s s i b l e to see F i g u r e 17a from two d i f f e r e n t viewpoints; one viewpoint shows the wedge as a c o n f i g u r a t i o n of two v e r t i c a l s u r f a c e s (the parallelograms) and a h o r i z o n t a l s u r f a c e (the t r i a n g l e ) , the other ( l e s s obvious) one shows the o b j e c t as a c o n f i g u r a t i o n of two s l a n t e d s u r f a c e s (the para l l e l o g r a m s ) and a v e r t i c a l s u r f a c e (the t r i a n g l e ) . Both viewpoints f i n a l l y l e a d to one c o n c l u s i o n : the o b j e c t i s a wedge. The wedge has been run without any of the c o n s t r a i n t s , mentioned above. The L-beam was a l s o run under the concave c o n s t r a i n t . D i s c u s s i o n 95 CHAPTER 6z DISCOSSIPS T h i s Chapter e v a l u a t e s the program and experiments d e s c r i b e d i n Chapter 1 and 5. A d i s c u s s i o n of f a v o u r a b l e f e a t u r e s of HOUSE i s f o l l o w e d by a d i s c u s s i o n of the problems met with the examples run with the program. These problems, to g e t h e r with the d i s c u s s i o n of a few examples on which HOOSE f a i l s , l e a d to a r e v i s i o n of the h e l i c a l metaphor i n t o a m u l t i - h e l i x metaphor f o r p e r c e p t i o n , proposed i n Chapter 7. 6.1 Favourable f e a t u r e s o f HOOSE - BOOSE HAS A UNIFORM SAY OF REPRESENTING THE MODELS NEEDED FOR DESCRIPTION OF THE IMAGE AT DIFFERENT LEVELS OF ABSTRACTION. A l l the models of the house domain are organized i n c u e / i n t e r p r e t a t i o n T ables a t d i f f e r e n t l e v e l s of d e s c r i p t i o n . These c u e / i n t e r p r e t a t i o n Tables are the l a y e r s o f a cue/model h i e r a r c h y . As mentioned b e f o r e , the c u e / i n t e r p r e t a t i o n Tables have the same format at a l l the d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n and only d i f f e r as t o t h e i r c o n t e n t . - HOUSE HAS A UNIFORM CONTROL STRUCTURE THAT GUIDES THE EXPLOITATION OF THE IMAGE. At each of the (scene) l e v e l s of p r o c e s s i n g , the g o a l s of the process were; Discussion 96 1) to f i n d the c o n s i s t e n t d e s c r i p t i o n s cf the s k e t c h a t t h a t l e v e l , and 2) to f i n d the cues t h a t allow b o o t s t r a p p i n g i n t o the next higher l e v e l . The uniform g o a l s of the process together with a uniform format f o r r e p r e s e n t i n g the models a t each l e v e l allow a uniform c o n t r o l s t r u c t u r e behind t h i s p r o c e s s . The c y c l e of p e r c e p t i o n , s e r v i n g as a metaphor f o r t h i s c o n t r o l s t r u c t u r e has been mentioned f r e q u e n t l y enough by now. The model t e s t i n g stage i n the c y c l e r e p r e s e n t s HOUSE 'S i n f e r e n c i n g c a p a b i l i t y , which was one o f the goals of the program. - HOUSE CAN REPRESENT THE IMAGE AT MANY DIFFERENT LEVELS OF ABSTRACTION AND DETAIL IN BOTH THE PICTURE AND THE SCENE DOMAIN. The need to d e s c r i b e an image a t d i f f e r e n t l e v e l s o f a b s t r a c t i o n and l e v e l s of d e t a i l was one of the l e s s e n s l e a r n e d from p r e v i o u s work i n the b l o c k s world. Guzman (1969) had only one domain o f r e p r e s e n t a t i o n , the p i c t u r e domain; Clowes (1971) i n t r o d u c e d the d i s t i n c t i o n between p i c t u r e and scene domain; on top of t h i s d i s t i n c t i o n Hackworth (1977a) d i s t i n g u i s h e s d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n the p i c t u r e domain and HOUSE (as the most c u r r e n t development i n t h i s l i n e ) has d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n both the p i c t u r e and the scene domain. M u l t i p l e l e v e l s of r e p r e s e n t a t i o n i s a c h a r a c t e r i s t i c a l s o found i n other areas of A r t i f i c i a l I n t e l l i g e n c e . The HEARSAY I I D i s c u s s i o n 97 system, f o r example, ( l e s s e r and Erman, 1977, 1978) i s a p e r c e p t u a l system f o r domains with l a r g e amounts of d i v e r s e , i n e x a c t and incomplete knowledge. Speech understanding i s one of the areas t o which the system has been a p p l i e d . One of the implementations of the system uses f i v e d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n : a parameter, segment, word, word-seguence and phrase l e v e l . P r i m i t i v e elements (comparable with the models i n HOUSE) are a s s o c i a t e d with each l e v e l . These p r i m i t i v e s are pa r t o f hypotheses: i n t e r p r e t a t i o n s of p o r t i o n s of the spoken utterance a t a p a r t i c u l a r l e v e l . These hypotheses have s t r u c t u r a l r e l a t i o n s h i p s with other hypotheses at the same and at other l e v e l s o f r e p r e s e n t a t i o n . Hypotheses a r e c r e a t e d by knowledge sources, which c r e a t e a uniform s t r u c t u r e f o r a hy p o t h e s i s i r r e s p e c t i v e o f the l e v e l of r e p r e s e n t a t i o n i n v o l v e d . Such a knowledge source i s a production (e.g. Newell and Simon, 1972, and Newell, 1973) c o n s i s t i n g of a p r e c o n d i t i o n and an a c t i o n . Knowledge sources are comparable with the c u e / i n t e r p r e t a t i o n Tables i n HOUSE; they can i n t e r c o n n e c t any a r b i t r a r y l e v e l i n the system. A S opposed to the cues i n HOUSE, however, knowledge sources can a l s o be a c t i v e w i t h i n a l e v e l and they can i n t e r c o n n e c t l e v e l s i n a top-down way. - HOUSE IS HODOLAB, GENERAL AND EFFICIENT. The modularity of HOUSE i s a d i r e c t consequence of both the uniform r e p r e s e n t a t i o n and c o n t r o l s t r u c t u r e at each o f the scene l e v e l s . Such modularity i m p l i e s g e n e r a l i t y . As l o n g as D i s c u s s i o n 98 d e s c r i p t i o n s of any type of p e r c e p t u a l domain (be i t geographic maps, houses or faces) can be expressed i n the r e p r e s e n t a t i o n a l format used i n HOUSE (and MAPSEE), such a domain can be handled by HOUSE-The e f f i c i e n c y of HOUSE r e s u l t s from i t s p a s s - o r i e n t e d c h a r a c t e r ; p r o c e s s i n g a t one l e v e l imposes c o n s t r a i n t s on the p r o c e s s i n g at the next higher l e v e l . Hot a l l the combinations of i n t e r p r e t a t i o n s proposed a t one l e v e l a r e c o n s i s t e n t with each other. B u l i n g out such combinations a t one l e v e l w i l l prevent them from being t e s t e d a t the next h i g h e r l e v e l . A merge o f , f o r example, the Edge and O r i e n t a t i o n l e v e l would l e a d to the same end r e s u l t s . However, the t o t a l number of combined i n t e r p r e t a t i o n s to be t e s t e d by the network c o n s i s t e n c y a l g o r i t h m would be much l a r g e r i n the merged s i t u a t i o n . - THE SEGMENTATION PROGRAM IS QUITE GENERAL. I r r e s p e c t i v e how a s k e t c h i s drawn, i n HOUSE i t w i l l always r e s u l t i n the same segmentation. For example, a Tee can be formed from an L and a Free-end, a Free-end c l o s e t o another c h a i n , or from t h r e e Free-ends i f t h e i r r e s p e c t i v e l i n e s have the proper o r i e n t a t i o n . HOUSE can segment any type of s k e t c h , be i t a p o l y h e d r a l o b j e c t , a geographic map or a f a c e . - THE INTEBPBE'TATION STRUCTURE RETURNED BY HOUSE AFFORDS A BASIS FOR INVESTIGATING VARIOUS LEARNING STRUCTURES. D i s c u s s i o n 99 A c h a r a c t e r i s t i c i n human i n f o r m a t i o n p r o c e s s i n g which i s not r e f l e c t e d i n HOOSE i s th a t repeated c o n f r o n t a t i o n with the same s i t u a t i o n tends to i n f l u e n c e the speed o f the r e c o g n i t i o n process. Bepeated succes or f a i l u r e to recognize an o b j e c t has r e p e r c u s s i o n s f o r the proc e s s . In case of success the process w i l l be r e o r g a n i z e d i n such a way t h a t the same r e s u l t w i l l be obtained f a s t e r the next time. In case of f a i l u r e r e o r g a n i z a t i o n of the process w i l l r e s u l t i n a d i f f e r e n t answer. HOOSE does not have such a feedback mechanism. HOOSE does p r o v i d e , however, an i n t e r p r e t a t i o n s t r u c t u r e on which such a feedback can be imposed. Perhaps an a l g o r i t h m c o u l d be designed, which, a f t e r s u c c e s s , t r a c e s the d e s c r i p t i o n ( s ) a t each l e v e l t h a t l e d to the c o r r e c t answer. Such a t r a c i n g procedure c o u l d r e s u l t i n a r e o r g a n i z a t i o n of the cue/model h i e r a r c h y . For example, low l e v e l cues can be reor g a n i z e d t o suggest high l e v e l models d i r e c t l y . another way would be to a s s i g n s t r e n g t h v a l u e s to cue i n t e r p r e t a t i o n s . Cue i n t e r p r e t a t i o n s with a h i g h s t r e n g t h value should be processed f a s t e r than other i n t e r p r e t a t i o n s with a low s t r e n g t h v a l u e . The r e s u l t of t h i s would be th a t i n t e r p r e t a t i o n s which are part o f d e s c r i p t i o n s t h a t l e a d to the c o r r e c t answer are i n s t a l l e d before the u n s u c c e s s f u l ones and can be processed f a s t e r , as a consequence of t h i s approach, the pr o c e s s i n g of ( p r e v i o u s l y ) s u c c e s s f u l d e s c r i p t i o n s w i l l occur ahead of u n s u c c e s s f u l ones. T h i s would i n t r o d u c e a m u l t i - p r o c e s s i n g i d e a s t i l l m a i n t a i n i n g the bottom-up approach D i s c u s s i o n 100 of the system, which i s important f o r the e f f i c i e n c y of the system as was p r e v i o u s l y e x p l a i n e d . I t s hould be noted that such an approach comes very c l o s e to Anderson's i d e a of a c t i v a t i o n s preading through networks (Anderson, 1976). Anderson assumes t h a t models can be represented as nodes i n a p r o p o s i t i o n a l network. The l i n k s between the nodes embody the r e l a t i o n s h i p between these models. A l l these l i n k s have s t r e n g t h v a l u e s a s s i g n e d t o them. Anderson has proposed a p r o b a b i l i s t i c a c t i v a t i o n model, which p r e d i c t s the average time needed t o support or r e j e c t c e r t a i n s t a t e s of the network. The t r a v e l speed of a s i g n a l through the network i s i n f l u e n c e d by t h e number o f l i n k s l e a v i n g each node and the s t r e n g t h value of the l i n k s . For example, the time necessary to v e r i f y t h e statement "The h i p p i e was i n the park" depends on both the s t r e n g t h value of the l i n k between the h i p p i e node and the park node and the number of l i n k s l e a v i n g both nodes. The s t r o n g e r the l i n k the more l i k e l y i t i s t h a t the answer can be given w i t h i n a c e r t a i n time l i m i t . S i m i l a r approaches have been taken by Becker (1973), Kulikowski (1974), and Heiss (1974). - CONSERVATIVE SEGMENTATION FOLLOWED £¥ INTERPRETATION CAN WORK. HOOSE ( l i k e MAPSEE) prov i d e s evidence f o r the f e a s i b i l i t y of a c o n s e r v a t i v e segmentation f o l l o w e d by i n t e r p r e t a t i o n . I t should be noted, however, that t h i s s t r a t e g y f i n e s s e s , r a t h e r than s o l v e s , the chicken and egg problem i n p e r c e p t i o n . D i s c u s s i o n 101 6.2 Problems i n HOUSE - INEFFICIENCIES IN THE PBOGBAM Although HOUSE 'S modularity buys some form of e f f i c i e n c y , t h ere are a number of i n e f f i c i e n c i e s i n the present implementation. Both the i n t e r n a l and e x t e r n a l c o n s i s t e n c y t e s t s c o u l d be done i n a more e f f i c i e n t way than i s the case at p r e s e n t . The most important i n e f f i c i e n c y i s i n the a p p l i c a t i o n of the network c o n s i s t e n c y a l g o r i t h m . C o n s t r a i n t s a t i s f a c t i o n a l g o r i t h m s apply i n any s i t u a t i o n where the i n t e r p r e t a t i o n of p r i m i t i v e s i s c o n s t r a i n e d by cues. I f these cues are c o n s t r a i n i n g enough, then NC guarantees a f a s t c o n v e r s i o n towards a s o l u t i o n . T h i s was the case i n MAPSEE where the c o n s t r a i n t s imposed on the i n t e r p r e t a t i o n of s k e t c h e s l i k e F i g u r e 4 q u i c k l y l e d towards a unique s o l u t i o n . The only ambiguity was formed by the o u t e r r e g i o n which c o u l d be i n t e r p r e t e d as both l a k e or sea. Step 1 and 2 of the a l g o r i t h m were s u f f i c i e n t f o r t h i s purpose. P o l y h e d r a l scenes, however, are f r e q u e n t l y much more ambiguous. The wedge, f o r example, has f i v e p o s s i b l e d e s c r i p t i o n s at the Edge l e v e l (see Table 5 ) . For t h i s reason repeated execution of step 3 ( s p l i t t i n g the i n t e r p r e t a t i o n domains i n h a l f and r e p e a t step 1 and 2} was necessary. The present implementation of NC h o l d s t h a t every time s t e p 3 i s made, the complete queue i s r e i n s t a l l e d ; t h a t i s , a l l the p r i m i t i v e s are p a i r e d with a l l the i n t e r p r e t a t i o n s s t i l l p resent D i s c u s s i o n 102 i n the domains. The time necessary to pass through the a l g o r i t h m i n c r e a s e s s h a r p l y with both the number of times t h a t s t e p 3 has to he executed and with the l e n g t h of the gueue. The f i r s t aspect depends cn the power of the c o n s t r a i n t s . The more c o n s t r a i n i n g t h e cues are on the i n t e r p r e t a t i o n process, the l e s s f r e q u e n t i t i s necessary to execute step 3. The c o n s t r a i n i n g i n f l u e n c e i s s p e c i f i e d i n each s i t u a t i o n and cannot be manipulated, but i t i s p o s s i b l e t o do something about the l e n g t h of the queue. As mentioned above, i n HC's implementation i n HOUSE, a l l the p r i m i t i v e s are r e i n s t a l l e d on the queue each time step 3 i s executed. In c e r t a i n s i t u a t i o n s , however, t h i s i s not necessary. During r e p e t i t i v e e x e c u t i o n of step 3, more and more unique s o l u t i o n s s t a r t to appear f o r the p r i m i t i v e s . He know t h a t i n such a s t a t e of the i n t e r p r e t a t i o n process t h a t each o f the i n t e r p r e t a t i o n s of p r i m i t i v e s a d j a c e n t to a p r i m i t i v e with a unique i n t e r p r e t a t i o n i s c o n s i s t e n t with that i n t e r p r e t a t i o n . For t h i s reason, t h e r e i s no l o n g e r any need to r e i n s t a l l a p r i m i t i v e with a unique i n t e r p r e t a t i o n on the queue du r i n g subsequent e x e c u t i o n s o f s t e p 3, nor i s i t necessary to r e i n s t a l l such a p r i m i t i v e as a r e s u l t of the e x e c u t i o n of step 2. Implementation of t h i s i d e a w i l l r e s u l t i n an i n c r e a s i n g l y s h o r t e r queue as the execution of the a l q o r i t h m proceeds. T h i s w i l l speed up the process e s p e c i a l l y i n scenes with many p r i m i t i v e s l i k e the house. S i n c e NC t a k e s the m a j o r i t y of the p r o c e s s i n g time a t each of the scene l e v e l s a g a i n i n e f f i c i e n c y of t h i s a l g o r i t h m i s very important. D i s c u s s i o n 103 another gain i n speed can be made by making the i n t e r n a l c o n s i s t e n c y t e s t more e f f i c i e n t . In the present implementation the i n t e r n a l c o n s i s t e n c y package i s a p p l i e d to each o f the re g i o n s and re g i o n - g h o s t s with a unique i n t e r p r e t a t i o n a f t e r e x e c u t i o n of NC»s step 2. A s m a l l gain i n e f f i c i e n c y can be made by a p p l y i n g the package t o those r e g i o n s and regi o n - g h o s t s o n l y , which d i d not have a unigue i n t e r p r e t a t i o n d u r i n g the pr e v i o u s pass. An exception should be made f o r the r u l e s p e c i f y i n g t h a t p a r a l l e l l i n e s i n a s u r f a c e should have the same i n t e r p r e t a t i o n ( s ) . T h i s r u l e should always be a p p l i e d t o a l l r e g i o n s and r e g i o n - g h o s t s . I t should be p o i n t e d out here, t h a t the i n e f f i c i e n c i e s i n the program mentioned above are implementation problems. The problems are not the r e s u l t o f the h e l i c a l metaphor (see F i g u r e 6) which served as a s t r a t e g y behind the implementation. He w i l l see t h a t t h i s i s d i f f e r e n t f o r the i s s u e t h a t i s d i s c u s s e d next. - THE CHICKEN AND EGG ffiOBLEH I n d e s c r i b i n g the r e s u l t s of the i n t e r p r e t a t i o n of a s k e t c h of a house (Chapter 5) i t was mentioned t h a t s e v e r a l c o n s t r a i n t s had t o be put on the i n t e r p r e t a t i o n process i n or d e r to prevent a g i g a n t i c e x p l o s i o n o f d e s c r i p t i o n s at the O r i e n t a t i o n l e v e l . Of these c o n s t r a i n t s , the e x c l u s i o n of the occlude/concave i n t e r p r e t a t i o n at the Edge l e v e l i s the l e a s t s e r i o u s one s i n c e concave and occlude/concave both express a concave r e l a t i o n s h i p . D i s c u s s i o n 104 Much more s e r i o u s i s the c o n s t r a i n t of a crack i n t e r p r e t a t i o n on the boundaries of r e g i o n s i n s i d e other r e g i o n s . The n e c e s s i t y t o impose such a c o n s t r a i n t p o i n t s a t a weakness i n the feotttom-up approach of the system. This c o n s t r a i n t r e f l e c t s the knowledge that a r e g i o n i n s i d e another r e g i o n has the same o r i e n t a t i o n as i t s surrounding r e g i o n . However, t h i s type of knowledge.is t y p i c a l f o r the S u r f a c e naming l e v e l , not f o r the Edge l e v e l . Using the crack c o n s t r a i n t t h e r e f o r e i m p l i e s t h a t we a n t i c i p a t e d S u r f a c e naming l e v e l knowledge i n order to c o n s t r a i n the number of p o s s i b l e d e s c r i p t i o n s a t the Edge l e v e l . Here we have another demonstration o f the c h i c k e n and egg problem. On the one hand, we are not allowed to bootstrap i n t o the o r i e n t a t i o n l e v e l before c o n s i s t e n c y i s reached at the Edge l e v e l . On the other hand, we need the models from the Surface naming l e v e l i n order t o c o n s t r a i n the number of p o s s i b l e d e s c r i p t i o n s a t the Edge l e v e l . T h i s problem addresses the gu e s t i o n o f e f f i c i e n c y and s u f f i c i e n c y which was one of the asp e c t s o f the HOUSE c o n t r o l s t r u c t u r e we wanted t o i n v e s t i g a t e . In the HOUSE example, a p a s s - o r i e n t e d bottom-up c o n t r o l s t i l l appears to be s u f f i c i e n t but i s e f f i c i e n t i f and only i f the c o n s t r a i n t s are organized i n a bottom-up f a s h i o n and not, as i n the example above, i n a top-down way. The bottom-up, p a s s - o r i e n t e d c o n t r o l s t r u c t u r e of the program i s i n h e r e n t i n the metaphor used. I t i s the h e l i c a l metaphor, which i s i n d i r e c t l y r e s p o n s i b l e f o r the i n t e r p r e t a t i o n e x p l o s i o n at the O r i e n t a t i o n l e v e l . T h i s weakness i n the h e l i c a l metaphor w i l l be accounted f o r i n the pro p o s a l of a new D i s c u s s i o n metaphor which i s d i s c u s s e d i n Chapter 7, 105 6 . 3 What HOUSE cannot do Many of the p r i n c i p l e s behind human i n f o r m a t i o n p r o c e s s i n g have become c l e a r e r by l o o k i n g a t s i t u a t i o n s i n which the system performs i n a p p r o p r i a t e l y , r a t h e r than l o o k i n g a t s i t u a t i o n s i n which the system performs w e l l . I n the case of HOUSE we w i l l look at a number o f cases i n which the program f a i l s but humans succeed, - EVEN A CONSEBVMMVB SEGMENTATION CAN BE HBONG. Fi g u r e 18 and 19 are two examples i n which t h i s i s the case. In HOUSE, the d e c i s i o n t o c l o s e or not to c l o s e a gap between two l i n e end-points i s made by means of parameters whose va l u e s a re p r e s p e c i f i e d . As a r e s u l t of these ( c o n s e r v a t i v e ) v a l u e s , HOUSE would c l o s e none of the gaps i n F i g u r e 18. Such a segmentation f a i l u r e shows the need f o r a co n t e x t dependent assignment of the values of these parameters. However, an al g o r i t h m t h a t does such a dynamic assignment has to f a c e the c h i c k e n and egg problem: we onl y know which values to a s s i g n to the parameters once we know t h a t the sketch d e p i c t s a cube. D i s c u s s i o n 106 F i g u r e 18: The open cube F i g u r e 19: The t r i c k y i s l a n d D i s c u s s i o n 107 F i g u r e 19 i s an example f o r the geographic maps world. A c o n s e r v a t i v e segmentation w i l l l e a d i n t h i s example t o the formation of a Tee and an L-vertex i n the c i r c l e d areas. As a r e s u l t of t h i s , the p i c t u r e cannot be i n t e r p r e t e d because the r i v e r cannot be r i v e r and r i v e r * at the same time. I f the L-vertex i s not formed, then t h e p i c t u r e would be i n t e r p r e T a b l e . However, the lower mountains would be i n t e r p r e t e d as r i v e r (or r i v e r * ) . In i t s present c o n s t i t u t i o n HOUSE (and MAPSEE as well) has no way o f escap i n g t h i s c o n c l u s i o n . He have no way to f i n d the i d e n t i t y o f t h e lower mountains unless we segment, b ut, i f we do so, we w i l l i n f e r the wrong c o n c l u s i o n . The problem must look f a m i l i a r by now; i t i s another m a n i f e s t a t i o n of the c h i c k e n and egg problem. How do we (humans) know th a t the lower mountains are intended as such? F i r s t of a l l , i f the s k e t c h i s presented to us, we know t h a t the sketch d e p i c t s a geographic map and we know what type of models we can f i n d i n there (an uninformed observer u s u a l l y does not r e c o g n i z e F i g u r e 19). At the model l e v e l we have i n f o r m a t i o n about the shape of the mountains (two l i n e s approximately symmetrical around the v e r t i c a l upwardly c o n v e r t i n g i n t o an L - v e r t e x ) , and the course o f r i v e r s (a r i v e r flows from the mountains i n t o a l a k e or sea, p o s s i b l y i n t e r r u p t e d by a b r i d g e ) . The h e l i c a l metaphor f o r c e s a p r o c e s s i n g format i n which the i n t e r p r e t a t i o n process i s d r i v e n by segmentation. Humans a l s o have the p o s s i b i l i t y to use a model-driven i n t e r p r e t a t i o n process guided by segmentation. The new metaphor proposed i n D i s c u s s i o n 108 Chapter 7 s i l l be seen to provide f a c i l i t i e s f o r such a model-driven i n t e r p r e t a t i o n process. - MISSING EDGES F i g u r e 20 i s another example i n which. HOUSE f a i l s . The v e r t i c a l edge s e p a r a t i n g the two w a l l s i s missing ( f o r example, as a r e s u l t of l i g h t i n g ) . Although a c o n s e r v a t i v e segmentation w i l l be c o r r e c t with r e s p e c t to the l i n e s t h a t show on the p i c t u r e , the i n t e r p r e t a t i o n process w i l l f a i l a t the Edge l e v e l . An L -vertex does not allow both arms t o be i n t e r p r e t e d as convex or occlude/concave. The p a s s - o r i e n t e d c h a r a c t e r of the process w i l l p r o h i b i t any p r o c e s s i n g a t higher l e v e l s . In s p i t e of the f a i l u r e t o reach c o n s i s t e n c y a t the Edge l e v e l , the p i c t u r e s t i l l makes sense a t the O r i e n t a t i o n l e v e l and Surface naming l e v e l . No d i s t i n c t i o n i s made a t the O r i e n t a t i o n l e v e l between v e r t i c a l s u r f a c e s with d i f f e r e n t o r i e n t a t i o n with r e s p e c t to the plane of p r o j e c t i o n . As a r e s u l t , t h e r e g i o n s I I I and IV would be i n t e r p r e t e d as v e r t i c a l , whereas the r e g i o n s I and I I are h o r i z o n t a l and s l a n t e d r e s p e c t i v e l y . From here p r o c e s s i n g c o u l d continue i n the normal f a s h i o n i n t o the S u r f a c e naming l e v e l . What we need here, i s a way to access the O r i e n t a t i o n l e v e l from the Vertex l e v e l d i r e c t l y . Apart from t h i s , the g o a l of the p r o c e s s i n the scene domain i s t o reach c o n s i s t e n c y a t each of the l e v e l s i n the scene domain. / Since i t i s not p o s s i b l e t o reach c o n s i s t e n c y a t D i s c u s s i o n 109 the Edge l e v e l i n a bottom-up way, the system should have the a b i l i t y t o a c c e s s t h i s l e v e l from higher l e v e l s as v e i l . Such a top-down access can be used to impose i n t e r p r e t a t i o n s at the Edge l e v e l , r a t h e r than to suggest them, as i s the case i n the bottom-up mode by means of the c u e / i n t e r p r e t a t i o n T ables. The h e l i c a l metaphor, however, does not provide such a f a c i l i t y . The problem i n F i g u r e 21 i s s i m i l a r to the problem i n F i g u r e 18, but t h i s time i t i n v o l v e s a house. F i g u r e 20: House with edge missing F i g u r e 21: Open house The multi-helix CHAPTBB 7: A flUITI-flELIX AS A BETAPHOB FOB TJJE PE&CEPTuAL  PBOCESS 7.1 The metaphor The problems demonstrated i n the Figures 18 to 21 include both problems i n segmentation and in t e r p r e t a t i o n . On the segmentation side of the process, we need to have some form of segmentation i n order to be able to access the interpretation domain. On the other hand, we want to avoid taking any premature segmentation decisions (e.g. closing gaps between l i n e end-points), because such decisions could be i n er r o r . In the interpretation domain, there i s the need for multiple a c c e s s i b i l i t y of a l l the l e v e l s i n both a bottom-up and top-down way. In bottom-up mode cues suggest possible interpretations for the l e v e l under consideration, whereas i n top-down mode the higher l e v e l models impose interpretations on the l e v e l i n consideration. Such a form of processing should be seen as a multi-helix, rather than as a single helix (see Figure 22d). Different cues can s t a r t d i f f e r e n t helices simultaneously. The h e l i c e s no longer represent a purely bottom-up process, however. Having completed the cycle at a certa i n l e v e l , there are e s s e n t i a l l y two p o s s i b i l i t i e s ; the helix can continue eit h e r bottom-up or top-down. The type of processing w i l l be d i f f e r e n t , however, depending on the mode, as explained above. Some implementation suggestions f o r such processing are given in section 7.2.1. The m u l t i - h e l i x M u l t i - p r o c e s s i n g can a l s o be proposed as a s o l u t i o n f o r the problems i n the segmentation domain. B a s i c a l l y , m u l t i - p r o c e s s i n g i s done to a v o i d unfavourable conseguences of a wrong segmentation, which i s an i n c o r r e c t i n t e r p r e t a t i o n , or f a i l u r e to i n t e r p r e t . The s t a r t i n g p o i n t f o r m u l t i - p r o c e s s i n g can be the computation o f d i f f e r e n t segmentation p o s s i b i l i t i e s . In the segmentation process we compute the p r i m i t i v e s and the cues i n the p i c t u r e . The problem with the h e l i x was t h a t a commitment was made to a c e r t a i n segmentation which was then t r i e d out i n the i n t e r p r e t a t i o n domain. a wrong segmentation a u t o m a t i c a l l y l e d t o i n t e r p r e t a t i o n f a i l u r e . The c u r r e n t p r o p o s a l i n c l u d e s a segmentation on a non-committal b a s i s . The b a s i c i d e a i s t o suggest p o s s i b l e p r i m i t i v e s and cues i n one pa r t of t h e p i c t u r e and to ask o u r s e l v e s the q u e s t i o n : "Suppose these cues and p r i m i t i v e s a r e c o r r e c t , how should we segment the r e s t of the p i c t u r e i f we want the p i c t u r e as a whole t o make sense i n the i n t e r p r e t a t i o n domain". F a i l u r e t o i n t e r p r e t the p i c t u r e as a whole i s an i n d i c a t i o n t h a t we chose the wrong segmentation. Thus, we t r y an a l t e r n a t i v e segmentation next. The i d e a behind such an i n t e r p r e t a t i o n - d r i v e n segmentation i s f u r t h e r e x p l a i n e d i n s e c t i o n 7.2.2 with a few examples. A s i m i l a r p r o p o s a l was made by Yakimovsky and Feldman (1973). However, t h e i r segmentation process was guided by Bayesian s t a t i s t i c s . Such an approach i s not i n c l u d e d i n the c u r r e n t p r o p o s a l . The m u l t i - h e l i x 7.2 Some examples 7.2.1 M u l t i p l e a c c e s s i b i l i t y of l e v e l s with top-down i m p o s i t i o n of i n t e r p r e t a t i o n s In F i g u r e 20, we can show how m u l t i p l e a c c e s s i b i l i t y of l e v e l s i n the i n t e r p r e t a t i o n domain co u l d work. The v e r t i c e s at the V e r t e x - l e v e l are used t o invoke the models a t the Edge l e v e l . Apart from t h i s , we can use r e g i o n r e l a t i o n s h i p s at the Vertex l e v e l to invoke models of the Sur f a c e naming l e v e l d i r e c t l y . INSIDE (BEGION V,BEGION I I I ) a t the Vertex l e v e l i m p l i e s w a l l or door as models f o r r e g i o n I I I and window f o r r e g i o n V (see Table 3 ) . One o f the model i n t e r p r e t a t i o n s o f the cue COMMON(BEGION III,BEGION IV) i s w a l l f o r r e g i o n I I I and door f o r r e g i o n IV. A t h i r d r e l a t i o n s h i p not found i n BOOSE at present c o u l d be ABOVE (BEGION II,BEGION III,SUB-CHN 10) and ABOVE {BEGION II,BEGION III,SOB-CHN 11). Begion I I would be a roof or t o p - f a c e and r e g i o n I I I a w a l l or s i d e - f a c e i n t h i s s i t u a t i o n . , Sub-chains 2, 3, 10, and 11 would be s l a n t e d or h o r i z o n t a l . An NC t e s t a t t h i s l e v e l would r e s u l t i n a unigue i n t e r p r e t a t i o n f o r each of the r e g i o n s (not f o r most of the s u b - c h a i n s ) . An i n t e r n a l c o n s i s t e n c y t e s t would f u r t h e r reduce the number of p o s s i b l e i n t e r p r e t a t i o n s . H a l l s , doors and windows a r e v e r t i c a l s u r f a c e s . As a r e s u l t , a l l v e r t i c a l l i n e s i n w a l l s , doors and windows have t o be v e r t i c a l . For f u r t h e r c o n s t r a i n i n g of edge i n t e r p r e t a t i o n s we need some knowledge not The m u l t i - h e l i x 113 represented i n HOOSE i n i t s present implementation- Hindows (and d o o r s ) , when t h e i r apparent shape i s a p a r a l l e l o g r a m , u s u a l l y r e p r e s e n t a r e c t a n g l e as t h e i r r e a l shape, As a r e s u l t , a l l non v e r t i c a l l i n e s i n such a shape should be h o r i z o n t a l edges. The r u l e t h a t a l l p a r a l l e l edges i n the same s u r f a c e must have the same o r i e n t a t i o n w i l l then r e s u l t i n a h o r i z o n t a l i n t e r p r e t a t i o n f o r the sub-chains 6,7,8,9,10,12,13 and 14. Thus, we have obtained a complete d e s c r i p t i o n of the sketch a t the Surface naming l e v e l . a f t e r t h i s we have the p o s s i b i l i t y to bootstrap both i n t o the next higher l e v e l or next lower l e v e l . One of the goals c f the process was to p r o v i d e c o n s i s t e n t d e s c r i p t i o n s of the p i c t u r e at a l l l e v e l s of a b s t r a c t i o n , so we w i l l do both. B o o t s t r a p p i n g up i n the h i e r a r c h y f o l l o w s the r o u t i n e o f the present implementation. In order t o impose i n t e r p r e t a t i o n s on lower l e v e l s we w i l l use the cue/model h i e r a r c h y i n the r e v e r s e d i r e c t i o n . a w a l l , door, and window can be a v e r t i c a l s u r f a c e only a t the o r i e n t a t i o n l e v e l , whereas a roof can be both a s l a n t e d or a h o r i z o n t a l s u r f a c e . Because of the edge i n t e r p r e t a t i o n s (which remain the same a t the O r i e n t a t i o n l e v e l ) the i n t e r n a l c o n s i s t e n c y t e s t w i l l d e l e t e the l a t t e r p o s s i b i l i t y f o r the r o o f . An NC t e s t w i l l show t h a t t h e other i n t e r p r e t a t i o n s are c o n s i s t e n t . As a r e s u l t of the top-down approach we are now ab l e to impose i n t e r p r e t a t i o n s on the sub-chains at the Edge l e v e l . Cues o f the type 0CCL0DE-C0NCAVE(HALI,GROUND,tf-EDGE) not only serve as bottom-up cues f o r the Object l e v e l now., They can a l s o be used i n the re v e r s e way. H a l l and ground separated by a The m u l t i - h e l i x 114 h-edge i m p l i e s occiude-concave as the only p o s s i b l e model at the Edge l e v e l . In the same way sub-chains 10 and 11 can onl y be convex, because t h i s i s the only i n t e r p r e t a t i o n allowed between a w a l l and a r o o f . an important aspect of these top-down imposed i n t e r p r e t a t i o n s i s that they allow us to t r a c e the edge l e v e l cues (the v e r t i c e s ) which cause i n c o n s i s t e n c y . In t h i s case two L - v e r t i c e s do not allow the convex and occiude-concave i n t e r p r e t a t i o n s . a s u c c e s s f u l i n t e r p r e t a t i o n of the sketch at the Edge l e v e l i n a top-down way combined with f a i l u r e t o do so i n the bottom-up d i r e c t i o n not only suggests resegmentation, i t a l s o i n d i c a t e s where t o resegment. a bottom-up a n a l y s i s o n l y shows f a i l u r e t o come to a c o n s i s t e n t d e s c r i p t i o n but does not l e a d us to the cause of t r o u b l e . T h i s approach i m p l i e s a simultaneous p r o c e s s i n g at d i f f e r e n t l e v e l s and c o n s t r a i n i n g between l e v e l s i n both bottom-up and top-down mode. Top-down c o n s t r a i n i n g added t o bottom-up c o n s t r a i n i n g w i l l a l s o a s s i s t i n suppressing the bulge of i n t e r m e d i a t e d e s c r i p t i o n s t h a t are the r e s u l t o f the bottom-up approach i n both the house and the wedge. For example, the i n t e r p r e t a t i o n o f r e g i o n V as window and re g i o n I I I as w a l l d i r e c t l y i m p l i e s a crack r e l a t i o n s h i p f o r a l l the boundary edges of the window a t the Edge l e v e l . Imposing t h i s c o n s t r a i n t on the Edge l e v e l , l i m i t s the number of p o s s i b l e i n t e r p r e t a t i o n s . I n the present implementation of HOOSE t h i s crack c o n s t r a i n t was an i n t e r n a l c o n s i s t e n c y c o n s t r a i n t a p p l i e d a t the Edge l e v e l i t s e l f . The m u l t i - h e l i x 115 A p p l i c a t i o n of t h i s type of c o n s t r a i n t as an i n t e r n a l c o n s t r a i n t was i n a p p r o p r i a t e because the knowledge f o r the a p p l i c a t i o n of such a c o n s t r a i n t i s at the Surf a c e naming l e v e l and not at the Edge l e v e l . Such an example demonstrates how a combination of a top-down and bottom-up approach can be e f f i c i e n t . 7.2.2 I n t e r p r e t a t i o n - d r i v e n segmentation In F i g u r e 21 there i s an example to e x p l a i n how an i n t e r p r e t a t i o n - d r i v e n segmentation c o u l d work. Apart from the two e n c i r c l e d gaps a l l the gaps between l i n e s are well c l o s e d i n t h i s f i g u r e . Although we cannot be sure t h a t a l l the w e l l c l o s e d gaps were r e a l l y intended t o be c l o s e d , we can assume t h i s f o r the moment with the o p t i o n t h a t we can s t a r t a l l over again with d i f f e r e n t c l o s u r e assumptions i f i t turns out that the p i c t u r e can never make sense i n the i n t e r p r e t a t i o n domain under the present c l o s u r e assumptions. The good c l o s u r e s allow f o r the computation of s e v e r a l p r i m i t i v e s and cues. F o r example, r e g i o n VI has an INSIDE r e l a t i o n with e i t h e r r e g i o n IV or the r e g i o n combination I-IV. He can t r y out both p o s s i b i l i t i e s . In Table 3 we can see that INSIDE i m p l i e s a window f o r r e g i o n VI and a w a l l or door f o r r e g i o n IV or the r e g i o n combination I-IV. However, at the Surface naming l e v e l r e g i o n I-IV, as out e r r e g i o n , has t o be ground. Thus, a c l o s u r e o f the gap so th a t r e g i o n s I and IV are separated i s the only p o s s i b l e segmentation. S i m i l a r l y , r e g i o n V has a COMMON r e l a t i o n s h i p with e i t h e r r e g i o n I I I o r the r e g i o n combination The m u l t i - h e l i x 116 I I - I I I . Since n e i t h e r r e g i o n I I o r I I I i s an outer r e g i o n ground i s an im p o s s i b l e i n t e r p r e t a t i o n , which means t h a t the re g i o n s I I and I I I can e i t h e r be w a l l or door-handle (see Table 3). The c e n t r a l f o r k i n the p i c t u r e now a s s i s t s a f u r t h e r l i m i t a t i o n of p o s s i b l e i n t e r p r e t a t i o n s . A door-handle and a w a l l , or a door and a w a l l , can be connected by a crack edge o n l y . The c e n t r a l f o r k does not a l l o w such an i n t e r p r e t a t i o n f o r i t s edges (see Table 1). T h i s l e a v e s us with w a l l as i n t e r p r e t a t i o n f o r both the r e g i o n combination I I - I I I and f o r r e g i o n I ? , a d d i t i o n a l l y , w a l l s have t o be connected by a convex edge, and t h i s edge has t o be v e r t i c a l . I n a w a l l a n o n - v e r t i c a l l i n e can never d e p i c t a v e r t i c a l edge. As a r e s u l t , edge 1 cannot be v e r t i c a l which i n t u r n i m p l i e s that r e g i o n I I cannot be i n t e r p r e t e d as w a l l . Thus, Figure 21 can only be c o n s i s t e n t i f the r e g i o n s I I and I I I are s e p a r a t e r e g i o n s . I n t e r p r e t a t i o n - d r i v e n segmentation a l s o p r o v i d e s a s o l u t i o n f o r the problems i n F i g u r e 19. In the d i s c u s s i o n of t h i s F i g u r e i t was mentioned a l r e a d y , that humans can apply i n f o r m a t i o n from the i n t e r p r e t a t i o n domain (e.g., t h e shape of the mountains and the course of r i v e r s ) . T h i s i n f o r m a t i o n can be used t o a d j u s t the segmentation, i f necessary. I f we are i n t e r p r e t i n g a segmentation p o s s i b i l i t y i n which a c h a i n with two Free-ends i s i n t e r p r e t e d as a r i v e r , then the r i v e r model can impose a c l o s u r e a t one end of the c h a i n . T h i s can be done, because we know t h a t one end of a r i v e r always has to be Tee or a h i g h e r order j u n c t i o n . A s i m i l a r reasoning process can be used f o r the The m u l t i - h e l i x 1.17 mountain. S e v e r a l requirements have t o be s a t i s f i e d b e f o r e a c h a i n can be i n t e r p r e t e d as a mountainside. One such requirement i s t h a t such a c h a i n has to form an I r - c o n f i g u r a t i o n with another c h a i n such t h a t both c h a i n s are approximately symmetrical with r e s p e c t t o the v e r t i c a l a x i s . The e n c i r c l e d Tee c o u l d be formed i f and only i f a l l three T-arms co u l d be i n t e r p r e t e d as mountainside which i s not the case. Here are a few examples how a proposed segmentation i n one part of the p i c t u r e can suggest models as i n t e r p r e t a t i o n . However, once these models are suggested, they impose r e s t r i c t i o n s on the p o s s i b l e segmentations i n another p a r t of the p i c t u r e . However, one problem remains. which i n i t i a l segmentation do we s t a r t from? The c h o i c e of such an i n i t i a l segmentation i s l e s s obvious than i t was F i g u r e 21. He might t r y out s e v e r a l i m p o s s i b l e segmentations before we t r y out one which c l o s e s n e i t h e r o f the e n c i r c l e d v e r t i c e s . One way to s o l v e the problems i n Figure 18 i s to assume t h a t r e g i o n shapes can serve as models f o r d r i v i n g the segmentation. I n HODSE ,s prese n t implementation the r e g i o n shape can onl y be computed a f t e r the v e r t i c e s have been formed. However, i t i s a l s o p o s s i b l e t o have two p a i r s of p a r a l l e l l i n e s serve as a cue f o r a p a r a l l e l o g r a m , when each l i n e i n each p a i r a f t e r e x t e n s i o n i n both d i r e c t i o n s i n t e r s e c t s with each of the l i n e s of the other p a i r . The th r e e proposed p a r a l l e l o g r a m s proposed f o r F i g u r e 18 would r e g u i r e c l o s u r e of a l l the gaps. T h i s segmentation r e s u l t s i n a c o n s i s t e n t i n t e r p r e t a t i o n o f the f i g u r e . The m u l t i - h e l i x 118 I t should i e noted that the approach proposed i n the m u l t i - h e l i x metaphor i s very s i m i l a r to the approach taken i n HEABSAY I I (Lesser and Erman, 1977, 1978). HEASSAY I I knowledge sources resemble HOUSE ' S cue/model h i e r a r c h y , which now a l s o works i n a top-down way. Knowledge sou r c e s , a c t i v e only w i t h i n one l e v e l , resemble HOUSE ' S c o n s i s t e n c y t e s t s . HEABSAY*s-blackboard r e p r e s e n t i n g l e v e l s of d e s c r i p t i o n i s s i m i l a r to HOUSE ' S l e v e l s of r e p r e s e n t a t i o n . Towards a theory of machine p e r c e p t i o n 119 CHAPTER 8: TOWARDS A TJEORI OF MACHINE PERCEPTION S e v e r a l metaphors f o r d e s c r i b i n g the p e r c e p t u a l process have been d i s c u s s e d by now. F i r s t of a l l , the c y c l e of p e r c e p t i o n has been discu s s e d (Figure 22a). The c y c l e with i t s f o u r stages (cue d i s c o v e r y , model i n v o c a t i o n , model t e s t i n g and model e l a b o r a t i o n ) i s able t o d e s c r i b e the flow of c o n t r o l i n most v i s i o n programs (Mackworth, 1978). Apart from that the c y c l e a l s o embodies the c h i c k e n and egg problem i n p e r c e p t i o n : segmentation means i n t e r p r e t a t i o n and v i c e versa. Hereby, cue di s c o v e r y can stand f o r segmentation whereas model t e s t i n g and e l a b o r a t i o n can r e p r e s e n t the i n t e r p r e t a t i o n process. C o n s e r v a t i v e segmentation f o l l o w e d by i n t e r p r e t a t i o n allows us t o avoid the c h i c k e n and egg problem i n a number of s i t u a t i o n s , as we have seen. model elaboration cue discovery model t e s t i n g model invocation F i g u r e 22a: The c y c l e of per c e p t i o n r i e n t a t i o n Level Edge Level Vertex ^ Level Line/region - Level Sketch ^ Level Figure 22b: The h e l i x towards a theory of machine p e r c e p t i o n 121 The l e s s o n s learned from p r e v i o u s r e s e a r c h i n Machine V i s i o n r e s u l t e d i n a l e v e l s of r e p r e s e n t a t i o n approach f o r HOUSE. HOUSE* s c o n t r o l s t r u c t u r e can he d e s c r i b e d by the h e l i c a l v e r s i o n of the c y c l e ( F i g u r e 22b). Another v e r s i o n of the c y c l e o f p e r c e p t i o n metaphor r e s u l t s from Havens* work (Havens, 1978). F i r s t of a l l , an i t e r a t i v e r e c o g n i t i o n c y c l e i s formed between the e x p e c t a t i o n and matching stage i n h i s r e c o g n i t i o n model. Each schema seeks to s a t i s f y i t s e x p e c t a t i o n s by s e a r c h i n g f o r the type o f i n f o r m a t i o n s p e c i f i e d i n the e x p e c t a t i o n s . Depending on the evidence found the schema w i l l e i t h e r suspend i t s e l f or continue to seek c o n f i r m a t i o n o f i t s e x p e c t a t i o n s on b a s i s of the evidence found. T h i s process c o n t i n u e s u n t i l a l l the e x p e c t a t i o n s are s a t i s f i e d . Havens* r e c o g n i t i o n model i s a l s o c y c l i c i n another sense. Once a l l the e x p e c t a t i o n s o f a schema are s a t i s f i e d then the schema w i l l seek co m p l e t i o n , t h a t i s , i t w i l l a c t as a cue f o r other schemata f u r t h e r up i n the h i e r a r c h y . Havens approach of the p e r c e p t u a l process can t h e r e f o r e be seen as a b i - c y c l e metaphor f o r p e r c e p t i o n (Figure 22c). Towards a theory of machine p e r c e p t i o n 122 ons F i g u r e 22c: The b i - c y c l e metaphor The advantage o f the b i - c y c l e metaphor i s , t h a t i t makes an e x p l i c i t d i s t i n c t i o n between top-down and bottcm-up processes; the completion c y c l e r e p r e s e n t s the b o t t o i - u p component of the p e r c e p t u a l process, whereas the o b s e r v a t i o n c y c l e r e p r e s e n t s the top-down component. Representing the two component processes by d i f f e r e n t c y c l e s expresses the f a c t t h a t they have d i f f e r e n t c h a r a c t e r i s t i c s , as was e x p l a i n e d i n s e c t i o n 2.1; the two types of processes can be d i s t i n g u i s h e d as to whether or not they are Towards a theory of machine p e r c e p t i o n 123 g o a l - o r i e n t e d . The c y c l e of p e r c e p t i o n does not provide such a d i s t i n c t i o n . The m u l t i - h e l i x (Figure 22d) has i n h e r i t e d the p r o p e r t i e s of noth metaphors. The c y c l e i s represented i n the l e v e l o f r e p r e s e n t a t i o n , whereas the top-down and bottom-up processes are re p r e s e n t e d by the d i r e c t i o n o f i n v o c a t i o n ; an upward flow of i n f o r m a t i o n means a bottom-up p r o c e s s , an i n f o r m a t i o n flow i n the o p p o s i t e d i r e c t i o n means a top-down process. However, the m u l t i - h e l i x metaphor expresses more than t h a t . L i k e the s i n g l e - h e l i x , the m u l t i - h e l i x shows the presence of d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n and p r o c e s s i n g , and l a s t but not l e a s t , the m u l t i - h e l i x expresses the m u l t i p l e a c c e s s i b i l i t y of the d i f f e r e n t l e v e l s of r e p r e s e n t a t i o n i n both a bottom-up and top-down d i r e c t i o n . Object Level 124 Surface naming Level Vertex ^ Level Line/region - Level Sketch . Level Figure 22d: The m u l t i - h e l i x Towards a theory of machine p e r c e p t i o n 125 There a l s o remain problems, however. The metaphor does c e r t a i n l y not provide a s o l u t i o n f o r a l l problems i n the p e r c e p t u a l process. Although the metaphor i n d i c a t e s , how the i n t e r m e d i a t e bulge of d e c r i p t i o n s as they occur i n HOOSE might be suppressed, to mention an example, the process by means of which s u p p r e s s i o n should occur, i s not w e l l d e f i n e d . In the house example, both the Edge l e v e l and the Surface naming l e v e l can be accessed d i r e c t l y from the Vertex l e v e l (Figure 22d). However, p r o c e s s i n g at the S u r f a c e naming l e v e l has t o be completed b e f o r e i t can impose i n t e r p r e t a t i o n s on the Edge l e v e l . Meanwhile the i n t e r p r e t a t i o n s a t the Edge l e v e l can explode. Thus, we are faced with a t i m i n g problem. He can o n l y e f f i c i e n t l y suppress the i n t e r p r e t a t i o n e x p l o s i o n a t the Edge l e v e l , i f top-down pr o c e s s i n g towards t h a t l e v e l comes bef o r e or s i m u l t a n e o u s l y with the bottom-up p r o c e s s i n g towards that l e v e l . The m u l t i - h e l i x metaphor can r e p r e s e n t some o f the p e r c e p t u a l p r i n c i p l e s i n a b e t t e r way, than the present implementation does. F i r s t of a l l , t h e r e i s a b e t t e r balance between top-down and bottom-up p r o c e s s i n g i n the m u l t i - h e l i x . HOOSE has a s t r o n g bottom-up b i a s . Such a b i a s i s h e l p f u l t o i n v e s t i g a t e the s t r e n g t h ( f o r example, c e r t a i n forms of e f f i c i e n c y ) and weaknesses ( f o r example, the i n a b i l i t y to d e a l with incomplete drawings) of such an approach. However, a system that wants t o compete with the human p e r c e p t u a l p o t e n t i a l needs top-down processing not only w i t h i n a l e v e l (as manifested i n HO0SE*s present implementation) but a l s o i n top-down d i r e c t i o n * Towards a theory of machine perception 126 The new proposal also emphasizes the importance of redundancy- In the perceptual process the a b i l i t y to recognize a degraded picture depends very much on t h i s property. Regardless how degraded the picture i s , as long as the picture provides enough cues, recognition can succeed. Although the multi-helix expresses more fa c t s about the perceptual process than the other metaphors discussed, i t cannot be seen as a theory of machine perception. If we want to present a theory of machine perception, then we need i d e n t i f i c a t i o n and precise formulation of a l l the perceptual p r i n c i p l e s that constitute the perceptual process. As we have seen i n section 2.1, there i s only an i n t u i t i v e notion of what the perceptual p r i n c i p l e s are, and t h e i r formulation i s not very precise. Nevertheless, machine v i s i o n research gradually progresses towards a theory of machine perception. The design of programs, based on metaphors that represent our i n t u i t i v e notions of the perceptual p r i n c i p l e s with increasing accuracy, w i l l a s s i s t a formulation of these p r i n c i p l e s with increasing p r e c i s i o n . The multi-helix metaphor provides another step i n that d i r e c t i o n , but much remains to be done. B i b l i o g r a p h y 127 BIBLIOGRAPHY: Anderson J . (1976), Language, Memory and Thought, Lawrence Erlbaum A s s o c i a t e s , H i l l s d a l e , New J e r s e y . Barrow H.G. and Tenenbaum J.M. (1976), "MSYS a System f o r Seasoning about Scenes", Techn. Note 121, A.I. Center, Stanf o r d Hes. I n s t . , Menlo Park, CA. B a r t l e t t F.C. (1932), Remembering, Cambridge Univ. P r e s s , London, r e p r i n t e d i n 1967. Becker J . (1973), "A Model f o r the Encoding of Experimental I n f o r m a t i o n " , i n Schank B. and Colby K. (e d s . ) , Computer  Models of Thought and Language, Freeman, San F r a n c i s c o , pp. 397-434. Bobrow D.G. and C o l l i n s A. (1975), Be pre s e n t a t i on and Understanding, academic Press, New York. Bobrow D.G. and Norman D.A. (1975), "Some P r i n c i p l e s o f Memory Schemata", i n Bobrow D.G. and C o l l i n s A., Re p r e s e n t a t i o n and Understanding. Academic Press, pp. 131-150. Bobrow D.G. and Hinograd T. (1977), "An Overview of KBL, a Knowledge Be p r e s e n t a t i o n Language", C o g n i t i v e S c i e n c e . J , 3-46. Clowes M.B. (1971), "On Seeing Things". A r t i f i c i a l I n t e l l i g e n c e . 2, 1, 79-112. Clowes M.B. (1973), " A r t i f i c i a l I n t e l l i g e n c e as Psychology", La b o r a t o r y of Experimental Psychology, Univ. of Sussex, Palmer, Nr., B r i g h t o n . Evans T.G. (1963), "A H e u r i s t i c Program t o Solve Geometric Analogy Problems", Ph.D. t h e s i s , i n Minsky M. (ed.), Semantic  Information P r o c e s s i n g , M.I-T. P r e s s , Cambridge, Mass., pp. 271-353. Fa l k G. (1972), " I n t e r p r e t a t i o n of L i n e Data as a Three-dimensional Scene", A r t i f i c i a l I n t e l l i g e n c e . 3, 2, 101-14 4. F i l l m o r e C. (1968), "The Case f o r Case", i n Bach E. and H a r r i s R.I. (eds.\ . O n i v e r s a l s i n L i n g u i s t i c Theory. H o l t . B i n e h a r t and B i n s t o n , New York, pp. 1-90. Freuder E.C. (1976a), " S y n t h e s i z i n g C o n s t r a i n t E x p r e s s i o n s " , A.I. Memo 370, M.I.T., Cambridge, Mass. Freuder E.C. (1976b), "A Computer System f o r V i s u a l R e c o g n i t i o n Using A c t i v e Knowledge", Ph.D. t h e s i s , AI-T8-345, M.I.T., B i b l i o g r a p h y 128 A.I- L a b o r a t o r y , Cambridge, Mass. Gaschaig J . (1974), "A C o n s t r a i n t S a t i s f a c t i o n Method f o r Inf e r e n c e Making", Proc. 12th Ann. A l l e r t o n Conf. on C i r c u i t theory,, 0. of I l l i n o i s , Urbana Champaign, 111., pp. 866-874. Guzman A. (1969), "Decomposition of a V i s u a l Scene i n t o Three-dimensional Bodies", i n G r a s s e l l i A. (ed.). Automatic  I n t e r p r e t a t i o n and C l a s s i f i c a t i o n of Images, Academic P r e s s , Sew York, pp. 243-276. Havens U.S. (1978), "A Pro c e d u r a l Model of r e c o g n i t i o n f o r Machine P e r c e p t i o n " , Ph.D. T h e s i s , TH-78-3, Department of Computer S c i e n c e , Univ. of B r i t i s h Columbia, march 1978. Hubel D.H. and Wiesel T.N. (1962), " P e r c e p t i v e P i e l d s , B i n o c u l a r I n t e r a c t i o n and F u n c t i o n a l A r c h i t e c t u r e i n the Cat*s V i s u a l C o r t e x " , J o u r n a l o f Physiology (London). 160. 106-154. , Hubel D.H. and S i e s e l T.N. (1963), "Shape and Arrangements of Columns i n the Cat*s S t r i a t e C o r t e x " , J o u r n a l of Physiology  (London). 165, 559-568. Hubel D.H. and S i e s e l T.N. (1965), "Beceptive F i e l d s and F u n c t i o n a l A r c h i t e c t u r e i n two N o n s t r i a t e V i s u a l Areas (18 and 19) of the C a t " , J o u r n a l of Neurophysiology. 28, 229-289. Hubel D.H. and H i e s e l T.N. (1968), "Beceptive F i e l d s and F u n c t i o n a l A r c h i t e c t u r e of Monkey S t r i a t e C o r t e x " , J o u r n a l of Physiology (London). 195. 215-243. Hueckel M. (1971), "An Operator which Locates Edges i n D i g i t i z e d P i c t u r e s " , J o u r n a l o f the Ass, of Comp. Hach.. 18, 1, 113-125. Huffman D.A. (1971), "Impossible Objects as Nonsense Sentences", i n H e l t z e r B. and H i c h i e D. ( e d s . ) . Machine I n t e l l i g e n c e . v o l . VI, Edinburgh Univ. Press, pp. 295-323. Jones B. (1978), " I n t e g r a t e d G r a p h i c s i n L i s p " , Deptmt. of Comp. Sc i e n c e , Univ. of B r i t i s h Columbia, T e c h n i c a l Note TH-21. K e l l y M.D. (1971), "Edge D e t e c t i o n i n P i c t u r e s by Computer Using P l a n n i n g " , i n Me l t z e r 8. and Mi c h i e D. (e d s . ) . Machine I n t e l l i g e n c e , v o l . VI, Edinburgh Univ. P r e s s , pp. 397-410. Kuipers B-J. (1975), "A Frame f o r Frames", i n Bobrow D.G. and C o l l i n s A. (e d s . ) . B e p r e s e n t a t i o n and Understanding, Academic Press, New York, pp.,151-184. K u l i k o w s k i C. (1974), "A System f o r Computer-Based Medical C o n s u l t a t i o n " , Proc. of the N a t i o n a l Computer Conference, Chicago, 111., Bay 1974. B i b l i o g r a p h y 129 Lesse r V.B. and Erman L.D. (1977), "A B e t r o s p e c t i v e View of the HEARSAY I I A r c h i t e c t u r e " , Proc. of IJCAI-77. H.I.I., Cambridge, Mass., pp. 790-800. L e s s e r V.B. and Erman L-Q. (1978), "HEABSAY I I : T u t o r i a l I n t r o d u c t i o n and B e t r o s p e c t i v e View", Deptmt. of Comp. Sci e n c e , Carnegy Mellon Univ., Techn- Report CMU-CS-78-117. Hackworth A.K. (1973), " I n t e r p r e t i n g P i c t u r e s of P o l y h e d r a l Scenes", A r t i f i c i a l I n t e l l i g e n c e . 4, 2, 121-137. Hackworth A.K. (1975), "How t o See a Simple World", Deptmt. of Computer S c i e n c e , Univ. of B r i t i s h Columbia, TB-75-4; a l s o p u b l i s h e d i n E l c o c k E.W. and H i c h i e D . (eds.), Machine  I n t e l l i g e n c e , v o l . V I I I , John Wiley, 1977. Hackworth A.K. (1977a), "On r e a d i n g Sketch Maps", Proc. of IJCAI-77. M.I.T., Cambridge, Mass., pp. 598-606. Hackworth A.K.,(1977b), "Consistency i n Networks of B e l a t i o n s " , A r t i f i c i a l I n t e l l i g e n c e . 8, 1, 99-118. Hackworth A.K. (1977c), "Model-driven I n t e r p r e t a t i o n i n I n t e l l i g e n t V i s i o n Systems", P e r c e p t i o n . 5, 349-370. Hackworth A.K. (1978), " V i s i o n Research S t r a t e g y : Black Magic, Metaphors, Mechanisms, Miniworlds and Maps", i n Riseman E. and Hansen A. ( e d s . ) . Computer V i s i o n Systems. Academic Press, New York, pp. 53-61. Minsky M. (1975), "A Framework f o r Representing Knowledge", i n Winston P.H. (e d . ) . The Psychology of Computer V i s i o n . McGraw-Hill, New York, p p l 211-277. Montanari U. (1974), "Networks of C o n s t r a i n t s : Fundamental P r o p e r t i e s and A p p l i c a t i o n s to P i c t u r e P r o c e s s i n g " , Information S c i e n c e s . 7. 95-132. Mulder J.A. and Hackworth A.K. (1978a), "Using M u l t i - l e v e l Semantics to Understand Sketches of Houses and other P o l y h e d r a l O b j e c t s " , Proc. of the second Hat. Conf. o f the  Can. Soc. f o r Computational S t u d i e s of I n t e l l i g e n c e , Univ. of Toronto, Toronto, O n t a r i o , pp. 244-253. Mulder J.A. and Mackwcrth A.K. (1978b), "Schemata R i v a l r y i n the P e r c e p t i o n of V i s u a l S l a n t " , ( i n p r e p a r a t i o n ) . N e i s s e r U. (1976), C o g n i t i o n and R e a l i t y . Freeman and Comp., San F r a n c i s c o . Newell A. (1973), " P r o d u c t i o n Systems: Model of C o n t r o l S t r u c t u r e s " , i n Chase W.G. (ed.) , V i s u a l I n f o r m a t i o n  P r o c e s s i n g . Academic Press, New York. B i b l i o g r a p h y 130 Newell A• and Simon H.A. (1963), "GPS: a Program t h a t Simulates Human Thought", i n Feigenbaum E-A. and Feldman J . (eds . ) , Computers and Thought, McGraw-Hill, New York, pp. 279-296. Newell A. and Simon H.A. (1972), Human Problem S o l v i n g . Englewood C l i f f s , New J e r s e y , P r e n t i c e B a l l . Norman D.A. and Bobrow D.G. (1976), "On the Bole of A c t i v e memory Processes i n P e r c e p t i o n and C o g n i t i o n " , i n C o f e r C.N . (ed.), The S t r u c t u r e of Human Memory. Freeman and Comp., San F r a n c i s c o , pp. 111-132. Norman D.A. and Bumelhart D.E. (1975), E x p l o r a t i o n s i n C o g n i t i o n . Freeman and Comp., San F r a n s i s c o . Palmer S.E..(1975), " V i s u a l P e r c e p t i o n and World Knowledge: Notes on a Model of Se n s o r y - C o g n i t i v e I n t e r a c t i o n " , i n Norman D.A. And Bumelhart D.£., E x p l o r a t i o n s i n C o g n i t i o n . Freeman and Comp., San F r a n c i s c o , pp. 279-307. Piaget J . J . (1971), Biology and Knowledge, Oniv. of Chicago Press, Chicago. Pylyshyn Z. (1976), "Imagery and A r t i f i c i a l I n t e l l i g e n c e " , i n Savage W. (ed.), Minnesota S t u d i e s i n the Philosophy of  Scienc e , v o l . IX, Oniv. of Minnesota Press, Minneapolis. Q u i l l i a n M.B. (1969), "Semantic Memory", i n Mimsky M. (ed.). Semantic Information P r o c e s s i n g , M.I.T. P r e s s , Cambridge, Mass., pp. 227-270. Boberts L-G. (1965), "Machine P e r c e p t i o n of Three-dimensional O b j e c t s " , i n T i p p e t J.T. e t A l . (eds.), O p t i c a l and E l e c t r o - o p t i c a l Information P r o c e s s i n g , M.I.T. P r e s s , Cambridge, Mass., pp.,159-197. Bosenfeld A., Hummel B.A. and Zucker S.W. (1976), "Scene L a b e l i n g by B e l a x a t i o n O p e r a t i o n s " , IEEE Trans, on Systems. Man and C y b e r n e t i c s . SMC-6. 420-433. Bumelhart D.E. and Ortony A. (1977), "The Bepresentation of Knowledge i n Memory", i n S p i r o B.J. and Montague W.E. ( e d s . ) . S c h o o l i n g and the A c q u i s i t i o n of Knowledge. Lawrence Erlbaum A s s o c i a t e s , H i l l s d a l e , New J e r s e y . Samuel A.L. (1963), "Some S t u d i e s i n Machine L e a r n i n g Using the Game of Checkers", i n Feigenbaum E.A. and Feldman F- ( e d s . ) . Computers and Thought. McGraw-Hill, New York, pp. 71-108. S h i r a i Y. (1975), " A n a l y s i n g I n t e n s i t y Arrays Osing Knowledge about Scenes", i n Winston P.H. (ed.), the Psychology of Computer V i s i o n , McGraw-Hill, New York, pp. 93-114. B i b l i o g r a p h y 131 S t a r r D.W. and Hackworth A.K- (1976), " I n t e r p r e t a t i o n D i r e c t e d Segmentation of EBTS Images", Deptmt- of Comp.,Science, Univ. of B r i t i s h Columbia, TB-76-3-Waltz D.L. (1972), "Generating Semantic D e s c r i p t i o n s from Drawings of Scenes with Shadows", Ph.D. T h e s i s , AI-TB-271, H.I.T., Cambridge, Mass. Weiss S- (1974), "A System f o r Hodel-Based Computer-Aided Diagnosis and Therapy", Computer Science Deptmt-, Butgers U n i v e r s i t y , Hew Brunswick, NJ-, T e c h n i c a l Beport, CBM-TB-27. Winograd T. (1975), "Frame B e p r e s e n t a t i o n s and the P r o c e d u r a l - D e c l a r a t i v e C o n t r o v e r s y " , i n Bobrow D.G- and C o l l i n s A. (eds-), B epresentation and Understanding, Academic Press, New York, pp. 185-210. Yakimovsky Y. and Feldman J . A. (1973), "A Semantics-Based D e c i s i o n Theory Begion A n a l y z e r " , Proceedings of IJCAI-73. pp. 580-588. Zucker S.W. (1978), "Low-level V i s i o n , C o n s i s t e n c y and Continuous B e l a x a t i o n " , Proc. of the Second Nat. Conf. of the Can. Soc. f o r Computational S t u d i e s of I n t e l l i g e n c e . Univ. Of Toronto, Toronto, O n t a r i o , pp. 107-116. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0051802/manifest

Comment

Related Items