Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Knowledge-based visual interpretation using declarative schemata Browse, Roger Alexander 1982

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
831-UBC_1983_A1 B79.pdf [ 10.56MB ]
Metadata
JSON: 831-1.0051865.json
JSON-LD: 831-1.0051865-ld.json
RDF/XML (Pretty): 831-1.0051865-rdf.xml
RDF/JSON: 831-1.0051865-rdf.json
Turtle: 831-1.0051865-turtle.txt
N-Triples: 831-1.0051865-rdf-ntriples.txt
Original Record: 831-1.0051865-source.json
Full Text
831-1.0051865-fulltext.txt
Citation
831-1.0051865.ris

Full Text

c 1 KNOWLEDGE-BASED VISUAL INTERPRETATION USING DECLARATIVE SCHEMATA by ROGER A. BROWSE B.Sc. M c G i l l U n i v e r s i t y , 1972 M.Sc. The U n i v e r s i t y of B r i t i s h Columbia, 1977 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE We accept t h i s t h e s i s as conforming to the r e q u i r e d standard THE UNIVERSITY OF BRITISH COLUMBIA November 1982 © Roger A. Browse, 1982 In presenting t h i s thesis i n p a r t i a l f u l f i l m e n t of the requirements for an advanced degree at the University of B r i t i s h Columbia, I agree that the Library s h a l l make i t f r e e l y available for reference and study. I further agree that permission for extensive copying of t h i s thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. I t i s understood that copying or publication of t h i s thesis for f i n a n c i a l gain s h a l l not be allowed without my written permission. Department of ^&^piA^C?M ^> e x ' e v i Qjp The University of B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 Date D ^ , Z o l < l g 2 _ 11 A b s t r a c t One of the main o b j e c t i v e s of computer v i s i o n systems i s to produce s t r u c t u r a l d e s c r i p t i o n s of the scenes d e p i c t e d i n images. Knowledge of the c l a s s of o b j e c t s being imaged can f a c i l i t a t e t h i s o b j e c t i v e by p r o v i d i n g models to guide i n t e r p r e t a t i o n , and by f u r n i s h i n g a b a s i s f o r the s t r u c t u r a l d e s c r i p t i o n s . T h i s document d e s c r i b e s r e s e a r c h i n t o tech-niques f o r the r e p r e s e n t a t i o n and use of knowledge of object c l a s s e s , c a r r i e d out w i t h i n the context of a computational v i s i o n system which i n t e r p r e t s l i n e drawings of human-like body forms. A d e c l a r a t i v e schemata format has been d e v i s e d which r e p r e s e n t s s t r u c t u r e s of image f e a t u r e s which c o n s t i t u t e dep-i c t i o n s of body p a r t s . The system encodes r e l a t i o n s between these image c o n s t r u c t i o n s and an u n d e r l y i n g three dimensional model of the human body. Using the component h i e r a r c h y as a s t r u c t u r a l b a s i s , two l a y e r s of r e p r e s e n t a t i o n are developed. One r e f e r e n c e s the f i n e r e s o l u t i o n f e a t u r e s , and the other r e f e r e n c e s the coarse r e s o l u t i o n . These l a y e r s are connected with l i n k s r e p r e s e n t a t i v e of the s p e c i a l i z a t i o n / g e n e r a l i z a t i o n h i e r a r c h y . The problem domain d e s c r i p t i o n i s d e c l a r a t i v e , and makes no commitment to the nature of the subsequent i n t e r p r e -t a t i o n processes. As a means of t e s t i n g the adequacy of the r e p r e s e n t a t i o n , p o r t i o n s have been converted i n t o a PROLOG fo r m u l a t i o n and used t o "prove" body p a r t s i n a data base of a s s e r t i o n s about, image p r o p e r t i e s . The i n t e r p r e t a t i o n phase r e l i e s on a cue/model approach, using an ext e n s i v e cue t a b l e which i s a u t o m a t i c a l l y generated from the problem domain d e s c r i p t i o n . The primary mechanisms f o r c o n t r o l of i n t e r p r e t a t i o n p o s s i b i l i t i e s are fashioned a f t e r network c o n s i s t e n c y methods. The o p e r a t i o n of these mechanisms i s l o c a l i z e d and separated between o p e r a t i o n s at the f e a t u r e l e v e l and at the model l e v e l . The body drawing i n t e r p r e t a t i o n system i s c o n s i s t e n t with aspects of human v i s u a l p e r c e p t i o n . The system i s capable of i n t e l l i g e n t s e l e c t i o n of p r o c e s s i n g l o c a t i o n s on the b a s i s of the progress of i n t e r p r e t a t i o n . A dual r e s o l u t i o n r e t i n a i s moved about the image c o l l e c t i n g f i n e l e v e l f e a t u r e s i n a small f o v e a l area and coarse l e v e l f e a t u r e s i n a wider p e r i -p h e r a l a r e a . Separate i n t e r p r e t a t i o n s are developed l o c a l l y on the b a s i s of the two d i f f e r e n t r e s o l u t i o n l e v e l s , and the r e l a t i o n between these two i n t e r p r e t a t i o n s i s analyzed by the system to determine l o c a t i o n s of p o t e n t i a l l y u s e f u l informa-t i o n . i i i Table of Contents A b s t r a c t i i L i s t of F i g u r e s v i 1 I n t r o d u c t i o n 1 2 Framework and Approach 6 2.1 Use of Model Knowledge in V i s i o n 7 2.1.1 The Image Feature Access Approach 8 2.1.2 The Volume Access Approach 11 2.1.3 D i s c u s s i o n 12 2.2 Line Drawings i n Computer V i s i o n Research 23 2.2.1 L i n e as a Symbolic L e v e l 23 2.2.2 Some H i s t o r y 25 2.2.3 G e n e r a l i z i n g from the Blocks-World 27 2.2.3.1 Compressing C o n s t r a i n t s to a S i n g l e L e v e l . 27 2.2.3.2 D i f f e r e n t Types of Features 30 2.2.3.3 A v a i l a b i l i t y of Components 31 2.2.3.4 S t r u c t u r e Within L a b e l s 31 2.2.4 Beyond the Blocks-World 32 2.3 L e v e l s of R e s o l u t i o n 36 2.3.1 R e s o l u t i o n L e v e l s i n N a t u r a l V i s i o n 37 2.3.2 R e s o l u t i o n Pyramids or Cones 40 2.3.3 R e s o l u t i o n L e v e l s i n Edge D e t e c t i o n 42 2.3.4 Knowledge I n t e r a c t i o n with M u l t i p l e R e s o l u t i o n L e v e l s 46 2.4 L o c a t i o n S e l e c t i o n i n V i s i o n 48 2.4.1 Saccadic Eye Movements 50 2.4.2 N o n - A r b i t r a r y F i x a t i o n L o c a t i o n and D u r a t i o n .. 52 i v 2.4.3 P i e c i n g Together F i x a t i o n s 53 2.4.4 Models for Saccadic C o n t r o l 55 3 Research Overview 60 3.1 A Model f o r P e r c e p t i o n 60 3.2 D e c l a r a t i v e Schemata 65 4 A Computer V i s i o n Implementation 70 4.1 Problem Domain and Image Generation 71 4.2 Knowledge Repres e n t a t i o n 77 4.2.1 Adequacy of Representation 89 4.3 P r e p a r a t i o n f o r I n t e r p r e t a t i o n 91 4.4 Feature-Based Operations 96 4.5 Model-Based Operations 103 4.5.1 L o c a l l y Legal I n t e r p r e t a t i o n s Issue 106 4.5.2 A p p l y i n g Network Con s i s t e n c y 108 4.5.3 Incremental C o n s i s t e n c y 112 4.5.4 Representing R e l a t i o n Instances 115 4.6 S e l e c t i n g P r o c e s s i n g L o c a t i o n s 118 5 Working Examples 122 5.1 A S i n g l e F i x a t i o n 122 5.2 Example with M u l t i p l e F i x a t i o n s 147 6 R e l a t e d Issues 161 6.1 Grouping and Feature I n t e g r a t i o n 161 6.2 P i c t u r e Grammars 168 6.3 View Based Repr e s e n t a t i o n s 174 7 C o n c l u s i o n s 177 References 182 Reference Notes 195 V Appendix A Angles at Connections 196 Appendix B Body Form Knowledge 197 Appendix C PROLOG Body D e f i n i t i o n s 238 Appendix D I n t e r p r e t a t i o n Examples 250 v i L i s t of F i g u r e s 2.1.1. The M u l t i p l e Access Model of the i n t e r a c t i o n s of o b j e c t knowledge i n v i s u a l p e r c e p t i o n 13 2.1.2. The Volume Access Model d e r i v e d from the m u l t i -p l e access model 15 2.1.3. The Image Feature Access Model d e r i v e d from the m u l t i p l e access model 16 2.2.1. Two l a b e l l e d v e r t i c e s from the blo c k s - w o r l d 28 2.2.2. (a) A mountain symbol (b) A sketch map 29 2.3.1. T y p i c a l l o c a l - g l o b a l s t i m u l i : (a) incompatible (b) compatible 39 3.1.1. D i f f e r e n t l e v e l s i n the image h i e r a r c h y c u i n g at d i f f e r e n t l e v e l s i n the s p e c i a l i z a t i o n h i e r a r c h y 62 4.1.1. Some examples of body form drawings 71 4.1.2. Complete c o l l e c t i o n of body p a r t d e p i c t i o n s 73 4.1.3. Image f e a t u r e s and t h e i r a t t r i b u t e s 74 4.1.4. L i n e drawing at (a) 1024x1024 i n i t i a l l i n e drawing, (b) 128x128 averaged image 75 4.1.5. Line drawing at (a) 32x32 averaged image (b) the a x i s of each d e t e c t e d b l o b 76 4.2.1. Component d e s c r i p t i o n of a view of a hand 78 4.2.2. Image to scene mapping d e s c r i p t i o n f o r a view of the hand 79 4.2.3. Body form i n s t a r t i n g p o s i t i o n 80 4.2.4 Body pa r t o r i e n t a t i o n r e l a t i v e to i t s r e s t p o s i -t i o n d e s c r i b e d as a t r i p l e (6x,6y,6z) 81 4.2.5. A s i n g l e d e p i c t i o n of an upper-leg used to represent three d i f f e r e n t o r i e n t a t i o n s 82 4.2.6. Left-arm schema d e s c r i p t i o n 84 4.2.7. The component h i e r a r c h y f o r the f i n e l a y e r of the body form r e p r e s e n t a t i o n 86 v i i 2.4.8. The component h i e r a r c h y f o r the coarse l a y e r of the body form knowledge 87 4.2 .9. The s p e c i a l i z a t i o n / g e n e r a l i z a t i o n h i e r a r c h y f o r the body form r e p r e s e n t a t i o n 88 4.3.1. A simple set l a b e l l i n g s t r u c t u r e 92 4.3.2. A p a r t i a l example of the set l a b e l l i n g data s t r u c t u r e f o r the c u r v a t u r e of l i n e s 94 4.4.1. A t y p i c a l f i x a t i o n of an image. The area in 128x128 r e s o l u t i o n i n d i c a t e s the p e r i p h e r y , and the 1024x1024 area i s the fovea. The r e s t of the image i s shown i n 32x32 r e s o l u t i o n 96 4.4.2. I n i t i a l s i t u a t i o n , showing three l i n e s connect-ed by image h i e r a r c h y to a blob f e a t u r e . The model p o s s i b i l i t i e s are shown beneath the l i n e f e a t u r e s 99 4.4.3. A f t e r the a p p l i c a t i o n of grouping c o n s i s t e n c y .. 100 4.4.4. The f i n a l s i t u a t i o n a f t e r the i n t e r - l e v e l con-s i s t e n c y has been a p p l i e d 101 4.5.1. L o c a l l y l e g a l , but g l o b a l l y i l l e g a l s t r u c t u r e s in body form problem domain 107 4.5.2. A network c o n s t r u c t e d from a schema d e s c r i p t i o n .. 109 4.5.3. A network c o n s t r u c t e d from a schema d e s c r i p t i o n with an entry made 109 4.5.4. A network c o n s t r u c t e d from a schema d e s c r i p t i o n a f t e r s e v e r a l e n t r i e s 110 4.5.5. A network c o n s t r u c t e d from a schema d e s c r i p t i o n a f t e r s e v e r a l e n t r i e s 111 4.5.6. Incremental Consistency A l g o r i t h m 114 4.5.7. Example of s p e c i f i c a t i o n s f o r the e v a l u a t i o n of a t t r i b u t e s f o r a r e l a t i o n 116 5.1.1 The body form l i n e drawing to be used as the example 123 5.1.2. Area of a v a i l a b l e f i n e l a y e r f e a t u r e s i n the s i n g l e f i x a t i o n at p o i n t (350,325) 124 v i i i 5.1.3. Area of a v a i l a b l e coarse l a y e r f e a t u r e s i n the s i n g l e f i x a t i o n at p o i n t (350,325) 124 5.2.1. The f i r s t f i x a t i o n (at l o c a t i o n 449 192). The small squares i n d i c a t e p e r i p h e r y , and the l a r g e squares show unprocessed areas 148 5.2.2. The second f i x a t i o n at 162 448. The p r e v i o u s l y processed area i s a l s o shown 151 5.2.3. The t h i r d f i x a t i o n at 160 228. The p r e v i o u s l y processed areas are a l s o shown 153 5.2.4. The f o u r t h f i x a t i o n at 270 832 1 55 5.2.5. The f i f t h f i x a t i o n at 96 672 157 5.2.6. The s i x t h f i x a t i o n at 448 832 159 6.1.1. Two d i s p l a y s of the type used i n Feature In-t e g r a t i o n Theory experiments, (a) c o n j u n c t i o n t a r g e t R, (b) f e a t u r e t a r g e t R 162 6.1.2. A n a l y s i s of d i s p l a y c o n f i g u r a t i o n s shown i n terms of model p o s s i b i l i t i e s 163 6.1.3. Group p r o c e s s i n g d i g i t d e t e c t i o n d i s p l a y 164 6.1.4. Features a v a i l a b l e at two r e s o l u t i o n s 165 6.1.5. Low r e s o l u t i o n o b j e c t s d e t e c t e d and model pos-s i b i l i t i e s a ssigned to high r e s o l u t i o n f e a t u r e s which are roughly l o c a t e d 166 6.1.6. Features assigned to o b j e c t s d e t e c t e d at a f i n e r l e v e l of r e s o l u t i o n , f o r one of the low l e v e l o b j e c t s 166 D.1 The body form i n r e s t p o s i t i o n 248 D.2 A complete parse t r e e f o r a body form 249 D.3 A h a l f body at l a r g e s c a l e 251 Acknowledgement I wish to thank P r o f e s s o r Alan Mackworth fo r h i s a d v i c e , support, encouragement, and i n s p i r a t i o n . I am t r u l y f o r t u n a t e to have had my t h e s i s s u p e r v i s e d by such a h e l p f u l and under-st a n d i n g person. I a l s o wish to thank P r o f e s s o r s Anne Treisman and D a n i e l Kahneman f o r g u i d i n g my involvement i n psychology, and f o r h e l p i n g me to understand what i t means to be d e d i c a t e d to the i d e a l s of s c i e n c e . I am g r a t e f u l to P r o f e s s o r s R i c h a r d Rosenberg, Ray R e i t e r , and Bob Woodham f o r t h e i r h e l p and i n s t r u c t i o n . For the many u s e f u l and enjoyable d i s c u s s i o n s I thank Jan Mulder, Randy Goebel, H i l a r y Schmidt, Jay Glicksman, B i l l Havens, B i l l P r i n z m e t a l , Jim L i t t l e , and Marc Majka. F i n a l l y , I wish to thank Deborah Brown f o r a l l her p a t i e n c e and l o v e . T h i s work was supported by s c h o l a r s h i p s from Izaak Walton K i l l a m Memorial S c h o l a r s h i p s and The N a t i o n a l Science and E n g i n e e r i n g Research C o u n c i l of Canada. 1 K I n t r o d u c t ion Throughout human h i s t o r y there has been c o n t i n u a l e f f o r t to develop t o o l s and machines which can improve the e f f i c i e n c y and e f f e c t i v e n e s s of work. With the appearance of the d i g i t a l computer, emphasis has s h i f t e d from a concern f o r the enhance-ment of p h y s i c a l c a p a b i l i t i e s to a concern f o r the development of machines which can accomplish tasks which otherwise r e q u i r e human mental a c t i v i t y . A r t i f i c i a l I n t e l l i g e n c e i s one d i s c i p -l i n e w i t h i n the scie n c e and technology which has emerged to meet t h i s c h a l l e n g e . One of the b a s i c goals of A r t i f i c i a l I n t e l l i g e n c e i s to develop a computational understanding of the powerful p r o c e s s e s performed by the human mind through a d d r e s s i n g tasks which are known to i n v o l v e these processes such as n a t u r a l language understanding, computer v i s i o n , problem s o l v i n g and game p l a y i n g . Should a computational b a s i s f o r these processes be understood to the extent that computers c o u l d be programmed to accomplish s i m i l a r f u n c t i o n s , profound m o d i f i c a t i o n s would be necessary i n our concepts of i n t e l l i g e n c e and of e x i s t e n c e . Over the f i r s t twenty years of A r t i f i c i a l I n t e l l i g e n c e a number of areas of study have a r i s e n as t e c h n o l o g i c a l l y u s e f u l byproducts of t h i s r esearch, such as knowledge r e p r e s e n t a t i o n , image a n a l y s i s , expert systems, and r o b o t i c s . S t i l l i t i s un c l e a r whether any progress has been made towards u n r a v e l l i n g 2 the m y s t e r i e s of the computational b a s i s of human mental o p e r a t i o n s . The r e s e a r c h d e s c r i b e d i n t h i s document i s concerned with the i d e n t i f i c a t i o n and p u r s u i t of two p r i n c i p l e s which appear as important d i r e c t i o n s towards the accomplishment of t h i s b a s i c goal of A r t i f i c i a l I n t e l l i g e n c e . These p r i n c i p l e s are fol l o w e d i n the context of a computational v i s i o n system which i n t e r p r e t s l i n e drawings of human-like body forms. The f i r s t p r i n c i p l e i s a commitment to d e c l a r a t i v e s t r u c -t u r e s , and to the s e p a r a t i o n of knowledge about o b j e c t s and s i t u a t i o n s from the processes which employ the knowledge. I f the computer i s to remain an e f f e c t i v e t o o l i n re s e a r c h aimed at exposing computational mechanisms r e q u i r e d f o r human i n f o r -mation p r o c e s s i n g t a s k s , then i t i s e s s e n t i a l that perspicuous d e f i n i t i o n s of the "underlying knowledge s t r u c t u r e s be made a v a i l a b l e . D e c l a r a t i v e s t r u c t u r e s are i d e a l f o r the e x p l i c i t d e f i n i t i o n of both the task being undertaken, and the knowledge being employed, l a r g e l y because of the c l o s e t i e s between d e c l a r a t i v e s t r u c t u r e s and the w e l l known formal mechanisms of l o g i c and grammatical r e p r e s e n t a t i o n . A c l e a r s e p a r a t i o n of knowledge and process p r o v i d e s the p o t e n t i a l f o r v e r i f i c a t i o n and t r a n s f e r r a l of methods to other problem domains. T h i s idea has an i n t u i t i v e a p p e a l . In everyday human a c t i v i t y i t appears that d i f f e r e n t processes access the same 3 knowledge s t r u c t u r e s : the same knowledge of o b j e c t s seems to be employed i n v i s u a l understanding, i n forming mental images, i n drawing, and i n h a p t i c m a n i p u l a t i o n . Furthermore, given the enormous a r r a y of o b j e c t s understood i n v i s u a l t a s k s , i t i s only reasonable that the complex o p e r a t i o n s of i n t e r p r e t a -t i o n are not bound up s e p a r a t e l y with each o b j e c t type, but r a t h e r r e s i d e as a u n i t a r y system which may operate with s e l e c t e d o b j e c t knowledge. T h i s commitment to the s e p a r a t i o n of knowledge and pro-cess i n v i s i o n has l e d to the development of a d e c l a r a t i v e schemata format f o r encoding knowledge of the problem domain of l i n e drawings of human-like body forms. T h i s d e c l a r a t i v e s t r u c t u r e makes s i g n i f i c a n t extensions to the e a r l i e r work i n the use of grammatical r e p r e s e n t a t i o n s f o r v i s u a l knowledge, and as w e l l p r o v i d e s l i n k s to other popular approaches to l i n e drawing i n t e r p r e t a t i o n . The second p r i n c i p l e being pursued i n t h i s r e s e a r c h c e n t e r s on the importance of c o n s i d e r i n g the c h a r a c t e r i s t i c s of human o p e r a t i o n s i n the design of A r t i f i c i a l I n t e l l i g e n c e systems. In the computational study of v i s u a l p r o c e s s i n g c a p a b i l i t i e s i t i s important to recognize t h a t while some aspects of human v i s i o n may be l i t t l e more than a r t i f a c t s of the b i o l o g i c a l implementation, other aspects may r e f l e c t fun-damental p r o p e r t i e s of the u n d e r l y i n g p r o c e s s e s . At the phy-s i o l o g i c a l l e v e l , f o r example, i t has long been known that a s t a b i l i z e d image on the r e t i n a w i l l q u i c k l y fade and disappear 4 because of the p r o p e r t i e s of the r e t i n a l r e c e p t o r s , but i t would not seem u s e f u l to b u i l d t h i s c h a r a c t e r i s t i c i n t o com-puter based imaging systems. On the other hand, p r o p e r t i e s of the the s p a t i a l o r g a n i z a t i o n of r e c e p t o r s on the r e t i n a corresponds d i r e c t l y to the c h a r a c t e r i s t i c s of one of the most s u c c e s s f u l computational edge d e t e c t i o n o p e r a t o r s (Marr and H i l d r e t h , 1980). C o n s i d e r a t i o n of the c h a r a c t e r i s t i c s of human v i s i o n may a l s o be p r o d u c t i v e i n other r e s p e c t s . The body drawing i n t e r p r e t a t i o n system has been s t r u c t u r e d to i n c o r p o r a t e s e v e r a l a s p e c t s of human v i s i o n , and w i t h i n reasonable l i m i t s , an attempt has been made to provide connections and a n a l o g i e s between the system's o p e r a t i o n and r e s u l t s o b tained through C o g n i t i v e Psychology experimentation. Chapter Two pr o v i d e s a framework and background to the approach which has been taken. P a r t i c u l a r c a r e has been taken to d e s c r i b e the i n t e r p r e t a t i o n l a b e l l i n g approach to computer v i s i o n , which pl a c e s an importance on a v a r i e t y of f e a t u r e s and t h e i r r o l e i n suggesting models f o r o b j e c t s d e p i c t e d i n the image. T h i s approach i s d e s c r i b e d i n the context of res e a r c h i n l i n e drawing i n t e r p r e t a t i o n . Chapter Two a l s o f u r n i s h e s a computational p e r s p e c t i v e on the use of m u l t i p l e r e s o l u t i o n r e p r e s e n t a t i o n s , and on the r e l a t e d t o p i c of the s e l e c t i o n a l processes of v i s i o n . These areas form the b a s i s of the system's c o n s i d e r a t i o n f o r human 5 v i s u a l p r o c e s s i n g . The t h i r d chapter i s an overview of the research, g i v i n g a model f o r p e r c e p t i o n , and a d e s c r i p t i o n of the mechanism for encoding v i s u a l knowledge, c a l l e d d e c l a r a t i v e schemata. T h i s overview i s presented without r e f e r e n c e to the s p e c i f i c prob-lem domain. Most of the important ideas behind the computer implemen-t a t i o n r e q u i r e examples fo r t h e i r d e s c r i p t i o n . Chapter Four presents the o p e r a t i o n s of the system, going through each stage i n d e t a i l . Chapter F i v e i s an a b b r e v i a t e d demonstration of the work-ing system. An example was chosen f o r the r e s u l t i n g c l e a r demonstration of the processes d e s c r i b e d i n Chapter Four. An appendix p r o v i d e s other examples. During the design and implementation of the body form i n t e r p r e t a t i o n system, s e v e r a l i s s u e s came to l i g h t which r e l a t e to p r e v i o u s r e s e a r c h , both i n Computational V i s i o n , and in C o g n i t i v e Psychology. Chapter S i x d i s c u s s e s these i s s u e s . Chapter Seven concludes with a summary and suggestions f o r f u t u r e r e s e a r c h . 6 2. Framework and Approach The r e s e a r c h presented i n t h i s t h e s i s i s r e l a t e d to a v a r i e t y of e s t a b l i s h e d avenues of i n v e s t i g a t i o n i n t o the nature of v i s u a l p r o c e s s e s . T h i s chapter s i n g l e s out four a s p e c t s which r e q u i r e d i s c u s s i o n i n order to develop a ground-work of ideas and terminology. The f i r s t t o p i c i s concerned with the po i n t at which knowledge of s p e c i f i c o b j e c t s might enter i n t o the process of v i s u a l i n t e r p r e t a t i o n , with emphasis on the p o t e n t i a l r o l e of two dimensional image f e a t u r e s i n cu i n g such knowledge. The second p r e s e n t a t i o n d e s c r i b e s r e s e a r c h centered on l i n e drawings, e x p l o r i n g some of the l i m -i t a t i o n s of the e a r l y problem domains, and f o l l o w i n g the pro-g r e s s i o n of r e s u l t s to the development of "schemata-based" i n t e r p r e t a t i o n methods. The t h i r d i s s u e i s the use of m u l t i -p l e r e s o l u t i o n l e v e l s i n v i s u a l i n t e r p r e t a t i o n , which i n c l u d e s p r o p o s a l s f o r i n t e r p r e t a t i o n - b a s e d i n t e r a c t i o n s among l e v e l s . The f o u r t h t o p i c i s the s e l e c t i o n a l processes i n v o l v e d i n v i s u a l i n t e r p r e t a t i o n . Research i n sac c a d i c eye movements i s exp l o r e d , with an attempt to uncover some of the computational bases of s e l e c t i o n i n human v i s i o n . There are s e v e r a l other areas of i n v e s t i g a t i o n which r e l a t e to the c u r r e n t r e s e a r c h . A p p r o p r i a t e d i s c u s s i o n of these t o p i c s i s d e f e r r e d u n t i l a f t e r the e l a b o r a t i o n of the implemented computer v i s i o n system provided i n chapter f o u r . 7 2.j_. Use of Model Knowledge i n V i s i o n Computational v i s i o n may be d i s t i n g u i s h e d from i t s ances-t r a l d i s c i p l i n e s by i t s concern f o r the v a r i a t i o n i n appear-ance of o b j e c t s when imaged. The two major c o n t r i b u t o r s to t h i s v a r i a t i o n a r e : (1) the many p o s s i b l e viewing c o n d i t i o n s , i n c l u d i n g the d i v e r s i t i e s of l i g h t i n g , and (2) the p o s s i b l e v a r i a t i o n s i n the o b j e c t s themselves, i n c l u d i n g t h e i r deforma-t i o n and arrangement. The techniques of c o r r e l a t i o n matching and of f e a t u r e - v e c t o r c l a s s i f i c a t i o n have been d i s c a r d e d i n favour of the development of methods which i n c o r p o r a t e e x p l i -c i t models of these f a c t o r s i n f l u e n c i n g the image. E a r l y computer v i s i o n r e s e a r c h may be broadly c a t e g o r i z e d as an attempt to match image f e a t u r e i n f o r m a t i o n a g a i n s t the f e a t u r e s p r e d i c t e d by models of o b j e c t s and thereby develop r e p r e s e n t a t i o n s of imaged scenes. L a t e r r e s e a r c h has centered on the use of image f e a t u r e s i n the c o n s t r u c t i o n of more com-p l e t e c o n t e x t - f r e e r e p r e s e n t a t i o n s to be l a t e r matched ag a i n s t models of s p e c i f i c o b j e c t s . During the t r a n s i t i o n , the idea of a c c e s s i n g models e a r l y i n the v i s u a l process on the b a s i s of image f e a t u r e s has f a l l e n somewhat i n t o d i s f a v o u r . T h i s s e c t i o n begins by examining these two approaches, with emphasis on u n d e r l y i n g p e r s p e c t i v e s on the process of p e r c e p t i o n . A model of v i s u a l p e r c e p t i o n i s then presented which has p r o v i s i o n f o r both approaches. F i n a l l y , a presenta-t i o n i s made of some arguments a g a i n s t the c u r r e n t l y more 8 popular view that e l a b o r a t e r e p r e s e n t a t i o n s of scenes are con-s t r u c t e d before knowledge of s p e c i f i c o b j e c t s i s i n v o l v e d . 2.J_.J_. The Image Feature Access Approach One of the e a r l i e s t computer v i s i o n systems was developed by Roberts (1965) to i n t e r p r e t photographic images of simple p o l y h e d r a l o b j e c t s . T h i s r e s e a r c h e s t a b l i s h e d three important approaches to computer v i s i o n . (1) The problem domain of p o l y h e d r a l o b j e c t s known as the block s - w o r l d became a focus of much r e s e a r c h which f o l l o w e d . The domain p r o v i d e s p o s s i b i l i t i e s f o r v a r i a t i o n s of view, con-f i g u r a t i o n , and shape of o b j e c t s which may be simply modeled g e o m e t r i c a l l y . (2) The system operated i n two stages. The f i r s t step was to develop a l i n e drawing from the d i g i t i z e d image by grouping i n t e n s i t y d i s c o n t i n u i t i e s . The second step matched geometric models of known o b j e c t s a g a i n s t the l i n e drawings. The n o t i o n of an intermediate l i n e drawing stage i n computer v i s i o n has been popular ever s i n c e . (3) The matching phase e x p l o i t e d the f a c t that there are topo-l o g i c a l i n v a r i a n c e s i n the p r o j e c t i o n s of the p o l y h e d r a l o b j e c t models over simple t r a n s f o r m a t i o n s and changes of viewpoint. Thus object models c o u l d be suggested by the d e t e c t i o n of image f e a t u r e s . For example, a p a r a l l e l o g r a m suggests e i t h e r a wedge or cube. T h i s i s the b a s i s of the 9 image f e a t u r e access approach to computer v i s i o n : t h a t simple image f e a t u r e s invoke the examination of more complex ob j e c t models which may then be v e r i f i e d or r e j e c t e d . In v a r i o u s forms, t h i s approach has r e c e i v e d a great d e a l of a t t e n t i o n i n computer v i s i o n r e s e a r c h . The r e s e a r c h which followed Robert's work focused on the development of s t r u c t u r a l d e s c r i p t i o n s of b l o c k s - w o r l d scenes on the b a s i s of f e a t u r e s e x t r a c t e d from l i n e drawing images (Guzman, 1968; Clowes, 1971; Huffman, 1971). These systems used l i n e s and v e r t i c e s as image f e a t u r e s , and e x p l o i t e d the r e l a t i o n between these f e a t u r e s and scene p r o p e r t i e s . Soon a f t e r , e f f e c t i v e methods were developed fo r understanding b l o c k s - w o r l d l i n e drawings. T h i s r e s e a r c h e x p l o r e d the use of l o c a l c o n s i s t e n c y methods (Waltz, 1972) and g r a d i e n t space (Mackworth, 1973) in computational v i s i o n . S e c t i o n 2.2 d e s c r i b e s a number of l i n e drawing i n t e r p r e -t a t i o n systems which use the image f e a t u r e access approach. I t i n c l u d e s a d i s c u s s i o n of some of the i s s u e s which have advanced the r e s e a r c h to other problem domains, and introduced more e l a b o r a t e techniques. One important aspect of image f e a t u r e access systems i s that knowledge of o b j e c t s i s i n t r o d u c e d e a r l y to guide the i n t e r p r e t a t i o n process. T h i s poses a s i g n i f i c a n t problem: I f models are to guide i n t e r p r e t a t i o n , how can the system employ the c o r r e c t models u n t i l the scene has been i n t e r p r e t e d ? T h i s 10 has been r e f e r r e d to as the " p a r s i n g paradox" (Palmer, 1975) and as the "chicken-and-egg problem" (Mackworth, 1978). One s o l u t i o n to t h i s paradox i s to c o n s i d e r p e r c e p t i o n as a c y c l i c process r a t h e r than l i n e a r l y staged. Mackworth (1975; 1978) has proposed such a c y c l e of p e r c e p t i o n c o n s i s t -ing of four steps; cue d i s c o v e r y , model i n v o c a t i o n , model v e r i f i c a t i o n , and model e l a b o r a t i o n . The idea i s that the c y c l e may s t a r t e i t h e r with or without hypothesis of models, and g r a d u a l l y , as the c y c l e s are completed, develop more r e f i n e d correspondences to e x i s t i n g models, and thereby accom-p l i s h r e c o g n i t i o n . A s i m i l a r model f o r p e r c e p t i o n has been presented by N e i s s e r (1976). T h i s model i s c e n t e r e d on r e p r e s e n t a t i o n s of a n t i c i p a t i o n s about v i s u a l i n f o r m a t i o n c a l l e d "schemata". These schemata are m o d i f i e d by accumulated inputs and i n turn d i r e c t e x p l o r a t i o n of the v i s u a l f i e l d f o r f u r t h e r input r e l e v a n t to the schema's o b j e c t i v e s . The c y c l e may be i n i -t i a t e d e i t h e r by s t i m u l i or by a n t i c i p a t i o n s . T h i s idea that model knowledge becomes more and more s p e c i f i c a l l y u s e f u l as i n t e r p r e t a t i o n progresses i s inherent in most knowledge-based v i s i o n systems (Mackworth and Havens, 1981; Hinton, 1981; Brooks, 1981; Browse, 1982). Another type of s o l u t i o n , proposed by Palmer (1975) and by Havens (1976), i s to develop knowledge s t r u c t u r e s and t e c h -niques which enable simultaneous h y p o t h e s i s - d r i v e n and data-11 d r i v e n searches. 2.j_.2. The Volume Access Approach A d i f f e r e n t approach to computer v i s i o n has emerged on the b a s i s of the work of Horn (1975) and Marr (1976). Horn introduced the use of a mathematical formula r e l a t i n g p h y s i c a l c h a r a c t e r i s t i c s of a scene (such as s u r f a c e o r i e n t a t i o n and l i g h t source p o s i t i o n ) to the a r r a y of l i g h t i n t e n s i t i e s which r e s u l t s . T h i s work has i n s p i r e d attempts to recover knowledge of such p h y s i c a l c h a r a c t e r i s t i c s on the b a s i s of an i n t e n s i t y a r r a y by making a d d i t i o n a l assumptions about p r o p e r t i e s of the o b j e c t s (Woodham, 1978; 1981; W i t k i n s , 1981; Stevens, 1981). Marr argued for modularity i n the c o n s t r u c t i o n of v i s i o n systems with d i s t i n c t i n t e r v e n i n g r e p r e s e n t a t i o n s . T h i s has a l s o had widespread acceptance w i t h i n computational v i s i o n r esearch (see Brady, 1982). One of the b a s i c premises of the work of Marr i s that a l a r g e and complex computation (such as v i s i o n ) must be s p l i t up i n t o s m a l l , n e a r l y independent spe-c i a l i z e d sub-processes (Marr, 1976; Marr and N i s h i h a r a , 1976). The major j u s t i f i c a t i o n f o r t h i s view i s that such an o r g a n i -z a t i o n i s necessary to evolve a complex system; that otherwise an e v o l u t i o n a r y change to improve one aspect would degrade another. M o d u l a r i t y r e q u i r e s sub-processes which i n t e r a c t minimally with one another, and i m p l i e s strong r e p r e s e n t a -t i o n a l s t r u c t u r e s through which the modules may t r a n s f e r 12 i n f o r m a t i o n . The most obvious mode of o p e r a t i o n f o r such a l i n e a r - s t a g e system i s a s t r i c t bottom-up p r o c e s s i n g paradigm which d e f e r s the involvement of s p e c i f i c model knowledge u n t i l a complete three dimensional context f r e e r e p r e s e n t a t i o n has been developed (see a l s o N i s h i h a r a , 1981). T h i s volume access  approach i s c o n s i s t e n t with the recovery of p h y s i c a l charac-t e r i s t i c s of the scene through the use of the image formation equation (Horn, 1975; Barrow and Tenenbaum, 1978). Taken together, the r e s u l t i s an approach which r e l e g a t e s the use of s p e c i f i c model knowledge to a po i n t a f t e r the development of el a b o r a t e c o n t e x t - f r e e scene r e p r e s e n t a t i o n s . 2.^.3. D i s c u s s i o n F i g u r e 2.1.1 i s a schematic drawing of a s i m p l i f i e d model f o r v i s u a l p e r c e p t i o n which r e c o n c i l e s some of the d i f f e r e n c e s between "image f e a t u r e - a c c e s s " and "volume-access" approaches to computer v i s i o n . In t h i s model, a s e r i e s of processes transforms an image through intermediate r e p r e s e n t a t i o n s . The e a r l y stages are image f e a t u r e based, the l a t e r are volume and su r f a c e based. At each stage i n t h i s p r o g r e s s i o n , some gen-e r a l knowledge and assumptions are necessary, depending on the type of t r a n s f o r m a t i o n . These are d e p i c t e d on the r i g h t hand s i d e i n f i g u r e 2.1.1. On the l e f t hand sid e i s shown the s t r u c t u r e s encoding knowledge of the s p e c i f i c o b j e c t s and obj e c t c a t e g o r i e s . T h i s knowledge has access to every l e v e l i n the p r o g r e s s i o n , and may i n f l u e n c e any of the transforma-t i o n s . 13 i d e n t i f i e d models knowledge of spec i f i c o b j e c t s and o b j e c t c l a s s e s s u r f a c e & volume r e p r e s e n t a t i o n s u r f a c e & volume r e p r e s e n t a t i o n 1 image f e a t u r e r e p r e s e n t a t i o n image f e a t u r e r e p r e s e n t a t i o n assumpt ions about s u r f a c e c o n t i n u i t y and image formation knowledge p r i n c i p l e s of image c o n t i n u i t y and grouping pr inc i p l e s image F i g u r e 2.1.1. The M u l t i p l e Access Model of the use of obj e c t knowledge i n v i s u a l p e r c e p t i o n . The o p e r a t i o n of t h i s model i n v o l v e s the pr inc i p l e of . l e a s t e f f o r t : v i s u a l processes w i l l form a correspondence t o known s p e c i f i c o b j e c t s as q u i c k l y as p o s s i b l e , on the b a s i s of any r e p r e s e n t a t i o n which can provide support. T h i s means that i n the case of l i n e drawings and impoverished image s i t u a -t i o n s , two dimensional cues e x t r a c t e d at the image f e a t u r e l e v e l w i l l be used to invoke models which w i l l f i l l i n the d e t a i l s at the l e v e l of s u r f a c e s and volumes. Under optimal viewing c o n d i t i o n s , f o r u n f a m i l i a r o b j e c t s , the most expedient route might be through a c o n t e x t - f r e e surface-and-volume 1 4 r e p r e s e n t a t i o n . During the course of p e r c e p t i o n , i n t e r a c t i o n s w i l l take p l a c e with the s p e c i f i c model knowledge. Feat u r e s e x t r a c t e d e a r l y may cue these models which then make p r e p a r a t i o n s f o r the development of r e p r e s e n t a t i o n s at higher l e v e l s . Thus the model implements a c y c l e of p e r c e p t i o n s i m i l a r to that of Mackworth (1975) and N e i s s e r (1976), while r e t a i n i n g l i n e a r i t y and m o d u l a r i t y of r e p r e s e n t a t i o n . I t i s easy to see that t h i s " m u l t i p l e - a c c e s s " model can reduce to the "volume-access" model (see Barrow and Tenenbaum, 1981) by removal of a l l connections between image f e a t u r e r e p r e s e n t a t i o n s and s p e c i f i c model knowledge. F i g u r e 2.1.2 d e p i c t s t h i s model. The r e s e a r c h r e l a t i n g to the "volume acc e s s " model i s centered on the examination of the r o l e of the knowledge r e l a t i n g to each l e v e l , and so i t i s a reason-able s t e p to not c o n s i d e r these lower c o n n e c t i o n s . 15 i d e n t i f i e d models knowledge of spec i f i c o b j e c t s and o b j e c t s c l a s s e s s u r f a c e & volume r e p r e s e n t a t i o n s u r f a c e & volume rep r e s e n t a t ion assumptions about s u r f a c e c o n t i n u i t y and image formation knowledge image f e a t u r e r e p r e s e n t a t ion > image f e a t u r e r e p r e s e n t a t i o n I image p r i n c i p l e s of image c o n t i n u i t y and grouping p r i n c i p l e s F i g u r e 2.1.2. The Volume Access Model d e r i v e d from the m u l t i -p l e access model. The " m u l t i p l e access" model w i l l convert to the "image f e a t u r e access" model by d e l a y i n g the development of volume-based r e p r e s e n t a t i o n s u n t i l a f t e r o b j e c t s are i d e n t i f i e d (see f i g u r e 2.1.3). T h i s step i s r e q u i r e d i n the examination of images which are impoverished so as to not c o n t a i n enough inf o r m a t i o n to enable development of v o l u m e t r i c r e p r e s e n t a -t i o n s without the use of ob j e c t knowledge. In t h i s case, the d e s c r i p t i o n of the scene i n terms of s u r f a c e s and volumes i s viewed as a p a r t of the correspondence to the o b j e c t models. 16 s u r f a c e & volume r e p r e s e n t a t i o n I ident i f i e d models image f e a t u r e r e p r e s e n t a t ion image f e a t u r e r e p r e s e n t a t ion I image knowledge of s p e c i f i c o b j e c t s and o b j e c t c l a s s e s p r i n c i p l e s of image c o n t i n u i t y and grouping pr inc i p l e s F i g u r e 2.1.3. The Image Feature Access Model d e r i v e d from the m u l t i p l e access model. Proponents of the "volume-access" model have c i t e d exam-p l e s of the p e r c e p t i o n of u n f a m i l i a r o b j e c t s such as micropho-tographs of p o l l e n (Barrow and Tenenbaum, 1978). The argument i s t h a t s i n c e few of us have s p e c i f i c models f o r such o b j e c t s , and s i n c e we do seem to "understand" the images i n terms of the s u r f a c e s and volumes, t h e r e f o r e such r e s p r e s e n t a t i o n s play a v i t a l r o l e in human p e r c e p t i o n . Of course, t h i s type of demonstration only shows that human v i s i o n i s capable of d e v e l o p i n g a model-free three dimensional r e p r e s e n t a t i o n , not that i t must. F u r t h e r , i t i s p o s s i b l e to argue that such r e p r e s e n t a t i o n s are formed with the a i d of models of analogous o b j e c t s . At any r a t e , the demonstrations do not preclude the involvement of s p e c i f i c o b j e c t models, invoked from the l e v e l of image f e a t u r e s , i n f l u e n c i n g the development of three dimen-s i o n a l s t r u c t u r e d u r i n g the course of normal p e r c e p t i o n . 17 Another argument f o r the "volume-access" model i s cen-t e r e d on the r e s e a r c h of Warrington and T a y l o r (1973; 1975). T h i s r e s e a r c h has shown that some p a t i e n t s who have s u f f e r e d p a r i e t a l l e s i o n s are able to understand the shapes of o b j e c t s even though they cannot name them or e x p l a i n t h e i r use. From t h i s Marr (1982) has concluded that shapes may be determined, even in d i f f i c u l t cases, without the i n t e r v e n t i o n of s p e c i f i c models. There are two p o i n t s of c a u t i o n i n f o r m u l a t i n g con-c l u s i o n s on the b a s i s of t h i s type of r e s e a r c h : (1) no two l e s i o n s are the same, and i t i s d i f f i c u l t to g e n e r a l i z e from the c h a r a c t e r i s t i c s of the c o n d i t i o n , (2) i t i s p o s s i b l e that the p a t i e n t s are impaired i n t h e i r a b i l i t y to r e p o r t the o b j e c t , even though the v i s u a l model s t r u c t u r e d around the o b j e c t i s s t i l l being used in v i s u a l p e r c e p t i o n . There are such cases, f o r example, in which the p a t i e n t i s unable to r e p o r t seeing a "telephone", but uses terms such as " d i a l " i n i t s d e s c r i p t i o n (Schmidt, note 1). The s t r u c t u r e of the human eye c a s t s s e r i o u s doubt about the p o s s i b i l i t y of i n f e r r i n g three dimensional scene proper-t i e s on the b a s i s of a s i n g l e r e t i n a l image. Beyond the small f o v e a l c e n t e r , v i s u a l a c u i t y r a p i d l y d i m i n i s h e s toward the p e r i p h e r y t 1 ] . T h i s e f f e c t i s a r e s u l t of the o r g a n i z a t i o n of [1] There i s a v a r i e t y of measures for v i s u a l a c u i t y . Riggs (1965) d e s c r i b e s a t y p i c a l r e s u l t : at 10 minutes of a degree o f f the fovea, a c u i t y i s reduced by 25%, and at one degree, by 60%. T h i s r e d u c t i o n f o l l o w s the p a t t e r n of de-c r e a s i n g d e n s i t y of cones in the r e t i n a . 18 ' the r e t i n a l r e c e p t o r s , and as w e l l a r e s u l t of the s c a t t e r i n g of l i g h t by the l e n s and cornea (Haber, 1978). Thus i n only a small p o r t i o n of the v i s u a l f i e l d i s there a v a i l a b l e the type of h i g h d e t a i l input necessary to d i s c e r n s u r f a c e o r i e n t a t i o n c o n t e x t - f r e e . T h i s means that a number of f i x a t i o n s , consum-ing about a t h i r d of a second each, would be r e q u i r e d before most o b j e c t s subtending e x t e n s i v e v i s u a l angles c o u l d be iden-t i f i e d . Yet, there i s c o n c l u s i v e evidence that the progress of i n t e r p r e t a t i o n based on s p e c i f i c models of o b j e c t s i n f l u -ences the s e l e c t i o n of f i x a t i o n l o c a t i o n s (Mackworth and Morandi, 1967;. Antes, 1974; Parker, 1978). L o f t u s and Mack-worth (1978) have demonstrated that even the f i r s t saccade i s h i g h l y dependent on the r e s u l t s of i n t e r p r e t i v e p r o c e s s i n g . In a r e l a t e d experiment, Friedman (1979) has shown that s u b j e c t s do not f i x a t e as long on o b j e c t s which are more con-s i s t e n t with the e n t i r e scene, and as a r e s u l t have l e s s d e t a i l e d r e c o l l e c t i o n than f o r o b j e c t s which are unexpected w i t h i n the c o n t e x t . These r e s u l t s argue for the e a r l y use of knowledge, not only of i n d i v i d u a l o b j e c t s , but a l s o of e n t i r e s c e n a r i o s . Many other C o g n i t i v e Psychology s t u d i e s support t h i s view that p r e l i m i n a r y i n t e r p r e t a t i o n s , based on g l o b a l and coarse image p r o p e r t i e s are u t i l i z e d i n the e x t r a c t i o n of f i n e d e t a i l s . (Weisstein and H a r r i s , 1974; Palmer, 1975; Biederman, 1981). The i n t e r f e r e n c e e f f e c t s d i s c o v e r e d by Bruner and P o t t e r (1964) are a l s o i n t e r e s t i n g evidence of the human v i s u a l 19 system's w i l l i n g n e s s to form an e a r l y hypothesis about the nature of the scene. Subjects were asked to i d e n t i f y the con-ten t of scenes d e p i c t e d i n s l i d e p r e s e n t a t i o n s . The s l i d e s were shown i n i t i a l l y out of focus, but g r a d u a l l y becoming more c l e a r . At a s p e c i f i c p o i n t t h i s f o c u s s i n g process was stopped, and the s u b j e c t s were asked to i d e n t i f y the scene. The l e n g t h of time the s u b j e c t was exposed to the defocussed image was v a r i e d , and i t was found that the more exposure to the defocussed image, the l e s s l i k e l y to i d e n t i f y the scene c o r r e c t l y . The accepted i n t e r p r e t a t i o n of t h i s s u r p r i s i n g r e s u l t i s based on the idea that while viewing the defocussed image, a number of t e n t a t i v e , c o n f l i c t i n g hypotheses are developed about the scene. These hypotheses then i n t e r f e r e with the formation of an understanding of the more c l e a r l y focused image. T h i s i n d i c a t e s that models and s c e n a r i o s are invoked to a s s i s t i n the development of i n t e r p r e t a t i o n s when only very impoverished image i n f o r m a t i o n i s a v a i l a b l e . I t i s d i f f i c u l t to e x p l a i n the f i n d i n g s of G i l c h r i s t (1977; 1980) i n terms of a s t r i c t l i n e a r stage p r o c e s s . His r e s u l t s show that the p e r c e p t i o n of b r i g h t n e s s i s i n t e r r e l a t e d with p e r c e i v e d s p a t i a l arrangement and o r i e n t a t i o n of sur-f a c e s . T h i s i s not so much an argument a g a i n s t b u i l d i n g sur-face r e p r e s e n t a t i o n s on the b a s i s of more p r i m i t i v e aspects of the image, such as i n t e n s i t y , but r a t h e r i t i s an argument in favour of the i n c l u s i o n of a mechanism which enables d i f f e r e n t r e p r e s e n t a t i o n a l l e v e l s to i n f l u e n c e one another. 2 0 I t i s not s u r p r i s i n g t h at many c o g n i t i v e psychology r e s u l t s are c o n s i s t e n t with the idea of a c c e s s i n g s p e c i f i c models on the b a s i s of two dimensional f e a t u r e s . Most e x p e r i -mental s i t u a t i o n s i n v o l v e two dimensional p r e s e n t a t i o n s , o f t e n i n l i n e form, with s p e c i f i c response s e l e c t i o n s r e q u i r e d of the s u b j e c t s . However, the e s t a b l i s h e d p s y c h o l o g i c a l v a l i d i t y of t h i s view of pe r c e p t i o n paves the way f o r re s e a r c h i n the computational s t r u c t u r e s of v i s i o n which may b e n e f i t from the l a r g e c o l l e c t i o n of c l u e s inherent i n the experimental r e s u l t s . I t i s an unfortunate f a c t that there i s no unequivocal d e f i n i t i o n of the task of v i s i o n . Yet computer programs must have c l e a r l y d e f i n e d inputs and outputs. There i s l i t t l e con-t r o v e r s y over the nature of the input to v i s i o n , but the out-put s p e c i f i c a t i o n s of each computer v i s i o n system c o n s t i t u t e s a commitment to the o b j e c t i v e s of v i s i o n . The "image f e a t u r e a c c e s s " approach implies that v i s i o n i s the formation of  correspondences between images and known o b j e c t s and s i t u a - t i o n s , and as such the r e p r e s e n t a t i o n s are concerned with what  i s necessary to compute. The "volume ac c e s s " approach i m p l i e s that v i s i o n i s the development of o b j e c t i v e c o n t e x t - f r e e  r e p r e s e n t a t i o n s of the d e p i c t e d scene, and as such i s con-cerned with what i s p o s s i b l e to compute from the image. The r e s u l t of d i f f e r i n g task d e f i n i t i o n s i s d i f f e r e n t s i m p l i f y i n g assumptions. The "volume access" approach avoids the "chicken and egg" problem with a s i m p l i f i e d o v e r a l l 21 c o n t r o l s t r u c t u r e i n the r e t e n t i o n of a modular, and l i n e a r  stage view of p e r c e p t i o n through e l i m i n a t i o n of e a r l y i n v o l v e -ment of s p e c i f i c o b j e c t s and s c e n a r i o s . The "image f e a t u r e a c c e s s " approach o f t e n u t i l i z e s the s i m p l i f i c a t i o n of a c l e a n l i n e drawing input i n order to f a c i l i t a t e the f o r m a l i z a t i o n of the ( u s u a l l y c y c l i c ) i n t e r a c t i o n s w i t h i n a l i m i t e d realm of models of known o b j e c t s . N e i t h e r s i m p l i f i c a t i o n r e s u l t s i n a "general v i s i o n " system. The s p e c i f i c set of l i g h t i n g and s u r f a c e c o n d i t i o n s that must be obtained f o r c o n t e x t - f r e e volume and s u r f a c e r e p r e s e n t a t i o n s are no more l i k e l y to occur in a scene than some s p e c i f i c o b j e c t . The " m u l t i p l e access" model makes the nature of these assumptions c l e a r w i t h i n the context of a more r e a l i s t i c view of p e r c e p t i o n which i n c l u d e s both the l i n e a r stages i n the development of r e p r e s e n t a t i o n s and the c y c l i c i n t e r a c t i o n with s p e c i f i c models. The arguments presented i n t h i s s e c t i o n have been b i a s e d towards the "image f e a t u r e a c c e s s " s i m p l i f i c a t i o n s , p a r t l y because the r e s e a r c h o u t l i n e d i n t h i s t h e s i s f o l l o w s that t r a d i t i o n , and p a r t l y because the approach has r e c e n t l y been i n d i s f a v o u r . The arguments should not be taken as attempts to demonstrate the c o r r e c t n e s s of one approach, but r a t h e r as an e f f o r t to f u r t h e r the search f o r a means of combining approaches towards a coherent model f o r p e r c e p t i o n . One f i n a l analogy i s i r r e s i s t i b l e . The study of n a t u r a l language experienced a great i n f l u x of ideas with the i n t r o -d u c t i o n of phrase s t r u c t u r e and t r a n s f o r m a t i o n a l grammars 22 (Chomsky, 1957; 1965), which produced widespread and d i l i g e n t computational study w i t h i n A r t i f i c i a l I n t e l l i g e n c e . The con-c l u s i o n s were that s t r u c t u r e based on c o n t e x t - f r e e general c a t e g o r i e s o f f e r e d a u s e f u l dimension i n language a n a l y s i s , but that the r e a l key to understanding language use r e q u i r e s the study of semantic and pragmatic knowledge of concepts and s c e n a r i o s . For computational v i s i o n , the s t r i n g e n c y of the con-s t r a i n i n g assumptions necessary to operate without s p e c i f i c knowledge, and the evidence based on c o g n i t i v e psychology s t u -d i e s of v i s i o n p o i n t to the requirement f o r the use of d e t a i l e d knowledge of o b j e c t s and o r g a n i z a t i o n s of s c e n a r i o s . 23 2.2^  L i n e Drawings in Computer V i s i o n Research The p a r t i c u l a r c l a s s of o b j e c t s around which t h i s r e s e a r c h i s cen t e r e d i s l i n e drawings of human-like body forms. T h i s s e c t i o n i s concerned with l i n e drawings i n gen-e r a l , and the computational v i s i o n r e s e a r c h which has been aimed at t h e i r i n t e r p r e t a t i o n . 2.2.\_. L i n e as a Symbol i c L e v e l Computer V i s i o n and N a t u r a l Language Understanding are two areas of A r t i f i c i a l I n t e l l i g e n c e which can be viewed as attempting to a t t a i n a computational understanding of some aspect of human i n t e l l i g e n c e by studying p e r c e p t i o n . N a t u r a l Language Understanding has one advantage i n that there e x i s t s a c l e a r symbolic l e v e l , the l e v e l of words[2], which may be assumed i n order to study the involvement of human i n t e l l i -gence and experience in language u s e [ 3 ] . There i s c e r t a i n l y no c l e a r c o u n t e r p a r t i n computational v i s i o n r e s e a r c h . T h i s i s perhaps because there does not e x i s t an a p p r o p r i a t e l e v e l , but on the other hand, the l e v e l of l i n e drawings augurs w e l l as a candidate. As o b j e c t s are represented i n l i n e form, the as p e c t s which are l e s s important to be aware of i n a scene, such as l i g h t source l o c a t i o n , T2~] I t may be argued that morphemes are a b e t t e r c h o i c e . [3] The understanding of language i n f l u e n c e s the p e r c e i v e d input to a lower l e v e l than that of words, but i t i s accepted that l i t t l e c o n t e x t - f r e e p r o c e s s i n g takes p l a c e past t h i s l e v -e l . 24 s u r f a c e t e x t u r e , and shadow are d i s c a r d e d j u s t as a represen-t a t i o n i n words d i s c a r d s i n t o n a t i o n and i n f l e c t i o n . The l i n e holds a p l a c e of p a r t i c u l a r esteem i n human a c t i v i t i e s . L i n e drawings appeared i n caves around 10,000 B.C., progressed i n t o h i e r o g l y p h i c communication, and f i n a l l y formed the c h a r a c t e r s of w r i t i n g . Many v i s u a l communication d e v i c e s , such as maps, t e x t book i l l u s t r a t i o n s , f l o w - c h a r t s , and c i r c u i t diagrams are l a r g e l y l i n e - b a s e d . T h i s tendency toward the use of l i n e may be r e l a t e d to the human v i s i o n system's w e l l known s e n s i t i v i t y to i n t e n s i t y boundaries. The mental s t r u c t u r e s which encode and operate on v i s u a l informa-t i o n may themselves be tuned to l i n e - l i k e s t r u c t u r e s (see Marr, 1976). Computational V i s i o n r e s e a r c h based on l i n e drawings has s e v e r a l p o t e n t i a l b e n e f i t s : (1) Inasmuch as l i n e drawings are i n v o l v e d i n human communi-c a t i o n , i t i s of both p r a c t i c a l and t h e o r e t i c a l i n t e r e s t to study t h e i r i n t e r p r e t a t i o n (see Mackworth, 1977b). (2) Even i f l i n e drawings are not adequate intermediate r e p r e s e n t a t i o n s f o r the human v i s i o n system, i t may be expected that s t u d i e s which develop methods f o r the a p p l i c a t i o n of model knowledge i n the i n t e r p r e t a t i o n of l i n e images w i l l probably provide i n s i g h t i n t o the methods r e q u i r e d to process on the b a s i s of some more r e f i n e d , and perhaps more r e a l i s t i c i ntermediate 25 r e p r e s e n t a t i o n . 2.2.2. Some H i s t o r y F o l l o w i n g on the work of Roberts (1965), Guzman's (1968) program used h e u r i s t i c and symbolic methods in an attempt to i n t e r p r e t l i n e drawings of bl o c k s - w o r l d images. A c l a s s i f i c a -t i o n of image v e r t i c e s was de v i s e d , and regions bounded by arms of the v e r t i c e s were s t u d i e d f o r the p o s s i b i l i t y of t h e i r b elonging to the same o b j e c t . T h i s i n f o r m a t i o n was used to group regions which composed i n d i v i d u a l b l o c k s . Although many o r g a n i z a t i o n s of blocks were i n t e r p r e t e d c o r r e c t l y , there were many that the system c o u l d not handle (see Winston, 1972). Clowes (1971) and Huffman (1971) r e c o g n i z e d that a v a r i e t y of edge types (convex, concave, o c c l u d i n g ) i n a b l o c k s - w o r l d scene are a l l d e p i c t e d as l i n e s i n the image, and that the c l a s s i f i c a t i o n of v e r t i c e s p r o v i d e d by Guzman r e f l e c t e d a v a r i e t y of c o r n e r s and abutments of b l o c k s i n the scene. T h i s d i s t i n c t i o n between the image and scene domains was c a r r i e d f u r t h e r i n the r e a l i z a t i o n that only c e r t a i n l i n e i n t e r p r e t a t i o n s (as edges) were p o s s i b l e f o r the l i n e s compos-ing each vertex type. Thus each l i n e was a s s i g n e d a set of i n t e r p r e t a t i o n p o s s i b i l i t i e s (or l a b e l s ) , and the v e r t i c e s c o u l d be used to enable a search f o r the a p p r o p r i a t e l a b e l f o r each l i n e . Waltz (1972) expanded on t h i s theme by c o n s i d e r i n g more l a b e l l i n g s , i n c l u d i n g those f o r cracks and shadows. V e r t i c e s 26 were viewed as nodes of a network, with each node having an a s s o c i a t e d set of l a b e l p o s s i b i l i t i e s . Using the uniform con-s t r a i n i n g r e l a t i o n that s t r a i g h t l i n e s must have c o n s i s t e n t i n t e r p r e t a t i o n over t h e i r extent, a f i l t e r i n g o p e r a t i o n removed impossible l a b e l s towards a much reduced, and o f t e n unique i n t e r p r e t a t i o n . T h i s approach w i l l be r e f e r r e d to as the i n t e r p r e t a t i o n  l a b e l l i n g approach. There are two fundamental i n g r e d i e n t s : (1) Some l o c a l image elements (such as l i n e s ) are a s s i g n e d l i s t s of l a b e l s , i n d i c a t i v e of the r o l e s that the e l e -ments might p l a y in the s t r u c t u r e of the scene. (2) R e l a t i o n s among elements are i d e n t i f i e d which serve to c o n s t r a i n the l a b e l l i s t s , and techniques are d e v i s e d to propagate these c o n s t r a i n t s . The c o n s t r a i n t propagation technique d e s c r i b e d by Waltz (1972) has been g e n e r a l i z e d to network c o n s i s t e n c y a l g o r i t h m s by Mackworth (1977c), who a l s o argues f o r t h e i r general u s e f u l -ness in tasks such as computer v i s i o n . The idea behind net-work c o n s i s t e n c y i s that a c o n s t r a i n t s a t i s f a c t i o n problem i s s p e c i f i e d as a network, whose nodes are v a r i a b l e s with a s s o c i -a t e d domains of p o s s i b l e d i s c r e t e v a l u e s . The r e l a t i o n s r e q u i r e d between v a r i a b l e s are represented as d i r e c t e d a r c s of the network. In order f o r a network to be arc c o n s i s t e n t , a l l v a r i a b l e values must be l o c a l l y p o s s i b l e : f o r example, for the r e l a t i o n P i j ( x , y ) , for each "x" i n the domain of values at 27 node i , there must e x i s t a "y" i n the domain of valu e s at node j such that P i j ( x , y ) i s t r u e . T h i s does not guarantee the e x i s t e n c e or uniqueness of a complete s o l u t i o n , but an arc c o n s i s t e n t network may be searched f o r s o l u t i o n s with an expected r e d u c t i o n i n t h r a s h i n g behavior (see Mackworth, 1977c). In developing an arc c o n s i s t e n t network, the a r c s are examined one at a time, and r e v i s e d by d e l e t i n g domain v a l u e s which are not l o c a l l y p o s s i b l e . A f t e r an i n i t i a l pass through the a r c s , only those a r c s that l e a d i n t o a r e v i s e d node must be r e c o n s i d e r e d i n a r e l a x a t i o n p r o c e s s . 2.2.3. G e n e r a l i z i n g from the Blocks-World There are some s p e c i f i c a s p e c t s of the bloc k s - w o r l d which make i t s i n t e r p r e t a t i o n l a b e l l i n g f o r m u l a t i o n p a r t i c u l a r l y simple. To extend the use of these concepts to other problem domains r e q u i r e s s i g n i f i c a n t a l t e r a t i o n s of the techniques. The f o l l o w i n g o u t l i n e s four such aspects of the Clowes/Huffman/Waltz blo c k s - w o r l d s o l u t i o n , and serves as p r e p a r a t i o n f o r a subsequent examination of some schemata-based systems, whose o b j e c t i v e s i n c l u d e a d d r e s s i n g the more gen e r a l problems of a p p l y i n g knowledge of o b j e c t s i n more com-plex domains. 2.2.3.K Compressing C o n s t r a i n t s to a S i n g l e L e v e l In the bloc k s - w o r l d s o l u t i o n o u t l i n e d above, c o n s t r a i n i n g r e l a t i o n s from d i f f e r e n t types of i n f o r m a t i o n are represented i n the same form: as v e r t i c e s with l e g a l i n t e r p r e t a t i o n 28 p o s s i b i l i t i e s . Consider, f o r example, f i g u r e 2.2.1a. T h i s shows a cube, viewed in such a way that i t s upper p r o t r u d i n g corner appears as a "T" v e r t e x . One v a l i d l a b e l l i n g f o r the vertex i s t h e r e f o r e as two o c c l u d i n g edges ( i n d i c a t e d as arrows) and a c e n t r a l convex edge ( i n d i c a t e d by a p l u s s i g n ) . T h i s i s entered i n t o the pool of l e g a l c o n f i g u r a t i o n s f o r a "T" v e r t e x , and r e f l e c t s a p r o p e r t y of an i n d i v i d u a l block i n i s o l a t i o n . F i g u r e 2.2.1b shows a s i m i l a r v e r t e x formed by two adjacent b l o c k s . Thus another l e g a l c o n f i g u r a t i o n f o r the "T" v e r t e x i s e s t a b l i s h e d f 4 ] , but t h i s time on the b a s i s of the way blocks i n t e r a c t . (a) (b) F i g u r e 2.2.1. Two l a b e l l e d v e r t i c e s from the b l o c k s - w o r l d . In g e n e r a l , c o n s t r a i n t s based on d i f f e r e n t a spects of the s t r u c t u r e of a problem domain must be expressed s e p a r a t e l y . Consider a problem domain such as that of geographic sketch [4]The "c" l a b e l i n d i c a t e s a c r a c k . 29 maps (Mackworth, 1977b), as shown i n f i g u r e 2.2.2. The formu-l a t i o n of the idea that two l i n e s must meet i n a s p e c i f i c way to become a mountain symbol can be accomplished at the l i n e l e v e l . To s p e c i f y the requirements f o r mountain symbols com-b i n i n g to make a mountain range one r e q u i r e s more complex o b j e c t s and t h e i r a t t r i b u t e s . Another l e v e l s t i l l i s neces-sary to i n d i c a t e how a r i v e r combines with a mountain range to form a r i v e r system. Fi g u r e 2.2.2. (a) A mountain symbol (b) A sketch map. As a consequence of the compression to a s i n g l e l e v e l , i t i s p o s s i b l e to use a uniform v e r t e x - f i n d i n g method to l o c a t e a l l r e l a t i o n s among l i n e s . In the more general case the r e l a -t i o n s among more complex o b j e c t s must fo l l o w the d i s c o v e r y of the p r i m i t i v e r e l a t i o n s among l i n e s . (a) (b) 30 2 . 2 ._3 .2 . D i f f e r e n t Types of Features It i s p a r t i c u l a r to the bloc k s - w o r l d that the l i n e s which act as b a s i c f e a t u r e s , have no s t r u c t u r e or a t t r i b u t e s which suggest i n t e r p r e t a t i o n [ 5 ] . In n a t u r a l images, there i s a r i c h assortment of informa-t i o n a v a i l a b l e . Marr (1976) makes the p o i n t that i t i s impor-tant to represent a v a r i e t y of f e a t u r e types, and to s p e c i f y t h e i r a t t r i b u t e s . T h i s a t t i t u d e i s r e f l e c t e d i n the nature of the "primal sketch", which encodes s e v e r a l d i f f e r e n t edge types with a t t r i b u t e v a l u e s f o r such aspects as l e n g t h , width, and o r i e n t a t i o n . T h i s same view i s inherent i n p s y c h o l o g i c a l s t u d i e s aimed at the i d e n t i f i c a t i o n of " f e a t u r e dimensions" along which f e a t u r e v a l u e s may vary (Garner, 1974). L i n e s and v e r t i c e s of the bloc k s - w o r l d are the only image elements of concern. Even i n terms of l i n e drawing images, i t w i l l g e n e r a l l y be the case that i n d i v i d u a l l i n e s may be assign e d a t t r i b u t e v a l u e s . For example, c u r v a t u r e , o r i e n t a t i o n , and len g t h may be important aspects of these f e a t u r e s i n some other domain. In the blocks-world curved edges do not e x i s t , and n e i t h e r o r i e n t a t i o n nor l e n g t h have any c o n s t r a i n i n g f o r c e on the r o l e s that the l i n e may p l a y i n i t s r e p r e s e n t a t i o n of the scene. [5] There i s an ex c e p t i o n i n Waltz' system which c o n s i d e r s an a t t r i b u t e of shading edges (which s i d e i s d a r k e r ) . 31 In the more r e a l i s t i c s i t u a t i o n of many d i f f e r e n t types of f e a t u r e s , each with i t s own a t t r i b u t e s , the i s s u e of s e l e c -t i o n becomes important. What fe a t u r e s are necessary to i n t e r p r e t a t i o n , and which a t t r i b u t e s p r o v i d e the s t r o n g e s t c o n s t r a i n t s ? C e r t a i n l y i n human v i s i o n , a t t e n t i o n a l mechan-isms operate towards r e s o l v i n g these problems. 2^2.3_.2k A v a i l a b i l i t y of Components In the blocks-world, r e l a t i o n s among l i n e s are u n i f o r m l y a v a i l a b l e : If two l i n e s are found to connect, i t i s a simple matter to check f o r other c o n n e c t i n g l i n e s and thereby com-p l e t e the r e l a t i o n . In the g e n e r a l case, a l l components of a r e l a t i o n may not be a v a i l a b l e , e i t h e r because the m i s s i n g com-ponent i s i t s e l f a more complex o b j e c t , or because the r e l a -t i o n cannot s p e c i f y the means of o b t a i n i n g i t from the image. Even incomplete knowledge of a r e l a t i o n , however, might be enough to serve as a c o n s t r a i n t upon the i n t e r p r e t a t i o n of f e a t u r e s e n t e r i n g i n t o the r e l a t i o n . 2.2.3.4. S t r u c t u r e Within L a b e l s In the blocks-world, edge l a b e l s are a s s i g n e d to l i n e s as i n t e r p r e t a t i o n p o s s i b i l i t i e s . The l a b e l s e x h i b i t a s t r u c t u r a l o r g a n i z a t i o n , though most b l o c k s - w o r l d i n t e r p r e t a t i o n systems do not e x p l o i t i t (see Mackworth 1977a). G e n e r a l i z a t i o n s over groups of p o s s i b l e l a b e l s can be e i t h e r f i l t e r e d or r e t a i n e d as a group through the c o n s i d e r a t i o n of a s i n g l e r e l a t i o n , r a t h e r than c o n s i d e r i n g the r e l a t i o n over each element. T h i s 32 inherent o r g a n i z a t i o n of p o s s i b l e l a b e l l i n g s i s more apparent i n problem domains such as sketch maps (see Mackworth and Havens, 1981). 2.2.4^ Beyond the Blocks-World The p r e v i o u s s u b - s e c t i o n has reviewed four i s s u e s i n the use of model knowledge in the i n t e r p r e t a t i o n of l i n e drawings, which are not inherent i n the b l o c k s - w o r l d problem domain. These i s s u e s are not unique to l i n e drawing domains, but through the assumption of the a v a i l a b i l i t y of c l e a n l i n e draw-ing input makes the i s s u e s emerge as addressable in the con-t e x t of other problem domains. Computer v i s i o n systems have been implemented to examine the more s u b t l e aspects of a p p l y i n g s p e c i f i c model knowledge to v i s u a l p r o c e s s i n g . Together they are o f t e n termed "schemata-based" systems because they embody some ideas behind the v a r i e t y of p s y c h o l o g i c a l models of c o g n i t i o n which go by the same name ( B a r t l e t t , 1932; P i a g e t , 1967; N e i s s e r , 1976). There are three main i n g r e d i e n t s of a schemata based v i s i o n system: (1) o b j e c t centered knowledge. (2) use of the n a t u r a l s t r u c t u r e of the domain. (3) r e c u r s i v e c u i n g mechanism. 33 The f o l l o w i n g b r i e f review of such r e s e a r c h w i l l be aimed at the e x p l a n a t i o n of these concepts. Mackworth (1977b) extended the b a s i c idea of i n t e r p r e t a -t i o n l a b e l l i n g to a system to i n t e r p r e t geographic sketch maps. One important i n n o v a t i o n was that the f e a t u r e s were not uniform: both l i n e c hains and regions a c t e d as f e a t u r e s . The i n t e r p r e t a t i o n p o s s i b i l i t i e s assigned to these f e a t u r e s were common o b j e c t s of the problem domain. For example, a l i n e c h a i n c o u l d have any of the i n t e r p r e t a t i o n s {road, r i v e r , mountain, b r i d g e , shore}. The l i n e c h a i n i s then s a i d to act as a cue f o r any of these i n t e r p r e t a t i o n s . The system accom-p l i s h e d i n t e r p r e t a t i o n through a two-step segmentation and network c o n s i s t e n c y c y c l e . The movement towards u s i n g common o b j e c t types as the b a s i s f o r encoding knowledge about the problem domain was c a r -r i e d even f u r t h e r i n the r e c o g n i t i o n model de v i s e d by Havens (1978). He dev i s e d a programming language "Maya" i n order to represent the knowledge necessary to accomplish model-based v i s i o n . These p r o c e d u r a l schemata h e l d together e v e r y t h i n g known about i n d i v i d u a l o b j e c t s i n a way s i m i l a r to the "frames" proposed by Minsky (1975). The s t r u c t u r a l framework f o r encoding object knowledge i s the n a t u r a l s t r u c t u r e of the o b j e c t s themselves: the component and s p e c i a l i z a t i o n h i e r a r c h i e s . In Havens' model, the com-ponent h i e r a r c h y d e f i n e s a r e c u r s i v e c u i n g mechanism. T h i s 34 means that j u s t as a b a s i c image element may cue an i n t e r m e d i -ate s t r u c t u r e , the c o n f i r m a t i o n of that intermediate s t r u c t u r e a c t s as a cue f o r some more complex s t r u c t u r e . In the sketch-map domain, t h i s means that the " l i n e - c h a i n " has as i t s p o s s i b l e l a b e l s {road, r i v e r , ..}, and that " r i v e r " has as i t s p o s s i b l e l a b e l " r i v e r - s y s t e m " which i n turn cues "geosystem". T h i s system, MAPSEE2 (Mackworth and Havens, 1981) a l s o p r o v i d e s a means of grouping l a b e l s a c c o r d i n g t o the s p e c i a l i -z a t i o n h i e r a r c h y of the problem domain. For example, the r e l a t i o n s between regions on e i t h e r s i d e of a " s h o r e l i n e " may be e v a l u a t e d with r e s p e c t t o the l a b e l s "landmass" and "water-body". Only l a t e r on i s i t necessary to s p e c i a l i z e these re g i o n s to " i s l a n d " or "mainland" and " l a k e " or "ocean". The use of the component h i e r a r c h y i n computer v i s i o n i s q u i t e s t r a i g h t f o r w a r d . I t has been used i n numerous models of p e r c e p t i o n , p r o v i d i n g a c l e a r i n d i c a t i o n of i t s b e n e f i t . The s p e c i a l i z a t i o n h i e r a r c h y poses more d i f f i c u l t problems. T h i s h i e r a r c h y may be s t r u c t u r e d on the b a s i s of d i s t i n c t i o n s such as f u n c t i o n a l s i m i l a r i t y , v i s u a l s i m i l a r i t y , or c r i t e r i a l pro-p e r t y . I t i s not c l e a r which c r i t e r i a are s u i t a b l e f o r encod-ing v i s u a l knowledge. F u r t h e r problems are found i n t r y i n g to e s t a b l i s h the r o l e of s p e c i f i c e n t i t i e s , which may be viewed as the l e a f nodes of the s p e c i a l i z a t i o n h i e r a r c h y (see Mulder, note 2). 35 A l l schemata-based systems d e s c r i b e d thus f a r are pro-c e d u r a l i n nature. These procedures encode both the r e q u i r e -ments f o r o b j e c t s , and the a c t i o n s to be taken to obt a i n an inst a n c e of themselves. T h i s p r o c e d u r a l approach i s produc-t i v e i n experiments aimed at d i s c o v e r y of the b a s i c p r i n c i p l e s of how knowledge should be s t r u c t u r e d f o r v i s i o n because, i t i s easy to modify and t e s t small segments when they are represented as procedures. One step i n the development of schemata-based systems i s to move towards a more d e c l a r a t i v e knowledge base. That i s , to separate the knowledge of the o b j e c t s from the knowledge of the processes that e f f e c t i n t e r p r e t a t i o n . Such a development would have a number of advantages, which are d e s c r i b e d in sec-t i o n 3.2. 36 2.3. L e v e l s of R e s o l u t i o n In the development of an image, a plan a r p r o j e c t i o n of r e f l e c t e d l i g h t from o b j e c t s and s u r f a c e s i s always represented i n d i s c r e t e terms. A number of i n d i v i d u a l p i c t u r e elements cover the area of the image. These elements c o u l d be the l i g h t s e n s i t i v e s i l v e r h a l i d e c r y s t a l s used i n photo-graphic m a t e r i a l , the a r r a y of responses of the r e t i n a l c e l l s of the eye, or the c o o r d i n a t e s of imposed g r i d s i n d i g i t i z a -t i o n p r o c e s s e s . In each case there i s always a r e s o l u t i o n a s s o c i a t e d with an image: the number of p i c t u r e elements per u n i t area. There i s a v a r i e t y of evidence i n favour of approaching v i s i o n as a process which operates over s e v e r a l d i f f e r e n t , but r e l a t e d l e v e l s of r e s o l u t i o n . Neurophysiology, Psychology, and Computer Science a l l c o n t r i b u t e towards t h i s app'roach. N a t u r a l l y there i s some disagreement, p a r t i c u l a r l y i n terms of the l e v e l of p r o c e s s i n g at which i n f o r m a t i o n from d i f f e r e n t r e s o l u t i o n l e v e l s i n t e r a c t s . For some, m u l t i p l e r e s o l u t i o n i s a t o o l i n the d i s c o v e r y of c o n t e x t - f r e e image f e a t u r e s such a edges. Others b e l i e v e that the s t r u c t u r e and o r g a n i z a t i o n of o b j e c t knowledge i s r e l a t e d to the a v a i l a b i l i t y of s e v e r a l l e v e l s of d e t a i l . T h i s s e c t i o n reviews and c o n t r a s t s some of these i d e a s . 37 2.3.J_' R e s o l u t i o n L e v e l s i n N a t u r a l V i s i o n Measurements of c e l l responses i n the e a r l y p o r t i o n of the primate v i s u a l system have demonstrated s e l e c t i v e s e n s i -t i v i t y to a v a r i e t y of r e t i n a l f i e l d s i z e s . The o r i e n t a t i o n -independent responses of the center-surround f i e l d s encoun-te r e d at the ga n g l i o n and g e n i c u l a t e c e l l s , and the more s p e c i f i c a l l y s e n s i t i v e simple and complex c e l l s l o c a t e d a few synapses away i n the primary v i s u a l c o r t e x , are both examples of r e c e p t o r s which e x h i b i t a v a r i e t y of f i e l d s i z e response (Hubel and W e i s e l , 1979). As r e t i n a l e c c e n t r i c i t y i n c r e a s e s , average f i e l d s i z e s y s t e m a t i c a l l y i n c r e a s e s . T h i s e f f e c t i s a t t r i b u t a b l e to the v a r y i n g d e n s i t y of r e t i n a l and ga n g l i o n c e l l s and the v a r i a -t i o n i n convergence of s i g n a l s between them. T h i s r e l a t e s to, but does not completely e x p l a i n the change i n v i s u a l a c u i t y with e c c e n t r i c i t y (Westheimer, 1982). At a s i n g l e p o i n t on the r e t i n a , there i s an o v e r l a p of r e c e p t i v e f i e l d s of d i f -f e r e n t s i z e s . The s p a t i a l extent of the r e t i n a l c e n t e r - s u r r o u n d f i e l d s determine the types of edges which may be d e t e c t e d . For exam-p l e , a wide r e c e p t i v e f i e l d w i l l not respond to c l o s e l y spaced l i n e s , and smal l r e c e p t i v e f i e l d s w i l l not respond to gradu-a l l y changing i n t e n s i t i e s . The d i f f e r e n t f i e l d s i z e s may be viewed as encoding i n t e n s i t y d i s c o n t i n u i t y i n f o r m a t i o n based on d i f f e r e n t r e s o l u t i o n l e v e l s because of the a s s o c i a t e d v a r i -38 0 a t i o n i n the number of r e t i n a l r e c e p t o r s . P s y c h o p h y s i c a l experimentation has developed an analogy between s p a t i a l frequency a n a l y s i s and the v a r i a t i o n i n recep-t i v e f i e l d s i z e . A l a r g e r e c e p t i v e f i e l d s i z e corresponds to a low s p a t i a l frequency channel i n the sense t h a t , i n e i t h e r case, s e n s i t i v i t y i s g r e a t e s t f o r gradual i n t e n s i t y changes. Experiments have been performed which r e l y on t h i s a n a l -ogy. S u b j e c t s who observe s i n u s o i d a l g r a t i n g s f o r a few minutes e x h i b i t an e l e v a t e d c o n t r a s t t h r e s h o l d to subsequent t e s t g r a t i n g s of s i m i l a r s p a t i a l frequency (and otherwise i d e n t i c a l ) , but show no such e f f e c t f o r t e s t g r a t i n g s of d i s -s i m i l a r s p a t i a l frequency (Pantle Sekuler, 1968; Blakemore and Campbell, 1969). T h i s type of r e s u l t i s e x p l a i n e d i n terms of the s e l e c t i v e d e s e n s i t i z a t i o n of frequency s p e c i f i c channels at each r e t i n a l l o c a t i o n i n the human v i s u a l system. Wilson and Bergen (1979) have proposed four channels, each with a center surround p r o f i l e d e s c r i b e d by a d i f f e r e n c e of two Gaus-s i a n d i s t r i b u t i o n s . Others suggest as many as seven channels (see Watson, 1982). The s p a t i a l frequency analogy has a l s o been u s e f u l i n i d e n t i f y i n g two types of c e l l responses: s u s t a i n e d and t r a n -s i e n t . G e n e r a l l y , low s p a t i a l f r e q u e n c i e s are t r a n s i e n t and have been proposed as s p e c i a l i z e d f o r d e t e c t i o n of temporal and g l o b a l aspects of a scene, whereas the s u s t a i n e d high f r e -quency channels are b e l i e v e d i n v o l v e d i n form and p a t t e r n per-39 c e p t i o n (Breitmeyer and Gantz, 1976). Another important r e s u l t has been obtained through re s e a r c h i n summation at t h r e s h o l d f o r s p a t i a l frequency chan-n e l s . S t i m u l i composed of s i n u s o i d a l g r a t i n g s of s e v e r a l d i f -f e r e n t f r e q u e n c i e s are only s l i g h t l y more d e t e c t a b l e than the most d e t e c t a b l e of the composing g r a t i n g s . T h i s r e s u l t i s independent of the r e l a t i v e phase of the g r a t i n g s (see Graham, 1981). The small enhancement of d e t e c t a b i l i t y i s a t t r i b u t e d to a p r o b a b i l i t y summation model of d e t e c t i o n : that each chan-ne l has an independent p r o b a b i l i t y of d e t e c t i n g the p a t t e r n , and hence the p o t e n t i a l d e t e c t i o n by s e v e r a l channels i n c r e a s e s the o v e r a l l p r o b a b i l i t y of d e t e c t i o n . Given the small s i z e of the enhancement, t h i s model i s p r e f e r r e d over one which enables the combination of i n f o r m a t i o n from d i f -f e r e n t r e s o l u t i o n s at an e a r l y stage i n the v i s i o n system. There i s a l s o a l i n e of C o g n i t i v e Psychology research which i s concerned with d i f f e r e n t l e v e l s of r e s o l u t i o n . The i s s u e c e n t e r s around the order of p r o c e s s i n g at the d i f f e r e n t l e v e l s . The t r a d i t i o n a l c o n s t r u c t i v i s t view of p e r c e p t i o n pro-poses the development of h o l i s t i c p r o p e r t i e s on the b a s i s of the r e s u l t s from f i n e r e s o l u t i o n p r o c e s s i n g ( N e i s s e r , 1967). The opposing view i s that high-order forms are processed i n i -t i a l l y , f o l l o w e d by the f i n e r d e t a i l s (see Kahneman, 1973). K i n c h l a (1974) e s t a b l i s h e d what was to become one of the the 40 main paradigms i n the i n v e s t i g a t i o n of t h i s i s s u e : s u b j e c t s are shown a d i s p l a y c o n s i s t i n g of a l a r g e l e t t e r , which i s made up of many ins t a n c e s of a smaller l e t t e r (see f i g u r e 2.3.1). By v a r y i n g the task between r e p o r t i n g the i d e n t i t y of the small or l a r g e l e t t e r s , and by v a r y i n g the c o m p a t i b i l i t y between the l e t t e r s at the two l e v e l s , r e s e a r c h e r s were able to address the que s t i o n s of l o c a l - g l o b a l i n t e r a c t i o n and ord-e r i n g . N N N N H H N H H N H H N N N H H H H N H H N H H N N N N H H (a) (b) F i g u r e 2.3.1. T y p i c a l l o c a l - g l o b a l s t i m u l i : (a) incompatible (b) compatible. Navon (1977) showed that i n a t t e n d i n g the l a r g e l e t t e r s , the small can be e f f e c t i v e l y ignored, but that the presence of the l a r g e l e t t e r always i n f l u e n c e s the reaction, time to iden-t i f y the s m a l l , and thus e s t a b l i s h e d the concept of g l o b a l  precedence i n p e r c e p t i o n . Others have demonstrated that such f a c t o r s as a b s o l u t e s i z e , r e l a t i v e d e n s i t y , and q u a l i t y of the l e t t e r s w i l l i n f l u e n c e the r e s u l t s ( K i n c h l a and Wolfe, 1979; M a r t i n , 1979; Hoffman, 1980). M i l l e r (1981) a l t e r e d the task somewhat to r e q u i r e sub-j e c t s to d e t e c t s p e c i f i c l e t t e r s , whether they appear at the 41 l o c a l or g l o b a l l e v e l . Strong f a c i l i t a t i o n was found i n the compatible c o n d i t i o n . T h i s r e s u l t r e q u i r e s a model of percep-t i o n i n which in f o r m a t i o n from both l e v e l s of r e s o l u t i o n feed i n t o a s i n g l e d e c i s i o n process which i n t e g r a t e s the r e s u l t s . M i l l e r suggests that the i n t e g r a t i o n may be based on a t t e n -t i o n a l s h i f t s between l e v e l s , with i n i t i a l emphasis on the g l o b a l l e v e l because of the guidance i t i s thought to a f f o r d i n normal p e r c e p t i o n . T h i s idea i s c o n s i s t e n t with other areas of p s y c h o l o g i c a l r e s e a r c h which w i l l be d i s c u s s e d i n s e c t i o n 6.1. 2.3.2. R e s o l u t i o n Pyramids or Cones Computational v i s i o n r e s e a r c h has a l s o been concerned with i n f o r m a t i o n at d i f f e r e n t r e s o l u t i o n s . K e l l y (1971) intr o d u c e d the idea in a system to detect the o u t l i n e of a human head i n an image of background conto u r s . A second image was developed c o n s i s t i n g of one p i x e l f o r every 8x8 area of the o r i g i n a l d i g i t i z e d image. Thus t h i s e x t r a image was much sm a l l e r , and d i d not have as much d e t a i l . Edge segments i n the small image were compared to the coarse requirements of an image of a head, and then t h i s i n f o r m a t i o n was used to guide search among edges i n the o r i g i n a l image to c o n s t r u c t the d e t a i l e d o u t l i n e of the head. Th i s idea was extended to the notion of a r e c o g n i t i o n  cone, or image pyramid (Uhr, 1972; Hanson and Riseman, 1975; Tanimoto and P a v l i d i s , 1975) which represent an image as 42 s e v e r a l i n t e r r e l a t e d l a y e r s c o n s t r u c t e d at d i f f e r e n t r e s o l u -t i o n s . The base l e v e l i s the r e g u l a r d i g i t i z e d image, and the upper l a y e r s are s u c c e s s i v e l y s m a l l e r images, with p i x e l v a l u e s d e r i v e d by some averaging o p e r a t i o n on four (or more) p i x e l s at the l e v e l beneath i t . A number of p r o c e s s i n g schemes have been d e v i s e d to use these s t r u c t u r e s to a i d i n the d e t e c t i o n of image f e a t u r e s . The b a s i c idea behind the use of these pyramids i s that i n d i c a t i o n s of the e x i s t e n c e of a f e a t u r e may be found i n a simple search of a s m a l l e r , c o a r s e r r e s o l u t i o n v e r s i o n of the p i c t u r e , which can then be used to d i r e c t the e x t r a c t i o n of the f e a t u r e from the f i n e r l e v e l s (see Tanimpto, 1980). T h i s idea has been g e n e r a l i z e d to systems which permit s p e c i f i c a t i o n of p a r a l l e l a l g o r i t h m s which operate with t r a n s f e r a l of i n f o r m a t i o n i n both d i r e c -t i o n s i n the image h i e r a r c h y , as w e l l as l a t e r a l l y w i t h i n a l e v e l (Hanson and Riseman, 1978; 1980). Levine (1980) d e s c r i b e s a computer v i s i o n system which i n t e g r a t e s informa-t i o n from separate pyramids used to encode a v a r i e t y of image f e a t u r e s . 2.3.3. Resolut ion L e v e l s i n Edge Detect ion A s i m i l a r idea i s found i n the work of Marr (1982; Marr and H i l d r e t h , 1980). An image, smoothed with a v a r i e t y of Gaussian f i l t e r s , i s convolved with the L a p l a c i a n operator. The z e r o - c r o s s i n g s of these c o n v o l u t i o n s are r e p r e s e n t a t i v e of the i n t e n s i t y changes i n the image w i t h i n d i f f e r e n t s p a t i a l frequency channels, dependent on the value of the space 43 constant of the Gaussian d i s t r i b u t i o n . The response charac-t e r i s t i c s of these o p e r a t o r s are s i m i l a r to the d i f f e r e n c e of Gaussian operator proposed by Wilson and Bergen (1979). O r i e n t e d z e r o - c r o s s i n g segments are d e t e c t e d , and represent candidate edges. The r e s u l t s i n d i f f e r e n t channels are then combined to produce a s i n g l e r e p r e s e n t a t i o n of the image as the raw pr imal sketch, c o n s i s t i n g of symbolic d e s c r i p t i o n s of segments, p r o v i d i n g l o c a t i o n and a number of other p r o p e r t i e s (see Marr, 1976). The process of combining the r e s u l t s from the d i f f e r e n t channels r e l i e s on the idea that zero c r o s s i n g s at the same l o c a t i o n at d i f f e r e n t s c a l e s are probably a r e s u l t of the same u n d e r l y i n g p h y s i c a l phenomenon. So whenever the segments obtained at two or more (contiguous) channels agree i n both p o s i t i o n and o r i e n t a t i o n , an edge i s hypothesized. Subsequent o p e r a t i o n s group these edge tokens a c c o r d i n g to s e v e r a l s i m i l a r i t y measures i n order to o b t a i n tokens f o r l a r g e r s c a l e areas of c o n t i n u i t y and boundary. We must q u e s t i o n the use of a s i n g l e l o c a t i o n - b a s e d r e p r e s e n t a t i o n f o r tokens c o n s i s t i n g of a v a r i e t y of proper-t i e s . In p a r t i c u l a r , we must q u e s t i o n the e a r l y combination of i n f o r m a t i o n from s e v e r a l channels. I t i s q u i t e a reason-able a l t e r n a t i v e to r e t a i n separate r e p r e s e n t a t i o n s f o r each r e c e p t i v e f i e l d s i z e , i n t e r c o n n e c t e d through convergence of l o c a t i o n as i n the case of image pyramids, the d i f f e r e n c e being that i n s t e a d of c o n t a i n i n g averaged image i n t e n s i t i e s , 44 the pyramid would encode the more e l a b o r a t e s t r u c t u r e of z e r o - c r o s s i n g segments. There i s a v a r i e t y of reasons why t h i s approach i s more u s e f u l and more r e a l i s t i c : (1) The area of v i s i o n i n v e s t i g a t e d by Wilson and Bergen (1979) i n c l u d e d only 4 degrees of e c c e n t r i c i t y , which c o n s t i -t u t e s l e s s than one percent of the v i s u a l f i e l d . Even w i t h i n t h i s area, the s p a t i a l extent of each v i s u a l channel doubles toward the p e r i p h e r y . In such a system of v a r y i n g r e c e p t i v e f i e l d s i z e , the outcome of combining r e s u l t s would be d i f -f e r e n t at each e c c e n t r i c i t y f o r the same s t i m u l i . T h i s would confound the task of d e t e c t i n g v a r i a t i o n a s s o c i a t e d with changes i n s u r f a c e o r i e n t a t i o n . (2) An important task d u r i n g changes in f i x a t i o n l o c a t i o n i s to form a correspondence between what i s a l r e a d y known and the newly a v a i l a b l e i n f o r m a t i o n . With each change i n l o c a t i o n , there i s a s w i t c h i n g of the f o v e a l and p e r i p h e r a l r e s o l u t i o n s . If low r e s o l u t i o n channel r e s u l t s f o r the fovea are maintained and e l a b o r a t e d s e p a r a t e l y , and not c o l l a p s e d to a token at the f i n e s t p o s s i b l e l e v e l , then s t r u c t u r e s w i l l be a v a i l a b l e to f a c i l i t a t e the establishment of correspondence. (3) Basic to the idea of the p r i m a l sketch are r e p r e s e n t a -t i o n a l tokens which t i e together a number of p r o p e r t i e s (such as o r i e n t a t i o n and s i z e ) i n a s i n g l e r e t i n o t o p i c a r r a y . Recent experiments i n d i c a t e that such combinations of 45 p r o p e r t i e s are not a v a i l a b l e i n p a r a l l e l over the v i s u a l f i e l d , but must ra t h e r be c o n s t r u c t e d through the s e q u e n t i a l a p p l i c a t i o n of f o c a l a t t e n t i o n (Treisman and Gelade, 1980)[6]. A l s o , the p e r c e p t u a l grouping necessary to form boundaries i s more d i f f i c u l t when based on c o n j u n c t i o n s of p r o p e r t i e s r a t h e r than s i n g l e p r o p e r t i e s (Treisman, 1982). These r e s u l t s favour the maintenance of a number of r e t i n o t o p i c a r r a y s which can be used as the s u b j e c t of grouping o p e r a t i o n s and may be accessed as necessary to c o n s i d e r the c o i n c i d e n c e of f e a t u r e s at p a r t i c u l a r l o c a t i o n s [ 7 ] . I t does not seem reasonable to take the step of c o n s o l i d a t i n g s e v e r a l a s p e c t s of the a v a i l -a ble i n f o r m a t i o n i n t o a s i n g l e a r r a y and then apply grouping o p e r a t i o n s which must s o r t through the tokens in search of s i m i l a r i t y . (4) The t h r e s h o l d summation r e s u l t s p r e v i o u s l y d e s c r i b e d argue a g a i n s t the e a r l y combination of outputs from s e v e r a l chan-n e l s , at l e a s t i n terms of enhancing d e t e c t i o n . (5) The r e s e a r c h using image pyramids has e s t a b l i s h e d t h a t the computational advantage to u s i n g a v a r i e t y of r e s o l u t i o n s l i e s i n the idea that coarse elements need not be p r e c i s e l y l o c a t e d , and so can be maintained in s m a l l e r a r r a y s . The r e p r e s e n t a t i o n of z e r o - c r o s s i n g segments c o u l d gain t h i s [6] A more complete d i s c u s s i o n of t h i s experimentation i s found i n s e c t i o n 6.1. [7] Zeki (1978) has demonstrated that f o r the Rhesus mon-key, p r o j e c t i o n s from the primary v i s u a l c o r t e x to p r e s t r i a t e areas are d i v i d e d i n t o r e t i n o t o p i c areas of separate f e a t u r e s . 46 p r o c e s s i n g advantage through s e p a r a t i o n i n terms of r e c e p t i v e f i e l d s i z e . 2.3_.4. Knowledge I n t e r a c t i o n with M u l t i p l e R e s o l u t i o n Lev- e l s Most of the computer v i s i o n r e s e a r c h d e s c r i b e d thus f a r has a common goal i n the use of m u l t i p l e r e s o l u t i o n l e v e l s : to more a c c u r a t e l y and e f f i c i e n t l y e x t r a c t f e a t u r e s from an image. There have a l s o been a number of s t u d i e s which attempt an i n t e r p r e t a t i o n - b a s e d i n t e r a c t i o n between l e v e l s of r e s o l u -t i o n , u s ing knowledge of the c l a s s of o b j e c t s which comprise the problem domain. The o r i g i n a l work by K e l l y (1971) f a l l s i n t o t h i s c a t e g ory. The coarse l e v e l f e a t u r e s are analyzed i n the con-text of what was expected f o r the o u t l i n e of the head, thereby i g n o r i n g the other prominent edges produced by the background. C a t a n z a r i t i and Mackworth (1978) a p p l i e d a s i m i l a r idea to the task of c l a s s i f y i n g regions of ground cover type from s a t e l l i t e images. A pyramid s t r u c t u r e i s developed from the image, and i n f o r m a t i o n from maximum-likelihood c l a s s i f i e r s are passed a c r o s s the l e v e l s of the pyramid. Rosenthal and Bajcsy (1978; Bajcsy and Rosenthal, 1980) have extended the i n t e r a c t i o n between world knowledge and image h i e r a r c h y i n an i n q u i r y - d r i v e n computer v i s i o n system. The n a t u r a l h i e r a r c h i c a l r e l a t i o n s of the problem domain are 47 e x p l i c i t l y encoded. These i n c l u d e the p a r t - o f r e l a t i o n , the c l a s s i n c l u s i o n r e l a t i o n , and a s i z e o r d e r i n g r e l a t i o n . In i n v e s t i g a t i n g a query f o r a s p e c i f i c o b j e c t , the system f i r s t d e v i s e s a s e r i e s of c o n t e x t s from the o b j e c t s found towards the root node i n the p a r t - o f h i e r a r c h y . A search f o r these context o b j e c t s i s made at r e s o l u t i o n s determined by the s i z e r e l a t i o n s . Since the p a r t - o f r e l a t i o n i m p l i e s that the p a r t l i e s w i t h i n the s p a t i a l extent of the whole, each s u c c e s s f u l context search reduces the candidate search area at the f i n e r r e s o l u t i o n l e v e l s . A model f o r p e r c e p t i o n has been presented by Palmer (1975; 1977) which i s very s i m i l a r i n i t s t h e o r e t i c a l p o s i -t i o n . The model proposes a s t r u c t u r a l h i e r a r c h y based on the whole-part r e l a t i o n , forming a network. Each node expresses i t s component s t r u c t u r e as p a r t - o f l i n k s upon which f u r t h e r r e l a t i o n a l requirements may be imposed. Each l e v e l expresses h o l i s t i c p r o p e r t i e s i n terms of f e a t u r e s at d i f f e r e n t r e s o l u -t i o n s , becoming lower f o r concepts towards the root node. 48 2.4. L o c a t i o n S e l e c t i o n i n V i s i o n As d e s c r i b e d i n the p r e v i o u s s e c t i o n , the human percep-t u a l system has the c h a r a c t e r i s t i c that r e c e p t i v e f i e l d s i z e i n c r e a s e s towards the p e r i p h e r y , r e s u l t i n g i n a graded a c u i t y . With a f i x e d number of r e c e p t o r s , t h i s c o n f i g u r a t i o n p r o v i d e s both a wide f i e l d of view, and the c a p a b i l i t y f o r high r e s o l u -t i o n e x t r a c t i o n of d e t a i l . The s a c c a d i c eye movements which accompany v i s u a l p e r c e p t i o n , are the a c t i o n s which enable s e l e c t i v e high a c u i t y v i s i o n throughout the f i e l d of view. At one p o i n t in the e v o l u t i o n of human v i s i o n , i t i s pos-s i b l e that the s o l e purpose of s a c c a d i c eye movements was to produce t h i s enhancement of a c u i t y . In f a c t , there remains a r e f l e x a c t i o n to f i x a t e upon moving o b j e c t s d e t e c t e d in the extreme p e r i p h e r y , even though we cannot be aware of t h e i r movement (Gregory, 1966). However, i t seems a reasonable h y p o t h e s i s that the s t r u c t u r e of human i n t e l l i g e n c e has developed to be attuned to the sequences of h i g h r e s o l u t i o n input obtained through eye movements. I t i s a l s o reasonable that the s t r u c t u r e s which a i d i n the understanding of scenes and o b j e c t s c o n t a i n the knowledge necessary to guide the pro-cess of s e l e c t i o n to areas which w i l l provide u s e f u l informa-t i o n . There are a number of i d e n t i f i a b l e aspects to the s e l e c -t i o n processes that take p l a c e d u r i n g human v i s i o n . Saccadic eye movements are among the most obvious and a c c e s s i b l e to 49 a n a l y s i s . Even when the eyes are not moving[8], other s e l e c -t i o n o p e r a t i o n s are i n e f f e c t . There i s the s p a t i a l a l l o c a -t i o n of an a t t e n t i o n a l mechanism which enables or enhances the e x t r a c t i o n of v i s u a l i n f o r m a t i o n (see Posner, 1978; Treisman and Gelade, 1980). T h i s s p a t i a l a t t e n t i o n may be moved much more r a p i d l y than the eyes, and has the p r o p e r t y of being v a r i a b l e i n i t s extent ( E r i k s e n and Hoffman, 1972). Other, perhaps r e l a t e d , a t t e n t i o n a l mechanisms provide s e l e c t i v e a c t i v a t i o n of memory s t r u c t u r e s which attune v i s u a l processes to the r e c e p t i o n of p a r t i c u l a r image p r o p e r t i e s (Laberge, 1976; S h i f f r i n and Schneider, 1977), and s t i l l other mechan-isms are thought to be i n v o l v e d i n the s e l e c t i v e p r e p a r a t i o n of responses (Kahneman, 1973). The purpose of t h i s s e c t i o n i s to s i n g l e out s a c c a d i c eye movements as r e p r e s e n t a t i v e of the s e l e c t i o n a l a c t i o n s of per-c e p t i o n . The b a s i c c h a r a c t e r i s t i c s of saccades w i l l be reviewed with the o b j e c t i v e of emphasizing the n o n - a r b i t r a r y nature of the s e l e c t i o n of f i x a t i o n l o c a t i o n s . The steps i n v o l v e d in s e l e c t i n g and moving to f i x a t i o n l o c a t i o n s w i l l be o u t l i n e d , with the o b j e c t i v e of exposing the computational requirements. F i n a l l y , some t h e o r i e s and computer s i m u l a t i o n s of s a c c a d i c eye movements are d i s c u s s e d . [8] When f i x a t e d , the eyes undergo a number of small s h i f t s , d r i f t s , and tremors. 50 2_.4.J_. Saccadic Eye Movements During normal viewing of a p i c t u r e , humans move t h e i r f o v e a l v i s i o n over angles up to 15 degrees about 3 times per second (see Yarbus, 1967; Gould, 1976). Of t h i s viewing time, about 90% i s spent i n f i x a t i o n (Yarbus, 1967). The a c t u a l eye movement i s caused by the a p p l i c a t i o n of the f u l l f o r c e of the eye muscle, where the d u r a t i o n of the a p p l i c a t i o n determines the d i s t a n c e covered (Alpern, 1972). The saccade i s " b a l l i s -t i c " , i n that i t cannot be c o r r e c t e d once i n i t i a t e d [ 9 ] (Westheimer, 1954). A minimal amount of i n f o r m a t i o n i s p i c k e d up duri n g the saccade i t s e l f (Latour, 1962). These two f a c t s i n d i c a t e t hat d u r i n g a f i x a t i o n , the v i s u a l system must be both e x t r a c t i n g v i s u a l i n f o r m a t i o n , and p r e p a r i n g f o r the next movement. The f o l l o w i n g i s a s c e n a r i o of the steps which might be r e q u i r e d , s t a r t i n g from the p o i n t of the eyes coming to r e s t at a l o c a t i o n : (1) The f i r s t problem i s to determine i f the saccade was e f f e c t i v e i n p l a c i n g the fovea at the d e s i r e d l o c a t i o n . Such e r r o r s are l i k e l y d e t e c t e d w i t h i n the o c u l a r muscu-. l a r system, and may r e s u l t i n a s m a l l , c o r r e c t i v e saccade (Yarbus, 1967:134). (2) If one accepts the n o t i o n that some i n t e r n a l model of the v i s u a l f i e l d i s being maintained, then an updating of [9] T h i s i s not the case with other forms of eye movements such as convergences. 51 that r e p r e s e n t a t i o n must be accomplished to e s t a b l i s h the c o n t i n u i t y of p e r c e p t i o n . (3) The new p e r i p h e r y must be analyzed, and r e s u l t s compared with p r e v i o u s i n t e r p r e t a t i o n r e s u l t s . There i s evidence that the p e r i p h e r y i s processed from the o u t s i d e i n (Lowe, 1975), and there are suggestions that i t i s done before the fovea i s analyzed (Parker, 1978). (4) F o v e a l f e a t u r e i n f o r m a t i o n i s e x t r a c t e d and used in the enhancement of the ongoing scene i n t e r p r e t a t i o n . (5) The next l o c a t i o n must be s e l e c t e d f o r f i x a t i o n , and the exact muscle "program" must be developed. I t has been shown that the more p r e c i s e the saccade must be, the longer the l a t e n c y to the eye movement, and so presumably the longer i t takes to compute the parameters of the movement (Leushina, 1965). Research i n t o the nature and determinants of eye move-ments r a i s e two i n t e r e s t i n g i s s u e s from the p o i n t of view of the development of a computational understanding of v i s i o n : (1) What a f f e c t s the l o c a t i o n and d u r a t i o n of f i x a t i o n ? (2) How i s i n t e g r a t i o n across f i x a t i o n s accomplished? The two main sources of i n f o r m a t i o n about these q u e s t i o n s are r e s e a r c h i n reading and p i c t u r e viewing. The two tasks are q u i t e d i f f e r e n t and r e s u l t s may not always be g e n e r a l i z e d from 52 one to the other (Rayner, 1978:641). The great volume of r e s e a r c h l i t e r a t u r e p e r t a i n i n g t o eye movements p r o h i b i t s the i n c l u s i o n of a comprehensive review i n t h i s document. The i n t e r e s t e d reader c o u l d pursue the e x c e l -l e n t review provided by Rayner (1978), or the s e r i e s of three books by Senders, F i s c h e r , and Monty (1978; Monty and Senders, 1976; F i s c h e r , Monty and Senders, 1981). 2^ 4.,2. N o n - A r b i t r a r y F i x a t i o n L o c a t i o n and Duration The s u b j e c t i v e experience of s a c c a d i c eye movements i s somewhat d e c e p t i v e . We may b e l i e v e that we are t r a c i n g a smooth path along a l i n e while i n f a c t our eyes execute a s e r i e s of i r r e g u l a r s h i f t s . We may not be aware of the exact l o c a t i o n s upon which we f i x a t e , only the o b j e c t s which we d e t e c t . We may be aware of the gross i n f l u e n c e s on our f i x a -t i o n , such as sudden movement, but we are g e n e r a l l y unac-quainted with the s u b t l e f a c t o r s . One common misconception of the r o l e of eye movements in reading i s that the ocular-motor system executes rhythmic or random movements acr o s s the l i n e of t e x t . T h i s n o t i o n has been d i s p e l l e d by r e s e a r c h such as that of J u s t and Carpenter (1978) who showed that the semantic connections between sen-tences i s a good p r e d i c t o r of the amount of time the agents of the sentences are f i x a t e d . C e r t a i n types of grammatical s t r u c t u r e produce more frequent f i x a t i o n s ( K l e i n and Kur-kowski, 1974). 53 Buswell (1935) and Yarbus (1967) noted t h a t when viewing p i c t u r e s , s u b j e c t s are more l i k e l y to f i x a t e c e r t a i n areas of the image. Mackworth and Morandi (1967) d e v i s e d e m p i r i c a l methods to determine that the areas of p i c t u r e s which s u b j e c t s c o n s i d e r to be more i n f o r m a t i v e are more l i k e l y to be f i x a t e d , both e a r l y and l a t e i n the viewing. L o f t u s and Mackworth (1978) have shown that o b j e c t s which are unexpected in a scene are more l i k e l y to be f i x a t e d e a r l y , demonstrating the i n f l u -ence of c o g n i t i o n and e x p e c t a t i o n on eye movements, and show-ing the u s e f u l n e s s of the i n t e r p r e t a t i o n which takes p l a c e i n the p e r i p h e r y . Gould and S c h a f f e r (1965) r e p o r t the i n f l u e n c e s of task s p e c i f i c a t i o n s on the d u r a t i o n and s e l e c t i o n of eye movements du r i n g v i s u a l search (see a l s o Gould, 1976). L o f t u s (1972) d e s c r i b e s the r e l a t i o n between memory and f i x a t i o n c h o i c e s . O b j e c t s which were remembered i n a scene were f i x a t e d by the t h i r d f i x a t i o n 95% of the time. We must accept the i n t r i c a t e i n f l u e n c e s of image proper-t i e s , v i s u a l task, e x p e c t a t i o n s , and progress of understanding i n the d e t e r m i n a t i o n of p r o c e s s i n g l o c a t i o n s . 2.4.3. P i e c i n q Together F i x a t i o n s The s t u d i e s of f i x a t i o n determinants are q u i t e d e s c r i p -t i v e , and do not g e n e r a l l y suggest mechanisms. Research i n t o the p o s s i b l e ways that i n f o r m a t i o n i s p i e c e d together from s e v e r a l f i x a t i o n s o f t e n i n c l u d e s p r o c e s s i n g models. 54 Parks (1965) moved a s l i t i n a p i e c e of cardboard over an image of a p a t t e r n and n o t i c e d that even though only a s i n g l e element of the p a t t e r n c o u l d be seen at a time, that an under-standing of the whole p a t t e r n emerged. T h i s l e d him to con-clude that i n d i v i d u a l glimpses can be assembled i n t o a com-p l e t e p e r c e p t i o n . Hochberg (1968) extended t h i s experiment to i n c l u d e l i n e drawings and i n t r o d u c e d the idea of a "schematic map" which i s used to s y n t h e s i z e s u c c e s s i v e glimpses, along with eye movement i n f o r m a t i o n . Hochberg (1978; Hochberg and Brooks, 1978) emphasize the importance of u n d e r l y i n g e x p e c t a t i o n s i n the development of a coherent s t r u c t u r e of r e s u l t s from many f i x a t i o n s . Arguments in favour of t h i s approach r a t h e r than t r a n s l a t i o n of v i s u a l f i e l d on the b a s i s of feedback from the eye movement system are made on the b a s i s of the ease of understanding f i l m c l i p s which s h i f t p e r s p e c t i v e and s c a l e without p r e d i c t a b i l i t y . Reading s t u d i e s i n d i c a t e that e f f e c t s of i n t e g r a t i o n a c r o s s saccades can be simulated without a c t u a l eye movements. Using an " o n - l i n e " eye movement monitoring and d i s p l a y genera-t i o n mechanism, r e s e a r c h e r s are able to take advantage of the " b a l l i s t i c " p r o p e r t y of saccades, and by being a b l e to d e t e r -mine f i x a t i o n l o c a t i o n s before the eye comes to r e s t , the d i s p l a y s may be a l t e r e d d u r i n g eye movements. Rayner (1975) showed that naming words on which f i x a t i o n f a l l s i s e a s i e r 55 when the a c t u a l word appeared in the p r e v i o u s parafovea[10] ( r a t h e r than a s i m i l a r s t r i n g of l e t t e r s ) . Rayner, McConkie and E h r l i c h (1978) demonstrated that the same e f f e c t can be obtained when the s u b j e c t maintains a f i x a t i o n and the d i s p l a y s are m o d i f i e d e x a c t l y as i f a change in f i x a t i o n were t a k i n g p l a c e . McConkie and Rayner have suggested that p a r a f o v e a l and p e r i p h e r a l m a t e r i a l are s t o r e d as an " i n t e g r a -t i v e v i s u a l b u f f e r " , which i s used as the b a s i s f o r the i n c o r -p o r a t i o n of i n f o r m a t i o n from subsequent f i x a t i o n s . 2.4^.4. Models for Saccadic C o n t r o l Noton and Stark (1971a; 1971b; 1971c) proposed a r e p r e s e n t a t i o n for knowledge about o b j e c t s which c o n s i s t e d of r i n g s of a l t e r n a t i n g f e a t u r e s , and motor t r a c e s to permit mov-ing the eyes to the next f e a t u r e . Eye movements were con-s i d e r e d to be the f o l l o w i n g of "scan paths", as p r o v i d e d by the r e p r e s e n t a t i o n . These r e p e t i t i v e sequences of saccades were shown to develop e a r l y i n the viewing of an image, and to recur in subsequent p e r c e p t u a l tasks with the same image. As a theory of eye movement c o n t r o l t h i s n o t i o n of "scan-paths" has two weaknesses: (1) No c e n t r a l r o l e i s p r o v i d e d f o r the use of p e r i p h e r a l v i s i o n . [10] The area j u s t o u t s i d e of the f o v e a l center i s o f t e n r e f e r r e d to as the parafovea. 56 (2) The r e p r e s e n t a t i o n i s s t r i c t l y v i e w - o r i e n t e d . I t does not c o n s i d e r that i f the l o c a t i o n s to which f i x a t i o n i s to be drawn r o t a t e with an o b j e c t , then the motor t r a c e s necessary to e f f e c t the saccades w i l l have to change. F a r l e y (1976) presents the d e s c r i p t i o n of a computer implementation of an eye movement system. The gen e r a l form of the model i s d e r i v e d from Noton and Stark's ideas, but i t i n c l u d e s a h i e r a r c h i c a l o r g a n i z a t i o n of the o b j e c t s being viewed. The s t r a t e g i e s f o r e f f e c t i n g eye movements c o n s i s t e s s e n t i a l l y of b r e a d t h - f i r s t and d e p t h - f i r s t search of the space d e f i n e d by the o b j e c t models, and the l i n e s given in the model are fo l l o w e d to look f o r expected v e r t i c e s . The b a s i c d i r e c t i v e f o r changing p r o c e s s i n g l o c a t i o n i n F a r l e y ' s system i s suggested by the expected d i r e c t i o n s of the co r n e r s of the o b j e c t s . T h i s i s s i m i l a r to the concept employed by S h i r a i (1975) i n h i s knowledge-based l i n e f i n d i n g program. S h i r a i ' s knowledge of scene domain c o r n e r s was encoded as corresponding image domain vert e x i n f o r m a t i o n . Didday and A r b i b (1975) a l s o r e p o r t an eye movement com-puter implementation which i s based upon the Noton and Stark model. They conclude that eye movements are based on proper-t i e s of the image ( f e a t u r e s ) and not on motor t r a c e s . T h i s suggestion r e q u i r e s a more complete r e p r e s e n t a t i o n of the scene models than i s p r o v i d e d by Noton and St a r k . In a d d i -t i o n , p e r i p h e r a l v i s i o n would be r e q u i r e d to form hypotheses about the l o c a t i o n of f e a t u r e s which need more c a r e f u l 57 examination. T h i s i s the b a s i s of a model proposed by Walker-Smith, Gale, and F i n d l a y (1977), u s i n g s t u d i e s of eye movement paths over images of faces as s u p p o r t i v e evidence. Parker (1978) a l s o argues f o r the importance of p e r i -p h e r a l v i s i o n i n the c o n t r o l of eye movement behaviour. Parker's model i s based on N e i s s e r ' s (1976) p e r c e p t i o n c y c l e : e x p e c t a t i o n s about the type of i n f o r m a t i o n that w i l l be pro-v i d e d f o r an o b j e c t are encoded as sequences of f e a t u r e s to be f i x a t e d . The " e x p l o r a t i o n " phase of the c y c l e i n v o l v e s the d e t e c t i o n of these sequences. The c o n c l u s i o n drawn on the b a s i s of p s y c h o l o g i c a l exper-imentation that s e v e r a l d i v e r s e i n f l u e n c e s act towards the d e t e r m i n a t i o n of f i x a t i o n l o c a t i o n i s c o n s i s t e n t with the t e n -dency f o r computer implementations to emphasize one i s o l a t e d f a c t o r . Roy and Sutro (1982) d e s c r i b e a system which s e l e c t s a sequence of f i x a t i o n l o c a t i o n s i n an image on the b a s i s of the expected amount of edge. A rough measurement i s made at each l o c a t i o n , and the p r o c e s s o r f o l l o w s an ordered l i s t of the expected amount of edge at each l o c a t i o n f 1 1 ] . Funt (1976) developed a system to analyze the s t a b i l i t y and s t r u c t u r e of a group of imaged o b j e c t s . The o p e r a t i o n s i n c l u d e d the movement of a graded r e s o l u t i o n r e t i n a a c r o s s the image i n response to [11] The paper a l s o i n c l u d e s suggestions as to how the No-ton and Stark model might be extended i n t o three dimensions, f o l l o w i n g on some of the work of Marr and N i s h i h a r a (1976) and Oshima and S h i r a i (1981). 58 the requirements f o r problem r e l e v a n t i n f o r m a t i o n such as the l o c a t i o n s of p o i n t s of contact between o b j e c t s . Pylyshyn, E l c o c k , Marmor and Sander (1978a; 1978b) imple-mented a perceptual-motor system which i n c l u d e s , as one com-ponent, the a p p l i c a t i o n of an area of high r e s o l u t i o n a v a i l a -b i l i t y a cross drawings of geometric f i g u r e s . Of p a r t i c u l a r i n t e r e s t i n t h i s o p e r a t i o n i s the idea that j u s t because o b j e c t s or f e a t u r e s f a l l w i t h i n the fovea, does not mean that they are a u t o m a t i c a l l y f u l l y processed. An a t t e n t i o n a l mechanism must be a p p l i e d , and f e a t u r e s c o l l e c t e d to enable a matching with nodes of a memory network. T h i s idea i s very s i m i l a r to that expressed by Kahneman and Treisman (1982), i n t h e i r "object f i l e " model for v i s u a l a t t e n t i o n . From a computational p o i n t of view, the b a s i c r e q u i r e -ments of a system which can i n t e l l i g e n t l y s e l e c t p r o c e s s i n g l o c a t i o n s a r e : (1) The a b i l i t y to e x p l o i t the r e s u l t of more e x t e n s i v e , lower r e s o l u t i o n p e r i p h e r a l a n a l y s i s . (2) The c a p a b i l i t y to d i r e c t p r o c e s s i n g to area on the b a s i s of e x p e c t a t i o n s or c o n f l i c t w i t h i n an ongoing i n t e r p r e t a -t i o n p r o c e s s . (3) The r e c o g n i t i o n of areas of image d e t a i l which are i n t -r i n s i c a l l y more l i k e l y to provide important i n f o r m a t i o n . 59 Hochberg and Brooks (1978:312) have proposed that a dual sys-tem c o u l d best accomplish the d i r e c t i o n of eye movements: ".. a f a s t component which b r i n g s the eye to those p e r i p h e r a l l y v i s i b l e regions that promise to be i n f o r -mative or to act as landmarks, and a more s u s t a i n e d component that d i r e c t s the eye to obt a i n more d e t a i l e d i n f o r m a t i o n about the main f e a t u r e s that have a l r e a d y been l o c a t e d . " 6 0 3. Research Overview The r e s e a r c h presented i n t h i s t h e s i s has two major o b j e c t i v e s . ( 1 ) To develop formal r e p r e s e n t a t i o n methods f o r the d e f i n i -t i o n of s i m p l i f i e d problem domains, and to d e v i s e g e n e r a l i z e d o p e r a t i o n s which can u t i l i z e these r e p r e s e n t a t i o n s to e f f e c t i n t e r p r e t a t i o n of images r e p r e s e n t a t i v e of the problem domain. (2) To implement these s t r a t e g i e s w i t h i n the framework of a c o n s i s t e n t and r e a l i s t i c model of v i s u a l p e r c e p t i o n . T h i s chapter p r o v i d e s an o u t l i n e of the methods developed without r e f e r e n c e to the computer implementation or the s p e c i f i c problem domain used. As a r e s u l t , the o u t l i n e i s sketchy and incomplete. I t should only be viewed as p r o v i d i n g an o v e r a l l s t r u c t u r e f o r the d e t a i l e d accounts with r e f e r e n c e to the implementation found i n chapter f o u r . 3.j_. A Model f o r Percept ion Within t h i s model of p e r c e p t i o n , component h i e r a r c h y i n f o r m a t i o n i s made e x p l i c i t i n a knowledge s t r u c t u r e , with non-decomposed elements represented i n terms of image f e a t u r e s a v a i l a b l e at the f i n e s t l e v e l of r e s o l u t i o n . These image con-s t r u c t i o n s are p r o t o t y p i c a l views of o b j e c t s which are f l e x i -b l e enough to cover a wide range of a c t u a l viewing a n g l e s . Other r e p r e s e n t a t i o n s of o b j e c t knowledge might c o e x i s t with t h i s view-based s t r u c t u r e , but the c a p a b i l i t i e s of t h i s 61 s t r u c t u r e are c o n s i d e r a b l e , p a r t i c u l a r l y i n the understanding of convention-based l i n e drawings, (see S e c t i o n 4.2). The r e p r e s e n t a t i o n of o b j e c t s d i r e c t l y i n terms of image f e a t u r e s p r o v i d e s the p o s s i b i l i t y f o r an i n t e r p r e t a t i o n l a b e l -l i n g approach i n which f e a t u r e s are a s s i g n e d l i s t s of o b j e c t models that use the f e a t u r e s i n t h e i r d e s c r i p t i o n s . T h i s c u i n g s t r u c t u r e i s extended to the more complex o b j e c t s and thereby develops a r e c u r s i v e c u i n g mechanism, which encodes p o t e n t i a l r e l a t i o n s among o b j e c t s and t h e i r d e p i c t i o n i n images. The r e s u l t i s s t r u c t u r e s which provide the c a p a b i l i t y f o r both top-down or bottom-up a n a l y s i s (or both), without making a commitment to any p a r t i c u l a r s t r a t e g y . The r e s u l t a n t d e s c r i p t i v e s t r u c t u r e s 'might e q u a l l y w e l l be used i n the gen-e r a t i o n of drawings of the o b j e c t s . T h i s s t r u c t u r e accounts f o r f e a t u r e s at a f i n e r e s o l u t i o n l e v e l . In a l i n e drawing domain, these f e a t u r e s would be the l i n e s themselves, and t h e i r p r o p e r t i e s of l e n g t h , and curva-t u r e . In a d d i t i o n , other, l e s s d e t a i l e d s t r u c t u r a l d e s c r i p -t i o n s of o b j e c t s are maintained based on the types of f e a t u r e s a v a i l a b l e at a coarse r e s o l u t i o n l e v e l (such as b l o b s ) . In some cases there are r e l a t i o n s between concepts of the f i n e r e s o l u t i o n s t r u c t u r e and of the coarse r e s o l u t i o n s t r u c t u r e . These r e l a t i o n s c o i n c i d e with a s p e c i a l i z a t i o n h i e r a r c h y and thereby form a n a t u r a l part of the concepts of o b j e c t s . , 62 T h i s adds a new dimension t o the cuing s t r u c t u r e . Not only does the component h i e r a r c h y give the s t r u c t u r e f o r a r e c u r s i v e c u i n g mechanism, but the s p e c i a l i z a t i o n h i e r a r c h y p r o v i d e s a d i r e c t r e l a t i o n to the h i e r a r c h i c a l r e l a t i o n s w i t h i n the image, as shown i n f i g u r e 3 . 1 . 1 . ion cues r e s o l u t i o n scene domain image domain F i g u r e 3 . 1 . 1 . D i f f e r e n t l e v e l s i n the image h i e r a r c h y cuing at d i f f e r e n t l e v e l s i n the s p e c i a l i z a t i o n h i e r a r c h y . C e n t r a l to the o p e r a t i o n of t h i s model of p e r c e p t i o n i s the idea that f i n e d e t a i l f e a t u r e s are a v a i l a b l e i n only a smal l p o r t i o n of the image at a time, and that t h i s area of a v a i l a b i l i t y c o i n c i d e s with a l a r g e r area of a v a i l a b i l i t y of coarse l e v e l f e a t u r e s . Within a s i n g l e such f i x a t i o n , f e a t u r e s are c o l l e c t e d from t h e i r a p p r o p r i a t e a r e a s . Each f e a t u r e has a s s o c i a t e d with i t a l i s t of the p o s s i b l e object models to which i t may belong. The f i r s t i n t e r p r e t a t i o n processes act towards the reduc-t i o n of these model p o s s i b i l i t i e s by the formation of group-ings of f e a t u r e s i n t e r r e l a t e d by the image h i e r a r c h y s t r u c -t u r e . These groupings allow the r e d u c t i o n of model 63 p o s s i b i l i t i e s through the enforcement of the requirements f o r c o n s i s t e n t i n t e r p r e t a t i o n s w i t h i n groups (see s e c t i o n 4 . 4 ) . These r e d u c t i o n o p e r a t i o n s are assumed to be p a r a l l e l w i t h i n groupings, and r e l y on set i n t e r s e c t i o n as t h e i r primary o p e r a t o r [ 1 2 ] , The remaining p o s s i b i l i t i e s must be examined i n more d e t a i l with c o n s i d e r a t i o n of the s p e c i f i c r e l a t i o n s r e q u i r e d among f e a t u r e s to v e r i f y models. Whenever these requirements are s i m i l a r a c r o s s a c l a s s of o b j e c t s , model d e s c r i p t i o n s are compressed i n t o g e n e r a l i z e d forms, thereby c r e a t i n g a c r i -t e r i o n f o r a second type of s p e c i a l i z a t i o n h i e r a r c h y . Any o b j e c t which i s found to be adequately supported i n the image i s a s s e r t e d , and then can a c t as a cue f o r the more complex s t r u c t u r e s of which i t may be a component. The r e s u l t s which are p o s s i b l e on the b a s i s of a s i n g l e f i x a t i o n l o c a t i o n may be q u i t e l i m i t e d , and so other areas of the image must be processed. I n t e l l i g e n t s e l e c t i o n of f i x a -t i o n l o c a t i o n s w i l l e xpedite the i n t e r p r e t a t i o n . T h i s s e l e c -t i o n r e l i e s on the correspondence between fo v e a l - b a s e d d e t a i l e d r e s u l t s , and the r e s u l t s o b t a i n e d i n the coarse l e v e l p e r i p h e r y as f o l l o w s : ~ [12] See (Fahlman, 1979) f o r a d i s c u s s i o n of the use of set i n t e r s e c t i o n as a u n i t operator in p a r a l l e l systems. 64 ( 1 ) The coarse l e v e l r e s u l t s a c t as a framework for the i n t e g r a t i o n of the s u c c e s s i v e high d e t a i l r e s u l t s . (2) The l o c a t i o n s to process are s e l e c t e d so as to maximize the propagation of d e t a i l e d i n t e r p r e t a t i o n i n t o the p e r i -phery . In a d d i t i o n , there i s p r o v i s i o n f o r c o n s i d e r a t i o n of both the s t r u c t u r e of the image and the task at hand i n the s e l e c t i o n of new p r o c e s s i n g l o c a t i o n s . A f t e r a number of f i x a t i o n s [ 1 3 ] , the e n t i r e image i s understood i n terms of the coarse l e v e l models, and an under-standing on the b a s i s of the f i n e l e v e l models, obtained l o c a l l y at the f i x a t i o n c e n t e r s , has been adequately pro-pagated to the coarse l e v e l i n t e r p r e t a t i o n such that a f i n e l e v e l understanding of the e n t i r e scene i s p o s s i b l e without a c t u a l l y having s c r u t i n i z e d each l o c a t i o n with the fovea. [13] The number depends l a r g e l y on the s e t t i n g s f o r the r a -d i i of the fovea and p e r i p h e r y . 65 3.2. D e c l a r a t i v e Schemata The c h a r a c t e r i s t i c s and advantages of a schemata-based approach i n the a p p l i c a t i o n of model knowledge in computer v i s i o n has been o u t l i n e d i n s e c t i o n 2.2. There are a number of p o t e n t i a l advantages to the development of mechanisms which can encode such knowledge in a d e c l a r a t i v e way. ( 1 ) A d e c l a r a t i v e d e s c r i p t i o n of the problem domain, without r e f e r e n c e to the means of i n t e r p r e t a t i o n p r o v i d e s an e x p l i c i t statement of the system's c a p a b i l i t i e s and requirements. (2) D e c l a r a t i v e domain r e p r e s e n t a t i o n s may be used i n conjunc-t i o n with s i m p l i f i e d c o n t r o l s t r u c t u r e s in order to v e r i f y the model knowledge before s u b j e c t i n g i t to the t y p i c a l l y more complex c o n t r o l r e q u i r e d to use the model knowledge in v i s i o n . The recent i n t e r e s t i n l o g i c - b a s e d programming systems such as PROLOG has provided adequate t o o l s f o r the accomplishment of t h i s t e s t i n g . (3) S e p a r a t i o n of problem domain knowledge from the i n t e r p r e -t a t i o n methods permits s i m p l i f i e d expansion or m o d i f i c a t i o n of the models, or even t r a n s f e r to another domain which can be represented w i t h i n the syntax of the d e c l a r a t i v e schemata. This s e p a r a t i o n a l s o f a c i l i t a t e s experimentation with a v a r i e t y of i n t e r p r e t a t i o n c o n t r o l methods. (4) Procedures may be developed to analyze d e c l a r a t i v e sche-mata towards the end of automatic generation of a cuing s t r u c -66 t u r e . E x p l i c i t statements about r e l a t i o n s h i p among domain elements and t h e i r d e p i c t i o n i n images means that the knowledge may be i n v e r t e d to o b t a i n a cuing s t r u c t u r e . (5) In l a y i n g bare the s t r u c t u r e of the problem domain, p r o v i -s i o n i s made f o r the a n a l y s i s of o b j e c t s i n terms of the r e l a -t i v e importance of t h e i r a t t r i b u t e s . Of the many a t t r i b u t e values which may be developed f o r an o b j e c t , some are c r i -t e r i a l to i t s p l a y i n g a p a r t i n the support of a more complex s t r u c t u r e , while some may be r e l a t i v e l y unimportant. Thus the i n t e r p r e t a t i o n processes may be tuned to f i r s t d e al with those f e a t u r e s which are important to r e c o g n i t i o n . The s t r u c t u r a l correspondences among r e p r e s e n t a t i o n s based on f e a t u r e s at d i f f e r e n t r e s o l u t i o n l e v e l s may a l s o be made e x p l i c i t and a v a i l a b l e to a n a l y s i s . A d e c l a r a t i v e schemata system has been developed which i s , i n the s t r i c t e s t sense, a grammar of the problem domain and i t s d e p i c t i o n i n the image. The t e r m i n a l symbols of the grammatical d e s c r i p t i o n are the p r i m i t i v e elements of the image. The pro d u c t i o n s of the grammar w i l l be r e f e r r e d to here as d e s c r i p t i o n s . T h i s term i s more a p p r o p r i a t e because of t h e i r t r u l y d e s c r i p t i v e nature, and i n order to a v o i d the conno t a t i o n of a "production system" (Newell, 1973) which would be i n a p p r o p r i a t e because the d e s c r i p t i o n s i n v o l v e no p r o v i s i o n f o r i n t e r p r e t i v e a c t i o n . 67 Phrase s t r u c t u r e grammars are normally used i n the r e p r e s e n t a t i o n of c l a s s e s of o b j e c t s which are e s s e n t i a l l y one-dimensional[14]. The i m p l i c i t c o n c a t e n a t i o n of vocabulary symbols i n a p r o d u c t i o n or s e n t e n t i a l form i s r e p r e s e n t a t i v e of adjacency i n the i n p u t . For a c l a s s of two-dimensional image r e p r e s e n t a t i o n s , or f o r a c l a s s of three dimensional scene o b j e c t s , the n o t i o n of adjacency i s more complex, and must be made e x p l i c i t . A system of a s s i g n i n g a t t r i b u t e s to non-terminals has a l s o been i n c o r p o r a t e d as a means of s p e c i f y i n g the semantics of the domain. Values f o r the a t t r i b u t e s are passed on and developed through a mechanism reminiscent of " a t t r i b u t e gram-mars" d e s c r i b e d by Knuth (1968) and Marcotty Ledgard and Boch-mann (1976). A simple example of a d e s c r i p t i o n f o r an i s o s c e l e s t r i a n -g l e w i l l serve as a good demonstration of the way these exten-s i o n s have been i n t r o d u c e d . Each d e s c r i p t i o n has the u n d e r l y i n g form of a phrase s t r u c t u r e grammar p r o d u c t i o n X --> A B C [14] There are techniques which can reduce two-dimensional image elements i n t o one dimension, such as t r a c i n g around the perimeter and r e c o r d i n g the changes i n d i r e c t i o n i n a l i s t , which i s then t r e a t e d as input (see Ledley, 1964). 68 where X i s being d e f i n e d i n terms of the more b a s i c elements A, B, and C. t r i a n g l e — > { l i n e l i n e l i n e } In order to expedite the assignment of r e l a t i o n s among the b a s i c elements, each i s given a l a b e l by which i t may be r e f e r r e d . T h i s l a b e l a l s o e s t a b l i s h e s i t s uniqueness w i t h i n the d e s c r i p t i o n . t r i a n g l e — > (($1 l i n e ) ( $ 2 l i n e ) ($3 l i n e ) ) r e l a t i o n s i n d i c a t e the elements over which they apply: t r i a n g l e . ~ > (($1 l i n e ) ( $ 2 l i n e ) ( $ 3 l i n e ) ) (($4 connect ($1 $2)) ($5 connect ($2 $3)) ($6 connect ($3 $1 )) ($7 e q u a l - l e n g t h ($1 $2))) Of course, t h i s does not s p e c i f y an i s o s c e l e s t r i a n g l e i n two r e s p e c t s : the l i n e s may not be s t r a i g h t , and they may o v e r l a p . The use of the a t t r i b u t e s of the image f e a t u r e s , as w e l l as a t t r i b u t e s f o r the r e l a t i o n s can provide f o r the s p e c i f i c a t i o n of these c o n s t r a i n t s . t r i a n g l e — > (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0) ) ) (($4 connect ($1 $2) (angle (1 179))) ($5 connect ($2 $3) (angle (1 89))) ($6 connect ($3 $1) (angle (1 89))) ($7 e q u a l - l e n g t h ($1 $2))) 69 Some higher l e v e l d e s c r i p t i o n may use t h i s " t r i a n g l e " as one of i t s b a s i c elements, r e q u i r i n g c o n d i t i o n s to be p l a c e d upon a p p l i c a b i l i t y through r e f e r e n c e to i t s a t t r i b u t e s , so i t w i l l be p a r t of t h i s d e s c r i p t i o n to s p e c i f y what a t t r i b u t e s are a v a i l a b l e and how they might be developed out of the a t t r i b u t e s of the elements and r e l a t i o n s composing the " t r i a n -g l e " . T h i s s p e c i f i c a t i o n i s e a s i l y added: t r i a n g l e --> (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle (1 179))) ($5 connect ($2 $3) (angle (1 89))) ($6 connect ($3 $1) (angle (1 89))) ($7 e q u a l - l e n g t h ($1 $2))) ((b a s e - l e n g t h <- ( l e n g t h $3)) ( o r i e n t a t i o n <- (slope $3)) (height <- (times ( a r c t a n (angle $5)) ( d i v i d e ( l e n g t h $3) 2 ) ) ) ) With very few f u r t h e r m o d i f i c a t i o n s , t h i s form i s capable of encoding the e n t i r e t e s t problem domain, without r e q u i r i n g that the s p e c i f i c a t i o n s of r e l a t i o n s or a t t r i b u t e g e n e r a t i o n methods become much more complicated. 7 0 4. A Computer V i s i o n Implementation T h i s chapter p r o v i d e s a d e t a i l e d account of the computa-t i o n a l v i s i o n system designed to implement and experiment with the ideas given i n the p r e v i o u s c h a p t e r . With the h e l p of examples, the ideas are e l a b o r a t e d c o n s i d e r a b l y , and s e v e r a l d i s c u s s i o n s of r e l a t e d r e s e a r c h i s s u e s are i n c l u d e d . Below i s shown an overview diagram of the computer imple-mentation. Processes are e n c l o s e d i n squares while data s t r u c t u r e s are not e n c l o s e d . Each s t r u c t u r e or process has beside i t a number which i n d i c a t e s the number of the s e c t i o n of t h i s chapter which d e a l s with i t . Examples of the system i n o p e r a t i o n are p r o v i d e d i n chapter f i v e , and as w e l l i n Appendix D. image I g e n e r a t i o n © and pre-p r o c e s s i n g body image © data i f e a t u r e c o l l e c t i o n at a s i n g l e l o c a t ion c u r r e n t © f i x a t ion l o c a t ion 1 body knowledge (2} r e p r e s e n t a t i o n knowledge pr e p a r a t ion © © © image ~ f e a t u r e s f e a t u r e based ® > operat ions f e a t u r e s w i t h © assoc i a t e d model r o l e — i n f o r m a t i o n f i x a t i o n \^6) l o c a t ion s e l e c t ionl i n s t a n c e s of © models ( i n t e r p r e t -a t i o n r e s u l t s ) model based operat ions I © body data © ready f o r use i n v i s i o n networks of © ^ p a r t i a l l y complete models 71 4.j_. Problem Domain and Image Generation The c l a s s of images i n t e r p r e t e d by the system i s l i n e drawings of human-like body forms. The drawings are d e r i v e d from those used by Eshkol and Wachmann (1958) to i l l u s t r a t e t h e i r dance n o t a t i o n , and as would be expected, they are very e x p r e s s i v e of human body p o s i t i o n s . Each body drawing i s represented by 16 or 18 c l o s e d - l i n e image c o n s t r u c t i o n s , depending on the p e r s p e c t i v e view. Some examples are pro v i d e d i n f i g u r e 4.1.1. Fi g u r e 4.1.1. Some examples of body form drawings. 72 The drawings do not d e p i c t e i t h e r f o r e s h o r t e n i n g or o c c l u s i o n , as the processes necessary to d e a l with such aspects of a scene r e q u i r e more d e t a i l e d i n f o r m a t i o n about s u r f a c e o r i e n t a t i o n and range, which are not e a s i l y deduced from the s i m p l i f i e d image forms. Furthermore, these i s s u e s are not c e n t r a l to the goals of the r e s e a r c h . There i s a very l a r g e number of drawings which f a l l w i t h i n the problem domain. R e q u i r i n g 45 degrees to d i s t i n -g u i s h between angular p o s i t i o n s of body p a r t s , i g n o r i n g the f a c t t h a t s e v e r a l image c o n s t r u c t i o n s may d e p i c t a s i n g l e view of a body p a r t , and i g n o r i n g o v e r a l l o r i e n t a t i o n and s c a l e , there are s t i l l , by c o n s e r v a t i v e estimate, about 100 m i l l i o n d i f f e r e n t drawings i n the c l a s s . The images are c o n s t r u c t e d through the use of a menu-d r i v e n programf15], which permits p o s i t i o n i n g , s c a l i n g and r o t a t i n g of body pa r t d e p i c t i o n s s e l e c t e d from an inventory of image r e p r e s e n t a t i o n s as shown in f i g u r e 4.1.2. The p r e l i m -i n a r y r e s u l t i s a l i s t of s t r a i g h t l i n e segment end-points on a 1024x1024 g r i d . [15] T h i s p o r t i o n of the system runs on a PDP-11/34 using a VT-11 g r a p h i c s generator, and a VR17 d i s p l a y tube. The i n -t e r a c t i o n i s accomplished with a l i g h t - p e n . 73 Fi g u r e 4.1.2. Complete c o l l e c t i o n of body part d e p i c t i o n s , Next, a s e r i e s of programs operates on t h i s l i s t of end-p o i n t s i n order to produce a data-base of f e a t u r e s which w i l l be made a v a i l a b l e to subsequent i n t e r p r e t a t i o n systems. The f e a t u r e s to be retu r n e d are l i n e segments and blobs with a t t r i b u t e s assigned as shown i n f i g u r e 4.1.3. 74 Feature type A t t r i b u t e l i n e end-points c u r v a t u r e [ 1 6 ] blob center of g r a v i t y end-points of long a x i s end-points of short a x i s F i g u r e 4.1.3. Image f e a t u r e s and t h e i r a t t r i b u t e s . The Psychology l i t e r a t u r e supports the n o t i o n of c u r v a t u r e as as a f e a t u r e i n v o l v e d i n human p e r c e p t i o n (Riggs, 1973). In a d d i t i o n to these f e a t u r e s , some image h i e r a r c h y i n f o r m a t i o n i s computed. For each l i n e segment, a l i s t i s returned of the blobs with which i t o v e r l a p s i n space, along with a measure of the amount of o v e r l a p . T h i s i n f o r m a t i o n i s e x t r a c t e d by a s e r i e s of FORTRAN pro-grams which performs the f o l l o w i n g steps: (1) Trace connected chains of s t r a i g h t l i n e s l o o k i n g f o r p o i n t s of departure i n c u r v a t u r e , and mark the segment boundaries, measuring the segment's c u r v a t u r e . (2) Develop a 128x128 r e p r e s e n t a t i o n of the image, with each p i x e l encoding the l e n g t h of l i n e segments found i n an 8x8 window in the 1024x1024 image. T h r e s h o l d i n g the [16] The angle formed at the i n t e r s e c t i o n of the tangents at the end-points. 75 value of "on" p i x e l s , blobs are determined by expanding outward from the u n f i l l e d c e n t e r s . (3) Determine the blob a t t r i b u t e s . The axes are computed com-puted by averaging the o r i e n t a t i o n s of the p i x e l s nearest and most d i s t a n t from the c e n t e r of g r a v i t y . (4) C a l c u l a t e the segment and blob o v e r l a p f o r the image h i e r a r c h y i n f o r m a t i o n . F i g u r e 4.1.4 and 4.1.5 show an example l i n e drawing at d i f -f e r e n t stages of p r o c e s s i n g . (a) (b) F i g u r e 4.1.4. Line drawing at (a) 1024x1024 i n i t i a l l i n e draw-in g , (b) 128x128 averaged image. 76 (a) (b) F i g u r e 4.1.5. Li n e drawing at (a) 32x32 averaged image (b) the axes of each d e t e c t e d b l o b . I t must be understood that the r e s u l t s of t h i s i n i t i a l p r o c e s s i n g are intended only as a base of f e a t u r e s , represen-t a t i v e of the type of i n f o r m a t i o n which might be a v a i l a b l e to a v i s i o n system. For t h i s reason, the inner workings of the programs d e s c r i b e d here have not been e l a b o r a t e d . Chapter 5 p r o v i d e s a complete working example which i n c l u d e s a d e s c r i p -t i o n of the i n f o r m a t i o n made a v a i l a b l e from the image as basic f e a t u r e s . 77 4.2. Knowledge Representation Body form knowledge i s represented i n a d e c l a r a t i v e sche- mata [17] system, c o n s i s t i n g of three d i f f e r e n t types of d e s c r i p t i o n s : (1) those which develop image c o n s t r u c t i o n s from the b a s i c f e a t u r e s . (2) those which map between image c o n s t r u c t i o n s and b a s i c scene o b j e c t s . (3) those which develop complex scene o b j e c t s . The f o l l o w i n g c h a r a c t e r i z e s each type in turn with the h e l p of examples from the body form knowledge. Each of the v a l i d s t r u c t u r e s i n the l i n e drawings i s i n d i c a t i v e of a p a r t i c u l a r p e r s p e c t i v e of a body p a r t . The task of the image c o n s t r u c t i o n d e s c r i p t i o n s i s to i n d i c a t e how these views may be composed of image f e a t u r e s . The d e s c r i p -t i o n s are intended as p r o t o t y p e s , or i d e a l view r e p r e s e n t a -t i o n s , but as w i l l be seen l a t e r i n s e c t i o n 4.3, there i s a c t u a l l y a wide v a r i a t i o n which i s a c c e p t a b l e , determined by the s e t t i n g of g l o b a l parameters. Consider the example of "line-hand-1" shown in f i g u r e 4.2.1. T h i s view i s the s t a r t -i n g , or base view of the l e f t hand (see f i g u r e 4.2.3). As [17] The b a s i c concepts behind the d e c l a r a t i v e schemata system were f i r s t d e s c r i b e d i n Browse (1980). 78 i n d i c a t e d , t h i s i s a component d e s c r i p t i o n which d e p i c t s the hand view i n terms of three l i n e s , one of which i s curved. / $ 3 $2 (line-hand-1 (component . (($ 1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 134) ( r a t i o 108)) ($5 connect ($2 $3) (angle 90) ( r a t i o 225)) ($6 connect ($3 $1) (angle 92) ( r a t i o 41))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $5) ( l o c a t i o n $6))) (proximal-end <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) F i g u r e 4.2.1. Component d e s c r i p t i o n of a view of a hand. Connections are always such that the angle between l i n e s i s a d e f l e c t i o n to the r i g h t of magnitude between 0 and 180 degrees, thereby e l i m i n a t i n g some ambiguity. The angle at the connection i s a l o c a l angle, based on the end-point to end-p o i n t angle, and the c u r v a t u r e of the l i n e s (see Appendix A). The p r o t o t y p i c a l r a t i o of the lengths of the l i n e s i s a l s o p r o v i d e d as a c o n s t r a i n t on the a t t r i b u t e s of the connec-t i o n s [ 1 8 ] . [18] Though not shown i n t h i s example, a connection may a l s o take on a "ctype" a t t r i b u t e , which serves to p o i n t out the i n f r e q u e n t occurrences of concave l i n e c o n n e c t i o n s . Exam-p l e s of t h i s c o n s t r u c t may be seen i n the d e s c r i p t i o n of "line-head-4" i n Appendix B. 79 In t h i s example, as with a l l l i n e c o n s t r u c t i o n d e s c r i p -t i o n s , the a t t r i b u t e "a2d" i n d i c a t e s the two-dimensional o r i e n t a t i o n of the view as determined by the o r i e n t a t i o n of one of i t s composing l i n e s . The a t t r i b u t e s " s i z e " and "proximal-end" are a l s o important i n l a t e r uses of t h i s d e s c r i p t i o n . F i g u r e 4.2.2 gi v e s an example of a d e s c r i p t i o n which maps an image c o n s t r u c t i o n i n t o a b a s i c scene element. There i s only one element in the d e s c r i p t i o n , that i s "line-hand-1", and there are no r e q u i r e d r e l a t i o n s . The d e s c r i p t i o n simply t r a n s f e r s a t t r i b u t e v a l u e s from the image domain i n t o the scene domain. There i s a s i m i l a r d e s c r i p t i o n f o r each of the t o p o l o l o g i c a l l y d i f f e r e n t views of the hand. (hand (image (($1 line-hand-1)) n i l ((pos t u r e <- open) ( l o c a t i o n <- ( l o c a t i o n ' $ 1 ) ) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $ 1 ) ) ) ( ( s i d e <- l e f t ) (a3d < - ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) F i g u r e 4.2.2. Image to scene mapping d e s c r i p t i o n f o r a view of the hand. In the a t t r i b u t e p o r t i o n of the d e s c r i p t i o n , there i s f i r s t of a l l a l i s t of a t t r i b u t e s which are to be passed d i r e c t l y . There are a l s o a d d i t i o n a l s e t s of a t t r i b u t e s which are r e f e r r e d to as e l a b o r a t i o n s . Each e l a b o r a t i o n c o n s i s t s of 80 a c l u s t e r of a t t r i b u t e a l t e r n a t i v e s to be a s s i g n e d to the scene element. In t h i s example, i f the hand turns out to be a r i g h t hand, the two-dimensional o r i e n t a t i o n of the " l i n e -hand-1" o b j e c t w i l l i n f l u e n c e the t hree-dimensional o r i e n t a -t i o n ("a3d") of the hand in a d i f f e r e n t way than i f i t i s the l e f t hand. The convention used to denote the t h r e e - d i m e n s i o n a l o r i e n t a t i o n of a body pa r t i n d i c a t e s the amount of o r i e n t a t i o n v a r i a t i o n there i s from the body p a r t ' s s t a r t i n g p o s i t i o n . F i g u r e 4.2.3 shows the body in the s t a r t i n g p o s i t i o n . A F i g u r e 4.2.3. Body form i n s t a r t i n g p o s i t i o n . 81 The o r i e n t a t i o n i s given as a t r i p l e i n d i c a t i n g the l e f t - h a n d r o t a t i o n i n the range [0,pi) about each of the body-centered c a r t e s i a n axes (see f i g u r e 4.2.4). F i g u r e 4.2.4 Body pa r t o r i e n t a t i o n r e l a t i v e to i t s r e s t p o s i -t i o n d e s c r i b e d as a t r i p l e (6x,0y,0z). Pr o v i d i n g , that the three component angles are always con-s i d e r e d i n the same order, each o r i e n t a t i o n t r i p l e i s unique in i t s r e p r e s e n t a t i o n of the o r i e n t a t i o n of the body p a r t . There are i n s t a n c e s f o r which the three-dimensional o r i e n t a t i o n w i l l be the same r e g a r d l e s s of whether the scene o b j e c t turns out to be r i g h t or l e f t , and there are i n s t a n c e s f o r which s e v e r a l a l t e r n a t i v e t h r e e - d i m e n s i o n a l o r i e n t a t i o n s are p o s s i b l e f o r each s i d e . These r e p r e s e n t a t i o n s are s e n s i -t i v e to the d e p i c t i o n p o s s i b i l i t i e s allowed f o r an image s t r u c t u r e . Consider f i g u r e 4.2.5. If t h i s image c o n s t r u c t i o n 82 i s allowed to d e p i c t an upper l e g i n such a way that the curved bulge may be e i t h e r the back of a l e g or the o u t s i d e of the l e g , then each of these o r i e n t a t i o n e l a b o r a t i o n s must be i n c l u d e d i n the mapping to the scene domain. If the d e p i c t i o n were extended so that the bulge c o u l d represent the t r i c e p s , then an a d d i t i o n a l e l a b o r a t i o n would be necessary. The philosophy behind the use of t h i s type of r e p r e s e n t a -t i o n i s based on a concept of o b j e c t s as c o l l e c t i o n s of a t t r i -butes. Each of the p o t e n t i a l mappings from the image con-s t r u c t i o n s to the scene o b j e c t s , d i f f e r s only i n the way i t develops some of the a t t r i b u t e s . For the example shown in f i g u r e 4.2.5, the a t t r i b u t e v a l u e s of " s i z e " and " d i s t a l - e n d " w i l l be the same f o r each of the s i x p o s s i b l e mappings (three F i g u r e 4.2.5. A s i n g l e d e p i c t i o n of an upper-leg used to represent three d i f f e r e n t o r i e n t a t i o n s . 83 per s i d e ) . Only "a3d" and " s i d e " vary across the d i f f e r e n t meanings of the view. I t does not seem reasonable to s p e c i f y each mapping s e p a r a t e l y , but r a t h e r , because of the s i m i l a r -i t y , i t i s best to develop a g e n e r i c mapping from which s e v e r a l s p e c i a l i z e d v e r s i o n s , c a l l e d " e l a b o r a t i o n s " may be o b t a i n e d . The t h i r d d e s c r i p t i o n type composes more complex body p a r t s out of the b a s i c ones. The s t r u c t u r e of the d e s c r i p t i o n i s almost i d e n t i c a l to the image c o n s t r u c t i o n d e s c r i p t i o n s except that the components are now other body p a r t s i n s t e a d of l i n e s , and the "connect" r e l a t i o n has been r e p l a c e d by a "near" r e l a t i o n . The "near" r e l a t i o n a l s o s p e c i f i e s two e l e -ments f o r which i t must h o l d , and s p e c i f i e s p o i n t a t t r i b u t e s of those elements, which must be w i t h i n a p r o x i m i t y of one another as determined a c c o r d i n g to the o v e r a l l s i z e of the p a r t s i n v o l v e d . For any such "near" r e l a t i o n , the o r i e n t a t i o n of the more d i s t a l part r e l a t i v e to the proximal may e a s i l y be computed. T h i s o r i e n t a t i o n t r i p l e i s broken down i n t o three separate a t t r i b u t e s of the r e l a t i o n : "angle-x", "angle-y", and "angle-z". Ranges in which these values must f a l l are pro-v i d e d in the r e l a t i o n s p e c i f i c a t i o n , and i n d i c a t e the range of motion c a p a b i l i t i e s at the j o i n t s of the b o d y [ l 9 ] . [19] The r e s t (or s t a r t i n g ) p o s i t i o n chosen i s i d e n t i c a l to t h a t used by Eshkol and Wachmann (1958) in t h e i r dance nota-t i o n . I t i s a l s o i d e n t i c a l to the r e s t p o s i t i o n used by the American Academy of Orthopedic Surgeons in t h e i r handbook " J o i n t Motion: Method of Measuring and Recording" (1965). So the angles of j o i n t movement p r o v i d e d i n that handbook c o u l d be i n s e r t e d d i r e c t l y i n t o the body model. 84 F i g u r e 4.2.6 pr o v i d e s , as an example, the " l e f t - a r m " d e s c r i p -t i o n ^ ] . ( l e f t - a r m (component (($1 hand (s i d e l e f t ) ) ($2 lower -arm ( s i d e l e f t ) ) ($3 upper-arm ( s i d e l e f t ) ) ) (($4 near ($1 $2 proximal-end d i s t a l - e n d ) (angle-x (-30 20)) (angle-y (0 0)) (angle-z (-90 90)) ( r a t i o 43)) ($5 near ($2 $3 proximal-end d i s t a l - e n d ) (angle-x (-150 150)) (angle-y (-90 90)) (angle-z (-150 150)) ( r a t i o 79))) ((a3d <- (a3d $3)) (proximal-end <- (proximal-end $3)) ( s i z e <- (times 2.1 ( s i z e $3))) ( l o c a t i o n <- ( l o c a t i o n $5)) (e l b o w - l o c a t i o n <- ( d i s t a l - e n d $3)) (elbow-posture <- ( d i f f (caddr (a3d $3)) (caddr (a3d $2)))) )) )) Fi g u r e 4.2.6. Left-arm schema d e s c r i p t i o n . Other d e s c r i p t i o n s , of course, develop even more complex body p a r t s , such as "lower-body" i n terms of " l e g " and " h i p s " , and f i n a l l y the d i s t i n g u i s h e d symbol of the grammar "body" i s d e f i n e d . The e n t i r e body knowledge grammar i s provided i n Appendix B. [20] The hand was d e s c r i b e d as a s i n g l e o b j e c t with a s i d e a t t r i b u t e of e i t h e r l e f t or r i g h t . The arms and l e g s , however, have separate d e s c r i p t i o n s f o r t h e i r s i d e s . T h i s was an a r b i -t r a r y and i n t e n t i o n a l d e c i s i o n made so that i n v e s t i g a t i o n c o u l d be made i n t o the use of both modes. The f i n a l model was l e f t with one of each. 85 The complete grammar f o r the body knowledge i s made up of two l a y e r s [ 2 l ] . The examples used i n t h i s s e c t i o n are a l l from the f i n e l a y e r , which g i v e s a d e t a i l e d account of the body on the b a s i s of l i n e f e a t u r e s . The coarse l a y e r p r o v i d e s a rough account of the body on the b a s i s of blob f e a t u r e s . Each uses d e s c r i p t i o n s of the same form, and r e a l l y i s a separate grammar i n i t s own r i g h t . Most non-terminals do, however, have c o u n t e r p a r t s i n the other l a y e r , s p e c i f i e d e x p l i c i t l y through the use of g e n e r a l i z a t i o n / s p e c i a l i z a t i o n h i e r a r c h y l i n k a g e . For example, an obj e c t "limb" i n the coarse l a y e r grammar, has l i n k s to both "arm" and " l e g " i n the f i n e l a y e r , while "extremity" l i n k s to both "hand" and " f o o t " . Within each l a y e r of the grammar, then, i s s p e c i f i e d a component h i e r a r c h y of body p a r t s . Across l a y e r s i s s p e c i f i e d the g e n e r a l i z a t i o n / s p e c i a l i z a t i o n h i e r a r c h y . F i g u r e s 4.2.7 to 4.2.9 show the complete s t r u c t u r e i n v o l v e d i n these h i e r a r -c h i e s . [21] The term " l a y e r " i s chosen here rather than " l e v e l " to avoid any con f u s i o n with the n o t i o n of a t w o - l e v e l grammar (van Wijngaarden, et a l 1969), which i s an e n t i r e l y d i f f e r e n t concept. F i g u r e 4.2.7. The component h i e r a r c h y f o r the f i n e l a y e r the body form r e p r e s e n t a t i o n . 87 Fi g u r e 2.4.8. The component h i e r a r c h y f o r the coarse l a y e r of the body form knowledge. 88 o rough body 6 body Figure 4.2.9. The specialization/generalization hierarchy for the body form representation. 89 4.2.J_. Adequacy of Represe n t a t i o n As a means of t e s t i n g the adequacy, of the body form knowledge, a p o r t i o n has been t r a n s l a t e d i n t o the programming system PROLOG[22], and used to "prove" body p a r t s i n a data base of a s s e r t i o n s about l i n e s . Each o b j e c t i n the PROLOG system i s i d e n t i f i e d by a "tag" which i s made up of i t s name fo l l o w e d by an i n t e g e r ( f o r example, " l i n e 7 " ) . A t t r i b u t e s are then axioms a s s e r t e d which i n v o l v e the tag. For example, the a s s e r t i o n of " l i n e l " i s : p o i n t ( 1 , l i n e l , 4 3 2 , 8 7 6 ) p o i n t ( 2 , l i n e l , 3 8 3 , 9 5 0 ) l e n g t h d i n e l ,88) c u r v e ( l i n e 1,14) The p r o c e s s i n g f i r s t d e t e c t s a l l connections i n the image and a s s e r t s them. Next, the e x i s t e n c e of a complex scene o b j e c t i s entered as a g o a l : <- l e g ( * t a g , * s i d e ) Through a s t r a i g h t f o r w a r d p rocess, i t was p o s s i b l e to devise theorems f o r the body p a r t s , based on the grammatical d e s c r i p -t i o n s [ 2 3 ] . A l l that was necessary to support the use of these theorems, was to encode the r e l a t i o n s which are s p e c i f i e d i n [22] This p o r t i o n of the system was implemented on an Am-dahl V8, running MTS o p e r a t i n g system. [23] Appendix C c o n t a i n s p r i n t o u t s of some of the PROLOG theorems f o r the body p a r t s . They may be compared to the gram-m a t i c a l d e s c r i p t i o n s found i n Appendix B. 90 the d e s c r i p t i o n s ( f o r example, "near", "connect"). T h i s PROLOG system was u s e f u l i n two r e s p e c t s : (1 ) I t was determined that the body form r e p r e s e n t a t i o n i s adequate to permit i n t e r p r e t a t i o n . (2) The system c o u l d be used as a means of debugging the knowledge r e p r e s e n t a t i o n , without the c o m p l i c a t i o n of i n t e r p r e t a t i o n processes being i n v o l v e d . One of the ba s i c goals of t h i s r e s e a r c h i s to experiment i n methods of c o n t r o l l i n g and d i r e c t i n g i n t e r p r e t a t i o n of images using the d e c l a r a t i v e s t r u c t u r e s which d e f i n e the prob-lem domain. The f o l l o w i n g s e c t i o n s w i l l o u t l i n e these pro-cedures. Because of the d i f f i c u l t y of implementing l o c a l con-s i s t e n c y methods and because of i t s inherent commitment to backtrack search, PROLOG was not used as the language of implementat i o n . 91 4.3_- P r e p a r a t i o n f o r I n t e r p r e t a t i o n The idea behind the " r e c u r s i v e c u i n g mechanism" of schemata-based v i s i o n systems i s that each o b j e c t (or perhaps r e l a t i o n ) i n the problem domain may act both as a f e a t u r e and as a model. T h i s way of viewing the s t r u c t u r e of a domain seems a p p l i c a b l e to the body knowledge r e p r e s e n t a t i o n because there are e x p l i c i t t i e s among the elements throughout the d e s c r i p t i o n s . A c l o s e r examination, however, r e v e a l s a problem: The f r i n g e nodes of the component h i e r a r c h y , the f e a t u r e s such as a " l i n e " , cue a l a r g e number of models - i n f a c t every image c o n s t r u c t i o n i n the f i n e l a y e r . S i m i l a r l y f o r connections between l i n e s . The s o l u t i o n to t h i s problem i s found through the use of a technique f o r i n c o r p o r a t i n g the a t t r i b u t e s t r u c t u r e of o b j e c t s i n t o the mechanism for m a i n t a i n i n g model p o s s i b i l i -t i e s . We s h a l l r e f e r to t h i s technique as set l a b e l l i n g . The idea behind "set l a b e l l i n g " can be e a s i l y expressed in the context of a s i m p l i f i e d problem domain. Consider the domain of v e h i c l e s (as used i n Havens, 1978). Assume four v e h i c l e s : s p o r t s c a r , b i c y c l e , c a r t , and t r u c k , each of which has a d e s c r i p t i o n based on i t s components. Each w i l l have "wheel" as a component, but each w i l l express d i f f e r e n t a t t r i -bute values which must hold f o r the "wheel": 92 • • • • truck --> ($1 wheel (type s o l i d ) (width t h i c k ) ) b i c y c l e --> . . . . ($1 wheel (type spoked) (width t h i n ) ) Set l a b e l l i n g p r o v i d e s a c u i n g s t r u c t u r e which i s con-t i n g e n t on a t t r i b u t e v a l u e s of f e a t u r e s . A simple s t r u c t u r e i s set up, as shown i n f i g u r e 4.3.1. Q o t h e r a t t r i b u t e s ( ^ s o l i d (^)spoked £)thick £)thin (truck ( b i c y c l e (truck ( b i c y c l e c a r t ) s p o r t s - s p o r t s - c a r t ) car) car) F i g u r e 4.3.1. A simple set l a b e l l i n g s t r u c t u r e . When an in s t a n c e of a wheel i s detected, i t a c t s a cue f o r any of the four v e h i c l e s , but as a t t r i b u t e v a l u e s of the wheel are obtained, the set of p o s s i b l e models becomes a u t o m a t i c a l l y c o n s t r a i n e d to the set corresponding to the a t t r i b u t e v a l u e . If both a t t r i b u t e s become a v a i l a b l e , simple set i n t e r s e c t i o n of the models f o r each a t t r i b u t e value w i l l f u r t h e r c o n s t r a i n the p o s s i b i l i t i e s . Thus p a r t i a l informa-t i o n , as might be a v a i l a b l e i n the view of the wheel from the 93 s i d e (type only) w i l l be u s e f u l , and a d d i t i o n a l i n f o r m a t i o n w i l l always be w e l l used. The order of appearance of the a t t r i b u t e s i s of no consequence, as would be the case i f the encoding were e i t h e r p r o c e d u r a l or i n the form of a d i s c r i m i -n a t i o n t r e e . A p r e l i m i n a r y a n a l y s i s of the grammatical r e p r e s e n t a t i o n of the problem domain can e a s i l y produce such set l a b e l l i n g s t r u c t u r e s f o r any f e a t u r e ' s a t t r i b u t e which can be a p p r o p r i -a t e l y q u a n t i z e d . For the body knowledge grammar, l i n e s , b l o b s , and connec-t i o n s are t r e a t e d i n t h i s way. Consider what happens in the case of l i n e s : The d e s c r i p t i o n f o r "line-hand-5" i n c l u d e s ($1 l i n e (curve 17)) T h i s means that any l i n e with c u r v a t u r e of 17 can act as a cue for the "$1" component of the model "line-hand-5". Since these image c o n s t r u c t i o n d e s c r i p t i o n s are intended as p r o t o -types, we would a l s o expect l i n e s with s i m i l a r c u r v a t u r e values to cue t h i s r o l e i n the model. Thus the range over which the a t t r i b u t e v a l u e s may vary, and s t i l l f u l f i l l the d e s c r i p t i o n , i s expanded out to an i n t e r v a l by an a r b i t r a r y extent. In the case of l i n e c u r v a t u r e , the range i s expanded 8 degrees from the p r o t o t y p e , so any l i n e with c u r v a t u r e between 9 and 25 degrees w i l l cue the model. Each use of " l i n e " i n a d e s c r i p t i o n can be s i m i l a r l y analyzed, u n t i l a 94 l a r g e l i s t of c u r v a t u r e ranges with cued models has been pro-duced. Over t h i s l i s t , o v e r l a p p i n g p o r t i o n s of adjacent ranges are merged, u n t i l a l i s t of mutually e x c l u s i v e ranges with t h e i r a t t a c h e d set of cued models has r e s u l t e d . Some adjacent ranges may d i f f e r only s l i g h t l y i n the l i s t of models they cue, so another merging o p e r a t i o n c o l l a p s e s a c r o s s such ranges. The r e s u l t i s a p a r t i t i o n i n g of the range of values that the a t t r i b u t e " c u r v a t u r e " can take on, together with a l i s t of p o s s i b l e models, that i s , a l i s t of a l l d e s c r i p t i o n -l a b e l p a i r s which s p e c i f y a l i n e with the a t t r i b u t e w i t h i n the range. A p a r t i a l example i s shown i n f i g u r e 4.3.2. ( l i n e curve (9 (19 8 l i n e - l o w e r - l e g - 1 component $3) 1ine-lower-leg-2 component $2) line-hand-3 component $4) line-hand-3 component $2) line-hand-4 component $5) line-hand-4 component $3) l i n e - t r u n k - 3 component $2) l i n e - t r u n k - 4 component $3) line-head-1 component $4) 23 line-head-1 component $4) line-head-1 component $3) 1ine-lower-leg-1 component $3) 1ine-lower-leg-2 component $2) line-hand-3 component $2) line-hand-4 component $5) l i n e - t r u n k - 3 component $2) l i n e - t r u n k - 4 component $3) line-head-2 component $3) line-head-2 component $2) F i g u r e 4.3.2. A p a r t i a l example of the set l a b e l l i n g data s t r u c t u r e f o r the c u r v a t u r e of l i n e s . 95 With t h i s process we have accomplished both the g e n e r a l i -z a t i o n from the p r o t o t y p i c a l d e s c r i p t i o n s and the development of the set l a b e l l i n g s t r u c t u r e . Whenever an i n s t a n c e of a l i n e i s found i n the image, and i t s c u r v a t u r e i s known, i t w i l l s t o r e the value of that a t t r i b u t e as a p o i n t e r i n t o the set l a b e l l i n g s t r u c t u r e , because f o r the purpose of the i n t e r p r e t a t i o n i t need not be known more p r e c i s e l y than the range i n t o which i t f a l l s . The same procedure i s c a r r i e d out f o r the " r a t i o " a t t r i -bute of "blobs", and f o r the "angle" and " r a t i o " a t t r i b u t e s of "con n e c t i o n s " . In the case of "connections", the set l a b e l -l i n g i s used to i t s f u l l advantage because i t i s o f t e n the case that the angle between two l i n e s can be determined (because i t i s a l o c a l p r o p erty) but that the r a t i o s of the conn e c t i n g l i n e s i s not known, A model p o s s i b i l i t y l i s t i s developed f o r every element used i n a d e s c r i p t i o n , i n c l u d i n g t e r m i n a l s and non-terminals. In a sense t h i s i s a complete i n v e r s i o n of the grammar. The r e s u l t i n g s t r u c t u r e can be thought of as a cue t a b l e (Mack-worth, 1977a) f o r the e n t i r e body knowledge. It i s important to r e a l i z e that i t i s the d e c l a r a t i v e and uniform s t r u c t u r e of the knowledge grammar which permits the automatic development of these u s e f u l s t r u c t u r e s . [ 2 4 ] [24] T h i s part of the implementation, as w e l l as a l l the f o l l o w i n g p a r t s , was accomplished i n F r a n z l i s p , with the UNIX o p e r a t i n g system on a VAX-11/780. 9 6 4_.4. Feature-Based Operations Before the steps i n the use of the body grammar i n image i n t e r p r e t a t i o n can be e x p l a i n e d we must f i r s t c o n s i d e r the a v a i l a b i l i t y of f e a t u r e s . Features are made a v a i l a b l e i n l i m -i t e d areas of the image, d e f i n e d as c o n c e n t r i c c i r c l e s about a c e n t r a l f i x a t i o n l o c a t i o n . An inner c i r c l e d e f i n e s the f o v e a l  area, an area i n which l i n e i n f o r m a t i o n i s a v a i l a b l e , and a l a r g e r c i r c l e i s the p e r i p h e r a l area, the area of a v a i l a b l e blob data. The center p o i n t of these c i r c l e s , c a l l e d the f i x - a t i o n c e n t e r can be moved to any l o c a t i o n i n the image. These c i r c l e s , whose r a d i i are a r b i t r a r i l y s e t , represent the a v a i -l a b i l i t y of i n f o r m a t i o n at d i f f e r e n t r e s o l u t i o n l e v e l s because of the s t r u c t u r e of the human r e t i n a . F i g u r e 4.4.1 shows an image with a t y p i c a l f i x a t i o n . F i g u r e 4.4.1. A t y p i c a l f i x a t i o n of an image. The area i n 128x128 r e s o l u t i o n i n d i c a t e s the p e r i p h e r y , and the 1024x1024 area i s the fovea. The r e s t of the image i s shown i n 32x32 r e s o l u t i o n . 97 Features w i t h i n the f i x a t i o n may not be complete i n the sense that some of t h e i r a t t r i b u t e s may not be determined. For example, the c u r v a t u r e of a l i n e may be found, but one of the endpoints might l i e o u t s i d e of the fovea and so not be de t e c t e d . Features w i l l , however, always s p e c i f y l i s t s of model p o s s i b i l i t i e s - l i s t s of r o l e s that they may play i n ob j e c t d e s c r i p t i o n s . One approach to i n t e r p r e t a t i o n i s to begin model invoca- t i o n . T h i s o p e r a t i o n i n v o l v e s the examination of the d e s c r i p -t i o n s s p e c i f i e d i n the model p o s s i b i l i t i e s , f o l l o wed by attempts to e s t a b l i s h the e x i s t e n c e of the remaining r e q u i r e d f e a t u r e s , and then t e s t i n g the r e q u i r e d r e l a t i o n s among them. T h i s approach can pro v i d e a dynamic det e r m i n a t i o n of whether p r o c e s s i n g should proceed top-down or bottom-up (see Havens, 1978). As w e l l , i t can pro v i d e a means of i t e r a t i v e r e f i n e -ment of segmentation and i n t e r p r e t a t i o n (see Mackworth, 1978). Model i n v o c a t i o n can be c o s t l y because of the search i n v o l v e d and the p o s s i b i l i t y of redundant o p e r a t i o n s . In order to v a l i d a t e a model i n terms of the image, the r e l a t i o n s of the d e s c r i p t i o n must be v e r i f i e d with a p p r o p r i a t e b i n d i n g s of f e a t u r e s . Model i n v o c a t i o n methods are e v e n t u a l l y used to accomplish these t e s t s , and w i l l be d e s c r i b e d in the context of the body grammar i n the f o l l o w i n g s e c t i o n . The purpose of t h i s s e c t i o n i s to examine the idea that some of the model p o s s i b i l i t i e s (or r o l e s ) might be e l i m i n a t e d before the complete model i n v o c a t i o n procedures are used. 98 As we have seen in the review of the work of Waltz (1972) and Mackworth (1977b), the o p e r a t i o n s of network c o n s i s t e n c y may be used to reduce s e t s of p o s s i b l e b i n d i n g s whenever an a p p r o p r i a t e c o n s t r a i n i n g r e l a t i o n can be i d e n t i f i e d . As a step towards d i s c o v e r i n g such r e l a t i o n s , we note that the coarse l a y e r f e a t u r e s can be expected to be l a r g e r than the f i n e l a y e r f e a t u r e s . Furthermore, s e v e r a l f i n e l a y e r f e a t u r e s can be expected w i t h i n the same image extent as a s i n g l e coarse l a y e r f e a t u r e . For the body drawing problem domain, each blob f e a t u r e c o i n c i d e s with a number of l i n e f e a t u r e s . An image h i e r a r c h y r e t a i n s t h i s i n f o r m a t i o n , as d e s c r i b e d i n s e c t i o n 4.1. In some cases, a l i n e f e a t u r e may be r e l a t e d to s e v e r a l blob f e a t u r e s , p a r t i c u l a r l y i f the body p a r t s i n the o r i g i n a l drawing are c l o s e together, but i n many cases l i n e f e a t u r e s w i l l only have t h i s image h i e r a r c h y con-n e c t i o n with a s i n g l e blob f e a t u r e . In such cases of c o n f i d e n c e about the c o i n c i d e n c e of f e a t u r e s from d i f f e r e n t l a y e r s , i t seems a reasonable assump-t i o n that a group of l i n e f e a t u r e s which are a l l r e l a t e d to the same blob, w i l l a l s o e x h i b i t the s p e c i f i c l i n e - b a s e d r e l a -t i o n s which would be necessary fo r t h e i r c o n f i r m a t i o n of sup-po r t f o r some more complex o b j e c t . In a n a t u r a l v i s i o n s i t u a -t i o n there would be many more sources of the formation of groups of f i n e l a y e r f e a t u r e s . Motion, c o l o u r , and t e x t u r e c o u l d a l l provide the c r i t e r i a f o r the development of rough groupings w i t h i n which we might expect c o n t i n u i t y of 99 i n t e r p r e t a t i o n f o r the f i n e l a y e r f e a t u r e s . T h i s idea forms the b a s i s of the f i r s t feature-based con-s i s t e n c y r e l a t i o n : (1) Grouping C o n s i s t e n c y : Fine l a y e r f e a t u r e s which are image h i e r a r c h y r e l a t e d to the same coarse l a y e r f e a t u r e must have compatible i n t e r p r e t a t i o n s . Consider a simple example as shown i n f i g u r e 4.4.2. l i n e - 1 0 {line-hand-1 $1} {1ine-upper-arm-2 $ 1}{1ine-upper-arm-2 $2} { l i n e - f o o t - 3 $1} {line-hand-1 $2} {line-neck-2 $3} { l i n e - l o w e r - l e g - 1 $ 1 } { l i n e - f o o t - 3 $2} {line-hand-1 $3} {line-neck-2 $1} {1ine-lower-leg-1 $3}{1ine-foot-3 $3} { l i n e - l o w e r - l e g - 1 $2} F i g u r e 4.4.2. I n i t i a l s i t u a t i o n , showing three l i n e s connected by image h i e r a r c h y to a blob f e a t u r e . The model p o s s i -b i l i t i e s are shown beneath the l i n e f e a t u r e s . The three l i n e s are known to be w i t h i n the same image area as "blob-17", and so they are image h i e r a r c h y r e l a t e d to that blob. Each of the l i n e s has a number of p o s s i b l e model r o l e s as i n d i c a t e d i n the l i s t s beneath them. A f t e r the a p p l i c a t i o n of grouping c o n s i s t e n c y , some p o s s i b i l i t i e s are e l i m i n a t e d , as i n d i c a t e d i n f i g u r e 4.4.3. 100 {line-hand-1 $ 1} { l i n e - f o o t - 3 $1} {line-hand-1 $2} { l i n e - l o w e r - l e g - 1 $ 1 } { l i n e - f o o t - 3 $2} {line-hand-1 $3} {l i n e - l o w e r - l e g - 1 $ 3 } { l i n e - f o o t - 3 $3} {l i n e - l o w e r - l e g - 1 $2} F i g u r e 4.4.3. A f t e r the a p p l i c a t i o n of grouping c o n s i s t e n c y . Note that the c o n s i s t e n c y has been based only on the generic model types, not on the r o l e s i n those models. Thus a l l r o l e p o s s i b i l i t i e s f o r the model " l i n e - n e c k - 2 " were e l i m i n a t e d , because " l i n e - 9 " d i d not support any r o l e i n that model. In some s i t u a t i o n s , not a l l of the l i n e s which compose an image c o n s t r u c t i o n w i l l be a v a i l a b l e i n the image h i e r a r c h y informa-t i o n , so t h i s method does not r e q u i r e that a l l of the r o l e s be present at t h i s stage. F i g u r e 4.4.3 a l s o shows the model p o s s i b i l i t i e s f o r the blob f e a t u r e . The s p e c i a l i z a t i o n h i e r a r c h y encodes r e l a t i o n s between models based on f i n e l a y e r f e a t u r e s and models based on coarse l a y e r f e a t u r e s . T h i s i n f o r m a t i o n i s used i n the second feature-based c o n s i s t e n c y r e l a t i o n . (2) I n t e r - L e v e l C o n s i s t e n c y : Fine l a y e r f e a t u r e s which are image h i e r a r c h y r e l a t e d to the same coarse l a y e r f e a t u r e must have i n t e r p r e t a t i o n s compatible with the 101 s p e c i a l i z a t i o n s of the coarse l a y e r f e a t u r e ' s i n t e r p r e t a -t i o n s . In the example, of the p o s s i b l e i n t e r p r e t a t i o n s f o r "blob-17", only " e x t r e m i t y " has a c o u n t e r p a r t among the remaining l i n e i n t e r p r e t a t i o n s (as e i t h e r "line-hand-1 or " l i n e - f o o t - 3 " ) , and so the o t h e r s are e l i m i n a t e d . S i m i l a r l y , the " l i n e - l o w e r -l e g - 1 " p o s s i b i l i t y has no co u n t e r p a r t among the blob's i n t e r p r e t a t i o n s , and so i t i s e l i m i n a t e d . The r e s u l t i s shown in f i g u r e 4.4.4. -10 {1ine-hand-1 $ 1 } { l i n e - f o o t - 3 $1} {line-hand-1 $2} { l i n e - f o o t - 3 $2} {line-hand-1 $3} { l i n e - f o o t - 3 $3} F i g u r e 4.4.4. The f i n a l s i t u a t i o n a f t e r the i n t e r - l e v e l con-s i s t e n c y has been a p p l i e d . Once these two c o n s i s t e n c y requirements are met, the number of remaining model p o s s i b i l i t i e s i s s i g n i f i c a n t l y reduced. One f u r t h e r c o n s t r a i n i n g r e l a t i o n i s a v a i l a b l e on the b a s i s of the j u n c t i o n s between l i n e s . R e c a l l from s e c t i o n 4.3 that model p o s s i b i l i t i e s are a l s o a s s i g n e d to the p o i n t s of connection between l i n e s . These r o l e s are based on the 1 02 a t t r i b u t e s (1) the angle at the j u n c t i o n , and (2) the r a t i o of the lengths of the l i n e s forming the j u n c t i o n . These junc-t i o n s must have i n t e r p r e t a t i o n s which are compatible with the i n t e r p r e t a t i o n s of the l i n e s which meet at the j u n c t i o n . Again, t h i s c o n s i s t e n c y requirement i s r e q u i r e d only f o r the generic models. For example, i f one of the model p o s s i b i l i -t i e s f o r a j u n c t i o n i s "{line-hand-1 $4}", and one of the pos-s i b i l i t i e s f o r a l i n e i n v o l v e d i n the j u n c t i o n i s " { l i n e -hand-1 $3}", then they w i l l be c o n s i d e r e d compatible. A c l o s e r examination might r e v e a l that the r e q u i r e d connection "$4" i s not intended to i n v o l v e the l i n e bound as "$3". T h i s more d e t a i l e d examination based on the contents of the sche-mata d e s c r i p t i o n s i s reserved f o r a p o i n t a f t e r the f e a t u r e -based o p e r a t i o n s are complete. I t i s the i n t e n t i o n that these feature-based o p e r a t i o n s remain simple enough that set i n t e r -s e c t i o n i s adequate f o r t h e i r implementation. The c o n s i s t e n c y requirement ac r o s s j u n c t i o n s has the a p p r o p r i a t e format f o r the a p p l i c a t i o n of network c o n s i s t e n c y methods. F u l l arc c o n s i s t e n c y , as reviewed in s e c t i o n 2.2, r e q u i r e s complete r e l a x a t i o n , with s e v e r a l i t e r a t i o n s . Each of the f e a t u r e based c o n s i s t e n c y r e l a t i o n s i s only a p p l i e d i n a s i n g l e pass over the f e a t u r e s , or j u n c t i o n s . The c o n d i t i o n of c o n s i s t e n c y i s not r e q u i r e d by any of the subsequent processes, and the s i n g l e pass makes a s i g n i f i c a n t r e d u c t i o n i n the number of model p o s s i b i l i t i e s . 103 4.5_. Model-Based Operations To review to t h i s p o i n t , f e a t u r e s such as l i n e s , blobs and connections have been e x t r a c t e d w i t h i n a f i x a t i o n area of the image. On the b a s i s of the p r o p e r t i e s of these f e a t u r e s , l i s t s of p o s s i b l e i n t e r p r e t a t i o n s have been a s s i g n e d . Each p o s s i b l e i n t e r p r e t a t i o n i s r e a l l y a r o l e that the f e a t u r e may pl a y i n one of the d e c l a r a t i v e schemata d e s c r i p t i o n s of more complex o b j e c t s . The feature-based o p e r a t i o n s have made a major r e d u c t i o n i n these l i s t s of r o l e s . The purpose of the processes d e s c r i b e d i n t h i s s e c t i o n i s to c o n f i r m the p r e c i s e c o n d i t i o n s as l a i d out by the schemata, and thereby a s s e r t the e x i s t e n c e of more complex s t r u c t u r e s . These more complex s t r u c t u r e s w i l l , i n t u r n , be assi g n e d model p o s s i b i l i t i e s i n s t i l l more complex o b j e c t s . In s e c t i o n 4.2 i t was noted that the set of d e c l a r a t i v e schemata d e s c r i p t i o n s which encodes the body knowledge may be viewed as a grammar of the problem domain. The model-based o p e r a t i o n s may be seen as an attempt to parse the image, and develop a parse t r e e r e s u l t . The l e a f nodes of the parse t r e e are the c o l l e c t e d f e a t u r e s , the middle nodes are the simple body p a r t s , the higher nodes are the more complex body p a r t s , and the root node r e p r e s e n t s the e n t i r e body form. Each node i n t h i s d eveloping parse t r e e w i l l be c a l l e d a d e s c r i p t i o n i n s t a n t i a t i o n , meaning that whenever one of the schemata d e s c r i p t i o n s i s v e r i f i e d , one such node i s generated. 104 Each node w i l l be composed of three types of i n f o r m a t i o n : (1) i t s generic name (the schema of which i t i s an i n s t a n c e ) , (2) the a r c s p o i n t i n g to the nodes beneath i t which act as i t s su p p o r t i n g evidence, and (3) a t t r i b u t e value p a i r s f o r the a t t r i b u t e s which are s p e c i f i e d i n the schema. Due to the non-uniform a v a i l a b i l i t y of f e a t u r e s over the r e t i n a , i t w i l l o f t e n be the case that schemata i n s t a n c e s w i l l be p a r t l y developed, but not complete. For example, the schema d e s c r i p t i o n of " l i n e - f o o t - 1 " might be s a t i s f i e d by two l i n e s and a j u n c t i o n , but the t h i r d l i n e might not be a v a i l -a b l e , e i t h e r because i t f a l l s o u t s i d e of f i x a t i o n , or because i t has not yet been c o n s i d e r e d . The approach which has been taken i s to r e c o r d these p a r t i a l i n s t a n t i a t i o n s f o r any given schema as a network whose nodes are the p o s s i b l e b i n d i n g s f o r the r e q u i r e d o b j e c t s and whose a r c s are the r e l a t i o n s r e q u i r e d among o b j e c t s . T h i s s e c t i o n w i l l present an a l g o r i t h m which can be used to e x t r a c t any newly completed in s t a n c e of the schema which might r e s u l t from the a d d i t i o n of a new b i n d i n g p o s s i b i l i t y i n t o the network. The system's i n t e r p r e t a t i o n method i s e n t i r e l y bottom-up. T h i s choice of s t r a t e g y i s not r e f l e c t e d at a l l i n the body model r e p r e s e n t a t i o n , but i s l o c a l to the c o n t r o l programs. Other planned v e r s i o n s of the system w i l l be able to implement a v a r i e t y of types of top-down c o n t r o l . 105 T h i s s e c t i o n w i l l o u t l i n e and d i s c u s s the model-based o p e r a t i o n s i n the context of a s i n g l e f i x a t i o n . The problem of i n t e l l i g e n t l y s e l e c t i n g these l o c a t i o n s , and of combining i n f o r m a t i o n a c r o s s f i x a t i o n s w i l l be d e f e r r e d to the f o l l o w i n g s e c t i o n . T h i s p a r t i t i o n i s n a t u r a l because the s i z e of the f o v e a l and p e r i p h e r a l r a d i i of f e a t u r e a v a i l a b i l i t y i s a r b i -t r a r i l y chosen, and i t i s p o s s i b l e to set these values to cover the e n t i r e image and thereby e x t r a c t a l l f e a t u r e s i n a s i n g l e f i x a t i o n - as i f the image subtended a very small v i s u a l angle. The task i s one of p a r s i n g to as high a l e v e l as can be supported by the a v a i l a b l e f e a t u r e s . Each of the two l a y e r s of the grammar i s complete, and can be used independently, so only the f i n e l a y e r w i l l be d i s c u s s e d . T h i s l a y e r i s more complex because of the " e l a b o r a t i o n s t r u c t u r e " i n the a t t r i -bute s p e c i f i c a t i o n p o r t i o n s of the d e s c r i p t i o n s . There are a number of i s s u e s r e l a t i n g to t h i s phase of i n t e r p r e t a t i o n . In t h i s s e c t i o n we s h a l l c oncentrate on two p a r t i c u l a r i s s u e s and show how they motivate the mechanism developed f o r i n t e r p r e t a t i o n . The f i r s t i s s u e , termed the " l o c a l l y l e g a l i n t e r p r e t a t i o n s i s s u e " i s a r e s u l t of the u n c e r t a i n t y of the order i n which elements should be con-s i d e r e d , combined with the commitment to i n i t i a t e the develop-ment of the understanding of the scene before having e x t r a c t e d f e a t u r e s over the e n t i r e image. The second, the issue of " r e p r e s e n t i n g r e l a t i o n i n s t a n c e s " , i s a r e s u l t of the l o c a l 106 u n c e r t a i n t y of the u n d e r l y i n g three dimensional s t r u c t u r e of a s e l f deforming scene o b j e c t such as the human body. 4.5.K L o c a l l y Legal I n t e r p r e t a t i o n s Issue Each p r o t o t y p i c a l schemata d e s c r i p t i o n of image c o n s t r u c -t i o n s i n the grammar i s unique. However, once the g e n e r a l i z a -t i o n over a t t r i b u t e s p e c i f i c a t i o n s takes place (as shown i n s e c t i o n 4.3), a c o l l e c t i o n of l i n e s i n the image may s a t i s f y the c r i t e r i a f o r a number of d e s c r i p t i o n s . T h i s i s p a r t i c u -l a r l y true f o r c o n s t r u c t i o n s intended to be at d i f f e r e n t s c a l e s . For example "line-hand-1" and " l i n e - h i p s - 2 " are s i m i -l a r . As a r e s u l t , l o c a l l y l e g a l i n t e r p r e t a t i o n s w i l l be found which turn out to be i n c o r r e c t i n a l a r g e r context, so i t i s important to not make too great a commitment to a completed d e s c r i p t i o n , by, f o r example, a l l o w i n g i t to c o n t r o l the parse, s e a r c h i n g f o r i t s other r e q u i r e d elements. T h i s problem i s a l s o encountered at a h i g h e r l e v e l (toward the root node) in the parse of a body form image. The problem i s more v i v i d l y i l l u s t r a t e d at t h i s l e v e l . Suppose that the image shown in f i g u r e 4.5.1 i s to be i n t e r p r e t e d . I t i s apparent that the p a r t s l a b e l l e d "1" and "2" belong to the same l e g (crossed i n f r o n t of the body) and that p a r t s "3" and "4" belong to the other. I t might be the case that the f i r s t l e g to be recognized i n the image i s the one made up of p a r t s "2" and "3", which i s completely l e g a l i n a l o c a l sense. 107 F i g u r e 4.5.1. L o c a l l y l e g a l , but g l o b a l l y i l l e g a l s t r u c t u r e s i n body form problem domain. One s o l u t i o n to the problem r e q u i r e s a mechanism whereby a s i n g l e f e a t u r e may support a number of hypothesized models, not only d i f f e r e n t models, but a l s o s e v e r a l v e r s i o n s of the same model. The p a r t i c u l a r s o l u t i o n used here i n v o l v e s a v a r i a t i o n of network c o n s i s t e n c y methods (Mackworth, 1977b), so f i r s t we s h a l l examine the d i f f i c u l t i e s i n a p p l y i n g those methods d i r e c t l y , a l i n e of reasoning which w i l l r e i n t r o d u c e the " l o c a l l y l e g a l i n t e r p r e t a t i o n s " issue w i t h i n a s t r i c t e r formalism. 108 4.5.2. App l y i n g Network Consistency A network may be e s t a b l i s h e d f o r each schemata d e s c r i p -t i o n . The purpose of the network w i l l be to r e t a i n a r e c o r d of the f e a t u r e s , found to any p o i n t , which express a p o s s i b i l -i t y of s u p p o r t i n g the d e s c r i p t i o n . Each o b j e c t l a b e l i n the d e s c r i p t i o n w i l l be represented by a node, and the r e q u i r e d r e l a t i o n s between the o b j e c t s w i l l be the edges. The nodes w i l l be viewed as v a r i a b l e s , with p o s s i b l e b i n d i n g s from the set of f e a t u r e s which have s p e c i f i e d the corresponding l a b e l i n that d e s c r i p t i o n . For example, c o n s i d e r the schema d e s c r i p t i o n f o r the image c o n s t r u c t i o n " l i n e - h a n d - 1 " [ 2 5 ] : (line-hand-1 n i l (component (($1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 134) ( r a t i o 108)) ($5 connect ($2 $3) (angle 90) ( r a t i o 225)) ($6 connect ($3 $1) (angle 92) ( r a t i o 4 1)))) We may form the network as shown i n f i g u r e 4.5.2, fo r which the three r e q u i r e d o b j e c t s are nodes, and the r e l a t i o n s are a r c s . [25] The a t t r i b u t e development p o r t i o n has been d e l e t e d . 109 F i g u r e 4.5.2. A network c o n s t r u c t e d from a schema d e s c r i p t i o n . Within the f i x a t i o n , r o l e s of the f e a t u r e s are c o n s i d e r e d , one at a time. Each r o l e r e s u l t s i n an entry to the network f o r the schema s p e c i f i e d i n the r o l e . For example, i f " l i n e - 1 " i s found to have the r o l e "{line-hand-1 component $1}", then the domain f o r the "$1" node w i l l be updated to r e f l e c t the p o s s i -b i l i t y as shown in f i g u r e 4.5.3. $5 F i g u r e 4.5.3. A network c o n s t r u c t e d from a schema d e s c r i p t i o n with an entry made. As more r o l e s , and other l i n e s are c o n s i d e r e d , s e v e r a l e n t r i e s w i l l be made to the network. With each e n t r y , the r e q u i r e d r e l a t i o n s among elements are examined, and i f e s t a b l i s h e d , they are entered as p a r t of the extension of the a r c s . A 110 more advanced s t a t e of development i s shown i n f i g u r e 4.5.4, p) {line-1 ,line-4} $4 ^ / \ ^ $6 { ( l i n e - 1 , l i n e - 2 ) / \ * { ( l i n e - 6 , l i n e - 4 ) ( l i n e - 4 , l i n e - 2 ) } / \ ( l i n e - 6 , l i n e - 1 ) } { l i n e - 1 , l i n e - 2 / T i n e - 5 } J {1ine-6,1ine-7} $5 { ( l i n e - 2 , l i n e - 7 ) } F i g u r e 4.5.4. A network c o n s t r u c t e d from a schema d e s c r i p t i o n a f t e r s e v e r a l e n t r i e s . T h i s i s the c l a s s i c a l format f o r the a p p l i c a t i o n of network c o n s i s t e n c y methods towards the r e d u c t i o n of the s e t s of- pos-s i b l e b i n d i n g s , and u l t i m a t e l y to determine i n s t a n c e s of the d e s c r i p t i o n . Due to the design goals of the system, there are reasons why these methods may not be a p p l i e d d i r e c t l y . The c e n t r a l i s s u e i s a d i f f e r e n c e i n approach. Network c o n s i s t e n c y methods r e l y on the a v a i l a b i l i t y of a l l i n f o r m a t i o n : the com-p l e t e s e t s of p o s s i b l e v a r i a b l e b i n d i n g s , and a l l r e l a t i o n s among them. The s p i r i t of t h i s system i s to reach some under-standing a f t e r a minimum amount of f e a t u r e e x t r a c t i o n , i n an incomplete knowledge s i t u a t i o n , and i n p a r t i c u l a r , to a v o i d the assumption of a v a i l a b i l i t y of r e l a t i o n s among f e a t u r e s i n p a r a l l e l over an image, an a v a i l a b i l i t y which has been demon-s t r a t e d c o n t r a r y to the o p e r a t i o n s of human v i s i o n (Treisman and Gelade, 1980) . 111 For the example in f i g u r e 4.5.4, arc c o n s i s t e n c y would empty the domain f o r node "$3" on the f i r s t pass, which would propagate to empty a l l the domains. The next f i x a t i o n , or even the next f e a t u r e i n the same f i x a t i o n might provide " l i n e - 3 " as a p o s s i b i l i t y , with r e l a t i o n s as shown i n f i g u r e 4.5.5. ($U{line-1,line-4} $4 ~ ^ N , / \ ^ $6 { ( l i n e - 1 , l i n e - 2 ) / \ { ( l i n e - 6 , l i n e - 4 ) ( l i n e - 4 , l i n e - 2 ) } / \ ( 1 i n e - 6 , l i n e - 1 ) / \ ( l i n e - 3 , l i n e - 1 ) } ($2) ($3) {1ine-1,1ine-27iine-5} J { l i n e - 6 , l i n e - 7 , l i n e - 3 } $5 { ( l i n e - 2 , l i n e - 7 ) ( l i n e - 2 , l i n e - 3 ) } F i g u r e 4.5.5. A network c o n s t r u c t e d from a schema d e s c r i p t i o n a f t e r s e v e r a l e n t r i e s . The c o n c l u s i o n i s that the networks w i l l have to be exam-ined as each new p i e c e of i n f o r m a t i o n becomes a v a i l a b l e . One s t r a i g h t f o r w a r d way to do t h i s i s to apply arc c o n s i s t e n c y over the network each time a new v a r i a b l e i s entered. T h i s w i l l , u n f o r t u n a t e l y , r e s u l t i n the r e d i s c o v e r y of s o l u t i o n s returned at previous p o i n t s i n the i n t e r p r e t a t i o n . One p o s s i b l e remedy would be to remove v a r i a b l e bindings once they take p a r t i n some s o l u t i o n over the network. T h i s would in t r o d u c e the unfavourable c o n d i t i o n of not being able to d e a l with the " l o c a l l y l e g a l i n t e r p r e t a t i o n s " i s s u e as 1 1 2 d e s c r i b e d e a r l i e r : removing a b i n d i n g p o s s i b i l i t y excludes i t s involvement i n other s o l u t i o n s . A b e t t e r s o l u t i o n i s to r e s t r i c t the a p p l i c a t i o n of the c o n s i s t e n c y methods. A temporary arc c o n s i s t e n c y i s a p p l i e d s t a r t i n g at the node f o r which a new entry i s made. The set of domain v a r i a b l e s f o r that node i s c o n f i n e d to the s i n g l e e n t r y . The r e s u l t i s a l i s t of a l l p o t e n t i a l s o l u t i o n s which  have not been p r e v i o u s l y r e t u r n e d . These s o l u t i o n s may, how-ever, i n c l u d e domain v a r i a b l e s which have been used i n p r e v i -ous s o l u t i o n s f o r the same schema. 4.5.3. Incremental Consistency The f o l l o w i n g i s a f o r m a l i z a t i o n of t h i s v a r i a t i o n of arc c o n s i s t e n c y , which w i l l be c a l l e d incremental c o n s i s t e n c y . The f o r m u l a t i o n f o l l o w s c l o s e l y a f t e r t h a t of the AC-2 a l g o -rithm f o r arc c o n s i s t e n c y as provided by Mackworth (1977b). For each node i of the network, assume F i to be the set of a l l f e a t u r e s which express the p o t e n t i a l to f u l f i l l the schema r o l e represented by the node i . We would l i k e to know, at a l l times the value of : (D1, ... , Dn) where Di i s a subset of F i such that the elements of D1, ... , Dn are arc c o n s i s t e n t . Define the neighbourhood of a node i n the network: Qi = { j | P i j i s required} 1 13 where P i j r e p r e s e n t s a r e q u i r e d r e l a t i o n between nodes i and j . In the ongoing example from f i g u r e 4.5.5, Qi={$4 $6} For each x i n Di d e f i n e f o r each j i n Qi R i j x = { y | P i j ( x , y ) } T h i s means that each r e l a t i o n i s d e s c r i b e d as an e x t e n s i o n , d i s t r i b u t e d over the elements which enter i n t o the r e l a t i o n . For the example, R($3,$1,line-6) = { l i n e - 4 , l i n e - 1 } Whenever a f e a t u r e x s p e c i f i e s a r o l e i n i of the d e s c r i p t i o n , and should t h e r e f o r e be added to the network, we e s t a b l i s h R i j x f o r each j i n Q i . Then we apply NEW(x,i) which r e t u r n s the subset of a l l arc c o n s i s t e n t b i n d i n g s which have not been p r e v i o u s l y r e t u r n e d . The newly entered f e a t u r e w i l l be c a l l e d the o r i g i n a t i n g  v a l u e . The a l g o r i t h m f i r s t s e t s up a temporary domain f o r the node at which the o r i g i n a t i n g value i s entered. T h i s domain Di c o n s i s t s of that s i n g l e v a l u e . The a l g o r i t h m propagates outward from t h i s node. As the propagation proceeds from node i to node j , i f node j has not yet been v i s i t e d then the work-ing subset of v a r i a b l e s Dj f o r that node i s set to those v a l u e s which meet the P i j and P j i r e l a t i o n s with the values i n 1 1 4 D i . If the node has been v i s i t e d b e f o re, Dj w i l l be i n t e r -s e c t e d with the set of those which meet the r e l a t i o n s . I f the node j i s updated i n e i t h e r way, i t i s put onto the l i s t (REM) of nodes from which the propagation must yet take p l a c e . procedure NEW(x,i) Di <- {x} Dk <- 0 f o r a l l M i REM <- {i} while REM not empty do  begin s e l e c t and d e l e t e any i from REM f o r each j i n Qi do begin Xj <- U R i j a a i n Di i f Dj=0 then  begin Dj. <- Xj REM <- REM U {j} end e l s e i f Dj not a subset of Xj then  begin Dj <- Xj n Dj REM <- REM u {j} end i_f Dj = 0 then r e t u r n n i l end end r e t u r n {D1, ... ,Dn} end NEW F i g u r e 4.5.6. Incremental Consistency A l g o r i t h m . Note that the procedure w i l l terminate and r e t u r n " n i l " i f any of the o r i g i n a t i n g value's r e q u i r e d r e l a t i o n s i s not met f o r at l e a s t one v a l u e . P r o c e s s i n g w i l l only continue to the second i t e r a t i o n in the event that the o r i g i n a t i n g value has each of i t s r e q u i r e d r e l a t i o n s f u l f i l l e d . At that p o i n t i t i s l i k e l y that there w i l l be a new s o l u t i o n over the 1 15 network and the i n s t a n t i a t i o n s e t s are being reduced. I f a s i n g l e value r e s u l t s i n each of the D i , then i t i s c e r t a i n to be a s o l u t i o n , and i f some Di r e s u l t s with more than one e n t r y , a search i s r e q u i r e d to f i n d the a c t u a l s o l u t i o n . T h i s c o n s t i t u t e s the f i r s t phase of the model-based o p e r a t i o n : simply run through the l i s t of model p o s s i b i l i t i e s f o r each f e a t u r e , and enter the p o s s i b i l i t i e s i n t o the network f o r the a p p r o p r i a t e schema d e s c r i p t i o n . Then run the i n c r e -mental c o n s i s t e n c y a l g o r i t h m , and a l l newly formed s e t s of b i n d i n g candidates w i l l be r e t u r n e d . As we s h a l l see, there are important steps that must be taken upon f i n d i n g such s a t i s f i e d d e s c r i p t i o n s , but the b a s i c idea i s that the o b j e c t supported by the d e s c r i p t i o n w i l l i t s e l f be i n t r o d u c e d i n t o a network f o r the d e s c r i p t i o n of a higher l e v e l o b j e c t , and so f o r t h , u n t i l the process can no longer develop more complex o b j e c t s . 4^_5.4. Representing R e l a t i o n Instances We wish to a v o i d examining the c o n d i t i o n s f o r the e x i s t e n c e of a r e l a t i o n more than once f o r each p a i r of o b j e c t s or f e a t u r e s . I f , f o r example, a "connect" r e l a t i o n -s h i p i s found between two l i n e s d u r i n g c o n s i d e r a t i o n of t h e i r involvement i n the d e s c r i p t i o n "line-hand-2", then we would l i k e to r e t a i n i n f o r m a t i o n about t h e i r connection f o r examina-t i o n i n the event that these same two l i n e s become candidates i n some other schema d e s c r i p t i o n . For t h i s reason, r e l a t i o n 1 16 i n s t a n c e s have i d e n t i t i e s of t h e i r own, and c a r r y a t t r i b u t e values i n e x a c t l y the same way t h a t o b j e c t s do. Each o b j e c t has a s s o c i a t e d with i t a l i s t of r e l a t i o n i n s t a n c e s i n which i t i s known to take p a r t . In some cases attempts to e s t a b l i s h the r e l a t i o n i n s t a n c e s are made "on demand" during the course of the i n t e r p r e t a t i o n , and i n other cases, such as f o r l i n e connec-t i o n s , r e l a t i o n i n s t a n c e s are c o l l e c t e d w i t h i n the fovea e x h a u s t i v e l y with each f i x a t i o n , as i f they were themselves f e a t u r e s . As seen in s e c t i o n 2.2, each d e s c r i p t i o n of o b j e c t s has a s s o c i a t e d methods of developing a t t r i b u t e s . S i m i l a r l y , there are s p e c i f i c methods, of the same form, for the development of a t t r i b u t e s of r e l a t i o n s . F i g u r e 4.5.7 shows the methods f o r the development of a t t r i b u t e s f o r the "near" r e l a t i o n . The b i n d i n g l a b e l s "$1" and "$2" i n d i c a t e the two (scene) o b j e c t s which have been judged to be "near". (near ((ok <- (same ( s i d e $1) ( s i d e $2))) ( r a t i o <- (times 100. ( q u o t i e n t ( s i z e $1) ( s i z e $2)))) (angle-x <- ( d i f f (car (a3d $1)) (car (a3d $2)))) (angle-y <- ( d i f f (cadr (a3d $1)) (cadr (a3d $2)))) (angle-z <- ( d i f f (caddr (a3d $1)) (caddr (a3d $ 2 ) ) ) ) ) ) F i g u r e 4.5.7. Example of s p e c i f i c a t i o n s for the e v a l u a t i o n of a t t r i b u t e s f o r a r e l a t i o n . We have seen in s e c t i o n 4.2 that o b j e c t s may have s p e c i a l a t t r i b u t e s t r u c t u r e s c a l l e d " e l a b o r a t i o n s " which c o n t a i n a 1 17 number of a l t e r n a t i v e sets of valu e s which c o u l d not be d e f i n -i t e l y determined at the time of completing i t s d e s c r i p t i o n . In t h i s case, the a t t r i b u t e v a l u e s f o r the r e l a t i o n w i l l a l s o be s t o r e d as e l a b o r a t i o n s , r e p r e s e n t i n g the p o s s i b l e combina-t i o n s of the e l a b o r a t i o n s of the two o b j e c t s e n t e r i n g i n t o the r e l a t i o n . T h i s g i v e s the appearance of a c o m b i n a t o r i a l explo-s i o n i n the number of e l a b o r a t i o n s , but the bulk of the e l a -b o r a t i o n s never c o n t r i b u t e to any more complex s t r u c t u r e . T h i s i s because there are c o n s t r a i n t s on the a t t r i b u t e s of the r e l a t i o n which are r e q u i r e d by the d e s c r i p t i o n s that s p e c i f y the more complex o b j e c t s . For example i f a c e r t a i n "hand" with two p o s s i b l e o r i e n t a t i o n s i s found to be "near" a lower arm with four p o s s i b l e o r i e n t a t i o n s , then each of the a t t r i -butes "angle-x", "angle-y" ,and "angle-z" w i l l have e i g h t pos-s i b l e o r i e n t a t i o n s . But, the d e s c r i p t i o n f o r the o b j e c t "arm" s p e c i f i e s a t i g h t range of p o s s i b l e values f o r these a t t r i -butes, and hence any instance of "arm" w i l l only r e t a i n a few p o s s i b l e e l a b o r a t i o n s f o r that r e l a t i o n . 118 4.6. S e l e c t i n g P r o c e s s i n g L o c a t i o n s The p r e v i o u s s e c t i o n s have demonstrated how a parse t r e e might be developed out of a v a i l a b l e image f e a t u r e s , such that the body s t r u c t u r e and i t s a t t r i b u t e s are r e p r e s e n t e d . Within a s i n g l e f i x a t i o n , however, there may not be enough informa-t i o n to develop a root node, "body". The nature of the knowledge r e p r e s e n t a t i o n , together with the s t a t u s of the i n t e r p r e t a t i o n p rovide an i d e a l means of i n t e l l i g e n t s e l e c t i o n of p r o c e s s i n g l o c a t i o n s such that the e n t i r e i n t e r p r e t a t i o n can be e f f e c t i v e l y accomplished. Before a d d r e s s i n g t h i s i s s u e , we must c o n s i d e r what i s meant by the term " i n t e r p r e t a t i o n " . I n t e r p r e t a t i o n might r e q u i r e that every l i n e i n the image be used i n support of a complete parse t r e e f o r an o b j e c t at the f i n e l a y e r , with every a t t r i b u t e of each body p a r t computed. In t h i s case, the i s s u e of s e l e c t i n g p r o c e s s i n g l o c a t i o n s i s not important. Since every l o c a t i o n must be f i x a t e d f o v e a l l y , a r a s t e r scan would be a p p r o p r i a t e . On the other hand, i n t e r p r e t a t i o n might r e q u i r e only a complete body to be determined on the b a s i s of coarse l a y e r f e a t u r e s , but the degree of u n c e r t a i n t y a s s o c i -ated with the r e p r e s e n t a t i o n makes t h i s a l t e r n a t i v e u n a t t r a c -t i v e . There i s a compromise p o s i t i o n which can be motivated by the phenomenology of v i s i o n . During the normal viewing of an o b j e c t such as a bookcase, only a small p o r t i o n of the f i e l d 119 of view i s a v a i l a b l e i n f i n e d e t a i l , and so one might be abso-l u t e l y sure that a few of the books were a c t u a l l y books. The books which are not seen f o v e a l l y w i l l l i k e l y conform to some c o a r s e r r e p r e s e n t a t i o n f o r books. The c o e x i s t e n c e of these two r e p r e s e n t a t i o n s i s adequate to permit the s u b j e c t i v e experience of having seen a l l of the books i n d e t a i l . I t i s q u i t e u n l i k e l y that one would f i x a t e on each book i n a bookcase unless s e a r c h i n g . T h i s idea can be expressed c o m p u t a t i o n a l l y w i t h i n the body drawing i n t e r p r e t a t i o n system. For any coarse l a y e r i n s t a n c e of an o b j e c t , d e f i n e a correspondence to be a f i n e l a y e r o b j e c t i n s t a n c e which meets the f o l l o w i n g c r i t e r i a : (1 ) The i n s t a n c e s are r e l a t e d by s p e c i a l i z a t i o n -g e n e r a l i z a t i o n l i n k s . (2) A t t r i b u t e v a l u e s of the two i n s t a n c e s are s i m i l a r , par-t i c u l a r l y the " s i z e " a t t r i b u t e . (3) The two i n s t a n c e s have roughly the same l o c a t i o n . ( 4 ) The image c o n s t r u c t i o n which forms the b a s i s of the f i n e l a y e r o b j e c t instance does not support any other (and d i f f e r e n t ) o b j e c t i n s t a n c e s . Of g r e a t e r i n t e r e s t are the coarse l a y e r o b j e c t i n s t a n c e s which do not have correspondences. We cannot be sure of the v a l i d i t y of these i n t e r p r e t a t i o n s , yet there are s t i l l i n v e s -t i g a t i o n s that can be made. Suppose, f o r example, that the 120 coarse l a y e r o b j e c t has a component d e s c r i p t i o n , and that one of i t s components has a correspondence. In t h i s case, we say that the composed ob j e c t i n s t a n c e has an i n f e r r e d correspon- dence . We c o u l d expand t h i s d e f i n i t i o n to i n c l u d e o b j e c t s which have a component with an i n f e r r e d correspondence a l s o . With these concepts we can d e f i n e the d e f a u l t o b j e c t i v e s  of i n t e r p r e t a t i o n to be the development of an i n s t a n c e of the body based on the coarse l a y e r grammar, which has an i n f e r r e d correspondence in the f i n e l a y e r o b j e c t s . Other, more s p e c i f i c demands of a task c o u l d produce the requirement f o r f i n e l a y e r i n f o r m a t i o n about some s p e c i f i c body p a r t and thereby extend the o b j e c t i v e s , but i n the absence of such requests the d e f a u l t o b j e c t i v e s are adequate to c o n f i r m the e x i s t e n c e of a body. I t f o l l o w s that p r o c e s s i n g may be d i r e c t e d to areas of the image which can permit the most r a p i d a r r i v a l at the o b j e c t i v e s . T h i s can be formulated as (1) Foveal p r o c e s s i n g requirement: The l o c a t i o n s of coarse l a y e r o b j e c t s are of i n t e r e s t f o r f i x a t i o n i f they com-pose some more complex ob j e c t which has no correspondence at the f i n e l a y e r . T h i s requirement p i n p o i n t s l o c a t i o n s which have the g r e a t e s t o p p o r t u n i t y to propagate the c e r t a i n t y a s s o c i a t e d with f o v e a l f i x a t i o n out to the p e r i p h e r a l o b j e c t s through correspondence. The body drawing i n t e r p r e t a t i o n system uses t h i s r u l e as a 121 means of s e l e c t i n g f i x a t i o n l o c a t i o n s . The next chapter i n c l u d e s examples of the o p e r a t i o n s of the s e l e c t i o n process. There may not always be the a p p r o p r i a t e c o n f i g u r a t i o n f o r the a p p l i c a t i o n of the f o v e a l requirement, so another r u l e must be formulated. (2) P e r i p h e r a l p r o c e s s i n g requirement: In the absence of f o v e a l requirements, f i x a t i o n l o c a t i o n s should be s e l e c t e d to expand the area which i s i n t e r p r e t e d at the coarse l a y e r . In the body drawing system, t h i s r u l e i s implemented by making a v a i l a b l e an 32x32 g r i d over the image which i n d i c a t e s the amount of d e t a i l [ 2 6 ] i n the g r i d square. Depending on the s i z e of the p e r i p h e r a l r a d i u s , l o c a t i o n s are s e l e c t e d which c o n t a i n high d e t a i l such that the p e r i p h e r a l area w i l l merge with the area a l r e a d y processed. T h i s maximizes the chance of d e v e l o p i n g a f o v e a l requirement, and at the same time works towards a complete coarse l a y e r i n t e r p r e t a t i o n . [26] T h i s i s simply a measure of the amount of l i n e in the g r i d square. 122 5. Working Examples T h i s chapter demonstrates the o p e r a t i o n s of the computer implementation with the help of two examples. Outputs from computer runs are i n c l u d e d , and are a l l p r e f i x e d with a v e r t i -c a l l i n e to d i s t i n g u i s h them from the a n n o t a t i o n s . The f i r s t example shows the p r o c e s s i n g t a k i n g p l a c e at a s i n g l e f i x a -t i o n , with q u i t e a wide f i e l d of view both p e r i p h e r a l l y and f o v e a l l y . T h i s example w i l l be used to demonstrate the f e a t u r e c o l l e c t i o n , feature-based model r e d u c t i o n , and the model i n v o c a t i o n phases of i n t e r p r e t a t i o n . The second example shows a s e r i e s of s i x f i x a t i o n l o c a t i o n s being s e l e c t e d by the system, and demonstrates the r e s u l t s at each step. T h i s sequence i s s u f f i c i e n t to demonstrate the l o c a t i o n s e l e c t i o n c r i t e r i a and the i n t e g r a t i o n a c r o s s f i x a t i o n s . 5.J_. A S i n g l e F i x a t i o n The body form drawing that w i l l be used i n these examples i s shown in f i g u r e 5.1.1. For t h i s example, the r a d i u s of p e r i p h e r a l v i s i o n has been set at 375 u n i t s a c r o s s the e n t i r e image area of 1024x1024, while the fovea was chosen as 325. T h i s l a r g e fovea i s used i n order that the i n t e r p r e t a t i o n processes can be shown to develop to the p o i n t of r e c o g n i z i n g complex s t r u c t u r e s , without r e q u i r i n g r e f i x a t i o n . F i g u r e 5.1.2 and 5.1.3 i n d i c a t e s the areas that are i n c l u d e d i n the example. 1 23 i F i g u r e 5.1.1 The body form l i n e drawing to be used as the example. 124 Fi g u r e 5.1.2. Area of a v a i l a b l e f i n e l a y e r f e a t u r e s i n the s i n g l e f i x a t i o n at p o i n t (350,325). / \ \ \ \ / / F i g u r e 5.1.3. Area of a v a i l a b l e coarse l a y e r f e a t u r e s i n the s i n g l e f i x a t i o n at p o i n t (350,325). 125 The system i s i n s t r u c t e d to f i x a t e at l o c a t i o n (350,325). Next a l i s t i s produced of a l l the f e a t u r e s which were c o l -l e c t e d w i t h i n the f i x a t i o n . T h e i r p r o p e r t i e s are i n c l u d e d i n the l i s t where known, and are otherwise n i l . The number of p o s s i b l e r o l e s that the f e a t u r e may take i n the i n t e r p r e t a t i o n i s l i s t e d at the f a r r i g h t hand s i d e under the heading "models". -> (seel 350 325) a f t e r f e a t u r e c o l l e c t i o n at 350 325 (325/375) node p o i n t 1 p o i n t 2 curve models l i n e - 3 0 512 541 503 536 -8 1 82 l i n e - 2 9 524 517 512 542 9 18 12 l i n e - 2 8 535 571 524 517 36 44 12 l i n e - 2 7 512 570 535 571 -8 1 82 l i n e - 2 6 503 536 512 570 9 18 12 l i n e - 2 5 524 579 n i l n i l -8 1 82 l i n e - 2 4 524 579 n i l n i l 54 61 10 l i n e - 2 3 1 99 251 1 29 225 -8 1 82 l i n e - 2 2 184 293 1 99 251 -8 1 82 lin e - 2 1 129 225 184 293 -8 1 82 l i n e - 2 0 1 60 438 171 299 1 9 23 14 l i n e - 1 9 1 1 7 447 1 60 438 -8 1 82 l i n e - 1 8 171 299 1 1 7 447 -8 1 82 l i n e - 1 7 121 456 309 514 36 44 1 2 l i n e - 1 6 339 456 309 514 -8 1 82 l i n e - 1 5 121 456 339 456 -8 1 82 l i n e - 1 4 409 54 333 54 -8 1 82 l i n e - 1 3 409 99 409 54 -8 1 82 l i n e - 1 2 333 54 409 99 -8 1 82 l i n e - 1 1 434 236 396 101 1 9 23 1 4 l i n e - 1 0 396 259 434 236 -8 1 82 l i n e - 9 396 101 396 259 -8 1 82 l i n e - 8 405 272 353 490 46 53 1 2 l i n e - 7 405 490 353 490 -8 1 82 l i n e - 6 405 272 405 490 -8 1 82 l i n e - 5 349 512 376 614 -8 1 82 l i n e - 4 403 503 349 512 -8 1 82 l i n e - 3 376 614 403 503 45 45 1 6 l i n e - 2 n i l n i l 379 623 93 104 6 l i n e - 1 n i l n i l 379 623 9 18 1 2 node center l o n g a x i s s h o r t a x i s r a t i o models blob-12 525 551 525 576 528 520 512 552 536 552 213 237 4 blob-11 517 651 n i l n i l 528 592 504 648 528 648 n i l n i l 0 126 blo b - 10 414 1 92 405 127 400 260 432 188 400 196 363 512 2 blo b - 9 383 557 382 61 1 400 512 368 560 408 560 238 297 3 blo b - 8 385 398 401 298 379 496 408 408 360 392 363 512 2 blo b - 7 385 70 350 59 416 96 392 56 384 88 213 237 4 blo b - 6 n i l n i l 391 653 n i l n i l n i l n i l n i l n i l n i l n i l 0 b l o b - 5 n i l n i l 288 672 n i l n i l n i l n i l n i l n i l n i l n i l 0 b l o b - 4 175 262 152 245 1 92 288 184 248 1 68 272 203 212 3 blo b - 3 199 659 n i l n i l 256 664 200 648 n i l n i l n i l n i l 0 b l o b - 2 248 484 1 52 469 340 464 252 464 244 512 363 512 2 blo b - 1 1 56 390 171 328 1 24 448 176 396 1 44 388 363 512 2 Next the system goes through the steps to reduce the models us i n g the feature-based o p e r a t i o n s as d e s c r i b e d i n s e c t i o n 4.4. The three steps shown below correspond to the three con-s i s t e n c y r e l a t i o n s which are e x p l o i t e d . The step marked "2-l e v e l " i n d i c a t e s the i n t e r l e v e l c o n s i s t e n c y and the " C - f i l t e r " s t e p i s the c o n s i s t e n c y at j u n c t i o n s of l i n e s . The p r i n t o u t shows the r e d u c t i o n in terms of the t o t a l number of models f o r the l i n e f e a t u r e s at each step i n the p r o c e s s . Note that the image h i e r a r c h y i n f o r m a t i o n i s not adequate f o r " l i n e - 1 6 " to enter i n t o the groupings of f e a t u r e s , and hence i t s model pos-s i b i l i t i e s are not reduced. before grouping 1690 models at l i n e l e v e l a f t e r grouping 824 models at l i n e l e v e l a f t e r 2 - l e v e l 435 models at l i n e l e v e l a f t e r C - f i l t e r 170 models at l i n e l e v e l a f t e r feature-based model r e d u c t i o n node point 1 point 2 curve models l i n e - 3 0 512 541 503 536 -8 1 2 l i n e - 2 9 524 517 512 542 9 18 2 l i n e - 2 8 535 571 524 517 36 44 1 l i n e - 2 7 512 570 535 571 -8 1 2 l i n e - 2 6 503 536 512 570 9 18 2 l i n e - 2 5 524 579 n i l n i l -8 1 16 l i n e - 2 4 524 579 n i l n i l 54 61 8 l i n e - 2 3 199 251 129 225 -8 1 3 l i n e - 2 2 184 293 199 251 -8 1 3 lin e - 2 1 129 225 184 293 -8 1 3 l i n e - 2 0 160 438 171 299 19 23 1 1 27 1 i n e - 19 1 17 447 1 60 438 -8 1 2 1 i n e -18 171 299 117 447 -8 1 2 l i n e - 17 121 456 309 514 36 44 12 1 i n e - 16 339 456 309 514 -8 1 82 l i n e - 1 5 121 456 339 456 -8 1 2 l i n e - 1 4 409 54 333 54 -8 1 3 l i n e - 13 409 99 409 54 -8 1 3 l i n e - 1 2 333 54 409 99 -8 1 3 l i n e - 1 1 434 236 396 101 19 23 1 1 i n e -10 396 259 434 236 -8 1 2 l i n e - 9 396 101 396 259 -8 1 2 l i n e - 8 405 272 353 490 46 53 1 1 i n e -7 405 490 353 490 -8 1 2 1 i n e -6 405 272 405 490 -8 1 2 l i n e - 5 349 512 376 614 -8 1 2 l i n e - 4 403 503 349 512 -8 1 2 l i n e - 3 376 614 403 503 45 45 1 l i n e - 2 n i l n i l 379 623 93 104 2 l i n e - 1 n i l n i l 379 623 9 18 1 node c e n t e r l o n g a x i s s h o r t a x i s r a t i o mod blo b - 1 2 525 551 525 576 528 520 512 552 536 552 213 237 b l o b - 1 1 517 651 n i l n i l 528 592 504 648 528 648 n i l n i l b l o b - 1 0 414 1 92 405 1 27 400 260 432 188 400 196 363 512 b l o b - 9 383 557 382 61 1 400 512 368 560 408 560 238 297 b l o b - 8 385 398 401 298 379 496 408 408 360 392 363 512 b l o b - 7 385 70 350 59 416 96 392 56 384 88 21 3 237 b l o b - 6 n i l n i l 391 653 n i l n i l n i l n i l n i l n i l n i l n i l b l o b - 5 n i l n i l 288 672 n i l n i l n i l n i l n i l n i l n i l n i l b l o b - 4 1 75 262 1 52 245 1 92 288 184 248 168 272 203 212 blob- 3 199 659 n i l n i l 256 664 200 648 n i l n i l n i l n i l b l o b - 2 248 484 1 52 469 340 464 252 464 244 512 363 512 b l o b - 1 1 56 390 171 328 1 24 448 1 76 396 1 44 388 363 512 The system now goes through a l l the known coarse l a y e r f e a t u r e s and attempts the model-based o p e r a t i o n s as d e s c r i b e d in s e c t i o n 4.5. Not a l l of the blob f e a t u r e s have enough known about t h e i r a t t r i b u t e s to suggest p o s s i b l e models. F i r s t , the f e a t u r e "blob-12" i s c o n s i d e r e d as a p o s s i b l e b i n d -ing f o r the "$1" r o l e i n the "imagel" d e s c r i p t i o n of "extrem-i t y " . The f u l l d e s c r i p t i o n i s as f o l l o w s : (extremity n i l (image 1 (($1 blob ( r a t i o (200 300)))) n i l 1 28 ((ends <- ( l i s t ( p t l l $1) (pt21 $1))) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .25 (lengthb $1))) )) There i s only one r e q u i r e d o b j e c t f o r t h i s d e s c r i p t i o n , and as i s shown below, the requirement on i t s " r a t i o " p r o p e r t y i s met, so the d e s c r i p t i o n ' s requirements are f u l f i l l e d , and a new node "extremity-1" i s c o n s t r u c t e d , and i t s p r o p e r t y values generated as s p e c i f i e d i n the d e s c r i p t i o n out of the proper-t i e s of the l i n e f e a t u r e . attempting s o l u t i o n s f o r blob-12 as (extremity imagel $1) ma s t e r - r o l e l i s t f o r blob-12 i n (extremity imagel $1) ( n i l $1 ( n i l blob-12 ( n i l ) ) ) attempt to e l a b o r a t e e x t r e m i t y from (($1 . blob-12)) v e r i f y i n g ($1 blob-12) with ( ( r a t i o (200 300))) found= (213 237) node:extremity-1 typ  extremity d e s c r i p t i o n imagel b i n d i n g s (($1 blob-12)) ends ((525 . 576) (528 . 520)) l o c a t i o n (525 . 551) s i z e 56.08 d e l t a 14.02 S i m i l a r l y , p o s s i b l e model r o l e s are c o n s i d e r e d f o r the other blob f e a t u r e s , r e s u l t i n g i n the generation of the f o l l o w i n g nodes: node:lower-1imb-1 type lower-limb d e s c r i p t i o n image bi n d i n g s (($1 blob-10)) ends ((405 . 127) (400 . 260)) 129 l o c a t i o n s i z e d e l t a (414 . 192) 133.09 26.61 node:central-body-1 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a c e n t r a l - b o d y image 1 (($1 blob-9)) ((382 . 611) (400 (383 . 557) 100.62 35.21 node:extremity-2 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a extremity image 1 (($1 blob-9)) ((382 . 611) (400 (383 . 557) 100.62 25. 1 5 node: upper-limb-1 d e s c r i p t i o n image b i n d i n g s ends l o c a t ion s i z e d e l t a (($1 blob-8)) ((401 . 298) (379 (385 . 398) 199.21 39.84 512) ) 512)) 496)) Up to t h i s p o i n t , an attempt has been made to enter each of the newly e s t a b l i s h e d nodes i n t o a model r o l e f o r some more complex c o n s t r u c t i o n , such as "limb". A l l of these attempts so f a r have f a i l e d to i n s p i r e any c l o s e examination because the r e l a t i o n s r e q u i r e d to s a t i s f y such models have not been e s t a b l i s h e d . The node "lower-1imb-1" does, however, meet a r e l a t i o n with another node, and so the f o l l o w i n g d e s c r i p t i o n f o r "limb" i s c o n s i d e r e d . (limb n i l (component (($ 1 extremity) ($2 lower-limb) 1 30 ($3 upper-limb)) (($4 b-connect ($1 $2 n i l n i l ) ( r a t i o (25 60))) ($5 b-connect ($2 $3 n i l n i l ) ( r a t i o (60 80)))) ((proximal-end <- (free2 $5)) ( d i s t a l - e n d <- ( f r e e l $4)) ( l o c a t i o n <- ( l o c a t i o n $5)) ( d e l t a <- ( d e l t a $3)) ( s i z e <- (times 2.3 ( s i z e $ 2 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 right-arm) ($2 r i g h t - l e g ) ($3 l e f t - a r m ) ($4 l e f t - l e g ) ) n i l n i l ) )) In the examination of the p o s s i b l e f u l f i l l e d models, i t i s noted that the node being c o n s i d e r e d , "upper-limb-1", has a l l of the r e l a t i o n s which are r e q u i r e d of i t in the d e s c r i p t i o n , and so the Incremental Consistency a l g o r i t h m d e s c r i b e d i n sec-t i o n 4.5 attempts to r e t u r n a l i s t of p o t e n t i a l b i n d i n g s , and f a i l s because no a p p r o p r i a t e "extremity" has been encountered yet . **** r e l a t i o n b-connect-1 e s t a b l i s h e d between lower-limb-1 upper-1imb-1 attempting s o l u t i o n s f o r upper-limb-1 as (limb component $3) rem= ($2) dj= (lower-limb-1) xj= (lower-limb-1) d l i s t = ( n i l $3 (upper-1imb-1) $2 (lower-limb-1)) rem= ($ 1) dj= n i l x j = n i l d l i s t = ( n i l $3 (upper-1imb-1) $2 (lower-limb-1) $1 n i l ) s o l u t i o n s not found 131 As more blob f e a t u r e s are c o n s i d e r e d , more nodes are gen-e r a t e d . When the next "lower-limb" i s encountered, another attempt i s made to e s t a b l i s h a node for "limb". The Incremen-t a l C onsistency a l g o r i t h m a c t u a l l y r e t u r n s a b i n d i n g l i s t as a p o t e n t i a l s o l u t i o n , but a subsequent examination d i s c o v e r s that the upper and lower p a r t s of the p o t e n t i a l limb are sup-p o r t e d by the same image c o n s t r u c t i o n ("blob-8") and so the node i s not generated. node:lower-limb-2 type lower-limb d e s c r i p t i o n image bin d i n g s (($1 blob-8)) ends ((401 . 298) (379 . 496)) l o c a t i o n (385 . 398) s i z e 199.21 d e l t a 39.84 **** r e l a t i o n b-connect-2 e s t a b l i s h e d between extremity-2 lower-limb-2 **** r e l a t i o n b-connect-3 e s t a b l i s h e d between lower-limb-2 upper-limb-1 attempting s o l u t i o n s f o r lower-limb-2 as (limb component $2) rem= ($ 1) dj= (lower-limb-2) xj= (lower-1imb-1 lower-limb-2) d l i s t = ( n i l $2 (lower-limb-2) $1 (extremity-2) $3 (upper-1imb-1)) rem= n i l dj = (lower-limb-2) x j = (lower-limb-2) d l i s t = ( n i l $2 (lower-limb-2) $1 (extremity-2) $3 ( u p p e r - l i m b - 1 ) ) s o l u t i o n s returned ((($2 . lower-limb-2) ($1 . extremity-2) ($3 . upper-1imb-1))) 132 master-role l i s t f o r lower-limb-2 i n (limb component $2) ( n i l $1 ( n i l extremity-1 ( n i l ) extremity-2 ( n i l $4 (lower-limb-2))) $2 ( n i l lower-limb-1 ( n i l $5 (upper-limb-1)) lower-1imb-2 ( n i l $4 (extremity-2) $5 (upper-limb-1))) $3 ( n i l upper-limb-1 ( n i l $5 (lower-limb-2 lower-limb-1)))) attempt to e l a b o r a t e limb from (($2 . lower-limb-2) ($1 . extremity-2) ($3 . upper-limb-1)) not-unique The system proceded c o n s i d e r i n g the blob f e a t u r e s , and gen-e r a t e s more nodes: node:central-body-2 type c e n t r a l - b o d y d e s c r i p t i o n image2 bin d i n g s (($1 blob-7)) ends ((385 . 70)) l o c a t i o n (385 . 70) s i z e 75.66 d e l t a 37.83 node:central-body-3 type c e n t r a l - b o d y d e s c r i p t i o n imagel bindings (($1 blob-7)) ends ((350 . 59) (416 . 96)) l o c a t i o n (385 . 70) s i z e 75.66 d e l t a 26.48 node:lower-1imb-3 type lower-limb d e s c r i p t i o n image bin d i n g s (($1 blob-7)) ends ((350 . 59) (416 . 96)) 1 33 l o c a t i o n (385 . 70) s i z e 75.66 d e l t a 15.13 node:extremity-3 type extremity d e s c r i p t i o n imagel b i n d i n g s (($1 blob-7)) ends ((350 . 59) (416 . 96)) l o c a t i o n (385 . 70) s i z e 75.66 d e l t a 18.91 **** r e l a t i o n b-connect-4 e s t a b l i s h e d between extremity-3 lower-1imb-1 **** r e l a t i o n b-connect-5 e s t a b l i s h e d between extremity-3 lower-1imb-3 As new r e l a t i o n s are found f o r the node "extremity-3", another attempt i s made to f i n d a s o l u t i o n from among the p o s s i b l e b i n d i n g s f o r the components of "limb". T h i s time a l l the requirements are met. attempt to e l a b o r a t e limb from (($1 . extremity-3) ($2 . lower-1imb-1) ($3 . upper-limb-1)) v e r i f y i n g ($1 extremity-3) with n i l v e r i f y i n g ($2 lower-1imb-1) with n i l v e r i f y i n g ($3 upper-1imb-1) with n i l **** r e l a t i o n b-connect-6 e s t a b l i s h e d between extremity-3 lower-1imb-1 v e r i f y i n g ($4 b-connect-6) with ( ( r a t i o (25 60))) found= 56 **** r e l a t i o n b-connect-7 e s t a b l i s h e d between lower-limb-1 upper-limb-1 v e r i f y i n g ($5 b-connect-7) with ( ( r a t i o (60 80))) found= 66 node:1imb-1 type limb 134 d e s c r i p t i o n component bin d i n g s ($5 b-connect-7) ($4 b-connect-6) ($1 extremity-3) ($2 lower-limb-1) ($3 upper-limb-1) proximal-end (379 . 496) d i s t a l - e n d (350 . 59) l o c a t i o n (400 . 279) y d e l t a 39.84 s i z e 306.11 **** r e l a t i o n b-connect-8 e s t a b l i s h e d between limb-1 central-body-1 The c u r r e n t s t a t u s of the i n t e r p r e t a t i o n i s summarized i n the p a r t i a l parse t r e e shown below: ilimb-1 extremity-3 (blob-7) lower-limb-1 (blob-10) upper-limb-1 (blob-8) The process c o n t i n u e s , f i n d i n g more b a s i c coarse l e v e l body p a r t s u n t i l a second limb i s d e t e c t e d . node:centra1-body-4 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a c e n t r a l - b o d y image2 (($1 blob-4)) ((175 . 262)) (175 . 262) 58.72 29.36 node:central-body-5 type d e s c r i p t i o n b i ndings ends l o c a t ion s i z e d e l t a c e n t r a l - b o d y image 1 (($1 blob-4)) ((152 . 245) (192 (175 . 262) 58.72 20.55 288) ) node:extremity-4 type extremity 135 descr i p t ion b i n d i n g s ends l o c a t i o n s i z e d e l t a image 1 (($1 blob-4)) ((152 . 245) (192 (175 . 262) 58.72 1 4.68 288) ) node:upper-1imb-2 type d e s c r i p t i o n b i n d i n g s ends l o c a t i o n s i z e d e l t a upper-limb image (($1 blob-2)) ((152 . 469) (340 (248 . 484) 188.06 37.61 464) ) node:lower-1imb-4 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a node:limb-2 type d e s c r i p t i o n b i n d i n g s proximal-end d i s t a l - e n d l o c a t ion d e l t a s i z e lower-1imb image (($1 b l o b - D ) ((171 . 328) (124 (156 . 390) 128.87 25.77 limb component ($5 b-connect-16) ($4 b-connect-15) ($2 lower-limb-4) ($1 extremity-4) ($3 upper-1imb-2) (340 . 464) 245) 458) 448) ) (152 . ( 1 38 . 37.61 296.41 Now that a second limb has been e s t a b l i s h e d , the d e s c r i p t i o n f o r the "body-half" i s s a t i s f i e d . The a c t u a l schema d e s c r i p -t i o n i s p r o v i d e d below: (body-half n i l (component (($1 limb) ($2 limb) ($3 c e n t r a l - b o d y ) ) (($4 b-connect ($1 $3 proximal-end n i l ) ( r a t i o (150 450))) 1 36 ($5 b-connect ($2 $3 proximal-end n i l ) ( r a t i o (150 450)))) ((head-end <- (midpoint (proximal-end $1) (proximal-end $2))) (center-end <- (free2 $4)) ( l o c a t i o n <- ( l o c a t i o n $3)) ( d e l t a <- ( d e l t a $3)) ( s i z e <- (plus ( s i z e $1) ( s i z e $ 3 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 upper-body) ($2 lower-body)) n i l n i l ) )) node:body-half-1 type body-half d e s c r i p t i o n component bin d i n g s ($5 b-connect-19) ($4 b-connect-18) ($2 limb-2) ($3 central-body-1) ($1 limb-1) head-end (359 . 480) center-end (382 . 611) l o c a t i o n (383 . 557) d e l t a 35.21 s i z e 406.73 a f t e r coarse models invoked The "body-half" was the l a r g e s t s t r u c t u r e which c o u l d be sup-ported i n the context of the l i m i t e d diameter of p e r i p h e r a l f e a t u r e s . To t h i s p o i n t , each of the model p o s s i b i l i t i e s f o r the blob f e a t u r e s has been entered i n t o the i n t e r p r e t a t i o n p rocess, and so now the fovea i s processed. The parse t r e e f o r the body-half i s shown below: body-half-1 limb-1 extremity-3 (blob-7) lower-limb-1 (blob-10) upper-limb-1 (blob-8) limb-2 extremity-4 (blob-4) lower-limb-4 (blob-1) upper-limb-2 (blob-2) central-body-1 ( b l o b - 9 ) 1 37 The o p e r a t i o n s f o r the fovea are i d e n t i c a l to those f o r the p e r i p h e r y , using the same r o u t i n e s . At some p o i n t a node i s generated f o r " l i n e - f o o t - 1 " . n o d e : l i n e - f o o t - 1 -1 type l i n e - f o o t - 1 d e s c r i p t i o n component bi n d i n g s ($6 connect-7) ($5 connect-8) ($4 connect-9) ($1 l i n e - 2 1 ) ($2 l i n e - 2 2 ) ($3 l i n e - 2 3 ) a2d 20 proximal-end (184 . 293) l o c a t i o n (160 . 248) s i z e 87 Working i n a s t r i c t bottom-up f a s h i o n , the system recognizes that the scene s t r u c t u r e " f o o t " can be supported by the image  c o n s t r u c t i o n " l i n e - f o o t - 1 " . At t h i s p o i n t the system cannot know whether i t w i l l be a r i g h t or l e f t f o o t , and so both pos-s i b i l i t i e s are r e t a i n e d as the e l a b o r a t i o n s of the node f o r the f o o t . attempt to e l a b o r a t e foot from (($1 . l i n e - f o o t - 1 - 1 ) ) v e r i f y i n g ($1 l i n e - f o o t - 1 - 1) with n i l (($1 l i n e - f o o t - 1 - 1 ) ) node:foot-1 type foot d e s c r i p t i o n imagel b i n d i n g s (($1 1ine-foot-1 - 1 )) proximal-end (184 . 293) s i z e 87 l o c a t i o n (160 . 248) a3d (-20 90 0) e x t r a (E00007 E00008) elaboration:E00007 s i d e r i g h t elaboration:E00008 1 38 | s i d e l e f t The " l i n e - f o o t - 1 " c o n s t r u c t i o n i s shown below. If i t i s the l e f t f o o t , then the o u t s i d e i s f a c i n g the viewer, and i f i t i s the r i g h t , then the i n s i d e faces the viewer. In e i t h e r case, the three dimensional r o t a t i o n from the r e s t p o s i t i o n i s the same, so the a t t r i b u t e "a3d" ( t h r e e -d imensional o r i e n t a t i o n ) does not appear in the e l a b o r a t i o n , but r a t h e r i n the main node. An image c o n s t r u c t i o n f o r " l i n e - l o w e r - l e g - 1 " i s developed next, which in turn prompts the generation of a node f o r " l o w e r - l e g " . node:line-lower-leg-1 - 1 type l i n e - l o w e r - l e g - 1 d e s c r i p t i o n component bi n d i n g s ($6 connect-10) ($5 connect-11) ($4 connect-12) ($1 l i n e - 1 8 ) ($2 l i n e - 1 9 ) ($3 l i n e - 2 0 ) s i z e 157 a2d 20 proximal-end (117 . 447) l o c a t i o n (154 . 370) d i s t a l - e n d (171 . 299) 139 attempt to e l a b o r a t e lower-leg from (($1 . l i n e - l o w e r - l e g - 1 - 1 ) ) v e r i f y i n g ($1 l i n e - l o w e r - l e g - 1 - 1 ) with n i l (($1 l i n e - l o w e r - l e g - 1 - 1 ) ) node:lower-leg type d e s c r i p t i o n b i n d i n g s s i z e l o c a t i o n proximal-end d i s t a l - e n d e x t r a -1 lower-leg image 1 (($1 1 57 ( 1 54 (117 (171 1ine-lower-leg-1 - 1)) 370) 447) 299) (E00009 E00010 E00011 E00012) e l a b o r a t ion: s i d e a3d e l a b o r a t ion: si d e a3d e l a b o r a t ion: s i d e a3d e l a b o r a t ion: s i d e a3d E00009 l e f t (0 0 -20) E00010 l e f t (-20 90 0) E0001 1 r i g h t (-20 90 0) E0001 2 r i g h t (0 180 20) Again the s i d e s are kept i n the e l a b o r a t i o n s , but i n t h i s case, d i f f e r e n t t hree-dimensional o r i e n t a t i o n s are a l s o p o s s i -b l e . The image c o n s t r u c t i o n i s shown below: 140 The bulged s i d e may e i t h e r be the o u t s i d e or the back of the lower l e g , producing d i f f e r e n t o r i e n t a t i o n s r e l a t i v e to the r e s t p o s i t i o n . S i m i l a r l y , "upper-leg-1 -1" i s e v e n t u a l l y gen-e r a t e d : node:upper-leg-1 type upper-leg d e s c r i p t i o n image4 bin d i n g s (($1 l i n e - u p p e r - l e g - 4 - 1 ) ) s i z e 218 l o c a t i o n (222 . 470) proximal-end (339 . 456) d i s t a l - e n d (121 . 456) e x t r a (E00017 E00018 E00019 E00020) elaboration:E00017 s i d e l e f t a3d (0 180 -90) elaboration:E00018 s i d e l e f t a3d (90 90 0) elaboration:E00019 s i d e r i g h t a3d (0 0 90) elaboration:E00020 s i d e r i g h t a3d (90 90 0) Connections are made between the body p a r t s i n the form of "near" r e l a t i o n s . As d i s c u s s e d i n s e c t i o n 4.5, these r e l a -t i o n s takes on a t t r i b u t e value p a i r s i n much the same way as do the i n t e r p r e t a t i o n nodes f o r o b j e c t s . The method for the development of a t t r i b u t e values f o r the "near" r e l a t i o n i s shown below: ((ok <- (same (s i d e $1) ( s i d e $2))) ( r a t i o <- ( g e t r a t i o x ( s i z e $1) ( s i z e $2))) (angle-x <- ( d i f f (car (a3d $1)) (car (a3d $2)))) (angle-y <- ( d i f f (cadr (a3d $1)) (cadr (a3d $2)))) (angle-z <- ( d i f f (caddr (a3d $1)) (caddr (a3d $ 2 ) ) ) ) ) ) The r e s u l t of e s t a b l i s h i n g that the upper and lower l e g are "near" i s shown in the f o l l o w i n g p r i n t o u t . Because the body 141 p a r t s have s e v e r a l values f o r t h e i r o r i e n t a t i o n s ("a3d") the a t t r i b u t e v a l u e s of the "near-2" r e l a t i o n have e l a b o r a t i o n s (p r e f a c e d with the l e t t e r "X") to s t o r e the p o s i b i l i t i e s . near-2 (type near l o c a t i o n (119 . 451) args (lower-leg-1 upper-leg-1 (117 . 447) (121 . 456)) r a t i o 72 e x t r a (X00021 X00022 X00023 X00024 . . . ) ) X00021 (xargs (($2 . E00017) ($1 . E00009)) angle-z 70 angle-y -180 angle-x 0 ok l e f t ) X00022 (xargs (($2 . E00017) ($1 . E00010)) angle-z 90 angle-y -90 angle-x -20 ok l e f t ) X00023 (xargs (($2 . E00018) ($1 . E00009)) angle-z -20 angle-y -90 angle-x -90 ok l e f t ) X00024 (xargs (($2 . E00018) ($1 . E00010)) angle-z 0 angle-y 0 angle-x -110 ok l e f t ) X00027 (xargs (($2 . E00020) ($1 . E00011)) angle-z 0 angle-y 0 angle-x -110 ok r i g h t ) 142 The system attempts to e s t a b l i s h a " l e g " on the b a s i s of these body p a r t s . During the process of t e s t i n g the schema f o r " l e f t - l e g " , the requirements f o r some of the component o b j e c t s are found to not hold , and so some of the p o s s i b l e e l a b o r a -t i o n s f o r the body p a r t s are e l i m i n a t e d from f u r t h e r con-s i d e r a t i o n i n t h i s c o n t e x t . attempt t o e l a b o r a t e l e f t - l e g from (($3 . upper-leg-1) ($2 . lower-leg-1) ($1 . f o o t - 1 ) ) v e r i f y i n g ($1 foot-1 E00007 E00008) with ( ( s i d e l e f t ) ) found= n i l req= ( s i d e l e f t ) t r y i n g e l a b o r a t i o n : E00007 found=right e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : E00008 found=left S i m i l a r l y , the r e q u i r e d p r o p e r t i e s of the "near" r e l a t i o n s are examined, and candidate e l a b o r a t i o n s f o r the r e l a t i o n nodes are e l i m i n a t e d . v e r i f y i n g ($5 near-2 X00021 X00022 X00023 X00024 ...) with ((angle-x (-145 10)) (angle-y (0 0)) (angle-z (0 0)) ( r a t i o 72) ) req= (angle-y (0 0)) t r y i n g e l a b o r a t i o n : X00021 found=-180 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00022 found=-90 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00023 found=-90 143 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00024 found=0 The r e s u l t i s that not many of the e l a b o r a t i o n p o s s i b i l i t i e s remain v a l i d i n the context of a " l e f t - l e g " . The bindings show a l l of the l o c a l p o s s i b i l i t i e s , which are then examined fo r c o m p a t i b i l i t y . The r e s u l t i s a s i n g l e p o s s i b l e value f o r the o r i e n t a t i o n of the l e g . The c o n s t r a i n t s of the a l l o w a b l e angles at the connections of the components of the l e g has pruned the e l a b o r a t i o n s . n o d e : l e f t - l e g - 1 type l e f t - l e g d e s c r i p t i o n component bind i n g s ($5 near-2 X00024 X00027) ($4 near-1 X00014 X00015) ($3 upper-leg-1 E00017 E00018) ($2 lower-leg-1 E00009 E00010) ($1 foot-1 E00008) proximal-end (339 . 456) s i z e 392.5 knee-location(121 . 456) l o c a t i o n (119 . 451) foot-base (-20 0) ex t r a (E00029) elaboration:E00029 xargs ($5 . X00024) ($4 . X00014) ($3 . E00018) ($2 . E00010) ($1 . E00008) f o o t - p o s t u r e 0 knee-posture 110 a3d (90 90 0) l e f t - l e g - 1 foot-2 l i n e - f o o t - 1 - 1 ( l i n e - 2 1 l i n e - 2 2 l i n e - 2 3 ) lower-leg-1 l i n e - l o w e r - l e g - 1 - 1 ( l i n e - 1 8 l i n e - 1 9 l i n e - 2 0 ) upper-leg-1 144 lin e - u p p e r - l e g - 4 - 1 ( l i n e - 1 5 l i n e - 1 7 l i n e - 1 6 ) T h i s same group of body p a r t s i s then developed i n t o a " r i g h t - l e g " node a l s o . These two p o s s i b i l i t i e s are v a l i d , but with d i f f e r e n t values f o r the e l a b o r a t i o n s of the component p a r t s . node:r i g h t - l e g - 1 type r i g h t - l e g d e s c r i p t i o n component b i n d i n g s ($5 near-2 X00024 X00027) ($4 near-1 X00014 X00015) ($3 upper-leg-1 E00019 E00020) ($2 lower-leg-1 E00011 E00012) ($1 foot-1 E00007) proximal-end (339 . 456) s i z e 392.5 knee-location(121 . 456) l o c a t i o n (119 . 451) foot-base (-20 90) e x t r a (E00030) elaboration:E00030 xargs ($5 . X00027) ($4 . X00015) ($3 . E00020) ($2 . E00011) ($1 . E00007) f o o t - p o s t u r e 0 knee-posture 110 a-3d (90 90 0) T h i s process c o n t i n u e s , u n t i l the other l e g i n the image i s e s t a b l i s h e d , along with the " h i p s " , and then a node i s c r e a t e d for the e n t i r e "lower-body". A c t u a l l y two such "lower-body" nodes are supported i n the image, with d i f f e r e n t i n t e r p r e t a -t i o n s f o r the s i d e s of the l e g s . node:lower-body-1 type lower-body d e s c r i p t i o n component bi n d i n g s ($5 near-8) ($4 near-5) ($3 hips-1) ($1 r i g h t - l e g - 1 E00030) 145 ($2 l e f t - l e g - 2 E00053) top (376 . 614) a3d (10 90 0) l o c a t i o n (369 . 535) s i z e 506.5 node:lower-body-2 type lower-body d e s c r i p t i o n component bi n d i n g s ($5 near-7) ($4 near-6) ($3 hips-1) ($.1 r i g h t - l e g - 2 E00054) ($2 l e f t - l e g - 1 E00029) top (376 . 614) a3d (10 90 0) l o c a t i o n (369 . 535) s i z e 509.0 At t h i s p o i n t , the i n t e r p r e t a t i o n has gone as f a r as i t can w i t h i n the l i m i t e d diameter of a v a i l a b l e f e a t u r e s as d e f i n e d by the f o v e a l and p e r i p h e r a l r a d i i . The complete parse t r e e to t h i s p o i n t i s : lower-body-2 r i g h t - l e g - 2 foot-4 l i n e - f o o t - 1 - 2 ( l i n e - 1 2 l i n e - 1 3 l i n e - 1 4 ) lower-leg-2 l i n e - l o w e r - l e g - 1 - 2 ( l i n e - 9 l i n e - 1 0 l i n e - 1 1 ) upper-leg-2 1ine-upper-leg-6-1 ( l i n e - 8 l i n e - 7 l i n e - 6 ) l e f t - l e g - 1 foot-2 l i n e - f o o t - 1 - 1 ( l i n e - 2 1 l i n e - 2 2 l i n e - 2 3 ) lower-leg-1 l i n e - l o w e r - l e g - 1 - 1 ( l i n e - 1 8 l i n e - 1 9 l i n e - 2 0 ) upper-leg-1 1ine-upper-leg-4-1 ( l i n e - 1 5 l i n e - 1 7 l i n e - 1 6 ) hips-1 l i n e - h i p s - 2 - 1 ( l i n e - 3 l i n e - 4 l i n e - 5 ) lower-body-1 r i g h t - l e g - 1 foot-2 l i n e - f o o t - 1 - 1 ( l i n e - 2 1 l i n e - 2 2 l i n e - 2 3 ) lower-leg-1 l i n e - l o w e r - l e g - 1 - 1 ( l i n e - 1 8 l i n e - 1 9 l i n e - 2 0 ) 146 upper-leg-1 line-upper-leg-4-1 ( l i n e - 1 5 l i n e - 1 7 l i n e - 1 6 ) l e f t - l e g - 2 foot-4 l i n e - f o o t - 1 - 2 ( l i n e - 1 2 l i n e - 1 3 l i n e - 1 4 ) lower-leg-2 l i n e - l o w e r - l e g - 1 - 2 ( l i n e - 9 l i n e - 1 0 l i n e - 1 1 ) upper-leg-2 line-upper-leg-6-1 ( l i n e - 8 l i n e - 7 l i n e - 6 ) hips-1 1ine-hips-2-1 ( l i n e - 3 l i n e - 4 l i n e - 5 ) At the l i n e l e v e l , the i n t e r p r e t a t i o n i s c o r r e c t f o r 50% of the h ypothesised c o n s t r u c t i o n s . Those which were i n c o r r e c t c o u l d not take p a r t i n some l a r g e r s t r u c t u r e and so do not appear in the parse t r e e . In order to p rovide some idea of the amount of time taken by the i n t e r p r e t a t i o n p r o c e s s e s , t h i s same example was pro-cessed with p e r i p h e r a l and f o v e a l diameters which covered the e n t i r e image. The CPU seconds taken on a VAX-11/780 are shown beside each ste p . f i x a t i o n ( f e a t u r e c o l l e c t i o n ) 16 seconds f e a t u r e based model p o s s i b i l i t y r e d u c t i o n 125 seconds blob based i n t e r p r e t a t i o n 42 seconds l i n e based i n t e r p r e t a t i o n 82 seconds t o t a l 265 seconds The e n t i r e system i s w r i t t e n i n F r a n z l i s p , i n c l u d i n g mathemat-i c a l o p e r a t i o n s and i t runs i n t e r p r e t i v e l y . 147 5.2. Example with M u l t i p l e F i x a t i o n s T h i s second example i s d i f f e r e n t from the previous one i n that the diameters of a v a i l a b l e f e a t u r e s are s m a l l e r , and s e v e r a l f i x a t i o n l o c a t i o n s are s e l e c t e d by the system i n order to accomplish the i n t e r p r e t a t i o n through the propagation of the f i n e l a y e r r e s u l t s i n t o the p e r i p h e r y . The system attempts to s e l e c t l o c a t i o n s which w i l l f a c i l i t a t e t h i s propa-g a t i o n . The f o v e a l r a d i u s has been reduced t o only 125 u n i t s , and the p e r i p h e r y reduced to 250. The l o c a t i o n of the f i r s t f i x a -t i o n i s a r b i t r a r i l y chosen to be (449 192). F i g u r e 5.2.1 shows the areas processed. A f t e r the usual c o l l e c t i o n of f e a t u r e s , and the f e a t u r e based o p e r a t i o n s , a number of coarse body p a r t s are supported, as shown below. -> (see 449 192) a f t e r f e a t u r e c o l l e c t i o n at 449 192 (125/250) node p o i n t 1 point 2 curve models 1 ine-7 409 99 n i l n i l -8 1 82 1ine-6 n i l n i l 409 99 -8 1 82 l i n e - 5 434 236 396 101 19 23 14 1 ine-4 396 259 434 236 -8 1 82 l i n e - 3 396 101 396 259 -8 1 82 l i n e - 2 405 272 n i l n i l 46 53 12 1 i n e - 1 405 272 n i l n i l -8 1 82 node center l o n g a x i s s h o r t a x i s r a t i o models blob-3 414 192 405 127 400 260 432 188 400 196 363 512 2 blob-2 385 398 401 298 n i l n i l 408 408 360 392 n i l n i l 0 blob-1 385 70 350 59 416 96 392 56 384 88 213 237 4 before grouping 436 models at l i n e l e v e l a f t e r grouping 242 models at l i n e l e v e l a f t e r 2 - l e v e l 146 models at l i n e l e v e l a f t e r C - f i l t e r 89 models at l i n e l e v e l F i g u r e 5.2.1. The f i r s t f i x a t i o n (at l o c a t i o n 449 192). The small squares i n d i c a t e p e r i p h e r y , and the l a r g e squares show unprocessed a r e a s . 1 49 node:lower-1imb-1 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a lower-limb image (($1 blob-3)) ((405 . 127) (400 (414 . 192) 133.09 26.61 260) ) node:central-body-1 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a c e n t r a l - b o d y image2 (($1 blob-1)) ((385 . 70)) (385 . 70) 75.66 37.83 node:central-body-2 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a c e n t r a l - b o d y image 1 (($1 blob-1)) ((350 . 59) (416 (385 . 70) 75.66 26.48 96)) node:lower-limb-2 type descr i p t ion b i n d i n g s ends l o c a t i o n s i z e d e l t a lower-1imb image (($1 blob-1 ((350 . 59) (385 . 70) 75.66 15.13 ) ) (416 . 96)) node:extremity-1 type d e s c r i p t i o n b i n d i n g s ends l o c a t ion s i z e d e l t a extremity image 1 (($1 blob-1)) ((350 . 59) (416 . 96)) (385 . 70) 75.66 18.91 At the l i n e l e v e l , there i s one hypothesized o b j e c t . node:1ine-lower-leg-1 -1 type 1ine-lower-leg-1 150 d e s c r i p t i o n component bi n d i n g s ($6 connect-2) ($5 connect-3) ($4 connect-4) ($1 l i n e - 3 ) ($2 l i n e - 4 ) ($3 l i n e - 5 ) s i z e 158 a2d 0 proximal-end (396 . 259) l o c a t i o n (405 . 174) d i s t a l - e n d (396 . 101) node:lower-leg-1 type lower-leg d e s c r i p t i o n imagel b i n d i n g s (($1 l i n e - l o w e r - l e g - 1 - 1 ) ) s i z e 158 l o c a t i o n (405 . 174) proximal-end (396 . 259) d i s t a l - e n d (396 . 101) e x t r a (E00005 E00006 E00007 E00008) elaboration:E00005 s i d e l e f t a3d (0 0 0) elaboration:E00006 s i d e l e f t a3d (0 90 0) elaboration:E00007 s i d e r i g h t a3d (0 90 0) elaboration:E00008 sid e r i g h t a3d (0 180 0) To t h i s p o i n t , there i s no c o n s t r u c t i o n known at the coarse l e v e l which has components and yet does not have a correspond-ing model at the l i n e l e v e l . Thus i t i s not p o s s i b l e to use the f o v e a l requirement f o r p r o c e s s i n g l o c a t i o n (as d e s c r i b e d in s e c t i o n 4.6). As a r e s u l t , the p e r i p h e r a l requirement i s used. An area of the image which has a l o t of d e t a i l , as measured by the d e s i t y of l i n e s , i s s e l e c t e d such that p e r i -p h e r a l p r o c e s s i n g of the. new l o c a t i o n w i l l merge with the e x i s t i n g p e r i p h e r y . 151 F i g u r e 5.2.2. The second f i x a t i o n at 162 448. The p r e v i o u s l y processed area i s a l s o shown. 1 52 l o c a t e d : lower-limb-2 n i l at ((385 . 70) (350 . 59) (416 . 96)) l o c a t e d : lower-limb-1 as (lower-leg) next l o c a t i o n s e l e c t e d as 162 448 p e r i p h e r a l P r o c e s s i n g at the second l o c a t i o n (shown i n f i g u r e 5.2.2) r e s u l t s i n a number of models at the coarse l e v e l , the most i n t e r e s t i n g of which i s the "limb-1" c o n s t r u c t i o n . a f t e r f e a t u r e c o l l e c t i o n at 162 448 (125/250) before grouping 361 models at l i n e l e v e l a f t e r grouping 167 models at l i n e l e v e l a f t e r 2 - l e v e l 117 models at l i n e l e v e l a f t e r C - f i l t e r 117 models at l i n e l e v e l node:limb-1 type limb d e s c r i p t i o n component b i n d i n g s ($5 b-connect-7) ($4 b-connect-6) ($2 lower-limb-4) ($1 extremity-2) ($3 upper-1imb-2) proximal-end (340 . 464) d i s t a l - e n d (152 . 245) l o c a t i o n (138 . 458) d e l t a 37.61 s i z e 296.41 No l i n e l e v e l o b j e c t i s developed which can correspond to any of the components of t h i s limb, and so the b a s i c c r i t e r i a f o r the f o v e a l requirement i s met. The l o c a t i o n of one of the components i s chosen as the next f i x a t i o n , as shown in f i g u r e 5.2.3. l o c a t e d : limb-1 u n i n s t a n t i a t e d next l o c a t i o n s e l e c t e d as 160 288 f o v e a l 153 F i g u r e 5.2.3. The t h i r d f i x a t i o n at 160 228. The p r e v i o u s l y processed areas are a l s o shown. 154 The r e s u l t of p r o c e s s i n g at t h i s new l o c a t i o n i s that " f o o t - 1 " i s d e t e c t e d . T h i s new l i n e l e v e l o b j e c t i s r e c o g n i z e d as corres p o n d i n g to the "extremity-2" i n v o l v e d in "limb-1", so an i n f e r r e d correspondence i s made between "limb-1" and " l e g " . I t i s now no longer necessary to process the remaining por-t i o n s of t h e i s l e g with the high r e s o l u t i o n fovea, because the more d e t a i l e d i n t e r p r e t a t i o n has been propagated to the e n t i r e o b j e c t . a f t e r f e a t u r e c o l l e c t i o n at 160 288 (125/250) n o d e : l i n e - f o o t - 1 -1 type d e s c r i p t i o n b i n d i n g s l i n e - f o o t - 1 component ($6 connect-9) connect-10) connect-11) l i n e - 1 3 ) l i n e - 1 4 ) l i n e - 1 5 ) a2d proximal-end l o c a t ion s i z e node:foot-1 type d e s c r i p t i o n b i n d i n g s proximal-end s i z e l o c a t i o n a3d e x t r a ($5 ($4 ($1 ($2 ($3 20 (184 (160 87 foot image 1 293) 248) (($1 (184 87 ( 1 60 (-20 l i n e - f o o t - 1 - 1 ) ) 293) . 248) 90 0) (E00011 E00012) elaboration:E00011 s i d e r i g h t elaboration:E00012 s i d e l e f t l o c a t e d : limb-1 as l e g 155 F i g u r e 5.2.4. The f o u r t h f i x a t i o n at 270 832. 156 The next l o c a t i o n , shown i n f i g u r e 5.2.4, i s chosen on the b a s i s of p e r i p h e r a l requirement because no new coarse l e v e l o b j e c t s have been d e t e c t e d . At the new l o c a t i o n , another limb i s d e t e c t e d , which r e s u l t s i n another f o v e a l requirement r e s u l t i n g i n the i n f e r r e d correspondence of that limb. next l o c a t i o n s e l e c t e d as 270 832 p e r i p h e r a l a f t e r f e a t u r e c o l l e c t i o n at 270 832 (125/250) node:1imb-2 type d e s c r i p t i o n b i n d i n g s prox imal-end d i s t a l - e n d l o c a t i o n d e l t a s i z e 1 imb component ( $5 b-connect- 1 1 ) ($4 b-connect-10) ($3 upper-limb-3) ($2 lower-limb-3) ($1 extremity-3) (336 . 804) (112 . 665) (272 . 668) 28.0912797857271 254.2618532143585 l o c a t e d : limb-2 u n i n s t a n t i a t e d l o c a t e d : limb-1 as l e g a f t e r next l o c a t i o n s e l e c t e d as 96 672 f o v e a l a f t e r f e a t u r e c o l l e c t i o n at 96 672 (125/250) F i g u r e 5.2.5. The f i f t h f i x a t i o n at 96 672 node:1ine-hand-type d e s c r i p t i o n b i n d i n g s s i z e a2d prox imal-end l o c a t i o n d i s t a l - e n d node:hand-1 type d e s c r i p t i o n b i n d i n g s posture l o c a t i o n proximal-end d i s t a l - e n d s i z e e x t r a 2-1 1ine-hand-2 component ($6 connect-26) connect-27) connect-25) 1ine-33) l i n e - 3 4 ) l i n e - 3 5 ) ($5 ($4 ($3 ($2 ($1 58 -136 ( 1 1 8 (95 . (72 , 651 ) 666) 681 ) hand image2 (($1 line-hand-2-1 )) open (95 . 666) (118 . 651) (72 . 681) 58 (E00023 E00024) elaboration:E00023 s i d e r i g h t a3d (0 0 136) elaboration:E00024 s i d e l e f t a3d (0 180 -1 l o c a t e d : limb-2 as arm l o c a t e d : limb-1 as l e g next l o c a t i o n s e l e c t e d as 448 832 p e r i p h e r a l 159 F i g u r e 5.2.6. The s i x t h f i x a t i o n at 448 832. 160 Of course, the p r e s e t s i z e s of the f o v e a l and p e r i p e r a l diame-t e r s i n f l u e n c e the r e s u l t i n g l o c a t i o n s s e l e c t e d . The same image with d i f f e r e n t s e t t i n g w i l l r e s u l t i n d i f f e r e n t s e l e c -t i o n s . Two of the four examples i n Appendix D demonstrate t h i s f e a t u r e . There are suggestions p r o v i d e d f o r extensions to these b a s i c s e l e c t i o n requirements i n the c o n c l u d i n g chapter 7. 161 6. Related Issues £ . J _ . Grouping and Feature I n t e g r a t i o n Often human v i s u a l tasks may be couched i n terms of the r e c o g n i t i o n of s p e c i f i c models expressed as the composition of more ba s i c f e a t u r e s . There are two steps i n v o l v e d i n t h i s r e c o g n i t i o n : (1) The i d e n t i f i c a t i o n of the necessary f e a t u r e s . (2) The l o c a l i z a t i o n i n space of the p a r t i c u l a r combinations of f e a t u r e s comprising the models, and the examination of the way f e a t u r e s i n t e r a c t to determine the v a l i d i t y of the models. There i s c o m p e l l i n g evidence that the o p e r a t i o n s at these two steps are q u i t e d i f f e r e n t , even though responses i n d i c a -t i v e of r e c o g n i t i o n may be based on e i t h e r s t e p . The Feature I n t e g r a t i o n Theory of a t t e n t i o n (Treisman and Gelade, 1980) proposes that i n d i v i d u a l image f e a t u r e s are d e t e c t e d r a p i d l y and i n p a r a l l e l , but, i n order that an o b j e c t be i d e n t i f i e d as c o n s i s t i n g of two or more separate f e a t u r e s , l o c a t i o n s must be processed s e r i a l l y with f o v e a l a t t e n t i o n . I f f o c a l a t t e n t i o n i s prevented, i l l u s o r y p e r c e p t i o n s w i l l be formed through com-b i n i n g f e a t u r e s i n c o r r e c t l y (Treisman and Schmidt, 1981). There i s an expense which accompanies the a p p l i c a t i o n of f o v e a l a t t e n t i o n . Treisman, Sykes, and Gelade (1977) have demonstrated t h i s expense i n the context of experiments 1 62 r e q u i r i n g s u b j e c t s to d e t e c t a t a r g e t i n a d i s p l a y of d i s t r a c -t o r o b j e c t s . The amount of time r e q u i r e d to d e t e c t a t a r g e t made up of a c o n j u n c t i o n of f e a t u r e s i n c r e a s e s l i n e a r l y with the number of d i s t r a c t o r s , but the d e t e c t i o n of t a r g e t s which are i d e n t i f i a b l e on the b a s i s of f e a t u r e s alone depends l i t t l e on d i s p l a y s i z e . Consider an example taken from experiment IV of Treisman and Gelade (1980). In separate b l o c k s , s u b j e c t s were r e q u i r e d to d e t e c t the l e t t e r "R" i n a f i e l d of "P"s and "Q"s or i n a f i e l d of "P"s. and "B"s. The p r e d i c t i o n was found to be c o r r e c t : that d e t e c t i o n time i n c r e a s e d l i n e a r l y with d i s p l a y s i z e f o r the "PQ" d i s t r a c t o r s , and l e s s f o r the "PB" d i s t r a c -t o r s , even though i t took longer f o r s u b j e c t s to detec t "R"s in a f i e l d of "B"s than i n a f i e l d of "Q"s alone (see F i g u r e 6.1.1). F i g u r e 6.1.1. Two d i s p l a y s of the type used i n Feature I n t e g r a t i o n Theory experiments, (a) c o n j u n c t i o n t a r g e t R, (b) f e a t u r e t a r g e t R. P Q Q P Q Q P P Q P Q R P Q P P Q P P Q P Q Q Q P P B B P P B P P B P P P B R B P B B P B B P B B P (a) (b) F i g u r e 6.1.2 shows an a n a l y s i s of these s e t s of l e t t e r s in terms of the model p o s s i b i l i t i e s a s s o c i a t e d with each 163 f e a t u r e of the l e t t e r s i n the two d i f f e r e n t c ontexts.[27] The d i f f e r e n c e s i n r e a c t i o n time can be e x p l a i n e d as o p e r a t i o n s t a k i n g p l a c e on groupings of d i f f e r e n t s p a t i a l e x t e n t . In the d i s p l a y s with the "PB" d i s t r a c t o r s , the occurrence of the t a r -get may be determined w i t h i n a grouping c o n s i s t i n g of the e n t i r e d i s p l a y , simply by the presence of a l l the f e a t u r e s necessary to make up the "R". If t h i s c r i t e r i a i s met, the l o c a t i o n of the s i n g l e c r i t i c a l f e a t u r e (the one with the s i n -g l e model p o s s i b i l i t y ) can be used as a l o c a t i o n to form a smaller grouping based on the l e t t e r alone to co n f i r m the t a r g e t ' s presence. In the case of the d i s t r a c t o r s "PQ" a l l the f e a t u r e s that make up an "R" are always present, and there i s no c r i t i c a l f e a t u r e , so the t i g h t e r grouping must be a p p l i e d to each l e t t e r i n the d i s p l a y s e q u e n t i a l l y . {P,R,B} y r {P,R,B} {P,R,B} {P,R,B} !B) tP,R,B) {P,R,B} {R} {P,R} {P,R} {Q,R} {P,R} y {P,R} (Q,R} F i g u r e 6.1.2. A n a l y s i s of d i s p l a y c o n f i g u r a t i o n s shown i n terms of model p o s s i b i l i t i e s . [27] A ra t h e r a r b i t r a r y set of f e a t u r e s has been chosen f o r the l e t t e r s i n order to pursue the example. 164 These o p e r a t i o n s are s i m i l a r i n nature to those proposed as "grouping c o n s i s t e n c y " f o r the body drawing i n t e r p r e t a t i o n system. Each i n v o l v e s the formation of groups and subsequent use of the presence of f e a t u r e s , and t h e i r model p o s s i b i l i t i e s towards i n t e r p r e t a t i o n . For the f e a t u r e i n t e g r a t i o n e x p e r i -ments, r e c o g n i t i o n e i t h e r succeeds or f a i l s at each grouping o p e r a t i o n . The proposed method used i n the body drawing sys-tem c o n s i d e r s many more p o s s i b l e models, and uses the grouping o p e r a t i o n to move c l o s e r to i n t e r p r e t a t i o n by e l i m i n a t i n g l o c a l model p o s s i b i t i e s of the f e a t u r e s . Kahneman and Henik (1977) have formulated a "group-p r o c e s s i n g " model of the a p p l i c a t i o n of a t t e n t i o n which i s a l s o s i m i l a r to the a p p l i c a t i o n of grouping c o n s i s t e n c y . T h e i r model proposes a p r e - a t t e n t i v e grouping o p e r a t i o n which s e l e c t s l a r g e s c a l e o b j e c t s f o r subsequent a n a l y s i s . The experiments which demonstrate the v a l i d i t y of t h i s model employ d i s p l a y s such as that of f i g u r e 6.1.3. F i g u r e 6.1.3. Group p r o c e s s i n g d i g i t d e t e c t i o n d i s p l a y . One of the two d i s p l a y s such as shown i n f i g u r e 6.1.3 i s presented b r i e f l y and the task i s to detect a s p e c i f i e d t a r -get d i g i t . The r e s u l t s show that groups are processed (a) (b) 165 s e p a r a t e l y , but that p r o c e s s i n g i s almost uniform w i t h i n groups. {1,7,4} {4,8} , , {2,4} {1,4} I 1 (a) . (b) F i g u r e 6.1.4. Features a v a i l a b l e at two r e s o l u t i o n s . Assume that high r e s o l u t i o n f e a t u r e i n f o r m a t i o n i s a v a i l -a b l e , and that for each such f e a t u r e , a set of model p o s s i b i l -i t i e s i s e s t a b l i s h e d (as shown i n f i g u r e 6.1.4a). A l s o assume the a v a i l a b i l i t y of coarse l e v e l i n f o r m a t i o n which g i v e s the i d e n t i f i c a t i o n of l a r g e r o b j e c t s ( f i g u r e 6.1.4b). Consider the f o l l o w i n g i n t e r p r e t a t i o n of these r e s u l t s : In the f i r s t stage, the g l o b a l o b j e c t s are d e t e c t e d , as are the h i g h r e s o l u t i o n f e a t u r e s s p e c i f y i n g t h e i r model p o s s i b i l -i t y s e t s . These s e t s can only be assigned, however, to the e s t a b l i s h e d o b j e c t s , as d e p i c t e d i n f i g u r e 6.1.5. 166 {1,4,7} {1,4,7} {4,8} {7,2,6} {5,7,2,6} {8,3,2} * * * * ^ {8,3,2} {6,8} {2,4} {1,4} {6,3,8,5} {6,8} {6,3,8,5} {6,3,8,5} F i g u r e 6.1.5. Low r e s o l u t i o n o b j e c t s d e t e c t e d and model p o s s i -b i l i t i e s a s s i g n e d to high r e s o l u t i o n f e a t u r e s , which are roughly l o c a t e d . At t h i s p o i n t , there are o b v i o u s l y too many f e a t u r e s a s s o c i a t e d with the obj e c t f o r i t to be a s i n g l e d i g i t , so a subsequent breakdown of o b j e c t s takes p l a c e . In that t h i s second phase r e q u i r e s a higher r e s o l u t i o n , i t can only take p l a c e over a smaller area, so one of the two main o b j e c t s i s s e l e c t e d f o r more d e t a i l e d examination (see f i g u r e 6.1.6). {1,7,4} {4,8} {1,7,4} Z^" {1,4} - * {2,4 A 3 ^{1,4} {8,3,2} {6,3,8,5} F i g u r e 6.1.6. Features a s s i g n e d to o b j e c t s d e t e c t e d at a f i n e r l e v e l of r e s o l u t i o n , f o r one of the low l e v e l o b j e c t s . A second examination of the p o s s i b i l i t y sets r e v e a l s that the r e q u i r e d elements are a v a i l a b l e f o r only one d i g i t i n each of the d e f i n e d p o s i t i o n s , and hence t h e i r i d e n t i t i e s can be e s t a -b l i s h e d i n p a r a l l e l , without s e r i a l a p p l i c a t i o n of a t t e n t i o n 167 to each of the s p e c i f i c l o c a t i o n s . Feature i n t e g r a t i o n theory proposes that o b j e c t i d e n t i f i -c a t i o n may take p l a c e i n p a r a l l e l , based on f e a t u r e s alone, or s e r i a l l y based on c o n j u n c t i o n s of f e a t u r e s when necessary. The group p r o c e s s i n g r e s u l t s i n d i c a t e intermediate steps at which f e a t u r e s are a s s i g n e d to o b j e c t s detected at low r e s o l u t i o n , and once t h i s assignment i s complete, some model p o s s i b i l i t i e s may be d i s c a r d e d by using grouping c o n s i s t e n c y . The l o c a t i o n of o b j e c t s to which these f e a t u r e s are a t t a c h e d w i l l become more r e f i n e d i f necessary, to the p o i n t of e i t h e r a l l o w i n g o b j e c t i d e n t i f i c a t i o n through c o n f i r m a t i o n of the presence of the r e q u i r e d elements alone, or i f necessary by c o n s i d e r i n g the r e l a t i o n s among f e a t u r e s . The i d e n t i t y of f e a t u r e s may be determined over a wide v i s u a l f i e l d , but without s p e c i f i c l o c a t i o n . L o c a t i o n may become more s p e c i f i c through attachment to low r e s o l u t i o n image elements, but only over a more r e s t r i c t e d v i s u a l f i e l d . F i n a l l y , the a c t u a l l o c a t i o n may be determined to permit f e a t u r e i n t e g r a t i o n . T h i s f i n a l l o c a t i n g a c t i o n operates over a s m a l l area of the v i s u a l f i e l d , and t h e r e f o r e r e q u i r e s s e r i a l a p p l i c a t i o n i f more than one l o c a t i o n i s to be searched.[28] [28 ] The r e l a t i o n between these aspects of C o g n i t i v e Psychology and the f e a t u r e based i n t e r p r e t a t i o n methods was f i r s t expressed i n Browse (1981). 168 6.2. P i c t u r e Grammars The d e c l a r a t i v e schemata r e p r e s e n t a t i o n used f o r the body form problem domain has r o o t s i n some e a r l y computer v i s i o n r e s e a r c h . In the 1960's, there emerged the requirement f o r s t r u c t u r a l d e s c r i p t i o n s r a t h e r than c a t e g o r i z a t i o n of p i c -t o r i a l data. Grammatical s t r u c t u r e s became the o b j e c t of i n v e s t i g a t i o n toward t h i s end. I f a c l a s s of p i c t o r i a l o b j e c t s c o u l d be represented as a grammar, and images c o u l d be parsed using that grammar, then the r e s u l t a n t parse t r e e or t r a c e of the a p p l i c a t i o n of p r o d u c t i o n s c o u l d p rovide a s t r u c -t u r a l d e s c r i p t i o n of the image. The fundamental problems in the a p p l i c a t i o n of grammati-c a l methods to images are: (1) I t i s necessary to i d e n t i f y a set of image p r i m i t i v e s which can act as t e r m i n a l symbols of the grammar. (2) I t i s necessary to develop a means of s p e c i f y i n g and using the complex r e l a t i o n s which e x i s t among image e l e -ments. Grammars for languages u t i l i z e the i m p l i c i t and uniform r e l a t i o n between symbols of the grammar, which i s simply t h e i r ordered sequence of appearance. Both of these iss u e s were addressed in e a r l y p i c t u r e grammar systems. Ledley's (1964) system t r a c e d around the perimeter of o b j e c t s d e t e c t e d i n an image. L o c a l c h a r a c t e r i s -t i c s of the l i n e segments found d u r i n g the t r a c e were used as 169 the b a s i s f o r the development of a set of p r i m i t i v e s to be used as the t e r m i n a l symbols of a grammar ( s t r a i g h t l i n e , c l o c k w i s e curve, e t c . ) . T h i s t r a c i n g a l s o p r o v i d e d a s t r i c t o r d e r i n g of grammatical s t r u c t u r e s , which allowed the d i r e c t use of the i m p l i c i t c o n c a t e n a t i o n r e l a t i o n as used i n language based grammatical a p p l i c a t i o n s . Shaw (1969) developed a p i c t u r e d e s c r i p t i o n language (PDL) which a l s o used l i n e segments as t e r m i n a l symbols. An inventor y of connecting r e l a t i o n s was given which were used to c o n s t r u c t d e s c r i p t i o n s of o b j e c t s . These c o n n e c t i n g r e l a t i o n s were allowed, not only among the t e r m i n a l symbols, but a l s o among l a r g e r s c a l e p i c t u r e o b j e c t s . These r e l a t i o n s were a l l c o n n e c t i v i t y r e l a t i o n s , r e q u i r i n g every p i c t u r e p a r t , or t e r -minal symbol to designate a "head" and " t a i l " p a r t , i n order that the r e l a t i o n s c o u l d be s p e c i f i e d . Evans (1969) recognized the advantages of using more s o p h i s t i c a t e d r e l a t i o n s h i p s i n the s p e c i f i c a t i o n of o b j e c t s . For example: (TRI (X Y Z) .( (PT X) (PT Y) (PT Z) (ELS X Y) (ELS X Z) (ELS Y Z) (NONCOLL X Y Z))) T h i s d e s c r i b e s a t r i a n g l e as three p o i n t s . The "ELS" p r e d i -c a t e r e q u i r e s that there e x i s t s a l i n e segment between the p o i n t s s p e c i f i e d , and "NONCOLL" r e q u i r e s that the p o i n t s s p e c i f i e d as i t s arguments are not c o l l i n e a r . 170 Stanton (1972) i d e n t i f i e d t h i s p r o g r e s s i o n towards more r e l a t i o n a l i n f o r m a t i o n i n p i c t u r e grammars. The purpose of such r e p r e s e n t a t i o n s was to make e x p l i c i t a c l a s s of o b j e c t s which might be d e p i c t e d i n the image. Stanton r e c o g n i z e d that no c l e a r s p e c i f i c a t i o n of the r e l a t i o n s and p r e d i c a t e s was being p r o v i d e d , and that the m o d u l a r i t y and g e n e r a l i t y of the r e s u l t i n g systems was d i m i n i s h i n g . Stanton (1970) d e v i s e d a system, RAMOS which combines the d e s c r i p t i v e s t r u c t u r e s of p i c t u r e grammars with a set of prim-i t i v e o p e r a t i o n s over a data base. R e l a t i o n s among image o b j e c t s were expressed as combinations of the p r i m i t i v e opera-t i o n s i n a way which a n t i c i p a t e d the use of l o g i c programming systems such as PROLOG in the a n a l y s i s of v i s u a l i n f o r m a t i o n . Consider the example from Stanton (1972): F( ... A:SQUARE B:TRI JOIN(A,B) ...) T h i s i s a d e s c r i p t i o n f o r a s i t u a t i o n i n which "a square i s j o i n e d to a t r i a n g l e " . The "F" operator i n d i c a t e s the i n s t r u c t i o n to f i n d the s i t u a t i o n , and the requirements f o r the s i t u a t i o n are given w i t h i n the parentheses. The p r e d i c a t e "JOIN" i s given i n terms of the p r i m i t i v e o p e r a t i o n s : JOIN G(A,B) F(C.SIDE(A) D.SIDE(B) SAME(C,D)) T h i s means that given A and B, f i n d a s i d e of A ( c a l l i t C), and f i n d a s i d e of B ( c a l l i t D) such that C and D are the 171 same. The e q u i v a l e n t PROLOG d e s c r i p t i o n would be: s q u a r e - j o i n e d - t o - t r i a n g l e ( * A , * B ) <-square(*A) & t r i a n g l e ( * B ) & j o i n ( * A , * B ) . join(*A,*B) <-side-of ( * A , * C ) & s i d e - o f ( * B , * C ) . The use of the "same" p r e d i c a t e i s not necessary i n the PROLOG v e r s i o n because of the r e q u i r e d u n i f i c a t i o n on the v a r i a b l e "*C" . Stanton's requirement f o r the c l e a r d e p i c t i o n of the s t r u c t u r e of r e l a t i o n s was taken as an i n d i c a t i o n of the need f o r a p r o c e d u r a l component, though w i t h i n the context of d e c l a r a t i v e systems such as PROLOG, t h i s i s not the only o p t i o n . Stanton a l s o i d e n t i f i e d the need f o r an a b i l i t y to i n v e r t the p r e d i c a t e s r e q u i r e d among image s t r u c t u r e s to enable not only t e s t i n g , but o b t a i n i n g r e l a t i o n s from an image. T h i s remains an open problem which i s independent of whether d e c l a r a t i v e of p r o c e d u r a l methods are employed. The d e c l a r a t i v e schemata system used f o r the body form inf o r m a t i o n as d e s c r i b e d i n t h i s document may be viewed as an extension to the concept of a p i c t u r e grammar. These exten-s i o n s are i d e n t i f i e d and d e s c r i b e d below: Logic programming connect i o n : The d e c l a r a t i v e schemata system represents knowledge i n a way which can e a s i l y be t r a n s l a t e d 172 to l o g i c programming systems (see s e c t i o n 4.2). T h i s g i v e s a d i r e c t i o n f o r the f u r t h e r e x p l o r a t i o n of the i s s u e s of i n c o r -p o r a t i n g s t r u c t u r a l d e s c r i p t i o n s of r e l a t i o n s and p r e d i c a t e s as proposed by Stanton. M u l t i - l a y e r grammars: P a r s i n g processes are i n h e r e n t i n v i s i o n systems which use component h i e r a r c h i e s as the b a s i s f o r the s t r u c t u r a l d e s c r i p t i o n s of o b j e c t s . Havens (1978) makes an argument for the s i m i l a r i t y of such systems. The d e c l a r a t i v e schemata system p r o v i d e s a grammatical c o u n t e r p a r t f o r the s p e c i a l i z a t i o n h i e r a r c h y through the i n c o r p o r a t i o n of connec-t i o n s between the l a y e r s of the grammar whose b a s i s i s at d i f -f e r e n t r e s o l u t i o n l e v e l s . Cue/model approach: Most of the r e s e a r c h r e l a t i n g to p i c t u r e grammars preceded the development of the idea of l a b e l l i n g image elements with i n t e r p r e t a t i o n p o s s i b i l i t i e s (Huffman, 1971; Clowes, 1971). The d e c l a r a t i v e schemata system makes an e x p l i c i t c onnection between these approaches through the pro-v i s i o n of the a b i l i t y to analyze the grammatical s t r u c t u r e and a u t o m a t i c a l l y generate the cue/model s t r u c t u r e necessary fo r the a p p l i c a t i o n of i n t e r p r e t a t i o n l a b e l l i n g methods. A t t r i b u t e s t r u c t u r e : The d e c l a r a t i v e schemata provide a means of s p e c i f y i n g a t t r i b u t e s of the non-terminals of the grammar. These s p e c i f i c a t i o n methods are a s s o c i a t e d with each schema in much the same way that a t t r i b u t e e v a l u a t i o n methods are asso-c i a t e d with p r o d u c t i o n s in a t t r i b u t e grammars (Knuth, 1968). 173 The main d i f f e r e n c e i s , that i n the use of a t t r i b u t e grammars, a c o n t e x t - f r e e parse t r e e i s f i r s t generated, and then a f t e r -wards, the a t t r i b u t e values are developed. In the d e c l a r a t i v e schemata system, the a t t r i b u t e s are evaluated immediately as the schema i s a p p l i e d [ 2 9 ] . The r e s u l t s (the a t t r i b u t e values) then enter i n t o the d e c i s i o n as to the a p p l i c a b i l i t y of subse-quent schemata by t h e i r involvement i n r e q u i r e d r e l a t i o n s . T h i s a t t r i b u t e s t r u c t u r e thus p r o v i d e s a c o n t e x t - s e n s i t i v i t y mechanism as w e l l as a uniform means of d e v e l o p i n g the seman-t i c s a s s o c i a t e d with p a r s i n g an image. Access to three dimensions: One problem with p i c t u r e grammars was that they were only a p p l i c a b l e to two dimensional problem domains. P i c t u r e grammars p r o v i d e d no means of s t r u c t u r i n g the d e p i c t i o n s such that coherent t h r e e - d i m e n s i o n a l o b j e c t s c o u l d be represented (see Stanton, 1972). The d e c l a r a t i v e schemata system enables the mapping from v i e w - o r i e n t e d d e s c r i p t i o n s of o b j e c t s to u n d e r l y i n g r e p r e s e n t a t i o n s of the t h r e e - d i m e n s i o n a l aspects of o b j e c t s . [29] T h i s sometimes r e s u l t s i n m u l t i p l e value p o s s i b i l i -t i e s . 174 6.3. View Based Representat ion The body form i n t e r p r e t a t i o n system re p r e s e n t s problem domain knowledge i n a way which i s a h y b r i d of two and t h r e e -dimensional s t r u c t u r e s . The u n d e r l y i n g three-dimensional r e p r e s e n t a t i o n of the human body g i v e s p o s s i b l e ranges of o r i e n t a t i o n and r e l a t i v e l e n g t h of the body segments. T h i s f o r m u l a t i o n does not express the shape of the o b j e c t s , and so i t i s not a p p r o p r i a t e f o r matching o p e r a t i o n s such as those used by Marr and N i s h i h a r a (1976). As an a l t e r n a t i v e , s p e c i f i c p r o t o t y p i c a l views are given with mappings between these image domain s t r u c t u r e s and the u n d e r l y i n g model. Con-s t r a i n t s from both r e p r e s e n t a t i o n s are used i n the development of an i n t e r p r e t a t i o n f o r the image. A r e l a t e d proposal i s found in Minsky's (1975) frame sys-tem f o r the r e p r e s e n t a t i o n of knowledge. In i t s a p p l i c a t i o n to v i s u a l scene a n a l y s i s , d i f f e r e n t viewpoints are represented s e p a r a t e l y , with t r a n s f o r m a t i o n s p r o v i d e d which take one such frame to the next, thereby encoding the e f f e c t of p e r s p e c t i v e change. Minsky a l s o argues that the idea of d i m e n s i o n a l i t y i n a r e p r e s e n t a t i o n i s not completely a p p r o p r i a t e i n the d i s c u s -s i o n of p r o p o s i t i o n a l systems. T h i s n o t i o n i s borne out by the body form r e p r e s e n t a t i o n system. The same ba s i c p r o p o s i -t i o n a l format i s used to encode the three-dimensional r e q u i r e -ments f o r the body p a r t s as i s used to d e s c r i b e the two-dimensional r e l a t i o n s r e q u i r e d among elements i n the image c o n s t r u c t i o n s . 175 Pinker (1980; Pinker and Finke, 1980) has proposed a model f o r the r e p r e s e n t a t i o n of p h y s i c a l space which uses both u n d e r l y i n g t h r e e - d i m e n s i o n a l i n f o r m a t i o n and s p e c i f i c perspec-t i v e s . Subjects were shown to be a b l e to develop mental images of o b j e t s from angles which they had not experienced. Scanning and 1 i n e - o f - s i g h t tasks i n d i c a t e d that s u b j e c t s were u t i l i z i n g emergent two-dimensional aspects of t h e i r images. The not i o n of a t h r e e - d i m e n s i o n a l model for o b j e c t knowledge which can be a r b i t r a r i l y r o t a t e d , and from which p e r s p e c t i v e s may be generated i s an a p p e a l i n g i d e a . Caution must be taken, however, not to usurp an understanding of the v i s u a l processes being e x p l a i n e d by assuming the a b i l i t y to "look a t " t h i s i n t e r n a l model (see Pylyshyn, 1973; Minsky, 1975) . P i n k e r ' s model emphasizes the primacy of the t h r e e -dimensional s t r u c t u r e s . Other s t u d i e s i n d i c a t e t h a t , f o r known o b j e c t s , there e x i s t s p e c i a l p r i v i l e g e d p e r s p e c t i v e s , suggesting that p a r t i c u l a r views are not n e c e s s a r i l y con-s t r u c t e d from the th r e e - d i m e n s i o n a l model, but may have an " e x i s t e n c e " of t h e i r own (Palmer, Rosch, and Chase, 1981). As an i n f o r m a l i n d i c a t i o n of t h i s idea, c o n s i d e r that i t would be more d i f f i c u l t to recognize an elephant from a photograph taken from above than from a photograph taken from the s i d e . Yet the case would be reversed f o r an ant. I t would be more d i f f i c u l t to r e c o g n i z e the ant from the s i d e . T h i s i s because the f a m i l i a r i t y of p a r t i c u l a r p e r s p e c t i v e s has a r o l e i n the 176 s t r u c t u r e of v i s u a l knowledge. Such d i f f e r e n c e s are d i f f i c u l t to e x p l a i n i n a system which only p r o j e c t s p e r s p e c t i v e s from a th r e e - d i m e n s i o n a l s t r u c t u r e . In t h e i r experiments, Palmer Rosch and Chase e s t a b l i s h e d c a n o n i c a l views using a number of converging measures: goodness of view, s e l e c t e d angle to take a photograph, and imagined viewpoint. Subjects were shown to be f a s t e r i n i d e n t i f y i n g photographs of these c a n o n i c a l views. The r e s u l t s were the same i n a c o n d i t i o n i n which the s u b j e c t s were t o l d ahead of time what the viewpoint was to be. The n o t i o n of c a n o n i c a l concepts in semantic memory has been demonstrated (Rosch, 1975; Mervis and Rosch, 1981). Par-t i c u l a r l y f a m i l i a r concepts such as "dog" appear to have a s p e c i a l s t a t u s , forming a base f o r both g e n e r a l i z a t i o n s ("animal"), and s p e c i a l i z a t i o n s ( " c o l l i e " ) . T h i s idea that "the f a m i l i a r " forms the b a s i s of knowledge s t r u c t u r e s c a r r i e s i n t o the realm of v i s u a l knowledge i n the n o t i o n of c a n o n i c a l views. The s t r u c t u r e s which were d e v i s e d f o r the knowledge about the body form l i n e drawing problem domain are intended as a step towards s o l v i n g the problems i n v o l v e d i n m a i n t a i n i n g both two and three-dimensional r e p r e s e n t a t i o n s e x p l i c i t l y . 177 7. C o n c l u s i o n s T h i s document d e s c r i b e s a computational v i s i o n system which i n t e r p r e t s a set of l i n e drawings of human body forms. The system was d e v i s e d in response to two r e s e a r c h g o a l s : (1) To develop d e c l a r a t i v e s t r u c t u r e s f o r the r e p r e s e n t a t i o n of the knowledge about s p e c i f i c o b j e c t s , as r e q u i r e d f o r v i s u a l i n t e r p r e t a t i o n , and to separate the process of i n t e r p r e t a t i o n from t h i s knowledge. (2) To i n c o r p o r a t e fundamental aspects of human v i s i o n i n t o a computational system. S p e c i f i c a l l y , to enable i n t e r p r e -t a t i o n based i n t e r a c t i o n among l e v e l s of r e s o l u t i o n , and to p r o v i d e a means of i n t e l l i g e n t s e l e c t i o n of p r o c e s s i n g l o c a t i o n s . The f i r s t s tep in the development of the computer system was to d e v i s e a d e c l a r a t i v e schemata format for the represen-t a t i o n of v i s u a l knowledge. T h i s format i s s i m i l a r to, and extends p i c t u r e grammars i n a number of ways. The b a s i c s t r u c t u r e of the body form knowledge f o l l o w s the component h i e r a r c h y f o r the domain. Two types of f e a t u r e s are known to the system, each of which forms the b a s i s f o r a separate r e p r e s e n t a t ion l a y e r . These l a y e r s are i n t e r c o n n e c t e d by l i n k s i n d i c a t i v e of the s p e c i a l i z a t i o n / g e n e r a l i z a t i o n h i e r a r -chy. The d e c l a r a t i v e schemata system p r o v i d e s p r o t o t y p i c a l p e r s p e c t i v e i n f o r m a t i o n , with e x p l i c i t mappings i n t o an under-l y i n g t h r e e - d i m e n s i o n a l model. 178 The second step accomplishes an a n a l y s i s of the d e c l a r a -t i v e schemata c o n t e n t s . T h i s a n a l y s i s generates an ext e n s i v e cue t a b l e a s s o c i a t i n g each generic o b j e c t with a l i s t of i n t e r p r e t a t i o n p o s s i b i l i t i e s , or r o l e s that they might play i n some l a r g e r s t r u c t u r e . These r o l e s are c o n d i t i o n a l on the value s of a t t r i b u t e s of the o b j e c t s . T h i s cue t a b l e i s dev-i s e d to permit a set l a b e l l i n g mechanism for the maintenance of model p o s s i b i l i t i e s , which permits the e f f e c t i v e use of p a r t i a l i n f o r m a t i o n about image f e a t u r e s . T h i s phase a l s o r e l a x e s the c o n d i t i o n s on the p r o t o t y p i c a l view re p r e s e n t a -t i o n s so t h a t they cover a wider c l a s s of d e p i c t i o n s . During the i n t e r p r e t a t i o n p r o c e s s , f e a t u r e s are a v a i l a b l e i n c o n c e n t r i c areas of l i m i t e d diameter fovea and p e r i p h e r y . At each f i x a t i o n , f e a t u r e based grouping and c o n s i s t e n c y pro-vide a means of pruning the l i s t s of i n t e r p r e t a t i o n p o s s i b i l i -t i e s a s s o c i a t e d with each f e a t u r e . T h i s c o n s t i t u t e s the f i r s t phase of i n t e r p r e t a t i o n based i n t e r a c t i o n between l e v e l s of d e t a i l . A s t r i c t l y bottom-up method f o r invoking the examination of s p e c i f i c schemata i s employed, which allows systematic con-t r o l of p a r t i a l l y f u l f i l l e d schemata in s t a n c e s through the a p p l i c a t i o n of incremental c o n s i s t e n c y . The systems maintains p r o v i s i o n f o r m u l t i p l e c ontexts by c a r r y i n g s e v e r a l p o s s i b l e v a l u e s only f o r those a t t r i b u t e s which are a f f e c t e d by con-text . 1 79 An a n a l y s i s of the r e s u l t s of the i n v o c a t i o n of the sche-mata y i e l d s c r i t e r i a f o r the i n t e l l i g e n t s e l e c t i o n of p r o c e s s -ing l o c a t i o n . The notion of a correspondence between i n t e r p r e t a t i o n s based on d i f f e r e n t l e v e l s of r e s o l u t i o n i s int r o d u c e d . T h i s i s extended to the idea of i n f e r r e d  correspondence. The use of i n f e r r e d correspondence of coarse l a y e r i n t e r p r e t a t i o n as a goal of the system enables the pro-pagation of d e t a i l e d f o v e a l based r e s u l t s i n t o the p e r i p h e r a l area, removing the requirement f o r f i n e l a y e r p r o c e s s i n g of the e n t i r e image. Throughout the d e s c r i p t i o n of the system, attempts have been made to f u r n i s h the d e t a i l s of r e l a t i o n s that o p e r a t i o n s might have to C o g n i t i v e Psychology r e s e a r c h . One of the main advantages of the s e p a r a t i o n of o b j e c t knowledge and process i s that i t f a c i l i t a t e s t r a n s f e r r a l to other problem domains, and permits experimentation with other i n t e r p r e t a t i o n techniques. Some of the d i r e c t i o n s that these extensions might take are l i s t e d below: (1) An i n t e r e s t i n g extension would be to d e v i s e a r e p r e s e n t a -t i o n f o r the d i g i t s and the l e t t e r s of the alphabet. The op e r a t i o n of the r e s u l t i n g system might then be a l i g n e d with the experimental r e s u l t s of Psychology r e s e a r c h in v i s u a l a t t e n t i o n , which o f t e n u t i l i z e l e t t e r s and d i g i t s as d i s p l a y items. Having a r e p r e s e n t a t i o n f o r l e t t e r s might a l s o allow computational based s t u d i e s of the s e l e c t i o n a l processes i n 180 r e a d i n g , and the examination of the r o l e of p e r i p h e r a l i n f o r -mation (see Rayner, 1978). (2) Other problem domains may r e q u i r e enhancements of the d e c l a r a t i v e schemata format. For the l e v e l of d e s c r i p t i o n used f o r the body form knowledge, there i s always a f i x e d number of components for each o b j e c t ( i e . , two arms, not t h r e e ) . In other domains o b j e c t s may be s p e c i f i e d with unk-nown numbers of components. For example, in the sketch map domain (Mackworth, 1977b), a mountain range i s composed of an a r b i t r a r y number of mountain symbols. Extensions to other domains w i l l be necessary to develop a robust and g e n e r a l l y u s e f u l r e p r e s e n t a t i o n a l t o o l . (3) The body drawing i n t e r p r e t a t i o n system i s e n t i r e l y "bottom-up" in i t s o p e r a t i o n . "Top-down" components might be i n s t i t u t e d i n s e v e r a l ways. "Top-down" e x p e c t a t i o n s c o u l d be formulated using p r i o r knowledge of the expected p o s i t i o n of the body i n the image. C u r r e n t l y , the model i n v o c a t i o n at the coarse and f i n e l a y e r do not i n t e r a c t . I t would be p o s s i b l e to use a g l o b a l to l o c a l "top-down" component by o r d e r i n g the c o n s i d e r a t i o n s at the f i n e l a y e r on the b a s i s of the r e s u l t s from the coarse l a y e r . P r e l i m i n a r y i n t e r p r e t a t i o n , based on an even more coarse l e v e l of r e s o l u t i o n c o u l d be used to develop a p r e l i m i n a r y context f o r the p r o c e s s i n g at any l o c a -t i o n (see Palmer, 1977). A t h i r d type of "top-down" c o n t r o l c o u l d be i n t r o d u c e d as hypothesis t e s t i n g . Once a schema i s p a r t i a l l y f u l f i l l e d , i t c o u l d d i r e c t p r o c e s s i n g towards the 181 d i s c o v e r y of i t s remaining requirements. Another i n t e r e s t i n g d i r e c t i o n would be the f u r t h e r exami-na t i o n of the use of l o g i c programming in v i s u a l i n t e r p r e t a -t i o n systems. The d e c l a r a t i v e schemata system maps w e l l i n t o such r e p r e s e n t a t i o n s , but e a s i e r access to c o n t r o l s t r u c t u r e would be r e q u i r e d i n order to allow the experimentation in the p rocesses of i n t e r p r e t a t i o n . The body drawing i n t e r p r e t a t i o n system uses very simple c r i t e r i a f o r s e l e c t i n g p r o c e s s i n g l o c a t i o n s . These c r i t e r i a were intended only as an example of what might be accom-p l i s h e d . The schemata themselves c o u l d be examined in terms of expected areas of r e l a t e d o b j e c t s , even without the support of coarse l a y e r r e s u l t s . The nature of the task i n v o l v e d i n v i s i o n i s an obvious candidate as a determining f a c t o r i n l o c a t i o n s e l e c t i o n . 182 References A l p e r n , M. 1972, "Eye Movements," i n Handbook of Sensory Physiology V o l  V I 1 / 4 : V i s u a l Psychophysics, D. Jameson and L.M. Hurvich Teds.), S p r i n g e r , B e r l i n . Antes, J.R. 1974, "The Time Course of P i c t u r e Viewing," J o u r n a l of  Experimental Psychology 103, 6 2 - 7 0 . American Academy of Orthopeadic Surgeons 1965, J o i n t Motion: Method of Measuring and Recording, Chicago. Bajcsy, R. and Rosenthal, D.A. 1980, " V i s u a l and Conceptual Focus of A t t e n t i o n , " i n S t r u c t u r e d Computer V i s i o n , S. Tanimoto and A. K l i n g e r ( e d s . ) , Academic Press, New York, 1 3 3 - 1 4 9 . Barrow, H.G. and Tenenbaum, J.M. 1978, "Recovering I n t r i n s i c Scene C h a r a c t e r i s t i c s From Images," i n Computer V i s i o n Systems, A.R. Hanson and E.M. Riseman (ed s . ) , Academic Press, New York, 3 - 2 6 . Barrow, H.G. and Tenenbaum, J.M. 1981, "Computational V i s i o n , " Proc. IEEE 6 9 , 5 7 2 - 5 9 5 . B a r t l e t t , F.C. 1932, Remembering, A Study i n Experimental and S o c i a l  Psychology, Cambridge U n i v e r s i t y Press, Cambridge. Biederman, I. 1981, "On the Semantics of a Glance at a Scene," i n P e r c e p t u a l O r g a n i z a t i o n , M. Kubovy and J.R. Pomerantz (eds.), Erlbaum, H i l l s d a l e , New J e r s e y , 2 1 3 - 2 5 3 . Blakemore, C. and Campbell, F.W. 1969, "On the E x i s t e n c e of Neurons i n the Human V i s u a l System S e l e c t i v e l y S e n s i t i v e to the O r i e n t a t i o n and S i z e of R e t i n a l Images," J o u r n a l of P h y s i o l o g y 2 0 3 , 2 3 7 - 2 6 0 . Brady, M. 1982, "Computational Approaches to Image Understanding," ACM  Computing Surveys J_4( 1 ) , 3 - 7 2 . Breitmeyer, B. and Ganz, L. 1976, " I m p l i c a t i o n s of Sustained and T r a n s i e n t Channels f o r Th e o r i e s of V i s u a l P a t t e r n Masking, Saccadic Suppression, and Information P r o c e s s i n g , " P s y c h o l o g i c a l Review 8 3 , 1-36. 183 Brooks, R.A. 1981, "Model-Based Three Dimensional I n t e r p r e t a t i o n s of Two Dimensional Images," Proceedings of the Seventh  I n t e r n a t i o n a l J o i n t Conference on A r t i f i c i a l I n t e l l i g e n c e , Vancouver, 619-624. Browse, R.A. 1980, "Mediation Between C e n t r a l and P e r i p h e r a l Processes: U s e f u l Knowledge S t r u c t u r e s , " Proc. T h i r d Conf. of the  Canadian S o c i e t y f o r the Computational S t u d i e s of  I n t e l l i g e n c e , V i c t o r i a , Canada, 166-171. Browse, R.A. 1981, " R e l a t i o n s Between Scemata-Based Computational V i s i o n and Aspects of V i s u a l A t t e n t i o n , " Proc. Fourth Annual  Conference of the C o g n i t i v e Sc ience Soc i e t y , Berkeley. Browse, R.A. 1982, " I n t e r p r e t a t i o n - B a s e d I n t e r a c t i o n Between L e v e l s of D e t a i l , " Proc. Fourth Conf. of the Canadian Soc i e t y f o r the  Computational S t u d i e s of I n t e l l i g e n c e , Saskatoon, Canada, 27-32. Bruner, J.S. and P o t t e r , M.C. 1 964, " I n t e r f e r e n c e i n V i s u a l R e c o g n i t i o n , " Sc ience 144, 424-425. Buswell, G.T. 1935, How People Look At P i c t u r e s , U n i v e r s i t y of Chicago P r e s s . C a t a n z a r i t i , E. and Mackworth, A.K. 1978, " F o r e s t s and Pyramids: Using Image H i e r a r c h i e s to Understand Landsat Images," Proc. 5th Canadian Symposium on Remote Sensing. Chomsky, N. 1957, Syntact i c S t r u c t u r e s , Mouton and Co., The Hague. Chomsky, N. 1965, Aspects of the Theory of Syntax, The M.I.T. Press, Cambridge. Clowes, M.B. 1971, "On Seeing Things," A r t i f i c i a l I n t e l l i g e n c e 2(1), 79-112. Didday, R.C. and A r b i b , M.A. 1975, "Eye Movements and V i s u a l P e r c e p t i o n : A Two V i s u a l System Model," I n t e r n a t i o n a l J o u r n a l of Man-Machine Studi e s 7, 547-569. 184 E r i k s e n , C.W. and Hoffman, J.E. 1972, "Temporal and S p a t i a l C h a r a c t e r i s t i c s of S e l e c t i v e Encoding from V i s u a l D i s p l a y s , " Percept ion and Psychophysics J_2, 201-204. E s h k o l , N. and Wachmann, A. 1958, Movement N o t a t i o n , Arrowsmith, B r i s t o l . Evans, T.G. 1969, " D e s c r i p t i v e P a t t e r n A n a l y s i s Techniques," i n Automat i c I n t e r p r e t a t i o n and C l a s s i f i c a t i o n of Images, A. G r a s s e l l i (ed.), NATO Advanced Study I n s t i t u t e , P i s a , 79-96. Fahlman, S.E. 1979, NETL: A System f o r Representing and Using Real-World  Knowledge, M.I.T. Press, Cambridge. F a r l e y , A.M. 1976, "A Computer Implementation of C o n s t r u c t i v e V i s u a l Imagery and P e r c e p t i o n , " i n Eye Movements and P s y c h o l o g i c a l  Processes, R.A. Monty and J.W. Senders Ceds.), H a l s t e a d Press, New York, 473-490. F i s h e r , D.F., Monty, R.A., and Senders, J.W. 1981, Eye Movements: C o g n i t i o n and V i s u a l P e r c e p t i o n , Erlbaum, H i l l s d a l e , New J e r s e y . Friedman, A. 1979, "Framing P i c t u r e s : The Role of Knowledge i n Automatized Encoding and Memory f o r G i s t , " J o u r n a l of Experimental  Psychology: General 108(3), 316-355. Funt, B.V. 1976, "Whisper: a Computer Implementation Using Analogues in Reasoning," TR-76-09, Computer Science Dept., U n i v e r s i t y of B r i t i s h Columbia, Vancouver, Canada. Garner, W.R. 1974, The P r o c e s s i n g of Information and S t r u c t u r e , Erlbaum, Pontomac, Maryland. G i l c h r i s t , A.L. 1977, "Perceived L i g h t n e s s Depends on P e r c e i v e d S p a t i a l Arrangement," Science 195, 186-187. G i l c h r i s t , A.L. 1980, "When Does P e r c e i v e d L i g h t n e s s Depend on P e r c e i v e d S p a t i a l Arrangement?," Percept ion and Psychophysics 28, 527-538. 185 Gould, J.D. 1976, "Looking At P i c t u r e s , " i n Eye Movements and P s y c h o l o g i c a l Processes, R.A. Monty and J.W. Senders ( e d s . ) , H a l s t e a d Press, New York, 323-345. Gould, J.D. and S c h a f f e r , A. 1965, "Eye Movement P a t t e r n s During V i s u a l Information P r o c e s s i n g , " Psychonomic Science 3, 317-318. Graham, N. 1981, "Psychophysics of S p a t i a l - F r e q u e n c y Channels," in P e r c e p t u a l O r g a n i z a t i o n , M. Kubovy and J.R. Pomerantz (eds.), Erlbaum, H i l l s d a l e , New J e r s e y , 1-25. Gregory, R.L. 1966, Eye and B r a i n , McGraw-Hill, New York. Guzman, A. 1968, "Decomposition of a V i s u a l Scene Into Three-Dimensional Bodies," Proc. AFIPS 1968 F a l l J o i n t Computer Conference, 291-304. Haber, R.N. 1978, " V i s u a l P e r c e p t i o n , " i n Annual Review of Psychology, M.R. Rosenzweig and L.W. P o r t e r ( e d s . ) , Erlbaum, H i l l s d a l e , New J e r s e y . Hanson, A.R. and Riseman, E.M. 1975, "The Design of a S e m a n t i c a l l y D i r e c t e d V i s i o n P r o c e s s o r , " T e c h n i c a l Report No. 75C-1, U n i v e r s i t y of Massachusetts, Amherst, Massachusetts. Hanson, A.R. and Riseman,' E.M. 1978, "Segmentation of N a t u r a l Scenes," in Computer V i s i o n  Systems, A.R. Hanson and E.M. Riseman (ed s . ) , Academic Press, New York, 129-164. Hanson, A.R. and Riseman, E.M. 1980, " P r o c e s s i n g Cones: A Computational S t r u c t u r e f o r Image A n a l y s i s , " i n S t r u c t u r e d Computer V i s i o n , S. Tanimoto and A. K l i n g e r ( e d s . ) , Academic Press, New York, 101-131. Havens, W.S. 1976, "Can Frames Solve the Chicken and Egg Problem?," Proc. F i r s t Conf. of the Canadian Soc i e t y f o r the Computational  Studi e s of I n t e l l i g e n c e , Vancouver, Canada, 232-242. Havens, W.S. 1978, "A P r o c e d u r a l Model of R e c o g n i t i o n f o r Machine P e r c e p t i o n , " TR-78-3, Computer Science Dept., U n i v e r s i t y of B r i t i s h Columbia, Vancouver, Canada. 186 Hinton, G.F. 1981, "Shape Representation i n P a r a l l e l Systems," Proceedings  of the Seventh I n t e r n a t i o n a l J o i n t Conference on A r t i f i c i a l  I n t e l l i g e n c e , Vancouver, 1088-1096. Hochberg, J . 1968, "In the Mind's Eye," in Contemporary Theory and  Research in V i s u a l P e r c e p t i o n , R.N. Haber (ed.), H o l t , R i n e h a r t , and Winston, New York, 309-321. Hochberg, J.E. and Brooks, V. 1978, " F i l m C u t t i n g and V i s u a l Momentum," in Eye Movements  and the Higher P s y c h o l o g i c a l F u n c t i o n s , J.W. Senders, D.F. F i s h e r , and R.A. Monty (eds.), Erlbaum, H i l l s d a l e , New J e r s e y . Hoffman, J.E. 1980, " I n t e r a c t i o n Between G l o b a l and L o c a l L e v e l s of a Form," J o u r n a l of Experimental Psychology: Human P e r c e p t i o n  and Performance 6(2), 222-234. Horn, B.K.P. 1975, "Obtaining Shape From Shading Information," i n The  Psychology of Computer V i s i o n , P.H. Winston (ed.), McGraw-H i l l , New York, 115-155. Hubel, D.H. and W e i s e l , T.N. 1979, " B r a i n Mechanisms of V i s i o n , " S c i e n t i f i c American 24J_(3), 150-162. Huffman, D.A. 1971, "Impossible Objects as Nonsense Sentences," in Machine  I n t e l l i g e n c e 6, B. M e l t z e r and D. M i c h i e ( e d s . ) , American E l s e v i e r , New York, 295-323. J u s t , M.A. and Carpenter, P.A. 1978, "Inference Processes During Reading: R e f l e c t i o n s from Eye F i x a t i o n s , " i n Eye Movements and the Higher  P s y c h o l o g i c a l Funct i o n s , J.W. Senders, D.F. F i s h e r , and R.A. Monty (eds.), Erlbaum, H i l l s d a l e , New J e r s e y . Kahneman, D. 1973, A t t e n t i o n and E f f o r t , P r e n t i c e H a l l , Englewood C l i f f s , New J e r s e y . Kahneman, D. and Henick, A. 1977, " E f f e c t s of V i s u a l Grouping on Immediate R e c a l l and S e l e c t i v e A t t e n t i o n , " i n A t t e n t i o n and Performance VI, S. Dornic (eds.), Erlbaum, H i l l s d a l e , New J e r s e y , 307-332. 187 Kahneman, D. and Treisman, A. 1983, "Changing Views of A t t e n t i o n and A u t o m a t i c i t y , " i n V a r i e t i e s of A t t e n t i o n , R. Parasuraman, R. Davies, and J . Beatty ( e d s . ) , Academic Press, New York. K e l l y , M.D. 1971, "Edge D e t e c t i o n i n P i c t u r e s by Computer Using P l a n n i n g , " i n Machine I n t e l l i g e n c e 6, B. M e l t z e r and D. M i c h i e (eds.), American E l s e v i e r , New York, 397-409. K i n c h l a , R. 1974, " D e t e c t i n g Target Elements in Multielement A r r a y s : A C o n f u s a b i l i t y Model," P e r c e p t i o n and Psychophysics 15, 149-1 58. K i n c h l a , R. and Wolfe, J . 1979, "The Order of V i s u a l P r o c e s s i n g : "Top-Down", "bottom-Up", or "Middle-Out"," P e r c e p t i o n and Psychophysics 25, 225-231 . K l e i n , G.A. and Kurkowski, F. 1974, " E f f e c t of Task Demand on the R e l a t i o n s h i p Between Eye Movement and Sentence Complexity," P e r c e p t u a l and Motor  S k i l l s 39, 463-466. Knuth, D.E. 1968, "Semantics of Context-Free Languages," Mathematical  Systems Theory 2(2), 127-146. Laberge, D. 1976, " P e r c e p t u a l L e a r n i n g and A t t e n t i o n , " i n Handbook of  L e a r n i n g and C o g n i t i v e Processes, V o l 4, W.K. E s t e s Ted.), Erlbaum, H i l l s d a l e , New J e r s e y . Latour, P.L. 1962, " V i s u a l T h r e s h o l d During Eye Movements," V i s i o n  Research 2, 261-262. Ledley, R.S. 1964, "High-Speed Automatic A n a l y s i s of Biomedical P i c t u r e s , " Science 146(9), 216-223. Levine, M.D. 1980, "Region A n a l y s i s Using a Pyramid Data S t r u c t u r e , " i n S t r u c t u r e d Computer V i s i o n , S. Tanimoto and A. K l i n g e r (e d s . ) , Academic Press, New York, 57-100. Lockhead, G.R. 1972, " P r o c e s s i n g Dimensional S t i m u l i : A Note," P s y c h o l o g i c a l  Review 79, 410-419. 188 L o f t u s , G.R. 1972, "Eye F i x a t i o n s and Re c o g n i t i o n Memory f o r P i c t u r e s , " C o g n i t i v e Psychology 3, 525-551 . L o f t u s , G.R. and Mackworth, N.H. 1978, " C o g n i t i v e Determinants of F i x a t i o n L o c a t i o n During P i c t u r e Viewing," J o u r n a l of Experimental Psychology: Human  Percept ion and Performance 4, 565-572. Lowe, D. 1975, "P r o c e s s i n g Information About L o c a t i o n i n B r i e f V i s u a l D i s p l a y s , " P e r c e p t i o n and Psychophysics 18, 309-316. Leushina, L . I . 1965, "On E s t i m a t i o n of P o s i t i o n of Photostimulus and Eye Movement," B i o f i z i k a 10, 130-136. Mackworth, A.K. 1973, " I n t e r p r e t i n g P i c t u r e s of P o l y h e d r a l Scenes," A r t i f i c i a l I n t e l l i g e n c e 4(2), 121-137. Mackworth, A.K. 1975, "Model-Driven I n t e r p r e t a t i o n i n I n t e l l i g e n t V i s i o n Systems," P e r c e p t i o n 5, 349-370. Mackworth, A.K. 1977a, "How to See a Simple World: an Exegesis of Some Computer Programs f o r Scene A n a l y s i s , " i n Machine  I n t e l l i g e n c e 8_, E.W. Elco c k and D. Michie (eds.) , John Wiley, New York, 510-540. Mackworth, A.K. 1977b, "On Reading Sketch Maps," Proceeding of the F i f t h  I n t e r n a t i o n a l J o i n t Conference on A r t i f i c i a l I n t e l l i g e n c e , Cambridge, 598-606. Mackworth, A.K. 1977c, "Consistency i n Networks of R e l a t i o n s , " A r t i f i c i a l  I n t e l l i g e n c e 8(1), 99-118. Mackworth, A.K. 1978, " V i s i o n Research S t r a t e g y : Black Magic, Metaphors, Mechanisms, Miniworlds and Maps," in Computer V i s i o n  Systems, A.R. Hanson and E.M. Riseman ( e d s . ) , Academic Press, New York, 53-61. Mackworth, A.K. and Havens, W.S. 1981, " S t r u c t u r i n g Domain Knowledge f o r V i s u a l P e r c e p t i o n , " Proceedings of the Seventh I n t e r n a t i o n a l J o i n t Conference on A r t i f i c i a l I n t e l l i g e n c e , Vancouver, 625-627. 189 Mackworth, N.H. and Morandi, A . J . 1967, "The Gaze S e l e c t s Informative D e t a i l s Within P i c t u r e s , " P e r c e p t i o n and Psychophsics 2, 547-552. Marcotty^ M., Ledgard, H.F., and Bochmann, G.V. 1976, "A sampler of Formal D e f i n i t i o n s , " ACM Computer Surveys 8(2), 191-276. Ma r r , D. 1976, " E a r l y P r o c e s s i n g of V i s u a l I nformation," Phi 1. Trans. Royal S o c i e t y of London 275B(942), 483-524. Ma r r, D. 1 982, V i s i o n : A Computational I n v e s t i g a t i o n i n t o the Human  Representat ion and P r o c e s s i n g of V i s u a l Information, W.H. Freeman, San F r a n c i s c o . Marr, D. and H i l d r e t h , E. 1980, "Theory of Edge D e t e c t i o n , " Proc. Royal Soc. London B(207), 187-217. Marr, D. and N i s h i h a r a , H.K. 1976, "Representation of the S p a t i a l O r g a n i z a t i o n of Three Dimensional Shapes," Report 377, A.I. Lab, M.I.T. Ma r t i n, M. 1979, " L o c a l and G l o b a l P r o c e s s i n g : The Role of S p a r s i t y , " Memory and C o g n i t i o n 7, 476-484. Mer v i s , C.B. and Rosch, E. 1981, " C a t e g o r i z a t i o n of N a t u r a l O b j e c t s , " i n Annual Review  of Psychology, M.R. Rosenzweig and L.W. Porter" (eds . 7 " ! M i l l e r , J . 1981, "Global Precedence i n A t t e n t i o n and D e c i s i o n , " J o u r n a l  of Experimental Psychology: Human Percept ion and Performance 7, 1161-1174. Minsky, M. 1975, "A Framework f o r Representing Knowledge," i n The Psychology of Computer V i s i o n , P.H. Winston (ed.), McGraw-H i l l , New York. Monty, R.A. and Senders, J.W. 1976, Eye Movements and P s y c h o l o g i c a l Processes, H a l s t e a d Press, New York. Navon, D. 1977, "Forest Before Trees: The Precedence of G l o b a l Features i n V i s u a l P e r c e p t i o n , " C o g n i t i v e Psychology 9, 353-383. 190 N e i s s e r , U. 1976, C o g n i t i o n and R e a l i t y , Freeman, San F r a n c i s c o . N e i s s e r , U. 1967, C o g n i t i v e Psychology, Appleton, Century, C r o f t s , New York. Newell, A. 1973, "Production Systems: Models of C o n t r o l S t r u c t u r e s , " i n V i s u a l Information P r o c e s s i n g , W.G. Chase (ed.), Academic Press, New York. N i s h i h a r a , H.K. 1981, " I n t e n s i t y , V i s i b l e - S u r f a c e , and Volumetric Represenatations," A r t i f i c i a l I n t e l l i g e n c e 17, 265-284. Noton, D. and L.Stark, 1971a, "Eye Movements and V i s u a l P e r c e p t i o n , " S c i e n t i f i c  American 224, 34-43. Noton, D. and Stark, L. 1971b, "Scanpaths in Eye Movements During P a t t e r n P e r c e p t i o n , " Science 171, 308-311. Noton, D. and Stark, L. 1971c, "Scanpaths in Saccadic Eye Movements Wh i l s t Viewing and Recognizing P a t t e r n s , " V i s i o n Research 11, 929-942. Oshima, M. and S h i r a i , Y. 1981, "Object R e c o g n i t i o n Using Three-Dimensional Information," Proceedings of the Seventh I n t e r n a t i o n a l J o i n t  Conference on A r t i f i c i a l I n t e l l i g e n c e , Vancouver, 601-606. Palmer, S.E. 1975, " V i s u a l P e r c e p t i o n and World Knowledge: Notes on a Model of S e n s o r y - C o g n i t i v e I n t e r a c t i o n , " i n E x p l o r a t i o n s in  C o g n i t i o n , D.A. Norman and D.E. Rumelhart (eds.), Freeman, San F r a n c i s c o , 279-307. Palmer, S.E. 1977, " H i e r a r c h i c a l S t r u c t u r e i n P e r c e p t u a l R e p r e s e n t a t i o n , " C o g n i t i v e Psychology 9, 441-474. Palmer, S.E., Rosch, E., and Chase, P. 1981, "Canonical P e r s p e c t i v e and the P e r c e p t i o n of O b j e c t s , " i n A t t e n t i o n and Performance IX, J . Long and A. Baddeley (eds . ) , Erlbaum, H i l l s d a l e , New J e r s e y . P a n t l e , A. and Sekuler, R. 1968, " V e l o c i t y S e n s i t i v e Mechanisms in Human V i s i o n , " V i s i o n  Research 8, 445-450. 191 Parker, R.E. 1978, " P i c t u r e P r o c e s s i n g During R e c o g n i t i o n , " J o u r n a l of  Experimental Psychology: Human Percept ion and Performance 4, 284-293. Parks, T. 1965, " P o s t - R e t i n a l V i s u a l Storage," American J o u r n a l of  Psychology 78, 145-147. P i a g e t , J . 1967, B i o l o g y and Knowledge, G a l l i m a r d Press, P a r i s . P i n k e r , S. 1980, "Mental Imagery and the T h i r d Dimension," J o u r n a l of  Experimental Psychology: General 109(3), 354-371. Pin k e r , S. and Fi n k e , R.A. 1980, "Emergent Two-Dimensional P a t t e r n s i n Images Rotated i n Depth," J o u r n a l of Experimental Psychology: Human Pe r c e p t i o n  and Performance 6 j 2 ) , 244-64. Posner, M.I. 1978, Chronometric E x p l o r a t i o n of Mind, Erlbaum, H i l l s d a l e , New J e r s e y . Pylyshyn, Z.W. 1973, "What the Mind's Eye T e l l s the Mind's B r a i n : a C r i t i q u e of Mental Images," Psychology B u l l e t i n 8_0(1), 1-24. Pylyshyn, Z., E l c o c k , E.W., Marmor, M., and Sander, P. 1978a, " E x p l o r a t i o n in Vi s u a l - M o t o r Space," Proc. Second  Conf. of the Canadian Soc i e t y f o r the Computational S t u d i e s  of I n t e l l i g e n c e , Toronto, Canada Pylyshyn, Z., E l c o c k , E.W., Marmor, M.M., and Sander, P.T. 1978b, A System f o r Perceptual-Motor Based Reasoning, Report 42, Department of Computer Scie n c e , U. of Western O n t a r i o , London, Canada. Rayner, K. 1975, "The P e r c e p t u a l Span and P e r i p h e r a l Cues i n Reading," C o g n i t i v e Psychology 1_, 65-81 . Rayner, K. 1978, "Eye Movements i n Reading and Information P r o c e s s i n g , " P s y c h o l o g i c a l B u l l e t i n 85(3), 618-660. Rayner, K., McConkie, G.W., and E r l i c h , S. 1978, "Eye Movements and I n t e g r a t i n g Information Across F i x a t i o n s , " J o u r n a l of Experimental Psychology: Human  Per c e p t i o n and Performance 4(4), 529-544. 1 92 Riggs, L.A. 1965, " V i s u a l A c u i t y , " i n V i s i o n and V i s u a l P e r c e p t i o n , C H . Graham (ed.), Wiley, New York. Riggs, L.A. . . . 1973, "Curvature as a Feature of P a t t e r n V i s i o n , Science  181, 1070-1072. Roberts, L.G. 1965, "Machine P e r c e p t i o n of Three Dimensional S o l i d s , " i n O p t i c a l and E l e c t r o - o p t i c a l Information P r o c e s s i n g , J.T. T i p p e t t , D. Berkowitz, L. Clapp, C. Koester, and A. Vanderburgh (e d s . ) , M.I.T. Pr e s s , Cambridge, Massachusetts, 159-197. Rosch, E. 1975, " C o g n i t i v e R e p r e s e n t a t i o n s of Semantic C a t e g o r i e s , " J o u r n a l of Experimental Psychology; General 104, 193-233. Rosenthal, D. and Bajcsy, R. 1978, "Conceptual and V i s u a l F o c u s s i n g i n the R e c o g n i t i o n Process as Induced by Querie s , " Proc. Fourth I n t e r n a t i o n a l  J o i n t Conference on P a t t e r n R e c o g n i t i o n , Kyoto, 417-420. Roy, R. and Sutro, L.L. 1982, " S i m u l a t i o n of Two Forms of Eye Motion and I t s P o s s i b l e I m p l i c a t i o n s f o r the Automatic R e c o g n i t i o n of Three-Dimensional O b j e c t s , " IEEE Transact ions on Systems, Man, and  C y b e r n e t i c s SMC~12(3), 276-288. Senders, J.W., F i s h e r , D.F., and Monty, R.A. 1978, Eye Movements and the Higher P s y c h o l o g i c a l Funct ions, Erlbaum, H i l l s d a l e , New J e r s e y . Shaw, A.C. 1969, "A Formal P i c t u r e D e s c r i p t i o n Scheme as a Ba s i s f o r P i c t u r e P r o c e s s i n g Systems," Information and C o n t r o l 14(1), 9-52. . S h i f f r i n , R.M. and Schneider, W. 1977, " C o n t r o l l e d and Automatic Human Information P r o c e s s i n g : I I . P e r c e p t u a l L e a r n i n g , Automatic A t t e n d i n g and General Theory," P s y c h o l o g i c a l Review 84, 127-190. S h i r a i , Y. 1975, "Analyzing I n t e n s i t y A r r a y s Using Knowledge About Scenes," i n The Psychology of Computer V i s i o n , P.H. Winston (ed.), McGraw-Hill, New York, 93-113. 1 93 Stanton, R.B. 1970, "Computer Graphics - the Recovery of D e s c r i p t i o n s i n G r a p h i c a l Communication," Phd T h e s i s , E l e c t r o n i c Computation, U n i v e r s i t y of New South Wales. Stanton, R.B. 1972, "The I n t e r p r e t a t i o n of Graphics and Graphic Languages," in Graphic Languages, F. Nake and A. Rosenfeld (eds.), North-Holland P u b l i s h i n g Co., Amsterdam, 144-159. Stevens, K. A. 1981, "The V i s u a l I n t e r p r e t a t i o n of Surface Contours," A r t i f i c i a l I n t e l l i g e n c e 17, 47-75. Tanimoto, S.L. 1976, " P i c t o r i a l Feature D i s t o r t i o n i n a Pyramid," Computer  Graphics and Image P r o c e s s i n g 5(3), 333-352. Tanimoto, S.L. 1980, "Image Data S t r u c t u r e s , " i n S t r u c t u r e d Computer V i s i o n , S. Tanimoto and A. K l i n g e r ( e d s . ) , Academic Press, New York, 31-55. Tanimoto, S.L. and P a v l i d i s , T. 1975, "A H i e r a r c h i c a l Data S t r u c t u r e f o r P i c t u r e P r o c e s s i n g , " Computer Graphics and Image P r o c e s s i n g 4^(2), 104-119. Treisman, A.M. 1982, " P e r c e p t u a l Grouping and A t t e n t i o n i n V i s u a l Search f o r Features and f o r O b j e c t s , " J o u r n a l of Experimental  Psychology: Human Percept ion and Performance 8, 194-214. Treisman, A.M. and Gelade, G. 1980, "A Feature I n t e g r a t i o n Theory of A t t e n t i o n , " C o g n i t i v e  Psychology 12, 97-136. Treisman, A.M. and Schmidt, H. 1981, " I l l u s o r y Conjunctions i n the P e r c e p t i o n of O b j e c t s , " C o g n i t i v e Psychology 14, 107-141. Treisman, A.M., Sykes, M., and Gelade, G. 1977, " S e l e c t i v e A t t e n t i o n and Stimulus I n t e g r a t i o n , " i n A t t e n t i o n and Performance VI, S. Dornic ( e d . ) , Erlbaum, H i l l s d a l e , Nerw J e r s e y . Uhr, L. 1972, "Recognition Cone Networks that Preprocess, C l a s s i f y , and D e s c r i b e , " IEEE T r a n s a c t i o n s on Computers 21 , 758-768. 194 Uhr, L. 1980, " P s y c h o l o g i c a l M o t i v a t i o n and U n d e r l y i n g Concepts," i n S t r u c t u r e d Computer V i s i o n , S. Tanimoto and A. K l i n g e r (eds.), Academic Press, N.Y., 1-30. van Wijngaarden, A., M a i l l o u x , B.J., Peck, J.E., and Koster, C.H.A. 1969, "Report on the A l g o r i t h m i c Language ALGOL 68," Report MR 101, Mathematisch Centrum, Amsterdam. Walker-Smith, G.J. and Gale, A.G. 1977, "Eye Movement S t r a t e g i e s Involved i n Face P e r c e p t i o n s , " Percept ion 6, 313-326. Waltz, D.L. 1972, "Generating Semantic D e s c r i p t i o n s From Drawings of Scenes with Shadows," T e c h n i c a l Note TR-271, A l Lab, M.I.T. Warrington, E.K. and T a y l o r , A.M. 1973, "The C o n t r i b u t i o n of the Right P a r i e t a l Lobe to Object R e c o g n i t i o n , " Cortex 9, 152-164. Warrington, E.K. and T a y l o r , A.M. 1975, "The s e l e c t i v e Impairment of Semantic Memory," Q u a r t e r l y J o u r n a l of Experimental Psychology 27, 635-657. Watson, A.B. 1982, "Summation of G r a t i n g Patches I n d i c a t e s Many Types of Detector at One R e t i n a l L o c a t i o n , " V i s i o n Research 22, 17-25. W e i s s t e i n , N. and H a r r i s , C S . 1974, " V i s u a l D e t e c t i o n of L i n e Segments: an Object S u p e r i o r i t y E f f e c t , " Science 186, 752-755. Westheimer, G.H. 1954, "Eye Movement Responses to a H o r i z o n t a l l y Moving V i s u a l Stimulus," A r c h i v e s of Opthamology 52, 932-934. Westheimer, G.H. 1982, "The S p a t i a l G r a i n of the P e r i f o v e a l V i s u a l F i e l d , " V i s i o n Research 22, 157-162. Wilson, H.R. and Bergen, J.R. 1979, "A Four Mechanism Model f o r Threshold S p a t i a l V i s i o n , " V i s i o n Research 19, 19-32. Winston, P.H. 1972, "The MIT Robot," i n Machine I n t e l l i g e n c e 7, B. M e l t z e r and D. M i c h i e (eds.), American E l s e v i e r , New York, 431-463. 195 W i t k i n , A.P. 1981, "Recovering Surface Shape and O r i e n t a t i o n from Texture," A r t i f i c i a l I n t e l l i g e n c e Vl_{\-3), 17-45 . Woodham, R.J. 1978, " R e f l e c t a n c e Map Techniques f o r A n a l y s i n g Surface D e f e c t s in Metal C a s t i n g s , " T e c h n i c a l Report 457, Al Lab, M.I.T. Woodham, R.J. 1981, "Analyzing Images of Curved S u r f a c e s , " A r t i f i c i a l  I n t e l l i g e n c e j _ 7 ( l - 3 ) , 117-140. Yarbus, A.L. 1967, Eye Movements and V i s i o n , Plenum Press, New York; Z e k i , S.M. 1978, "Uniformity and D i v e r s i t y of S t r u c t u r e i n Rhesus Monkey P r e s t r i a t e V i s u a l Cortex," J o u r n a l of Ph y s i o l o g y 277, 273-290. Reference Notes (1) Schmidt, H., Ph.D. T h e s i s ( i n p r e p a r a t i o n ) , Dept. of Psychology, U n i v e r s i t y of Pennsylvania, P h i l a d e l p h i a , P e n n s y l v a n i a . (2) Mulder, J . Ph.D. T h e s i s ( i n p r e p a r a t i o n ) , Dept. of Computer Sc i e n c e , U n i v e r s i t y of B r i t i s h Columbia, Vancouver, Canada. 196 Appendix A Angles at Connect ions In the body drawing i n t e r p r e t a t i o n system, the angle at the p o i n t of connection between curved l i n e s a c t s as a cue to the models that the l i n e s may play some pa r t i n . Since the system does not always have complete i n f o r m a t i o n about the image, i t would not seem reasonable to simply use the angle formed by the endpoints. A b e t t e r measure i s the l o c a l angle of c o n n e c t i o n . Since the body drawing i n t e r p r e t a t i o n system simulates the l o c a l i z e d a v a i l a b i l i t y of f e a t u r e s , a simple means of p r o v i d i n g t h i s l o c a l angle was sought. The diagram below shows the connection between two l i n e s AB and AC. We wish to provide the angle 0 ', the l o c a l angle of c o n n e c t i o n . The c u r v a t u r e of the l i n e AB i s known, and i s represented as the change in the angle of the tangent, (j). Thus the l o c a l angle at connection 0 ' i s given as ©' = © + (j)/2 197 Appendix B. Body Form Knowledge T h i s appendix d e s c r i b e s the d e c l a r a t i v e schemata system, and g i v e s the d e t a i l s of i t s a p p l i c a t i o n to the body form problem domain. Examples of the use of these schemata are found i n s e c t i o n s 3.2 and 4.2. The S t r u c t u r e of a Schema The system c o n s i s t s of a l i s t of schemata, each of which d e f i n e s a p r o t o t y p i c a l o b j e c t . The general form of a schema i s : <object-schema> => (name (<parameter>*) <description>+) where the a s t e r i s k "*" i n d i c a t e s any number of r e p e t i t i o n s , and the p l u s s i g n "+" i n d i c a t e s any number of r e p e t i t i o n s , but at l e a s t one. <parameter> => (name value value+) Parameters are s p e c i a l i n h e r i t e d a t t r i b u t e s . In some cases there are v a r i a t i o n s of o b j e c t s which are q u i t e minor, and do not j u s t i f y the development of separate o b j e c t s because a sim-p l e parameter may be used to d i s t i n g u i s h the v a r i a t i o n s . 198 <description> => (<type> (<object-reference>+>) ( < r e l a t i o n - r e f e r e n c e > * ) < a t t r i b u t e - s e c t ion>) There are types of d e s c r i p t i o n s : component, image, and spe-c i a l i z a t i o n . An i n d i c a t i o n of the type i s r e t a i n e d i n order to access the three d i f f e r e n t r e s u l t i n g h i e r a r c h i e s . The s t r u c t u r e of the d e s c r i p t i o n i s the same in a l l cases. The t y p i c a l d e s c r i p t i o n c o n s i s t s of three p a r t s as shown above. The <object-reference> p a r t s p e c i f i e s o b j e c t s which are the components (or support) of the o b j e c t which i s being d e f i n e d . The < r e l a t i o n - r e f e r e n c e > p a r t s p e c i f i e s r e l a t i o n s that must h o l d over the components i n order to c o n s t i t u t e the o b j e c t being d e f i n e d . The < a t t r i b u t e - s e c t i o n > part s p e c i f i e s how a t t r i b u t e values can be obtained f o r the. object through r e f e r -ence to a t t r i b u t e s of the components and r e l a t i o n s . <object-reference> => ( r e f object <attribute-value>*) Each component object i s s p e c i f i e d . I t i s given a r e f e r e n c e " r e f " , by which i t can be r e f e r r e d throughout the component descr i p t i o n . < a t t r i b u t e - v a l u e s > => ( a t t r i b u t e value) => ( a t t r i b u t e ( v a l u e ! value2)) 199 component o b j e c t s can be f u r t h e r s p e c i f i e d so as to r e q u i r e that they have c e r t a i n a t t r i b u t e s . The second form i n d i c a t e s that the a t t r i b u t e value must f a l l w i t h i n a range. <relat ion-reference> => ( r e f r e l a t i o n (arg arg+) <attribute-value>*) R e l a t i o n r e f e r e n c e s appear much the same as o b j e c t - r e f e r e n c e s , except that they take on a number of arguments which are o b j e c t s over which the r e l a t i o n must h o l d . R e l a t i o n s a l s o take on a t t r i b u t e s , over which requirements may be s p e c i f i e d i n the same way as they are f o r component o b j e c t s . The i n v e n t o r y of a v a i l a b l e r e l a t i o n s i s not s p e c i f i e d e x p l i -c i t l y w i t h i n the d e c l a r a t i v e schemata system. They a r e , i n a l l cases, simple r e l a t i o n s , whose semantics i s e a s i l y understood. < a t t r i b u t e - s e c t ion> => ( < a t t r i b u t e - d e f i n i t i o n > * ) < a t t r i b u t e - d e f i n i t i o n > => ( a t t r i b u t e <- method) There may be any number of a t t r i b u t e s e c t i o n s . The f i r s t one s p e c i f i e s the means of developing those a t t r i b u t e values which are always developed i n the same way, r e g a r d l e s s of c o n t e x t . The a t t r i b u t e s e c t i o n s which f o l l o w are ways of d e v e l o p i n g a l t e r n a t i v e s f o r s p e c i a l c l u s t e r s of a t t r i b u t e s which are sen-s i t i v e to c o n t e x t . A method for the development of the a t t r i -bute value i s s p e c i f i e d as some simple c o n s t r u c t i o n over the a t t r i b u t e v a l u e s of the component o b j e c t s and r e l a t i o n s . 200 B a s i c O b j e c t s and R e l a t i o n s There are three b a s i c o b j e c t types i n the domain of l i n e - d r a w i n g s of body-forms: (1) The o b j e c t s which make up the scene domain, such as ARM, HAND,UPPER-BODY, e t c . (2) The o b j e c t s which are l i n e s and l i n e c o n s t r u c t i o n s , such as LINE, LINE-FOOT-1, LINE-HEAD-3, e t c . (3) The o b j e c t s which are BLOBs. For each of the image domains, there are scene o b j e c t s which have d e s c r i p t i o n s based on that domain. For example: LIMB i s a BLOB-based scene o b j e c t ARM i a a LINE-based scene o b j e c t At the u p p e r - l e v e l s of the scene domain d e f i n i t i o n , these LINE and BLOB-based o b j e c t s are r e l a t e d to one another through the use of s p e c i a l i z a t i o n d e s c r i p t i o n s . The domain d e f i n i t i o n i n c l u d e s a complete d e s c r i p t i o n of a l l image domain c o n s t r u c t i o n s (such as LINE-FOOT-3), and com-p l e t e d e s c r i p t i o n s of a l l scene domain o b j e c t s (whether LINE or BLOB based, or c o n s t r u c t e d ) . T h i s d e f i n i t i o n i n c l u d e s a s p e c i f i c a t i o n of a l l a t t r i b u t e s , and the means of o b t a i n i n g t h e i r v a l u e s . There are two a s p e c t s of the d e f i n i t i o n which are not made 201 e x p l i c i t i n the r e p r e s e n t a t i o n . They are : (1) The meaning of the a t t r i b u t e s of the two b a s i c o b j e c t s LINE and BLOB. (2) The semantics of the r e l a t i o n s used to s p e c i f y the con-s t r u c t i o n of o b j e c t s (such as NEAR, CONNECT, e t c ) . I m p l i c i t Object LINE: CURVE: the number of degrees that the l i n e segment cu r v e s . (+ve value o n l y ) . LENGTHL: the len g t h of the l i n e segment (endpoint to end-p o i n t d i stance) . MIDPOINT: midpoint between two ends. I m p l i c i t Object BLOB COFG: the center of g r a v i t y of the blob. PT1L: one of the end p o i n t s of the long a x i s . PT2L: the other endpoint of the long a x i s . I f one a x i s cannot be d i s t i n g u i s h e d as the longer, then the center of g r a v i t y i s used f o r both endpoints. LENGTHB: the len g t h of the long a x i s . RATIO: the r a t i o of long to short axes (x100). I m p l i c i t r e l a t i o n : CONNECT T h i s r e l a t i o n takes two arguments, which are both l i n e s . Two l i n e s are s a i d to meet the CONNECT r e l a t i o n i f they share an end-point which i s w i t h i n a t h r e s h o l d of each other, and the l i n e s are d i s t i n c t . 202 ANGLE: the d e f l e c t i o n at the connection p o i n t , going from the f i r s t l i n e to the second ( p o s i t i v e angle f o r a r i g h t hand t u r n ) . RATIO: the r a t i o between the lengths of the two l i n e s ( f i r s t over second (xlOO)). LOCATION: the p o i n t of the f i r s t l i n e which i s in the . con-n e c t i o n . CTYPE: the i n d i c a t i o n of the nature of the l i n e s at the c o n n e c t i o n . Most connections are covex by d e f a u l t , which i n c l u d e s the case of both l i n e s being s t r a i g h t . If the f i r s t l i n e curves to the l e f t , then the "ctype" i s "con-c a v e l " . I f the second l i n e curves to the l e f t , i t i s "con-cave2", and "concave3" i n d i c a t e s both these c o n d i t i o n s . I m p l i c i t r e a l t i o n : NEAR Th i s r e l a t i o n takes four arguments, two body p a r t s and two a t t r i b u t e s of those body p a r t s whose valu e s are p o i n t s . The two p o i n t s thus s p e c i f i e d are examined to see i f they are c l o s e to each o t h e r . T h i s i s done by f i r s t o b t a i n i n g a t h r e s -h o l d (which i s one quarter of the s i z e of the l a r g e r of the two body p a r t s ) . ANGLE-X, ANGLE-Y, ANGLE-Z: A l l body-parts have 3D o r i e n t a -t i o n s . Each of these three a t t r i b u t e s are the r e l a t i o n between the o r i e n t a t i o n s of the s p e c i f i e d body p a r t s i n each of the three component axes of r o t a t i o n . In each case, i t i s the o r i e n t a t i o n of the second s p e c i f i e d minus the o r i e n t a t i o n of the f i r s t s p e c i f i e d . 203 RATIO: the r a t i o of the lengths ( s i z e ) of the two s p e c i f i e d body p a r t s ( f i r s t over second X100) LOCATION: h a l f way between the two p o i n t s that are found to be NEAR. I m p l i c i t R e l a t i o n : B-CONNECT T h i s r e l a t i o n takes four arguments which are two BLOB-based scene o b j e c t s , and two a t t r i b u t e s which have p o i n t v a l u e s . Each blob-based scene o b j e c t has a value f o r the a t t r i b u t e DELTA which determines how c l o s e the p o i n t of another BLOB based scene o b j e c t must be before i t can B-CONNECT. If the c l o s e n e s s range i s met f o r both the s p e c i f i e d o b j e c t s , then the B-CONNECT w i l l be found to h o l d . The reason for t h i s odd setup i s because scene o b j e c t s may be based on BLOBs which have no d i s t i n c t a x i s endpoints. In t h i s case the c e n t e r i s used as the connection p o i n t , and a l a r g e DELTA i s s p e c i f i e d (based on the s i z e of the BLOB). If the BLOB does have d i s t i n c t a x i s endpoints, then they are used f o r connec-t i o n p o i n t s (with small value f o r DELTA). RATIO: the r a t i o (X100) between the s i z e of the two s p e c i -f i e d scene o b j e c t s . Implic i t R e l a t i o n : LINE-NEAR A r e l a t i o n which i s s i m i l a r to NEAR, except that i t takes arguments which are LINE c o n s t r u c t i o n s , and a t t r i b u t e s which have p o i n t v a l u e s . T h i s r e l a t i o n i s used to t e s t the p r o x i m i t y of LINE c o n s t r u c t i o n s i n b u i l d i n g up more complex LINE con-s t r u c t i o n s (see LINE-TRUNK-5). A c r i t e r i o n the same as f o r 204 NEAR i s used. RATIO: the r a t i o (xLOO) between the s i z e of the two s p e c i -f i e d l i n e c o n s t r u c t i o n s . ANGLE: the 2D angle formed between the two. (the same as a s i n g l e a x i s f o r NEAR). 205 Schemata f o r the Body Form Problem Domain • *************************************************** ; l i n e - b a s e d image c o n s t r u c t i o n s . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ( i n s t a l l ' ( l i n e - f o o t - 1 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 120) ( r a t i o 193)) ($5 connect ($2 $3) (angle 90) ( r a t i o 60)) ($6 connect ($3 $1) (angle 150) ( r a t i o 86))) ((a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $5))) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( s i z e <- ( l e n g t h l $1 )) ) ))) ( i n s t a l l ' ( l i n e - f o o t - 2 n i l (component (($1 l i n e (curve 0) ) ($2 l i n e (curve 0)) . ($3 l i n e (curve 0) ) ) (($4 connect ($1 $2) (angle 150) ( r a t i o 116)) ($5 connect ($2 $3) (angle 90) ( r a t i o 167)) ($6 connect ($3 $1) (angle 120) ( r a t i o 52))) ((a2d <- (slope ( l o c a t i o n $5) ( l o c a t i o n $4))) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( s i z e <- ( l e n g t h l $ 1 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - f o o t - 3 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0) ) ($3 l i n e (curve 0) ) ) (($4 connect ($1 $2) (angle 124) ( r a t i o 180)) ($5 connect ($2 $3) (angle 90) ( r a t i o 67)) ($6 connect ($3 $1) (angle 146) ( r a t i o 83))) ((a2d <- (slope ( l o c a t i o n $5) ( l o c a t i o n $4))) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) 206 ( s i z e <- ( l e n g t h l $ 1 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - f o o t - 4 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 146) ( r a t i o 120)) ($5 connect ($2 $3) (angle 90) ( r a t i o 150)) ($6 connect ($3 $1) (angle 124) ( r a t i o 56))) ((a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $5))) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) ( s i z e <- ( l e n g t h l $ 1 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - l o w e r - l e g - 1 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 21 ))) (($4 connect ($1 $2) (angle 120) ( r a t i o 362)) ($5 connect ($2 $3) (angle 65) ( r a t i o 31)) ($6 connect ($3 $1) (angle 155) ( r a t i o 89))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- ( d i f f (slope ( l o c a t i o n $ 6 ) ( l o c a t i o n $4)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' (1ine-lower-leg-2 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 21 )) ($3 l i n e (curve 0)) ) (($4 connect ($1 $2) (angle 155) ( r a t i o 113)) ($5 connect ($2 $3) (angle 65) ( r a t i o 321)) ($6 connect ($3 $1) (angle 120) ( r a t i o 28))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t ion $ 4 ) ( l o c a t i o n $6)) 90)) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) 207 ( i n s t a l l ' ( l i n e - u p p e r - l e g - 1 n i l (component (($1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 140) ( r a t i o 119)) ($5 connect ($2 $3) (angle 60) ( r a t i o 313)) ($6 connect ($3 $1) (angle 108) ( r a t i o 27))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- ( d i f f (slope ( l o c a t ion $4) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - u p p e r - l e g - 2 n i l (component (($1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 108) ( r a t i o 372)) ($5 connect ($2 $3) (angle 60) ( r a t i o 32)) ($6 connect ($3 $1) (angle 140) ( r a t i o 84))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t i o n $6) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l '(1ine-upper-leg-3 n i l (component (($1 l i n e (curve 0) ) ($2 l i n e (curve 0)) ($3 l i n e (curve 36))) (($4 connect ($1 $2) (angle 117) ( r a t i o 330)) ($5 connect ($2 $3) (angle 62) ( r a t i o 34)) ($6 connect ($3 $1) (angle 145) ( r a t i o 90))) ( ( s i z e <- ( l e n g t h l $1 ) ) (a2d <- ( d i f f (slope ( l o c a t i o n $6) ( l o c a t i o n $4)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) 208 ( i n s t a l l '(1ine-upper-leg-4 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 36)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 145) ( r a t i o 116)) ($5 connect ($2 $3) (angle 62) ( r a t i o 284)) ($6 connect ($3 $1) (angle 117) ( r a t i o 30))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t ion $4) ( l o c a t i o n $6)) 90)) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - u p p e r - l e g - 5 n i l (component (( $ 1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 140) ( r a t i o 103)) ($5 connect ($2 $3) (angle 90) ( r a t i o 414)) ($6 connect ($3 $1) (angle 78) ( r a t i o 23))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t ion $4) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l '(1ine-upper-leg-6 n i l (component ( ($ 1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 78) ( r a t i o 426)) ($5 connect ($2 $3) (angle 90) ( r a t i o 24)) ($6 connect ($3 $1) (angle 140) ( r a t i o 97))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- ( d i f f (slope ( l o c a t i o n $6) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - h i p s - 1 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 117) ( r a t i o 90)) ($5 connect ($2 $3) (angle 126) ( r a t i o 100)) ($6 connect ($3 $1) (angle 117) ( r a t i o 111))) ( ( s i z e <- ( l e n g t h l $2)) (a2d <- (slope ( l o c a t ion $4) ( l o c a t i o n $6))) (top <- ( l o c a t i o n $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) (bottom <- (midpoint $ 1 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - h i p s - 2 n i l (component (($ 1 l i n e (curve 45 ) ) ($2 l i n e (curve 0) ) ($3 l i n e (curve 0) ) ) (($4 connect ($1 $2) (angle 92) ( r a t i o 211)) ($5 connect ($2 $3) (angle 95) ( r a t i o 51)) ($6 connect ($3 $1) (angle 129) ( r a t i o 92))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (sl o p e ( l o c a t ion $5) ( l o c a t i o n $4))) (top <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) (bottom <- (midpoint $ 2 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - h i p s - 3 n i l (component (($1 l i n e (curve 45)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 129) ( r a t i o 109)) ($5 connect ($2 $3) (angle 95) ( r a t i o 194)) ($6 connect ($3 $1) (angle 92) ( r a t i o 47))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $5))) (top <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) (bottom <- (midpoint $3 ) ) ) ) ) ) ( i n s t a l l '(line-hand-1 n i l (component ( ($ 1 l i n e (curve 53)) 210 ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 134) ( r a t i o 108)) ($5 connect ($2 $3) (angle 90) ( r a t i o 225)) ($6 connect ($3 $1) (angle 92) ( r a t i o 41))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- (slope ( l o c a t i o n $5) ( l o c a t i o n $6))) (proximal-end <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' (line-hand-2 n i l (component ( ($1 l i n e (curve 53)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 92) ( r a t i o 244)) ($5 connect ($2 $3) (angle 90) ( r a t i o 44)) ($6 connect ($3 $1) (angle 134) ( r a t i o 92))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $4) ( l o c a t i o n $5))) (proximal-end <- (midpoint $2)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' (line-hand-3 n i l (component (($1 l i n e (curve 41)) ($2 l i n e (curve 15)) ($3 l i n e (curve 0)) ($4 1ine (curve 10)) ($5 l i n e (curve 0))) (($6 connect ($1 $2) (angle 115) ( r a t i o 189)) ($7 connect ($3 $2) (angle 98) ( r a t i o 37) (ctype concave2)) ($8 connect ($3 $4) (angle 127) ( r a t i o 30)) ($9 connect ($4 $5) (angle 69) ( r a t i o 153)) ($10 connect ($5 $1) (angle 82) ( r a t i o 42))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $9) ( l o c a t i o n $10))) (proximal-end <- (midpoint $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $5))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l 21 1 ' (line-hand-4 n i l (component ( ($ 1 l i n e (curve 41)) ($2 l i n e (curve 0)) ($3 l i n e (curve 10)) ($4 l i n e (curve 0)) ($5 l i n e (curve 15))) (($6 connect ($1 $2) (angle 82) ( r a t i o 240)) ($7 connect ($2 $3) (angle 69) ( r a t i o 65)) ($8 connect ($3 $4) (angle 127) ( r a t i o 392)) ($9 connect ($5 $4) (angle 98) ( r a t i o 271) (ctype c o n c a v e l ) ) ($10 connect ($5 $1) (angle 115) ( r a t i o 53))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $7))) (proximal-end <- (midpoint $2)) ( l o c a t i o n <- (middle ( l o c a t i o n $10) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 1 0 ) ) ) ) ) ) ( i n s t a l l ' (line-hand-5 n i l (component ( ($ 1 l i n e (curve 17)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0)) ($4 l i n e (curve 0)) ($5 l i n e (curve 20)) ($6 l i n e (curve 0) )) (($7 connect ($1 $2) (angle 40) ( r a t i o 367)) ($8 connect ($2 $3) (angle 51) ( r a t i o 35)) ($9 connect ($3 $4) (angle 45) ( r a t i o 340)) ($10 connect ($4 $5) (angle 47) ( r a t i o 22)) ($11 connect ($5 $6) (angle 67) ( r a t i o 143)) ($12 connect ($6 $1) (angle 70) ( r a t i o 73))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $11) ( l o c a t i o n $12))) (proximal-end <- (midpoint $6)) ( l o c a t i o n <- (middle (midpoint $3) (midpoint $6))) ( d i s t a l - e n d <- ( l o c a t i o n $ 3 ) ) ) ) ) ) ( i n s t a l l '(line-lower-arm-1 n i l (component (($ 1 l i n e (curve 57)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 144) ( r a t i o 136)) ($5 connect ($2 $3) (angle 28) ( r a t i o 257)) ($6 connect ($3 $1) (angle 132) ( r a t i o 29))) 212 ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f ( s l o p e ( l o c a t i o n $4) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' (1ine-lower-arm-2 n i l (component (($ 1 l i n e (curve 57) ) ($2 l i n e (curve 0)) ($3 l i n e (curve 0)) ) (($4 connect ($1 $2) (angle 132) ( r a t i o 348)) ($5 connect ($2 $3) (angle 28) ( r a t i o 39)) ($6 connect ($3 $1) (angle 144) ( r a t i o 74))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- ( d i f f ( slope ( l o c a t i o n $6) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' (1ine-lower-arm-3 n i l (component (($1 l i n e (curve 0) ) ($2 l i n e (curve 0)) ($3 l i n e (curve 0)) ) (($4 connect ($1 $2) (angle 166) ( r a t i o 100)) ($5 connect ($2 $3) (angle 97) ( r a t i o 421)) ($6 connect ($3 $1) (angle 97) ( r a t i o 24))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f ( slope ( l o c a t ion $4) ( l o c a t i o n $6)) 83)) (proximal-end <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l '(1ine-upper-arm-1 n i l (component (($ 1 1ine (curve 37) ) ($2 l i n e (curve 0) ) ($3 1ine (curve 0)) ) (($4 connect ($1 $2) (angle 150) ( r a t i o 146)) ($5 connect ($2 $3) (angle 34) ( r a t i o 198)) ($6 connect ($3 $1) (angle 140) ( r a t i o 35))) 213 ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f ( slope ( l o c a t i o n $4) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint $3))) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l '(line-upper-arm-2 n i l (component (($1 l i n e (curve 37)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 140) ( r a t i o 288)) ($5 connect ($2 $3) (angle 34) ( r a t i o 51)) ($6 connect ($3 $1) (angle 150) ( r a t i o 69))) ( ( s i z e <- ( l e n g t h l $ 1)) (a2d <- ( d i f f (slope ( l o c a t ion $6) ( l o c a t i o n $5)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' (1ine-upper-arm-3 n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 37 ) ) ) (($4 connect ($1 $2) (angle 148) ( r a t i o 283)) ($5 connect ($2 $3) (angle 30) ( r a t i o 49)) ($6 connect ($3 $1) (angle 146) ( r a t i o 73))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f ( s l o p e ( l o c a t ion $6) ( l o c a t i o n $4)) 90)) (proximal-end <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) ( d i s t a l - e n d <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l '(line-upper-arm-4 n i l (component ( ( $ 1 l i n e (curve 0)) ($2 l i n e (curve 37)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 146) ( r a t i o 138)) ($5 connect ($2 $3) (angle 30) ( r a t i o 205)) 214 ($6 connect ($3 $1) (angle ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f ( s l o p e ( l o c a t i o n $4) (proximal-end <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) 148) ( r a t i o 35))) ( l o c a t i o n $6)) 90)) (midpoint $3))) n i l r ( i n s t a l l '(line-upper-arm-5 (component (($1 l i n e (curve l i n e (curve l i n e (curve connect ($1 connect ($2 connect ($3 ( ( s i z e <- (times (a2d <- ( d i f f (proximal-end ($2 ($3 (($4 ($5 ($6 37)) 0)) 0))) $2) $3) $D 1 .1 150) 90) 84) ( r a t i o ( r a t i o ( r a t i o 102) ) 450) ) 216))) (angle (angle (angle ( l e n g t h l $1 ))) (s l o p e ( l o c a t i o n $4) <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) (midpoint ( d i s t a l - e n d <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( l o c a t i o n $5)) 90)) $3))) n i l ( i n s t a l l '(line-upper-arm-6 (component (($ 1 l i n e (curve ($2 l i n e (curve ($3 l i n e (curve (($4 connect ($ 1 ($5 connect ($2 ($6 connect ($3 ( ( s i z e <- (times (a2d <- ( d i f f (proximal-end ( l o c a t i o n <-( d i s t a l - e n d < 37)) 0)) 0))) $2) $3) $ D 1 . 1 (angle 84) (angle 90) (angle 150) ( l e n g t h l $1))) ( s l o p e ( l o c a t i o n $6) <- (midpoint $2)) (middle ( l o c a t i o n $6) - ( l o c a t i o n $6)) ) ) ) ) ( r a t i o 459)) ( r a t i o 22)) ( r a t i o 98))) ( l o c a t i o n $5)) 90)) (midpoint $2))) ( i n s t a l l ' ( l i n e - t o r s o n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0)) ($4 l i n e (curve 0)) 215 ($5 l i n e (curve 0))) (($6 connect ($1 $2) (angle 138) ( r a t i o 100)) ($7 connect ($2 $3) (angle 55) ( r a t i o 342)) ($8 connect ($3 $4) (angle 56) ( r a t i o 72)) ($9 connect ($4 $5) (angle 56) ( r a t i o 139)) ($10 connect ($5 $1) (angle 55) ( r a t i o 29))) ( ( s i z e <- ( l e n g t h l $1 )) (a2d <- ( d i f f ( slope ( l o c a t i o n $6) ( l o c a t i o n $7)) 101)) (top <- (midpoint $4)) (bottom <- ( l o c a t i o n $6)) ( l e f t - s i d e <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) ( l o c a t i o n $4))) ( r i g h t - s i d e <- (midpoint $ 5 ) ) ) ) ) ) ( i n s t a l l ' (1ine-trunk-1 n i l (component (($1 l i n e (curve 124)) ($2 l i n e (curve 69))) (($3 connect ($1 $2) (angle 84) ( r a t i o 100)) ($4 connect ($2 $1) (angle 84) ( r a t i o 100))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t ion $4) ( l o c a t i o n $3)) 95)) (top <- ( l o c a t i o n $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $3) ( l o c a t i o n $4))) (bottom <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l '(1ine-trunk-2 n i l (component (($1 l i n e (curve 69)) ($2 l i n e (curve 124))) (($3 connect ($1 $2) (angle 84) ( r a t i o 100)) ($4 connect ($2 $1) (angle 84) ( r a t i o 100))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t i o n $4) ( l o c a t i o n $3)) 85)) (top <- ( l o c a t i o n $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) ( l o c a t i o n $3))) (bottom <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - t r u n k - 3 n i l 216 (component (($1 l i n e (curve 96)) ($2 l i n e (curve 15)) ($3 l i n e (curve 93))) (($4 connect ($1 $2) (angle 128) ( r a t i o 134) (ctype concave2)) ($5 connect ($2 $3) (angle 4) ( r a t i o 243) (ctype concave 1)) ($6 connect ($3 $1)(angle 58) ( r a t i o 31))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t ion $4) ( l o c a t i o n $6)) 110)) (top <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $4) ( l o c a t i o n $6))) (bottom <- ( l o c a t i o n $ 4 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - t r u n k - 4 n i l (component (($1 l i n e (curve 96)) ($2 l i n e (curve 93)) ($3 l i n e (curve 15))) (($4 connect ($1 $2) (angle 58) ( r a t i o 325)) ($5 connect ($2 $3) (angle 4) ( r a t i o 41) (ctype concave2)) ($6 connect ($3 $1) (angle 58) ( r a t i o 75) (ctype c o n c a v e l ) ) ) ( ( s i z e <- ( l e n g t h l $1) ) (a2d <- ( d i f f (slope ( l o c a t ion $6) ( l o c a t i o n $4)) 70)) (top <- ( l o c a t i o n $4)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) ( l o c a t i o n $4))) (bottom <- ( l o c a t i o n $ 6 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - s h o u l d e r n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0 ) ) ) (($4 connect ($1 $2) (angle 123) ( r a t i o 109)) ($5 connect ($2 $3) (angle 114) ( r a t i o 100)) ($6 connect ($3 $1) (angle 123) ( r a t i o 91))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $4))) ( l e f t - s i d e <- (midpoint $3)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) ( r i g h t - s i d e <- (midpoint $ 2 ) ) ) ) ) ) ( i n s t a l l ' ( l i n e - n e c k n i l (component (($1 l i n e (curve 0)) ($2 l i n e (curve 0)) ($3 l i n e (curve 0))) (($4 connect ($1 $2) (angle 124) ( r a t i o 111)) ($5 connect ($2 $3) (angle 112) ( r a t i o 100)) ($6 connect ($3 $1) (angle 124) ( r a t i o 90))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t ion $4) ( l o c a t i o n $6))) (top <- ( l o c a t i o n $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $5) (midpoint $1))) (bottom <- (midpoint $ 1 ) ) ) ) ) ) ( i n s t a l l ' (line-head-1 n i l (component (($1 l i n e (curve 73)) ($2 l i n e (curve 73)) ($3 l i n e (curve 14)) ($4 l i n e (curve 14))) (($5 connect ($1 $2) (angle 20) ( r a t i o 100)) ($6 connect ($2 $3) (angle 33) ( r a t i o 80)) ($7 connect ($3 $4) (angle 96) ( r a t i o 100)) ($8 connect ($4 $1) (angle 33) ( r a t i o 125))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- (slope ( l o c a t i o n $8) ( l o c a t i o n $6)) ) (top <- ( l o c a t i o n $5)) ( l o c a t i o n <- (middle ( l o c a t i o n $8) (midpoint $6))) (bottom <- ( l o c a t i o n $ 7 ) ) ) ) ) ) ( i n s t a l l ' (line-head-2 n i l (component (($1 l i n e (curve 101 ) ) ($2 l i n e (curve 27)) ($3 l i n e (curve 27)) ($4 l i n e (curve 0)) ($5 l i n e (curve 0)) ($6 l i n e (curve 81 ) )) (($7 connect ($1 $2) (angle 23) ( r a t i o 390)) ($8 connect ($2 $3) (angle 48) ( r a t i o 64)) ($9 connect ($4 $3) (angle 84) ( r a t i o 24) 218 (ctype concave2)) ($10 connect ($4 $5) (angle 99) ( r a t i o 23)) ($11 connect ($5 $6) (angle 2) ( r a t i o 95)) ($12 connect ($6 $1) (angle 26) ( r a t i o 45))) ( ( s i z e <- ( l e n g t h l $1)) (a2d <- ( d i f f (slope ( l o c a t i o n $7) ( l o c a t i o n $12)) 90)) (top < - ( l o c a t i o n $12)) ( l o c a t i o n <- (middle ( l o c a t i o n $8) ( l o c a t i o n $12))) (bottom <- (midpoint $ 2 ) ) ) ) ) ) ( i n s t a l l ' (line-head-3 n i l (component (($1 l i n e (curve 101)) ($2 l i n e (curve 81)) ($3 l i n e (curve 0)) ($4 l i n e (curve 0)) ($5 l i n e (curve 27)) ($6 l i n e (curve 27))) (($7 connect ($1 $2) (angle 26) ( r a t i o 221)) ($8 connect ($2 $3) (angle 2) ( r a t i o 106)) ($9 connect ($3 $4) (angle 99) ( r a t i o 438)) ($10 connect ($5 $4) (angle 84) ( r a t i o 413) (ctype cocave1)) ($11 connect ($5 $6) (angle 48) ( r a t i o 157)) ($12 connect ($6 $1) (angle 23) ( r a t i o 26))) ( ( s i z e <- ( l e n g t h l $1) ) (a2d <- (slope ( l o c a t i o n $6) ( l o c a t i o n $7))) (top < - ( l o c a t i o n $7)) ( l o c a t i o n <- (middle ( l o c a t i o n $7) ( l o c a t i o n $11))) (bottom <- (midpoint $ 6 ) ) ) ) ) ) ( i n s t a l l '(1ine-head-4 n i l (component (($1 l i n e (curve 138)) ($2 l i n e (curve 0)) ($3 l i n e (curve 138))) (($4 connect ($1 $2) (angle 26) ( r a t i o 621)) ($5 connect ($2 $3) (angle 26) ( r a t i o 16)) ($6 connect ($3 $1) (angle 32) ( r a t i o 100))) ( ( s i z e <- ( l e n g t h l $1 ) ) (a2d <- ( d i f f (slope ( l o c a t i o n $ 4 ) ( l o c a t i o n $6)) 100)) (top <- ( l o c a t i o n $6)) ( l o c a t i o n <- (middle ( l o c a t i o n $6) (midpoint $2))) (bottom <- (midpoint $ 2 ) ) ) ) ) ) 219 • ********************************************* ; composed l i n e - b a s e d c o n s t r u c t i o n s . * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ( i n s t a l l ' (1ine-trunk-5 n i l (component (($1 l i n e - t o r s o ) ($2 l i n e - s h o u l d e r ) ($3 l i n e - s h o u l d e r ) ) (($4 l i n e - n e a r ($2 $1 r i g h t - s i d e l e f t - s i d e ) (angle (-10 10)) ( r a t i o 28)) ($5 l i n e - n e a r ($3 $1 l e f t - s i d e r i g h t - s i d e ) (angle (-10 10)) ( r a t i o 28))) ((a2d <- (a2d $1)) ( s i z e <- ( s i z e $ 1) ) ( l e f t - s i d e <- ( l e f t - s i d e $2)) ( r i g h t - s i d e <- ( r i g h t - s i d e $3)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (top <- (top $1 )) )) ) ) • *********************************************************** ; l i n e - b a s e d scene o b j e c t s • *********************************************************** ( i n s t a l l ' ( f o o t ( ( s i d e l e f t r i g h t ) ) (image 1 (($1 l i n e - f o o t - 1 ) ) n i l ((proximal-end <- (proximal-end $1)) ( s i z e <- ( s i z e $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) ) ( ( s i d e <- l e f t ) ) ) (image2 (($1 l i n e - f o o t - 2 ) ) n i l ((proximal-end <- (proximal-end $1)) ( s i z e <- ( s i z e $ 1 )) ( l o c a t i o n <- ( l o c a t i o n $1)) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) ) ( ( s i d e <- l e f t ) ) ) (image3 (($1 l i n e - f o o t - 3 ) ) n i l ((proximal-end <- (proximal-end $1)) ( s i z e <- (times 1.6 ( s i z e $1))) ( l o c a t i o n <- ( l o c a t i o n $1))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) (image4 (($1 l i n e - f o o t - 4 ) ) n i l ( ( s i z e ( s i z e $ 1) ) (proximal-end <- (proximal-end $1)) ( l o c a t i o n <- ( l o c a t i o n $1))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) )) ( i n s t a l l ' ( l o w e r - l e g ( ( s i d e l e f t r i g h t ) ) (image 1 (($1 1ine-lower-leg-1 ) ) n i l ( ( s i z e <- ( s i z e $ 1 )) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ) r (image2 (($1 1 ine-lower-leg-2)) n i l ( ( s i z e <- ( s i z e $ 1) ) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (neg (a2d $ 1 ) ) ) ) ) ;back ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ;si d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0))) ;si d e 221 ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (a2d $1) ) ) ) ) )) ; f r o n t ( i n s t a l l '(upper-leg ( ( s i d e l e f t r i g h t ) ) (image 1 ( (ima ( (image3 ( n i l $1 l i n e - u p p e r - l e g - 1 ) ) ( s i z e <- ( s i z e $ 1 ) ) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1))) ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0))) ) f r o n t s i d e back s i d e n i l ge2 $1 1ine-upper-leg-2 ) ) ( s i z e <- ( s i z e $ 1 ) ) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ) ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) n i l $1 )) $1 1ine-upper-leg-3)) ( s i z e <- ( s i z e $ 1 )) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end ( d i s t a l - e n d <- ( d i s t a l - e n d ) ) ) ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( s i d e <- r i g h t ) (a3d <- (a2d $1) -90 0)) ) f r o n t s i d e back s i d e (image4 (($1 l i n e - u p p e r - l e g - 4 ) ) n i l ( ( s i z e <- ( s i z e $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) (image5 (($1 l i n e - u p p e r - l e g - 5 ) ) n i l ( ( s i z e <- ( s i z e $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end ( d i s t a l - e n d <- ( d i s t a l - e n d ) ) ) $1 )) ( ( s i d e (a3d ( ( s i d e (a3d <-( ( s i d e (a3d ( ( s i d e (a3d <-<-<-<-<-<-l e f t ) ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) l e f t ) ( l i s t (a2d $1) -90 0))) r i g h t ) ( l i s t 0 180 (a2d $1 ) ) ) ) r i g h t ) (a2d $1) -90 0)) ) f r o n t s i d e back s i d e (image6 (($1 l i n e - u p p e r - l e g - 6 ) ) n i l ( ( s i z e <- ( s i z e $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) )) ( i n s t a l l ' ( h i p s n i l (image 1 (($1 l i n e - h i p s - 1 ) ) n i l ((bottom <- (bottom $1)) 223 (top <- (top $1 )) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $1 )) ) ((a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ; f r o n t ((a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) ; back (image2 (($1 l i n e - h i p s - 2 ) ) n i l ((a3d <- ( l i s t (neg (a2d $1)) 90 0)) (top <- (top $ 1)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1 )) )) ( image3 (($1 l i n e - h i p s - 3 ) ) n i l ((a3d <- ( l i s t (a2d $1) -90 0)) (top <- (top $ 1 )) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1 ))) ) ) ) ( i n s t a l l '(hand ( ( s i d e l e f t r i g h t ) ) (image 1 (($1 line-hand-1 ) ) n i l ((posture <- open) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $ 1 ) ) ) ( ( s i d e <- l e f t ) (a3d < - ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) (image2 (($1 line-hand-2)) n i l ((posture <- open) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $ 1)) ) ( ( s i d e <- r i g h t ) (a3d < - ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1 )) ))) r (image3 (($1 line-hand-3)) n i l 224 ((a3d <-( l i s t (neg (a2d $1)) 90 0)) (posture <- open) (proximal-end <- (proximal-end $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $1)) ) ( ( s i d e <- l e f t ) ) ( ( s i d e <- r i g h t ) ) ) ( image4 (($1 line-hand-4)) n i l ((a3d <-( l i s t (a2d $1) -90 0) ) (posture <- open) (proximal-end <- (proximal-end $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $ 1 )) ) ( ( s i d e <- l e f t ) ) ( ( s i d e <- r i g h t ) ) ) (image5 (($1 line-hand-5)) n i l ((pos t u r e <- closed) ( l o c a t i o n <- ( l o c a t i o n $1)) (proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- (times 1.75 ( s i z e $1)) )) ((a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ((a3d <- ( l i s t 0 180 (a2d $1)))) ((a3d <- ( l i s t (a2d $1) -90 0))) ((a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) )) ( i n s t a l l '(lower-arm ( ( s i d e l e f t r i g h t ) ) (image 1 (($1 1ine-lower-arm-1)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1)) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0) ))) thumb s i d e palm s i d e f i n g e r s i d e knuckle s i d e 225 (image2 (($1 line-lower-arm-2)) n i l ((proximal-end <- (proximal-end $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( s i z e <- ( s i z e $ 1 )) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) ; palm s i d e f i n g e r s i d e knuckle sid e thumb s i d e (image3 (($1 1ine-lower-arm-3)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- (times 1.15 ( s i z e $1))) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) -90 0))) ( ( s i d e <- l e f t ) ( l i s t (a2d $1) 90 0))) r i g h t ) ( l i s t (neg (a2d $1)) 90 0))) r i g h t ) ( l i s t (a2d $1) -90 0 ) ) ) ) )) (a3d <• ( ( s i d e <-(a3d <-( ( s i d e <• (a3d < -palm s i d e knuckle sid e palm s i d e knuckle sid e ( i n s t a l l '(upper-arm ( ( s i d e l e f t r i g h t ) ) (image 1 (($1 line-upper-arm-1 ) ) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1 )) ) ; bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ; bulge o u t s i d e seen from f r o n t ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ; bulge b i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge t r i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ; bulge o u t s i d e seen from back ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)) )) ; bulge b i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0 ) ) ) ) (image2 (($1 1ine-upper-arm-2)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1)) ) ; bulge t r i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge o u t s i d e seen from back ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1))) ) ; bulge b i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ; bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge o u t s i d e seen from f r o n t ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ; bulge b i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) (image3 (($1 1ine-upper-arm-3)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1) ) ) ; bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ; bulge o u t s i d e seen from f r o n t ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ; bulge b i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge t r i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) ; bulge o u t s i d e seen from back ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)) )) bulge b i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0 ) ) ) ) (image4 (($1 line-upper-arm-4)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <~ ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1)) ) bulge t r i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) bulge o u t s i d e seen from back ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1 )) ) ) bulge b i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0))) bulge o u t s i d e seen from f r o n t ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) bulge b i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) (image5 (($1 1ine-upper-arm-5)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1)) ) bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) bulge o u t s i d e seen from f r o n t ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) bulge b i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) bulge t r i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) bulge o u t s i d e seen from back ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)) )) bulge b i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) 228 (a3d <- ( l i s t (a2d $1) -90 0 ) ) ) ) (image6 (($1 line-upper-arm-6)) n i l ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $ 1 )) ) ; bulge t r i c e p t seen from i n s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge o u t s i d e seen from back ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 180 (a2d $1))) ) ; bulge b i c e p t seen from o u t s i d e ( ( s i d e <- l e f t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0))) : bulge t r i c e p t seen from o u t s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0))) ; bulge o u t s i d e seen from f r o n t ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 0 (neg (a2d $1 ) ) ) ) ) ; bulge b i c e p t seen from i n s i d e ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) ) ) ( i n s t a l l '(trunk n i l ; (image 1 (($1 l i n e - t r u n k - 1 ) ) n i l ( ( s i z e <- (times .84 ( s i z e $1))) (a3d <- ( l i s t (neg (a2d $1)) 90 0)) ( l e f t - s i d e <- (top $1)) (top <- (top $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (bottom $1)) ( r i g h t - s i d e <- (top $1)))) (image2 (($1 l i n e - t r u n k - 2 ) ) n i l ( ( s i z e <- (times .84 ( s i z e $1))) (a3d <- ( l i s t (a2d $1)- 90 0)) ( l e f t - s i d e <- (top $1)) (top <- (top $1)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( r i g h t - s i d e <- (top $1)))) . (image3 (($1 l i n e - t r u n k - 3 ) ) n i l ( ( s i z e <- (times .86 ( s i z e $1))) (a3d <- ( l i s t (neg (a2d $1)) 90 0)) ( l e f t - s i d e <- (top $1)) (top <- (top $1)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( r i g h t - s i d e <- (top $1)))) (image4 (($1 l i n e - t r u n k - 4 ) ) n i l ( ( s i z e <- (times .86 ( s i z e $1))) (a3d <- ( l i s t (a2d $1)- 90 0)) ( l e f t - s i d e <- (top $1)) (top <- (top $1)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( r i g h t - s i d e <- (top $1)))) ( image5 (($1 l i n e - t r u n k - 5 ) ) n i l ( ( s i z e <- ( s i z e $ 1 )) ( l e f t - s i d e <- ( l e f t - s i d e $1)) (top <- (top $1)) (bottom <- (bottom $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( r i g h t - s i d e <- ( r i g h t - s i d e $1))) ((a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ((a3d <- ( l i s t 0 180 (a2d $ 1 ) ) ) ) ) )) ( i n s t a l l '(nec k n i l (image (($1 l i n e - n e c k ) ) n i l ( ( s i z e <- ( s i z e $ 1 )) (top <- (top $ 1)) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (bottom $1))) ((a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( (a3d <- ( l i s t 0 180 (a2d $1 ))) ) ((a3d <- ( l i s t (a2d $1) -90 0))) ( U 3 d <- ( l i s t (neg (a2d $1)) 90 0 ) ) ) ) ) ) ( i n s t a l l '(shoulder n i l (image (($1 l i n e - s h o u l d e r ) ) n i l ( ( s i z e <- ( s i z e $1 ) ) ) ) )) 230 ( i n s t a l l 1 ( h e a d n i l (image 1 (($1 l i n e - h e a d - 1 ) ) n i l ((a3d <- ( l i s t 0 0 (neg (a2d $1))) ( s i z e <- ( s i z e $ 1 ) ) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (bottom $ 1 ) ) ) ) ) (image2 (($1 line-head-2)) n i l ((a3d <- ( l i s t (neg (a2d $1)) 90 0)) ( s i z e <- (times .72 ( s i z e $1))) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (bottom $1)))) . (image3 (($1 1ine-head-3)) n i l ((a3d <- ( l i s t (a2d $1) -90 0)) ( s i z e <- (times .72 ( s i z e $1))) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (bottom $1)))) (image4 (($1 line-head-4)) n i l ((a3d <- ( l i s t 0 180 (a2d $1))) ( s i z e <- (times .68 ( s i z e $1))) ( l o c a t i o n <- ( l o c a t i o n $1)) (bottom <- (midpoint $2)) )) )) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * composed l i n e - b a s e d scene o b j e c t s *********************************************************** ( i n s t a l l ' ( l e f t - l e g n i l (component (($1 foot ( s i d e l e f t ) ) ($2 l o w e r - l e g ( s i d e l e f t ) ) ($3 upper-leg ( s i d e l e f t ) ) ) (($4 near ($1 $2 proximal-end d i s t a l - e n d ) (angle-x (-50 20)) (angle-y (0 0)) (angle-z (-35 20)) ( r a t i o 55)) ($5 near ($2 $3 proximal-end d i s t a l - e n d ) (angle-x (-145 10)) (angle-y (0 0)) (angle-z (0 0)) ( r a t i o 72))) ((a3d <-( l i s t (car (a3d $3)) (cadr (a3d $1)) (caddr (a3d $3)))) (proximal-end <- (proximal-end $3)) ( s i z e <- (times 2.5 ( s i z e $2))) ( k n e e - l o c a t i o n <- ( d i s t a l - e n d $3)) (knee-posture <- ( d i f f (car (a3d $3)) (car (a3d $2)))) ( l o c a t i o n <- ( l o c a t i o n $5)) (foot-base <- ( l i s t (car (a3d $1)) (caddr (a3d $1)))) ( f o o t - p o s t u r e <- ( d i f f (car (a3d $2)) (car (a3d $1)))) )) )) ( i n s t a l l ' ( r i g h t - l e g n i l (component (($1 f o o t ( s i d e r i g h t ) ) ($2 l o w e r - l e g ( s i d e r i g h t ) ) ($3 upper-leg ( s i d e r i g h t ) ) ) (($4 near ($1 $2 proximal-end d i s t a l - e n d ) (angle-x (-50 20)) (angle-y (0 0)) (angle-z (-20 35)) ( r a t i o 55)) ($5 near ($2 $3 proximal-end d i s t a l - e n d ) (angle-x (-145 10)) (angle-y (0 0)) (angle-z (0 0)) ( r a t i o 72))) ((a-3d <-( l i s t (car (a3d $3)) (cadr (a3d $1)) (caddr (a3d $3) ) ) ) (proximal-end <- (proximal-end $3)) ( s i z e <- (times 2.5 ( s i z e $2))) ( k n e e - l o c a t i o n <- ( d i s t a l - e n d $3)) (knee-posture <- ( d i f f (car (a3d $3)) (car (a3d $2)))) ( l o c a t i o n <- ( l o c a t i o n $5)) (foot-base <- ( l i s t (car (a3d $1)) (cadr (a3d $1)))) ( f o o t - p o s t u r e <- ( d i f f (car (a3d $2)) (car (a3d $1)))) )) )) F ( i n s t a l l '(lower-body n i l (component (($1 r i g h t - l e g ) ($2 l e f t - l e g ) ($3 h i p s ) ) (($4 near ($1 $3 proximal-end bottom) (angle-x (-30 120)) (angle-y (-90 90)) (angle-z (-30 60)) ( r a t i o 340)) ($5 near ($2 $3 proximal-end bottom) (angle-x (-30 120)) (angle-y (-90 90)) (angle-z (-30 60)) ( r a t i o 340))) ( ( t o p <- (top $3)) (a3d <- (a3d $3)) ( l o c a t i o n <- ( l o c a t i o n $3)) ( s i z e <- (plus ( s i z e $1) ( s i z e $3))) )) )) ( i n s t a l l ' ( l e f t - a r m ( ( s i d e l e f t r i g h t ) ) (component (($1 hand (s i d e l e f t ) ) ($2 lower-arm ( s i d e l e f t ) ) ($3 upper-arm ( s i d e l e f t ) ) ) (($4 near ($1 $2 proximal-end d i s t a l - e n d ) (angle-x (-30 20)) (angle-y (0 0)) (angle-z (-90 90)) ( r a t i o 43) ) ($5 near ($2 $3 proximal-end d i s t a l - e n d ) (angle-x (-150 150)) (angle-y (-90 90)) (angle-z (-150 150)) ( r a t i o 79)) ) ((a3d <- (a3d $3)) (proximal-end <- (proximal-end $3)) ( s i z e <- (times 2.1 ( s i z e $3))) ( l o c a t i o n <- ( l o c a t i o n $5)) ( e l b o w - l o c a t i o n <- ( d i s t a l - e n d $3)) (elbow-posture <- ( d i f f (caddr (a3d $3)) (caddr (a3d $2)))) )) )) ( i n s t a l l ' ( r i g h t - a r m ( ( s i d e l e f t r i g h t ) ) (component (($1 hand ( s i d e r i g h t ) ) ($2 lower-arm ( s i d e r i g h t ) ) ($3 upper-arm ( s i d e r i g h t ) ) ) 233 (($4 near ($1 $2 proximal-end d i s t a l - e n d ) (angle-x (-30 20)) (angle-y (0 0)) (angle-z (-90 90)) ( r a t i o 43)) ($5 near ($2 $3 proximal-end d i s t a l - e n d ) (angle-x (-150 150)) (angle-y (-90 90)) (angle-z (-150 150)) ( r a t i o 79))) ((a3d <- (a3d $3)) (proximal-end <- (proximal-end $3)) ( s i z e <- (times 2.1 ( s i z e $3))) ( l o c a t i o n <- ( l o c a t i o n $5)) ( e l b o w - l o c a t i o n <- ( d i s t a l - e n d $3)) (elbow-posture <- ( d i f f (caddr (a3d $3)) (caddr (a3d $2) ))) )) )) ( i n s t a l l '(upper-body n i l (component (($1 trunk) ($2 l e f t - a r m ) ($3 right-arm) ($4 neck)) (($5 near ($2 $1 proximal-end l e f t - s i d e ) (angle-x (-60 180)) (angle-y (-90 90)) (angle-z (-180 75)) ( r a t i o 196)) ($6 near ($3 $1 proximal-end r i g h t - s i d e ) (angle-x (-60 180)) (angle-y (-90 90)) (angle-z (-75 180)) ( r a t i o 196)) ($7 near ($4 $1 bottom top) (angle-x (0 0)) (angle-y (0 0)) (angle-z (0 0)) ( r a t i o 15))) ((a3d <- (a3d $1) ) ( s i z e <- (plus ( s i z e $1) ( s i z e $2))) ( l o c a t i o n <- ( l o c a t i o n $1)) (top <- ( l o c a t i o n $4)) (bottom <- (bottom $1)) )) )) ( i n s t a l l 234 '(body n i l (component (($1 lower-body) ($2 upper-body) ($3 head)) (($4 near ($2 $1 bottom top) (angle-x (-90 45)) (angle-y (0 0)) (angle z (-45 45)) ( r a t i o 120)) ($5 near ($3 $2 bottom top) (angle-x (-60 60)) (angle-y (-90 90)) (angle-z (-45 45)) ( r a t i o 15))) ((a3d <- (a3d $2)) ( l o c a t i o n <- ( l o c a t i o n $5)) )) )) ******************************************* blob-based scene o b j e c t s *********************************************************** ( i n s t a l l '(extremi ty n i l (image 1 (($1 blob ( r a t i o (200 300)))) n i l ; open ((ends <- ( l i s t ( p t 1 l $1) (pt21 $1))) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .25 (lengthb $1))) )) . (image2 (($1 blob ( r a t i o (100 190)))) n i l ; c l o s e d ((ends <- ( l i s t ( cofg $1) )) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (times 1.5 (lengthb $1))) ( d e l t a <- (times .5 (lengthb $ 1 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 foot) ($2 hand)) n i l n i l ) ) ) ( i n s t a l l '(lower-limb n i l (image (($1 blob ( r a t i o (225 500)))) n i l ((ends <- ( l i s t ( p t 1 l $1) (pt21 $1))) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .2 (lengthb $ 1 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 lower-leg) ($2 lower-arm)) n i l n i l ) ) ) ( i n s t a l l '(upper-limb n i l (image (($1 blob ( r a t i o (310 550)))) n i l ((ends <- ( l i s t ( p t 1 l $1) (pt21 $1))) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .2 (lengthb $ 1 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 upper-leg) ($2 upper-arm)) n i l n i l ) ) ) ( i n s t a l l '(blob-neck n i l (image (($1 blob ( r a t i o (100 160)))) n i l ( ( l o c a t i o n <- (cofg $1)))) ( s p e c i a l i z a t i o n (($1 neck)) n i l n i l ) ) ) ( i n s t a l l ' ( blob-shoulder n i l (image (($1 blob ( r a t i o (100 140)))) n i l ( ( l o c a t i o n <- (cofg $1)))) ( s p e c i a l i z a t i o n (($1 s h o u l d e r ) ) n i l n i l ) ) ) ( i n s t a l l '(head-part n i l (image (($1 blob ( r a t i o (120 165)))) n i l ((ends <- ( l i s t ( cofg $1 ) )) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .5 (lengthb $ 1 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 head)) n i l n i l ) )) 236 ( i n s t a l l ' ( c e n t r a l - b o d y n i l (image 1 (($1 blob ( r a t i o (200 350)))) n i l ((ends <- ( l i s t ( p t 1 l $1) (pt21 $1))) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .35 (lengthb $ 1 ) ) ) ) ) . (image2 (($1 blob ( r a t i o (100 225)))) n i l ((ends <- ( l i s t ( c o f g $1) )) ( l o c a t i o n <- (cofg $1)) ( s i z e <- (lengthb $1)) ( d e l t a <- (times .5 (lengthb $1))) )) ( s p e c i a l i z a t i o n (($1 trunk) ($2 h i p s ) ) n i l n i l ) )) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * c o n s t r u c t e d blob-based scene o b j e c t s *********************************************************** ( i n s t a l l ' (limb n i l (component (($ 1 extremity) ($2 lower-limb) ($3 upper-1imb)) (($4 b-connect ($1 $2 n i l n i l ) ( r a t i o (25 60))) ($5 b-connect ($2 $3 n i l n i l ) ( r a t i o (60 80) ) ) ) ((proximal-end <- (free2 $5)) ( d i s t a l - e n d <- ( f r e e l $4)) ( l o c a t i o n <- ( l o c a t i o n $5)) ( d e l t a <- ( d e l t a $3)) ( s i z e <- (times 2.3 ( s i z e $ 2 ) ) ) ) ) ( s p e c i a l i z a t i o n (($ 1 right-arm) ($2 r i g h t - l e g ) ($3 l e f t - a r m ) ($4 l e f t - l e g ) ) n i l n i l ) )) ( i n s t a l l '(body-half n i l (component (($1 limb) ($2 limb) ($3 c e n t r a l - b o d y ) ) (($4 b-connect ($1 $3 proximal-end n i l ) ( r a t i o (150 450))) ($5 b-connect ($2 $3 proximal-end n i l ) ( r a t i o (150 450)))) ((head-end <- (midpoint (proximal-end $1) (proximal-end $2))) (center-end <- (free2 $4)) ( l o c a t i o n <- ( l o c a t i o n $3)) ( d e l t a <- ( d e l t a $3)) ( s i z e <- (plus ( s i z e $1) ( s i z e $ 3 ) ) ) ) ) ( s p e c i a l i z a t i o n (($1 upper-body) ($2 lower-body)) n i l n i l ) )) ( i n s t a l l '(rough-body n i l (component (($1 head-part) ($2 body-half) ($3 b o d y - h a l f ) ) (($4 b-connect ($1 $2 n i l head-end) ( r a t i o (40 80))) ($5 b-connect ($3 $2 center-end center-end) ( r a t i o (80 160)))) ( ( l o c a t i o n <- ( l o c a t i o n $5)) ( s i z e <- (times 2.3 ( s i z e $ 2 ) ) ) ) ) r ( s p e c i a l i z a t i o n (($1 body)) n i l n i l ) )) 238 Appendix C PROLOG body d e f i n i t i o n s /* The d e f i n i t i o n s of objects as theorems */ li n e _ f o o t 3(*tag) <-l i n e T * l ) & match_line(*1,0) & l i n e ( * 2 ) & NE(*1,*2) & match_line(*2,0) & l i n e ( * 3 ) & NE(*2,*3) & match_line(*3,0) & connect(*4,*1,*2) & match_connect(*4,124,180) & connect(*5,*2,*3) & match_connect(*5,90,67) & connect(*6,*3,*1) & match_connect(*6,146,83) & gen(line_foot_3_,*tag) & ADDAX(line_foot_3(*tag),1) & ADDAXCcomponents(*tag,*1.*2.*3.NIL)) & setl(orient_2d,*tag,slope,location,*5,location,*4) & set2(location,*tag,midpoint,location,*5,midpoint,*1) & set2(proximal_end,*tag,locat ion,*6) & setl(size,*tag,length,*1). 1ine_lower_leg_1(*tag)•<-l i n e ( * l ) & match_line(*1,0) & l i n e ( * 2 ) & NE(*1,*2) & match_line(*2,0) & l i n e ( * 3 ) & NE(*2,*3) & match_line(*3,21) & connect(*4,*1,*2) & match_connect(*4,120,362) & connect(*5,*2,*3) & match_connect(*5,75,31) & connect(*6,*3,*1) & match_connect(*6,165,89) & gen(line_lower_leg_1_,*tag) & ADDAX(line_lower leg_1(*tag),1) & ADDAX(componentsl*tag,*1.*2.*3.NIL)) & location(*6,*t1,*t2) & loc a t i o n ( * 4 , * t 3 , * t 4 ) & s l o p e ( * t 1 , * t 2 , * t 3 , * t 4 , * t 5 ) & DIFF(*t5,90,*t6) & ADDAX(orient_2d(*tag,*t6)) & set2(di stal_end,*tag,locat ion,*6) & set2(location,*tag,midpoint,location,*6,midpoint,*2) & set2(proximal_end,*tag,location,*4) & setl(size,*tag,length,*1). line_upper_leg_l(*tag) <-l i n e ( * l ) & match_line(*1,53) & l i n e ( * 2 ) & NE(*1,*2) & match_line(*2,0) & l i n e ( * 3 ) & NE(*2,*3) & match_line(*3,0) & connect(*4,*1,*2) & match_connect(*4,166,119) & connect(*5,*2,*3) & match_connect(*5,60,313) & connect(*6,*3,*1) & match_connect(*6,134,27) & gen(1ine_upper_leg_1_,*tag) & ADDAX(1ine_upper l e g _ l ( * t a g ) , 1 ) & ADDAX(componentsT*tag,*1.*2.*3.NIL)) & loca t i o n ( * 4 , * t 1 , * t 2 ) & loc a t i o n ( * 5 , * t 3 , * t 4 ) & s l o p e ( * t 1 f * t 2 , * t 3 , * t 4 , * t 5 ) & D I F F ( * t 5 , 9 0 , * t 6 ) & ADDAX(orient_2d(*tag,*t6)) & s e t 2 ( d i s t a l _ e n d , * t a g , l o c a t i o n , * 4 ) & s e t 2 ( l o c a t i o n , * t a g , m i d p o i n t , l o c a t i o n , * 4 , m i d p o i n t , * 3 ) & s e t 2 ( p r o x i m a l _ e n d , * t a g , l o c a t i o n , * 6 ) & set 1 ( s i z e , * t a g , l e n g t h , * 1). f o o t ( * t a g , * s i d e ) <-l i n e _ f o o t _ 3 ( * 1 ) & gen(foot,*tag) & set2(proximal_end,*tag,proximal_end,*1) & s e t 2 ( l o c a t i o n , * t a g , l o c a t i o n , * 1 ) & s i z e ( * 1 , * t x 1 ) & PROD(*tx1,16,*tx2) & QUOT(*tx2,10,*tx3) & ADDAX(size(*tag,*tx3)) & o r i e n t _ 2 d ( * 1,*orient) & D I F F ( 0 , * o r i e n t , * n e g o r i e n t ) & A D D A X ( o r i e n t _ 3 d ( * t a g , l e f t , 0 , 0 , * n e g o r i e n t ) ) & A D D A X ( o r i e n t _ 3 d ( * t a g , r i g h t , 0 , I 8 0 , * o r i e n t ) ) & ADDAX(components(*tag,*1.NIL)) & ADDAX(foot(*tag,*side),1). l o w e r _ l e g ( * t a g , * s i d e ) <-l i n e _ l o w e r _ l e g _ 1 ( * 1 ) & gen(lower_leg,*tag) & set2(proximal_end,*tag,proximal_end,*1) & s e t 2 ( l o c a t i o n , * t a g , l o c a t i o n , * 1 ) & set 1 ( s i z e , * t a g , s i z e ,*1 ) & s e t 2 ( d i s t a l _ e n d , * t a g , d i s t a l _ e n d , * 1 ) & o r i e n t _ 2 d ( * 1 , * o r i e n t ) & D I F F ( 0 , * o r i e n t , * n e g o r i e n t ) & A D D A X ( o r i e n t _ 3 d ( * t a g , l e f t , 0 , 0 , * n e g o r i e n t ) ) & ADDAX(orient_3d(*tag,left,*negorient,90,0)) & A D D A X ( o r i e n t _ 3 d ( * t a g , r i g h t , 0 , I 8 0 , * o r i e n t ) ) & ADDAX(orient_3d(*tag,right,*orient,90,0)) & ADDAX(components(*tag,*1.NIL)) & ADDAX(lower_leg(*tag,*side),1). u p p e r _ l e g ( * t a g , * s i d e ) <-l i n e _ u p p e r _ l e g _ 1 ( * 1 ) & gen(upper_leg,*tag) & set2(proximal_end,*tag,proximal_end,*1) & s e t 2 ( l o c a t i o n , * t a g , l o c a t i o n , * 1 ) & set 1 ( s i z e , * t a g , s i z e , * 1 ) & s e t 2 ( d i s t a l _ e n d , * t a g , d i s t a l _ e n d , * 1 ) & o r i e n t _ 2 d ( * 1 , * o r i e n t ) & D I F F ( 0 , * o r i e n t , * n e g o r i e n t ) & A D D A X ( o r i e n t _ 3 d ( * t a g , l e f t , 0 , 0 , * n e g o r i e n t ) ) S< ADDAX (or i e n t _ 3 d ( * tag, l e f t , * o r i e n t , ' - 9 0 ' , 0 ) ) & A D D A X ( o r i e n t _ 3 d ( * t a g , r i g h t , 0 , I 8 0 , * o r i e n t ) ) & A D D A X ( o r i e n t _ 3 d ( * t a g , r i g h t , * o r i e n t , ' - 9 0 ' , 0 ) ) 240 & ADDAX(components(*tag,*1.NIL)) & ADDAX(upper_leg(*tag,*side),1). l e g ( * t a g , * s i d e ) <-f o o t ( * 1 , * s i d e ) & l o w e r _ l e g ( * 2 , * s i d e ) & u p p e r _ l e g ( * 3 , * s i d e ) & near(*4,*1,*2,proximal e n d , d i s t a l _ e n d ) & match_ratio(*4, 5 5 T S< orient_3d(*1 ,*side,*x1 ,*y1 , *z1 ) & orient_3d(*2,*side,*x2,*y2,*z2) & match_angle(*x2,*x1,20,'-50',*x4) & match_angle(*y2,*y1,0,0,*y4) & match_angle(*z2,*z1,35,'-35',*z4) & near(*5,*2,*3,proximal e n d , d i s t a l _ e n d ) & match_ratio(*5,72T & orient_3d(*3,*side,*x3,*y3,*z3) & match_angle(*x3,*x2,10,'-145',*X5) & match_angle(*y3,*y2,0,0,*y5) & match_angle(*z3,*z2,0,0,*z5) & gen(leg,*tag) & ADDAX(leg(*tag,*side),1) & ADDAX(components(*tag,*1.*2.*3.NIL)) /* r e a l t i o n which w i l l be used to f i n d a l l the connections i n a l i n e f i l e and r e c o r d t h e i r a t t r i b u t e v a l u e s . I t should be entered as the < - c o n n e c t _ e a c h ( * l i n e 1 , * l i n e 2 ) & FAIL. the r e l a t i o n w i l l make a l l connections between l i n e s (only connections with p o s i t i v e d e f l e c t i o n are c o n s i d e r e d ( i e . >-2 d e g r e e s ) ) . For each connection found, the axioms: c o n n e c t ( * t a g , * l i n e 1 , * l i n e 2 ) v a l u e s ( * t a g , * d e f l e c t i o n , * r a t i o ) l o c a t i on(*tag,*x,*y) are added to the data-base. Thus, each connection i s marked by a unique tag which can be used to r e t r i e v e i t s a t t r i b u t e v a l u e s . */ c o n n e c t _ e a c h ( * l i n e 1 , * l i n e 2 ) <-p o i n t ( * p o i n t 1 , * 1 i n e 1 , * x , * y ) & p o i n t ( * p o i n t 2 , * 1 i n e 2 , * x , * y ) & N E ( * l i n e 1 , * l i n e 2 ) /* f i n d the d e f l e c t i o n between l i n e 1 and l i n e 2 . 241 each l i n e has two s l o p e s , one going i n the d i r e c t i o n away from each p o i n t , they are measured i n degrees, 0 to 360. r i g h t hand d e f l e c t i o n i s p o s i t i v e so the formula i s : d e f l e c t i o n := s l o p e ( l i n e 1 ) - s l o p e ( l i n e 2 ) ; d e f l e c t i o n := d e f l e c t i o n + ( i f d e f l e c t i o n < 0 then -180 e l s e 180) V & s l o p e f ( * p o i n t 1 , * l i n e 1 , * s 1 ) & s l o p e f ( * p o i n t 2 , * l i n e 2 , * s 2 ) & DIFF(*s1,*s2,*s) & IF(GE(*s1,*s2), then, D I F F ( * s , 1 8 0 , * d e f l e c t i o n ) , e l s e , SUM(*s,180,*deflection)) & G T ( * d e f l e c t i o n , ' - 2 ' ) /* f i n d the r e l a t i o n between the lengths of the l i n e s */ & l e n g t h ( * l i n e 1 , * l e n g t h 1 ) & I e n g t h ( * l i n e 2 , * l e n g t h 2 ) & PROD(*length1,100,*ltemp) & QUOT(*ltemp,*length2,*ratio) & s e t c o n ( * l i n e 1 , * 1 i n e 2 , * d e f l e c t i o n , * r a t i o , * x , * y ) . s e t c o n ( * l i n e 1 , * l i n e 2 , * d e f l e c t i o n , * r a t i o , * x , * y ) <--•connect (*any,*line1 , * l i n e 2 ) & gen(connection,*tag) & A D D A X ( v a l u e s ( * t a g , * d e f l e c t i o n , * r a t i o ) ) & ADDAX(location(*tag,*x,*y)) & ADDAX(connect(*tag,*line1,*line2),1) &/. /* f o r the connection given by *tag, see i f the v a l u e s are w i t h i n a c e r t a i n range of the s p e c i f i e d d e f l e c t i o n and r a t i o */ m a t c h _ c o n n e c t ( * t a g , * d e f l e c t i o n , * r a t i o ) <-values(*tag,*dfound,*rfound) & w i t h i n ( * d f o u n d , * d e f l e c t i o n , 3 ) /* decomposes the component h i e r a r c h y f o r a given tagged o b j e c t */ all_components(*tag) <-WRITE(*tag) & c o m p o n e n t s ( * t a g , * l i s t ) & decompose(*1ist,2) & /. /* decompose i s i t e r a t i v e on the l i s t of components i t was passed, and i s r e c u r s i v e on the l i s t s found as values i n the component r e l a t i o n f o r each l i s t element */ d e c o m p o s e ( * f i r s t . * r e s t , * l e v e l ) <-A T O M ( * f i r s t ) & SUM(*level,2,*cur) & o u t p u t ( * f i r s t , * c u r ) & I F ( c o m p o n e n t s ( * f i r s t , * n e x t l i s t ) , then, d e c o m p o s e ( * n e x t l i s t , * c u r ) ) & I F ( S K E L ( * r e s t ) , then, d e c o m p o s e ( * r e s t , * l e v e l ) ) . /* output w i l l g i ve the value of i t s f i r s t argument p r i n t e d out *spacing spaces over */ o u t p u t ( * v a l u e , * s p a c i n g ) <-space(*spac ing) & WRITECH(*value) & NEWLINE. space(*n) <-IF(GT(*n,0), then, WRITECH(' ') & DIFF(*n,1,*n1) /* gensyms the i d given with i t s next value so g e n ( l i n e , * ) g i v e s l i n e l or whatever i s next * gen(*word,*result) <-increment(*word,*next) & concatenate(*word,*next,*result) & /. i n c r e m e n t ( * i d , * v a l u e ) <-i n c r e m e n t l ( * i d , * v a l u e ) & ADDAX(value(*id,*value)) & /. i n c r e m e n t l ( * i d , * v a l u e ) <-v a l u e ( * i d , * v ) & DELAX(value(*id,*v)) & SUM(*v,1,*value) & /. i n c r e m e n t l ( * i d , * v a l u e ) <- BIND(1,*value). concatenate( * i d 1, * i d 2 , * i d 1 i d 2 ) <-STRING( * i d 1 , * i d 1 l i s t ) & STRING( * i d 2 , * i d 2 1 i s t ) & combine( * i d 1 l i s t , * i d 2 1 i s t , * i d 1 i d 2 1 i s t ) & STRING( * i d 1 i d 2 , * i d 1 i d 2 l i s t ). combine( *x.NIL, *y, *x.*y ). combine( * x . * r e s t , *y, *z ) <-combine( * r e s t , *y, *u ) /* l i n e p r e d i c a t e s are l i k e t h i s : pointO ,li n e 1 , 100,140) . p o i n t ( 2 , l i n e 1 , 1 2 0 , 1 6 5 ) . Iength(line1,42) . curve(1ine1 , 0 ) . /* u t i l i t y r o u t i n e : succeeds i f v a l u e l w i t h i n of value2 */ w i t h i n ( * v a l u e 1 , * v a l u e 2 , * d ) <-DIFF(*value2,*d,*t_low) & GE(*value1,*t_low) & SUM(*value2,*d,*t_high) & LE(*v a l u e 1 , * t _ h i g h ) & /. /* u t i l i t y r o u t i n e : r e t u r n s the midpoint of l i n e * midpoint(*1ine,*x,*y) <-point(1,*1ine,*x1,*y1) & p o i n t ( 2 , * l i n e , * x 2 , * y 2 ) & aver2(*x1,*x2,*x) & aver2(*y1,*y2,*y) . midpoint(*x1,*y1,*x2,*y2,*x,*y) <-aver2(*x1,*x2,*x) & aver2(*y1,*y2,*y) & /. a v e r 2 ( * i n p u t 1 , * i n p u t 2 , * r e s u l t ) <-SUM(*input1,*input2,*temp) & QUOT(*temp,2,*result) & /. /* j u s t to c o n f i r m the e x i s t e n c e */ l i n e ( * t a g ) <-c u r v e ( * t a g , * 1) /* check to see i f the l i n e ' s curve i s w i t h i n some l i m i t of the s p e c i f i e d curve */ m a t c h _ l i n e ( * t a g , * c u r v e ) <-c u r v e ( * t a g , * c found) & within(*cfound,*curve,3) & /. /* r e t u r n s slope from given p o i n t of l i n e * / s l o p e f ( * p o i n t , * l i n e , * s ) <-p o i n t ( * p o i n t , * l i n e , * x 1 , * y 1 ) & p o i n t ( * t , * l i n e , * x 2 , * y 2 ) & N E ( * p o i n t , * t ) /* t h i s r e l a t i o n t e s t s f o r the c l o s e n e s s of the two s p e c i f i e d scene domain o b j e c t s *mov and * f i x . The l o c a t i o n s are given as the a t t r i b u t e s *locmov and * l o c f i x . the p o i n t s must be as c l o s e as 1/4 of the s i z e of the l a r g e r of the two o b j e c t s */ n e a r ( * t a g , * m o v , * f i x , * l o c m o v , * l o c f i x ) <-size(*mov,*movsize) & s i z e ( * f i x , * f i x s i z e ) & GO(*locmov.*mov.*xmov.*ymov.NIL) & G O ( * l o c f i x . * f i x . * x f i x . * y f i x . N I L ) & MAX(*movsize,*fixsize,*maxsize) & PROD(*maxsize,*maxsize,*maxsize2) & D I F F ( * x m o v , * x f i x , * x d i f f ) & D I F F ( * y m o v , * y f i x , * y d i f f ) & P R O D ( * x d i f f , * x d i f f , * x d i f f 2 ) & P R O D ( * y d i f f , * y d i f f , * y d i f f 2 ) & SUM(*xdiff2,*ydiff2,*space2) & QUOT(*maxsize2,4,*maxsize3) & LE(*space2,*maxsize3) & gen(near,*tag) & PROD(*movsize,100,*movsize1) & Q U 0 T ( * m o v s i z e 1 , * f i x s i z e , * r a t i o ) & A D D A X ( r a t i o ( * t a g , * r a t i o ) ) . m a t c h _ r a t i o ( * t a g , * v a l ) <-rat i o ( * t a g , * t e s t ) & w i t h i n ( * t e s t , * v a l , 1 5 ) . m a t c h _ a n g l e ( * t e s t 1 , * t e s t 2 , * h i g h , * l o w , * t e s t ) <-D I F F ( * t e s t 1 , * t e s t 2 , * t e s t ) & L E ( * t e s t , * h i g h ) & G E ( * t e s t , * l o w ) . MAX(*1,*2,*res) <-IF(GE(*1,*2), then, BIND(*1,*res), e l s e , /* these are u t i l i t y r o u t i n e s which are used to generate a t t r i b u t e values out of the a t t r i b u t e v a l u e s of other bound v a r i a b l e s , the only reason f o r t h e i r e x i s t e n c e i s to 245 make the PROLOG d e c l a r a t i o n s look l i k e the LISP d e c l a r a t i o n . */ s e t l ( * a t t r , * t a g , * f n , * a r g ) <-GO(*fn.*arg.*temp.NIL) & PUT(*attr.*tag.*temp.NIL). s e t 2 ( * a t t r , * t a g , * f n , * a r g ) <-GO(*fn.*arg.*temp1,*temp2.NIL) & PUT(*attr.*tag.*temp1.*temp2.NIL). s e t l ( * a t t r ,-*tag, * f n , * f n 1 ,*arg1 ,*fn2,*arg2) <-GO(*fn1.*arg1.*temp1.*temp2.NIL) & GO(*fn2.*arg2.*temp3.*temp4.NIL) & GO(*fn.*temp1.*temp2.*temp3.*temp4.*temp5.NIL) & PUT(*attr .*tag.*temp5.NIL) . s e t 2 ( * a t t r , * t a g , * f n , * f n 1 , * a r g 1 , * f n 2 , * a r g 2 ) <-GO(*fn1,*arg1.*temp1.*temp2.NIL) & GO(*fn2.*arg2.*temp3.*temp4.NIL) & GO(*fn.*temp1.*temp2.*temp3.*temp4.*temp5.*temp6.NIL) slope(*x1,*y1,*x2,*y2,*theta) <-/* r e t u r n s the slope * t h e t a between p o i n t s 1 and 2 as an angle from 0 to 358 (by twos) */ DIFF(*y2,*y1,*t1) & DIFF(*x2,*x1,*dx) & PR0D(*t1,I00,*dy) /* watch f o r zero d i v i d e which means 90 or 270 */ & IF(EQ(0,*dx), then, slope 1(*y1,*y2,90,*theta), e l s e , IF(EQ(0,*dy), then, slope 1(*x1,*x2,0,*theta), e l s e , s l o p e 2 ( * d x , * d y , * y 1 , * y 2 , * t h e t a ) ) ) . s l o p e l ( * y 1 , * y 2 , * t t , * t h e t a ) <-IF(LT( * y 2 , * y 1 ) , then, SUM(*tt,I80,*theta), e l s e , BIND(*tt,*theta) ) . slope2(*dx,*dy,*y1,*y2,*theta) <-246 QUOT(*dy,*dx,*q) /* t a b l e out of range means 90 or 270 */ & IF(outside(*q,'-5729',5728), then, s l o p e l ( * y 1 , * y 2 , 9 0 , * t h e t a ) , e l s e , a r c t a n ( * f r o m , * t o , * t t ) & GE(*q,*from) & LE(*q,*to) & s l o p e l ( * y 1 , * y 2 , * t t , * t h e t a ) ) . outside(*1,*2,*3)<- LT(*1,*2) ! GT(*1,*3). 247 Appendix D I n t e r p r e t a t i o n Examples T h i s appendix c o n s i s t s of four working examples which f u r t h e r demonstrate the o p e r a t i o n s of the i n t e r p r e t a t i o n sys-tem. The f i r s t example i s a complete i n t e r p r e t a t i o n of the body form i n the r e s t p o s i t i o n , accomplished i n a s i n g l e f i x a -t i o n . The second example i s a p a r t body at a much l a r g e r s c a l e than used i n the other examples. T h i s example demon-s t r a t e s the r e l a t i o n between the knowledge r e p r e s e n t a t i o n and the i n t e r p r e t a t i o n c a p a b i l i t y . The t h i r d and f o u r t h examples use the same body as shown i n Chapter F i v e , but d i f f e r e n t diameters of fovea and p e r i p h e r y are used. Example One F i g u r e D.1 shows the body form i n i t s r e s t p o s i t i o n . In t h i s p o s i t i o n a l l body p a r t s are i n three dimensional r e l a -t i o n s of (0,0,0) to t h e i r r e f e r e n c e p o i n t s . The purpose of t h i s example i s simply to provide a complete parse tr e e which r e p r e s e n t s the i n t e r p r e t a t i o n . Only the f i n e l a y e r i s represented i n t h i s example because the blob i n t e r p r e t a t i o n s are not very r e l i a b l e i n i n s t a n c e s i n which the body p a r t s are c l o s e , t o g e t h e r . One more l v e l of r e p r e s e n t a t i o n , using only the o v e r a l l shape of the body would be q u i t e u s e f u l i n cases such as these. A A b. F i g u r e D.1. The body form in r e s t p o s i t i o n . 249 body-2 lower-body-2 r i g h t - l e g - 1 foot-1 1ine-foot-4-1 ( l i n e - 4 3 l i n e - 4 5 l i n e - 4 4 ) lower-leg-1 l i n e - l o w e r - l e g - 2 - 1 ( l i n e - 4 0 l i n e - 4 2 l i n e - 4 1 ) upper-leg-1 line-upper-leg-2-1 ( l i n e - 3 3 l i n e - 3 2 l i n e - 3 1 ) l e f t - l e g - 4 foot-2 l i n e - f o o t - 3 - 1 ( l i n e - 3 7 l i n e - 3 8 l i n e - 3 9 ) lower-leg-2 l i n e - l o w e r - l e g - 1 - 1 ( l i n e - 3 4 l i n e - 3 5 l i n e - 3 6 ) upper-leg-2 1ine-upper-leg-1 -1 ( l i n e - 2 4 l i n e - 2 2 l i n e - 2 3 ) hips-6 l i n e - h i p s - 1 - 6 ( l i n e - 2 9 l i n e - 3 0 l i n e - 2 8 ) upper-body-1 trunk-1 1ine-trunk-5-1 l i n e - t o r s o - 1 ( l i n e - 9 l i n e - 5 l i n e - 6 l i n e - 7 l i n e - 8 ) l i n e - s h o u l d e r - 1 ( l i n e - 4 9 l i n e - 5 0 l i n e - 5 1 ) l i n e - s h o u l d e r - 7 ( l i n e - 1 3 l i n e - 1 4 l i n e - 1 5 ) left-arm-1 hand-1 line-hand-2-1 ( l i n e - 5 7 l i n e - 5 6 l i n e - 5 5 ) lower-arm-1 1ine-lower-arm-2-1 ( l i n e - 5 2 l i n e - 5 4 l i n e - 5 3 ) upper-arm-1 line-upper-arm-2-1 ( l i n e - 4 6 l i n e - 4 8 l i n e - 4 7 ) right-arm-2 hand-2 line-hand-1-1 ( l i n e - 2 7 l i n e - 2 5 l i n e - 2 6 ) lower-arm-2 line-lower-arm-1 - 1 ( l i n e - 1 9 l i n e - 2 0 l i n e - 2 1 ) upper-arm-2 1ine-upper-arm-1 - 1 ( l i n e - 1 6 l i n e - 1 7 l i n e - 1 8 ) neck-9 l i n e - n e c k - 9 ( l i n e - 1 2 l i n e - 1 1 l i n e - 1 0 ) head-1 1ine-head-1 -1 ( l i n e - 2 l i n e - 3 l i n e - 4 l i n e - 1 ) F i g u r e D.2. A complete parse t r e e f o r a body form, 250 Example Two T h i s example demonstrates that the s p e c i f i c a t i o n i n the d e c l a r a t i v e schemata c o n t r o l s the l e g a l i t y of i n t e r p r e t a t i o n s . The example uses an image which c o n s i s t s of only a p a r t body, and at a d i f f e r e n t s c a l e from the other drawings. F i g u r e D.3 shows the o r i g i n a l drawing. In t h i s case, the fovea and p e r i p h e r y are both set to cover the e n t i r e image. The f e a t u r e based o p e r a t i o n s and the d e t a i l s of the i n t e r p r e t a t i o n are not shown, but the r e s u l t i s f i r s t of a l l , the d e t e c t i o n of a "body-half" at the coarse l a y e r . A c t u a l l y two such "body-half" i n t e r p r e t a t i o n s are gen-e r a t e d . T h i s i s because the t o l e r a n c e s at the coarse l e v e l are set q u i t e h i g h , and the " h i p " was used as an "extremity" to complete another "limb" node. body-half-2 limb-2 extremity-2 (blob-8) lower-limb-2 (blob-9) upper-limb-2 (blob-7) limb-3 extremity-4 (blob-1) lower-limb-4 (blob-2) upper-limb-4 (blob-3) central-body-5 (blob-5) body-half-1 limb-1 extremity-1 (blob-10) lower-limb-2 (blob-9) upper-limb-2 (blob-7) 1imb-3 extremity-4 (blob-1) lower-limb-4 (blob-2) upper-limb-4 (blob-3) central-body-5 (blob-5) 251 F i g u r e D.3. A h a l f body at l a r g e s c a l e . 252 The i n i t i a l i n t e r p r e t a t i o n based on the f i n e l a y e r i s shown below: left-arm-1 hand-2 line-hand-1-1 ( l i n e - 2 0 l i n e - 1 8 l i n e - 1 9 ) lower-arm-2 1ine-lower-arm-1-2 ( l i n e - 1 5 l i n e - 1 6 l i n e - 1 7 ) upper-arm-2 line-upper-arm-1-1 ( l i n e - 1 2 l i n e - 1 3 l i n e - 1 4 ) right-arm-1 hand-2 line-hand-1-1 ( l i n e - 2 0 l i n e - 1 8 l i n e - 1 9 ) lower-arm-2 1ine-lower-arm-1-2 ( l i n e - 1 5 l i n e - 1 6 l i n e - 1 7 ) upper-arm-2 line-upper-arm-1 -1 ( l i n e - 1 2 l i n e - 1 3 l i n e - 1 4 ) The i n t e r p r e t a t i o n was not s u c c e s s f u l . Only one arm was d e t e c t e d , seen as the two p o s s i b i l i t i e s of a "right-arm" or a " l e f t - a r m " . The arm that was found i s the one on the r i g h t hand s i d e of the image. The other arm appears l e g a l , but i t i s not w i t h i n the d e f i n i t i o n of the body form, and so has been r e j e c t e d . In order to demonstrate t h i s , the system was i n t e r -rupted, and an e x p l i c i t attempt was made to f o r c e the i n t e r p r e t a t i o n of a " l e f t - a r m " on the component body-parts: 253 -> (setq b i n d '(($1 . hand-1) ($2 . lower-arm-1) ($3 . upper-arm-1 ))) -> (ELABORATE bind ' l e f t - a r m 'component) attempt to e l a b o r a t e l e f t - a r m from (($1 . hand-1) ($2 . lower-arm-1) ($3 . upper-arm-1)) v e r i f y i n g ($1 hand-1 E00005 E00006) with ( ( s i d e l e f t ) ) found= n i l req= (side l e f t ) t r y i n g e l a b o r a t i o n : E00005 found=left t r y i n g e l a b o r a t i o n : E00006 found=r ight e l a b o r a t i o n d e l e t e d v e r i f y i n g ($4 near-1 X00011 X00012 X00013 X00014) • with ((angle-x (-30 20)) (angle-y (0 0)) (angle-z (-90 90)) ( r a t i o 43)) found= n i l req= (angle-x (-30 20)) t r y i n g e l a b o r a t i o n : X00011 found=161 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00012 found=312 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00013 found=161 e l a b o r a t i o n d e l e t e d t r y i n g e l a b o r a t i o n : X00014 found=312 e l a b o r a t i o n d e l e t e d r e l a t i o n not v e r i f i e d n i l There are two l e g a l i n t e r p r e t a t i o n s f o r the bulged p a r t of the image d e p i c t i o n . The only p o s s i b l e i n t e r p r e t a t i o n s of the connection between the hand and the lower-arm i s that a r o t a -254 t i o n of more than 180 degrees has taken p l a c e . Since t h i s amount of r o t a t i o n i s o u t s i d e of the body's c a p a b i l i t i e s , the i n t e r p r e t a t i o n as an arm was r e j e c t e d . The p o r t i o n of the schema which i s r e s p o n s i b l e f o r t h i s s p e c i f i c a t i o n i s shown below: ((proximal-end <- (proximal-end $1)) ( d i s t a l - e n d <- ( d i s t a l - e n d $1)) ( l o c a t i o n <- ( l o c a t i o n $1)) ( s i z e <- ( s i z e $1))) ( ( s i d e <- l e f t ) (a3d <- ( l i s t 0 0 (neg (a2d $ 1 ) ) ) ) ) ( ( s i d e <- l e f t ) (a3d <- ( l i s t (a2d $1) -90 0))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t 0 180 (a2d $1)))) ( ( s i d e <- r i g h t ) (a3d <- ( l i s t (a2d $1) -90 0 ) ) ) ) T h i s i s the a t t r i b u t e development p o r t i o n of the "imagel" schema f o r the o b j e c t "lower-arm". The s p e c i f i c a t i o n maps a t t r i b u t e v a l u e s from the image domain i n t o the scene domain. We are i n t e r e s t e d in the three dimensional o r i e n t a t i o n of the lower-arm ("a3d") which i s based on the two dimensional o r i e n -t a t i o n of the image c o n s t r u c t i o n ("a2d"). J u s t f o r t h i s exam-p l e , we w i l l modify the schema, i n order to allow a e x t r a map-ping from the image domain. A f t e r running the example again, the complete upper body i s r e c o g n i z e d as shown below: 255 upper-body-1 trunk-2 l i n e - t r u n k - 1 - 1 ( l i n e - 1 0 l i n e - 1 1 ) left-arm-1 hand-1 line-hand-3-1 ( l i n e - 3 2 l i n e - 3 3 l i n e - 3 4 l i n e - 3 0 l i n e - 3 1 ) lower-arm-1 line-lower-arm-1 -1 ( l i n e - 2 7 l i n e - 2 8 l i n e - 2 9 ) upper-arm-1 line-upper-arm-2-1 ( l i n e - 2 4 l i n e - 2 6 l i n e - 2 5 ) right-arm-2 hand-2 line-hand-1-1 ( l i n e - 2 0 l i n e - 1 8 l i n e - 1 9 ) lower-arm-2 line-lower-arm-1-2 ( l i n e - 1 5 l i n e - 1 6 l i n e - 1 7 ) upper-arm-2 1ine-upper-arm-1 -1 ( l i n e - 1 2 l i n e - 1 3 l i n e - 1 4 ) neck-1 line-neck-1 ( l i n e - 7 l i n e - 9 l i n e - 8 ) t -> (print-models3 'upper-body-1) node:upper-body-1 type upper-body d e s c r i p t i o n component bi n d i n g s ($7 near-9) ($6 near-8) ($5 near-5) ($4 neck-1 E00068 E00069 E00070 E00071) ($1 trunk-2) ($2 left-arm-1 E00033 E00034) ($3 right-arm-2 E00066 E00067) a3d (-19 90 0) s i z e 1161.3 l o c a t i o n (478 . 465) top (372 . 704) bottom (574 . 253) uses (near-12) -> (print-models3 'extremity-3) node:extremity-3 type extrem i t y d e s c r i p t i o n image2 bi n d i n g s (($1 blob-6)) ends ((384 . 713)) l o c a t i o n (384 . 713) s i z e 59.34 d e l t a 25.71 256 T h i s example demonstrates the idea that l i n e drawing images are convention based. The d e c l a r a t i v e schemata system pro-v i d e s a means of making these conventions e x p l i c i t , and i t al l o w s experimentation with d i f f e r e n t d e p i c t i o n formats. Example Three T h i s example c o n s i s t s of two complete i n t e r p r e t a t i o n s of an image. The system i s o p e r a t i n g i n the mode i n which i t s e l e c t s the p r o c e s s i n g l o c a t i o n s , as d e s c r i b e d i n s e c t i o n 4.6. The same image example was used i n chapter 5. In that exam-p l e , the f o v e a l and p e r i p h e r a l diameters were set to 125 and 210 r e s p e c t i v e l y (over a t o t a l image area of 1024x1024). For these examples, the p e r i p h e r y i s l a r g e r . The e f f e c t of the propagation of f i n e l e v e l i n t e r p r e t a t i o n i n t o the p e r i p h e r y i s f a r g r e a t e r i n t h i s case. For the two examples which f o l l o w , a complete i n f e r r e d correspondence i s obtained f o r the "rough-body" by the t h i r d or f o u r t h f i x a t i o n . As i n d i c a t e d i n the c o n l u s i o n s of Chapter Seven, there are a number of p o s s i b l e c r i t e r i a f o r s e l e c t i o n w i t h i n the image. Two of these i n f l u e n c e s are demonstrated here: s i z e of f e a t u r e c o l l e c t i o n areas, and the s t a r t i n g l o c a t i o n . Quite d i f f e r e n t r e s u l t s are obtained i n the two d i f f e r e n t examples. The examples i n c l u d e a minimum of a n n o t a t i o n . The p r i n -tout from the execution p r o v i d e s comments about the progress of i n f e r r i n g correspondences and the development of the parse 257 t r e e s . The system i n d i c a t e s the c h o i c e of new l o c a t i o n s and p r o v i d e s the reason f o r the c h o i c e , e i t h e r f o v e a l or p e r i -p h e r a l (see s e c t i o n 4.6). -> (setq $radius2$ 325) 325 -> (see 320 704) a f t e r f e a t u r e c o l l e c t i o n at 320 704 (125/325) before grouping 472 models at l i n e l e v e l a f t e r grouping 216 models at l i n e l e v e l a f t e r 2 - l e v e l 164 models at l i n e l e v e l i n f e r r e d r i g h t - a r m f o r limb-2 from component upper-limb-3 as c o r r e s p o n d i n g upper-arm i n f e r r e d upper-body f o r body-half-1 from component limb-2 as i n f e r r e d arm next l o c a t i o n s e l e c t e d as (464 . 780) f o v e a l body-half-1 limb-1 extremity-1 (blob-13) lower-limb-2 (blob-12) upper-1imb-2 (blob-11) limb-2 extremity-6 (blob-1) lower-limb-4 (blob-4) upper-limb-3 (blob-5) central-body-6 (blob-7) upper-arm-1 line-upper-arm-3-1 ( l i n e - 8 l i n e - 6 l i n e - 7 ) node:body-half type d e s c r i p t i o n b i ndings head-end center-end l o c a t ion d e l t a s i z e uses i n f e r r e d -1 body-half component ($5 b-connect-27) ($4 b-connect-26) ($2 limb-2) ($3 central-body-6) ($1 limb-1) (378 . 818) (391 . 653) (384 . 755) 68.72 464.0 (b-connect-29 b-connect-28) upper-body 258 I F i x a t i o n at 320 704 259 node:limb-1 type limb d e s c r i p t i o n component bin d i n g s ($5 b-connect-1 1) ($4 b-connect-10) ($3 upper-limb-2) ($2 lower-limb-2) ($1 extremity-1) proximal-end (420 . 832) d i s t a l - e n d (528 . 520) l o c a t i o n (511 . 721) d e l t a 26.73 s i z e 267.72 uses b-connect-26 b-connect-15 b-connect-14 b-connect-13 b-connect-12 node:limb-2 type d e s c r i p t i o n b i n d i n g s prox imal-end d i s t a l - e n d l o c a t ion d e l t a s i z e uses i n f e r r e d 1 imb component ($5 b-connect-22) ($4 b-connect-21) ($1 extremity-6) ($2 lower-limb-4) ($3 upper-limb-3) (336 . 804) (112 . 665) (272 . 668) 28.09 254.26 b-connect-33 b-connect-32 b-connect-31 b-connect-30 b-connect-27 b-connect-25 b-connect-24 b-connect-23 arm The f o v e a l requirement f o r the next f i x a t i o n i s a r e s u l t of there being no correspondence f o r "limb-1", even though the l a r g e r s t r u c t u r e "body-half-1" does have a correspondence. a f t e r f e a t u r e c o l l e c t i o n at 464 780 (125/325) F i x a t i o n at 464 780 261 before grouping 678 models at l i n e l e v e l ..... a f t e r grouping 428 models a t l i n e l e v e l a f t e r 2 - l e v e l 383 models at l i n e l e v e l i n f e r r e d arm f o r limb-1 from component upper-limb-2 as corresponding upper-arm next l o c a t i o n s e l e c t e d as (320 . 320) p e r i p h e r a l body-half-1 limb-1 extremity-1 (blob-13) lower-limb-2 (blob-12) upper-limb-2 (blob-11) limb-2 extremity-6 (blob-1) lower-limb-4 (blob-4) upper-limb-3 (blob-5) central-body-6 (blob-7) upper-arm-2 line-upper-arm-1-1 ( l i n e - 1 4 l i n e - 1 5 l i n e - 1 6 ) upper-arm-1 1ine-upper-arm-3-1 ( l i n e - 8 l i n e - 6 l i n e - 7 ) a f t e r f e a t u r e c o l l e c t i o n at 320 320 (125/325) before grouping 641 models at l i n e l e v e l a f t e r grouping 583 models at l i n e l e v e l a f t e r 2 - l e v e l 471 models at l i n e l e v e l i n f e r r e d body f o r rough-body-1 from component body-half-1 as i n f e r r e d upper-body next l o c a t i o n s e l e c t e d as (400 . 512) f o v e a l rough-body-1 head-part-2 (blob-6) body-half— 1 limb-1 extremity-1 (blob-13) lower-limb-2 (blob-12) upper-limb-2 (blob-11) limb-2 extremity-6 (blob-1) lower-limb-4 (blob-4) upper-limb-3 (blob-5) central-body-6 (blob-7) body-half-2 1imb-3 extremity-7 (blob-15) lower-limb-6 (blob-16) 262 F i x a t i o n at 320 320 263 upper-Iimb-7 (blob-8) limb-4 extremity-8 (blob-14) lower-limb-9 (blob-2) upper-1imb-5 (blob-3) central-body-5 (blob-9) upper-arm-2 1ine-upper-arm-1 -1 ( l i n e - 1 4 l i n e - 1 5 l i n e - 1 6 ) upper-arm-1 line-upper-arm-3-1 ( l i n e - 8 l i n e - 6 l i n e - 7 ) node:rough-body-1 type rough-body d e s c r i p t i o n component bindings ($5 b-connect-61) ($4 b-connect-63) ($3 body-half-2) ($2 body-half-1) ($1 head-part-2) l o c a t i o n (386 . 632) s i z e 1067.36 i n f e r r e d body With t h i s l a s t f i x a t i o n , the coarse l a y e r i n t e r p r e t a t i o n i s complete, and the "rough-body" has obtained a correspondence at the f i n e l a y e r . a f t e r f e a t u r e c o l l e c t i o n at 400 512 (125/325) before grouping 929 models at l i n e l e v e l a f t e r grouping 569 models at l i n e l e v e l a f t e r 2 - l e v e l 549 models at l i n e l e v e l i n f e r r e d l e g f o r limb-3 from component upper-limb-7 as c o r r e s p o n d i n g upper-leg i n f e r r e d lower-body f o r body-half-2 from component limb-3 as i n f e r r e d l e g i n f e r r e d lower-body f o r body-half-2 from component central-body-5 as corresponding h i p s The i n t e r p r e t a t i o n w i l l continue to f i l l i n the d e t a i l s by e s t a b l i s h i n g more f i n e l a y e r nodes u n t i l the e n t i r e image i s 264 F i x a t i o n at 400 512 processed. l o c a t i o n s e l e c t e d as (152 . 469) f o v e a l a f t e r f e a t u r e c o l l e c t i o n at 152 469 (125/325) 266 F i x a t i o n at 152 469 267 Example Four The p r o c e s s i n g i s now i n i t i a t e d once more, but with a d i f f e r e n t s t a r t i n g l o c a t i o n , and d i f f e r e n t v a l u e s f o r the f e a t u r e c o l l e c t i o n r a d i i . -> (s e t q $radius1$ 100 $radius2$ 275) 275 -> (see 600 550) a f t e r f e a t u r e c o l l e c t i o n at 600 550 (100/275) before grouping 292 models at l i n e l e v e l a f t e r grouping 42 models at l i n e l e v e l a f t e r 2 - l e v e l 36 models at l i n e l e v e l next l o c a t i o n s e l e c t e d as (320 . 832) p e r i p h e r a l node:hand-1 type hand d e s c r i p t i o n image3 b i n d i n g s (($1 line-hand-3-1)) a3d (-2 90 0) posture open proximal-end (523 . 570) l o c a t i o n (523 . 543) d i s t a l - e n d (524 . 517) s i z e 55 e x t r a (E00005 E00006) elaboration:E00005 s i d e l e f t elaboration:E00006 s i d e r i g h t node:extremity-2 type extremity d e s c r i p t i o n imagel b i n d i n g s (($1 blob-4)) ends ((382 . 611) (400 . 512)) l o c a t i o n (383 . 557) s i z e 100.62 d e l t a 25.15 uses (b-connect-3) node .'extremity- 1 type extremity d e s c r i p t i o n imagel b i n d i n g s (($1 blob-7)) ends ((525 . 576) (528 . 520)) 268 F i x a t i o n at 600 550 269 l o c a t i o n (525 . 551) s i z e 56.08 d e l t a 14.02 uses (b-connect-1) i n s t a n t i a t i o n ( h a n d - 1 ) a f t e r f e a t u r e c o l l e c t i o n at 320 832 (100/275) before grouping 766 models at l i n e l e v e l a f t e r grouping 464 models at l i n e l e v e l a f t e r 2 - l e v e l 382 models at l i n e l e v e l i n f e r r e d arm f o r limb-1 from component extremity-1 as corresponding hand next l o c a t i o n s e l e c t e d as (320 . 320) p e r i p h e r a l node:1imb-1 type d e s c r i p t i o n b i n d i n g s proximal-end d i s t a l - e n d l o c a t ion d e l t a s i z e uses i n f e r r e d limb component ($5 b-connect-8) ($4 b-connect-7) ($3 upper-limb-4) ($2 lower-limb-1) ($ 1 extremity-1) (420 . 832) (528 . 520) (511 . 721) 26.73 267.72 (b-connect-13 arm b-connect-12 b-connect-10 b-connect-9) a f t e r f e a t u r e c o l l e c t i o n at 320 320 (100/275) before grouping 640 models at l i n e l e v e l a f t e r grouping 582 models at l i n e l e v e l a f t e r 2 - l e v e l 470 models at l i n e l e v e l next l o c a t i o n s e l e c t e d as (383 . 557) f o v e a l body-half-1 limb-2 extremity-5 (blob-15) lower-limb-4 (blob-16) upper-limb-7 (blob-3) limb-3 extremity-6 (blob-14) lower-limb-6 (blob-13) upper-limb-8 (blob-1) central-body-1 (blob-4) 270 F i x a t i o n at 320 832 271 F i x a t i o n at 383 557 273 node:body-half-1 type b d e s c r i p t i o n c b i n d i n g s ( ody-half omponent ($5 b-connect-46) ($4 b-connect-45) ($2 limb-3) ($3 central-body-1) ($1 limb-2) (359 . 480) (382 . 611) (383 . 557) 35.21 406.73 head-end center-end l o c a t i o n d e l t a size a f t e r f e a t u r e c o l l e c t i o n at 383 557 (100/275) before grouping 826 models at l i n e l e v e l a f t e r grouping 598 models at l i n e l e v e l a f t e r 2 - l e v e l 536 models at l i n e l e v e l a f t e r C - f i l t e r 536 models at l i n e l e v e l i n f e r r e d l e g f o r limb-2 from component upper-limb-7 as corresponding upper-leg i n f e r r e d lower-body fo r body-half-1 from component limb-2 as i n f e r r e d l e g i n f e r r e d lower-body fo r body-half-1 from component central-body-1 as corresponding h i p s next l o c a t i o n s e l e c t e d as (152 . 469) f o v e a l body-half-1 limb-2 extremity-5 (blob-15) lower-limb-4 (blob-16) upper-limb-7 (blob-3) 1imb-3 extremity-6 (blob-14) lower-limb-6 (blob-13) upper-limb-8 (blob-1) central-body-1 (blob-4) a f t e r f e a t u r e c o l l e c t i o n at 152 469 (100/275) before grouping 796 models at l i n e l e v e l a f t e r grouping 594 models at l i n e l e v e l a f t e r 2 - l e v e l 562 models at l i n e l e v e l i n f e r r e d upper-body fo r body-half-2 from component limb-1 as i n f e r r e d arm i n f e r r e d body for rough-body-1 F i x a t i o n at 152 469 275 from component body-half-2 as i n f e r r e d upper-body i n f e r r e d body fo r rough-body-1 from component body-half-1 as i n f e r r e d lower-body next l o c a t i o n s e l e c t e d as (336 . 804) f o v e a l rough-body-1 head-part-2 (blob-11) body-half-2 limb-1 extremity-1 (blob-7) lower-limb-1 (blob-6) upper-limb-4 (blob-5) limb-4 extremity-7 (blob-8) lower-limb-3 (blob-9) upper-limb-2 (blob-10) central-body-4 (blob-2) body-half-1 limb-2 extremity-5 (blob-15) lower-limb-4 (blob-16) upper-limb-7 (blob-3) 1imb-3 extremity-6 (blob-14) lower-limb-6 (blob-13) upper-limb-8 (blob-1) central-body-1 (blob-4) Now the second example i s complete to the extent that an e n t i r e coarse l a y e r body i s known, and i t has a correspondence at the f i n e l a y e r . 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051865/manifest

Comment

Related Items