Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Automatic interpretation of Landsat images using context sensitive region merging Starr, Dale William 1976

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1976_A6_7 S73.pdf [ 4.89MB ]
Metadata
JSON: 831-1.0051777.json
JSON-LD: 831-1.0051777-ld.json
RDF/XML (Pretty): 831-1.0051777-rdf.xml
RDF/JSON: 831-1.0051777-rdf.json
Turtle: 831-1.0051777-turtle.txt
N-Triples: 831-1.0051777-rdf-ntriples.txt
Original Record: 831-1.0051777-source.json
Full Text
831-1.0051777-fulltext.txt
Citation
831-1.0051777.ris

Full Text

automatic I n t e r p r e t a t i o n of LANDSAT Images Using Context S e n s i t i v e Region Merging by Dale Wi l l i a m S t a r r B.Math, U n i v e r s i t y of Waterloo, 1974 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOB THE DEGREE OF MASTEB OF SCIENCE i n THE DEPABTMENT OF COMPUTEB SCIENCE We accept t h i s t h e s i s as conforming to the r e g u i r e d standard THE UNIVEESITY OF BBITISH COLUMBIA A p r i l , 1976 (c) Dale W i l l i a m S t a r r , 1976 '• •' > • In p r e s e n t i n g t h i s t h e s i s in p a r t i a l f u l f i l m e n t o f the r e q u i r e m e n t s f o r an advanced degree a t the U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r ee t h a t the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r r e f e r e n c e and s t u d y . I f u r t h e r ag ree t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y pu rpo se s may be g r a n t e d by the Head o f my Department o r by h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l not be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . Department o f ^X^JA/JS'f /^t^z^d?. The U n i v e r s i t y o f B r i t i s h Co l umb i a 2075 Wesbrook P l a c e Vancouver, Canada V6T 1W5 Date ^tft. 5^  H?6 i A b s t r a c t Automatic i n t e r p r e t a t i o n of images from Earth Resources Technology S a t e l l i t e - 1 (ERTS-1, now c a l l e d LANDSAT) can be used i n a v a r i e t y of a p p l i c a t i o n s with c o n s i d e r a b l e accuracy. Most systems, however, c l a s s i f y s t r i c t l y on a p o i n t by p o i n t b a s i s , making no use of any s p a t i a l knowledge. Standard p h o t o - i n t e r p r e t a t i o n technigues are combined with some techniques from a r t i f i c i a l i n t e l l i g e n c e to produce an i n c r e a s e i n accuracy over a p o i n t - b y - p o i n t c l a s s i f i c a t i o n method. T r a d i t i o n a l c l a s s i f i c a t i o n methods are used to o b t a i n an i n i t i a l segmentation of the image. Then, a c o n t r o l l e d region merging process a l l o w s the r e g i o n s with unambiguous i n t e r p r e t a t i o n s to i n f l u e n c e the i n t e r p r e t a t i o n o f neighbouring ambiguous r e g i o n s , thereby i n t r o d u c i n g c o n s i d e r a b l e context s e n s i t i v i t y i n t o the i n t e r p r e t a t i o n process. R e s u l t s are given of an experiment t o i n t e r p r e t areas of d i f f e r e n t f o r e s t cover. Acknowledgement I would l i k e to thank Dr. Alan Mackworth for his many ideas, his patience, and the f i n a n c i a l support provided by his NEC Operating Grant A9281. I would also l i k e to thank Dr. Bary Pollack for reading t h i s thesis and Dr. Murtha of the Faculty of Forestry f o r providing the ground truth data so e s s e n t i a l to t h i s study. i i i Table of Contents Introduction 1 Objectives of Remote Sensing 7 Uses of Remote Sensing ...................................... 9 Background 15 Region Merging 19 The Method 22 Results • • 36 Conclusions 41 Beyond the Present 43 Bibliography 44 Appendix 1: Description of Programs 46 Bead From Tape 46 Print Picture 48 Get S t a t i s t i c s 51 E l l i p s e 53 E l l i p s o i d 56 I n i t i a l C l a s s i f i e r 59 The Region Finder 61 The Region Merger 66 i v l i s t of Figures 1. The LANDSAT S a t e l l i t e and i t s F i e l d of View 2 2., A Sample LANDS AT Image 4 3. Ground-truth Data 24 4. Band 5-6 E l l i p s e s of Concentration 27 5. Band 5-7 E l l i p s e s of Concentration 28 6. Band 6-7 E l l i p s e s of Concentration 29 7. I n i t i a l C l a s s i f i c a t i o n with no Ambiguous Regions ....... 38 8. I n i t i a l C l a s s i f i c a t i o n with Ambiguous Regions .......... 39 9. F i n a l P a r t i t i o n After Merging , 40 V L i s t of Tables I . Suggested L e v e l One C l a s s i f i c a t i o n Features ............. 7 I I . Summary of LANDS AT Uses 13 I I I . I n i t i a l C l a s s i f i c a t i o n with Ambiguous Regions ........ 32 IV. I n i t i a l C l a s s i f i c a t i o n with no Ambiguous Regions 36 V. j O v e r p r i n t e d C h a r a c t e r s f o r each R e f l e c t a n c e Range 50 1 I n t r o d u c t i o n A p i c t u r e can be s t o r e d on the computer as an a r r a y on an NxM g r i d G. (Each poi n t on the g r i d i s c a l l e d a p i c t u r e element or p i x e l ) . The a r r a y i s a f u n c t i o n on G whose value a t each p o i n t i s some measure of l i g h t at t h a t p o i n t . I f a simple b l a c k and white photograph were s t o r e d the value would simply be the b r i g h t n e s s of the p o i n t . A c o l o u r photo would r e g u i r e more i n f o r m a t i o n so that the b r i g h t n e s s of the c o l o u r and l i g h t i n t e n s i t y of the p o i n t are known f o r the b a s i c c o l o u r s . In t h i s way, with the r i g h t eguipment, one i s a b l e to a c c u r a t e l y reproduce a p i c t u r e from the d i g i t a l data. T h i s i s what i s done f o r s a t e l l i t e p i c t u r e s . In 1972 Earth Resources Technology S a t e l l i t e 1 (ERTS-1, now c a l l e d LANDSAT) was launched by NASA i n the United States as p a r t of a program to demonstrate the f e a s i b i l i t y and p r a c t i c a l i t y of remote sensing from space. The s a t e l l i t e has an apogee of 917 km., a perigee of 898 km., with a r o t a t i o n a l p e r i o d of 103 minutes. T h i s o r b i t covers the e a r t h ' s s u r f a c e from a l a t i t u d e of 81 degrees S t o 81 degrees N, r e p e a t i n g the coverage every 18 days at the same l o c a l time. The b a s i c component of the s a t e l l i t e , as shown i n F i g u r e 1 (from Gower and Daniel (13) and B e r n s t e i n ( 4 ) ) , i s a m u l t i s p e c t r a l scanner which c o n s i s t s of an o s c i l l a t i n g m irror and an o p t i c a l system which r e f l e c t and d i r e c t scene r a d i a n c e values onto a d e t e c t o r a r r a y t h a t i s s e n s i t i v e to wavelengths i n f o u r s p e c t r a l bands as f o l l o w s : Figure 1 The LANDSAT S a t e l l i t e and i t s f i e l d of view. 3 Band 4: Green portion of spectrum .5 - .6 micrometers Band 5: Red portion .6 - .7 Band 6: Near infrared .7 - .8 Band 7: Further infrared .8 -1. 1 Six scan l i n e s are simultaneously swept i n each spectral band with each o s c i l l a t i o n of the mirror, the 11.56 degree f i e l d of view covering a swath of 185 km. on the earth's surface. The output of the detectors i s i n six b i t per word d i g i t a l form f o r transmission to ground receiving stations. Each point of the scene has associated with i t four pieces of data: the radiance values of the point i n each of the spectral ranges defined above. Hith the LANDSAT-1 eguipment a resolution of 4600 sguare metres per pix e l i s achieved; plans for future s a t e l l i t e s reduce t h i s s i g n i f i c a n t l y . A sample LANDSAT image i s given in Figure 2, showing southern Vancouver Island and the S t r a i t of Georgia. Unfortunately, there are many sources of error for the multispectral scanner which must be corrected before the data can be used. Some of these are (from Bernstein ( 4 ) ) : Al t i t u d e : Differences of the spacecraft from i t s nominal a l t i t u d e cause d i s t o r t i o n s of scale which must be corrected. Attitude: The axis of the platform i s maintained with one axis normal to the surface of the earth and another p a r a l l e l to the velocity vector of the c r a f t . Distortions r e s u l t when perturbations to the correct Figure 2 A Sample LANDSAT Image headings are experienced. Scan Skew: While the mirror i s scanning, the spacecraft i s moving along the ground track. Therefore the swath obtained i s not normal to the ground track vector but skewed s l i g h t l y . Velocity.: If the velocity changes from the nominal value, the ground track covered by a given number of successive mirror sweeps changes, producing d i s t o r t i o n s along the ground track. Earth Rotation: As the mirror sweeps, not only does the spacecraft move, but the earth also rotates beneath i t . Therefore, there i s a gradual westward s h i f t of the ground swath being scanned, causing along scan d i s t o r t i o n s . Mirror Sweep: Imperfections i n the c o n t r o l l i n g mechanism of the mirror cause the scanning rate to vary, creating along scan d i s t o r t i o n s . P § £ § f i Q g t i y e : For most applications the desired images represent the projection of points on the earth on a plane tangent to the earth at the point d i r e c t l y below the s a t e l l i t e . The data; however, represent a perspective projection as a re s u l t of the curvature of the surface. Atmospheric E f f e c t s : The l i g h t received by the s a t e l l i t e i s dispersed and attenuated by the atmosphere. Compensation f o r thi s e f f e c t i s d i f f i c u l t , but i f i t can be done i t enhances the 6 accuracy of the i n f o r m a t i o n r e c e i v e d . Tapes r e c e i v e d from the Canadian Centre f o r Remote Sensing have v a r i o u s c o r r e c t i v e o p e r a t i o n s performed on them to compensate f o r these i n p e r f e c t i o n s . The data a l s o i s put i n t o an e i g h t b i t word format and the s t r u c t u r e of the data on the tape i s completely changed f o r e a s i e r use. 7 O b j e c t i v e s of Remote Sensing The o b j e c t i v e of automatic i n t e r p r e t a t i o n i s t o produce a meaningful p a r t i t i o n of the scene presented, p l a c i n g each p i x e l i n t o one of a number of c l a s s e s , depending on the a p p l i c a t i o n of the study. For example, i n the p e r i o d i c t r a c i n g of the growth of a c i t y i t may be s u f f i c i e n t to have only three c a t e g o r i e s f o r a gross study -say r e s i d e n t i a l , i n d u s t r i a l - c o m m e r c i a l , and other while a more in-depth study might delve i n t o more s p e c i f i c s u b c l a s s e s of these l a r g e r groupings. In 1971 the Conference on Land Use Information and C l a s s i f i c a t i o n (15) provided some c l a s s i f i c a t i o n o b j e c t i v e s . The p a r t i c i p a n t s d i v i d e d t h e i r c l a s s e s i n t o two l e v e l s , the f i r s t as gi v e n i n Table I. 8 I. Urban and built-up. I I . Transportation, communication and u t i l i t i e s I I I . Farming IV. Grassland V. Forest land VI. Extractive (mining) VII. Water VIII. Marshland IX. Tundra X. Barren land XI. Permanent snow f i e l d s Table I Suggested l e v e l One C l a s s i f i c a t i o n Features Level two features were subsections of those of l e v e l one. For example, urban was s p l i t into r e s i d e n t i a l , commercial, i n d u s t r i a l , services, r e c r e a t i o n a l , transportation and other. Similar subsections were proposed for the other groupings where appropriate. The basic aim was to achieve l e v e l one interpretation through the use of s a t e l l i t e imagery alone and l e v e l two through a combination of s a t e l l i t e and air-photo techniques. These aims were s l i g h t l y conservative though, since success has been achieved i n i d e n t i f y i n g most l e v e l two classes s o l e l y from s a t e l l i t e imagery at much less cost than i f a i r c r a f t had been used. 9 Uses of Bemote Sensing from S a t e l l i t e s One of the main uses of remote s e n s i n g from space l i e s i n the area of land-use i n v e n t o r i e s . Present data that i s used to c o n s t r u c t an i n v e n t o r y map may be d e r i v e d from d i f f e r e n t years and c o l l e c t e d by d i v e r s e methods. T h i s o b v i o u s l y l e a d s to i n a c c u r a c y and high c o s t s , not to mention the time i t takes to gather such i n f o r m a t i o n . T h i s i s e s p e c i a l l y t rue i n developing c o u n t r i e s which may net have the f a c i l i t i e s f o r such data g a t h e r i n g , but to whom the r e s u l t s may be of extreme importance. With an e f f e c t i v e system of s a t e l l i t e remote s e n s i n g a resource i n v e n t o r y can be done much more g u i c k l y and at l e s s c o s t than by c o n v e n t i o n a l methods with the added b e n e f i t of a l l the data coming from a s i n g l e source. One area where t h i s c o u l d be important to Canada i s i n p r e d i c t i n g wheat crops throughout the world so that the export market can be estimated. Zwarun (24) r e p o r t e d t h a t i n 1975 most p r e d i c t i o n s were of a l a r g e S o v i e t crop, c a u s i n g world p r i c e s to f a l l and farmers t o make seeding plans i n a n t i c i p a t i o n of the Russian crop. American s c i e n t i s t s ; however, using s a t e l l i t e data c o r r e c t l y p r e d i c t e d t h a t the S o v i e t crop would be a f a i l u r e . Had the Canadians used the same remote s e n s i n g data more wheat could have been planted i n a n t i c i p a t i o n of i n c r e a s e d export demands. T h i s a b i l i t y to determine y i e l d s should be cf major importance to the world as a whole s i n c e most crops can be d i f f e r e n t i a t e d using s a t e l l i t e data. In a d d i t i o n t o t h i s , the h e a l t h of most cr o p s a l s o i s d e t e c t a b l e , e n a b l i n g b l i g h t s t o be 10 c o n t r o l l e d before they spread over too great an area (see r e f e r e n c e s 2 and 3). The f o r e s t i n d u s t r y a l s o should b e n e f i t from a p p l i c a t i o n s of remote s a t e l l i t e s e n s i n g . I n v e n t o r i e s of a v a i l a b l e wood w i l l be made much l e s s c o s t l y and e a s i e r t o o b t a i n than i t p r e s e n t l y i s , e s p e c i a l l y i n i n h o s p i t a b l e mountain areas. F o r e s t f i r e damage can be g u i c k l y mapped by s a t e l l i t e s , p r o v i d i n g companies with guick e s t i m a t e s of r e p l a n t i n g c o s t s . In much the same way urban land-use can be obtained from LANDSAT data, as demonstrated by E l l e f s e n et a l . (9) and Todd et a l . (22). The r e p e t i t i v e coverage i s u s e f u l f o r t r a c i n g the growth of c i t i e s and co u l d prove u s e f u l i n c i t y p l a n n i n g , although c i t i e s u s u a l l y have much of t h i s data r e a d i l y a v a i l a b l e . H o f f e r et a l . (16) have demonstrated the e f f e c t i v e n e s s of p r o v i d i n g areas of i n t e r e s t to mining companies even i n h e a v i l y vegetated areas. While they can't d e t e c t minerals they can i d e n t i f y c e r t a i n g e o l o g i c a l f e a t u r e s which give c l u e s t o high p o t e n t i a l areas. T h i s could be of p a r t i c u l a r importance i n the Canadian north where vast areas of uncharted l a n d c o u l d be viewed by s a t e l l i t e t o provide mining concerns with areas where f u r t h e r study should be c a r r i e d out. Another use i n the g e o l o g i c a l area i s i n earthquake s t u d i e s i S a t e l l i t e photos provide e x c e l l e n t views of l a r g e areas making f a u l t l i n e s r e a d i l y apparent. T h i s might have use i n p l a n n i n g c i t i e s so that major p o p u l a t i o n c e n t r e s or i n d u s t r i a l complexes are not placed on or near high r i s k areas. 11 V o l c a n i c d i s t u r b a n c e s a l s o can be detected but i n a d d i t i o n to t h i s i t may be p o s s i b l e to p r e d i c t when a c t i v i t y w i l l occur, as o u t l i n e d by Friedman et a l . (11). During p e r i o d s of repose of a c t i v e volcanoes, as we l l as durin g p e r i o d s of e r r u p t i c n , a s i z e a b l e heat f l u x reaches the s u r f a c e . H o p e f u l l y t h i s can be monitored and used as a p r e d i c i t i v e c l u e t o impending a c t i v i t y . The c u r r e n t energy c r i s i s and r a p i d i n c r e a s e i n world o i l and energy p r i c e s have had a great impact on the Canadian north i n terms of f o s s i l f u e l e x p l o i t a t i o n and i n development of h y d r o e l e c t r i c p r o j e c t s . There e x i s t s a need f o r i n f o r m a t i o n on which t o base d e c i s i o n s concerning r o u t i n g of roads and p i p e l i n e s . Resource s a t e l l i t e s can provide t h i s data i n a s h o r t time and a t a reasonable cost so t h a t d e t a i l e d mapping can be c a r r i e d out i n the areas of s p e c i f i c i n t e r e s t . Clough et a l . (7) r e p o r t t h a t a c o s t b e n e f i t study on remote sensing of sea i c e f o r e c a s t e d savings to A r c t i c s h i p p i n g of $4 m i l l i o n i n 1975, r i s i n g to $100 m i l l i o n by 1990. T h i s s a v i n g i s p r e d i c t i n g i n c r e a s e d use of o i l tankers through the A r c t i c , but i f t h i s i s not undertaken c o n s i d e r a b l e savings s t i l l could be obtained by wheat s h i p s going through C h u r c h i l l , Manitoba, where the n a v i g a t i o n season i s very s h o r t and any a i d s i n g e t t i n g through the i c e as g u i c k l y as p o s s i b l e would be a major help. Gerson (12) has demonstrated t h a t i t i s p o s s i b l e to map i c e f i e l d s by computer p r o c e s s i n g of s a t e l l i t e p i c t u r e s where the main d i f f i c u l t y l i e s i n d i f f e r e n t i a t i n g between cl o u d and the i c e . The mapping of i c e i s not only u s e f u l to n a v i g a t i o n but a l s o can be used to plan many a c t i v i t i e s such as 12 o f f s h o r e d r i l l i n g or c o n s t r u c t i o n of a r t i f i c i a l i s l a n d s which r e g u i r e good knowledge of freeze-up or break-up c o n d i t i o n s . Ocean c u r r e n t s can be detected because of the d i f f e r e n c e s i n temperature between the c u r r e n t and the surrounding water. By s t u d y i n g t h i s i t should be p o s s i b l e t o gain more knowledge i n t o d i s t r i b u t i o n s of f i s h p o p u l a t i o n s f o r more e f f e c t i v e and e c o l o g i c a l l y sound f i s h i n g . Taking t h i s f u r t h e r i n t o the f u t u r e i t might prove u s e f u l , once farming of the sea becomes a r e a l i t y , t o know areas where the temperature i s optimal f o r a c e r t a i n crop t o d e l i v e r i t s maximum y i e l d . Another area where s a t e l l i t e imagery could prove u s e f u l i s i n monitoring the amount of snow present i n s p r i n g t i m e so that r u n o f f can be p r e d i c t e d to prevent f l o o d i n g . At present though, the 18 day c y c l e time i s probably too long f o r t h i s purpose, but s e v e r a l s a t e l l i t e s c o u l d be u t i l i z e d . F i n a l l y , some p o l l u t i o n monitoring c o u l d be accomplished through the use of s a t e l l i t e data. On r i v e r s and l a k e s one can r e a d i l y d e t e c t sources of p o l l u t i o n as a r e s u l t of sudden changes i n c o l o u r or temperature of the water. O i l s p i l l s from p i p e l i n e s a l s o c o u l d be detected i n t h i s way as the o i l u s u a l l y covers a f a i r l y l a r g e area. T h i s i s p r e s e n t l y done by a i r c r a f t which p a t r o l the p i p e l i n e s each week or so but c o u l d probably be done using LANDSAT data s i n c e only a change from normal has to be detected i n s t e a d of a whole new c l a s s i f i c a t i o n . Table 2, from B e r n s t e i n (4) summarizes of most of the important uses f o r LANDSAT imagery. 1 3 A g r i c u l t u r e , F o r e s t r y Crop census Crop y i e l d estimates I d e n t i f i c a t i o n of v e g e t a t i o n d i s e a s e Land use i n v e n t o r y Hydrology water resource i n v e n t o r y Fresh water source i d e n t i f i c a t i o n Flood monitoring P o l l u t i o n monitoring of lakes and r i v e r s Oceanography, marine resources F i s h p r oduction Ship r o u t i n g Sea s t a t e Ice c o n d i t i o n s Geology Geologic and p h y s i o g r a p h i c mapping M i n e r a l e x p l o r a t i o n Earthguake area s t u d i e s G l a c i e r and v o l c a n i c s t u d i e s S h o r e l i n e e r o s i o n Geography Land use mapping P h y s i c a l geography Cartography Urban planning Demography Table I I Summary of LANDSAT Uses As i s r e a d i l y observable the uses of remote s e n s i n g from space are many and v a r i e d and l i k e l y w i l l i n c r e a s e as the r e s o l u t i o n of the imagery i n c r e a s e s . T h e i r g r e a t e s t impact may be f e l t i n those c o u n t r i e s which at present l a c k many of the data g a t h e r i n g f a c i l i t i e s common i n North America. But perhaps the most important use w i l l l i e i n the area of food p r o d u c t i o n , where s a t e l l i t e s can be used t o de t e c t unused a r a b l e l a n d , p r e d i c t y i e l d s to predetermine d i s t r i b u t i o n , spot b l i g h t areas, e t c . 14 Clough (6) r e p o r t s t h a t the estimated c o s t of the LANDSAT program i n the U.S. w i l l be about $20 to $50 m i l l i o n per annum while the estimated p o t e n t i a l b e n e f i t s should exceed $1.4 b i l l i o n per annum. As an example, minimizing f l o o d damage may save $300 m i l l i o n and improving f o r e c a s t s of a v a i l a b i l i t y of i r r i g a t i o n water may save $280 m i l l i o n each year i n the U.S. Alone. This makes the use of such s a t e l l i t e s not onl y f e a s i b l e but extremely economically v i a b l e . 15 Background Systems for automatically interpreting LANDSAT images generally can be divided into two basic groups -supervised and non-supervised c l a s s i f i c a t i o n systems. Supervised c l a s s i f i c a t i o n reguires some input by the experimenter to i d e n t i f y certain t r a i n i n g areas i n the scene. Each class under consideration has i t s own areas so that the necessary s t a t i s t i c s can be calculated. The means and covariance matrix are calculated for the class within these areas so i t i s obviously of great importance that the tr a i n i n g areas be as homogeneous as possible, consisting solely of that c l a s s ; otherwise the s t a t i s t i c s w i l l not be as accurate as one would l i k e . Once this i s carr i e d out for a l l classes the pa r t i t i o n i n g of the picture can be done by placing each point in that group which gives i t the highest probability of membership. This usually i s done using a maximum li k e l i h o o d function. Unsupervised c l a s s i f i c a t i o n usually employs cluster analysis (or factor analysis) to automatically group a given set of data into the most s p e c t r a l l y separable c l u s t e r s using several wavelength bands. Different features of the earth's surface i d e a l l y would have their own unigue spectral response, thereby creating d i s t i n c t clusters of points belonging to each feature i n the spectral space. Unfortunately, t h i s i s seldom the case as the clusters usually overlap to some extent, thereby causing errors in the c l a s s i f i e r . Many studies have used these approaches with considerable 16 suc c e s s . E l l e f s e n et a l . (9) and Todd et a l . (22) c a r r i e d out s i m i l a r s t u d i e s i n urban land-use mapping from s a t e l l i t e d a t a. Todd e t a l . used a c l u s t e r a n a l y t i c approach with the number of c l a s s e s set at 14. R e s u l t s were s a i d t o be poor so i t was necessary t o use ground-truth data to r e i n f o r c e the c l u s t e r i n g approach. O v e r a l l over 90$ of the p o i n t s were c o r r e c t l y i d e n t i f i e d with most e r r o r s o c c u r r i n g near the u r b a n - r u r a l f r i n g e where upper income r e s i d e n t i a l areas were c l a s s i f i e d , not u n n a t u r a l l y , as grassy areas. E l l e f s e n et a l . a l s o used c l u s t e r a n a l y s i s i n t h e i r study with an o v e r a l l accuracy of about 8735. Freguent e r r o r s o c c u r r e d i n m i s c l a s s i f y i n g o l d r e s i d e n t i a l areas as wooded, open space; o b v i o u s l y something which e a s i l y c o u l d be c o r r e c t e d i f s p a t i a l knowledge was i n c o r p o r a t e d i n the system. Robertson (18) used a d i f f e r e n t approach from most systems. The image was r e c u r s i v e l y p a r t i t i o n e d i n t o b l o c k s of image p o i n t s such t h a t each block contained p o i n t s from a s i n g l e c l a s s . These blocks were then c l a s s i f i e d as a whole, using t e x t u r e and other s p a t i a l c h a r a c t e r i s t i c s to i n f l u e n c e the c l a s s i f i c a t i o n . Accuracy was i n c r e a s e d s l i g h t l y over a p o i n t Jay point c l a s s i f i e r (with an average accuracy of 82%) at a c o s t of about ten times the computing time. T e s t s were c a r r i e d out on both a i r c r a f t and s a t e l l i t e imagery but i n e x p l i c a b l y the r e s u l t s from a i r p h o t o s were only comparable to those from s a t e l l i t e , whereas i n most cases a i r p h o t o s r e s u l t e d i n g r e a t e r accuracy because of the much s m a l l e r area covered by a s i n g l e p i x e l . 17 Gupta e t a l . (14) a l s o f e l t t h a t p o i n t by p o i n t c l a s s i f i c a t i o n was l e s s than o p t i m a l . They employed a boundary f i n d e r to produce homogeneous r e g i o n s i n the scene. The images under c o n s i d e r a t i o n were those of a g r i c u l t u r a l f i e l d s - o b v i o u s l y w e l l s u i t e d to the method used s i n c e one would expect the boundaries to be r e l a t i v e l y s t r a i g h t and t h e r e f o r e e a s i l y d e t e c t a b l e . Once the c l o s e d r e g i o n s were found c l a s s i f i c a t i o n was done using a maximum l i k e l i h o o d c l a s s i f i e r and a minimum d i s t a n c e c l a s s i f i e r . R e s u l t s of the study were extremely good. On a p o i n t by p o i n t b a s i s an e r r o r r a t e of only 4. 1% was observed but t h i s was reduced to 255 with t h e i r system. One must bear i n mind; however, that the t e s t s were c a r r i e d out on a i r c r a f t imagery so the e x c e l l e n t accuracy i s not a l l t h a t unexpected. Bajcsy and T a v a k o l i (1) c a r r i e d out an i n t e r e s t i n g study to i d e n t i f y b r i d g e s , i s l a n d s , r i v e r s , and l a k e s from s a t e l l i t e imagery. They used a world model to d e s c r i b e the o b j e c t s of i n t e r e s t . For example, a bridge has the s p e c t r a l response of the l a n d ; i t has a t h i n , elongated shape; i t i s connected to the l a n d on i t s s h o r t s i d e s and surrounded by water on i t s l o n g e r s i d e s . S i m i l a r i l y , every o b j e c t of i n t e r e s t to the study was d e s c r i b e d i n the program. I n i t i a l segmentation of the p i c t u r e was f a i r l y simple, needing only to d i f f e r e n t i a t e between land and water although i t had to be g u i t e s e n s i t i v e i n order to f i n d evidence of b r i d g e s which were narrower than the p i x e l s i z e . A s k e l e t o n operator was then used to f i n d the shape of the r e g i o n s i n the p i c t u r e which were then used i n 18 c o n n e c t i o n with the model. An hypothesis was made about a f i g u r e and then compared with the model i n an ordered way. For example, b r i d g e s were i d e n t i f i e d f i r s t and then removed from the p i c t u r e so t h a t r i v e r s and i s l a n d s can be i d e n t i f i e d . A f t e r a l l t h i s was done t h e i r program was able to i d e n t i f y a l l the b r i d g e s they had hoped f o r , p l u s some others which they d i d not expect t o f i n d because of t h e i r s m a l l width. The preceeding study made the most use of context to i n t e r p r e t the scene; However, i t had the disadvantage t h a t i n an u n s t r u c t u r e d scene one would be hard pressed to come up with an e g u i v a l e n t world model. However, scene a n a l y s i s technigues from a r t i f i c i a l i n t e l l i g e n c e seem to provide a good b a s i s f o r use i n s a t e l l i t e image i n t e r p r e t a t i o n . 19 Region Merging Brice and Fenema (5) described a method of scene analysis involving the use of regions and region merging to reduce the number of small regions i n a picture into larger, more meaningful pieces. (A region simply i s a connected set of p i x e l s ) . I n i t i a l l y the picture was partitioned into regions of egual grey value or radiance. This; however, did not generally r e s u l t i n a p a r t i t i o n allowing simple interpretation of the scene. Many f a l s e regions were created because of uneven i l l u m i n a t i o n , noise, etc. A f i r s t approach to solving t h i s would be to group points i f t h e i r features are not too d i f f e r e n t . However, this may tend to produce regions that extend across the natural boundaries of the scene i f two actual regions are not too d i f f e r e n t across the separating boundary. Their approach; therefore, was to s t a r t with atomic regions and use global c r i t e r i a to merge them. The f i r s t pass merged regions i n such a way that the resulting boundary was shortened. Even i f the boundary between two regions was weak, the two were joined only i f their r e s u l t i n g boundary did not grow too guickly. More precisely, t h e i r so-called phagocyte h e u r i s t i c was to merge adjacent regions R_i and R_j only i f H/Pm > THETA1, where Pm = min{Perim_i, Perim_ j) , THETA1 i s the threshold, and W i s the length of the weak boundary between the two regions. (A weak boundary i s defined as the boundary vector between two pixels having a difference i n gray value less than some strength threshold). 20 T h i s h e u r i s t i c removed many of the s u p e r f l u o u s r e g i o n s from the p i c t u r e , but many f a l s e s e p a r a t i o n s s t i l l e x i s t e d so another h e u r i s t i c ; the weakness h e u r i s t i c , was a p p l i e d . T h i s j o i n e d two r e g i o n s only i f W/I > Theta2, where 8 was again the weak p o r t i o n of the i n t e r s e c t i o n I of the two r e g i o n s . A f t e r these two h e u r i s t i c s have been a p p l i e d most of the n a t u r a l r e g i o n s were present, with very few r e g i o n s t h a t weren't a c t u a l l y i n the scene. Feldman and Yakimovsky (10) have done much work on a semantics based r e g i o n a n a l y z e r . While t h e i r aims were d i f f e r e n t (they were i n t e r e s t e d i n understanding everyday scenes) the i d e a s seem t o be r e a d i l y a p p l i c a b l e to LANDSAT imagery. A major problem i n r e g i o n a n a l y s i s i s t h a t most systems use ab s o l u t e c r i t e r i a f o r d e c i d i n g what c o n s t i t u t e s a r e g i o n F e l d m a n and Yakimovsky attempt to vary these c r i t e r i a with c o n t e x t . For example, c e r t a i n shades of green, brown and yellow might be merged i n t o a s i n g l e area of g r a s s i n one p i c t u r e , while remaining completely separate e n t i t i e s i n another, or even i n d i f f e r e n t areas of the same p i c t u r e . As i n B r i c e and Fenema, they began with atomic r e g i o n s which were s e g u e n t i a l l y merged i n t o l a r g e r r e g i o n s . I n i t i a l l y they d i d a pass which merged r e g i o n s based on a weakest boundary f i r s t c r i t e r i a . In other words, the p a i r of re g i o n s with the weakest boundary was merged i n t o one r e g i o n . T h i s process was stopped f a i r l y e a r l y with c o n s e r v a t i v e t h r e s h o l d s f o r merging so t h a t i t d i d not produce f a l s e merges. The semantic c o n t r o l l e d program was then c a l l e d t o produce 21 the f i n a l p i c t u r e using v a r i o u s h e u r i s t i c s to choose the r e g i o n s t o merge. T h i s process depended on semantic i n f o r m a t i o n known by the program which i n c l u d e d such items as shape, p o s i t i o n , boundary smoothness, e t c . with some knowledge of the r e l a t i o n s h i p s between p o s s i b l e o b j e c t s i n the scene with r e s p e c t to these f e a t u r e s . T h e i r f i n a l r e s u l t s were q u i t e i m p r e s s i v e i n the s i m p l i c i t y and c o r r e c t n e s s , with the system a b l e t o c o r r e c t l y i d e n t i f y most o b j e c t s i n the scene with few f a l s e r e g i o n s . H o p e f u l l y s a t e l l i t e imagery i n t e r p r e t a t i o n can be taken to such e l e g a n t h e i g h t s , Tennenbaum and Weyl ( 2 1 ) i n c o r p o r a t e d i d e a s from both the above to analyze everyday scenes. T h e i r system used i n p u t from the experimenter t o supply an i n t e r p r e t a t i o n when the system reached p o i n t s of u n c e r t a i n t y , U i t h t h i s approach a user could q u i c k l y o u t l i n e major r e g i o n s of a p i c t u r e which would then be used as an i n i t i a l p a r t i t i o n , e s p e c i a l l y i n scenes which are too complex to process by computer alone. While the above systems are used to analyze everyday scenes, the s t r u c t u r e s they produce are e x a c t l y what i s needed f o r s a t e l l i t e imagery. Feldman and Yakimovsky take a r e g i o n of h i g h e s t c o n f i d e n c e and a s s i g n t o i t the most l i k e l y i n t e r p r e t a t i o n . ' T h i s allows the p r o b a b i l i t y of adjacent r e g i o n s to be changed by c o n s i d e r i n g the i n t e r - r e l a t i o n s h i p s between v a r i o u s p o s s i b l e i n t e r p r e t a t i o n s . T h i s i s the i d e a I wished t o apply to s a t e l l i t e p h o t o - i n t e r p r e t a t i o n . 22 The Method The aim was to be able to show t h a t i n t e g r a t i o n of some scene a n a l y s i s i d e a s from a r t i f i c i a l i n t e l l i g e n c e c o u l d be a p p l i e d to the problem of c l a s s i f y i n g LANDSAT photos. H o p e f u l l y an i n c r e a s e i n accuracy and a s i m p l e r p i c t u r e would r e s u l t over the p o i n t by p o i n t method. Dr. Murtha of the F a c u l t y of F o r e s t r y provided a d e t a i l e d ground-truth map of a f o r e s t e d area on Vancouver I s l a n d so t h i s became the t e s t area, with the aim of c l a s s i f y i n g r e g i o n s of o l d growth, second growth, r e c e n t l o g g i n g , and water. T h i s ground-truth data provided t r a i n i n g areas where the s t a t i s t i c s of these f o u r c l a s s e s c o u l d be determined. T h i s i s necessary s i n c e a s u p e r v i s e d maximum l i k e l i h o o d c l a s s i f i e r i s used. In order to use t h i s ground-truth data, i t f i r s t must be transformed i n t o machine readable form. T h i s can be accomplished with the a i d of a d i g i t i z e r i n the Mechanical E n g i n e e r i n g Department. F i r s t though, some p o i n t s must be e s t a b l i s h e d as ground c o n t r o l p o i n t s . These are p o i n t s which are r e a d i l y i d e n t i f i a b l e on both the ground-truth data and the d i g i t a l data. In t h i s case there are s e v e r a l s m a l l l a k e s which stand out c l e a r l y from other f e a t u r e s on a computer p r i n t - o u t of the scene (see Appendix 1, P r i n t P i c t u r e ) . Therefore i t i s a simple matter to determine the exact p o s i t i o n of a number of p i x e l s i n the p i c t u r e matrix. Once the ground-truth data i s put i n t o matrix form i t then becomes a matter of t r a n s f o r m i n g the p o i n t s i n the ground t r u t h t o the corresponding p o i n t of the d i g i t a l data. 23 I t i s p o s s i b l e on the d i g i t a l data to f i n d two ground c o n t r o l p o i n t s which l i e i n the same row of the matrix. T h i s d e f i n e s a l i n e on the v i s u a l ground t r u t h data which can be used as a base l i n e f o r d i g i t i z i n g . Assume the area of i n t e r e s t covers x cm. on the v i s u a l data and y rows of p i x e l s i n the a c t u a l p i c t u r e matrix. Then each row of the matrix corresponds to x/y cm. of v i s u a l data. T h e r e f o r e , i n order to d i g i t i z e the v i s u a l data a scan i s s t a r t e d at the o r i g i n and sweeps a c r o s s the base l i n e , r e c o r d i n g the p o s i t i o n of each boundary encountered plus a number i n d i c a t i n g the type of r e g i o n to the r i g h t cf the boundary. Once t h i s i s done f o r the base l i n e the next sweep i s made along a p a r a l l e l l i n e x/y cm. above the l a s t one, which corresponds to the next row i n t h e matrix. T h i s i s done f o r as many rows as necessary, thereby p r o v i d i n g a matrix of ground t r u t h data. However, these p o i n t s are determined by an (x,y) p o s i t i o n which i s measured i n inches from the o r i g i n . T h i s must be transformed to the c o - o r d i n a t e system used by the a c t u a l data. T h i s t r a n s f o r m a t i o n was done using a l e a s t squares f i t p r o v i d i n g r e s u l t s which were at most 1% o f f the c o r r e c t values f o r the ground c o n t r o l p o i n t s . The r e s u l t s are shown i n Figure 3, showing only the area of the study. Once t h i s i s accomplished i t i s p o s s i b l e t o provide c e r t a i n areas where the s u r f a c e f e a t u r e s are known, i n order to e s t a b l i s h means and the c o v a r i a n c e matrix f o r the r e f l e c t a n c e v e c t o r s f o r the f e a t u r e s . Then, assuming a multi-normal d i s t r i b u t i o n about the c l a s s means, a p o i n t i s a s s i g n e d to a 24 •1 • Figure 3 Ground-truth Data | * 1 Second Growth • Old Growth Recent Logging Water 25 p a r t i c u l a r c l a s s on the b a s i s of a multi-normal p r o b a b i l i t y f u n c t i o n . In p a r t i c u l a r , there w i l l be a f u n c t i o n f o r each c l a s s , each t a k i n g the form as f o l l o w s : (see S t e i n e r (19)) f(X,S)= _ L {-ZZA ( X - y r ) ( X - l i ; )} where |S| i s the determinant of S, which i s the covar i a n c e matrix f o r the c l a s s . 2 o„ ... 2 ' 2 a 1 m. 3 m i s the c o f a c t o r of S i n the i t h row and j t h column. • y y. and y, are the means of the i t h and j t h elements of the f e a t u r e v e c t o r s r e s p e c t i v e l y . The f e a t u r e v e c t o r i s simply the data known f o r each p o i n t of the p i c t u r e . From the a c t u a l data, f o u r p i e c e s of i n f o r m a t i o n are known -the r e f l e c t a n c e v a l u e s i n each of the four bands. Only bands f i v e , s i x , and seven were used though, s i n c e the values f o r band fo u r were extremely c l o s e f o r a l l p i x e l s and provided l i t t l e a d d i t i o n a l i n f o r m a t i o n . I n order t o a s s i g n a p o i n t t o a c l a s s , a t r a d i t i o n a l p o i n t by p o i n t method would determine the p r o b a b i l i t y t h a t the p o i n t belongs t o each c l a s s and then a s s i g n i t to the c l a s s which gave i t the h i g h e s t p r o b a b i l i t y of membership. T h i s b a s i c technique was augmented as f o l l o w s . I f p1, p2, p3, and pU are 26 the p r o b a b i l i t i e s r e c e i v e d by a p o i n t from the f o u r p r o b a b i l i t y f u n c t i o n s , the g r e a t e s t of them, pmax, must be g r e a t e r than some t h r e s h o l d before the p o i n t w i l l be assigned t o the c l a s s which produced pmax. More p r e c i s e l y , i f pmax > k (p1+p2+p3+p4)/100 (where k i s the t h r e s h o l d value) the p o i n t i s placed i n the c l a s s producing pmax. I f not, i t i s assigned to a new c l a s s which depends on pmax and the second highest p r o b a b i l i t y . These new c l a s s e s ( c a l l e d ambiguous c l a s s e s ) are used when the c l a s s i f i e r i s not sure about the c l a s s i n which t o place a p o i n t . For example, i f a p o i n t r e c e i v e s i t s highest p r o b a b i l i t y of membership from c l a s s 1, but a l s o r e c e i v e s a f a i r l y high one from c l a s s 2, i t i s assigned to a c l a s s which c o n t a i n s a l l p o i n t s which are l i k e l y c l a s s 1 or 2, but about which some doubt e x i s t s . T h i s method i n c r e a s e s the number of c l a s s e s from the b a s i c f o u r t o a maximum of ten; s p e c i f i c a l l y type 1, type 2, type 3, type 4, type 1 or 2, type 1 or 3, or type 2 or 3. There are other p o s s i b i l i t i e s but one need i n c l u d e only those combinations f o r which some s t a t i s t i c a l o v e r l a p o c c u r s . For example, i t i s not necessary to i n c l u d e c l a s s e s which could be water (type 4) and any other type s i n c e t h i s c o n f u s i o n seldom, i f e ver, occurred. F i g u r e s 4, 5, and 6 show t h i s more c l e a r l y . These three f i g u r e s i l l u s t r a t e the extent of the s t a t i s t i c a l o v e r l a p . a p r o b a b i l i t y e l l i p s o i d i s one way of showing the c l u s t e r of p o i n t s belonging to a p a r t i c u l a r c l a s s when the f e a t u r e v e c t o r has th r e e elements. The motivation f o r t h i s i n two dimensions (which e a s i l y can be extended to more) 2.7 New growth Figure 4 Band 5-6 Ellipses of Concentration 28 Figure 5 Band 5-7 E l l ip ses of Concentration Figure 6 Band 6-7 E l l ip ses of Concentration 30 i s to f i n d a geometric r e p r e s e n t a t i o n of the c o n c e n t r a t i o n of a given d i s t r i b u t i o n about the c e n t r e of g r a v i t y (ml, m2). In other words; one wants a curve e n c l o s i n g that p o i n t such t h a t i f a mass i s u n i f o r m l y d i s t r i b u t e d over the area bounded by the curve, i t w i l l have the same f i r s t and second moments as the given d i s t r i b u t i o n . In g e n e r a l t h a t curve i s undetermined so one i s r e s t r i c t e d to f i n d i n g an e l l i p s e with the r e g u i r e d p r o p e r t y . The e l l i p s e shows a contour of constant f(X,S)=p where p i n c r e a s e s as one gets c l o s e r to the c e n t r e . Given a c e r t a i n mean vector and c o v a r i a n c e matrix these e l l i p s o i d s t h e r e f o r e show the g e n e r a l d i s t r i b u t i o n of p o i n t s belonging to the c l a s s . The c l o s e r a p o i n t f a l l s to the centre of the e l l i p s o i d , the more l i k e l y i t i s an element of t h a t c l a s s . These e l l i p s o i d s are d e f i n e d by the eguation z l x.-,- x.x. = n+2 . . . |S| where n i s the dimension. In two dimensions t h i s i s 1 { (x - v t ) 2 - 2p(x -y,)(y -u a ) + (x - y » ) 2 =4 l ~ p l a 2 a,a a * 2 W H S R E P = oJa,al <see C r a n , e ^ However, three dimensions are somewhat d i f f i c u l t to p l o t so the p r o j e c t i o n s of the e l l i p s o i d s on the three axes were taken and t h a t i s what appears i n the f i g u r e s . They show r e s p e c t i v e l y the p r o j e c t i o n on the band 5-6 plane, the band 5-7 plane, and the band 6-7 plane. As i s r e a d i l y observable there i s much o v e r l a p among the 31 three types of f o r e s t cover, but e s p e c i a l l y between types 1 and 2 (second growth and o l d growth), and types 1 and 3 (second growth and r e c e n t l o g g i n g ) . However, the area represented by water i s completely separable from the o t h e r s , i n d i c a t i n g l i t t l e c o n f u s i o n i n c l a s s i f y i n g t h a t p a r t i c u l a r f e a t u r e . The t h r e s h o l d value k, f o r c l a s s i f y i n g i n t o the f o u r main c l a s s e s , was determined e x p e r i m e n t a l l y by c l a s s i f y i n g a c c o r d i n g to the above c r i t e r i a and examining the r e s u l t s . With too high a v a l u e , very few p o i n t s are placed i n c l a s s e s 1 through 4, although one c o u l d be almost c e r t a i n they belonged to the a s s i g n e d c l a s s . As the value i s lowered, the number of p o i n t s a s s i g n e d to the f i r s t f o u r groups i n c r e a s e s and the percentage c o r r e c t n e s s drops. A value of 0 f o r k would c l a s s i f y a l l p o i n t s i n t o the f o u r main c a t e g o r i e s alone, which i s j u s t what a s t r a i g h t p o i n t by p o i n t c l a s s i f i e r does. c l e a r l y i t i s d e s i r o u s to have enough p o i n t s i n the s t r o n g r e g i o n s (regions whose c l a s s i f i c a t i o n i s f a i r l y c e r t a i n ) to provide a good base f o r merging ambiguous ones, but at the same time keep the c o r r e c t n e s s of the p o i n t s so c l a s s i f i e d at a high l e v e l . The value of k was f i n a l l y set at 701, with a r e s u l t a n t i n i t i a l c l a s s i f i c a t i o n as f e l l o w s . 32 C l a s s i f i e d as C o r r e c t c l a s s i f i c a t i o n J%1 J 2 3 4 1 63. 1 27.1 9.6 0.2 2 13. 1 81.3 5.5 0 3 11.8 11.1 76.9 0. 1 4 1.5 0 3. 1 95.4 1 or 2 38.6 50.4 10.9 0. 1 1 or 3 45.1 14. 8 39.9 0. 1 Number i n each group: 1: 1474 2: 3550 3: 2693 4: 65 1 or 2: 1416 1 or 3: 802 Table I I I I n i t i a l C l a s s i f i c a t i o n with Ambiguous Regions In the i n i t i a l c l a s s i f i c a t i o n , of a l l the p o i n t s c l a s s i f i e d as being type 1, 63.IX were a c t u a l l y t h i s type (according to the ground-truth) , 27i15J were type 2, and 9.6% were type 3. O v e r a l l c o r r e c t n e s s f o r the p o i n t s c l a s s i f i e d i n the f o u r main groups was 77%. Once the i n i t i a l c l a s s i f i c a t i o n i s completed the r e g i o n s must be found, with each r e g i o n c o n t a i n i n g connected p o i n t s which were assigned to the same c l a s s . The r e g i o n f i n d e r i s a f a i r l y s t r a i g h t - f o r w a r d program which p r o v i d e s i n p u t to the merging program. I t begins i n the upper l e f t corner of the matrix of p o i n t s which are l a b e l e d by c l a s s number. The program then f o l l o w s the outer boundary of a l l p o i n t s with the 33 same c l a s s number u n t i l i t reaches the point where i t began. The l i s t of boundary elements i s w r i t t e n i n t o a f i l e f o r l a t e r i n p u t i n t o the re g i o n merger. As the program goes around the boundary of a r e g i o n i t a l s o e n t e r s a r e g i o n number i n t o a matrix which i t uses to keep t r a c k of where i t has been. A l l po i n t s w i t h i n the re g i o n are f i l l e d i n with the same region number and the program w i l l proceed to search f o r the next region by moving along the row i t i s p r o c e s s i n g u n t i l a p o i n t i s found which has no region number and t h e r e f o r e has not y e t been touched. T h i s i s continued u n t i l the whole matrix i s scanned, producing a l i s t of a l l the r e g i o n s i n the scene. The merging program i s then c a l l e d t o get r i d of the ambiguous regions i n the best p o s s i b l e way. Se v e r a l passes are used, each pass being s l i g h t l y l e s s s t r i n g e n t than i t s predecessor i n i t s c r i t e r i a f o r merging. In the f i r s t pass ambiguous r e g i o n s w i l l be merged i n t o unambiguous ones only i f a) the ambiguous r e g i o n r e c e i v e d the h i g h e s t p r o b a b i l i t y cf membership from the same c l a s s as the strong r e g i o n ; b) the weak common boundary between the two r e g i o n s i s g r e a t e r than some t h r e s h o l d (a weak boundary element w i l l have the r e l a t i v e d i f f e r e n c e s of each component of the f e a t u r e v e c t o r f o r both p o i n t s on the boundary l e s s t h a t some s t r e n g t h t h r e s h o l d ) ; c) i f the two r e g i o n s are merged, the new r e g i o n formed by merger must remain a strong r e g i o n (which means t h a t i t s hig h e s t p r o b a b i l i t y of 34 membership i n t o one of the f o u r groups must be gr e a t e r than k% of the t o t a l p r o b a b i l i t i e s , where k i s the same t h r e s h o l d as e s t a b l i s h e d above); and d) the average r e f l e c t a n c e values of the three s p e c t r a l bands used are c l o s e f o r both r e g i o n s . {Close here means t h a t the r e l a t i v e d i f f e r e n c e between each component of the average v e c t o r i s l e s s than some s t r e n g t h t h r e s h o l d ) . • The a c t u a l a c t i o n of the program i s as f o l l o w s . Two l i n k e d l i s t s of r e g i o n d e s c r i p t o r s are kept -the ambiguous r e g i o n s and the strong r e g i o n s . The program s t a r t s at the beginning of the ambiguous r e g i o n l i s t and works i t s way to the end f o r as many times as there are re g i o n s merged i n the process. The f i r s t t h i n g which i s done i s t o o b t a i n a l i s t of a l l the str o n g r e g i o n s which have some weak boundary elements i n common with the ambiguous r e g i o n . T h i s i s accomplished by t r a c i n g around the boundary of the r e g i o n , checking each p a i r of o p p o s i t e boundary p o i n t s to see i f the boundary between them i s weak. I f so, the number of weak elements i n common f o r the ambiguous r e g i o n and the str o n g r e g i o n to which the point belongs i s incremented by one. Once t h i s i s done f o r the complete boundary of the ambiguous r e g i o n , the program w i l l merge the ambiguous region with the strong r e g i o n which i s the most l i k e l y c a n d i d a t e . T h i s f a c t o r i s determined by the c l o s e n e s s of the average f e a t u r e v e c t o r s f o r the two r e g i o n s and a l s o by the percentage of the boundary which the two have i n common. For example, i f 35 two s t r o n g regions have average f e a t u r e v e c t o r s which are e g u a l l y c l o s e to that of the ambiguous r e g i o n , the one chosen f o r merging w i l l be the one which has a g r e a t e r percentage of weak boundary elements i n common. In t h i s way the strong r e g i o n which i s most l i k e the ambiguous one w i l l be chosen to merge. Hhen the a c t u a l merging i s done the program goes through the. matrix i n which each element r e f l e c t s the r e g i o n t o which the p a r t i c u l a r p o i n t belongs and changes a l l p o i n t s belonging to the ambiguous r e g i o n to now belong to the newly expanded s t r o n g r e g i o n . The boundary of the new r e g i o n a l s o i s determined by t r a c i n g i t out. As mentioned e a r l i e r , t h i s process con t i n u e s u n t i l no more r e g i o n s are merged i n a scan through the ambiguous region l i s t . A second pass i s then c a r r i e d out i n which the same c r i t e r i a f o r merging hold as i n pass one, with the e x c e p t i o n t h a t the f i r s t c r i t e r i o n i s r e l a x e d somewhat. Ihereas i n pass one a weak r e g i o n had to have i t s g r e a t e s t p r o b a b i l i t y of membership come from the same c l a s s as the s t r o n g r e g i o n , pass two r e g u i r e s only the highest or second h i g h e s t p r o b a b i l i t y to come from the same c l a s s as the str o n g r e g i o n . T h i s then w i l l merge any r e g i o n s which were not " c l o s e " enough t o a strong r e g i o n i n the f i r s t pass, but which are s t i l l f a i r l y c l o s e i n t h e i r average v e c t o r s to an adjacent s t r o n g r e g i o n . T h i s pass a l s o i s done so long as re g i o n s are merged. 36 R e s u l t s When a simple p o i n t by p o i n t c l a s s i f i c a t i o n method was used, p l a c i n g each p o i n t i n one of four c l a s s e s (no ambiguous c l a s s e s ) , a success r a t e of 711 was achieved as given below. C l a s s i f i e d as C o r r e c t c l a s s i f i c a t i o n 1 2 3 a 1 56.0 28.0 15. 7 0.2 2 16.7 77.2 6.0 0 3 13.5 11.5 74.9 0. 1 4 1.5 0 3. 1 95.4 Table IV I n i t i a l C l a s s i f i c a t i o n with no Ambiguous Regions The success f i g u r e may seem a l i t t l e low, but t h i s probably i s • a r e s u l t of the c o n s i d e r a b l e s t a t i s t i c a l o v e r l a p among the three types of f o r e s t cover. N e v e r t h e l e s s , the merging a l g o r i t h m i n c r e a s e d the accuracy to 79%, which r e p r e s e n t s a 28% decrease i n the e r r o r r a t e . A second p o s i t i v e r e s u l t was the s i m p l i f i c a t i o n of the f i n a l p i c t u r e over the p o i n t by p o i n t c l a s s i f i c a t i o n . T y p i c a l l y about 150 regions were found using t h a t technigue i n a p i c t u r e of 2500 p i x e l s , as shown i n F i g u r e 6, p l a c i n g each p o i n t i n one of the four main c a t e g o r i e s of o l d growth, second growth, re c e n t l o g g i n g , and water. F i g u r e 7 shows the r e s u l t s when the ambiguous r e g i o n s were i n c l u d e d i n the i n i t i a l 37 c l a s s i f i c a t i o n , r e s u l t i n g i n over 350 r e g i o n s . The f i n a l p a r t i t i o n , shown i n Fig u r e 8, r e s u l t e d i n about 70 r e g i o n s . The system c o n s i s t s of a number of programs mostly w r i t t e n i n ALGGLW-F with a few FOBTBAN s e r v i c e and p l o t programs. These programs are de s c r i b e d i n Appendi-x 1. The p o i n t by p o i n t c l a s s i f i c a t i o n f o r 2500 p i x e l s r e g u i r e s about 10 seconds of CPU time on the IBM 370/168. The merging program takes about 55 sec. In t o t a l and 150k of s t o r a g e , i n c l u d i n g the i n i t i a l c l a s s i f i c a t i o n and re g i o n f i n d i n g . These values i n c r e a s e s i g n i f i c a n t l y as the s i z e of the p i c t u r e being processed i n c r e a s e s s i n c e with more r e g i o n s the program w i l l have to spend more time s e a r c h i n g through l i s t s . T h e r e f o r e , i f a l a r g e p i c t u r e i s t o be processed i t would be a d v i s a b l e to break i t up i n t o s m a l l e r p i e c e s and process each one i n t u r n . Figure 7 I n i t i a l C l a s s i f i c a t i on with no Ambiguous Regions 39 • Second Growth (1) Old Growth (2) Recent Logging (3) JLJ Water (4) / Type 1 or 2 ^ Type 1 or 3 Ml Type 2 or 3 Figure 8 I n i t i a l C l a s s i f i c a t i on with Ambiguous Regions 40 Figure 9 Final Pa r t i t i on Af ter Merging 41 £2S£lu£ions Even with the minimal semantic knowledge employed, a s i g n i f i c a n t i n c r e a s e i n accuracy over the p o i n t by p o i n t method was achieved by the r e g i o n merger. One could o n l y hope f o r a g r e a t e r improvement i n areas where there i s more semantic knowledge a v a i l a b l e . For example, parks w i t h i n a c i t y would be c l a s s i f i e d , c o n t e x t - f r e e , as a g r i c u l t u r a l areas i n a l l l i k e l i h o o d , while the mere f a c t of t h e i r p o s i t i o n p r e c l u d e s t h i s p o s s i b i l i t y . For another example, a highway may become blended i n t o surrounding v e g e t a t i o n as i t becomes narrower than the area covered by one p i x e l , but i f the program knew that a road should be c o n t i n u i n g i n the a r e a , such p o i n t s c o u l d be i d e n t i f i e d c o r r e c t l y s i n c e the p i x e l s would be darker than the surrounding ones. Of course, the amount of semantic knowledge one i s able to i n c l u d e i s h e a v i l y dependent on the p i c t u r e domain being s t u d i e d as w e l l as the intended a p p l i c a t i o n , but one can only hope to in p r o v e accuracy with such technigues. Future LANDSAT s a t e l l i t e s should provide even b e t t e r performance s i n c e the r e s o l u t i o n w i l l be f i n e r , with each p i x e l c o v e r i n g l e s s area. As t h i s happens there i s l e s s chance of a p i x e l h i t t i n g two d i s t i n c t f e a t u r e s such as occurs when a road becomes narrower than the p i x e l s i z e . Of c o u r s e , with s m a l l e r p i x e l s there w i l l be much more data per u n i t area of the e a r t h ' s s u r f a c e and a corresponding i n c r e a s e i n computing time. However, i t would not always be necessary to look a t i n d i v i d u a l p i x e l s , but r a t h e r group them together i n t o l a r g e r u n i t s , going 42 t o p i x e l r e s o l u t i o n only near boundaries and other areas where accuracy demands. 43 Beyond the Present I f one were t o work on a p i c t u r e where there i s c o n s i d e r a b l e semantic knowledge the program should be improved g r e a t l y by i n f u s i n g some of t h i s knowledge. For example, i n mapping wheat f i e l d s on the p r a i r i e s one can be f a i r l y sure t h a t the f i e l d w i l l be r e c t a n g u l a r with s t r a i g h t boundaries. T h i s f a c t alone could be u s e f u l i n i n f l u e n c i n g the i n t e r p r e t a t i o n of ambiguous r e g i o n s w i t h i n an area known f a i r l y s u r e l y to be wheat. U n f o r t u n a t e l y , t h i s knowledge i s not always as easy to determine as i t i s when i n t e r p r e t i n g everyday scenes. In them f o r example, one might know th a t sky i s above the t r e e s , t r e e s are u s u a l l y found growing on grass or ground, and c a r s are found on roads. In many cases though, the l i t t l e semantic i n f o r m a t i o n i s known about the v a r i o u s elements of the scene covered by a s a t e l l i t e photo. P o s s i b l y a c l u s t e r a n a l y t i c approach may y i e l d s l i g h t l y b e t t e r • r e s u l t s than the s u p e r v i s e d maximum l i k e l i h o o d c l a s s i f i e r used. With c l u s t e r a n a l y s i s the data forms i t s own s t a t i s t i c a l c l u s t e r s while a s u p e r v i s e d approach f i r s t determines the c l a s s e s and then gathers the s t a t i s t i c s , which may form some u n n a t u r a l groupings. However, the semantic-based r e g i o n segmentation approach would be e q u a l l y as v a l i d f o r such an i n i t i a l c l a s s i f i c a t i o n . 44 1. B a j c s y , B., and T a v a k o l i , M. A computer r e c o g n i t i o n of b r i d g e s i s l a n d s , r i v e r s and l a k e s from s a t e l l i t e p i c t u r e s . Machine P r o c e s s i n g of gemotely Sensed Data, Purdue Univ. . (1973),~2a54-2a687 2. Becker, M.A., Freden, S.C. and Mercanti, E.P. (ed.) The T h i r d Earth Resources Technology S a t e l l i t g - 1 Symposium. Washington, D.C. (Dec. 1973) 3. Becker, M.A., Freden, S.C. and M e r c a n t i , E.P. (ed.) Symposium on S i g n i f i c a n t fiesuits Obtained from the E a r t h Resources Technology S a t e l l i t g - 1 . New C a r r o l t o n , Md. (Mar. , 1973) 4. B e r n s t e i n , fi. D i g i t a l image p r o c e s s i n g of Earth o b s e r v a t i o n sensor data. IBM J o u r n a l of Research and Development, 20, 1. (Jan. 1976), 40-57. ~ 5 . , B r i c e , C.R., and Fenema, C.I. Scene a n a l y s i s using r e g i o n s . A r t i f i c i a l I n t e l l i g e n c e 1, 3 (1970), 20 5-226. 6. Clough, D.J. P r e l i m i n a r y b e n e f i t / c o s t a n a l y s i s of Canadian s a t e l l i t e / a i r c r a f t remote sensing a p p l i c a t i o n s , groceedings of the F i r s t Canada.am Symposium on Rejote Sensing. Ottawa. ~(Feb~ 1972) , 3-28. 7. Clough, D.J., Henein, J.C. and M c Q u i l l a n , A.K. Case S t u d i e s of Canadian b e n e f i t s of remote s e n s i n g . Proceedings of the Second Canadian Symposium on Remote Sensing. Guelph, Ont. (May 1974)",~28-40.~ 8. Cramer, H. Mathematical Methods of S t a t i s t i c s . P r i n c e t o n U n i v e r s i t y P r e s s , " P r i n c e t o n N.57 (19667, 260-301. 9. E l l e f s e n , R., Swain P.H., and Hray, J.R. Urban land-use mapping by machine p r o c e s s i n g of L1NDSAT-1 m u l t i s p e c t r a l data: A San F r a n c i s c o Bay area example. Machine P r o c e s s i n g of Remotely Sensed Data, Purdue Univ. (1973), 2a7-2a22. 10. Feldman,'J.A., and Yakimovsky, Y. D e c i s i o n theory and A r t i f i c i a l I n t e l l i g e n c e : I . A semantics-based r e g i o n a n a l y z e r . A r t i f i c i a l I n t e l l i g e n c e 5 (1974), 34 9-371. 11. Friedman, J.D., Frank, D.G.,- P r e b l e , D. and P a i n t e r , J.E. Thermal s u r v e i l l a n c e of Cascade Eange volcanoes using EBTS-1 m u l t i s p e c t r a l scanner, a i r c r a f t imaging systems and ground-based data communication p l a t f o r m s . Symposium on S i g n i f i c a n t R e s u l t s Obtained from the E a r t h Resources Technology S a t e l l i t e - ! . New C a r r o l t o n , Md. ( l a r . 1973), 1549-1560. 12. Gerson, D.J. Computer e s t i m a t i o n of the presence of sea i c e 45 i n s a t e l l i t e p i c t u r e s . Computer Science T e c h n i c a l Report S e r i e s #366. Oniv. of Maryland. (Apr. 1975). 13. Gower, J.F.R., and D a n i e l , I . The use of ERTS-1 computer compatible tapes with r e s p e c t to marine r e s e a r c h . Unpublished Manuscript. Marine Sciences D i r e c t o r a t e , P a c i f i c Region, V i c t o r i a , B.C. (1974) . 14. Gupta, J.N., K e t t i g , R.L., Landgrebe, D.A., and Wintz, P.A. Machine boundary f i n d i n g and sample c l a s s i f i c a t i o n of remotely sensed a g r i c u l t u r a l data. Machine Pro c e s s i n g of Remotely Sensed Data, Purdue Univ. (1973), 4b25-4b35. 15. Hardy, E., and Anderson, J.R. A l a n d use c l a s s i f i c a t i o n system f o r use with remote-sensor data. Machine P r o c e s s i n g of Remotely Sensed Data, Purdue Univ. . (1973), 2a1-2a6. 16. H o f f e r , R.M. and LARS S t a f f . Technigues f o r computer aided a n a l y s i s of ERTS-1 data u s e f u l i n g e o l o g i c f o r e s t and water resource surveys. LARS Information Note 121073, Purdue Univ. (1973). 17. MacLeod, I.D. P i c t o r i a l output on a l i n e p r i n t e r . IEEE T r a n s a c t i o n s on Computers. V o l . C-19, No. 2. (Feb. 1970), 160-162. 18. Robertson, T.V. E x t r a c t i o n and c l a s s i f i c a t i o n of o b j e c t s i n m u l t i s p e c t r a l images. Machine p r o c e s s i n g of Remotely Sensed Data, Purdue Univ. (19737,~3b27-3b34. 19. S t e i n e r , R. A methodology f o r the automated p h o t o - i n t e r p r e t a t i o n of r u r a l l a n d use types. Automatic £fiiS£E£§£atign and C l a s s i f i c a t i o n o f Images, G r a s s e l i , A. (Ed.) Academic Press, New York, 1969. 20. Strome, H.M. Canadian ERTS data ha n d l i n g system. Canadian Comfiuter Conference Se s s i o n 72. (1972) 423101-423118 21. Tenenbaum, J.M. and fieyl, S. A r e g i o n - a n a l y s i s subsystem f o r i n t e r a c t i v e scene a n a l y s i s . S t a n f o r d Research I n s t i t u t e T e c h n i c a l Note 104 (1975). 22. Todd, W., Mausel, P., and Baumgardner, M.F. An a n a l y s i s of Milwaukee County l a n d use by machine-processing of ERTS data. Laboratory f o r A p p l i c a t i o n of Remote Sensing (LARS) Information Note 022773, Purdue Univ. (1973).. 23. Yakimovsky, Y., Scene a n a l y s i s u s i n g a semantic base f o r r e g i o n growing. S t a n f o r d A r t i f i c i a l I n t e l l i g e n c e Laboratory Memo AIM-209, S t a n f o r d Univ. (1973). 24. Zwarun, S. Keeping an eye on the crops. Macleans Magazine. (Feb. 23, 1976). 46 A^gendix J , : D e s c r i p t i o n of Programs Read From Tang T h i s FORTRAN program i s used t o read the data from a p o r t i o n of a LANDSAT tape to a f i l e AREA2. F i r s t , the tape must be mounted by running the program DLIB:DATA and s u p p l y i n g the d e s i r e d tape i d e n t i f i e r . The tape then w i l l be mounted under the pseudo d e v i c e name *DLIB*. The user i s then asked to supply a s t a r t i n g row and column f o r the p o r t i o n he wishes to save. I t a l s o asks f o r the s i z e of the p i c t u r e i n terms of the number of rows and columns. A s k i p value i s then reguested.. T h i s value i s used to i n d i c a t e the spacing t o be used when read i n g i n the data. For example, a value of one w i l l read i n every p o i n t of the a r e a , while a value of two reads i n every second p o i n t of every second row, and so f o r t h . No matter the value of t h i s v a r i a b l e though, i t does not a f f e c t the number of rows which are read i n . In other words, i f a s i z e of 100 rows was s p e c i f i e d , and a s k i p value of two was used, you would s t i l l have 100 rows of data i n p u t , which would represent 200 rows of the a c t u a l p i c t u r e . On the tape three f i l e s are f i r s t skipped. These f i l e s c o n t a i n i n f o r m a t i o n records which g e n e r a l l y are of l i t t l e concern to the user. D e t a i l s can be found i n the ERTS Data Osers Handbook, a v a i l a b l e f o r use i n the Data L i b r a r y . Then, the a p p r o p r i a t e number of records are skipped so that the tape i s p o s i t i o n e d at the c o r r e c t r e c o r d of the p i c t u r e . As a check, f o r the f i r s t r e c o r d of data read, the row 47 number from the tape i s printed. This value i s always found i n bytes 71 and 72 of the f i r s t record for the row. Bytes 107 and 108 hold a number which indicates the start of data i n each of the records for that row of the picture. Values previous to th i s position i n the record are a l l zero. For some reason, the values for band 6 occur two bytes e a r l i e r than band 5, while those of band 7 are 4 bytes e a r l i e r , so care must be taken to ensure that the correct data i s read i n for each pi x e l . The records for each of bands 5, 6, and 7 f i n a l l y are read into the LOGICAL*1 vectors PIXEL1, PIXEL2, PIXEL3 (using the MTS read routine) and the appropriate columns placed i n array BUF, which i s then output. Only 25 points are output to any l i n e of the f i l e AREA2 because of a l i m i t a t i o n of ALGCLW-F. Hhile - I could have put the data into a sequential f i l e of appropriate l i n e length, ALGOLW-F doesn't allow you to read or write a l i n e longer than 255 characters. 48 P r i n t P i c t u r e T h i s FORTRAN program produces an o v e r p r i n t e d c h a r a c t e r r e p r e s e n t a t i o n of a LANDSAT scene u s i n g data from band 6 s i n c e i t seems t o give the best • v i s u a l r e s u l t s . Because of the dimensions of a sguare on a page of computer output each p i x e l i s not- p r i n t e d or e l s e the p i c t u r e would appear longer than i t a c t u a l l y i s . More s p e c i f i c a l l y , only s i x rows are p r i n t e d per i n c h while ten columns are p r i n t e d i n the same space. T h e r e f o r e when p r i n t i n g the p i c t u r e , f o r every f i v e rows of a c t u a l data, only three w i l l be p r i n t e d . A l i m i t i s a l s o placed on the width of the p i c t u r e produced although t h i s c ould e a s i l y be changed. The p i c t u r e only can be 780 p i x e l s i n width, which r e s u l t s i n a maximum width of s i x pages of output s i n c e 130 are put on each page. F i r s t , the v a r i a b l e s ROWBEG, ROHEND, COLBEG, and COLEND are i n p u t t o d e f i n e the dimensions of the p i c t u r e to be p r i n t e d i n terms of the o r i g i n a l p i c t u r e ' s rows and columns. The program then goes i n t o a l o o p , c o v e r i n g f i v e rows of data at a time, but only p r o c e s s i n g every other one i n a c a l l t o GETLIN, a r o u t i n e which takes care of most of the work. GETLIN: T h i s r o u t i n e processes one l i n e of data. The f i r s t r e c o r d i s read from tape i n order to determine the s t a r t i n g p o i n t of the data (held i n bytes 107 and 108 of the f i r s t r e c o r d f o r each row of the p i c t u r e ) . For band 6, the s t a r t i n g p o i n t of the data i s f o u r l e s s than i n band 4, but the r e are always two counter bytes at the beginning of each H9 record so that explains the c a l c u l a t i o n of the actual starting position of the area of i n t e r e s t , indicated i n MSTABT. The data for band 5 i s skipped and then the band 6 record read into the array PIXEL. Each point of i n t e r e s t in this array i s then assigned a certain combination of overprinted characters by the routine ASSIGN, which places the characters in the array PICT. The data i s then printed out into separate f i l e s , one f i l e holding a s t r i p of the picture 130 characters wide. F i r s t , one l i n e i s printed with a "9" control character, which w i l l suppress page skips, followed by seven l i n e s with a n+ n control character to overprint. ASSIGN: This routine decides the overprinting to be done for a s p e c i f i c reflectance value given i n the variable SUM. The character printed for each value i s given i n the following table from MacLeod (17). Value Range Characters Percent of box covered > 125 AX'.HBVC 100 120-125 .XO'HBV 97 115-120 .XO'HB 93 109-114 .XO'HC 89 103-108 .-XC 85 97-102 . = +0» 79 91-96 . +0 • 67 85-90 +0' 64 77-84 0+ 60 73-78 0= 56 67-72 0- 53 61-66 M 45 55-60 A 42 49-54 X 40 43-48 Z 37 37-42 1 33 31-36 ) 29 25-30 + 25 19-24 — 22 13-18 - 15 1-12 0 Table V Overprinted Characters f o r each Reflectance Range 51 Get s t a t i s t i c s T h i s program, w r i t t e n i n ALGOLH-F, c a l c u l a t e s s t a t i s t i c s f o r the areas of the p i c t u r e which are to be the l e a r n i n g areas of ground-truth data i n order t o determine the s t a t i s t i c s f o r the p r o b a b i l i t y f u n c t i o n s . F i r s t , as u s u a l , the beginning row and column values f o r the data h e l d i n f i l e AREA2 must be i n p u t . The r e f l e c t a n c e data i s then read i n t o the a r r a y VALUES, three values per p i x e l . . Now the form of input data must be d e s c r i b e d . (The i n p u t i s i n a f i l e a t tached to SCABDS). The f i r s t row w i l l be the value of the type of the area which i s d e l i n e a t e d i n the f o l l o w i n g rows. T h i s value must be negative though, so the program can t e l l when i t gets t o the next type. Each row f o l l o w i n g the type value ( u n t i l the next negative number) c o n s i s t s of a row value, ( i n terms of the ground t r u t h data t h i s time, so the values w i l l range from 1 to the number of rows) f o l l o w e d by a s t a r t i n g and f i n i s h i n g column va l u e . The program simply goes to the i n d i c a t e d row of the ground t r u t h data and e x t r a c t s the p o i n t s from the given column range and keeps t r a c k of the t o t a l s f o r each p o s i t i o n of the f e a t u r e v e c t o r . T h i s i s done u n t i l a value of -5 i s encountered, i n d i c a t i n g the end of the data. At t h i s p o i n t the mean values are c a l c u l a t e d and the program goes back to the top of the i n p u t data to begin c a l c u l a t i n g the covariances and standard d e v i a t i o n s . T h i s i s accomplished by p u t t i n g a SCONTINUE HITH FILENAME statement at the end of the data f i l e . Once the c o v a r i a n c e s and standard d e v i a t i o n s have been 52 c a l c u l a t e d f o r a l l the c l a s s e s , they are output to a temporary f i l e (-STATS) i n the f o l l o w i n g order f o r each row. 1. Standard d e v i a t i o n of f i r s t element of f e a t u r e v e c t o r 2. Standard d e v i a t i o n of second element of f e a t u r e v e c t o r 3. Standard d e v i a t i o n of t h i r d element of f e a t u r e v e c t o r 4. Covariance between the f i r s t and second elements 5. Covariance between the f i r s t and t h i r d elements 6. Covariance between the second and t h i r d elements 7. Mean value of f i r s t element of f e a t u r e v e c t o r 8. Mean value of the second element 9. Mean value of the t h i r d element 53 E l l i p s e T h i s program, w r i t t e n i n FOHTRAN, i s used to draw the two dimensional p r o j e c t i o n s of the e l l i p s o i d s of c o n c e n t r a t i o n onto one of the t h r e e p o s s i b l e planes. The program f i r s t reads i n the s t a t i s t i c s d e f i n i n g the e l l i p s e s and begins at the l e f t s i d e and c a l c u l a t e s the two Y values f o r t h a t X value (or one i f the d e r i v a t i v e i s zero at that p o i n t ) . These values are kept i n an a r r a y POINTS with two Y values f o r each X value. I t steps moves to the r i g h t and r e p e a t s the o p e r a t i o n u n t i l the X value reaches i t s maximum. The equation f o r an e l l i p s e i n two space i s - i — {(x - n , ) 2 - 2p(x -u,)(y -u.) + (y - y 3 ) 2 } =4 1-p2 ? 2 of a, a 2 of which becomes x 2 - 2pxy_ + _y£ - 4 ( l - p 2 ) =0 a 2 o,a3 a 2 when centered at the o r i g i n . To f i n d the maximum X values the d e r i v a t i v e with r e s p e c t to Y i s taken and set to zero as 2px - 2py_ dx + 2y_ = 0 a, a 2 a, a2 dy a 2 dxi ( 2x - 2py ) = 2 Px - 2y dy a 2 a, ax a, a, a 2 d x = ( 2 £ x - ^ ) / ( 2 x - 2 p y _ ) = Q dy a, a2 a 2 a 2 a, at f o l l o w s . 2x dx -a 2 dy 54 T h i s i s then used to s o l v e f o r Y. 2 Px _ 2y a, a2 al y = g2px S u b s t i t u t i n g i n t o the o r i g i n a l equation to s o l v e f o r the minimum and maximum values of X. The s t a t i s t i c s must be i n a f i l e a t t a c h e d to u n i t 1 i n the program* with a number of rows corresponding to the number of c l a s s e s . Each row has the f o l l o w i n g data: Standard d e v i a t i o n of f i r s t v a r i a b l e Standard d e v i a t i o n of second v a r i a b l e Covariance between the two The mean of the f i r s t v a r i a b l e The mean of the second v a r i a b l e To p l o t the e l l i p s e s the program s t a r t s at the beginning of the array POINTS and moves to p o s i t i o n (POINTS ( 1 , 1 ), POINTS ( 1 , 2 )). I t then draws t o each subsequent p o i n t (POINTS ( i , 1 ) , POINTS (i, 2 ) ) u n t i l i t reaches the end. T h i s w i l l draw h a l f of the e l l i p s e so i t s t a r t s a t the end and draws back t o the beginning of the f i g u r e , j o i n i n g p o i n t s (POINTS ( i , 1) , POINTS ( i , 3)) s i n c e the t h i r d column of t h i s a r r a y t. 2 £ X ( £!£A ) + P 2 X 2 _ 4 ( 1 _ p 2 ) = o a* a} , 1 ^ = 4(l-p 2) x = ±2a, Ids the other y value f o r each x. 56 T h i s FOETBAN program d i s p l a y s three dimensional p r o b a b i l i t y e l l i p s o i d s of c o n c e n t r a t i o n on the Adage g r a p h i c s t e r m i n a l , These images may then be r o t a t e d by the user to give some i d e a of the s t a t i s t i c a l o v e r l a p among the c l a s s e s . The r o u t i n e s used are from UBC G (Three Dimensional Graphics S u b r o u t i n e s ) . Since only l i n e drawings can be d i s p l a y e d on the Adage i t i s i m p o s s i b l e to show s u r f a c e s . I n s t e a d , e q u a l l y spaced e l l i p s e s are drawn to g i v e the e f f e c t , as i f one were to take a c l e a r b a l l o o n and draw p a r a l l e l l i n e s on i t s s u r f a c e . The user i s asked to input the number of e l l i p s e s to draw per e l l i p s o i d , and the number of p o i n t s per e l l i p s e . Of course, the more p o i n t s used the longer i t takes but the smoother the p i c t u r e . The s t a t i s t i c s are read i n from the f i l e STATISTICS, c r e a t e d by the s t a t i s t i c s g a t h e r i n g program. The matrix SIGMA holds the covar i a n c e values while MO i s used to hold the average r e f l e c t a n c e v a l u e s . The v a r i a b l e s LAM11, LAM22, e t c . are the c o f a c t o r s of the i n d i c a t e d element of the c o v a r i a n c e matrix. For example, LAM23 i s the c o f a c t o r of the (2,3) p o s i t i o n of the matrix. DET holds the determinant f o r the matrix of each c l a s s . I n order to do the a c t u a l p l o t t i n g the program works as f o l l o w s . F i r s t , an e l l i p s o i d i s used with a centre of g r a v i t y a t the o r i g i n , s i n c e t h i s makes the eguations e a s i e r to work with. T h i s i s then t r a n s l a t e d t o the point (MU_X, M0_Y, MU_Z), which i s the a c t u a l c e n t r e . The minimum and maximum Z values are then e a s i l y c a l c u l a t e d as 5*DET/LAM33 and -5*DET/LAM33 57 r e s p e c t i v e l y . The program then moves along the Z-axis, drawing e l l i p s e s i n successive planes Z=Zval. The Procedures E l i j J s e : This subroutine c a l c u l a t e s an e l l i p s e i n the Z=Zval plane where Zval i s the current value of Z which i s passed to the r o u t i n e . F i r s t , the maximum and minimum X values are c a l c u l a t e d i n order to determine the s t a r t i n g and ending points f o r the e l l i p s e i n the X d i r e c t i o n . The o r i g i n a l eguation i s *„x2 + X y 2 + X z 2 + 2x xy + 2X xz + 2x yz = 5D where x - i s the c o f a c t o r of the ( i , j ) p o s i t i o n of the matrix and D i s the determinant. D i f f e r e n t i a t i n g with respect to Y gives 2X x dx + 2X y + 2X y + 2X z + 2X y dx = 0 " — 'V ax as >z — dy dy S o l v i n g f o r the d e r i v a t i v e and s e t t i n g i t to zero to get the extrema gives fy = ("V - v - \3z)/(x<.x + \.y) = 0 y= X„z + XJtx S u b s t i t u t i n g i n t o the o r i g i n a l eguation to f i n d the extreme values f o r X one gets x 2fX , - X 2)+x(2X ) 3z-X , , X„z)+ X^z 2-X 2z 2 - 5D=0 x u Xzi 58 which e a s i l y can be s o l v e d as a q u a d r a t i c equation. The program then s t a r t s a t the c a l c u l a t e d minimum value f o r X and c a l c u l a t e s the Y value at t h i s p o i n t . An increment i s added to the X value and the process repeated u n t i l X gets to the maximum value. At that point the e l l i p s e i s p l o t t e d , or r a t h e r placed i n t o the Adage b u f f e r . 59 I f i i t i a l C l a s s i f i e r T h i s program, w r i t t e n i n FORTBAN, i s used t o produce a f i l e where each p o i n t w i l l be a type number. A t h r e s h o l d value i s f i r s t read i n (from SCABDS) which w i l l determine how l a r g e the g r e a t e s t p r o b a b i l i t y must be i n order to c l a s s i f y a p o i n t i n t o one of the f i r s t f o u r c l a s s e s . I f a p o i n t r e c e i v e s i t s highest p r o b a b i l i t y of membership from c l a s s k, and t h i s p r o b a b i l i t y i s g r e a t e r that the t h r e s h o l d percentage of the t o t a l p r o b a b i l i t i e s , the po i n t i s ass i g n e d to the c l a s s . Otherwise i t i s assigned to a c l a s s depending on the two g r e a t e s t p r o b a b i l i t i e s . I f a point by p o i n t c l a s s i f i c a t i o n i s d e s i r e d , a t h r e s h o l d value of 0 would be used. The r e f l e c t a n c e values are f i r s t read i n t o the matrix VALS, ground-truth t o matrix TROTH, and the s t a t i s t i c s read i n from the f i l e STATISTICS. Then the program runs through each p o i n t of the p i c t u r e matrix, c a l c u l a t i n g the p r o b a b i l i t i e s i n matrix TEMP. T h i s i s s o r t e d and i f the highest p r o b a b i l i t y i s l a r g e enough the p o i n t w i l l be assigned t o one of the f o u r main c l a s s e s . Otherwise, the c l a s s number assigned w i l l depend on the highest and second highest p r o b a b i l i t y . At the end, a matrix o f c o r r e c t n e s s percentages i s p r i n t e d out. P o s i t i o n ( i , j ) of t h i s matrix i n d i c a t e s how many of the p o i n t s c l a s s i f i e d as type i are a c t u a l l y type j when compared to ground t r u t h data. Perhaps i t should be poi n t e d out t h a t the ground-truth data used had seven c l a s s i f i c a t i o n s which were merged i n t o f o u r . T h i s e x p l a i n s the o d d i t i e s of c a l c u l a t i n g these va l u e s . 60 GETS: T h i s s u b r o u t i n e c a l c u l a t e s the determinants and c o f a c t o r s f o r the f o u r c o v a r i a n c e matrices f o r use i n the p r o b a b i l i t y f u n c t i o n . F: T h i s i s the p r o b a b i l i t y f u n c t i o n which c a l c u l a t e s the p r o b a b i l i t y t h a t a p o i n t with r e f l e c t a n c e vector X belongs to c l a s s I. 61 IhS. Region F i n d e r The purpose of t h i s ALGOLW-F program i s to t r a c e out the re g i o n s contained i n the scene. The program begins i n the top l e f t c o r n e r of the matrix and runs around the boundary of the r e g i o n , which w i l l i n c l u d e a l l p o i n t s of the same type which are connected to one another. When the program r e t u r n s to the s t a r t i n g p o i n t t h i s r e g i o n i s f i n i s h e d so i t scans along the row u n t i l the next p o i n t which i t hasn't processed i s found. G l o b a l V a r i a b l e s BEGIN_R08, BEGIN_COI: These two v a r i a b l e s i n d i c a t e , i n terms of rows and columns of the a c t u a l p i c t u r e , the beginning row and column of the area of i n t e r e s t . NUMBEfi_OF_ROHS, NUMBER_QF_COLUHNS: These i n t e g e r s are the s i z e of the area of i n t e r e s t . VALUES: T h i s i s a STRING (1) a r r a y which c o n t a i n s the r e f l e c t a n c e values of each ' point i n the area. VALUES ( i , j , 1 ) i s be the r e f l e c t a n c e value of the p o i n t ( i , j ) i n band 5, VALUES(i,j,2) i s the value f o r band 6, and VALUES(i,j,3) f o r band 7. PICT: T h i s i n t e g e r matrix holds the p i c t u r e i n terms of the c l a s s type of each p o i n t . In other words, P I C T ( i , j ) i s the number of the c l a s s to which the p o i n t ( i , j ) has been as s i g n e d . REGIONS: REGIONS ( i , j) i s the number of the r e g i o n ( i n t e r n a l to the program) to which the po i n t ( i , j ) has been as s i g n e d . 62 The Procedures £I|jD.IEG,IONS: T h i s i s the c o n t r o l l i n g process. I t s t a r t s i n the upper l e f t corner and c a l l s FOLLOW_BO0NDARY t o get the boundary of the r e g i o n t o which t h a t p o i n t belongs. When i t r e t u r n s i t steps along the same row, asking each p o i n t i f i t i s i n a r e g i o n or not (accomplished by a c a l l to NOT_IN_A_REGI0N). When i t f i n d s one which i t hasn't processed yet, the process begins anew. At the end, a n u l l l i n e i s output to the f i l e REGIONS to i n d i c a t e the end of the l i s t of re g i o n d e s c r i p t o r s and the matrix REGIONS i s output'to the same f i l e . FOLLOW,BODNDAHY: Th i s procedure f o l l o w s the boundary of a r e g i o n . F i r s t i t moves to l e f t s i d e of the r e g i o n s i n c e the r o u t i n e always s t a r t s by having the f i r s t boundary vector going downwards. The r o u t i n e then e n t e r s a loop so lo n g as the Boolean v a r i a b l e FINISHED i s f a l s e . T h i s l o o p adds the present region number t o the point of the r e g i o n p r e s e n t l y being processed and c a l l s GET_NEW_POINT to f i n d the next p o i n t along the boundary and to determine the necessary boundary v e c t o r s . , POINT_I and POINT_J are the c o - o r d i n a t e s of the present p o i n t on the boundary, while BOUNDABY_POINT_I and BGONDABY_POINT_J are the c o - o r d i n a t e s of the end of the l a s t boundary vect o r . The d i f f e r e n c e i n these two values should perhaps be i l l u s t r a t e d . Imagine two adjacent p o i n t s of the same row, ( i , j-1) and ( i , j ) . The top of the v e r t i c a l boundary ve c t o r between these two would be (i#j) and the bottom 63 would be (i+1,j). One t h e r e f o r e can co n s i d e r two g r i d s i n v o l v e d , one which i s the a c t u a l p i x e l s , and another which l i e s between them, r e p r e s e n t i n g the boundary g r i d . But g e t t i n g back to the procedure, once the boundary has been completely t r a c e d FILI_IN_REGION i s c a l l e d to do j u s t t h a t . I t f i l l s i n a l l p o i n t s of the r e g i o n with the same re g i o n number (in the matrix REGIONS). F i n a l l y , the d e s c r i p t o r of the r e g i o n i s output i n a c a l l to ADD_TO_REGION_LIST. - I h i s procedure outputs the end of a boundary ve c t o r t o the f i l e BQUNDABYS and a d j u s t s the extreme p o i n t s of the r e g i o n i f necessary. ADD_TOau REGION ..LIST: T h i s outputs a d e s c r i p t o r of the r e g i o n to the f i l e REGIONS. MOVE_TQ_EBGE_0F_REGIGN: T h i s procedure, as one might imagine, moves from the present p o s i t i o n to the l e f t edge of the r e g i o n . T h i s i s done i n order t o begin the boundary with a v e r t i c a l v e c t o r o r i e n t e d downwards. £I2_HIJ«£0INT: Assuming the boundary t r a c e r i s p r e s e n t l y p o s i t i o n e d at P0INT_I and POINT_J, t h i s procedure w i l l move to the next p o i n t along the boundary and c r e a t e any boundary v e c t o r s necessary. T h i s i s done by c a l l i n g one of the procedures NEW_POINT_RIGHT, NE«_POINT_LEFT, NEW_POINT_UP, or NE«_POINT_DOWN, depending on the d i r e c t i o n the boundary t r a c e r i s p r e s e n t l y f o l l o w i n g . MW_E01JNT_UP: T h i s procedure determines the next p o i n t along the boundary, given t h a t the t r a c e r i s p r e s e n t l y heading 64 upwards. The other three r o u t i n e s are very s i m i l a r and w i l l not he d e s c r i b e d . There are four cases t o c o n s i d e r . F i r s t , the boundary may take a t u r n to the r i g h t ; second, i t may continue heading upwards; t h i r d , i t may be a t a r i g h t hand c o r n e r , i n which case i t turns to the l e f t ; and l a s t l y , there may be a s i n g u l a r p o i n t j u t t i n g above i t s neighbours. At s e v e r a l p o i n t s i t i s p o s s i b l e t h a t the boundary might reach i t s s t a r t i n g p o i n t , at which time the t r a c e r must cease. T h i s i s detected whenever the end of the boundary v e c t o r becomes egual t o (RCw_GBG, CCL_OBG). FILL_IN_BEGION: T h i s procedure f i l l s i n a l l i n t e r i o r p o i n t s of a r e g i o n with the same r e g i o n number ( i n the matrix BEGIONS). While the boundary f i n d e r was doing i t s job, a l l the boundary p o i n t s encountered were assigned the value of the r e g i o n number, so t h i s p r o v i d e s a s t a r t f o r f i l l i n g i n the i n t e r i o r p o i n t s . The method i s to scan a box formed by the extreme values of the r e g i o n . I f a p o i n t i s found which i s of the c o r r e c t type, and e i t h e r : the p o i n t to the top or to the l e f t belong to the c o r r e c t r e g i o n , then the point i s a l s o assigned t o the r e g i o n . However, some p o i n t s s t i l l may be missed i n t h i s scan so i t s t a r t s again a t the bottom cf the box and moves upwards i n the same manner. , A f t e r both scans the area and average f e a t u r e v e c t o r s are known f o r the r e g i o n . GETPICTORE; T h i s procedure reads i n the data necessary to 65 produce the r e g i o n s . T h i s data i s found i n the f i l e AREA, i n which the value of a p o i n t ( i , j ) w i l l be the c l a s s to which the point belongs. A l s o , the a c t u a l r e f l e c t a n c e values are read i n from f i l e AREA2 i n order t h a t the average r e f l e c t a n c e values f o r the r e g i o n s can be determined. F i r s t , an a p p r o p r i a t e number of rows are skipped i n both f i l e s s i n c e EEGIN_ROW, the beginning of the area of i n t e r e s t may not be the same as RQW_START, which i s the beginning row of the data held i n the f i l e . The program then reads i n 100 rows and columns of data. Any more than t h i s r e s u l t s i n a horrendously l a r g e number of r e g i o n s to process. ODTPDT_SEGION_MATRIX: T h i s i s used t o p r i n t out the matrix of r e g i o n s numbers i n the f i l e REGIONS. Again, i f there are more than 80 columns i n the matrix i t w i l l be necessary to con t i n u e on the next l i n e because of ALGQLW-F l i n e l e n g t h r e s t r i c t i o n s . ADD_TO_LIST: T h i s p r i n t s the arguments ROW and COL i n the f i l e BOONDARIS. Twelve p o i n t s per l i n e are allowed before moving to the next l i n e . 6 6 The Region Merger T h i s program, w r i t t e n i n ALGOIW-F, does the r e g i o n merging t o produce the f i n a l p a r t i t i o n of the scene i n v o l v e d . I w i l l f i r s t give a g e n e r a l d e s c r i p t i o n of the program and then a d e t a i l e d d e s c r i p t i o n of the a c t u a l procedures used. The b a s i c s t r u c t u r e s used i n the program begin with two l i n k e d l i s t s of r e g i o n s . One, c a l l e d STR0NG_BEGION_LIST, w i l l i n c l u d e a l l r e g i o n s whose i n t e r p r e t a t i o n i s f a i r l y c e r t a i n . The oth e r , the WEAK_REGION_LISI, w i l l hold a l l the ambiguous r e g i o n s . There are two types of r e g i o n merging a v a i l a b l e . The f i r s t steps through the STRONG_REGION_LIST, one r e g i o n at a time, and then goes through WEAK_8EGI0N_LIST s e a r c h i n g f c r the f i r s t r e g i o n which adheres to the merging c r i t e r i a of that p a r t i c u l a r pass. When i t f i n d s one, the two regions are merged and the WEAK_REGION_LIST has a l i n k removed. In the second type of merging, the r o u t i n e steps through the WEAK_REGION_LIST r a t h e r than the STRONG_REGION_LIST. The STRONG_REGIO»_LIST i s then searched f o r a l l r e g i o n s which are adj a c e n t to t h i s weak region and the best p o s s i b l e c h o i c e from these i s taken as the re g i o n to merge. One of these two processes i s completed f o r each pass f o r as many times as there are r e g i o n s being merged. As becomes c l e a r though, i f the number of r e g i o n s becomes too l a r g e the program spends most of i t s time l i s t s e a r c h i n g . T h e r e f o r e , i f a l a r g e p i c t u r e i s t o be processed i t would be b e t t e r t o d i v i d e i t i n t o s e c t i o n s and process each s e c t i o n i n t u r n . 67 G l o b a l V a r i a b l e s »UMBER_OF_BOWS, NUMBEE_OF_CCIUHNS: These are number of rows and columns i n the area being processed. BEGIN_BOW, BEGIN_COL: These two values are input from the f i l e produced by the re g i o n f i n d i n g program. They i n d i c a t e the row and column of the a c t u a l p i c t u r e matrix where the area of i n t e r e s t begins. CEBTAIN_THBESHOLD: T h i s r e a l v a r i a b l e i s the t h r e s h o l d f o r determining a stro n g r e g i o n . POINT: T h i s r e c o r d c o n t a i n s the row and column values of a p o i n t , along with a l i n k . f i e l d to c r e a t e a l i s t . The values BOW_F and COL_F (the F i s used to i n d i c a t e a f i e l d of a record) are STRING (1) v a r i a b l e s t o save c o r e , so the values must be l e s s than 255. BEGION: T h i s r e c o r d d e s c r i b e s the re g i o n s . The f i e l d s are as f o l l o w s . MIN_ROH_F, MAX_ROW_F, MIN_COL_F, MAX_COL_F: These values are the extreme p o i n t s of the r e g i o n . PERIM_F: T h i s i s the l e n g t h of the perimeter. TYPE_F: T h i s f i e l d holds the type of the r e g i o n . T h i s w i l l be an i n t e g e r value known to the user. In my case 1 repr e s e n t s second growth, 2 i s o l d growth, 3 i s recent l o g g i n g , 4 i s water, 5 i s type 1 or 2, 6 i s type 1 or 3, and 8 i s type 2 or 3 R0H_OBG_F, COL_OEG_F: These i n d i c a t e the row and column where the boundary o f the r e g i o n begins. REGION_NUMBER_F: T h i s f i e l d i s the number of the region 6 8 ( i n t e r n a l to the program}, ABEA_F: T h i s i s the number of p i x e l s i n the r e g i o n . BAND_5_AVG_F,_BAND_6_AVG_F, BAND_7_AVG_F: These values g i v e the averages of the three s p e c t r a l bands f o r the p o i n t s i n the r e g i o n . POINTEB_F: T h i s p o i n t s to a l i s t of vec t o r s which form the boundary of the r e g i o n . NEXT: T h i s i s the l i n k to the next r e g i o n i n the l i s t . REGIONS: T h i s i s a matrix which holds the r e g i o n number of each p o i n t . In other words, by l o o k i n g at the value of BEGIONS(i,j) you know t o what r e g i o n the p o i n t belongs. VALUES: T h i s i s a three dimensional array which holds the a c t u a l s p e c t r a l values f o r each p o i n t . VALUES{i,j, 1) i s the value of p o i n t ( i , j ) f o r band 5, VALUES ( i , j , 2 ) f o r band 6, and VALUES(i,j,3) f o r band 7. SIGMA: T h i s matrix holds the c o v a r i a n c e s f o r each c l a s s . The matrix has s i x columns s i n c e there are f o u r c l a s s e s . The elements are as f o l l o w s : SIGMA(i,1): The standard d e v i a t i o n cf the f i r s t element of the f e a t u r e vector f o r c l a s s i . SIGMA ( i , 2): The standard d e v i a t i o n f o r the second element. SIGMA ( i , 3 ) : The standard d e v i a t i o n f o r the t h i r d element. SIGMA ( i , 4 ) : The c o v a r i a n c e between the f i r s t and second elements. . SIGMA ( i , 5 ) : The co v a r i a n c e between the f i r s t and t h i r d 69 elements. SIGMA ( i , 6 ) : The c o v a r i a n c e between the second and t h i r d elements. MU: T h i s matrix holds the mean values of each element of the f e a t u r e v e c t o r f o r each of the main c l a s s e s . DET: T h i s a r r a y holds the determinant of the c o v a r i a n c e matrix f o r each c l a s s . S11, S12, S13, S22, S23, S33: These a r r a y s are the c o f a c t o r s of the i n d i c a t e d p o s i t i o n c f the c o v a r i a n c e matrix. For example, S23<i) i s the c o f a c t o r of the element (2,3) f o r the matrix of c l a s s i . GBOUND_TEUTH: T h i s matrix holds the ground-truth values f o r the area of i n t e r e s t . At present, t h i s matrix c o n s i s t s of one of seven numbers s i n c e the ground-truth data had seven c l a s s e s . Types one and two are the same as type one i n my program, types 3 and 4 were merged t o type 2, types 5 and 6 to type 3, and type 7 became type 4. TYPE: T h i s i s an a r r a y which holds the type of each r e g i o n . For example, TYPE(i) w i l l be the type of the region numbered i . T h i s i s done so t h a t the type can be g u i c k l y known, without having to go through the r e g i o n l i s t u n t i l the d e s i r e d r e g i o n i s found. T h i s a r r a y i s a STBING{1) v a r i a b l e s i n c e the number of types should never reach 255. At present room i s made f o r 500 r e g i o n s so i f the i n i t i a l p a r t i t i o n has more than t h i s the program w i l l have to be changed. 70 Th§ Procedures SI^ION^ELEMENT: T h i s l o g i c a l procedure determines whether or not the point at the arguments (EOS, COL) i s a member of REGIGN_NUMBEB1 or BEGI0N_NUMBEB2 (which are a l s o arguments). FILL_IN_REGION: T h i s procedure f i l l s i n the i n t e r i o r p o i n t s of a r e g i o n with the same re g i o n number. See the d e s c r i p t i o n i n the Region Finder f o r more d e t a i l s . FOLLOW_NEH_BOUNDASY: T h i s procedure t r a c e s and r e c o r d s the boundary of a newly merged r e g i o n . Given two region p o i n t e r s TEMPI and TEMP2, the r e s u l t a n t boundary w i l l be the boundary of the merged r e g i o n . The r o u t i n e begins at the extreme l e f t s i d e of the r e g i o n , determined by 1START and JSTABI. The f i r s t t h i n g to do i s c a l c u l a t e the average r e f l e c t a n c e values f o r the merged r e g i o n and determine the type from these values i n a c a l l to GET_TYPE, the v a r i a b l e SURE used to i n d i c a t e i f the new region i s a s t r o n g one. I f the r e g i o n i s s t r o n g ( i e . SURE i s t r u e ) , the second r e g i o n (TEMP2) i s removed from LIST2. (TEMP2_PREV i s the r e c o r d i n the l i s t which comes immediately before TEMP2, used f o r easy removal). Then the two boundary l i s t s are j o i n e d s i n c e the new boundary w i l l l i k e l y be longer than e i t h e r of the o l d ones, but has to be s h o r t e r than the sum of the two. FOLLCH_BOUNDARY i s then c a l l e d t o do the a c t u a l t r a c i n g of the boundary. F i n a l l y , the var i o u s d e s c r i p t o r s of the new re g i o n are e s t a b l i s h e d . 71 MEBGE_.fiEGION: This procedure does one cornplete pass of the simpler merging process. One outer loop continues u n t i l the pointer TEMP 1 i s n u l l by stepping through LIST1 one region at a time. TEMP2 then s t a r t s at the top of the second l i s t and a second loop i s entered so as long as TEMP2 i s not n u l l , or the f i r s t region (TEMPI) i s not removed i n a merge. If the two regions are mergeable (in theory), ADJACENT i s c a l l e d to see i f they have enough of th e i r boundaries i n common. If they have, FOLLGW_NEw_BOUNDARY i s c a l l e d to get the new boundary of the merged region. REMOVE: This routine removes the region TEMP2 from the l i s t headed by the reference variable LIST. I f the region i s already the head of the l i s t , the head i s changed to the next element, otherwise the l i n k f i e l d of the previous node i s changed to the l i n k f i e l d of the node being removed. ADJACENT: This l o g i c a l procedure determines i f the two regions denoted by LIST 1 and LIST2 are adjacent. This means that a certain number of boundary elements are not only adjacent i n the physical sense, but are also weak boundary vectors. In ether words the r e l a t i v e difference i n reflectance of the two pixels across the boundary i s less than STRENGTH_THRESHOLD. The procedure simply determines which region has the shortest perimeter and c a l l s ADJACENT2. This i s done since i t i s quicker to move around the outside of the smaller region. 72 ADJACENT2: T h i s procedure does most of the work i n determining adjacency. I t begins by s e t t i n g the r e f e r e n c e v a r i a b l e POINTER to the f i r s t p o i n t of the boundary l i s t of the r e g i o n TEHP1. Two b a s i c p o i n t s are used f o r the boundary v e c t o r , (11, J1) being one end and (12, J2) the other. I f the vector l i e s o u t s i d e of the extreme p o i n t s of the second r e g i o n i t o b v i o u s l y can't be a d j a c e n t , so nothing i s done. Otherwise ADJACENT3 i s c a l l e d s i n c e adjacency i s now p o s s i b l e . I t w i l l r e t u r n with v a r i a b l e s INTERSECT and WEAK_CODNT p o s s i b l y changed. INTERSECT w i l l be the number of boundary v e c t o r s the two r e g i o n have i n common. HEAK_CODNT i s the number of these v e c t o r s which are "weak", acc o r d i n g t o the p r e v i o u s l y s t a t e d d e f i n i t i o n of the term. I f the WEAK^COUNT becomes g r e a t e r than a c e r t a i n percentage of the perimeter (the percentage i s h e l d i n the v a r i a b l e PHAGCCYTE_THRESHCLD) the two r e g i o n s are s a i d to be adjacent. ADJACENT3; T h i s procedure determines i f the boundary vector given i t (passed i n parameters 11, J1, 12, and J2) i s adjacent to the region given i n TEMP2. The f i r s t t h i n g to do i s to determine the two p i x e l s on opposite s i d e s of t h i s boundary v e c t o r . To do t h i s an example might be u s e f u l . Suppose one has two p o i n t s , ( i , j ) and the point to i t s r i g h t ( i , j+1). The top of the v e r t i c a l boundary between them i s ( i , j+1) and the bottom i s ( i + 1, j+1) . T h e r e f o r e , given any boundary v e c t o r , the p i x e l s on e i t h e r s i d e can be 7 3 c a l c u l a t e d i n a simple f a s h i o n . I f the p o i n t o p p o s i t e the v e c t o r i s an element of the d e s i r e d r e g i o n i t i s then checked f o r weakness. I f not weak, t h i s v e c t o r i s not adjacent to t d e s i r e d r e g i o n . MEBGJE2: T h i s procedure pro v i d e s the second type of r e g i o n merging. The two l i s t s used are passed to the procedure i n the v a r i a b l e s IIST1 and 1IST2. (LIST2 g e n e r a l l y w i l l be the WEAK_REGION_LIST). The procedure e n t e r s a loop which w i l l look at each element of LIST2 i n t u r n , s e a r c h i n g f o r a l l the r e g i o n s i n IIST1 which are adjacent to i t . T h i s i s done by going through the boundary l i s t s of the r e g i o n s and r e c o r d i n g each region which i s adjacent and keeping t r a c k of the number of adjacent p o i n t s i n the a r r a y INTERSECT. INTERSECT(i) w i l l then ho l d the number of adjacent p o i n t s which the r e g i o n TEMP2 has i n common with the region numbered i . Once t h i s i s done f o r the complete boundary l i s t of TEMP2 a l l adjacent r e g i o n s are known. T h i s l i s t i s then checked t o get the r e g i o n which i s most l i k e the weak r e g i o n and has the g r e a t e s t boundary i n common. T h i s r e g i o n w i l l be the one to merge with. NQ_CHANGE_IN_INTERPRE T h i s l o g i c a l procedure takes as parameters two regions and w i l l r e t u r n TRUE i f the r e g i o n TEMPI w i l l not have i t s i n t e r p r e t a t i o n changed i f i t were merged with r e g i o n TEMP2, and FALSE otherwise. READ_IN_REGION: T h i s procedure does some of the i n p u t f o r the program. F i r s t , the necessary s t a t i s t i c s are read i n from 74 a f i l e STATISTICS, which i s c r e a t e d by the program GET_SIATS. The form of the data i n t h i s f i l e i s i n f o u r rows, one f o r each c l a s s , where each row has the f o l l o w i n g i n f o r m a t i o n i n the order i n d i c a t e d . Standard d e v i a t i o n of the f i r s t element of the f e a t u r e v e c t o r . Standard d e v i a t i o n of the second element of the f e a t u r e v e c t o r . Standard d e v i a t i o n of the t h i r d element of the f e a t u r e v e c t o r . Covariance between the f i r s t and second elements. Covariance between the f i r s t and t h i r d elements. Covariance between the second and t h i r d elements. Mean value c f the f i r s t element. Mean value of the second element. Mean value of the t h i r d element. Once these are read i n the four determinants are c a l c u l a t e d , as w e l l as the s i x c o f a c t o r s . (There are onl y s i x because the matrix i s symmetric). Next, the regions and t h e i r boundaries are read i n . T h i s data i s h e l d i n the two f i l e s REGIONS and BOUNDABYS, which are f i l l e d by the r e g i o n f i n d e r . I f the type of the region read i n i s l e s s than 5 i t w i l l be added t o the STBONG_BEGION_LIST, otherwise i t i s added to the HEAK_BEGION_LIST. AD_IN_BOUNDABY: Th i s procedure reads i n the boundary of a r e g i o n , c r e a t i n g a l i n k e d l i s t of the po i n t s as i t does so. The form of the boundary f i l e has twelve p o i n t s per row. 75 each poi n t c o n s i s t i n g of a row and column value. BEAD^IN^JEGIONS: T h i s procedure i s used to read i n the matrix of r e g i o n numbers and ground t r u t h v a l u e s . The r e g i o n number matrix i s found at the end of the f i l e REGIONS; the ground t r u t h i n the f i l e GROOND_TRUTH. The constant FIRST_R0W_OF_DATA i n d i c a t e s the row of the a c t u a l p i c t u r e which i s the f i r s t row of the data i n the f i l e AREA. T h i s i s done s i n c e I had no d e s i r e to read from tape every time I wanted to use the program. Instead, another program i s used to read i n an n by m matrix of p i x e l s from a c e r t a i n area of the p i c t u r e . BEGIN_ROW i s the beginning row f o r the area of i n t e r e s t . I f i t i s not the same as FIRST_ROw_OF_DATA, the c o r r e c t number of l i n e s must be skipped to get to the c o r r e c t s t a r t i n g l i n e . A s i m i l a r f u n c t i o n i s c a r r i e d out f o r the columns. READ_IN .DATA: T h i s r o u t i n e reads i n the a c t u a l r a d i a n c e values from a f i l e c a l l e d AREA2. Because of ALG0L8-F l i n e l e n g t h r e s t r i c t i o n s only 25 p o i n t s are held per l i n e of the f i l e , each p o i n t having three r e f l e c t a n c e values which may have up to t h r e e d i g i t s . The data i s placed i n the matrix STRING (1)' a r r ay VALUES. ZQiiOW_EOUNDARY: T h i s procedure and the ones i t c a l l s are i d e n t i c a l t o those i n the r e g i o n f i n d e r and w i l l not be de s c r i b e d here. GET_TYPE: T h i s procedure r e t u r n s the type of the po i n t (or region) whose radiance values are given i n the parameters X1, X2, and X3. The f u n c t i o n F, which c a l c u l a t e s the 76 p r o b a b i l i t y of a point with f e a t u r e v e c t o r (X1, X2, X3) belonging to c l a s s i , i s c a l l e d f o r a l l four c l a s s e s . The values obtained are s o r t e d and TYPE i s s e t to the number of the : c l a s s which gives the point the highest p r o b a b i l i t y of membership; F: T h i s r e a l procedure c a l c u l a t e s the p r o b a b i l i t y t h a t a p o i n t with r a d i a n c e v e c t o r (X1, X2, X3) belongs to c l a s s i . I t i s dene'according to the formula given p r e v i o u s l y . MIL.IG_BO0NJDARY: T h i s procedure simply adds a po i n t t o the boundary l i s t and changes the extreme p o i n t s of a r e g i o n i f necessary. flEBGEABIiE: T h i s l o g i c a l procedure determines i f the two re g i o n s TEMP 1 and TEMP2 are mergeable i n theory. In pass one, they are mergeable i f the two types are the same and the region TEMP 1 w i l l not change i t s i n t e r p r e t a t i o n i f they were merged. In pass two, e i t h e r the types are the same or the second most l i k e l y type of TEMP2 i s the same as the type of TEMPI. A l s o , the average r e f l e c t a n c e value d i f f e r e n c e must be l e s s than the STBENGTH_THRESHOLD and once a g a i n , r e g i o n TEMPI must not change i n t e r p r e t a t i o n i f they were merged. In pass t h r e e , only the d i f f e r e n c e i n average r e f l e c t a n c e f o r the two r e g i o n s have to be l e s s than the STB£NGTH_THRESHOLD. OOTPyT^MERGED^REGIOMS: T h i s i s a procedure used f o r debugging. I f the user wishes to see d e s c r i p t o r s of the f i n a l r e g i o n s , t h i s procedure w i l l place them i n a temporary f i l e named -MERGED-BEGIONS. 77 GgAPH.REGIONS: I f a graph i s of the p i c t u r e i s c a l l e d f o r t h i s procedure draws the o u t l i n e of every r e g i o n i n the l i s t i t i s g i v e n , 1 2 £ _ M f £ I ! I I 2 _ M R K I N G S : T h i s procedure adds the markings f o r the d i f f e r e n t types of r e g i o n s , as w e l l as adding boxes f o r the legend; I f the v a r i a b l e PLOT has the value one, i n d i c a t i n g only a graph of the o r i g i n a l p i c t u r e i s d e s i r e d , a l l legends are drawn, otherwise only the legends f o r the f o u r main c l a s s e s are done. No t e x t i s drawn though, s i n c e the g u a l i t y of t e x t from the p l o t t e r i s l e s s than p l e a s i n g . To p l a c e the markings on the graph, the procedure simply looks at each point of the p i c t u r e and c a l l s ADD_J3ARKINGS with the type of r e g i o n to which the p o i n t belongs. SQUARE: T h i s w i l l draw a sguare at (x, y) with a s i z e of SCALE; 4fiD_MABRINGS: T h i s adds an i d e n t i f y i n g mark to the p o i n t (x, y) which has type value given i n TYPE. At present o n l y seven types are a v a i l a b l e . Type four w i l l have a c r o s s drawn; type t h r e e has a l i n e from top l e f t t o bottom r i g h t ; type two i s blank; type one has a dot i n the c e n t r e ; type s i x , which i s e i t h e r type one or t h r e e , gets an X shape while type e i g h t , which i s e i t h e r type two or t h r e e , has both the above markings. PJ.OT^ENIJ: T h i s must be c a l l e d t o end a p l o t c o r r e c t l y . PLOT: T h i s i s the FORTRAN p l o t r o u t i n e . See OBC PLOT f o r d e t a i l s . START_REGION: T h i s moves to the p o i n t ( i , j ) i n the matrix. Those matrix values are not very u s e f u l to the p l o t t e r though, so they must be transformed i n t o C a r t e s i a n c o - o r d i n a t e s . §8APH_REGIONi T h i s procedure c a l l s GRAPH f o r each p o i n t i n the boundary l i s t given i t . GRAPH: T h i s c a l c u l a t e s the C a r t e s i a n . c o - o r d i n a t e s of the matrix p o i n t ( i , j ) and draws a l i n e to t h a t p o i n t . COMPARE,HITH.TROE.pATA: In order t o e s t a b l i s h the c o r r e c t n e s s of the f i n a l p a r t i t i o n , t h i s r o u t i n e i s c a l l e d . For example, i f p o i n t ( i , j ) belongs t o a region of type k i n the f i n a l p a r t i t i o n , and i s of type m i n the ground-truth d a t a , p o s i t i o n (k,m) i n the matrix CORRECT i s incremented. I d e a l l y , k and m would be e g u a l , i n d i c a t i n g the ground-truth agrees with the c a l c u l a t e d r e s u l t s . The matrix i s then output i n a percentage form. The Main Program: F i r s t the data i s read i n and the parameters f o r the program e s t a b l i s h e d . I f p l o t t i n g i s to be done f o r the o r i g i n a l p i c t u r e i t i s done before s t a r t i n g the merge. Then the merging process i s done f o r the f i r s t two passes f o r so long as r e g i o n s continue to be merged. (This i s i n d i c a t e d by the l o g i c a l f l a g SWITCH). For the t h i r d pass the WEAK_REGION_LIST i s done away with by j o i n i n g i t to the STRGNG_REGION_LIST a f t e r changing the type of each r e g i o n to be the type of the c l a s s which g i v e s i t the h i g h e s t p r o b a b i l i t y of membership. T h i s i s done s i n c e the f i n a l pass i s r a t h e r weak because most of the weak r e g i o n s should have been merged by then. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0051777/manifest

Comment

Related Items